Generalized Skew-Normal Negentropy and Its Application to Fish Condition Factor Time Series

Arellano-Valle, Reinaldo B.; Contreras-Reyes, Javier E.; Stehlík, Milan

doi:10.3390/e19100528

Open AccessArticle

Generalized Skew-Normal Negentropy and Its Application to Fish Condition Factor Time Series

by

Reinaldo B. Arellano-Valle

¹,

Javier E. Contreras-Reyes

^2,3,4,*

and

Milan Stehlík

^4,5,6

¹

Departamento de Estadística, Pontificia Universidad Católica de Chile, Santiago 7820436, Chile

²

Departamento de Matemática, Universidad Técnica Federico Santa María, Valparaíso 2390123, Chile

³

Departamento de Evaluación de Recursos, División de Investigación Pesquera, Instituto de Fomento Pesquero (IFOP), Valparaíso 2361827, Chile

⁴

Instituto de Estadística, Universidad de Valparaíso, Valparaíso 2360102, Chile

⁵

Institute of Applied Statistics, Johannes Kepler University Linz, Linz 4040, Austria

⁶

Linz Institute of Technology, Johannes Kepler University Linz, Linz 4040, Austria

^*

Author to whom correspondence should be addressed.

Entropy 2017, 19(10), 528; https://doi.org/10.3390/e19100528

Submission received: 7 September 2017 / Revised: 28 September 2017 / Accepted: 30 September 2017 / Published: 6 October 2017

(This article belongs to the Special Issue Foundations of Statistics)

Download

Browse Figures

Versions Notes

Abstract

:

The problem of measuring the disparity of a particular probability density function from a normal one has been addressed in several recent studies. The most used technique to deal with the problem has been exact expressions using information measures over particular distributions. In this paper, we consider a class of asymmetric distributions with a normal kernel, called Generalized Skew-Normal (GSN) distributions. We measure the degrees of disparity of these distributions from the normal distribution by using exact expressions for the GSN negentropy in terms of cumulants. Specifically, we focus on skew-normal and modified skew-normal distributions. Then, we establish the Kullback–Leibler divergences between each GSN distribution and the normal one in terms of their negentropies to develop hypothesis testing for normality. Finally, we apply this result to condition factor time series of anchovies off northern Chile.

Keywords:

Shannon entropy; negentropy; skew-normal; modified skew-normal; Kullback–Leibler divergence; condition factor

1. Introduction

Recent studies deal with the problem of measuring the disparity of a particular probability density function (pdf) from the normal one [1]. A typical technique to deal with the problem has been exact expressions using information measures over particular distributions. For example, Vidal et al. [2] measure the sensitivity of the skewness parameter using the

L_{1}

distance between symmetric and asymmetric distributions. Stehlík [3] proved results on the decomposition of Kullback–Leibler (KL) divergences [4] in the gamma and normal family for divergence between the Maximum Likelihood Estimator (MLE) of the canonical parameter and the canonical parameter of the regular exponential family [5]. Contreras-Reyes and Arellano-Valle [6] considered Jeffrey’s (J) divergence [7] to compare the multivariate Skew-Normal (SN) from the normal distribution, and Gómez-Villegas et al. [8] assessed the effect of kurtosis deviations from normality on conditional distributions, such as the multivariate exponential power family. Main et al. [9] evaluated the local effect of asymmetry deviations from normality using the KL divergence measure of the SN distribution and then compared the local sensitivity with Mardia’s and Malkovich–Afifi’s skewness indexes. They also agree on the use of the SN model to regulate the asymmetry of an empirical distribution because it reflects the deviation in a tractable way. Dette et al. [10] characterizes the “disparity” between the skew-symmetric models and their symmetric counterparts in terms of the total variation distance, which is later used to construct priors. The paper provides additional insights, to those provided in Vidal et al. [2], on the interpretation of this distance and also discusses the usage of the KL divergence among several other distances.

Some recent applications of measuring the disparity of a particular pdf from the normal one using negentropy include those by Gao and Zhang [11] and Wang et al. [12], where the negentropy method has been successfully applied to seismic wavelet estimation. Pires and Ribeiro [13] considered the negentropy to measure the distance of non-Gaussian information from the normal one in independent components, with application to Northern Hemispheric winter monthly variability of a high-dimensional quasi-geostrophic atmospheric model. Furthermore, Pires and Hannachi [14] used a tensorial invariant approximation of the multivariate negentropy in terms of a linear combination of squared coskewness and cokurtosis. Then, the method was applied to global sea surface temperature anomalies, after data anomalies were tested through a non-Gaussian distribution.

In this paper, we develop a procedure, based on KL divergences, to test the significance of the skewness parameter in the Generalized Skew-Normal (GSN) distributions, a flexible class of distributions that includes the SN and normal ones as particular cases. We consider asymptotic expansions of moments and cumulants for the negentropy of two particular cases: the SN and Modified Skew-Normal (MSN) distributions. Given that SN distributions do not accomplish the regularity condition of Fisher Information Matrix (FIM) at

η = 0

, normality is tested based on the MSN distribution [15]. This allows one to implement an asymptotic normality test for testing the significance of the skewness parameter. Numerical results are studied by: (a) comparing numerical integration methods with proposed asymptotic expansions; (b) comparing the asymptotic test with the likelihood ratio test and the asymptotic normality test given by Arrué et al. [15] ; and (c) applying the proposed test to condition factor time series of anchovy (Engraulis ringens).

This paper is organized as follows: information theoretic measures are described in Section 2. In Section 3, we provide an asymptotic expansion in terms of the corresponding cumulants for the GSN, SN and MSN negentropies. We also express the KL and J divergences among each GSN distribution and the normal one in terms of negentropies (as cumulants’ expansion series) to develop the hypothesis test about the significance of the skewness parameter together with a simulation study (Section 4). A simulation study is given in Section 5. In Section 6, the real data of the condition factor time series of anchovies off northern Chile illustrate the usefulness of the developed methodology. The discussion concludes the paper.

2. Shannon Entropy and Related Measures

The Shannon Entropy (SE) of a random variable Z with pdf f is given by:

\begin{matrix} H (Z) = - E {log f (Z)} . \end{matrix}

(1)

The SE of a localization-scale random variable

X = μ + σ Z

does not depend on

μ

and is such that

H (X) = log σ + H (Z)

(see, e.g., [16]). The SE could serve to define a measure of disparity from normality, the so-called negentropy [17], which is zero for a Gaussian variable and positive for any distribution. It is defined by:

\begin{matrix} N (Z) = H (Z_{0}^{'}) - H (Z), \end{matrix}

(2)

where

Z_{0}^{'}

is a normal random variable with the same mean and variance as those of Z. Equation (2) expresses the negentropy in terms of the standardized version of Z, say

Z^{*}

, as

N (Z) = H (Z_{0}) - H (Z^{*}) = N (Z^{*})

; here,

Z^{*}

has zero mean and unit variance. Thus, negentropy measures essentially the amount of information that departs from the normal entropy. Furthermore, clearly, the negentropy becomes the KL divergence (see Equation (3) below) between

Z^{*}

and

Z_{0}

.

Given that the calculus of negentropy presents a computational challenge, where the integral involves the pdf of Z [16,18], different approximations of negentropy are used, such as cumulants’ expansion series [17,19]. Withers and Nadarajah [19] provided exact and explicit series expansions for the SE and negentropy of a standardized pdf f on

R

, in terms of cumulants. Yet, they did not perform numerical studies that allow evaluation and comparison with other procedures in some specific families of distributions.

Other measures related to the SE are KL and J divergences. They measure the degree of divergence between the distributions of two random variables

Z_{1}

and

Z_{2}

with pdfs

f_{1}

and

f_{2}

, respectively. The KL divergence of the pdf for

Z_{1}

from the pdf for

Z_{2}

is defined as:

K (Z_{1}, Z_{2}) = E [log \{\frac{f_{1} (Z_{1})}{f_{2} (Z_{1})}\}] .

(3)

as indicated in the notation, the expectation is defined with respect to the pdf

f_{1}

for

Z_{1}

. Since in general

K (Z_{1}, Z_{2})

differs from

K (Z_{2}, Z_{1})

, the J divergence is considered as a symmetric version of the KL divergence, which is defined by:

J (Z_{1}, Z_{2}) = K (Z_{1}, Z_{2}) + K (Z_{2}, Z_{1}) .

(4)

3. Generalized Skew-Normal Distributions

An attractive class of Skew-Symmetric (SS) distributions defined in terms of the pdf appears in Azzalini [20], Azzalini and Capitanio [21] and Gupta et al. [22]:

\begin{matrix} f (z; η) = 2 f (z) G {w (z; η)}, z \in R, \end{matrix}

(5)

where

η \in R

represents a skewness/shape parameter, f and G are the respective pdf and cumulative distribution function (cdf) of symmetrical continuous distributions and

w (z; η)

is an odd function of z, with

w (0; η) = 0

for any fixed value of

η

. Furthermore, we assume that

w (z; η_{0}) = 0

for all z and some value

η_{0}

of

η

(typically

η_{0} = 0

), so that

f (z; η_{0}) = f (z)

, thus recovering symmetry.

The notation

Z \sim S S (η; f, G, w)

expresses that random variable Z has a distribution with the pdf given by (5). If

f (z) = ϕ (z)

represents the pdf of the standardized normal distribution, denoted by

N (0, 1)

, then (5) becomes a family of skew-symmetric distributions generated by the normal kernel, the GSN family. In this case,

Z \sim G S N (η; G, w)

emerges. An important property of the GSN random variable Z is that all its moments are finite. In particular, it possesses the same even moments of

Z_{0} \sim N (0, 1)

. For instance,

E (Z^{2}) = 1

, and so,

V a r (Z) = 1 - μ_{z}^{2}

, where

μ_{z} = E (Z)

. The most popular GSN distribution is Skew-Normal (SN) [23], for which

w (z; η) = η z

and

G (z) = Φ (z)

is the cdf of the standardized normal distribution. Therefore,

Z \sim S N (η)

expresses that Z follows an SN distribution. The location-scale extension of the SS pdf in (5) follows by applying the Jacobian method to the linear random variable

X = μ + σ Z

, where

μ \in R

and

σ > 0

. In this case, we state that X follows an SS distribution with location parameter

μ

, scale parameter

σ

and shape/skewness parameter

η

and obtains

X \sim S S (μ, σ^{2}, η; f, G, w)

. Furthermore, we write

X \sim G S N (μ, σ^{2}, η; G, w)

if

f = ϕ

,

X \sim G S N (μ, σ^{2}, η; G)

if

f = ϕ

and

w (z; η) = η z

; and

X \sim S N (μ, σ^{2}, η)

if

f = ϕ

,

w (z; η) = η z

and

G = Φ

.

Two other members of the GSN family that have been studied recently are the Skew-Normal-Cauchy (SNC) distribution [24,25], which follows from (5) by taking

f (z) = ϕ (z)

,

w (z; η) = η z

and

G (z) = 1 / 2 + (1 / π) \arctan (z)

, and the Modified Skew-Normal (MSN) distribution [15], for which

f (z) = ϕ (z)

,

w (z; η) = η z / \sqrt{1 + z^{2}}

and

G (z) = Φ (z)

. Nadarajah and Kotz [24] recall that the SNC distribution appears to attain a higher degree of sharpness than the normal distribution, i.e., disparity exists from the common normal distribution produced by the skewness parameter

η

. A random variable Z with the SNC or MSN distribution is denoted, respectively, by

Z \sim S N C (η)

or

Z \sim M S N (η)

and by

X \sim S N C (μ, σ^{2}, η; G)

or

X \sim M S N (μ, σ^{2}, η; G)

for their respective location-scale extensions.

We consider the SE for the GSN subclass, i.e., the SE of

Z \sim G S N (η; G, w)

. Thus, assuming a normal kernel in (5), we get the GSN-SE given by:

\begin{matrix} H (Z) = H (Z_{0}) - E (log [2 G {w (Z; η)}]), \end{matrix}

(6)

where

H (Z_{0}) = (1 / 2) log (2 π e)

is the SE of

Z_{0}

. It is assumed that a specific skewness value

η_{0}

exists so that

w (z; η_{0}) = 0

and so that

G {w (z; η_{0})} = 1 / 2

, thus recovering symmetry at

η = η_{0}

. Therefore, at

η = η_{0}

, Z and

Z_{0}

have the same distribution and thus the same SE.

Let

μ_{z} = E (Z)

and

σ_{z}^{2} = V a r (Z)

be the mean and variance of

Z \sim G S N (η; G, w)

, respectively, which must constitute functions of the skewness parameter

η

. Since

σ_{z}^{2} = 1 - μ_{z}^{2}

and

H (Z_{0}^{'}) = log σ_{z} + H (Z_{0})

, we get from (2) that the negentropy of Z becomes:

\begin{matrix} N (Z) = \frac{1}{2} log (1 - μ_{z}^{2}) + E (log [2 G {w (Z; η)}]) . \end{matrix}

(7)

Since at

η = η_{0}

, we have by symmetry

μ_{z} = 0

and

w (Z; η_{0}) = 0

, so the negentropy in this case is null, as expected. Clearly, SE and negentropy depend on the choice of the functions

G (\cdot)

and

w (\cdot; η)

. In this paper, we consider both families of GSN distributions for which

η_{0} = 0

, with

w (z; 0) = w (0; η) = 0

and

w (- z; η) = w (z; - η)

, thus following that

- Z \sim G S N (- η; G, w)

and recovering the normality at

η = 0

. Examples of this type of functions are

w (z; η) = η u (z)

and

w (z; η) = u (η z)

for some odd function

u (z)

, with

u (0) = 0

. In this case, recalling that

η Z \overset{d}{=} τ Z_{τ}

, where

Z_{τ} \sim G S N (τ; G, w)

,

τ = | η |

, and “

\overset{d}{=}

” denotes equality in the distribution, we obtain:

\begin{matrix} E (log [2 G {w (Z; η)}]) = E (log [2 G {w (Z_{τ}; τ)}]), \end{matrix}

(8)

thus

H (Z) = H (Z_{τ})

and

N (Z) = N (Z_{τ})

. That is, the entropy and negentropy of

Z \sim S N (η)

depend on the skewness parameter

η

only through its absolute value

τ = | η |

.

We have interest in both KL and J divergences for a GSN distribution with respect to the normal distribution. that is, assuming in (3) and (4) that

Z_{1} = Z \sim G S N (η; G, w)

and

Z_{2} = Z_{0}

. In this case, remembering that

η Z \overset{d}{=} τ Z_{τ}

, where

Z_{τ} \sim G S N (τ; G, w)

and

τ = | η |

, we have

K (Z, Z_{0}) = K (Z_{τ}, Z_{0})

and

K (Z_{0}, Z) = K (Z_{0}, Z_{τ})

, with:

K (Z_{τ}, Z_{0}) = E (log [2 G {w (Z_{τ}; τ)}]) and K (Z_{0}, Z_{τ}) = - E (log [2 G {w (Z_{0}; τ)}]) .

(9)

Therefore,

J (Z, Z_{0}) = J (Z_{τ}, Z_{0})

, with:

J (Z_{τ}, Z_{0}) = E (log [2 G {w (Z_{τ}; τ)}]) - E (log [2 G {w (Z_{0}; τ)}]) .

(10)

We also develop asymptotic expansions of the J divergence for the SN and MSN distributions from the normal distribution. To do this, we consider the following preliminary result, the proof of which stems from (9) and (10) by using the Taylor expansion of

ζ (z; τ) = log [2 G {w (z; τ)}]

at

z = 0

and also because of the facts that (a) all moments of

Z_{τ} \sim G S N (τ; G, w)

are finite and (b)

Z_{τ}

and

Z_{0}

contain the same even moments.

Lemma 1.

Consider the composite function

ζ (z; τ) = log [2 G {w (z; τ)}]

,

τ > 0

, by assuming that both functions

G (z)

and

w (z; τ)

are infinitely differentiable; hence, also

ζ (z; τ)

is infinitely differentiable at

z = 0

. If

Z_{τ} \sim G S N (τ; G, u)

, then:

\begin{matrix} K (Z_{τ}, Z_{0}) & = & \sum_{k = 1}^{\infty} \frac{ζ^{(k)} (0; τ)}{k!} E (Z_{τ}^{k}), \\ J (Z_{τ}, Z_{0}) & = & \sum_{k = 1}^{\infty} \frac{ζ^{(2 k - 1)} (0; τ)}{(2 k - 1)!} E (Z_{τ}^{2 k - 1}), \end{matrix}

(11)

where

ζ^{(k)} (z; τ)

is the k-th derivative of

ζ (z; τ)

. Moreover, from (11), the expressions (6) and (7) for the SE and negentropy of the GSN distributions have the forms:

\begin{matrix} H (Z_{τ}) & = & \frac{1}{2} log (2 π e) - \sum_{k = 1}^{\infty} \frac{ζ^{(k)} (0; τ)}{k!} E (Z_{τ}^{k}), \\ N (Z_{τ}) & = & \frac{1}{2} log (1 - μ_{τ}^{2}) + \sum_{k = 1}^{\infty} \frac{ζ^{(k)} (0; τ)}{k!} E (Z_{τ}^{k}), \end{matrix}

respectively, where

μ_{τ} = E (Z_{τ})

.

Notice in Lemma 1 that the coefficient

ζ^{(k)} (0; τ)

depends on the derivatives of

G (z)

and

w (z; τ)

at

z = 0

, which change for different GSN distributions. Moreover, since the expansion of

ζ (z; τ)

emerges around

z = 0

by assuming a fixed

τ

, the approximations may not be reasonable for some values of

τ

.

3.1. Skew-Normal Distribution

If

Z \sim S N (η)

or

Z \sim S N (0, 1, η)

represents an SN random variable, then its pdf is:

\begin{matrix} f (z; η) = 2 ϕ (z) Φ (η z), z \in R . \end{matrix}

Clearly, if

η = η_{0} = 0

, then (12) reduces to the

N (0, 1)

-pdf. The SN random variable Z can be conveniently represented as a linear combination of half-normal and normal variables through the following stochastic representation [26]:

Z \overset{d}{=} δ | U_{0} | + \sqrt{1 - δ^{2}} U,

(12)

where

δ = η / \sqrt{1 + η^{2}}

,

U_{0}

and U are independent and identically distributed with a unit normal distribution. In particular, since the half-normal random variable

| U_{0} |

has mean

b = \sqrt{2 / π}

and variance one, it follows from (12) that the mean and variance of

Z_{τ}

,

τ = | η |

, are given by:

\begin{matrix} E (Z_{τ}) = b δ_{τ} and Var (Z_{τ}) = 1 - {(b δ_{τ})}^{2}, \end{matrix}

(13)

where

δ_{τ} = τ / \sqrt{1 + τ^{2}}

.

In the SN case,

G (z) = Φ (z)

and

w (z; η) = η z

, which are both infinitely differentiable functions at

z = 0

. Consequently, the function

ζ (z; τ) = ζ_{0} (τ z) = log {2 Φ (τ z)}

is also infinitely differentiable at

z = 0

, thus admitting a Taylor expansion about zero. Therefore, since

ζ^{(k)} (z; τ) = τ^{k} ζ_{k} (τ z)

, where

ζ_{k} (x)

is the k-th derivative of

ζ_{0} (x) = log {2 Φ (x)}

, the expansion (11) in Lemma 1 of

E {ζ (Z_{τ}; τ)} = E {ζ_{0} (τ Z_{τ})}

,

Z_{τ} \sim S N (τ)

, becomes:

E {ζ_{0} (τ Z_{τ})} = \sum_{k = 1}^{\infty} \frac{τ^{k} κ_{k} μ_{k}}{k!},

(14)

where

κ_{k} = ζ_{k} (0)

and

μ_{k} = E (Z_{τ}^{k})

(see Appendix A).

In summary, since the even moments of

Z_{τ} \sim S N (τ)

are also the even moments of

Z_{0}

, Equation (14) can be rewritten as:

\begin{matrix} E {ζ_{0} (τ Z_{τ})} & = & \underset{E {ζ_{0} (τ Z_{0})}}{\underset{︸}{\sum_{k = 1}^{\infty} \frac{τ^{2 k} κ_{2 k}}{(2 k)!} μ_{2 k}}} + \underset{J (Z_{τ}, Z_{0})}{\underset{︸}{\sum_{k = 1}^{\infty} \frac{τ^{2 k - 1} κ_{2 k - 1}}{(2 k - 1)!} μ_{2 k - 1}}} . \end{matrix}

Hence, considering also Equation (13), we can compute for the SN case the results for the KL and J divergences, SE and negentropy given in Lemma 1 using the following Proposition 1.

Proposition 1.

Let

Z_{τ} \sim S N (τ)

and

Z_{0} \sim N (0, 1)

. Then:

\begin{matrix} K (Z_{τ}, Z_{0}) & = & \sum_{k = 1}^{\infty} \frac{τ^{k} κ_{k} μ_{k}}{k!}, \\ J (Z_{τ}, Z_{0}) & = & b \sum_{k = 1}^{\infty} \frac{κ_{2 k - 1} δ_{τ}^{2 k - 1}}{(2 k - 1)!} \sum_{m = 1}^{k} a_{k} (m) τ^{2 m - 1}, \\ H (Z_{τ}) & = & \frac{1}{2} log (2 π e) - K (Z_{τ}, Z_{0}), \\ N (Z_{τ}) & = & \frac{1}{2} log {1 - {(b δ_{τ})}^{2}} + K (Z_{τ}, Z_{0}), \end{matrix}

(15)

where the coefficients

a_{k} (m)

,

k = 1, 2, \dots

, are given in the Appendix A.

To gain a more complete analysis of the behavior of these series, we need appropriate forms for the calculation of the coefficients

κ_{k} = ζ_{k} (0)

,

k = 1, 2, \dots

(see Appendix A).

3.2. Modified Skew-Normal Distribution

The pdf for a random variable Z with MSN distribution, denoted by

Z \sim M S N (η)

, is given by:

\begin{matrix} f (z) = 2 ϕ (z) Φ \{η u (z)\}, z \in R, \end{matrix}

where

u (z) = z / \sqrt{1 + z^{2}}

. Similarly to the SN case, the MSN random variable

Z_{τ} \sim M S N (τ)

,

τ = | η |

, has even moments equal to the corresponding even moments of the standardized normal random variable

Z_{0}

[15], i.e.,

μ_{2 k} = (2 k - 1)!! = (2 k)! / (2^{k} k!)

(for odd moments

μ_{2 k - 1} (τ)

,

k = 1, 2, \dots

, see Appendix A).

In the MSN case,

G (w) = Φ (w)

and

w (z; τ) = τ u (z) = τ z / \sqrt{1 + z^{2}}

, both of which are infinitely differentiable at

z = 0

. Thus, in Lemma 1, we have

ζ (z; τ) = ζ_{0} {τ u (z)}

, where

ζ_{0} (x) = log {2 Φ (x)}

is also infinitely differentiable at

z = 0

. Thus, the series expansion of

E {ζ (Z_{τ}; τ)} = E [ζ_{0} {τ u (Z_{τ})}]

,

Z_{τ} \sim M S N (τ)

, can be obtained from (11) for which we need the derivatives of the composite function

ζ_{0} {τ u (z)} = log [2 Φ {τ u (z)}]

. Another way to obtain these derivatives is to define random variable

Z_{τ}^{*} = u (Z_{τ}) = Z_{τ} / \sqrt{1 + Z_{τ}^{2}}

and using (14) with

Z_{τ}

and

μ_{k} = E (Z_{τ})

replaced by

Z_{τ}^{*}

and

μ_{k}^{*} = E {{(Z_{τ}^{*})}^{k}}

, respectively. Thus, we obtain the series expansion:

E {ζ_{0} (τ Z_{τ}^{*})} = \sum_{k = 1}^{\infty} \frac{τ^{k} κ_{k}}{k!} μ_{k}^{*} .

From Lemma 1, the KL and J divergences, SE and negentropy for the MSN case can be computed using the following Proposition 2.

Proposition 2.

Let

Z_{τ} \sim M S N (τ)

and

Z_{0} \sim N (0, 1)

. Then:

\begin{matrix} K (Z_{τ}, Z_{0}) & = & \sum_{k = 1}^{\infty} \frac{τ^{k} κ_{k}}{k!} μ_{k}^{*}, \\ J (Z_{τ}, Z_{0}) & = & \sum_{k = 1}^{\infty} \frac{τ^{2 k - 1} κ_{2 k - 1}}{(2 k - 1)!} μ_{2 k - 1}^{*}, \\ H (Z_{τ}) & = & \frac{1}{2} log (2 π e) - K (Z_{τ}, Z_{0}), \\ N (Z_{τ}) & = & \frac{1}{2} log (1 - {[b {2 ξ_{1} (τ) - 1}]}^{2}) + K (Z_{τ}, Z_{0}) . \end{matrix}

(16)

In order to compute the quantities given by Proposition 2, we need to calculate the new moments

μ_{k}^{*} = E {{(Z_{τ}^{*})}^{k}}

,

k = 1, 2, . . .

. Since

Z_{τ}^{*} = Z_{τ} / \sqrt{1 + Z_{τ}^{2}}

is a random variable limited to the interval

(- 1, 1)

, all its moments are finite. In particular,

Z_{τ}^{*}

clearly has the same even moments as

Z_{0}^{*} = Z_{0} / \sqrt{1 + Z_{0}^{2}}

Moreover, from the Jacobian method, the pdf of

Z_{τ}^{*}

becomes:

\begin{matrix} f^{*} (u) = \frac{2}{{(1 - u^{2})}^{3 / 2}} ϕ (\frac{u}{\sqrt{1 - u^{2}}}) Φ (τ u), u \in (- 1, 1) . \end{matrix}

Hence, the k-th moment of

Z_{τ}^{*}

is:

μ_{k}^{*} = \int_{- 1}^{1} u^{k} f^{*} (u) d u, k = 1, 2, \dots,

which must be computed numerically.

3.3. J Divergence between SN and MSN Distributions

In the previous sections, SN and MSN distributions were compared with the normal distribution by means of the J divergence measure. As a byproduct, we were also computing the J divergence between the SN and MSN distributions, both with the same skewness parameter. This allows measuring the distance between these distributions with different

w (z; η)

’s. For this, we consider in Equation (4) that

Z_{1} \sim S N (τ)

and

Z_{2} \sim M S N (τ)

and define the random variables

Z_{i}^{*} = Z_{i} / \sqrt{1 + Z_{i}^{2}}

for

i = 0, 1, 2

, where

Z_{0} \sim N (0, 1)

. Let

μ_{i, k} = E (Z_{i}^{k})

and

μ_{i, k}^{*} = E {{(Z_{i}^{*})}^{k}}

,

i = 0, 1, 2

. Recall that

μ_{i, 2 k} = μ_{0, 2 k}

and

μ_{i, 2 k}^{*} = μ_{0, 2 k}^{*}

for all

k = 1, 2, \dots

. Thus, using (4) and then the Taylor expansion of

ζ_{0} (x) = log {Φ (x)}

around

x = 0

, Proposition 3 is obtained:

Proposition 3.

Let

Z_{0} \sim N (0, 1)

,

Z_{1} \sim S N (τ)

and

Z_{2} \sim M S N (τ)

. Define the random variables

Z_{i}^{*} = Z_{i} / \sqrt{1 + Z_{i}^{2}}

,

i = 1, 2

. Then:

\begin{matrix} J (Z_{1}, Z_{2}) & = & E {ζ_{0} (τ Z_{1})} - E {ζ_{0} (τ Z_{1}^{*})} + E {ζ_{0} (τ Z_{2}^{*})} - E {ζ_{0} (τ Z_{2})} \\ = & J (Z_{1}, Z_{0}) - J (Z_{1}^{*}, Z_{0}) + J (Z_{2}^{*}, Z_{0}) - J (Z_{2}, Z_{0}), \end{matrix}

where as before:

\begin{matrix} J (Z_{i}, Z_{0}) & = & \sum_{k = 1}^{\infty} τ^{2 k - 1} \frac{μ_{i, 2 k - 1} κ_{2 k - 1}}{(2 k - 1)!}, \\ J (Z_{i}^{*}, Z_{0}) & = & \sum_{k = 1}^{\infty} τ^{2 k - 1} \frac{μ_{i, 2 k - 1}^{*} κ_{2 k - 1}}{(2 k - 1)!}, i = 1, 2 . \end{matrix}

Proposition 3 indicates that J divergence between SN and MSN distributions is decomposed to the divergences of the normal distribution with each of these distributions, which depends only on their odd moments and cumulants.

4. Asymptotic Tests

Let

f (x; θ)

,

x \in X

,

θ \in Θ

, be the pdf of a regular parametric class of distributions, i.e., for which the sample space

X

does not depend on

θ

, the parametric space

Θ

is an open subset of

R^{p}

, and the regularity conditions (i)–(iii) stated in Salicrú et al. [27] are satisfied. As in Salicrú et al. [27], we denote the KL divergence between

f (x; θ)

and

f (x; θ^{'})

,

θ, θ^{'} \in Θ

, by:

K (θ, θ^{'}) = \int_{X} f (x; θ) log \{\frac{f (x; θ)}{f (x; θ^{'})}\} d x .

Consider the partition

θ = (θ_{1}, θ_{2})

, where

θ_{1} \in Θ_{1} \subset R^{r}

and

θ_{2} \in Θ_{2} = Θ \cap Θ_{1}^{c} \subset R^{p - r}

. Let

θ^{'} = (θ_{1}, θ_{2}^{0})

and consider the null hypothesis

H_{0} : θ_{2} = θ_{2}^{0}

for a known

θ_{2}^{0} \in Θ \cap Θ_{1}^{c}

. Let

\hat{θ} = ({\hat{θ}}_{1}, {\hat{θ}}_{2})

and

\hat{θ^{'}} = ({\hat{θ}}_{1}, θ_{2}^{0})

be the (unrestricted) MLE of

θ

and

θ^{'}

, respectively, both based on a random sample of size n from X with pdf

f (x; θ)

. Under these conditions, we have from Part (b) of Theorem 2 presented in Salicrú et al. [27] that:

\begin{matrix} 2 n K (\hat{θ}, \hat{θ^{'}}) = 2 n \int_{X} f (x; \hat{θ}) log \{\frac{f (x; \hat{θ})}{f (x; \hat{θ^{'}})}\} d x \underset{n \to \infty}{\overset{d}{⟶}} χ_{p - r}^{2}, \forall θ \in Θ, \end{matrix}

(17)

where “

\overset{d}{⟶}

” denotes convergence in distribution and

χ_{s}^{2}

denotes the chi-squared distribution function with s degrees of freedom. From (17), the above null hypothesis can be tested by the statistic

2 n K (\hat{θ}, \hat{θ^{'}})

, which is asymptotically chi-squared distributed with

p - r

degrees of freedom. Specifically, for large values of n, if we observe

K (\hat{θ}, \hat{θ^{'}}) = K_{0}

, then

H_{0}

is rejected at level

α

if

P (χ_{p - r}^{2} > 2 n K_{0}) \leq α

.

4.1. One-Sample Case: Test for Normality

The result in (17) can be applied for example to construct a normality test from the KL divergence between a regular GSN distribution and the normal distribution. Specifically, consider a random sample

X_{1}, \dots, X_{n}

from

X \sim G S N (μ, σ^{2}, η, G, w)

and the null hypothesis

H_{0} : η = η_{0}

under which

G {w (z; η_{0})} = G (0) = 1 / 2

; thus, the GSN random variable X becomes a

N (μ, σ^{2})

random variable. Let

\hat{θ} = (\hat{μ}, \hat{σ^{2}}, \hat{η})

be the MLE of

θ = (μ, σ^{2}, η)

and

\hat{θ^{'}} = (\hat{μ}, \hat{σ^{2}}, η_{0})

. Therefore, under

H_{0} : η = η_{0}

, we have:

2 n K (\hat{θ}, \hat{θ^{'}}) = 2 n K (Z_{\hat{τ}}, Z_{0}) \underset{n \to \infty}{\overset{d}{⟶}} χ_{1}^{2},

(18)

where

K (Z_{\hat{τ}}, Z_{0})

is the MLE of

K (Z_{τ}, Z_{0})

, which is defined in Equation (11) of Lemma 1 and depends only on

\hat{τ} = | \hat{η} |

. As stated in the Introduction, normality is typically obtained from the GSN class at

η_{0} = 0

or equivalently

τ_{0} = | η_{0} | = 0

.

Azzalini [20], Arellano-Valle and Azzalini [28] and Azzalini and Capitanio [23] recall the singularity of SN FIM at

η = 0

, preventing the asymptotic distribution of the above statistic tests. As suggested by Azzalini [20], a solution to recover the non-singularity of the information matrix under the symmetry hypothesis comes from the use of the so-called centered parametrization defined in terms of the mean, variance and the skewness parameters of the SN distribution (see also [28,29]). Otherwise, the FIM of the MSN model is non-singular at

η = 0

[15]. Thus, this model satisfies all the standard regularity conditions of Salicrú et al. [27], leading to consistence and asymptotic normality of the MLEs under the null hypothesis of normality. Therefore, the MSN model serves to test the null hypothesis of normality using (18). Hence, the symmetry null hypothesis

H_{0} : τ = 0

is rejected at level

α

if

P (χ_{1}^{2} > 2 n K_{0}) \leq α

, with

K_{0} = K (Z_{\hat{τ}}, Z_{0})

.

4.2. Two-Sample Case

Consider two independent samples of sizes

n_{1}

and

n_{2}

from

X_{1}

and

X_{2}

, respectively; where

θ, θ^{'} \in Θ \subset R^{p}

, and

X_{1}

and

X_{2}

have pdf’s

f (x; θ_{1})

and

f (x; θ_{2})

, respectively. Suppose partition

θ_{i} = (θ_{i 1}, θ_{i 2})

,

i = 1, 2

, and assume

θ_{21} = θ_{11} \in Θ_{1} \subset R^{r}

, so that

θ_{i 2} \in Θ \cap Θ_{1}^{c} \subset R^{p - r}

,

i = 1, 2

. Let

{\hat{θ}}_{i} = ({\hat{θ}}_{11}, {\hat{θ}}_{i 2})

be the MLE of

θ_{i} = (θ_{11}, θ_{i 2})

,

i = 1, 2

, which correspond to the MLE of the full model parameters

(θ_{1}, θ_{2})

under null hypothesis

H_{0} : θ_{21} = θ_{11}

. Thus, Part (b) of Corollary 1 in Salicrú et al. [27] establishes that if the null hypothesis

H_{0} : θ_{22} = θ_{12}

holds and

\frac{n_{1}}{n_{1} + n_{2}} \underset{n_{1}, n_{2} \to \infty}{⟶} λ

, with

0 < λ < 1

, then:

\begin{matrix} \frac{2 n_{1} n_{2}}{n_{1} + n_{2}} K ({\hat{θ}}_{1}, {\hat{θ}}_{2}) = \frac{2 n_{1} n_{2}}{n_{1} + n_{2}} \int_{- \infty}^{\infty} f (x; {\hat{θ}}_{1}) log \{\frac{f (x; {\hat{θ}}_{1})}{f (x; {\hat{θ}}_{2})}\} d x \underset{n_{1}, n_{2} \to \infty}{\overset{d}{⟶}} χ_{p - r}^{2} . \end{matrix}

(19)

Thus, a test of level

α

for the above homogeneity null hypothesis consists of rejecting

H_{0}

if:

\frac{2 n_{1} n_{2}}{n_{1} + n_{2}} K ({\hat{θ}}_{1}, {\hat{θ}}_{2}) > χ_{p - r, 1 - α}^{2},

where

χ_{p - r, α}^{2}

is the

α

-th percentile of the

χ_{p - r}^{2}

-distribution.

Contreras-Reyes and Arellano-Valle [6] considered the result of Kupperman [30] to develop an asymptotic test of complete homogeneity in terms of the J divergence between two SN distributions. The SN distribution satisfies all the aforementioned regularity conditions when skewness parameter

η \neq 0

. Thus, considering this condition, we can also apply (17) and (19) to obtain, respectively, asymptotic tests with one or two samples of other hypotheses not covered by Kupperman’s test.

5. Simulations

In this section, we study the behavior of the series expansions of the SE and negentropy for the SN and MSN distributions. In both cases, we compare the SE and negentropies obtained from their series expansions with their corresponding “exact” versions computed from the Quadpack numerical integration method of Piessens et al. [31]. More precisely, the “exact” expected values

E {ζ_{0} (τ Z_{τ})}

and

E {ζ_{0} (τ Z_{τ}^{*})}

are computed using the Quadpack method as in Arellano-Valle et al. [16], Contreras-Reyes and Arellano-Valle [6] or Contreras-Reyes [18]. From the series expansions, the SE and negentropies were carried out for

k = 12

as in Withers and Nadarajah [19]. However, they tend to converge for

k = 4

as in the Gram–Charlier and Edgeworth expansion methods (see, e.g., Hyvärinen et al. [17] and Stehlík et al. [1], respectively). All proposed methods are implemented with R software [32].

From Figure 1, we observe that the approximations by series expansions are better in the MSN case (Panels C and D) than in the SN case (Panels A and B). Furthermore, that series expansion approximations are quite exact for small to moderate values of the skewness parameter

τ

; more specifically, for

0 \leq τ \leq 2

in the SN case, and

0 \leq τ \leq 4

in the MSN case. Additionally, Panels A and C show that the SE decreases as

τ

increases, while Panels B and D indicate that the negentropy increases with

τ

. Finally, as expected in both GSN models, the SE is less than or equal to the SE of the normal model, namely

H (Z_{0}) \approx 1.418

[6,33].

Panel A of Figure 2 shows, respectively, the behavior of the KL divergences of the SN and MSN distributions from the normal one obtained from the expansions in series given in Equations (15) and (16). As in Figure 1, the KL divergence between the SN and normal distributions increases smoothly for values of

τ \in [0, 2]

, but rises sharply for

τ > 2

. Meanwhile, the increase in KL divergence between the MSN and normal distributions seems more stable, at least for

τ \in [0, 5]

. Crucially, for

τ = | η | \geq 2

, the SN model is close to its maximum level of asymmetry, while the MSN model does it for

τ = | η | \geq 5

(see [15] (Figure 2)).

Table 1 presents the observed power of the asymptotic test of normality obtained from Equation (18) in Section 4.1, for different sample sizes and values of the skewness parameter. All these results were obtained from 2000 simulations for a nominal level of 5%. In each simulation, the MLE of

Z \sim M S N (η)

was obtained by maximizing the log-likelihood function:

\begin{matrix} ℓ (η) = \frac{n}{2} log (2 π) - \frac{1}{2} \sum_{i = 1}^{n} z_{i}^{2} + \sum_{i = 1}^{n} log ζ_{0} {η u (z_{i})}, \end{matrix}

(20)

for shape parameter

η

and a random sample of size n from Z [15]. Table 1 shows that the proposed test is considerably conservative since the observed rate of incorrect rejections of the normality hypothesis is always lower than the nominal level. The proposed test is also considerably more powerful in large samples (

n \geq 300

) and values of the skewness parameter far from zero (

| η | \geq 1.2

). As expected, the power of the test increases with sample size, particularly for small values of the skewness parameter (close to normality), given that statistic

2 n K_{0}

depends on n despite

K_{0}

being small (Figure 2).

Now, we compare the proposed asymptotic test with two additional tests considered by Arrué et al. [15] for null hypothesis

H_{0} : η = η_{0}

versus

H_{1} : η \neq η_{0}

: the Likelihood Radio Test (LRT) (see Appendix A) and the asymptotic normality-based test. Since the regularity condition on MSN’s FIM at

η = 0

is satisfied, the authors proposed a distributional normal theory for testing

H_{0}

, i.e., based on asymptotic normality of MLE given by

\sqrt{n} (\hat{θ} - θ_{0}) \overset{d}{⟶} N_{3} (0, i_{M S N}^{- 1} (θ_{0}))

, as

n \to \infty

, where

\hat{θ} = {(\hat{μ}, {\hat{σ}}^{2}, \hat{η})}^{⊤}

is the MLE of

θ = {(μ, σ^{2}, η)}^{⊤}

,

θ_{0} = {(μ, σ^{2}, 0)}^{⊤}

and

i_{M S N}^{- 1} (θ_{0})

is the inverse FIM component related to

θ_{0}

. For asymptotic normality and LRT, they conclude that

H_{0}

is rejected for large values of

\hat{τ} = | \hat{η} |

, and for large values of n, the coverage rate increases when

\hat{η}

exists (

H_{0}

is rejected) (see [15] (Tables 3–5)). Analogously, in Table 6 of Arrué et al. [15], the coverage rate increase when

\hat{η}

exists for large values of n.

6. Application to Condition Factor Time Series

To apply our results to a real-world problem, we considered the Condition Factor (CF) index [34], which serves as an important indicator of the fatness condition of fish [18]. The CF index,

C F = \frac{W}{\hat{W}} (100 %),

of an individual of length L is computed in terms of the observed weight

W = W (L)

and an estimation

\hat{W} = \hat{W} (L)

obtained from the morphometric relationships of the expected weight

E (W)

at length L. Then, the CF index is interpretable as food deficit (<100%) and abundance (>100%) conditions. The expected length-weight relationship is described through the non-linear relationship:

E (W) = α L^{β},

(21)

where

α

is the theoretical weight at length zero and

β

is the weight growth rate [35]. According to (21),

\hat{W}

is computed as

\hat{W} = \hat{α} L^{\hat{β}}

, where

\hat{α}

and

\hat{β}

are obtained by fitting the non-linear regression induced by (21) to the length-weight data obtained from a sample of the species under study.

The CF index can be mainly affected by environmental factors such as El Niño (cold events) or La Niña (warm events). These effects are conductors of threshold biological processes due to the limitation of food. For these reasons, Contreras-Reyes [18] considered a threshold autoregressive model based on the stochastic representation (12) to model CF time series. That is, by assuming an SN distribution with skewness parameter

η

for the CF index [20], the condition

| δ | < 1

ensures the weak stationarity of the process. Additionally, when

η

is positive, CF values fall below 100% (food deficit). Otherwise, CF values are greater than 100% (food abundance).

We applied hypothesis testing developed in Section 4 to monthly CF time series associated with anchovy from Chile’s northern coast during the period 1990–2010, which were classified by length and sex, for length classes 12,...,18 cm and ALL (all length classes). Therefore, the sample size of each classification depends on the availability of the routine biological sampling program (see more details in [18]). CF were previously standardized, since the shape parameter

η

is not affected by a linear transformation of the CF [23]. Table 2 shows the

\hat{η}

’s assuming an SN and MSN distribution based on the MLE method of Azzalini [36] and Arrué et al. [15], respectively. For MSN, we considered the log-likelihood function of Equation (20). In both models, negative and positive values of

\hat{η}

correspond to asymmetry to the right and left, respectively (see Contreras-Reyes [18] (Figure 5)). This means that CF of the above-mentioned classes are affected by extreme events. As expected, we find generally that for low values of the empirical skewness index, the shape parameter of both distributions is close to zero.

The values of

\hat{η}

obtained from the SN and MSN models are presented in Table 2. Since that SN model is not regular at

η = 0

, we used only the MSN model to perform the test of normality and LRT for each sample datum. The results of this analysis appear in Table 3 and are not analogous for all the length classes in both groups. In fact, for the group of males, the null hypothesis

H_{0} : τ = 0

is not rejected, only in length class 15 (95% confidence level) and in class ALL (90% confidence level). In contrast, for the group of females, the null hypothesis is not rejected for length classes 12, 15, 17 (95% confidence level) and in class ALL (90% confidence level). For both tests, we obtained similar decisions on each time series.

According to Contreras-Reyes [18], the time series in which the shape parameter is close to zero or when the null hypothesis is not rejected are influenced simultaneously by both normal and extreme events as in the length class ALL, where all the fish population is included for the analysis. For length class 17 in males, for example, the CF is susceptible to some atypical events such as the moderate-strong El Niño event between 1991 and 1992 (high negative empirical skewness and high empirical kurtosis). For length class 13 in both sexes, the CF is susceptible to the strong El Niño event produced between 1997 and 1998.

7. Discussion

We have presented the methodology to compute the Shannon entropy, the negentropy and the Kullback–Leibler and Jeffrey’s divergences for a broad family of asymmetric distributions with the normal kernel called generalized skew-normal distributions. Our method considers asymptotic expansions regarding moments and cumulants for two particular cases: the skew-normal and modified skew-normal distributions. We then measured the degrees of disparity of these distributions from the normal distribution by using exact expressions for the negentropy in terms of moments and cumulants. Additionally, given the regularity conditions accomplished by the modified skew-normal distribution, normality was tested based on the modified skew-normal distribution. This test considered the asymptotic behavior of the Kullback–Leibler divergence, which is determined by the negentropy for normality disparity.

Numerical results showed that the Shannon entropy and negentropy of the modified skew-normal distribution are better approximated than the skew-normal one, at least for a wider range of the shape parameter. For small to moderate values of the asymmetry parameter, where the approximations are appropriate, we find that expansions series converge from the fourth moment/cumulant to greater, as in the Gram–Charlier and Edgeworth expansion methods [17]. For large values of the skewness parameter, where the expansions are inappropriate, the functions related to negentropy are not well approximated by Taylor expansions around zero, produced by a divergence in the moment and cumulant terms, i.e., the Taylor expansions for the expected values of the functions

ζ_{0} (τ Z_{τ})

and

ζ_{0} {τ u (Z_{τ})}

(SN and MSN case, respectively) if

τ = | η |

is too large. When this happens, the normal cdf,

Φ (τ Z_{τ})

and

Φ {τ u (Z_{τ})}

(SN and MSN case, respectively), tends to one, since according to the stochastic representation in (12), for large values of

τ

, the distribution of

Z_{τ}

converges to the standardized half-normal distribution [37].

However, the normality test considered in the application used skewness parameters inside the appropriate range. Furthermore, we plan to investigate the negentropy of the modified skew-normal-Cauchy distribution or similar models. In addition, although the approximations are appropriate over the range of variation of the asymmetry admitted by both models, more work should be done in order to improve the asymptotic approximations for a greater range of the skewness parameter values. Besides, this is not an easy task since generally it is difficult to approximate KL divergences involving asymmetric and heavy-tailed distributions [38].

The statistical application related to condition factor time series of anchovies off northern Chile is given. The results show that the proposed methodology serves to detect non-normal events in these time series, which produces an empirical distribution with high presence of skewness [18]. The proposed test for normality is therefore useful to detect anomalies in condition factor time series, linked to food deficit (positive shape parameter) or food abundance (negative shape parameter) influenced by environmental conditions.

Acknowledgments

We are grateful to the Instituto de Fomento Pesquero (IFOP) for providing access to the data used in this work. The authors also thank Jaime Arrué for providing useful R codes of the MSN distribution. We are sincerely grateful to the four anonymous reviewers for their comments and suggestions that greatly improved an early version of this manuscript. Arellano-Valle’s research was supported by Grant FONDECYT (Chile) Nos. 1120121 and 1150325. Contreras-Reyes’s research was supported by Grant PIIC 069/2016 from UTFSM (Valparaíso, Chile) and by the CONICYT doctoral scholarship 2016 No. 21160618. Stehlík’s research was supported by FONDECYT (Chile) No. 1151441 and LIT-2016-1-SEE-023. All R codes used in this paper are available by request to the corresponding author.

Author Contributions

Javier E. Contreras-Reyes conceived the experiments and analyzed the data; Reinaldo B. Arellano-Valle, Javier E. Contreras-Reyes and Milan Stehlík designed and performed the experiments, contributed reagents/analysis tools and wrote the paper; and Javier E. Contreras-Reyes contributed materials tools. All authors have read and approved the final manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Appendix A.1. Moments of the Skew-Normal Distribution.

The moments

μ_{k} = E (Z_{τ}^{k})

are given by:

\begin{matrix} μ_{2 k} & = & (2 k - 1)!! = \frac{(2 k)!}{2^{k} k!} (even moments), \\ μ_{2 k - 1} & = & \frac{(2 k - 1)! b τ}{2^{k - 1} {(1 + τ^{2})}^{(2 k - 1) / 2}} \sum_{j = 0}^{k - 1} \frac{j! {(2 τ)}^{2 j}}{(2 j + 1)! (k - j - 1)!} (odd moments; [26]) . \end{matrix}

From Proposition 2 in Martínez et al. [39], the odd moments can also be computed as:

μ_{2 k - 1} = \frac{b \sum_{m = 1}^{k} a_{k} (m) τ^{2 m - 1}}{{(1 + τ^{2})}^{(2 k - 1) / 2}},

where the coefficient

a_{k} (m)

is computed iteratively as follows:

\begin{matrix} a_{1} (1) & = & 1, \\ a_{k} (1) & = & (2 k - 1) a_{k - 1} (1), k \geq 2, \\ a_{k} (m) & = & 2 (k - 1) {a_{k - 1} (m) + a_{k - 1} (m - 1)}, 1 < m < k, k \geq 2, \\ a_{k} (k) & = & 2 (k - 1) a_{k - 1} (k - 1), k \geq 2 . \end{matrix}

Appendix A.2. Cumulants of the Skew-Normal Distribution

The coefficients

κ_{k} = ζ_{k} (0)

,

k = 1, 2, \dots

, are related to the cumulants of the half-normal random variable

V \sim \sqrt{χ_{1}^{2}}

given by

K (t) = (1 / 2) t^{2} + ζ_{0} (t)

(see also [21,23]). Let

K_{m} (0) = \frac{d^{m}}{d t^{m}} K_{t} |_{t = 0}

be the m-th cumulant of V and clearly

K_{1} (0) = ζ_{1} (0) = κ_{1}

,

K_{2} (0) = 1 + ζ_{2} (0) = 1 + κ_{2}

, and

K_{m} (0) = ζ_{m} (0) = κ_{m}

,

m = 3, 4, \dots

. Furthermore, from [21,23], it emerges that:

\begin{matrix} ζ_{1} (x) & = & \frac{ϕ (x)}{Φ (x)}, \\ ζ_{2} (x) & = & - ζ_{1} (x) {x + ζ_{1} (x)}, \\ ζ_{3} (x) & = & - ζ_{2} (x) {x + ζ_{1} (x)} - ζ_{1} (x) {1 + ζ_{2} (x)}, \\ ζ_{4} (x) & = & - ζ_{3} (x) {x + ζ_{1} (x)} - 2 ζ_{2} (x) {1 + ζ_{2} (x)} - ζ_{1} (x) ζ_{3} (x), \\ ζ_{5} (x) & = & - ζ_{4} (x) {x + ζ_{1} (x)} - 3 ζ_{3} (x) {1 + ζ_{2} (x)} - 3 ζ_{2} (x) ζ_{3} (x) - ζ_{1} (x) ζ_{4} (x), \\ ⋮ \end{matrix}

Recalling that

b = \sqrt{2 / π}

, the first five coefficients

κ_{k} = ζ_{k} (0)

are:

\begin{matrix} κ_{1} & = & b, \\ κ_{2} & = & - κ_{1}^{2}, \\ κ_{3} & = & - 2 κ_{2} κ_{1} - κ_{1}, \\ κ_{4} & = & - 2 κ_{3} κ_{1} - 2 κ_{2}^{2} - 2 κ_{2}, \\ κ_{5} & = & - 2 κ_{4} κ_{1} - 6 κ_{3} κ_{2} - 3 κ_{3}, \\ ⋮ \end{matrix}

Thus, by letting

κ_{0} = 1

, a recursive rule for these coefficients is obtained as follows:

\begin{matrix} κ_{1} & = & b, \\ κ_{2 k} & = & - (2 k - 2) κ_{2 k - 2} - 2 \sum_{i = 1}^{k} (\binom{2 k - 2}{i - 1}) κ_{i} κ_{2 k - i} + (\binom{2 k - 2}{k - 1}) κ_{k}^{2}, \\ κ_{2 k + 1} & = & - (2 k - 1) κ_{2 k - 1} - 2 \sum_{i = 1}^{k} (\binom{2 k - 1}{i - 1}) κ_{i} κ_{2 k - i + 1}, k = 1, 2, \dots . \end{matrix}

Appendix A.3. Odd Moments of the Modified Skew-Normal Distribution

Recalling that

b = \sqrt{2 / π}

, the odd moments can be computed as:

\begin{matrix} μ_{2 k - 1} (τ) = 2^{k - 1} (k - 1)! b \{2 ξ_{k} (τ) - 1\}, \end{matrix}

with:

ξ_{k} (τ) = \int_{0}^{\infty} Φ \{τ u (\sqrt{x})\} \frac{x^{k - 1} e^{- x / 2}}{2^{k} Γ (k)} d x, k = 1, 2, \dots,

where

Γ (\cdot)

denotes the usual gamma function. Note that

ξ_{k} (τ) = E [Φ \{τ u (\sqrt{X_{k}})\}]

, where

X_{k} \sim χ_{2 k}^{2}

; thus,

0 < 2 ξ_{k} (τ) - 1 < 1

for all

k = 1, 2, \dots

and

τ > 0

. In particular, the first four moments are

μ_{1} (τ) = b {2 ξ_{1} (τ) - 1}

,

μ_{2} = 1

,

μ_{3} (τ) = 2 b {2 ξ_{2} (τ) - 1}

, and

μ_{4} = 3

.

Appendix A.4. Likelihood Radio Test

The Likelihood Radio Test (LRT) statistic [40] for a null hypotheses

H_{0} : θ \in Θ_{0}

versus

H_{1} : θ \notin Θ_{0}

,

Θ_{0} \subset Θ

(the parametric space), is given by:

L R T = 2 {ℓ (\hat{θ}) - ℓ ({\hat{θ}}_{0})},

where

\hat{θ}

is the unrestricted MLE of

θ

,

{\hat{θ}}_{0}

is the MLE of

θ

under

H_{0}

and the log-likelihood function

ℓ (θ)

for MSN distributions is presented in (20). As before, normality is typically obtained from the GSN class at

η_{0} = 0

. Because the MSN distribution satisfies the standard regularity conditions [15], the LRT statistic is asymptotically

χ_{s}^{2}

distributed under

H_{0}

, with

s = \dim (Θ) - \dim (Θ_{0}) = 1

degrees of freedom [41]. Hence, the p-value associated with the LRT is computed as

1 - χ_{s}^{2} (LRT)

, where

χ_{s}^{2} (LRT)

denotes the

χ_{s}^{2}

-distribution function evaluated at the observed value of the LRT statistic.

In order to test normality, we considered the particular null hypothesis

H_{0} : η = 0

versus

H_{1} : η \neq 0

, with the rest of the parameters not specified. Therefore, by (20), the LRT statistic is given by:

L R T = 2 [- \frac{1}{2} \sum_{i = 1}^{n} \hat{z_{i}^{2}} + \frac{1}{2} \sum_{i = 1}^{n} \hat{z_{i 0}^{2}} + \sum_{i = 1}^{n} log ζ_{0} {\hat{η} u ({\hat{z}}_{i})}],

where

\hat{η}

is the unrestricted MLE of

η

,

\hat{z_{i}}

and

{\hat{z}}_{i 0}

are the unrestricted and restricted MLE of

z_{i} = (y_{i} - μ) / σ

, respectively; and the p-value is computed as

1 - χ_{1}^{2} (LRT)

.

References

Stehlík, M.; Střelec, L.; Thulin, M. On robust testing for normality in chemometrics. Chemom. Intell. Lab. Syst. 2014, 130, 98–108. [Google Scholar] [CrossRef]
Vidal, I.; Iglesias, P.; Branco, M.D.; Arellano-Valle, R.B. Bayesian Sensitivity Analysis and Model Comparison for Skew Elliptical Models. J. Stat. Plan. Inference 2006, 136, 3435–3457. [Google Scholar] [CrossRef]
Stehlík, M. Distributions of exact tests in the exponential family. Metrika 2003, 57, 145–164. [Google Scholar] [CrossRef]
Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat. 1951, 22, 79–86. [Google Scholar] [CrossRef]
Stehlík, M. Decompositions of information divergences: Recent development, open problems and applications. In Proceedings of the 9th International Conference on Mathematical Problems in Engineering, Aerospace and Sciences: ICNPAA 2012, Vienna, Austria, 10–14 July 2012; Volume 1493, pp. 972–976. [Google Scholar]
Contreras-Reyes, J.E.; Arellano-Valle, R.B. Kullback–Leibler divergence measure for Multivariate Skew-Normal Distributions. Entropy 2012, 14, 1606–1626. [Google Scholar] [CrossRef]
Jeffreys, H. An invariant form for the prior probability in estimation problems. Proc. R. Soc. Lond. A 1946, 186, 453–461. [Google Scholar] [CrossRef]
Gómez-Villegas, M.A.; Main, P.; Navarro, H.; Susi, R. Assessing the effect of kurtosis deviations from Gaussianity on conditional distributions. Appl. Math. Comput. 2013, 219, 10499–10505. [Google Scholar] [CrossRef]
Main, P.; Arevalillo, J.M.; Navarro, H. Local Effect of Asymmetry Deviations from Gaussianity Using Information-Based Measures. In Proceedings of the 2nd International Electronic Conference on Entropy and Its Applications, Santa Barbara, CA, USA, 15–30 November 2015. [Google Scholar]
Dette, H.; Ley, C.; Rubio, F.J. Natural (non-) informative priors for skew-symmetric distributions. Scand. J. Stat. 2017, in press. [Google Scholar]
Gao, J.H.; Zhang, B. Estimation of seismic wavelets based on the multivariate scale mixture of Gaussians model. Entropy 2009, 12, 14–33. [Google Scholar] [CrossRef]
Wang, Z.; Zhang, B.; Gao, J. The residual phase estimation of a seismic wavelet using a Rényi divergence-based criterion. J. Appl. Geophys. 2014, 106, 96–105. [Google Scholar] [CrossRef]
Pires, C.A.; Ribeiro, A.F. Separation of the atmospheric variability into non-Gaussian multidimensional sources by projection pursuit techniques. Clim. Dyn. 2017, 48, 821–850. [Google Scholar] [CrossRef]
Pires, C.A.; Hannachi, A. Independent Subspace Analysis of the Sea Surface Temperature Variability: Non-Gaussian Sources and Sensitivity to Sampling and Dimensionality. Complexity 2017. [Google Scholar] [CrossRef]
Arrué, J.; Arellano-Valle, R.B.; Gómez, H.W. Bias reduction of maximum likelihood estimates for a modified skew-normal distribution. J. Stat. Comput. Simul. 2016, 86, 2967–2984. [Google Scholar] [CrossRef]
Arellano-Valle, R.B.; Contreras-Reyes, J.E.; Genton, M.G. Shannon entropy and mutual information for multivariate skew-elliptical distributions. Scand. J. Stat. 2013, 40, 42–62. [Google Scholar] [CrossRef]
Hyvärinen, A.; Karhunen, J.; Oja, E. Independent Component Analysis; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2001. [Google Scholar]
Contreras-Reyes, J.E. Analyzing fish condition factor index through skew-gaussian information theory quantifiers. Fluct. Noise Lett. 2016, 15, 1650013. [Google Scholar] [CrossRef]
Withers, C.S.; Nadarajah, S. Negentropy as a function of cumulants. Inf. Sci. 2014, 271, 31–44. [Google Scholar] [CrossRef]
Azzalini, A. A Class of Distributions which includes the Normal Ones. Scand. J. Stat. 1985, 12, 171–178. [Google Scholar]
Azzalini, A.; Capitanio, A. Statistical applications of the multivariate skew normal distributions. J. R. Stat. Soc. Ser. B 1999, 61, 579–602. [Google Scholar] [CrossRef]
Gupta, A.K.; Chang, F.C.; Huang, W.J. Some skew-symmetric models. Random Oper. Stoch. Equ. 2002, 10, 133–140. [Google Scholar] [CrossRef]
Azzalini, A.; Capitanio, A. The Skew-Normal and Related Families; Cambridge University Press: Cambridge, UK, 2013; Volume 3. [Google Scholar]
Nadarajah, S.; Kotz, S. Skewed distributions generated by the normal kernel. Stat. Probab. Lett. 2003, 65, 269–277. [Google Scholar] [CrossRef]
Arrué, J.; Gómez, H.W.; Varela, H.; Bolfarine, H. On the skew-normal-Cauchy distribution. Commun. Stat. A Theory. 2010, 40, 15–27. [Google Scholar] [CrossRef]
Henze, N. A probabilistic representation of the ‘skew-normal’ distribution. Scand. J. Stat. 1986, 13, 271–275. [Google Scholar]
Salicrú, M.; Menéndez, M.L.; Pardo, L.; Morales, D. On the applications of divergence type measures in testing statistical hypothesis. J. Multivar. Anal. 1994, 51, 372–391. [Google Scholar] [CrossRef]
Arellano-Valle, R.B.; Azzalini, A. The centred parametrization for the multivariate skew-normal distribution. J. Multivar. Anal. 2009, 100, 816. [Google Scholar] [CrossRef]
Chiogna, M. Some results on the scalar skew-normal distribution. J. Ital. Stat. Soc. 1998, 1, 1–13. [Google Scholar] [CrossRef]
Kupperman, M. Further Applications of Information Theory to Multivariate Analysis and Statistical Inference. Ph.D. Thesis, George Washington University, Washington, DC, USA, January 1957; p. 270. [Google Scholar]
Piessens, R.; deDoncker-Kapenga, E.; Uberhuber, C.; Kahaner, D. Quadpack: A Subroutine Package for Automatic Integration; Springer: Berlin, Germany, 1983. [Google Scholar]
R Core Team. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2015; Available online: http://www.R-project.org (accessed on 1 October 2016).
Contreras-Reyes, J.E. Rényi entropy and complexity measure for skew-gaussian distributions and related families. Physica A 2015, 433, 84–91. [Google Scholar] [CrossRef]
Le Cren, E.D. The length–weight relationship and seasonal cycle in gonad weight and condition in the perch (Perca fluviatilis). J. Anim. Ecol. 1951, 20, 201–219. [Google Scholar] [CrossRef]
Contreras-Reyes, J.E.; Cortés, D.D. Bounds on Rényi and Shannon Entropies for Finite Mixtures of Multivariate Skew-Normal Distributions: Application to Swordfish (Xiphias gladius Linnaeus). Entropy 2016, 18, 382. [Google Scholar] [CrossRef]
Azzalini, A. R package sn: The Skew-Normal and Skew-t Distributions (version 0.4-6); Università di Padova’: Padua, Italy, 2010. [Google Scholar]
Arnold, B.C.; Beaver, R.J. Hidden truncation models. Sankhya A 2000, 62, 23–35. [Google Scholar]
Stehlík, M.; Somorčík, J.; Střelec, L.; Antoch, J. Approximation of information divergences for statistical learning with applications. Math. Slovaca 2017, in press. [Google Scholar]
Martínez, E.H.; Varela, H.; Gómez, H.W.; Bolfarine, H. A note on the likelihood and moments of the skew-normal distribution. Stat. Oper. Res. Trans. 2008, 32, 57–66. [Google Scholar]
Chernoff, H. On the distribution of the likelihood ratio. Ann. Math. Stat. 1954, 25, 573–578. [Google Scholar] [CrossRef]
Azzalini, A.; Arellano-Valle, R.B. Maximum penalized likelihood estimation for skew-normal and skew-t distributions. J. Stat. Plan. Inference 2013, 143, 419–433. [Google Scholar] [CrossRef]

Figure 1. Shannon entropy and negentropy for the (A,B) Skew-Normal (SN) and (C,D) Modified Skew-Normal (MSN) cases. The blue and red lines correspond to numerical integration and cumulant expansion series methods, respectively.

Figure 2. Kullback-Leibler (KL) divergence,

K_{0} = K (Z_{τ}, Z_{0})

, between SN and normal (solid line) and MSN and normal (dotted line).

Figure 2. Kullback-Leibler (KL) divergence,

K_{0} = K (Z_{τ}, Z_{0})

, between SN and normal (solid line) and MSN and normal (dotted line).

Table 1. Observed power (both in %) of the proposed normality test using the maximum likelihood estimator (MLE) of the MSN model from 2000 simulations for the nominal level of 5% and various values of shape parameter

η

and sample size n.

Table 1. Observed power (both in %) of the proposed normality test using the maximum likelihood estimator (MLE) of the MSN model from 2000 simulations for the nominal level of 5% and various values of shape parameter

η

and sample size n.

n/ $η$	0	0.1	0.2	0.3	0.4	0.8	1.2	1.6	2
50	97.55	37.02	39.45	37.39	37.60	40.07	40.82	48.37	51.29
100	97.65	46.15	45.90	47.90	46.05	50.05	56.10	66.20	75.49
200	98.05	54.25	54.30	53.65	54.50	58.80	70.55	85.20	93.25
300	98.30	57.30	57.70	58.00	58.65	62.00	77.05	92.65	98.25
400	98.55	57.20	57.65	57.15	57.50	64.10	83.95	95.80	99.65
500	98.70	58.55	59.40	59.95	58.45	66.70	86.95	97.45	99.75

Table 2. Shape parameter estimates (

\hat{η}

) of SN (reported in [18]) and MSN models for each sex and length class L, together with its respective standard deviations (s.d). Sample size (n), empirical skewness (

\sqrt{{\hat{b}}_{1}}

) and kurtosis (

{\hat{b}}_{2}

), as well as the log-likelihood function

ℓ (\hat{η})

for each model fit are also reported.

Table 2. Shape parameter estimates (

\hat{η}

) of SN (reported in [18]) and MSN models for each sex and length class L, together with its respective standard deviations (s.d). Sample size (n), empirical skewness (

\sqrt{{\hat{b}}_{1}}

) and kurtosis (

{\hat{b}}_{2}

), as well as the log-likelihood function

ℓ (\hat{η})

for each model fit are also reported.

Sex	L	n	$\sqrt{{\hat{b}}_{1}}$	${\hat{b}}_{2}$	SN			MSN
Sex	L	n	$\sqrt{{\hat{b}}_{1}}$	${\hat{b}}_{2}$	$\hat{η}$	s.d	$ℓ (\hat{η})$	$\hat{η}$	s.d	$ℓ (\hat{η})$
Male	12	213	−0.220	3.723	−1.065	0.147	−301.066	−1.717	0.662	−235.783
	13	238	0.658	5.120	1.778	0.134	−332.638	2.617	0.447	−359.291
	14	260	−0.147	2.885	−1.086	0.166	−367.935	0.459	0.542	−342.881
	15	261	−0.030	2.755	−0.442	0.150	−369.834	0.065	0.386	450.489
	16	261	0.307	3.138	1.616	0.149	−367.542	2.565	0.415	−354.073
	17	221	−3.461	27.958	−3.092	0.080	−285.125	−3.597	0.524	−312.701
	18	180	0.264	3.001	1.368	0.201	−253.988	2.821	0.643	−269.193
	All	262	0.068	2.687	0.721	0.178	−371.192	−0.287	0.517	−131.335
Female	12	198	0.041	2.738	0.552	0.196	−280.434	−0.142	0.386	−209.551
	13	228	0.917	6.103	2.267	0.128	−315.305	3.101	0.524	−326.665
	14	260	0.190	2.907	1.242	0.157	−367.555	1.459	0.867	−331.373
	15	260	0.076	2.672	0.728	0.174	−368.344	−0.189	0.383	−425.212
	16	261	0.349	3.091	1.702	0.155	−367.160	2.631	0.475	−346.705
	17	246	−0.056	3.115	−0.689	0.149	−348.487	0.041	0.754	−373.068
	18	208	−0.539	4.349	−1.581	0.136	−291.484	−2.223	0.472	−313.160
	All	262	0.072	2.764	0.748	0.172	−371.174	−0.267	0.401	−110.108

Table 3. MSN Shannon entropy (H) and negentropy (N) for each sex and length class L using expansion series of cumulants. For each time series, the KL divergence

K_{0} = K (Z_{\hat{τ}}, Z_{0})

, statistic

2 n K_{0}

of Equation (18), the Likelihood Ratio Test (LRT) statistic and its respective p-values are reported. All values reported consider estimates

\hat{η}

(for

\hat{τ} = | \hat{η} |

) and sample size n from Table 2.

Table 3. MSN Shannon entropy (H) and negentropy (N) for each sex and length class L using expansion series of cumulants. For each time series, the KL divergence

K_{0} = K (Z_{\hat{τ}}, Z_{0})

, statistic

2 n K_{0}

of Equation (18), the Likelihood Ratio Test (LRT) statistic and its respective p-values are reported. All values reported consider estimates

\hat{η}

(for

\hat{τ} = | \hat{η} |

) and sample size n from Table 2.

Sex	L	H	N	$K_{0}$	Asymptotic Test		LRT
Sex	L	H	N	$K_{0}$	$2 {nK}_{0}$	p-Value	Statistic	p-Value
Male	12	1.187	−0.074	0.232	98.950	<0.001	98.304	<0.001
	13	1.030	−0.063	0.389	185.087	<0.001	180.469	<0.001
	14	1.396	−0.008	0.023	11.699	0.001	11.693	0.001
	15	1.418	0.000	0.001	0.242	0.623	0.246	0.620
	16	1.038	−0.066	0.381	198.700	<0.001	193.567	<0.001
	17	0.873	0.050	0.546	241.472	<0.001	211.924	<0.001
	18	0.999	−0.047	0.420	151.341	<0.001	146.427	<0.001
	All	1.410	−0.003	0.009	4.682	0.031	4.673	0.030
Female	12	1.417	−0.001	0.002	0.873	0.350	0.903	0.342
	13	0.956	−0.019	0.463	211.320	<0.001	198.516	<0.001
	14	1.237	−0.062	0.183	94.892	<0.001	93.300	<0.001
	15	1.415	−0.001	0.004	2.027	0.155	2.071	0.150
	16	1.028	−0.062	0.391	204.117	<0.001	199.015	<0.001
	17	1.418	0.000	0.0002	0.091	0.763	0.089	0.765
	18	1.095	−0.080	0.324	134.967	<0.001	133.629	<0.001
	All	1.411	−0.003	0.008	4.058	0.044	4.112	0.042

© 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Arellano-Valle, R.B.; Contreras-Reyes, J.E.; Stehlík, M. Generalized Skew-Normal Negentropy and Its Application to Fish Condition Factor Time Series. Entropy 2017, 19, 528. https://doi.org/10.3390/e19100528

AMA Style

Arellano-Valle RB, Contreras-Reyes JE, Stehlík M. Generalized Skew-Normal Negentropy and Its Application to Fish Condition Factor Time Series. Entropy. 2017; 19(10):528. https://doi.org/10.3390/e19100528

Chicago/Turabian Style

Arellano-Valle, Reinaldo B., Javier E. Contreras-Reyes, and Milan Stehlík. 2017. "Generalized Skew-Normal Negentropy and Its Application to Fish Condition Factor Time Series" Entropy 19, no. 10: 528. https://doi.org/10.3390/e19100528

APA Style

Arellano-Valle, R. B., Contreras-Reyes, J. E., & Stehlík, M. (2017). Generalized Skew-Normal Negentropy and Its Application to Fish Condition Factor Time Series. Entropy, 19(10), 528. https://doi.org/10.3390/e19100528

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Generalized Skew-Normal Negentropy and Its Application to Fish Condition Factor Time Series

Abstract

1. Introduction

2. Shannon Entropy and Related Measures

3. Generalized Skew-Normal Distributions

3.1. Skew-Normal Distribution

3.2. Modified Skew-Normal Distribution

3.3. J Divergence between SN and MSN Distributions

4. Asymptotic Tests

4.1. One-Sample Case: Test for Normality

4.2. Two-Sample Case

5. Simulations

6. Application to Condition Factor Time Series

7. Discussion

Acknowledgments

Author Contributions

Conflicts of Interest

Appendix A

Appendix A.1. Moments of the Skew-Normal Distribution.

Appendix A.2. Cumulants of the Skew-Normal Distribution

Appendix A.3. Odd Moments of the Modified Skew-Normal Distribution

Appendix A.4. Likelihood Radio Test

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI