Information–Theoretic Aspects of Location Parameter Estimation under Skew–Normal Settings

Contreras-Reyes, Javier E.

doi:10.3390/e24030399

Open AccessArticle

Information–Theoretic Aspects of Location Parameter Estimation under Skew–Normal Settings

by

Javier E. Contreras-Reyes

Instituto de Estadística, Facultad de Ciencias, Universidad de Valparaíso, Valparaíso 2360102, Chile

Entropy 2022, 24(3), 399; https://doi.org/10.3390/e24030399

Submission received: 14 February 2022 / Revised: 9 March 2022 / Accepted: 11 March 2022 / Published: 13 March 2022

(This article belongs to the Special Issue Distance in Information and Statistical Physics III)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In several applications, the assumption of normality is often violated in data with some level of skewness, so skewness affects the mean’s estimation. The class of skew–normal distributions is considered, given their flexibility for modeling data with asymmetry parameter. In this paper, we considered two location parameter (

μ

) estimation methods in the skew–normal setting, where the coefficient of variation and the skewness parameter are known. Specifically, the least square estimator (LSE) and the best unbiased estimator (BUE) for

μ

are considered. The properties for BUE (which dominates LSE) using classic theorems of information theory are explored, which provides a way to measure the uncertainty of location parameter estimations. Specifically, inequalities based on convexity property enable obtaining lower and upper bounds for differential entropy and Fisher information. Some simulations illustrate the behavior of differential entropy and Fisher information bounds.

Keywords:

skew–normal distribution; location parameter; skewness; differential entropy; Fisher information; Cramér–Rao bound; convexity

1. Introduction

A typical problem in statistical inference is estimating the parameters from a data sample [1], especially if the data have some level of skewness. Therefore, the estimation of these parameters is affected by asymmetry. Recent research addressed data asymmetry with the class of skew–normal distributions, given their flexibility for modeling data with the skewness (asymmetry/symmetry) parameter [2]. In particular, Ref. [3] focused on estimating location parameter (

μ

), assuming that the coefficient of variation and skewness parameter are known. Specifically, they presented the least square estimator (LSE) and the best unbiased estimator (BUE) for

μ

. The precision of the location parameter estimation is directly influenced by skewness [4] and, hence, affects the confidence intervals and sample size [5,6].

Given that complex parametric distributions with several parameters are often considered [2], the information measures (entropies and/or divergences) play an important role in quantifying uncertainty provided by a random process about itself, and it is sufficient to study the reproduction of a marginal process through a noiseless system. One main application is related to the selection of models and detection of the number of clusters [7], or the interpretation of physical phenomena [8,9]. However, the use of entropies and/or divergences is widely considered to compare estimations [1]. For example, Ref. [10] considered the Kullback–Leibler (KL) divergence as a method to compare sample correlation matrices to an application in financial markets, assuming two multivariate normal densities. Using the estimated parameters based on maximum likelihood estimation, Ref. [11,12,13] considered the KL divergence for an asymptotic test to evaluate the data skewness and/or bimodality.

Given that precision was evaluated with confidence intervals in [5], the quantification of uncertainty for location parameter estimation under skew–normal settings motivated this study. The properties for MSE and (emphasizing) BUE, using classic theorems and properties of information theory are explored, which enable measuring the uncertainty of location parameter estimations based on differential entropy and Fisher information [1]. The Cramér–Rao inequality [14] linked Fisher information with the variance of an unbiased estimator, which is considered to find a lower bound for Fisher information. In addition, considering a stochastic representation [15] of a skew–normal random variable, the convexity property of Fisher information is also used to find an upper bound for Fisher information.

This paper is organized as follows: some properties and inferential aspects based on information theory are presented in Section 2. In Section 3, the computation and description of information–theoretic theorems related to location parameter estimation of skew–normal distribution are presented. In Section 4, some simulations illustrate the usefulness of the results. Final remarks conclude the paper in Section 5.

2. Information-Theoretic Aspects

In this section, some main theorems and properties of information theory are described. Specifically, these properties are based on differential entropy and Fisher information.

Definition 1.

Let X be a random variable with support in

R

and continuous probability density function (pdf),

f (x; θ)

, which depends on parameter θ. The differential entropy of X [1] is defined by

H (X) = - E [log f (X; θ)] = - \int_{R} f (x; θ) log f (x; θ) d x,

where notation

E [g (X)] = \int_{R} g (x) f (x; θ) d x

was used.

Differential entropy depends only on the pdf of the random variable. In the following theorem, the scaling property of differential entropy is presented.

Theorem 1.

For any real constant a, the differential entropy of

a X

Theorem 8.6.4 of [1] is given by

H (a X) = H (X) + log | a | .

In particular, for two random variables, the following differential entropy bounds hold.

Theorem 2.

Let X and Y be two independent random variables. Suppose

Z \overset{D}{=} X + Y

, where “

\overset{D}{=}

” denotes equality in distribution, then

(i): $\frac{H (X) + H (Y) + log 2}{2} \leq H (Z) \leq H (X) + H (Y) .$
(ii): For any constant ρ such that $0 \leq ρ \leq 1$ ,

$ρ H (X) + (1 - ρ) H (Y) \leq H (\sqrt{ρ} X + \sqrt{1 - ρ} Y) .$

Proof.

For part (i), consider first the general case for

X_{1}, X_{2}, \dots, X_{n}

independent and identically distributed (i.i.d.) random variables see Equations (5) and (6) of [16], then

H (X_{1}) \leq H (\frac{X_{1} + X_{2}}{\sqrt{2}}) \leq \dots \leq H (\frac{1}{\sqrt{n - 1}} \sum_{i = 1}^{n - 1} X_{i}) \leq H (\frac{1}{\sqrt{n}} \sum_{i = 1}^{n} X_{i}) .

Considering the latter inequality for two variables, X and Y, and the scaling property of Theorem 1, we obtain

2 H (X + Y) - log 2 \geq H (X) + H (Y)

, yielding the left side of the inequality. For the right side, see [17]. The inequality of part (ii) is proved in Theorem 7 of [14]. □

The inequality of Theorem 2 (ii) is based on the convexity property, and allows obtaining a lower bound for differential entropy Theorem 8.6.5 of [1].

Theorem 3.

Let X be a random variable with zero mean and finite

σ^{2}

, then

H (X) \leq \frac{1}{2} log (2 π e σ^{2}),

and equality is achieved if, and only if,

X \sim N (0, σ^{2})

.

Theorem 3, also known as the maximum entropy principle, implies that Gaussian distribution maximizes the differential entropy over all distributions with the same variance. This theorem has several implications for information theory, mainly when the differential entropy of an unknown distribution is hard to obtain. Thus, this upper bound is a good alternative. Another consequence is the relationship between estimation error and differential entropy, which includes the Cramér–Rao bound as described next. First, the Fisher information for continuous densities needs to be defined as follows.

Definition 2.

Let X be a random variable with support in

R

and continuous density function

f (x; θ)

, which depends on parameter θ, so

\int_{R} f (x; θ) d x = 1

. The Fisher information of X [1] is defined by

J (X) = E [{\{\frac{\partial}{\partial x} log f (x; θ)\}}^{2}] = \int_{R} {(\frac{\partial}{\partial x} f (x; θ))}^{2} \frac{1}{f (x; θ)} d x .

(1)

The Fisher information is a measure of the minimum error in estimating a parameter

θ

of a distribution. Classical definitions of Fisher information considered differentiation with respect to

θ

to define

J (θ)

; however, by considering a parametric form as

f (x - θ; θ)

, differentiation with respect to x is equivalent to differentiation with respect to

θ

as in Equation (1) [1]. The following inequality links Fisher information and variance.

Theorem 4.

Let

X = {(X_{1}, X_{2}, \dots, X_{n})}^{⊤}

be a sample of n random variables drawn i.i.d.

\sim f (x; θ)

, the mean-squared error of an unbiased estimator

T (X)

of parameter θ is lower bounded by the reciprocal of the Fisher information Theorem 11.10.1 of [1]:

V a r [T (X)] \geq \frac{1}{J (X)},

(2)

where

J (X)

is defined in Equation (1) and, if the inequality is achieved,

T (X)

is efficient.

Theorem 4, also known as the Cramér–Rao inequality, allows determining the best estimator of

θ

to obtain a lower bound for Fisher information. The Cramér–Rao inequality was first planned for any estimator

T (X)

(not necessarily unbiased) of

θ

in terms of mean-squared error, in this case

E [{T (X) - θ}^{2}] \geq \frac{{(1 + \frac{\partial}{\partial θ} B i a s (θ))}^{2}}{J (X)} + B i a s {(θ)}^{2},

where

B i a s (θ) = E [T (X) - θ]

; see Equation (11).290 of [1]. Clearly, if

T (X)

is an unbiased estimator of

θ

, Theorem 4 is a particular case of the latter inequality. Inequality (2) was obtained through the Cauchy–Schwarz inequality on the variance of all unbiased estimators. The following inequality, also known as the Fisher information inequality, is based on the convexity property and is useful to obtain an upper bound for Fisher information.

Theorem 5.

For any two independent random variables X and Y, and any constant ρ such that

0 \leq ρ \leq 1

, then

J (\sqrt{ρ} X + \sqrt{1 - ρ} Y) \leq ρ J (X) + (1 - ρ) J (Y) .

Proof.

See proof of Theorem 13 in [14]. □

3. Location Parameter Estimation

The skew–normal distribution is an extension of the normal one, allowing for the presence of skewness.

Definition 3.

X is called a skew–normal random variable [15] and denoted as

X \sim S N_{1} (μ, σ^{2}, λ)

if it has pdf

f (x; θ) = \frac{2}{σ} ϕ (\frac{x - μ}{σ}) Φ [λ (\frac{x - μ}{σ})], x \in R, θ = (μ, σ^{2}, λ);

with location

μ \in R

, scale

σ^{2} \in R

, and shape

λ \in R

parameters. In addition,

ϕ (x)

is the pdf of the standardized normal distribution with 0 mean and variance 1, denoted as

N (0, 1)

, and

Φ (x)

is the corresponding cumulative distribution function (cdf) of the standardized normal distribution.

Random variable X is represented by the following stochastic representation:

X \overset{d}{=} μ + σ (δ | U_{0} | + \sqrt{1 - δ^{2}} U),

(3)

where

δ = \frac{λ}{\sqrt{1 + λ^{2}}}

, and

U_{0}

and

U \sim N (0, 1)

are independently distributed; see Equation (2.14) of [15].

Additionally, X has a representation based on a link between differential entropy and Fisher information, due to de Bruijn’s identity. By matching the stochastic representation (3) with Equation (20) of [16], it is possible to assign

Y = μ + σ δ | U_{0} |

with fixed

δ

. Then,

H (Y) = \frac{1}{2} log (2 π e) + \int_{- \infty}^{\infty} λ \{(1 + λ^{2}) J (X) - 1\} d λ,

where an approximation for Fisher information

J (X)

appears in the proof of Proposition 5 below (with

n = 1

observation).

Definition 4.

X

is called a multivariate skew–normal random vector [18] and denoted as

X \sim S N_{n} (μ, Σ, λ)

if it has pdf

f_{n} (x; θ) = 2 ϕ_{n} (x; μ, Σ) Φ [λ^{⊤} Σ^{- 1 / 2} (x - μ)], x \in R^{n}, θ = (μ, Σ, λ),

with location vector

μ \in R^{n}

, scale matrix

Σ \in R^{n \times n}

, and skewness vector

λ \in R^{n}

parameters. In addition,

ϕ_{n} (x; μ, Σ)

is the n-dimensional normal pdf with location parameter μ and scale parameter Σ.

Let

X = {(X_{1}, \dots, X_{n})}^{⊤} \sim S N_{n} (μ, Σ, λ)

, with

μ = 1_{n} μ

,

Σ = σ^{2} I_{n}

, and

λ = 1_{n} λ

, where

1_{n} = {(1, \dots, 1)}^{⊤} \in R^{n}

and

I_{n}

denotes the

n \times n

-identity matrix. Following [3] and Corollary 2.2 of [5], the following properties hold.

Property 1.

X_{i} \sim S N_{1} (μ, σ^{2}, λ_{*})

,

i = 1, \dots, n

, with

λ_{*} = \frac{λ}{\sqrt{1 + (n - 1) λ^{2}}}

.

Property 1 indicates that

X_{1}, \dots, X_{n}

is a random sample with identically distributed but random variables not independent from a univariate skew–normal population with location

μ

, scale

σ^{2}

, and shape

λ_{*}

parameters.

Property 2.

E [X_{i}] = μ + σ b δ_{*}

and

V a r [X_{i}] = σ^{2} (1 - n b^{2} δ_{*}^{2})

, with

b = \sqrt{\frac{2}{π}}

and

δ_{*} = \frac{λ}{1 + n λ^{2}}

.

Property 3.

\bar{X} = \frac{1}{n} \sum_{i = 1}^{n} X_{i} \sim S N_{1} (μ, \frac{σ^{2}}{n}, \sqrt{n} λ)

.

Property 4.

(\frac{n - 1}{σ^{2}}) S^{2} \sim χ_{n - 1}^{2}

with

S^{2} = \frac{1}{n - 1} \sum_{i = 1}^{n} {(X_{i} - \bar{X})}^{2}

, where

χ_{n - 1}^{2}

denotes the chi-square distribution with

n - 1

degrees of freedom, and sample mean

\bar{X}

and sample variance

S^{2}

are independent.

3.1. Least Square Estimator

Assuming that the coefficient of variation

τ = | σ / μ |

and shape parameter

λ

are known, Theorem 4.1 of [3] provides the least square estimator for

μ

and its variance, given by

\begin{matrix} T_{L S E} (X) & = & ω \bar{X}, \end{matrix}

(4)

\begin{matrix} V a r [T_{L S E} (X)] & = & ω^{2} (1 - n δ_{n}^{2}) \frac{σ^{2}}{n}, \\ ω & = & \frac{n}{n + τ (\frac{τ + n δ_{n}}{1 + δ_{n} τ})}, \end{matrix}

(5)

where

δ_{n} = b δ_{*}

, and

δ_{*}

is defined in Property 2. The least square estimator for

μ

was obtained by minimizing the MSE of

n c \bar{X}

with respect to a constant c. The MSE of

T_{L S E} (X)

is

M S E [T_{L S E} (X)] = (\frac{σ^{2}}{n} + μ^{2} + 2 μ δ_{n} σ) ω^{2} - 2 μ ω (μ + σ δ_{n}) + μ^{2} .

(6)

Proposition 1.

Let

X = {(X_{1}, \dots, X_{n})}^{⊤} \sim S N_{n} (μ 1_{n}, σ^{2} I_{n}, 1_{n} λ)

, with known τ and λ. Thus,

(i): $\begin{matrix} H (T_{L S E} (X)) & = & \frac{1}{2} log (2 π e \frac{σ^{2}}{n} ω^{2}) - H_{N} (η), \\ H_{N} (η) & = & E [log {2 Φ (η W)}], \\ W & \sim & S N_{1} (0, 1, η), \\ η & = & σ λ . \end{matrix}$
(ii): $H (T_{L S E} (X)) \leq \frac{1}{2} log (2 π e \frac{σ^{2}}{n} ω^{2} (1 - n δ_{n}^{2})) .$

Proof.

Part (i) follows straightforwardly from Theorem 1, Property 3, (4) and Proposition 2.1 of [19] (for the univariate case). Part (ii) is straightforward from Theorem 3 and (5). □

Differential entropy of

T_{L S E} (X)

corresponds to the difference of the normal differential entropy and a term called negentropy,

H_{N} (η)

, that depends on

σ

and

λ

parameters. Additionally, note that part (ii) yields the upper bound for

H_{N} (η)

of part (i),

H_{N} (η) \leq \frac{1}{2} log (1 - n δ_{n}^{2})

.

As a particular case of Proposition 1, it is possible to obtain the differential entropy of sample mean

\bar{X}

by choosing

ω = 1

:

H (\bar{X}) = \frac{1}{2} log (2 π e \frac{σ^{2}}{n}) - E [log {2 Φ (η W)}];

(7)

its respective upper bound

H (\bar{X}) \leq \frac{1}{2} log (2 π e \frac{σ^{2}}{n} (1 - n δ_{n}^{2}));

and, from Equation (6), its respective MSE

M S E [\bar{X}] = \frac{σ^{2}}{n} .

(8)

3.2. Best Unbiased Estimator

Assuming that the coefficient of variation

τ = | σ / μ |

and shape parameter

λ

are known, Theorem 5.1 of [3] provides the best unbiased estimator (BUE) for

μ

, given by

\begin{matrix} T_{B U E} (X) & = & (1 - α) d_{1} (X) + α d_{2} (X), \\ d_{1} (X) & = & \frac{\bar{X}}{1 + δ_{n} τ}, \\ d_{2} (X) & = & c_{n} \sqrt{n - 1} S, \end{matrix}

(9)

\begin{matrix} c_{n} & = & \frac{1}{\sqrt{2 τ^{2}}} \frac{Γ (\frac{n - 1}{2})}{Γ (\frac{n}{2})}, \end{matrix}

(10)

\begin{matrix} α & = & \frac{1}{(1 + δ_{n} τ) {[(n - 1) c_{n}]}^{2}}, \end{matrix}

(11)

where

Γ (x)

denotes the usual gamma function and S is defined in Property 4.

Remark 1.

Equation (10) can be approximated using an asymptotic expression for the gamma function given by

Γ (x + a) \approx \sqrt{2 π} x^{x + a - 1 / 2} e^{- x}

,

a < \infty

, as

| x | \to \infty

[19]. Then,

c_{n} \approx \frac{1}{\sqrt{n τ^{2}}},

(12)

as

n \to \infty

. Since the exact form (10) can be undefined for large samples (

n > 200

), approximation (1) is very useful for these cases. Note that from (11) and (1),

δ_{n}, α \to 0

as

n \to \infty

, which implies that estimator

T_{B U E} (X)

is only influenced by

d_{1} (X)

for large samples.

From Properties 3 and 4, Ref. [3] also proved that

\begin{matrix} d_{1} (X) & \sim & S N_{1} (\frac{μ}{1 + δ_{n} τ}, \frac{σ^{2}}{n {(1 + δ_{n} τ)}^{2}}, \sqrt{n} λ), \end{matrix}

(13)

\begin{matrix} V a r [d_{1} (X)] & = & \frac{{(μ τ)}^{2} (1 - n δ_{n}^{2})}{n {(1 + δ_{n} τ)}^{2}}, \end{matrix}

(14)

\begin{matrix} V a r [d_{2} (X)] & = & 2 {(μ τ)}^{2} (1 - \frac{1}{2 (n - 1) {(τ c_{n})}^{2}}) . \end{matrix}

(15)

Given that

C o v (d_{1}, d_{2}) = 0

, from Equations (9), (14) and (15), we obtain

\begin{matrix} V a r [T_{B U E} (X)] & = & {(1 - α)}^{2} V a r [d_{1} (X)] + α^{2} V a r [d_{2} (X)] . \end{matrix}

(16)

The following proposition provides two upper bounds of differential entropy for

T_{B U E} (X)

based on Theorem 3.

Proposition 2.

Let

X = {(X_{1}, \dots, X_{n})}^{⊤} \sim S N_{n} (μ 1_{n}, σ^{2} I_{n}, 1_{n} λ)

, with known τ and λ. Thus,

(i): $H (T_{B U E} (X)) \leq \frac{1}{2} log (\frac{4 π e}{n} {(\frac{σ^{2} α (1 - α)}{1 + δ_{n} τ})}^{2} (1 - \frac{1}{2 (n - 1) {(τ c_{n})}^{2}})),$
(ii): $H (T_{B U E} (X)) \leq \frac{1}{2} log (2 π e V a r [T_{B U E} (X)]}) .$

Proof.

From Theorem 3 and Equations (14) and (15), the differential entropies of

d_{1} (X)

and

d_{2} (X)

are, respectively, upper bounded by

\begin{matrix} H (d_{1} (X)) & \leq & \frac{1}{2} log (2 π e \frac{σ^{2}}{n {(1 + δ_{n} τ)}^{2}}), \end{matrix}

(17)

\begin{matrix} H (d_{2} (X)) & \leq & \frac{1}{2} log (2 π e 2 σ^{2} (1 - \frac{1}{2 (n - 1) {(τ c_{n})}^{2}})) . \end{matrix}

(18)

Considering the right side on the inequality of Theorem 2(i), with

Z = X + Y

,

X = (1 - α) d_{1} (X)

and

Y = α d_{2} (X)

(thus

C o v (X, Y) = 0

), we obtain

H (T_{B U E} (X)) \leq H ((1 - α) d_{1} (X)) + H (α d_{2} (X)) = H (d_{1} (X)) + H (d_{2} (X)) + log | (1 - α) α |,

where Theorem 1 is applied later. Then, Equations (17) and (18) yield part (i). On the other hand, by considering directly Theorem 3 on

T_{B U E} (X)

, Equation (16) implies part (ii). □

The following proposition provides two lower bounds of differential entropy for

T_{B U E} (X)

.

Proposition 3.

Let

X = {(X_{1}, \dots, X_{n})}^{⊤} \sim S N_{n} (μ 1_{n}, σ^{2} I_{n}, 1_{n} λ)

, with known τ (

0 < τ < 1

) and λ. Thus

(i): $\frac{H (d_{1} (X)) + H (d_{2} (X)) + log (2 α^{2} {(1 - α)}^{2})}{2} \leq H (T_{B U E} (X)),$
(ii): $(1 - α^{2}) H (d_{1} (X)) + α^{2} H (d_{2} (X)) \leq H (T_{B U E} (X));$

where

\begin{matrix} H (d_{1} (X)) & = & \frac{1}{2} log (\frac{2 π e σ^{2}}{n {(1 + δ_{n} τ)}^{2}}) - H_{N} (η_{1}), \\ H_{N} (η_{1}) & = & E [log {2 Φ (η_{1} W_{1})}], \\ W_{1} & \sim & S N_{1} (0, 1, η_{1}), \\ η_{1} & = & | \frac{λ σ}{1 + δ_{n} τ} |; \end{matrix}

and

\begin{matrix} H (d_{2} (X)) & = & log (\frac{| α | {[Γ (\frac{n - 1}{2})]}^{n}}{2 {[τ Γ (\frac{n}{2})]}^{n - 1}}) - \frac{(n - 2) Γ (\frac{n - 1}{2})}{4 {(2 c_{n}^{2})}^{- (\frac{n - 1}{2})}} (ψ (\frac{n - 1}{2}) + log (2 c_{n}^{2})) \\ + \frac{Γ (n + 1)}{c_{n}^{2}} \frac{{[Γ (\frac{n - 1}{2})]}^{n + 2}}{{[τ Γ (\frac{n}{2})]}^{n + 3}} . \end{matrix}

Proof.

Differential entropy of

d_{1} (X)

is straightforward from evaluating (13) on Proposition 2.1 of [19] (for the univariate case). Given that distribution of

d_{2} (X)

is unknown, Ref. [3] provided its pdf

\begin{matrix} f_{d_{2}} (x; μ, σ) = 2 \frac{{[τ Γ (\frac{n}{2})]}^{n - 1}}{{[Γ (\frac{n - 1}{2})]}^{n}} x^{n - 2} e^{- \frac{x^{2}}{2 c_{n}^{2}}} . \end{matrix}

(19)

Through Equations (19) and (3.381.4) of [20], the moments of

d_{2}

are given by

\begin{matrix} E_{d_{2}} [X^{m}] = 2 Γ (m + n - 1) \frac{{[Γ (\frac{n - 1}{2})]}^{2 m + n - 2}}{{[τ Γ (\frac{n}{2})]}^{2 m + n - 1}}, m = 0, 1, \dots; \end{matrix}

(20)

and, using Equation (4.352.1) of [20], the moment of

log x

is

\begin{matrix} E_{d_{2}} [log X] & = & \int_{0}^{\infty} f_{d_{2}} (x; μ, σ) log x d x, \\ = & 2 \frac{{[τ Γ (\frac{n}{2})]}^{n - 1}}{{[Γ (\frac{n - 1}{2})]}^{n}} \int_{0}^{\infty} x^{n - 2} e^{- \frac{x^{2}}{2 c_{n}^{2}}} log x d x, \\ = & \frac{{(2 c_{n}^{2})}^{\frac{n - 1}{2}} Γ (\frac{n - 1}{2})}{4} (ψ (\frac{n - 1}{2}) + log (2 c_{n}^{2})), \end{matrix}

(21)

where

ψ (x) = \frac{d}{d x} log {Γ (x)}

is the digamma function. Therefore, by definition (1), the differential entropy of

d_{2} (X)

is computed as

\begin{matrix} H (d_{2} (X)) & = & - \int_{0}^{\infty} f_{d_{2}} (x; μ, σ) log f_{d_{2}} (x; μ, σ) d x, \\ = & - log (\frac{2 {[τ Γ (\frac{n}{2})]}^{n - 1}}{| α | {[Γ (\frac{n - 1}{2})]}^{n}}) - (n - 2) \underset{E_{d_{2}} [log X]}{\underset{︸}{\int_{0}^{\infty} f_{d_{2}} (x; μ, σ) log x d x}} + \frac{1}{2 c_{n}^{2}} \underset{E_{d_{2}} [X^{2}]}{\underset{︸}{\int_{0}^{\infty} f_{d_{2}} (x; μ, σ) x^{2} d x}} . \end{matrix}

Thus, Equations (20) and (21) are evaluated in the latter expression to obtain

H (d_{2} (X))

. By assuming

Z = X + Y

in Theorem 2(i) (

C o v (X, Y) = 0

), with

X = (1 - α) d_{1} (X)

and

Y = α d_{2} (X)

, the inequality of part (i) is obtained.

By assuming

Z = X + Y

in Theorem 2(ii), with

X = d_{2} (X)

,

Y = d_{1} (X)

(thus

C o v (X, Y) = 0

), and

ρ = α^{2}

, and since

d_{1} (X)

and

d_{2} (X)

are two unbiased estimators of

μ

[3], the inequality of part (ii) is obtained. □

The following proposition provides a lower bound for Fisher information of parameter

μ

based on

T_{B U E} (X)

.

Proposition 4.

Let

X = {(X_{1}, \dots, X_{n})}^{⊤} \sim S N_{n} (μ 1_{n}, σ^{2} I_{n}, 1_{n} λ)

, with known τ and λ. Thus,

\begin{matrix} J (μ) \geq {[{(1 - α)}^{2} \frac{σ^{2} (1 - n δ_{n}^{2})}{n {(1 + δ_{n} τ)}^{2}} + α^{2} 2 σ^{2} (1 - \frac{1}{2 (n - 1) {(τ c_{n})}^{2}})]}^{- 1} . \end{matrix}

Proof.

Considering that

T_{B U E} (X)

is an unbiased estimator of

μ

, from the Crámer–Rao inequality of Theorem 4 and Equations (14)–(16), we obtain

J (μ) \geq \frac{1}{{(1 - α)}^{2} V a r [d_{1} (X)] + α^{2} V a r [d_{2} (X)]},

yielding the result. □

The following Proposition provides an upper bound of Fisher information for parameter

μ

based on

T_{B U E} (X)

and the convexity property.

Proposition 5.

Let

X = {(X_{1}, \dots, X_{n})}^{⊤} \sim S N_{n} (μ 1_{n}, σ^{2} I_{n}, 1_{n} λ)

, with known τ (

0 < τ < 1

) and λ. Thus

J (μ) \leq (1 - α^{2}) J (d_{1} (X)) + α^{2} J (d_{2} (X)),

where

\begin{matrix} J (d_{1} (X)) & \approx & 1 + \frac{n {(b λ)}^{2}}{\sqrt{1 + 2 n b^{4} λ^{2}}}, \\ J (d_{2} (X)) & = & \frac{2 (n - 2) (n^{2} - 2 n - 2) Γ (n - 2)}{{(\frac{1}{2 c_{n}^{2}})}^{n - 3}} \frac{{[τ Γ (\frac{n}{2})]}^{n - 1}}{{[Γ (\frac{n - 1}{2})]}^{n}} + \frac{1}{c_{n}^{4}}, \end{matrix}

Proof.

By assuming

Z = X + Y

in Theorem 5, with

X = d_{2} (X)

,

Y = d_{1} (X)

(thus

C o v (X, Y) = 0

), and

ρ = α^{2}

, and since

d_{1} (X)

and

d_{2} (X)

are two unbiased estimators of

μ

[3], we obtain

J (μ) \leq α^{2} J (d_{2} (X)) + (1 - α^{2}) J (d_{1} (X))

. Note that condition

0 < τ < 1

ensures that

0 \leq α^{2} \leq 1

.

For

J (d_{1} (X))

, the steps of Section 3.2 of [9] were considered. By Equations (1) and (13), and the change of variable

z = (x - μ^{*}) / σ^{*}

, with

μ^{*} = μ / (1 + δ_{n} τ)

,

σ^{*} = σ / \sqrt{n {(1 + δ_{n} τ)}^{2}}

and

λ^{*} = \sqrt{n} λ

,

J (d_{1} (X))

can be computed as

\begin{matrix} J (d_{1} (X)) & = & \int_{- \infty}^{\infty} {(\frac{\partial}{\partial x} f (x; θ))}^{2} \frac{1}{f (x; θ)} d x \\ = & \int_{- \infty}^{\infty} f (z; λ^{*}) {[- z + λ^{*} ζ (λ^{*} z)]}^{2} d z \\ = & \int_{- \infty}^{\infty} z^{2} f (z; λ^{*}) d z - 2 λ^{*} \int_{- \infty}^{\infty} z ζ (λ^{*} z) f (z; λ^{*}) d z + {[λ^{*}]}^{2} \int_{- \infty}^{\infty} ζ {(λ^{*} z)}^{2} f (z; λ^{*}) d z \\ = & \int_{- \infty}^{\infty} z^{2} f (z; λ^{*}) d z - 4 λ^{*} \int_{- \infty}^{\infty} z ϕ (λ^{*} z) ϕ (z) d z + 2 {[λ^{*}]}^{2} \int_{- \infty}^{\infty} ϕ (z) \frac{ϕ {(λ^{*} z)}^{2}}{Φ (λ^{*} z)} d z, \end{matrix}

(22)

where

ζ (x) = ϕ (x) / Φ (x)

is the zeta function. From Equation (22), the first and second terms are the second moment of a standardized skew–normal random variable (

E [Z^{2}] = 1

) and the first moment of a standardized normal random variable (

E [R] = 0

,

R \sim N (0, 1)

), respectively. The third term is

\begin{matrix} \int_{- \infty}^{\infty} ϕ (z) \frac{ϕ {(λ^{*} z)}^{2}}{Φ (λ^{*} z)} d z & = & \int_{0}^{\infty} ϕ (z) \frac{ϕ {(λ^{*} z)}^{2}}{Φ (λ^{*} z)} d z + \int_{0}^{\infty} ϕ (z) \frac{ϕ {(λ^{*} z)}^{2}}{1 - Φ (λ^{*} z)} d z \\ = & \int_{0}^{\infty} ϕ (z) \frac{ϕ {(λ^{*} z)}^{2}}{Φ (λ^{*} z) [1 - Φ (λ^{*} z)]} d z \end{matrix}

The following approximation of normal densities (see p. 83 of [15]),

\frac{ϕ (y)}{\sqrt{Φ (y) [1 - Φ (y)]}} \approx π b^{2} ϕ (b^{2} y), \forall y \in R,

and some basic algebraic operations of normal densities are useful to approximate the third term of Equation (22) as

\begin{matrix} \int_{0}^{\infty} ϕ (z) \frac{ϕ {(λ^{*} z)}^{2}}{Φ (λ^{*} z) [1 - Φ (λ^{*} z)]} d z & \approx & π^{2} b^{4} \int_{0}^{\infty} ϕ (z) ϕ {(b^{2} λ^{*} z)}^{2} d z \\ = & \frac{π^{2} b^{4}}{2 π \sqrt{1 + 2 b^{4} {[λ^{*}]}^{2}}} \int_{0}^{\infty} ϕ (z; 0, {1 + 2 b^{4} {[λ^{*}]}^{2}}^{- 1}) d z \\ = & \frac{π b^{4}}{4 \sqrt{1 + 2 b^{4} {[λ^{*}]}^{2}}} . \end{matrix}

Given that

\frac{π b^{4}}{2} = b^{2}

and

λ^{*} = \sqrt{n} λ

, we obtain

\begin{matrix} J (d_{1} (X)) & \approx & 1 + \frac{n {(b λ)}^{2}}{\sqrt{1 + 2 n b^{4} λ^{2}}} . \end{matrix}

Using Equation (19),

J (d_{2} (X))

is computed as

\begin{matrix} J (d_{2} (X)) & = & \int_{0}^{\infty} {(\frac{\partial}{\partial x} f_{d_{2}} (x; μ, σ))}^{2} \frac{1}{f_{d_{2}} (x; μ, σ)} d x \\ = & \int_{0}^{\infty} f_{d_{2}} (x; μ, σ) {[(n - 2) x^{- 1} - \frac{1}{c_{n}^{2}}]}^{2} d x \\ = & {(n - 2)}^{2} \int_{0}^{\infty} x^{- 2} f_{d_{2}} (x; μ, σ) d x - 2 \frac{(n - 2)}{c_{n}^{2}} \int_{0}^{\infty} x^{- 1} f_{d_{2}} (x; μ, σ) d x + \frac{1}{c_{n}^{4}} \\ = & 2 {(n - 2)}^{2} \frac{{[τ Γ (\frac{n}{2})]}^{n - 1}}{{[Γ (\frac{n - 1}{2})]}^{n}} \underset{M_{1}}{\underset{︸}{\int_{0}^{\infty} x^{n - 4} e^{- \frac{1}{2 c_{n}^{2}} x^{2}} d x}} - \frac{4 (n - 2)}{c_{n}^{2}} \frac{{[τ Γ (\frac{n}{2})]}^{n - 1}}{{[Γ (\frac{n - 1}{2})]}^{n}} \underset{M_{2}}{\underset{︸}{\int_{0}^{\infty} x^{n - 3} e^{- \frac{1}{2 c_{n}^{2}} x^{2}} d x}} + \frac{1}{c_{n}^{4}} \\ = & \frac{2 (n - 2) (n^{2} - 2 n - 2) Γ (n - 2)}{{(\frac{1}{2 c_{n}^{2}})}^{n - 3}} \frac{{[τ Γ (\frac{n}{2})]}^{n - 1}}{{[Γ (\frac{n - 1}{2})]}^{n}} + \frac{1}{c_{n}^{4}}, \end{matrix}

where Equation (3.381.4) of [20] is applied to solve integrals

M_{1}

and

M_{2}

. □

Remark 2.

Considering the same argument as in Remark (1), it can be noted that inequalities of Propositions 4 and 5 are only affected by

J (d_{1} (X))

, i.e., for large samples, we obtain

\frac{1}{V a r [d_{1} (X)]} \leq J (μ) \leq J (d_{1} (X)) .

4. Simulations

All location parameter estimators, variances, Fisher information and differential entropies were calculated with R software [21]. Samples based on skew–normal random variables were drawn based on stochastic representation (3) and with the rsn function of sn package. All R codes used in this paper are available upon request from the corresponding author.

In general,

τ

takes a value between 0 and 1. If

τ

is close to 0, the sample has low variability, and if it is close to 1, the sample has high variability and mean loss reliability. For example, if

τ > 0.3

, the mean is less representative of the sample. Sometimes, if

μ

is close to zero,

τ

takes high values (high variability) and could exceed unity. Therefore, for illustrative purposes, in all simulations, a coefficient of variation set of

τ = 0.1, \dots, 1

is considered. Additionally considered are positive asymmetry parameters

λ = 0.1, \dots, 5

, sample sizes

n = 10

and 250, and theoretical location parameters

μ = 0.1

, 0.5 and 1. For the computation of information measures,

σ

is replaced by

τ | μ |

and location parameters

μ

are evaluated by their respective estimators.

The MSE of

T_{L S E} (X)

is given in Equation (6), and MSE of

T_{B U E} (X)

is the variance of the (unbiased) estimator (see Equation (16)). Without loss of generality,

τ = 1

is considered in Figure 1 because the same pattern is repeated for values of

τ

between 0 and 1. Comparing the MSE of both estimators, Figure 1 shows for all cases that differences between MSEs tend to increase for large values of

μ

, and MSEs turn around a specific value when the sample size increases. Moreover, MSEs of the unbiased estimator are less than those obtained by LSE, i.e., BUE dominates LSE Equation (11.263) of [1]. Therefore, the analysis focuses on BUE in the next section.

Behavior of differential entropy bounds given in Propositions 2 and 3 is illustrated in Figure 2 as 3D plots. Without loss of generality,

τ \in (0, 1]

is considered in Figure 2 because the same pattern is repeated for values of

τ > 1

, i.e., entropies keep increasing. The upper bound corresponds to the minimum value between bounds given in Proposition 2(i) and (ii), which is the one given in (ii). Thus, the upper bound of

H (T_{B U E} (X))

is determined by the variance (or MSE) of the estimator. In contrast, the lower bounds correspond to the maximum value between bounds given in Proposition 3 (i) and (ii) [17].

Sample sizes (

n = 250

) imply that

α \approx 0

and lower bounds only depend on

H (d_{1} (X))

. For small sample sizes (

n = 10

),

α

could be an intermediate value of the interval

(0, 1)

, thus, lower bounds depend on

H (d_{1} (X))

and

H (d_{2} (X))

. For

n = 10

, the surfaces are rough, given the randomness of bounds produced by the small sample. When

λ \approx 0

(symmetry condition), the bounds decay to negative values. This is analogous to considering the skew–normal density as a non-stationary process [15], when

λ

is near zero, so the Hurst exponent decreases abruptly [8]. On the other hand, for

n = 250

, surfaces are soft and bounds increase slightly for large

λ

. For all cases, information increases when

τ

tends to 1 because it produces more variability in samples.

For practical purposes, the average between bounds can be considered to provide an approximation of differential entropy [7] in similar form to average lengths of the confidence interval [3]. Given that all lower bounds of differential entropy depend on the entropy of

d_{1} (X)

, which depends on variance and sample size, they could take negative values and tend to zero when

τ

tends to 1. Therefore, the difference between lower and upper bounds could increase and turn out an inadequate approximation if the lower bound is negative. For the latter reason, the Fisher information considers only positive values, as studied next.

The Fisher information bounds given in Propositions 4 and 5 are illustrated in Figure 3 as 3D and 2D plots, respectively. As in the differential entropy case, and without loss of generality,

τ \in (0, 1]

is considered in Figure 3 because the same pattern is repeated for values of

τ > 1

, i.e., entropies keep decreasing. Following the Cramér–Rao theorem, the variance of BUE corresponds to the reciprocal of the Fisher information. In contrast, the lower bound corresponds to a combination of the Fisher information of

d_{1} (X)

and

d_{2} (X)

.

As in the differential entropy case, large sample sizes (

n = 250

) imply that

α \approx 0

and lower bounds only depend on

J (d_{1} (X))

, as mentioned in Remark 2. For small sample sizes (

n = 10

),

α

could be an intermediate value of the interval

(0, 1)

, thus lower bounds depend on

J (d_{1} (X))

and

J (d_{2} (X))

. When

τ \approx 0

(low variability condition), the lower bounds take the highest values. This reciprocal relationship is determined by the Cramér–Rao theorem: more variability, less Fisher information. In addition, the 2D plot shows that the smallest upper bounds of

J (T_{B U E} (X))

are produced when

λ \approx 0

[9]. Given that upper bounds do not depend on

τ

and

μ

because the skew–normal densities are standardized, these measures are illustrated with respect to n and

λ

. In addition, when

λ

and n increase simultaneously, the upper bounds of Fisher information take the largest values.

5. Concluding Remarks

In this paper, some properties of the best unbiased estimator proposed by [3] were presented, using classic theorems of information theory, which provide a way to measure the uncertainty of location parameter estimations. Given that BUE dominates LSE, this paper focused on this estimator. Inequalities based on differential entropy and Fisher information allowed obtaining lower and upper bounds for these measures. Some simulations illustrated the behavior of differential entropy and Fisher information bounds.

Classical theorems of information theory considered the obtained additional properties of unbiased location parameter estimators. However, these theorems could be applied to other estimators, such as Bayesian [22] (as long as the prior pdf density is known), shrinkage [23], or bootstrap-based [24] ones. The assumption of the sample that came from a multivariate skew–normal distribution is too strong and not always applicable in the real world, so the properties revised here could be extended to more complex densities, for example, those that assess bimodality and heavy tails in data [7,11,13,19]. On the other hand, and given that Fisher information bounds under skew–normal settings were considered in this study, further work could focus on developing time-dependent Fisher information for skew–normal density [25], which could be applied to real data in survival analysis.

Funding

This research was funded by FONDECYT (Chile) grant number 11190116.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

The author thanks the editor and three anonymous referees for their helpful comments and suggestions.

Conflicts of Interest

The author declares that there is no conflict of interest in the publication of this paper.

References

Cover, T.M.; Thomas, J.A. Elements of Information Theory, 2nd ed.; Wiley & Son, Inc.: New York, NY, USA, 2006. [Google Scholar]
Adcock, C.; Azzalini, A. A selective overview of skew-elliptical and related distributions and of their applications. Symmetry 2020, 12, 118. [Google Scholar] [CrossRef] [Green Version]
Wang, Z.Y.; Wang, C.; Wang, T.H. Estimation of location parameter in the skew normal setting with known coefficient of variation and skewness. Int. J. Intel. Technol. Appl. Stat. 2016, 9, 191–208. [Google Scholar]
Trafimow, D.; Wang, T.; Wang, C. From a sampling precision perspective, skewness is a friend and not an enemy! Educ. Psychol. Meas. 2019, 79, 129–150. [Google Scholar] [CrossRef] [PubMed]
Wang, C.; Wang, T.; Trafimow, D.; Myüz, H.A. Necessary sample sizes for specified closeness and confidence of matched data under the skew normal setting. Comm. Stat. Simul. Comput. 2019, in press. [Google Scholar] [CrossRef]
Wang, C.; Wang, T.; Trafimow, D.; Talordphop, K. Estimating the location parameter under skew normal settings: Is violating the independence assumption good or bad? Soft Comput. 2021, 25, 7795–7802. [Google Scholar] [CrossRef]
Abid, S.H.; Quaez, U.J.; Contreras-Reyes, J.E. An information-theoretic approach for multivariate skew-t distributions and applications. Mathematics 2021, 9, 146. [Google Scholar] [CrossRef]
Contreras-Reyes, J.E. Analyzing fish condition factor index through skew-gaussian information theory quantifiers. Fluct. Noise Lett. 2016, 15, 1650013. [Google Scholar] [CrossRef]
Contreras-Reyes, J.E. Fisher information and uncertainty principle for skew-gaussian random variables. Fluct. Noise Lett. 2021, 20, 21500395. [Google Scholar] [CrossRef]
Tumminello, M.; Lillo, F.; Mantegna, R.N. Correlation, hierarchies, and networks in financial markets. J. Econ. Behav. Organ. 2010, 75, 40–58. [Google Scholar] [CrossRef] [Green Version]
Contreras-Reyes, J.E. An asymptotic test for bimodality using the Kullback–Leibler divergence. Symmetry 2020, 12, 1013. [Google Scholar] [CrossRef]
Contreras-Reyes, J.E.; Kahrari, F.; Cortés, D.D. On the modified skew-normal-Cauchy distribution: Properties, inference and applications. Comm. Stat. Theor. Meth. 2021, 50, 3615–3631. [Google Scholar] [CrossRef]
Contreras-Reyes, J.E.; Maleki, M.; Cortés, D.D. Skew-Reflected-Gompertz information quantifiers with application to sea surface temperature records. Mathematics 2019, 7, 403. [Google Scholar] [CrossRef] [Green Version]
Dembo, A.; Cover, T.M.; Thomas, J.A. Information theoretic inequalities. IEEE Trans. Infor. Theor. 1991, 37, 1501–1518. [Google Scholar] [CrossRef] [Green Version]
Azzalini, A. The Skew-Normal and Related Families; Cambridge University Press: Cambridge, UK, 2013; Volume 3. [Google Scholar]
Madiman, M.; Barron, A. Generalized entropy power inequalities and monotonicity properties of information. IEEE Trans. Infor. Theor. 2007, 53, 2317–2329. [Google Scholar] [CrossRef] [Green Version]
Xie, Y. Sum of Two Independent Random Variables. ECE587, Information Theory; 2012; Available online: https://www2.isye.gatech.edu/~yxie77/ece587/SumRV.pdf (accessed on 15 January 2022).
Azzalini, A.; Dalla-Valle, A. The multivariate skew-normal distribution. Biometrika 1996, 83, 715–726. [Google Scholar] [CrossRef]
Contreras-Reyes, J.E. Asymptotic form of the Kullback–Leibler divergence for multivariate asymmetric heavy-tailed distributions. Phys. A Stat. Mech. Its Appl. 2014, 395, 200–208. [Google Scholar] [CrossRef]
Gradshteyn, I.S.; Ryzhik, I.M. Table of Integrals, Series, and Products, 7th ed.; Academic Press: Cambridge, MA, USA; Elsevier: London, UK, 2007. [Google Scholar]
R Core Team. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2021; Available online: http://www.R-project.org (accessed on 15 January 2022).
Bayes, C.L.; Branco, M.D. Bayesian inference for the skewness parameter of the scalar skew-normal distribution. Braz. J. Prob. Stat. 2007, 21, 141–163. [Google Scholar]
Kubokawa, T.; Strawderman, W.E.; Yuasa, R. Shrinkage estimation of location parameters in a multivariate skew-normal distribution. Comm. Stat. Theor. Meth. 2020, 49, 2008–2024. [Google Scholar] [CrossRef]
Ye, R.; Fang, B.; Wang, Z.; Luo, K.; Ge, W. Bootstrap inference on the Behrens–Fisher-type problem for the skew-normal population under dependent samples. Comm. Stat. Theor. Meth. 2021, in press. [Google Scholar] [CrossRef]
Kharazmi, O.; Asadi, M. On the time-dependent Fisher information of a density function. Braz. J. Prob. Stat. 2018, 32, 795–814. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Mean square errors (MSE) for

T_{L S E} (X)

[blue dots] and

T_{B U E} (X)

[red dots] considering

τ = 1

and several skewness

λ

and location

μ

parameters in the simulations.

Figure 1. Mean square errors (MSE) for

T_{L S E} (X)

[blue dots] and

T_{B U E} (X)

[red dots] considering

τ = 1

and several skewness

λ

and location

μ

parameters in the simulations.

Figure 2. Differential entropy bounds for

T_{B U E} (X)

considering

n = 10

and 250,

μ = 0.1

, 0.5 and 1; and several skewness

λ

and coefficient of variation

τ

parameters in the simulations.

Figure 2. Differential entropy bounds for

T_{B U E} (X)

considering

n = 10

and 250,

μ = 0.1

, 0.5 and 1; and several skewness

λ

and coefficient of variation

τ

parameters in the simulations.

Figure 3. Fisher information lower bounds for

T_{B U E} (X)

considering

n = 250

,

μ = 1

, 2.5 and 5; and several skewness

λ

and coefficient of variation

τ

parameters in the simulations. The fourth panel shows the upper bounds for

T_{B U E} (X)

considering

n = 100, \dots, 1000

and several skewness parameters

λ

.

Figure 3. Fisher information lower bounds for

T_{B U E} (X)

considering

n = 250

,

μ = 1

, 2.5 and 5; and several skewness

λ

and coefficient of variation

τ

parameters in the simulations. The fourth panel shows the upper bounds for

T_{B U E} (X)

considering

n = 100, \dots, 1000

and several skewness parameters

λ

.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Contreras-Reyes, J.E. Information–Theoretic Aspects of Location Parameter Estimation under Skew–Normal Settings. Entropy 2022, 24, 399. https://doi.org/10.3390/e24030399

AMA Style

Contreras-Reyes JE. Information–Theoretic Aspects of Location Parameter Estimation under Skew–Normal Settings. Entropy. 2022; 24(3):399. https://doi.org/10.3390/e24030399

Chicago/Turabian Style

Contreras-Reyes, Javier E. 2022. "Information–Theoretic Aspects of Location Parameter Estimation under Skew–Normal Settings" Entropy 24, no. 3: 399. https://doi.org/10.3390/e24030399

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Information–Theoretic Aspects of Location Parameter Estimation under Skew–Normal Settings

Abstract

1. Introduction

2. Information-Theoretic Aspects

3. Location Parameter Estimation

3.1. Least Square Estimator

3.2. Best Unbiased Estimator

4. Simulations

5. Concluding Remarks

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI