Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family

Maleki, Mohsen; Contreras-Reyes, Javier E.; Mahmoudi, Mohammad R.

doi:10.3390/axioms8020038

Open AccessArticle

Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family

by

Mohsen Maleki

¹

,

Javier E. Contreras-Reyes

^2,*

and

Mohammad R. Mahmoudi

³

¹

Department of Statistics, College of Sciences, Shiraz University, Shiraz 71946 85115, Iran

²

Departamento de Estadística, Facultad de Ciencias, Universidad del Bío-Bío, Concepción 4081112, Chile

³

Department of Statistics, Faculty of Science, Fasa University, Fasa 74616 86131, Iran

^*

Author to whom correspondence should be addressed.

Axioms 2019, 8(2), 38; https://doi.org/10.3390/axioms8020038

Submission received: 18 February 2019 / Revised: 15 March 2019 / Accepted: 27 March 2019 / Published: 1 April 2019

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In this paper, we examine the finite mixture (FM) model with a flexible class of two-piece distributions based on the scale mixtures of normal (TP-SMN) family components. This family allows the development of a robust estimation of FM models. The TP-SMN is a rich class of distributions that covers symmetric/asymmetric and light/heavy tailed distributions. It represents an alternative family to the well-known scale mixtures of the skew normal (SMSN) family studied by Branco and Dey (2001). Also, the TP-SMN covers the SMN (normal, t, slash, and contaminated normal distributions) as the symmetric members and two-piece versions of them as asymmetric members. A key feature of this study is using a suitable hierarchical representation of the family to obtain maximum likelihood estimates of model parameters via an EM-type algorithm. The performances of the proposed robust model are demonstrated using simulated and real data, and then compared to other finite mixture of SMSN models.

Keywords:

ECME algorithm; finite mixture model; maximum likelihood estimates; scale mixtures of normal family; two-piece distributions

1. Introduction

Finite mixture models are highly demanded in machine-learning analysis, due to their properties, computational tractability, and for being a good approximation for continuous densities [1]. They are also an important statistical tool for many applications in clustering, discriminant analysis, image processing and satellite imaging [2]. Beyond the already known results provided for the finite mixture of normal distributions (FM-NOR) model in the literature [1], recent developments cover symmetric/asymmetric and light/heavy tailed distributions. One of these is the novel class of finite mixture of multivariate skew-normal mixture (FM-SN) models [3,4], which provides some advantages over the normal mixtures: the normal components allow an arbitrarily close modeling of any distribution by increasing the number of components, and, in the context of supervised learning, groups of observations represented by asymmetrically distributed data can lead to the wrong classification. The components of skew-normal mixture models, however, capture skewness due to their flexibility [1]. In addition, a robust extension of the FM-SN model to robust finite mixture of skew-t (FM-ST) has been done in the influential works of [3,5,6,7]. The FM-ST components, too, capture both skewness and extreme observations due to their flexibility [8].

The SMSN family is a rich and very strong flexible class of distributions which covers the light/heavy-tailed distributions; e.g., skew-normal (SN), skew-t (ST), skew-slash (SSL) and skew contaminated-normal (SCN) distributions, and has been widely considered in many statistical models, especially FM models (see e.g., [5,9,10,11,12,13,14,15]). The SMSN family is an extension of the skewed version of the well-known symmetric scale mixtures of the normal (SMN) family which contains the light/heavy-tailed members: the normal (N), t (T), slash (SL) and contaminated-normal (CN) distributions [16]. Lange et al. [17], Lange and Sinsheimer [18], and Maleki and Nematollahi [19] used the SMN family in an application of robust statistical modeling. A two-piece distribution based on the symmetrical distributions with various scales is an alternative approach to model atypical data (see e.g., [10,20,21,22,23,24]). In our approach, we have used the two-piece distributions based on the SMN family. This family, called the two-piece distributions based on the scale mixtures of normal (TP-SMN), and analogy of the SMSN family, contains the light/heavy-tailed members: the two-piece normal (TP-N), two-piece t (TP-T), two-piece slash (TP-SL) and two-piece contaminated-normal (TP-CN) distributions as its members.

In this paper, we consider the TP-SMN family of distributions as a two-component mixture of truncated SMN distributions on a special two partition of the real domain (

R

), and then propose the finite mixture of this family, called FM-TP-SMN models. It represents an alternative family to the well-known scale mixtures of skew normal (SMSN) family studied by [25]. We have also used a hierarchical representation of the FM-TP-SMN and implemented an expectation-maximization (EM)-type algorithm for finding the maximum likelihood (ML) estimates of the proposed model. Studies by [21,23], show that by truncating the distribution in two partitions, makes it possible to obtain a better fit of empirical distribution because, the subjacent process of the complete likelihood is modeled. This way, the “two-piece” modeling is a direct competitor against the FM-SMSN family of distributions [21].

The rest of this paper is organized as follows. In Section 2, we review some main properties of the TP-SMN family and represent this family as a two-component mixture of the truncated SMN distributions. In Section 3, the FM-TP-SMN model is introduced and the ML estimates of the proposed model parameters via an EM-type algorithm are provided. In Section 4, numerical studies with an application of the proposed models and estimates are considered. Some conclusions and ideas for future research are offered in Section 5.

2. The Two-Piece Scale Mixtures of Normal Distributions

In this section, we analyze some necessary properties of the TP-SMN family of distributions for our proposed FM model.

The well-known SMN family introduced by [16] (the basis of the robust asymmetric TP-SMN family), has the following probability density function (PDF) and stochastic representation. Let

X \sim S M N (μ, σ, ν)

, then its PDF is

f_{S M N} (x | μ, σ, ν) = \int_{0}^{\infty} ϕ (x | μ, u^{- 1} σ^{2}) d H (u | ν), x \in R,

(1)

and its stochastic representation is

X = μ + σ U^{- 1 / 2} W,

(2)

where

ϕ (\cdot | μ, σ^{2})

represents the density of

N (μ, σ^{2})

distribution,

H (\cdot | ν)

is the cumulative distribution function (CDF) of the scale mixing random variable U, which can be indexed by a scalar or vector of parameters

ν

, and W is a standard normal random variable that is independent of U.

The TP-SMN is a rich family of distributions that covers the asymmetric light-tailed TP-N (also called the epsilon-skew-normal; [26]), the asymmetric heavy-tailed TP-T, TP-SL and TP-CN distributions, and their corresponding symmetric members. Note that symmetric members of the TP-SMN and SMSN classes are the SMN family. In terms of density, for

y \in R

this family can be represented as

g (x | μ, σ, γ, ν) = \{\begin{matrix} 2 (1 - γ) f_{S M N} (x | μ, σ (1 - γ), ν), & y \leq μ, \\ 2 γ_{S M N} (x | μ, σ γ, ν), & y > μ, \end{matrix}

(3)

where

0 < γ < 1

is the slant parameter,

f_{S M N} (\cdot | μ, σ, ν)

is given by (1) and is denoted by

Y \sim T P - S M N (μ, σ, γ, ν)

with

E (Y) = μ - b σ (1 - 2 γ)

and

V a r (Y) = σ^{2} [c_{2} k_{2} (ν) - b^{2} c_{1}^{2}]

, for which

b = \sqrt{2 / π} k_{1} (ν)

,

c_{r} = γ^{r + 1} + {(- 1)}^{r} {(1 - γ)}^{r + 1}

,

k_{r} (ν) = E (U^{- r / 2})

, and U is the scale mixing random variable in (2).

Different TP-SMN member distributions in (3) are obtained by several distributions for scale mixing random variable U in (2), as follows:

Two-piece normal (TP-N): $U = 1$ with probability one,
Two-piece t (TP-T): $U \sim G a m m a (ν / 2, ν / 2)$ , i.e., $ν = ν$ ,
Two-piece slash (TP-SL): $U \sim B e t a (ν, 1)$ , i.e., $ν = ν$ ,
Two-piece contaminated normal (TP-CN): $h (u | ν) = ν I_{(u = 1)} + (1 - ν) I_{(u = 1)}$ , i.e., $ν = {(ν, τ)}^{⊤}$ .

For more details and statistical properties of the TP-SMN family, see [20,23].

Further, the two-piece distributions can be represented as the two-component mixture with separated supports, i.e., left and right half basic distributions [20] (Equation (4)), especially when

Y \sim T P - S M N (μ, σ, γ, ν)

with PDF given in (3), two-component mixture left and right half SMN distributions with special component probabilities as follows:

g (x | μ, σ_{1}, σ_{2}, ν) = 2 \frac{σ_{1}}{σ_{1} + σ_{2}} f_{S M N} (y | μ, σ_{1}, ν) I_{(- \infty, μ]} (y) + 2 \frac{σ_{1}}{σ_{1} + σ_{2}} f_{S M N} (x | μ, σ_{2}, ν) I_{(μ, + \infty)} (y) .

(4)

Note in Equation (4) that, the scale parameter

σ

and slant parameter

γ

in Equation (3) are recovered in the form of

σ = σ_{1} + σ_{2}

and

γ = σ_{2} / (σ_{1} + σ_{2})

.

By using auxiliary (latent) variables

S_{j}

,

j = 1, 2

; in terms of the components of the mixture in Equation (4), the TP-SMN random variable can have the following stochastic representation

\{\begin{matrix} Y | S_{1} = 1 \sim S M N (μ, σ_{1}, ν) I_{A} (y_{i}), \\ Y | S_{2} = 1 \sim S M N (μ, σ_{2}, ν) I_{A^{c}} (y_{i}), \end{matrix}

(5)

where

A = (- \infty, μ)

and

S M N (\cdot) I_{A} (\cdot)

denotes the truncated SMN distribution on the interval A, and

S = {(S_{1}, S_{2})}^{⊤}

has a multinomial distribution with following probability mass function (PMF):

P (S = s) = {(\frac{σ_{2}}{σ_{1} + σ_{2}})}^{z_{1}} {(\frac{σ_{2}}{σ_{1} + σ_{2}})}^{z_{2}}; s_{1}, s_{2} = 0, 1,

(6)

and is denoted by

S \sim M (1, σ_{1} / (σ_{1} + σ_{2}), σ_{2} / (σ_{1} + σ_{2}))

. Note that each component-label is a Bernoulli random variable

S_{k} \sim B i n o m i a l (1, σ_{k} / (σ_{1} + σ_{2}))

;

k = 1, 2

, such that

S_{1} + S_{2} = 1

.

3. Finite Mixtures TP-SMN

In this section, we introduce the finite mixture of TP-SMN (FM-TP-SMN) model and obtain the ML estimates of this model’s parameters.

3.1. FM-TP-SMN Model

Here, we consider a distribution represented as a g-component mixture of TP-SMN distributions. In terms of density, this mixture distribution is characterized by the following density:

f (y | Θ) = \sum_{j = 1}^{g} π_{j} g (y | μ_{j}, σ_{j}, γ_{j}, ν_{j}), y \in R,

(7)

where

Θ = {(π_{1}, \dots, π_{g}, μ_{1}, \dots, μ_{g}, σ_{1}, \dots, σ_{g}, γ_{1}, \dots, γ_{g}, ν_{1}, \dots, ν_{g})}^{⊤}

, for which ß

= {(π_{1}, \dots, π_{g})}^{⊤}

with

π_{j} > 0

,

j = 1, \dots, g

,

\sum_{j = 1}^{g} π_{j} = 1

, and, for

j = 1, \dots, g

,

g (y | μ_{j}, σ_{j}, γ_{j}, ν_{j})

is an

T P - S M N (μ_{j}, σ_{j}, γ_{j}, ν_{j})

-component density as defined in (1). Also, we write

Y \sim F M - T P - S M N (Θ)

to say that a random variable Y has an FM-TP-SMN distribution as defined by (7).

Concerning the parameter

ν_{j}

of the mixing distribution

H (\cdot | ν_{j})

, for

j = 1, \dots, g

, it is worth noting that it can be a vector of parameters, e.g. the contaminated normal distribution. Thus, for computational convenience we assume that

ν_{1} = \dots = ν_{g} = ν

(see also [5]).

In terms of the components of the mixtures, Equation (7) can be equivalently obtained by

Y | Z_{j} = 1 \sim T P - S M N (μ_{j}, σ_{j}, γ_{j}, ν), j = 1, \dots, g,

(8)

where

Z = {(Z_{1}, \dots, Z_{g})}^{⊤} \sim M u l t i n o m i a l (1, π_{1}, \dots, π_{g})

is a multinomial (component-label) vector with probability mass function

P (Z_{1} = z_{1}, \dots, Z_{m} = z_{m}) = π_{1}^{z_{1}} π_{2}^{z_{2}} \dots π_{g}^{z_{g}}

,

z_{j} = 0, 1

;

j = 1, \dots, g

,

\sum_{j = 1}^{g} z_{j} = 1

.

Since only one component of

Z

can be equal to one (remaining ones are zero), events

{Z_{j} = 1}

and

{Z_{j} = 1, Z_{r} = 0; \forall j \neq r}

are equivalent, indicating thus that the distribution of Y corresponds to the i-th component of the mixture; for further details, see e.g., [1].

Remark 1.

Let

Y \sim F M - T P - S M N (Θ)

, then the mean and variance of Y are, respectively, given by

E [Y] = \sum_{j = 1}^{g} π_{j} E [Y | Z_{j} = 1] = \sum_{j = 1}^{g} π_{j} E [X_{j}],

and

V a r [Y] = \sum_{j = 1}^{g} π_{j} {V a r [Y | Z_{j} = 1] + {(E [Y | Z_{j} = 1] - E [Y])}^{2}} = \sum_{j = 1}^{g} π_{j} {V a r [X_{j}] + {(E [X_{j}] - \bar{μ})}^{2}},

where

X_{j} \sim T P - S M N (μ_{j}, σ_{j}, γ_{j}, ν)

,

j = 1, \dots, g

, and

\bar{μ} = \sum_{j = 1}^{g} π_{j} E [X_{j}]

(see e.g., [2]).

The FM-TP-ESN densities in (7) are an extremely flexible class which includes the finite mixtures of SMN densities as special case, when

ε_{j} = 0

,

j = 1, \dots, g

.

For each i.i.d. sample in the form of

Y = {(Y_{1}, \dots, Y_{n})}^{⊤}

, by considering the PDF (7), the log-likelihood function is

ℓ (Θ | Y) = \sum_{i = 1}^{n} \log (\sum_{j = 1}^{g} π_{j} g (y | μ_{j}, σ_{j}, γ_{j}, ν)) .

(9)

3.2. ML Estimates of Model Parameters

We can utilize a (latent) indicator (allocation) variables

Z_{i} = {(Z_{i 1}, \dots, Z_{i g})}^{⊤}

,

i = 1, \dots, n

, to assign observations belonging to different components of the mixture (

j = 1, \dots, g

), so in terms of

Z_{i j}

, we can conclude that

Y_{i} | Z_{i j} = 1 \sim^{i n d .} T P - S M N (μ_{j}, σ_{j}, γ_{j}, ν), P (Z_{i j} = 1) = π_{j}; i = 1, \dots, n, j = 1, \dots, g,

and so using Equations (2) and (5) with

S_{i j} = {(S_{i j 1}, S_{i j 2})}^{⊤}

,

i = 1, \dots, n

, we have that

\begin{matrix} Y_{i} | U_{i j}, Z_{i j} = 1, S_{i j k} = 1 & \sim^{i n d .} & N (μ_{j}, u_{i j}^{- 1 / 2} σ_{j}^{2}) I_{A_{j}} {(y_{i})}^{2 - k} I_{A_{j}^{c}} {(y_{i})}^{k - 1}, \\ U_{i j} | Z_{i j} = 1, S_{i j k} = 1 & \sim^{i n d .} & H (u_{i j} | ν), \\ S_{i j} | Z_{i j} = 1 & \sim^{i i d .} & M (1, \frac{σ_{1 j}}{σ_{1 j} + σ_{2 j}}, \frac{σ_{2 j}}{σ_{1 j} + σ_{2 j}}), \\ Z_{i} & \sim^{i i d .} & M (n, π_{1}, \dots, π_{g}), \end{matrix}

(10)

for

i = 1, \dots, n

;

j = 1, \dots, g

;

k = 1, 2

,

A_{j} = (- \infty, μ_{j}]

and

N (\cdot) I_{A} (\cdot)

denotes the truncated normal distribution on the interval A.

The above hierarchical representation of the FM-TP-SMN model will be used to obtain the ML estimates via an ECME-algorithm. This algorithm is a generalization of the ECM-algorithm introduced by [27], which is an extension of the EM-algorithm [28]. It can be obtained by replacing some CM-steps, which maximize the constrained expected complete-data log-likelihood function, with steps that maximize the corresponding constrained actual likelihood function. As [27,29] indicated, the joint ML estimates obtained by ECME-algorithms are much more efficient than other EM-type algorithms.

Let

C = {Y, S, Z}

denotes the complete data, where

Y = {(Y_{1}, \dots, Y_{n})}^{⊤}

is the observed sample

S = {(S_{1}^{⊤}, \dots, S_{n}^{⊤})}^{⊤}

and

Z = {(Z_{1}^{⊤}, \dots, Z_{n}^{⊤})}^{⊤}

are the latent or unobserved variables from the FM-TP-SMN model with vector of parameters

Θ = {(π_{1}, \dots, π_{g}, μ_{1}, \dots, μ_{g}, σ_{11}, \dots, σ_{1 g}, σ_{21}, \dots, σ_{2 g}, ν)}^{⊤}

. Considering the hierarchical representation (10), the completed (augmented) likelihood function is given by

L_{C} (Θ) = \prod_{i = 1}^{n} \prod_{j = 1}^{g} \prod_{k = 1}^{2} {(ϕ (y_{i} | μ_{j}, u_{i j k}^{- 1} σ_{k}^{2}) h (u_{i j k} | ν) p (z_{i} | π_{1}, \dots, π_{g}) p (s_{i} | σ_{1 j}, σ_{2 j}) I_{A_{j}} {(y_{i})}^{2 - k} I_{A_{j}^{c}} {(y_{i})}^{k - 1})}^{z_{i j} s_{i j k}},

where

A_{j} = (- \infty, μ_{j}]

. After ignoring constants and using auxiliary (latent) variables the completed log-likelihood function is in the form:

ℓ (Θ) = - \sum_{i = 1}^{n} \sum_{j = 1}^{g} z_{i j} \log (σ_{1 j} + σ_{2 j}) - \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{g} \sum_{k = 1}^{2} \frac{z_{i j} s_{i j k} u_{i j k}}{σ_{k j}^{2}} {(y_{i} - μ_{j})}^{2} + \sum_{i = 1}^{n} \sum_{j = 1}^{g} \sum_{k = 1}^{2} z_{i j} s_{i j k} \log h (u_{i j k} | ν) .

(11)

Quantities

{\hat{z}}_{i j} = E [Z_{i j} | \hat{Θ}, y_{i}]

,

{\hat{s}}_{i j k} = E [S_{i j k} | \hat{Θ}, y_{i}, Z_{i j} = 1]

and

{\hat{w}}_{i j k} = E [Z_{i j} S_{i j k} U_{i j k} | \hat{Θ}, y_{i}]

, must be defined, and using known properties of conditional expectation and PDF in (4), we obtain

{\hat{w}}_{i j k} = {\hat{z}}_{i j} {\hat{s}}_{i j k} {\hat{κ}}_{i j k}

, where

{\hat{κ}}_{i j k} = E [U_{i j k} | \hat{Θ}, y_{i}, Z_{i j} = 1, S_{i j k} = 1]

,

i = 1, \dots, n

,

j = 1, 2

, and

\begin{matrix} {\hat{z}}_{i j} & = & E [Z_{i j} | \hat{Θ}, y_{i}] = \frac{π_{j} g (y_{i} | {\hat{μ}}_{j}, {\hat{σ}}_{1 j}, {\hat{σ}}_{2} j, \hat{ν})}{\sum_{j = 1}^{g} π_{j} g (y_{i} | {\hat{μ}}_{j}, {\hat{σ}}_{1 j}, {\hat{σ}}_{2 j}, \hat{ν})}; i = 1, \dots, n, j = 1, \dots, g, \end{matrix}

(12)

\begin{matrix} {\hat{s}}_{i j 1} & = & \frac{2 (\frac{σ_{1 j}}{{\hat{σ}}_{1 j} + {\hat{σ}}_{2 j}}) f_{S M N} (y_{i} | {\hat{μ}}_{j}, {\hat{σ}}_{1 j}, \hat{ν}) I_{(- \infty, {\hat{μ}}_{j}]} (y_{i})}{g (y_{i} | {\hat{μ}}_{j}, {\hat{σ}}_{1 j}, {\hat{σ}}_{2 j}, \hat{ν})} = I_{(- \infty, {\hat{μ}}_{j}]} (y_{i}), \end{matrix}

(13)

where

g (\cdot | \cdot)

is the TP-SMN PDF defined in Equation (4), and

{\hat{s}}_{i j 2} = 1 - {\hat{s}}_{i j 1}

, and the conditional expectation

{\hat{κ}}_{i j}

for the TP-SMN distribution members are given by:

Two-piece normal (TP-N): ${\hat{κ}}_{i j k} = 1$ ,
Two-piece t (TP-T): ${\hat{κ}}_{i j k} = \frac{\hat{ν} + 1}{\hat{ν} + d_{i j k}}$ ,
Two-piece slash (TP-SL): ${\hat{κ}}_{i j} = \frac{2 \hat{ν} + 1}{d_{i j k}} \frac{P_{1} (\hat{ν} + 3 / 2, d_{i j k} / 2)}{P_{1} (\hat{ν} + 1 / 2, d_{i j k} / 2)}$ ,
Two-piece contaminated Normal (TP-CN): ${\hat{κ}}_{i j} = \frac{{\hat{τ}}^{3 / 2} \hat{ν} e^{- \hat{τ} d_{i j k} / 2} + (1 - \hat{ν}) e^{- d_{i j k} / 2}}{{\hat{τ}}^{1 / 2} \hat{ν} e^{- \hat{τ} d_{i j k} / 2} + (1 - \hat{ν}) e^{- d_{i j k} / 2}}$ ,

where

d_{i j k} = {(\frac{y_{i} - {\hat{μ}}_{j}}{{\hat{σ}}_{k j}})}^{2}

and

P_{x} (a, b)

denotes the distribution function of the

G a m m a (a, b)

distribution evaluated at x.

Now, the expectation step (E-step) at the

(r + 1)

th iteration of the ECME-algorithm requires the calculation of

Q (Θ | {\hat{Θ}}^{r}) = E_{Θ} [ℓ_{c} (Θ) | {\hat{Θ}}^{r}, y]

. So,

E-step.

Q (Θ | {\hat{Θ}}^{r}) = - n \sum_{j = 1}^{g} \log (σ_{1 j} + σ_{2 j}) - \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{g} \sum_{k = 1}^{2} \frac{{\hat{w}}_{i j k}}{σ_{k j}^{2}} {(y_{i} - μ_{j})}^{2} + \sum_{i = 1}^{n} \sum_{j = 1}^{g} \sum_{k = 1}^{2} E [Z_{i j} S_{i j k} \log h (U_{i j k} | ν) | {\hat{Θ}}^{r}, y_{i}] .

For the conditionally maximizing steps (CM-steps) at the

(r + 1)

-th iteration of the ECME-algorithm we have:

CM-steps.

Update

π_{j}

,

j = 1, \dots, g

, as:

π_{j}^{(r + 1)} = \frac{\sum_{i = 1}^{n} {\hat{z}}_{i j}^{(r)}}{n} .

(14)

Update

μ_{j}

,

j = 1, \dots, g

, as:

μ_{j}^{(r + 1)} = \frac{\sum_{i = 1}^{n} {\hat{α}}_{i j}^{(r)} y_{i}}{{\hat{α}}_{i j}^{(r)}},

(15)

where

{\hat{α}}_{i j}^{(r)} = \sum_{k = 1}^{2} {\hat{w}}_{i j k}^{(r)} / σ_{k j}^{2 (r)}

.

Update

{\hat{σ}}_{k j}^{(k + 1)}

;

k = 1, 2

;

j = 1, \dots, g

, by solving the following stressed cubic equations

σ_{k j}^{3} + p σ_{k j} + q = 0; k = 1, 2,

(16)

where

p = - \frac{\sum_{i = 1}^{n} {\hat{w}}_{i j k}^{(r)} {(Y_{i} - μ_{j}^{(r + 1)})}^{2}}{{\hat{z}}_{i j}^{(r)}}

, for which

q = p σ_{2 j} I_{(k = 1)} + p σ_{1 j} I_{(k = 2)}

. Note that

p, q < 0

, so this cubic equation has unique just root in the

(0, + \infty)

interval.

CML-step of the ECME-algorithm.

ν^{(k + 1)} = a r g m a x_{ν} ℓ ({\hat{Θ}}_{- ν}^{(r + 1)}, ν | Y),

(17)

where

ℓ (\cdot | Y)

is the log-likelihood function given in (9) and

{\hat{Θ}}_{- ν}^{(r + 1)}

denotes the

(r + 1)

-th update of

\hat{Θ}

except

ν

.

The ECME-algorithm iterates until a sufficient convergence rule is satisfied, e.g., if

| ℓ ({\hat{Θ}}^{(r + 1)} | y) / ℓ ({\hat{Θ}}^{(r)} | y) - 1 | \leq ϵ

, under the determined tolerance

ϵ

.

4. Numerical Studies

In this section, we assess the performance of the proposed FM model using simulated and real datasets. The implementations of the algorithms were based on the R software [30] version 3.5.1 with a core i7 760 processor 2.8 GHz, and a relative tolerance of

10^{- 5}

was used for convergence of the ECME-algorithms. A sample copy of the R code is available up on request from the authors and will be available in an R package specialize to this proposed model.

4.1. Simulations

In this section, we have three simulations. In the first, we showed the robustness of the FM-TP-SMN models to classify heterogeneous data; in the second, we showed the misspecification of the proposed FM-TP-SMN models; and in the third simulation we considered suitability of the asymptotic properties for proposed model estimates.

4.1.1. Clustering

The FM models are useful for clustering the observations by allocating them into groups of observations that are similar in some sense. In fact, by considering the estimated (posterior) probabilities, we can assign such observation points to given groups. However, some atypical data have an undesirable effect to suitable clustering (see e.g., [1,2,8]). In our models, we consider the skewness and use the clustering as a base on them to show the robustness on the clustering of atypical data in components. We generated 1000 samples from the FM-TP-SMN with two components and for each sample, and considered the k-means clustering while we have ignored the true classification on these classifications.

We simulated 1000 samples with sample sizes

n = 100, 350, 800

, from the FM-TP-SMN models with parameters

π_{1} = 0.75

,

μ_{1} = 10

,

μ_{2} = 15

,

σ_{1} = 2

,

σ_{2} = 4

,

γ_{1} = 0.1

,

γ_{2} = 0.7

and

γ_{1} = γ_{2} = 0.5

(FM-Normal model), for which

ν = 4

for TP-T and TP-SL, and

ν = (0.3, 0.3)

for TP-CN. According to the FM-TP-SMN estimated (posterior) probabilities given in (12) and the threshold value 0.5, we allocated the observations to some specific component. For each sample

t = 1, \dots, 1000

, the mean value rate of the correct allocations are given in Table 1, which shows that clustering based on the FM-TP-T, FM-TP-SL and FM-TP-CN are more reasonability than the ordinary FM-Normal model clustering, in the presence of atypical data. Also note that in the case of the true model (FM-TP-T), the FM-TP-CN also outperforms the other models.

4.1.2. Misspecification

For this section, we simulated 2000 samples with lengths

n = 150

from FM-SN (asymmetric and light tailed components) and FM-ST (asymmetric and heavy tailed components) separately, with parameters of the previous simulation structure and with

(λ_{1}, λ_{2}) = (- 2, 3)

. Then, we fitted various proposed FM-TP-SMN models to these data. In Table 2, various FM-TP-SMN models were first compared with the ordinary FM-NOR model (symmetric and light tailed components) and then various competitors within the FM-TP-SMN models (asymmetric components). The results in the first of four rows of Table 2 demonstrate that the number of preferred models belongs to the class of FM-TP-SMN models against the FM-NOR model. Also, the number of preferred models to fit the FM-SN is FM-TP-N, and in this case other preferred models except the FM-TP-N model are models which have similarities with it (for example FM-TP-T with large values of degree of freedom

ν

), i.e., preferred fitted models to the FM-SN with asymmetric and light tailed components are the FM-TP-SMN models with light tailed components. In the cases of FM-ST with asymmetric but heavy tailed components, also the FM-TP-SMN models with heavy tailed components were preferred. In this and the real application parts, the model selection criteria to choose the best model are: logarithm of the maximized likelihood function (log-like) which is

ℓ (\hat{Θ} | y)

, Akaike information criteria (AIC); [31], Bayesian information criteria (BIC); [32], in the form of

A I C = 2 k - 2 ℓ (\hat{Θ} | y) and B I C = k \log n - 2 ℓ (\hat{Θ} | y),

respectively, where k is the number of the model parameters.

4.1.3. Asymptotical results

For this section, we simulated 400 samples each one with sample sizes

n = 150

, 600, 1000, 2000, 4000, from some FM-TP-T models with two components which are weak separated (WS), medium separated (MS) and strong separated (SS) of components, i.e., little, medium and large overlap of components respectively (see Figure 1), for which

π_{1} = 0.4

,

μ_{1} = 10

,

μ_{2} = 15

,

σ_{1} = 1

,

σ_{2} = 1

,

γ_{1} = 0.65

,

γ_{2} = 0.35

,

ν = 4

and

(μ_{1}, μ_{2}) = (- 0.2, 0.2)

for WS data;

(μ_{1}, μ_{2}) = (- 1.5, 1.5)

for MS data; and

(μ_{1}, μ_{2}) = (- 5, 5)

for SS data.

Using the proposed ECME algorithm to find the ML estimates we focus on the evaluation of Monte-Carlo average of biasness (MC-bias) and mean squared error (MSE) defined as of the ML estimates in each j-th sample,

j = 1, \dots, 400

, respectively given in Table 3, Table 4 and Table 5 by

M C_{B} i a s (ξ) = (\frac{1}{400} \sum_{j = 1}^{400} {\hat{ξ}}_{j}^{(i)}) - ξ_{j}, M S E (ξ) = \frac{1}{400} \sum_{j = 1}^{400} {(ξ_{j} - {\hat{ξ}}_{j}^{(i)})}^{2},

where

{\hat{ξ}}_{j}^{(i)}

is the ML estimate of the parameter

ξ_{j}

in the i-th sample.

These results in Table 3, Table 4 and Table 5 are obtained from the different fitted FM-TP-SMN models and show the performance of the proposed models as well as their parameters estimates. As the sample size increased we naturally observed that the Monte-Carlo average bias of ML estimates and MSE were tending toward zero.

4.2. Applications

In this section, we apply the FM-TP-SMN models on some various real data sets to show the performance of the proposed models and estimates in applications.

BMI Data

We considered the body mass index (BMI) data set collected for men aged between 18 and 80 years. The BMI data set was gathered with the National Health and Nutrition Examination Survey in the US National Center for Health Statistics (NCHS) of the Center for Disease Control (CDC). A strong relationship between the obesity problem and many chronic diseases has attracted attention in recent years, that is, most people with an obesity problem will have chronic diseases. The ratio of body weight in kilograms and height in squared meters (BMI) is a measure to determine the rate of relationship between overweight and obesity. In this way, a person with BMI > 25 is considered overweighed, while BMI > 30 is considered obese.

This dataset had 4579 participants with BMI records, but for modeling with finite mixture models, participants with weights within 39.50–70.00 kg and 95.01–196.80 kg with 1069 and 1054 participants were considered in the first and second subgroups respectively. Lin et al. [7] were first analyzed this dataset by considering the reports in 1999–2000 and 2001–2002, and were fitted the FM-normal, FM-T, FM-SN and FM-ST, always with two components, and then [5,13] fitted the FM-SMSN models to this dataset. The results, obtained by [13], were general and involved the results by [5,7]. So we fitted the proposed FM-TP-SMN models to this dataset and compared obtained results in the [13].

Table 6 contains the ML estimates of the FM-TP-SMN models with two components, and the Log-likelihood, AIC and BIC criterions of the proposed FM-TP-SMN models and FM-SMSN taken from Table 1 due to [13] appear in Table 7.

As noted by Lin et al. [4] and Prates et al. [13], the criteria values in Table 7 indicate that the heavy tailed FM-SMSN models (FM-ST, FM-SCN and FM-SSL) had a better fit than the ordinary FM-NOR and FM-SN models, and also the FM-SSL and FM-ST were the best fitted models. Such results are for the FM-TP-SMN (with corresponding FM-SMSN counterparts) models, while the FM-TP-SMN models were more reasonable than FM-TP-SMN models. However, the FM-TP-SL and FM-TP-T were the best models. In Figure 2, we plot the fitted FM-TP-T and FM-ST densities curved on the histogram of BMI data.

4.3. UScrime Data

As a further application of the FM-TP-SMN models and proposed methodology, we consider the effect of punishment regimes on crime rates [33,34], which is of high interest to criminologists. This has been studied using aggregate data of 47 US states for 1960 given in this data frame, and we consider the 13th column of this data frame which is due to income inequality. The data are available under the UScrime function in the MASS R package.

Table 8 contains the ML estimates of the FM-TP-SMN models with two components, and the Log-likelihood, AIC and BIC criterions of the proposed FM-TP-SMN and FM- SMSN models.

The log-likelihood values in Table 8 indicate that the FM-TP-N and FM-TP-T are the best models within the FM-TP-SMN models, while the FM-SCN is the best model in the class of the FM-SMSN models. The AIC and BIC criteria chose the FM-TP-N model (asymmetric components) within the FM-TP-SMN models, while they chose the FM-NOR model (symmetric components) in the class of the FM-SMSN models, which has symmetrical components. Among all competitors, the criteria chose the FM-TP-N model belonging to the proposed FM-TP-SMN models, which is more reasonable. In Figure 3 is plotted the histogram of US crime data with the curves of FM-TP-SMN and FM-SMSN models. These graphical visualizations show the suitability of the asymmetrical components and proposed FM-TP-SMN models.

5. Conclusions

We have proposed a flexible family of TP-SMN distributions for application in clustering problems. The TP-SMN family is capable of representing distributions of symmetric/asymmetric and light/heavy tailed forms, which contains the well-known symmetric SMN family and is a reasonable competitor for asymmetric SMSN family as a special case. Estimation of the finite mixtures of the FM-TP-SMN parameters is relatively straightforward via the ECME algorithm with a fast convergence (a few iterations loop). Considering a Bayesian approach and the flexible TP-SMN family in the Autoregressive and ARMA processes from [35,36] can be further topics of research.

Author Contributions

M.M., J.E.C.-R. and M.R.M. wrote the paper and contributed the reagents/analysis/materials tools; M.M. conceived, designed, and performed the experiments, and analyzed the data. All authors read and approved the final manuscript. All authors have read and approved the final manuscript.

Acknowledgments

The authors thank the editor and three anonymous referees for their helpful comments and suggestions.

Conflicts of Interest

The authors declare that there is no conflict of interest in the publication of this paper.

References

McLachlan, G.; Peel, D. Finite Mixture Models; John Wiley and Sons: New York, NY, USA, 2000. [Google Scholar]
Contreras-Reyes, J.E.; Cortés, D.D. Bounds on Rényi and Shannon Entropies for Finite Mixtures of Multivariate Skew-Normal Distributions: Application to Swordfish (Xiphias gladius Linnaeus). Entropy 2016, 18, 382. [Google Scholar] [CrossRef]
Frühwirth-Schnatter, S.; Pyne, S. Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-t distributions. Biostatistics 2010, 11, 317–336. [Google Scholar] [CrossRef] [PubMed]
Lin, T.I.; Lee, J.C.; Yen, S.Y. Finite mixture modelling using the skew normal distribution. Stat. Sin. 2007, 17, 909–927. [Google Scholar]
Basso, R.M.; Lachos, V.H.; Cabral, C.R.B.; Ghosh, P. Robust mixture modeling based on scale mixtures of skew-normal distributions. Comput. Stat. Data Anal. 2010, 54, 2926–2941. [Google Scholar] [CrossRef]
Lin, T.I. Robust mixture modeling using multivariate skew t distributions. Stat. Comput. 2009, 20, 343–356. [Google Scholar] [CrossRef]
Lin, T.I.; Lee, J.C.; Hsieh, W.J. Robust Mixture Modelling Using the Skew t Distribution. Stat. Comput. 2007, 17, 81–92. [Google Scholar] [CrossRef]
Contreras-Reyes, J.E.; López Quintero, F.O.; Yáñez, A.A. Towards Age Determination of Southern King Crab (Lithodes santolla) Off Southern Chile Using Flexible Mixture Modeling. J. Mar. Sci. Eng. 2018, 6, 157. [Google Scholar] [CrossRef]
Maleki, M.; Arellano-Valle, R.B. Maximum a-posteriori estimation of autoregressive processes based on finite mixtures of scale-mixtures of skew-normal distributions. J. Stat. Comput. Sim. 2017, 87, 1061–1083. [Google Scholar] [CrossRef]
Maleki, M.; Arellano-Valle, R.B.; Dey, D.K.; Mahmoudi, M.R.; Jalali, S.M.J. A Bayesian Approach to Robust Skewed Autoregressive Processes. Calcutta Stat. Assoc. Bull. 2017, 69, 165–182. [Google Scholar] [CrossRef]
Maleki, M.; Wraith, D.; Arellano-Valle, R.B. Robust finite mixture modeling of multivariate unrestricted skew-normal generalized hyperbolic distributions. Stat. Comput. 2018, in press. [Google Scholar] [CrossRef]
Maleki, M.; Wraith, D.; Arellano-Valle, R.B. A flexible class of parametric distributions for Bayesian linear mixed models. Test 2018, in press. [Google Scholar] [CrossRef]
Prates, M.O.; Lachos, V.H.; Cabral, C. mixsmsn: Fitting finite mixture of scale mixture of skew-normal distributions. J. Stat. Soft. 2000, 54, 1–20. [Google Scholar] [CrossRef]
Hajrajabi, A.; Maleki, M. Nonlinear semiparametric autoregressive model with finite mixtures of scale mixtures of skew normal innovations. J. Appl. Stat. 2019, in press. [Google Scholar] [CrossRef]
Maleki, M.; Wraith, D. Mixtures of multivariate restricted skew-normal factor analyzer models in a Bayesian framework. Comput. Stat. 2019, in press. [Google Scholar] [CrossRef]
Andrews, D.R.; Mallows, C.L. Scale mixture of normal distribution. J. R. Stat. Soc. Ser. B 1974, 36, 99–102. [Google Scholar] [CrossRef]
Lange, K.L.; Little, R.; Taylor, J. Robust statistical modeling using t distribution. J. Am. Stat. Assoc. 1989, 84, 881–896. [Google Scholar] [CrossRef]
Lange, K.L.; Sinsheimer, J.S. Normal/independent distributions and their applications in robust regression. J. Comput. Graph. Stat. 1993, 2, 175–198. [Google Scholar]
Maleki, M.; Nematollahi, A.R. Autoregressive Models with Mixture of Scale Mixtures of Gaussian innovations. Iranian J. Sci. Technol. Trans. A 2017, 41, 1099–1107. [Google Scholar] [CrossRef]
Arellano-Valle, R.B.; Gómez, H.; Quintana, F.A. Statistical inference for a general class of asymmetric distributions. J. Stat. Plan. Inf. 2005, 128, 427–443. [Google Scholar] [CrossRef]
Hoseinzadeh, A.; Maleki, M.; Khodadadi, Z.; Contreras-Reyes, J.E. The Skew-Reflected-Gompertz distribution for analyzing symmetric and asymmetric data. J. Comput. Appl. Math. 2019, 349, 132–141. [Google Scholar] [CrossRef]
Moravveji, B.; Khodadai, Z.; Maleki, M. A Bayesian Analysis of Two-Piece distributions based on the Scale Mixtures of Normal Family. Iranian J. Sci. Technol. Trans. A 2018, in press. [Google Scholar] [CrossRef]
Maleki, M.; Mahmoudi, M.R. Two-Piece Location-Scale Distributions based on Scale Mixtures of Normal family. Commun. Stat. Theor. Meth. 2017, 46, 12356–12369. [Google Scholar] [CrossRef]
Rubio, F.J.; Steel, M.F.G. Inference in Two-Piece Location-Scale Models with Jeffreys Priors. Bayesian Anal. 2014, 9, 1–22. [Google Scholar] [CrossRef]
Branco, M.D.; Dey, D.K. A general class of multivariate skew-elliptical distributions. J. Multivar. Anal. 2001, 79, 99–113. [Google Scholar] [CrossRef]
Mudholkar, G.S.; Hutson, A.D. The epsilon-skew-normal distribution for analyzing near-normal data. J. Stat. Plan. Inf. 2000, 83, 291–309. [Google Scholar] [CrossRef]
Meng, X.; Rubin, D.B. Maximum likelihood estimation via the ECM algorithm: A general framework. Biometrika 2017, 80, 267–278. [Google Scholar] [CrossRef]
Dempster, A.P.; Laird, N.M.; Rubin, D.B. Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B 1977, 39, 1–38. [Google Scholar] [CrossRef]
Liu, C.; Rubin, D.B. The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence. Biometrika 1994, 81, 633–648. [Google Scholar] [CrossRef]
R Core Team. A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2018; ISBN 3-900051-07-0. Available online: http://www.R-project.org (accessed on 12 December 2018).
Akaike, H. A new look at the statistical model identification. IEEE Trans. Autom. Control 1974, 19, 716–723. [Google Scholar] [CrossRef]
Schwarz, G. Estimating the dimension of a model. Ann. Stat. 1978, 6, 461–464. [Google Scholar] [CrossRef]
Ehrlich, I. Participation in illegitimate activities: A theoretical and empirical investigation. J. Political Econ. 1973, 81, 521–565. [Google Scholar] [CrossRef]
Vandaele, W. Participation in illegitimate activities: Ehrlich revisited. In Deterrence and Incapacitation; Blumstein, A., Cohen, J., Nagin, D., Eds.; US National Academy of Sciences: Washington, DC, USA, 1978; pp. 270–335. [Google Scholar]
Ghasami, S.; Khodadadi, Z.; Maleki, M. Autoregressive Processes with Generalized Hyperbolic Innovations. Commun. Stat. Comput. Sim. 2019, in press. [Google Scholar] [CrossRef]
Zarrin, P.; Maleki, M.; Khodadadi, Z.; Arellano-Valle, R.B. Time series process based on the unrestricted skew normal process. J. Stat. Comput. Sim. 2019, 89, 38–51. [Google Scholar] [CrossRef]

Figure 1. An artificial simulated finite mixture two-piece distributions based on the scale mixtures of normal (FM-TP-SMN) data of length

n = 400

with two components: weakly separated components (WS); medium separated components (MS) and strongly separated components (SS), with curved probability density function (PDF) that datasets extracted from them.

Figure 1. An artificial simulated finite mixture two-piece distributions based on the scale mixtures of normal (FM-TP-SMN) data of length

n = 400

with two components: weakly separated components (WS); medium separated components (MS) and strongly separated components (SS), with curved probability density function (PDF) that datasets extracted from them.

Figure 2. Histogram of body mass index (BMI) data with fitted FM-TP-true (T) (left) and FM-ST (right) models with two components.

Figure 3. Histogram of UScrime data with fitted FM-TP-SMN and FM-SMSN models with two components.

Table 1. Mean of true allocations rates for fitted finite mixture two-piece distributions based on the scale mixtures of normal (FM-TP-SMN) models.

True Model	Sample Size	Fitted Model
True Model	Sample Size	FM-Normal	FM-TP-N	FM-TP-T	FM-TP-CN	FM-TP-SL
FM-TP-T	100	0.3745	0.7026	0.7674	0.8026	0.7902
	350	0.2836	0.7693	0.8372	0.8462	0.8450
	800	0.2231	0.7990	0.8418	0.8490	0.8469
FM-TP-CN	100	0.6023	0.6512	0.7835	0.7941	0.7829
	350	0.6258	0.7654	0.8542	0.8622	0.8510
	800	0.6395	0.7810	0.8599	0.8665	0.8563
FM-TP-SL	100	0.5374	0.7735	0.7810	0.7747	0.7840
	350	0.5903	0.8235	0.8420	0.8250	0.8491
	800	0.6034	0.8264	0.8437	0.8281	0.8507

Table 2. The number of times (out of 2000) the true FM models chosen under seven proposed hypotheses.

Condition Examined	True Model:	FM-SN		FM-ST
Condition Examined	Criteria:	AIC	BIC	AIC	BIC
FM-TP-N vs. FM-Normal		1934	1958	1986	1975
FM-TP-T vs. FM-Normal		1511	1645	2000	2000
FM-TP-SL vs. FM-Normal		1387	1410	1823	1865
FM-TP-CN vs. FM-Normal		1503	1532	1746	1732
FM-TP-T vs. FM-TP-N		83	92	1861	1883
FM-TP-SL vs. FM-TP-N		112	123	1821	1793
FM-TP-CN vs. FM-TP-N		143	196	1732	1746

Table 3. Monte-Carlo average bias (MC-bias) and mean squared error (MSE) for maximum likelihood (ML) estimates in the weak separated components (WS) FM-TP-true (T) model.

Measure	Parameter	Sample Size
Measure	Parameter	$n = 150$	$n = 600$	$n = 1000$	$n = 2000$	$n = 4000$
MC-Bias	$π_{1}$	−1.64533 × 10 $^{- 2}$	2.50755 × 10 $^{- 3}$	−1.849 × 10 $^{- 5}$	8.427 × 10 $^{- 6}$	3.452 × 10 $^{- 7}$
	$μ_{1}$	3.48475 × 10 $^{- 2}$	−9.48375 × 10 $^{- 4}$	−2.287 × 10 $^{- 3}$	7.427 × 10 $^{- 4}$	6.584 × 10 $^{- 4}$
	$μ_{2}$	3.93636 × 10 $^{- 1}$	−1.98365 × 10 $^{- 2}$	−3.276 × 10 $^{- 3}$	9.4978 × 10 $^{- 4}$	−6.436 × 10 $^{- 4}$
	$σ_{1}$	−4.98656 × 10 $^{- 1}$	3.44827 × 10 $^{- 2}$	6.487 × 10 $^{- 3}$	−6.775 × 10 $^{- 3}$	−7.864 × 10 $^{- 3}$
	$σ_{2}$	4.57463 × 10 $^{- 1}$	4.03741 × 10 $^{- 2}$	−6.284 × 10 $^{- 3}$	−6.103 × 10 $^{- 3}$	9.903 × 10 $^{- 4}$
	$γ_{1}$	1.57363 × 10 $^{- 2}$	1.93845 × 10 $^{- 3}$	1.574 × 10 $^{- 6}$	1.427 × 10 $^{- 6}$	5.047 × 10 $^{- 7}$
	$γ_{2}$	1.40384 × 10 $^{- 2}$	1.73644 × 10 $^{- 3}$	2.037 × 10 $^{- 6}$	6.948 × 10 $^{- 7}$	4.765 × 10 $^{- 7}$
	$ν$	1.14024 × 10 $^{2}$	1.36253 × 10 $^{- 1}$	−6.017 × 10 $^{- 2}$	1.284 × 10 $^{- 2}$	−1.201 × 10 $^{- 2}$
MSE	$π_{1}$	7.15248 × 10 $^{- 3}$	7.35733 × 10 $^{- 4}$	3.854 × 10 $^{- 4}$	6.036 × 10 $^{- 5}$	6.729 × 10 $^{- 5}$
	$μ_{1}$	1.10375 × 10 $^{- 1}$	8.99164 × 10 $^{- 2}$	3.927 × 10 $^{- 3}$	2.889 × 10 $^{- 4}$	2.960 × 10 $^{- 4}$
	$μ_{2}$	1.97364 × 10 $^{0}$	1.23473 × 10 $^{- 1}$	4.920 × 10 $^{- 2}$	3.276 × 10 $^{- 3}$	3.328 × 10 $^{- 3}$
	$σ_{1}$	1.69475 × 10 $^{0}$	1.11763 × 10 $^{0}$	2.118 × 10 $^{- 1}$	3.2849 × 10 $^{- 3}$	3.453 × 10 $^{- 3}$
	$σ_{2}$	1.67568 × 10 $^{1}$	1.43855 × 10 $^{1}$	4.548 × 10 $^{- 1}$	6.786 × 10 $^{- 2}$	6.903 × 10 $^{- 2}$
	$γ_{1}$	8.00264 × 10 $^{- 3}$	3.49566 × 10 $^{- 4}$	2.801 × 10 $^{- 4}$	6.104 × 10 $^{- 5}$	6.003 × 10 $^{- 5}$
	$γ_{2}$	1.98374 × 10 $^{- 2}$	8.20183 × 10 $^{- 3}$	2.102 × 10 $^{- 4}$	6.352 × 10 $^{- 5}$	6.102 × 10 $^{- 5}$
	$ν$	0.97803 × 10 $^{2}$	5.46093 × 10 $^{- 1}$	1.112 × 10 $^{- 1}$	1.684 × 10 $^{- 2}$	1.521 × 10 $^{- 2}$

Table 4. MC-Bias and MSE for ML estimates in the medium separated components (MS) FM-TP-T model.

Measure	Parameter	Sample Size
Measure	Parameter	$n = 150$	$n = 600$	$n = 1000$	$n = 2000$	$n = 4000$
MC-Bias	$π_{1}$	5.03746 × 10 $^{- 3}$	−4.21374 × 10 $^{- 3}$	−1.40712 × 10 $^{- 3}$	−5.83744 × 10 $^{- 4}$	6.10927 × 10 $^{- 5}$
	$μ_{1}$	2.18723 × 10 $^{- 2}$	−1.72451 × 10 $^{- 3}$	−6.28474 × 10 $^{- 4}$	2.99837 × 10 $^{- 5}$	−2.48576 × 10 $^{- 5}$
	$μ_{2}$	−1.30284 × 10 $^{- 2}$	8.29374 × 10 $^{- 4}$	6.99386 × 10 $^{- 4}$	3.95645 × 10 $^{- 5}$	−3.43927 × 10 $^{- 5}$
	$σ_{1}$	1.25344 × 10 $^{- 2}$	1.28374 × 10 $^{- 4}$	−5.98375 × 10 $^{- 5}$	−4.99380 × 10 $^{- 5}$	−5.99837 × 10 $^{- 6}$
	$σ_{2}$	1.27364 × 10 $^{- 2}$	−2.04634 × 10 $^{- 3}$	−1.47364 × 10 $^{- 3}$	2.98476 × 10 $^{- 4}$	6.97484 × 10 $^{- 5}$
	$γ_{1}$	−1.10264 × 10 $^{- 2}$	−1.02172 × 10 $^{- 3}$	−4.93846 × 10 $^{- 5}$	−6.10283 × 10 $^{- 7}$	7.98375 × 10 $^{- 7}$
	$γ_{2}$	−0.78725 × 10 $^{- 2}$	−0.90273 × 10 $^{- 3}$	5.97367 × 10 $^{- 5}$	5.83753 × 10 $^{- 7}$	9.01274 × 10 $^{- 7}$
	$ν$	2.40926 × 10 $^{0}$	1.12027 × 10 $^{- 1}$	3.98365 × 10 $^{- 2}$	−2.48462 × 10 $^{- 3}$	−2.83765 × 10 $^{- 4}$
MSE	$π_{1}$	5.85644 × 10 $^{- 3}$	6.45364 × 10 $^{- 4}$	3.57464 × 10 $^{- 5}$	6.20183 × 10 $^{- 5}$	7.10993 × 10 $^{- 5}$
	$μ_{1}$	1.08744 × 10 $^{- 1}$	9.37436 × 10 $^{- 1}$	1.03764 × 10 $^{- 2}$	2.57463 × 10 $^{- 4}$	2.03464 × 10 $^{- 4}$
	$μ_{2}$	2.03937 × 10 $^{- 1}$	2.00713 × 10 $^{- 1}$	4.67483 × 10 $^{- 2}$	3.47367 × 10 $^{- 3}$	3.24536 × 10 $^{- 3}$
	$σ_{1}$	3.46357 × 10 $^{- 1}$	2.47464 × 10 $^{- 1}$	2.23433 × 10 $^{- 1}$	3.39283 × 10 $^{- 3}$	6.58475 × 10 $^{- 4}$
	$σ_{2}$	8.38474 × 10 $^{- 1}$	6.37364 × 10 $^{- 1}$	4.38475 × 10 $^{- 1}$	6.87364 × 10 $^{- 2}$	6.00836 × 10 $^{- 2}$
	$γ_{1}$	1.84746 × 10 $^{- 2}$	4.03847 × 10 $^{- 4}$	3.11972 × 10 $^{- 4}$	7.00374 × 10 $^{- 5}$	6.21002 × 10 $^{- 5}$
	$γ_{2}$	2.04464 × 10 $^{- 2}$	7.64533 × 10 $^{- 3}$	2.30283 × 10 $^{- 4}$	5.89472 × 10 $^{- 5}$	7.00353 × 10 $^{- 5}$
	$ν$	6.47465 × 10 $^{1}$	5.87957 × 10 $^{- 1}$	1.93845 × 10 $^{- 2}$	1.74534 × 10 $^{- 2}$	1.48375 × 10 $^{- 2}$

Table 5. MC-Bias and MSE for ML estimates estimates in the strong separated components (SS) FM-TP-T model.

Measure	Parameter	Sample Size
Measure	Parameter	$n = 150$	$n = 600$	$n = 1000$	$n = 2000$	$n = 4000$
MC-Bias	$π_{1}$	5.13426 × 10 $^{- 3}$	5.60483 × 10 $^{- 3}$	−1.40712 × 10 $^{- 3}$	−6.26826 × 10 $^{- 4}$	−6.95662 × 10 $^{- 5}$
	$μ_{1}$	−1.90273 × 10 $^{- 2}$	−1.24751 × 10 $^{- 3}$	1.20183 × 10 $^{- 3}$	3.45342 × 10 $^{- 5}$	2.74634 × 10 $^{- 5}$
	$μ_{2}$	1.15379 × 10 $^{- 2}$	−3.06344 × 10 $^{- 3}$	1.47333 × 10 $^{- 3}$	4.28144 × 10 $^{- 5}$	3.35242 × 10 $^{- 5}$
	$σ_{1}$	−1.03046 × 10 $^{- 2}$	-8.34452 × 10 $^{- 5}$	6.73645 × 10 $^{- 5}$	−4.87354 × 10 $^{- 5}$	−7.28374 × 10 $^{- 6}$
	$σ_{2}$	−1.38724 × 10 $^{- 2}$	−1.27844 × 10 $^{- 3}$	−1.00904 × 10 $^{- 3}$	3.49383 × 10 $^{- 4}$	9.88365 × 10 $^{- 5}$
	$γ_{1}$	−0.58746 × 10 $^{- 2}$	0.34452 × 10 $^{- 3}$	5.09847 × 10 $^{- 5}$	−5.46353 × 10 $^{- 7}$	8.73645 × 10 $^{- 7}$
	$γ_{2}$	−0.60273 × 10 $^{- 2}$	−0.24533 × 10 $^{- 3}$	6.08422 × 10 $^{- 5}$	5.27363 × 10 $^{- 7}$	8.69384 × 10 $^{- 7}$
	$ν$	2.37264 × 10 $^{0}$	0.99775 × 10 $^{- 1}$	4.65635 × 10 $^{- 2}$	−1.74632 × 10 $^{- 2}$	−2.46354 × 10 $^{- 3}$
MSE	$π_{1}$	2.94256 × 10 $^{- 3}$	3.83748 × 10 $^{- 4}$	2.67464 × 10 $^{- 4}$	7.78374 × 10 $^{- 5}$	9.73646 × 10 $^{- 5}$
	$μ_{1}$	5.01324 × 10 $^{- 3}$	4.11293 × 10 $^{- 4}$	1.69837 × 10 $^{- 4}$	8.32847 × 10 $^{- 6}$	2.37462 × 10 $^{- 6}$
	$μ_{2}$	5.73648 × 10 $^{- 3}$	5.03744 × 10 $^{- 4}$	8.98474 × 10 $^{- 5}$	6.26353 × 10 $^{- 6}$	3.03733 × 10 $^{- 6}$
	$σ_{1}$	3.48373 × 10 $^{- 2}$	1.20385 × 10 $^{- 3}$	1.65464 × 10 $^{- 4}$	1.92736 × 10 $^{- 6}$	1.82037 × 10 $^{- 6}$
	$σ_{2}$	7.92746 × 10 $^{- 2}$	5.38474 × 10 $^{- 3}$	1.16354 × 10 $^{- 3}$	1.98274 × 10 $^{- 5}$	1.48263 × 10 $^{- 5}$
	$γ_{1}$	3.26354 × 10 $^{- 3}$	4.24846 × 10 $^{- 4}$	2.63846 × 10 $^{- 4}$	6.03947 × 10 $^{- 5}$	8.48375 × 10 $^{- 5}$
	$γ_{2}$	2.73145 × 10 $^{- 3}$	3.38475 × 10 $^{- 4}$	9.83746 × 10 $^{- 5}$	5.64544 × 10 $^{- 5}$	8.66555 × 10 $^{- 5}$
	$ν$	1.28736 × 10 $^{2}$	3.01763 × 10 $^{- 1}$	9.02635 × 10 $^{- 2}$	7.27647 × 10 $^{- 3}$	8.83645 × 10 $^{- 3}$

Table 6. ML estimation results for fitting FM-TP-SMN models to the body mass index (BMI) data.

Parameter	Fitted Model
Parameter	FM-Normal	FM-TP-N	FM-TP-T	FM-TP-SL	FM-TP-CN
$π_{1}$	0.391	0.5400	0.4600	0.5374	0.5282
$μ_{1}$	21.412	20.7520	20.744	20.7982	21.0629
$μ_{2}$	32.548	30.1324	30.155	30.0169	29.9851
$σ_{1}$	2.0176	5.0343	5.0303	4.2785	1.5215
$σ_{2}$	6.4180	7.4040	7.4038	6.3213	2.4601
$γ_{1}$	–	0.6798	0.6806	0.6709	0.6302
$γ_{2}$	–	0.8695	0.8673	0.8771	0.8785
$ν$	–	–	7.9939	2.1950	(0.971, 0.081)

Table 7. Model selection criteria for fitting FM-TP-SMN and FM-scale mixtures of the skew normal (SMSN) models to the BMI data. The best values are marked in bold.

Criterion	Fitted Model
Criterion	FM-Normal	FM-TP-N	FM-SN	FM-TP-T	FM-ST	FM-TP-SL	FM-SSL	FM-TP-CN	FM-SCN
Log-like	−6911.76	−6870.30	−6979.47	−6856.65	−6869.03	−6857.14	−6867.99	−6871.65	−6865.25
AIC	13833.35	13754.72	13972.95	13729.41	13754.06	13730.41	13751.98	13812.31	13748.51
BIC	13961.61	13763.79	13821.22	13737.61	13782.33	13735.85	13780.24	13775.89	13776.77

Table 8. ML estimation results and model selection criteria for fitting FM-TP-SMN models to the UScrime data. The best values are marked in bold.

Criterion	Fitted Model
Criterion	FM-Normal	FM-TP-N	FM-SN	FM-TP-T	FM-ST	FM-TP-SL	FM-SSL	FM-TP-CN	FM-SCN
$π_{1}$	0.615	0.463	0.607	0.462	0.607	0.463	0.607	0.509	0.582
$μ_{1}$	166.674	173.927	175.448	173.924	174.558	173.931	175.350	170.931	175.390
$μ_{2}$	237.632	246.160	227.951	246.060	228.018	245.998	227.888	204.645	224.584
$σ_{1}$	18.283	20.399	20.170	20.263	19.597	20.294	19.835	18.626	16.362
$σ_{2}$	20.455	59.464	22.778	59.268	22.634	58.813	22.478	42.728	20.201
$γ_{1} (λ_{1})$	–	0.071	−0.687	0.071	−0.613	0.071	−0.679	0.176	−0.872
$γ_{2} (λ_{2})$	–	0.257	0.555	0.258	0.548	0.259	0.560	0.782	0.579
$ν$	–	–	–	39.934	100	39.726	37.590	0.745	0.672
$τ$	–	–	–	–	–	–	–	0.999	0.990
Log-like	−232.226	−228.215	−232.274	−228.865	−231.275	−228.211	−231.265	−229.864	−230.281
AIC	474.452	470.437	478.548	473.734	478.549	472.426	478.531	477.748	478.562
BIC	483.703	483.386	491.499	488.535	491.500	487.228	491.482	494.399	491.513

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Maleki, M.; Contreras-Reyes, J.E.; Mahmoudi, M.R. Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family. Axioms 2019, 8, 38. https://doi.org/10.3390/axioms8020038

AMA Style

Maleki M, Contreras-Reyes JE, Mahmoudi MR. Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family. Axioms. 2019; 8(2):38. https://doi.org/10.3390/axioms8020038

Chicago/Turabian Style

Maleki, Mohsen, Javier E. Contreras-Reyes, and Mohammad R. Mahmoudi. 2019. "Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family" Axioms 8, no. 2: 38. https://doi.org/10.3390/axioms8020038

APA Style

Maleki, M., Contreras-Reyes, J. E., & Mahmoudi, M. R. (2019). Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family. Axioms, 8(2), 38. https://doi.org/10.3390/axioms8020038

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Robust Mixture Modeling Based on Two-Piece Scale Mixtures of Normal Family

Abstract

1. Introduction

2. The Two-Piece Scale Mixtures of Normal Distributions

3. Finite Mixtures TP-SMN

3.1. FM-TP-SMN Model

3.2. ML Estimates of Model Parameters

4. Numerical Studies

4.1. Simulations

4.1.1. Clustering

4.1.2. Misspecification

4.1.3. Asymptotical results

4.2. Applications

BMI Data

4.3. UScrime Data

5. Conclusions

Author Contributions

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI