*2.5. Model Selection*

Within MCLUST framework, selection for the number of clusters can be achieved through the use of the Bayesian information criterion (BIC). Given a random sample of *n* independent *d*-vectors *y* = (*y*1, ..., *yn*) drawn from (4) and (9) with some value of *G*, the BIC for this *G*-component mixture model is given by:

$$BIC\_G = 2l\_O(\blackdot{\Theta}; \mathfrak{y}) - \nu\_G \log(n),\tag{21}$$

where Θˆ is the MLE for model parameters, *lO* is the observed likelihood as in (5) or (10), and *ν<sup>G</sup>* is the number of independent parameters to be estimated. In the most simplistic case, we allow the mean and covariance of each component to vary freely—this is the case we will focus on in this paper. Therefore, for a *G*-component mixture model, we have *ν<sup>G</sup>* = (*G* − 1) + *Gd* + *Gd*(*d* − 1)/2. For comparison purpose, in this paper, we will compare MCLUST-ME results to MCLUST results with the same number of components.
