2.6.2. MCLUST-ME Boundary

Consider the data *<sup>S</sup>* = {*yi*}*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> and each *yi* is associated with known error covariance Λ*<sup>i</sup>* for all *i*. Suppose our goal is to partition *<sup>S</sup>* into two clusters. Let (*τ*˜*k*,*μ*˜ *<sup>k</sup>*,Σ˜ *<sup>k</sup>*) be MLEs from the MCLUST-ME model. If we assign each observation to the more probable cluster, the two clusters can be expressed as follows,

$$E\_1^\* = \{ \mathbf{y}\_i \in \mathbb{S} : \boldsymbol{\tau}\_1 \boldsymbol{g}\_1(\mathbf{y}\_i; \boldsymbol{\tilde{\mu}}\_1, \boldsymbol{\Sigma}\_1, \mathbf{A}\_i) - \boldsymbol{\tau}\_2 \boldsymbol{g}\_2(\mathbf{y}\_i; \boldsymbol{\tilde{\mu}}\_2, \boldsymbol{\Sigma}\_2, \mathbf{A}\_i) > 0 \}; \quad E\_2^\* = \mathcal{S} \mid E\_1^\* \}$$

where *gk* is defined in (8). The above decision rule (and therefore boundary) of classifying each point *yi* now depends not only on the values of MLEs, but also on the error covariance matrix, Λ*i*, of *yi*. Instead of producing a common boundary for all points in *S*, the MCLUST-ME model specifies an individualized classification boundary for each *yi* as follows,

$$B^\*(\mathbf{A}\_i) = \{ \mathbf{t} \in \mathbb{R}^d : \vec{\tau}\_1 \mathbf{g}\_1(\mathbf{t}; \vec{\mu}\_1, \vec{\Sigma}\_1, \mathbf{A}\_i) - \vec{\tau}\_2 \mathbf{g}\_2(\mathbf{t}; \vec{\mu}\_2, \vec{\Sigma}\_2, \mathbf{A}\_i) = 0 \}.$$

Similar to our argument in Section 2.6.1, when *d* = 2, *B*∗(Λ*i*) is either a straight line or a conic section.

When Λ*<sup>i</sup>* = Λ*<sup>j</sup>* for some *i* = *j*, that is, when two points are associated with the same error covariance, it can be seen that *B*∗(Λ*i*) = *B*∗(Λ*j*), meaning that the two points share a common classification boundary. In the special case where Λ*<sup>i</sup>* = Λ*<sup>j</sup>* ∀*i* = *j*, all boundaries *B*∗(Λ*i*) will coincide with each other.

One consequence of the existence of multiple decision boundaries is that the classification uncertainty of each point will depend on its corresponding value of Λ*i*. In MCLUST, points with high uncertainty (≈ 0.5) are aligned around the single classification boundary, whereas in MCLUST-ME, each highly uncertain point is close to its own boundary. Consequently, as we will see in Section 3.1, our method allows intermixing of points belonging to different clusters, while MCLUST creates clear-cut separation between clusters.
