*Proceeding Paper* **Robust Methods for Soft Clustering of Multidimensional Time Series †**

**Ángel López-Oriona 1,\*, Pierpaolo D'Urso 2, José A. Vilar 1,3 and Borja Lafuente-Rego <sup>1</sup>**


**Abstract:** Three robust algorithms for clustering multidimensional time series from the perspective of underlying processes are proposed. The methods are robust extensions of a fuzzy *C*-means model based on estimates of the quantile cross-spectral density. Robustness to the presence of anomalous elements is achieved by using the so-called metric, noise and trimmed approaches. Analyses from a wide simulation study indicate that the algorithms are substantially effective in coping with the presence of outlying series, clearly outperforming alternative procedures. The usefulness of the suggested methods is also highlighted by means of a specific application.

**Keywords:** multidimensional time series; fuzzy *C*-means; unsupervised learning

**Citation:** López-Oriona, Á.; D'Urso, P.; Vilar, J.A.; Lafuente-Rego, B. Robust Methods for Soft Clustering of Multidimensional Time Series. *Eng. Proc.* **2021**, *7*, 60. https://doi.org/ 10.3390/engproc2021007060


Academic Editors: Joaquim de Moura, Marco A. González, Javier Pereira and Manuel G. Penedo

Published: 12 November 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

#### **1. Introduction**

Clustering of time series is a pivotal problem in statistics with several applications [1,2]. Generally, the goal is to divide collection of unlabelled time series into uniform groups so that intra-cluster similarity is maximized wheres the inter-cluster similarity is minimized. Most of the current techniques deal with univariate time series (UTS), while clustering of multidimensional time series (MTS) has received limited attention. This paper proposes three robust clustering methods for MTS. All of them are aimed at neutralizing the effect of outlying series while detecting the underlying grouping structure.

### **2. Robust Clustering Methods for Multivariate Time Series**

Let {*Xt*, *<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>} <sup>=</sup> {(*Xt*,1, ... , *Xt*,*d*), *<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>} be a *<sup>d</sup>*-variate real-valued strictly stationary stochastic process. Let *Fj* the marginal distribution function of *Xt*,*j*, *j* = 1, ... , *d*, and let *qj*(*τ*) = *F*−<sup>1</sup> *<sup>j</sup>* (*τ*), *<sup>τ</sup>* <sup>∈</sup> [0, 1], the corresponding quantile function. Fixed *<sup>l</sup>* <sup>∈</sup> <sup>Z</sup> and an arbitrary couple of quantile levels (*τ*, *τ* ) ∈ [0, 1] 2, consider the cross-covariance of the indicator functions *I Xt*,*j*<sup>1</sup> ≤ *qj*<sup>1</sup> (*τ*) and *I Xt*<sup>+</sup>*l*,*j*<sup>2</sup> ≤ *qj*<sup>2</sup> (*τ* ) 

$$\gamma\_{j\_1 j\_2}(l, \tau, \tau') = \text{Cov}\left(I\{X\_{t, j\_1} \le q\_{j\_1}(\tau)\}, I\{X\_{t+l, j\_2} \le q\_{j\_2}(\tau')\}\right),\tag{1}$$

for 1 ≤ *j*1, *j*<sup>2</sup> ≤ *d*. Taking *j*<sup>1</sup> = *j*<sup>2</sup> = *j*, the function *γj*,*j*(*l*, *τ*, *τ* ), with (*τ*, *τ* ) ∈ [0, 1] 2, so-called quantile autocovariance function (QAF) of lag *l*, generalizes the traditional autocovariance function.

For the multivariate process {*Xt*, *<sup>t</sup>* <sup>∈</sup> <sup>Z</sup>}, we can consider the *<sup>d</sup>* <sup>×</sup> *<sup>d</sup>* matrix **<sup>Γ</sup>**(*l*, *<sup>τ</sup>*, *<sup>τ</sup>* ) = *γj*1,*j*<sup>2</sup> (*l*, *τ*, *τ* ) 1≤*j*1,*j*2≤*d* , which simultaneously gives information about both the crossdependence (when *j*<sup>1</sup> = *j*2) and the serial dependence (since there is a lag *l*).

Under appropriate summability conditions (mixing conditions), we can define the the Fourier transform of the cross-covariances. In this regards, the *quantile cross-spectral density* is given by

$$f\_{\hat{j}\_1,\hat{j}\_2}(\omega,\tau,\tau') = (1/2\pi) \sum\_{l=-\infty}^{\infty} \gamma\_{\hat{j}\_1,\hat{j}\_2}(l,\tau,\tau') e^{-il\omega},\tag{2}$$

for 1 <sup>≤</sup> *<sup>j</sup>*1, *<sup>j</sup>*<sup>2</sup> <sup>≤</sup> *<sup>d</sup>*, *<sup>ω</sup>* <sup>∈</sup> <sup>R</sup> and *<sup>τ</sup>*, *<sup>τ</sup>* <sup>∈</sup> [0, 1]. Note that <sup>f</sup>*j*1,*j*<sup>2</sup> (*ω*, *<sup>τ</sup>*, *<sup>τ</sup>* ) is complex-valued.

The quantile cross-spectral density contains information about the general dependence patterns of a given stochastic process. For a specific realization of the process, this quantity can be consistently estimated by means of the so-called smoothed CCR-periodogram, *G*ˆ*j*1,*j*<sup>2</sup> *<sup>T</sup>*,*<sup>R</sup>* (*ω*, *τ*, *τ* ), proposed by [3].

Based on previous remarks, a simple dissimilarity measure between two realizations of the *<sup>d</sup>*-variate process (MTS) can be defined as follows. Given the *<sup>i</sup>*-th MTS, *<sup>X</sup>*(*i*) *<sup>t</sup>* , consider the set *<sup>G</sup>*(*i*) <sup>=</sup> {*G*ˆ*j*1,*j*<sup>2</sup> *<sup>T</sup>*,*<sup>R</sup>* (*ω*, *τ*, *τ* ), *j*1, *j*<sup>2</sup> = 1, ... , *d*, *ω* ∈ Ω, *τ*, *τ* ∈ T}, where Ω is the set of Fourier frequencies and <sup>T</sup> <sup>=</sup> {0.1, 0.5, 0.9}. Let **<sup>Ψ</sup>**(*i*) be the vector formed by concatenating the elements of the set *<sup>G</sup>*(*i*). The dissimilarity measure between the series *<sup>X</sup>*(1) *<sup>t</sup>* and *<sup>X</sup>*(2) *<sup>t</sup>* is defined as the Euclidean distance between the complex vectors **Ψ**(1) and **Ψ**(2) . We call this dissimilarity *dQCD*.

The dissimilarity *dQCD* is used to develop three robust fuzzy clustering methods. All of them assume that we want to group *n* MTS into *C* clusters, and are based on the traditional fuzzy *C*-means clustering algorithm. They look for the set of centroids **<sup>Ψ</sup>** <sup>=</sup> {**Ψ**(1) , ... , **<sup>Ψ</sup>**(*C*) }, and the *n* × *C* matrix of fuzzy coefficients, *U* = (*uic*), *i* = 1, ... , *n*, *c* = 1, ... , *C*, which define the solution of a given minimization problem. The quantity *uic* represents the membership degree of the *i*-th MTS in the *c*-th cluster. The minimization problem for the first method is the following:

$$\min\_{\boldsymbol{\Psi}, \boldsymbol{\Pi}} \sum\_{i=1}^{\mathcal{C}} \sum\_{\boldsymbol{c}=1}^{\mathcal{C}} \boldsymbol{u}\_{\boldsymbol{ic}}^{m} \left[ 1 - \exp \left\{ -\beta \left\| \left\| \boldsymbol{\Psi}^{(i)} - \overline{\boldsymbol{\Psi}}^{(\boldsymbol{c})} \right\|\_{2}^{2} \right\} \right] \text{ w.r.t } \sum\_{\boldsymbol{c}=1}^{\mathcal{C}} \boldsymbol{u}\_{\boldsymbol{ic}} = 1 \text{ and } \boldsymbol{u}\_{\boldsymbol{ic}} \ge 0, \boldsymbol{s}$$

where *β* is an hyperparameter that needs to be set in advance and *m* is a parameter which determines the fuzziness of the partition, frequently called the fuziness parameter.

The exponential distance is used in the previous model because it is capable of neutralizing the effect of outlying series by spreading out their membership degrees between the different clusters [4].

The second robust procedure follows the noise cluster approach, and takes into account the following minimization problem:

$$\min\_{\mathbf{W}, \mathbf{U}} \sum\_{i=1}^{n} \sum\_{\varepsilon=1}^{\mathcal{C}-1} u\_{i\varepsilon}^{m} \left\| \mathbf{Y}^{(i)} - \mathbf{F}^{(\varepsilon)} \right\|\_{2}^{2} + \sum\_{i=1}^{n} \delta^{2} \left( 1 - \sum\_{\varepsilon=1}^{\mathcal{C}-1} u\_{i\varepsilon} \right)^{m} \\ \text{w.r.t. } \sum\_{\varepsilon=1}^{\mathcal{C}} u\_{i\varepsilon} = 1 \text{ and } u\_{i\varepsilon} \ge 0, \forall i$$

where *δ* > 0 is the a parameter known as the noise distance, which has to be specified in advance.

The previous model includes *C* groups, but only (*C* − 1) are "real" clusters. The noise cluster is artificially created for outlier identification purposes. The aim is to locate the outliers and place them in the noise cluster, which is represented by a fictitious prototype that has a constant distance from every MTS (the noise distance, *δ*).

The third technique can be expressed by means of the minimization problem:

$$\min\_{\mathbf{Y}, \mathbf{U}} \sum\_{i=1}^{H(a)} \sum\_{c=1}^{C} u\_{ic}^{m} \left\| \mathbf{Y}^{(i)} - \overline{\mathbf{Y}}^{(c)} \right\|^2 \\ \text{w.r.t. } \sum\_{c=1}^{C} u\_{ic} = 1 \text{ and } u\_{ic} \ge 0.$$

where *<sup>Y</sup>* ranges on all the subsets of **<sup>Ψ</sup>** <sup>=</sup> {**Ψ**(1) , ... , **Ψ**(*n*) } of size *H*(*α*) = *n*(1 − *α*). The model attains its robustness by removing a certain proportion of the series and requires the specification of the fraction *α* of the data to be trimmed.

The three previously presented robust models have been analysed by means of a broad simulation study containing a wide variety of generating processes. Two alternative dissimilarities were taken into account for comparison purposes [5,6]. In all cases, the three proposed algorithms outperformed the competitors.

#### **3. Application to real data**

The three techniques proposed in Section 2 were applied to perform clustering in a real MTS database. Specifically, we considered daily stock returns and trading volume of the top 20 companies of the S&P 500 index, thus obtaining 20 bivariate MTS. Table 1 shows the membership degrees of the series concerning the trimmed approach.

**Table 1.** Membership degrees for the top 20 companies in the S&P 500 index by considering the trimmed approach and a 6-cluster partition.


The symbols in bold correspond to the companies which were trimmed away, Berkshire Hathaway (BRK.B), Walmart (WMT) and Home Depot (HD). Similar clustering solutions were obtained with the remaining two methods.

#### **4. Conclusions**

This work proposes three robust methods to perform fuzzy clustering of MTS. They are based on the so-called exponential, noise and trimmed ideas. Each approach attains robustness to outlying series in a different way. The three procedures have been presented and assessed through a wide simulation study, substantially outperforming alternative approaches. A real data application has been also carried out in order to show the usefulness of the presented techniques.

**Acknowledgments:** This research has been supported by MINECO (MTM2017-82724-R and PID2020- 113578RB-100), the Xunta de Galicia (ED431C-2020-14), and "CITIC" (ED431G 2019/01).


MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Engineering Proceedings* Editorial Office E-mail: engproc@mdpi.com www.mdpi.com/journal/engproc

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18

www.mdpi.com