2.2.1. Similarity Matrix

If we would like to cluster different items, first the measurement of similarity has to be decided. In this study similarity of two stock indices (*i*, *j*) will be denoted by *Wi*,*j*. The goal is to penalize differences and reward similarities. Logarithmic returns are easy to handle and maintain all price process information.

$$r\_i(t) = \ln \left( S\_i(t) / S\_i(t-1) \right),\tag{1}$$

where *Si*(*t*) represents the price of index *i*. The current study analyses multiple similarity approaches. First,theMarkowitz-basedsquaredcorrelation isconsideredasimilaritymetric.

*Wi*,*<sup>j</sup>*=Corr<sup>2</sup>(*ri*,*rj*),(2)

We argue this approach because logarithmic returns are not normally distributed, hence non-linear effects may also be important. However, as correlation is linear, squared correlation similarities only take into account linear dependences.

The problem of higher-order moments can be easily solved by using symmetric and positive-definite kernel functions. The idea comes from the functional analysis. Data can be transformed into a reproducing kernel Hilbert space (RKHS), where applying the usual statistics provides the same outcomes as can be attained by using non-linear statistics in the original Hilbert space (Berlinet and Christine 2011); and, in practice, the Gaussian-kernel is widely used (Gregory et al. 2008).

$$\mathcal{W}\_{\bar{i},\bar{j}} = \exp\left(-\left\|\left\|\left\|\left\|\left\|\left\|\left\|\right\|\right\|\right\|\right\|\right\|\right)\right.\tag{3}\right) \tag{3}$$

We notice that, if the sets of the relevant information and sensitivities are similar, then the relative entropy of the distribution of return processes is small. Otherwise, we can say stock indices are sensitive to different sets of information in a different manner (Ormos and Zibriczky 2014). This means that the similarity function has to be monotonically decreasing in symmetric Kullback–Leibler distance, and so we can construct a similarity measure such that:

$$\mathcal{W}\_{i,j} = 2/(2 + \left| \text{KL}(p(r\_i) \parallel p(r\_j)) + \text{KL}(p(r\_j) \parallel p(r\_i)) \right|), \tag{4}$$

where *p*(*ri*) denotes the probability distribution function of logarithmic returns of index *i* and KL(*p*(*ri*) *p*(*rj*)) def = ∑ *p*(*ri* = *x*)ln (*p*(*ri* = *x*)/*p*(*rj* = *x*)) the relative entropy of indices *i* and *j*.

Another perspective argues that large deviations are riskier, hence similarities should be defined with tail distributions. We calculate the differences of the return series and count the number of at least two standard deviation peaks. This logic implies that indices are similar if their price processes jump together. Similarity function has to be decreasing in the number of large deviations, hence we propose the following metric:

$$\mathcal{W}\_{i,j} = 1/\left(1 + \sum\_{t=1}^{T} \delta(\left\lfloor z\_i(t) - z\_j(t) \right\rfloor > 2)\right),\tag{5}$$

where *zi* represents the normalized return of index *i*.

In the current study we compare each approach.
