**1. Introduction**

The global stock market structure has to be well understood to diversify risk and manage cross-border equity portfolios. Appropriate portfolio construction is rather complicated. The linear dependence structure of the network is not stable (Erd˝os et al. 2011; Song et al. 2011; Maldonado and Anthony 1981). Moreover, exogenous shocks have major impact on the correlation structure; hence, uncorrelated assets could start moving together (Heiberger 2014). Therefore, correlation-based techniques could cause unwanted variance peaks.

Institutional economic surveys (like MSCI 2018) provide qualitatively identified network structures e.g., emerging markets and developed markets to stabilize their classification.

The main goal of this study is to provide more suitable quantitative techniques, generalize the widely used correlation-based portfolio construction framework, discover the equity index network and make diversification reliable.

The baseline concept follows the Sharpe (1964) Capital Asset Pricing Model (CAPM), in which similarity measures are calculated from correlations between logarithmic returns (Yalamova 2009). The anomalies of CAPM indicate a two-dimensional mean-beta framework that gives only a simplified picture of the real market structure. In order to explain the residuals, financial variables appeared in the famous regression (Fama and French 1996).

In this paper, we carry out a graph theory-based approach to unveil embedded network level information (Shi and Malik 2000). We propose non-linear similarity kernels that are able to deal with higher-order terms. We introduce novel jump-based similarity to investigate the effect of shocks. In addition, we test whether relative entropy of the distribution functions, that captures non-Gaussian behavior, conveys network level information. We also investigate the widely used Gaussian smoothing and correlation (Von Luxburg 2007). We compare different spectral clustering techniques and introduce the usage of the normalized Newman–Girvan cut (Bolla 2011).

Analyzing historical data supports the *a priori* assumption that clusters are homogenously connected. Thus, normalized Laplacian based techniques (Takumasa et al. 2015) are not applicable. However, the proposed Newman-Girvan cut brings suitable, stationary clustering results. We calculate correlation, jump, relative entropy and Gaussian-based similarities. The figures show that Newman–Girvan cut outperforms normalized Laplacian technique. Analyzing the spectral property of the jump-based similarity matrix unveils that exogenous shocks have minor effect on the network. Thus, our novel results imply that shocks do not convey sufficient information about the equity index graph. Regression analysis demonstrates the stationarity and explanatory power of the clusters. Moreover, we shed some light on the node level equity index structure. We unveil that the index network has scale free properties. Nevertheless, we show that geographical and qualitative categorizations are in line with clusters.

The article structured as follows: in Section 2 we introduce our spectral clustering-based concept. In Section 3 we analyze the equity index graph, compare different similarity matrices and clustering techniques. Section 4 summarizes the article.

### **2. Materials and Methods**
