*2.1. Correlation-Based Analysis in the Financial Markets*

The topic of correlation analysis has a long history in connection with stock markets throughout various historical economic crises using different correlation-calculating metrics. In [24], the authors estimated the correlation between 116 *S*&*P*500 stocks between 1982 and 2000 using Pearson coefficient. They further used MST to build up a correlation-based network in order to observe time-varying correlations based on three network measuring metrics including *normalized tree length*, *survival ratio* and *mean occupation layer*. As a result, they pointed out a large change in the network structure during Black Monday. More recently, [6] came up with a Neural Network approach to construct a graph and found a dramatic difference in the network structure during the downturns in 2008, 2011 and 2020. In [1], a Pearson correlation matrix of 200 and 400 stocks from the CSI 300 and *S*&*P*500 index, respectively, was used to find an optimized portfolio following the Markowitz optimization scheme. Instead of using Pearson method, Liu et al's paper used an interesting alternative method Mutual Information to generate a distance metric to take account of non-linear effects in intra-day S&P stock data [25]. Other methods to estimate the correlation coefficients (i.e., Wavelet coherence, Fast Fourier Transform) and construct correlation-based networks (i.e., PMFG, threshold method) were introduced in several studies [2,11,26].

Different existing approaches to study the correlations in the stock market have been applied to digital coins. Some common conclusions from existing articles are that the cryptocurrency network changes over time but Ethereum tends to act as a central node in the whole network, i.e., it is a densely connected node [5,27,28]. A few works remedy the problem of dataset shortages that have been concerned in the traditional markets, i.e ones tended to use low-frequency data to implement their studies such as daily or weekly. However, they only account for a small portion of the existing literature. For example, Antonio et al. [29] used small frequency resolutions such as one hour and four hours and also consider daily data of 25 large market capitalized entities traded on the FTX exchange to discover the evolution of cryptocurrency network structures between different time frequencies. By using Pearson correlation-based MST, they found an increase in the complexity of networks' shape for coarser time resolutions. In other words, cryptocurrencies converge into a bigger group as resolution increases. On the contrary, the authors in [20] using multiple timescales starting at 10 min to 360 min proposed an opposite statement that low timescales cause the network to be centralized while it is distributed and more correlated at high timescales. They used the liquidity and capitalization differences among the assets to explain this result, since cryptocurrencies

with low capitalization are traded less frequently than those with large capitalization, it takes more time for a piece of market information to spread over such cryptocurrencies. Thus, they are more inclined to use longer scales. Notably, this is one of the very few studies that remove the trend effect from the original dataset. Interestingly, instead of using return time series like other researchers, a research using hourly realized volatility values was carried out to observe the risk spillover between 7 high-capitalized cryptocurrencies [3].

Different methods have been introduced to detect communities given a correlation matrix. The authors in [4] applied Louvain method on the MST of 119 cryptocurrencies to cluster potential communities. The time-varying dynamics from the community structures found suggests collective behaviour among these communities. With the communities found by the same method, the authors in [30] went one step further by using Principal Component Analysis (PCA) to find an optimal portfolio out of 200 cryptocurrencies in circulation. Another community detection method that is worth taking into consideration is Girvan–Newman, which has been adopted widely for multiple purposes such as link prediction, portfolio diversification, etc. [31,32]. A few other methods are also being used to grouping similar entities but are less popular such as Clauset algorithm, Stochastic block model (SBM), Latent Dirichlet Allocation (LDA) and Markov random field (MRF) [33]. One obstacle from existing studies is that some used a specific community detection algorithm only, raising a doubt about the robustness of the community structure. To this end, we first use the Louvain method to detect communities in our dataset and then adopt Girvan– Newman method to examine the robustness of the communities found earlier.
