**3. Methods**

Considering the fact that the evolution of the network structure is investigated in this study, the sliding window technique was applied. The essence of this method is that a fragment of a fixed length (the time window size) is selected from the time series. An analysis is performed for this fragment, and subsequently, the beginning and the end are shifted by one point and all calculations are repeated. The procedure is repeated until the time window reaches the end of the series.

The analysis carried out in this work can be divided into the following main steps:


#### *3.1. Distance Matrix*

The distance between the log-returns time series is calculated based on the ultrametric distance [26,27,39,40] as in Equation (2),

*DM*(*<sup>A</sup>*, *<sup>B</sup>*)*<sup>t</sup>*,*<sup>T</sup>* = (12 (1 − *ρ*(*<sup>A</sup>*, *<sup>B</sup>*)*<sup>t</sup>*,*<sup>T</sup>*), (2)

where the correlation *ρ*(*<sup>A</sup>*, *<sup>B</sup>*)*<sup>t</sup>*,*<sup>T</sup>* is calculated using Pearson correlation coefficient, as in Equation (3):

$$\rho(A\_\prime B)\_{t,T} = \frac{\langle AB \rangle\_{t,T} - \langle A \rangle\_{t,T} \langle B \rangle\_{t,T}}{\sigma(A)\_{t,T} \sigma(B)\_{t,T}}. \tag{3}$$

where the indices ()*<sup>t</sup>*,*<sup>T</sup>* denote the interval (*t*, *t* + *<sup>T</sup>*). *T* stands for the time window size. The distance *DM*, when equal to zero, indicates a perfect linear correlation between time series, while a distance *DM* equal to one is obtained in the case of a lack of linear correlation (which does not mean that the time series are not correlated by other functions [39]).

In the literature, there is an alternative formulation of Equation (2), the ultrametric distance, which utilizes different normalization techniques [20]. Of course, the normalization does not affect the conclusions. The ultrametric distance *DM* is calculated for all possible pairs of time series, and the results are presented in the form of the distance matrix. The distance matrix *DM* is symmetrical due to the definition of the ultrametric distance Equation (2).

#### *3.2. Network Construction*

Considering the fact that each distance matrix contains (*n*(*<sup>n</sup>*−<sup>1</sup>) 2 different elements, here it gives 93096 different numbers. The analysis of the distance matrix requires the construction of higher-order structure—networks. Although in the literature the minimum spanning tree (MST) is one of the most popular structures [6,16,18,19,41,42], it imposes a very strong bias on the generated network. For example, due to the imposed tree structure, it is impossible to observe cliques, which are quite important elements of economic relationship analysis. In the case of MST analysis, with some additional effort, it is also possible to distinguish clusters [16], but such analysis is not straightforward due to the tree structure. MST analysis often distinguishes one prominent node, eg. [16,42], but in different network structures, the node could be a member of a clique and such a conclusion of its special role would be not possible.

Therefore, in this paper, the threshold method is used. The distances are categorised into defined groups, and, in each case, the network is constructed based on the appropriately filtered distance matrix.

Distance categorisation:

• Strongly connected time series—the companies are connected when the distance is shorter than the first quartile of the distances in the analysed distance matrix, so the network is built on a set of the 25% shortest links;


The examples of the network generated in the study are presented in the Appendix B. Due to the huge number of graphs generated in the analysis (the time series length diminishes by the time window size) and the size of the networks, only a few examples are presented focusing on the state before the COVID-19 pandemic (July/August 2019) and two examples during the pandemic (March 2020 and August/September 2020).

On the other hand, the MST analysis allows the dominating node to be distinguished, usually with the highest number of links, eg. GE in [16]. However, this result partially depends on the imposed tree structure. In the threshold method, such situations are less probable, and a very high number of companies have a high number of links, so such prominent nodes are not observed.

### *3.3. Network Analysis*

The last step of the analysis is the network parameter calculations. Considering the fact that, in the study, more than a thousand networks are constructed (due to the sliding window technique) and each network consists of 432 nodes, the direct analysis is tremendous. On the other hand, the general state of the system can be characterised by calculating appropriately chosen parameters.

The study aims to observe changes in the structure of the network of correlations. In the case of economic systems, some structures are of special interest. Usually one of the very first issues analysed is the leadership, or the presence of dominating companies, which are network hubs. The second most important structures are clusters that correspond to strongly cooperating companies or sets with strong mutual relationships, e.g., belonging to the same highly specialised sector, with the same ownership or sharing another common factor. The question of the presence of dominating companies is answered by the rank node analysis, which ranks nodes with respect to the number of links. It was shown in [22,32,42] that during crises, the dominating structure is a star-like network with a well-defined centre. On the other hand, in independently developing companies, one can expect that the statistical distances among time series would be similar (with some fluctuations). Moreover, the most interesting aspects, from the point of view of questions raised, are the changes in the network structure. Thus, the measure which properly exposes such structures and their changes is Shannon entropy. Therefore, the rank node distribution is characterised by information entropy; here it will be called **rank node entropy** and defined by Equation (4),

$$SN = -\sum\_{i \in L} p\_i \ln p\_i \tag{4}$$

where *L* represents the list of all observed ranks, and *pi* represents the probability of the i-th rank node.

The second feature investigated is the formation of particular structures, specifically triangles and cycles. The triangles expose the companies forming closely interacting groups; analogously, cycles are the groups with significant relationships (a chain of dependence). These two parameters are analysed by the calculation of transitivity and cycle entropy. The transitivity is defined as the fraction of all possible triangles in the graph.

$$T = 3 \frac{\text{\#triangles}}{\text{\#trials}} \tag{5}$$

where *triad* indicates two edges with a shared vertex. **The cycle entropy** is defined as the information entropy of the cycle length distribution:

$$\text{SC} = -\sum\_{i \in \mathcal{C}} p\_i \ln p\_i \tag{6}$$

where *C* indicates the list of all observed cycle lengths, and *pi* represents the probability of observing a cycle of the length *i*.

The last analysed network parameter is the clustering coefficient, which is the standard characteristic of the link density. Here, the averaged clustering coefficient is used, which is defined by Equation (7):

$$\mathcal{C} = \frac{1}{n} \sum\_{v \in G} c\_{v\prime} \quad c\_v = \frac{2T(v)}{\deg(v)(\deg(v) - 1)}\tag{7}$$

where *<sup>T</sup>*(*v*) is the number of triangles through node *v*, and *deg*(*v*) represents the degree of node *v*.

The last element of the analysis procedure to define is the time window length. Considering the analysis of daily time series, three time window lengths have been chosen: 5 days, 20 days, and 60 days, which correspond to the week, month, and quarter periods, respectively.

A summary of the analysis algorithm is as follows:


Finally, the time evolution of the network characteristics is received and discussed.
