*Article* **Linear and Nonlinear Effects in Connectedness Structure: Comparison between European Stock Markets**

**Renata Karkowska \* and Szczepan Urjasz \***

Faculty of Management, University of Warsaw, Szturmowa Street 1/3, 02-678 Warsaw, Poland

**\*** Correspondence: rkarkowska@wz.uw.edu.pl (R.K.); surjasz@wz.uw.edu.pl (S.U.)

**Abstract:** The purpose of this research is to compare the risk transfer structure in Central and Eastern European and Western European stock markets during the 2007–2009 financial crisis and the COVID-19 pandemic. Similar to the global financial crisis (GFC), the spread of coronavirus (COVID-19) created a significant level of risk, causing investors to suffer losses in a very short period of time. We use a variety of methods, including nonstandard like mutual information and transfer entropy. The results that we obtained indicate that there are significant nonlinear correlations in the capital markets that can be practically applied for investment portfolio optimization. From an investor perspective, our findings suggest that in the wake of global crisis and pandemic outbreak, the benefits of diversification will be limited by the transfer of funds between developed and developing country markets. Our study provides an insight into the risk transfer theory in developed and emerging markets as well as a cutting-edge methodology designed for analyzing the connectedness of markets. We contribute to the studies which have examined the different stock markets' response to different turbulences. The study confirms that specific market effects can still play a significant role because of the interconnection of different sectors of the global economy.

**Keywords:** stock market; market connectedness; mutual information; transfer entropy; COVID-19; crisis

**1. Introduction**

Correlation estimates are crucial not only for asset allocation decisions but also for risk management and hedge. Following the global financial crisis (GFC), we have another critical period in the financial market—global outbreak of the coronavirus (COVID-19) [1]. The pandemic is influencing a number of channels, including commercial activities, consumption, labor markets, and international supply chains. Among these channels, one of the most important components is the stock markets [2,3].

As a result, investors are more active and efficient in transferring their investments from one market to another in the event of a financial crisis, particularly at the first signs of economic or political instability. However, at a time when financial crises and pandemic turbulences are systemic in nature, the process of international diversification of assets may not fulfill its basic role—risk reduction. Additionally, empirical studies confirm that correlations between markets change over time, which makes the benefits of the theory of diversification of investment portfolio selection questionable [4]. The main goal of this paper is to verify the risk transfer between US stock market indices and six European stock market indices under the 2007–2009 global financial crisis and COVID-19 outbreak.

In our study, we compare the Central and Eastern European (CEE) and Western European markets, even though these countries are forming a common area of the European Union together. The motivation to perform this division is to compare markets from countries with different levels of economic development, including the financial market. Keeping this in mind, the risk transfer structure may be different for these two regions. Our previous research confirms this relationship [5]. Our interest in that group of countries

**Citation:** Karkowska, R.; Urjasz, S. Linear and Nonlinear Effects in Connectedness Structure: Comparison between European Stock Markets. *Entropy* **2022**, *24*, 303. https://doi.org/10.3390/e24020303

Academic Editor: Joanna Olbry´s

Received: 10 February 2022 Accepted: 18 February 2022 Published: 21 February 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

stems from several insights. Firstly, CEE countries have made major structural changes and reforms to integrate into European structures. Therefore, verification of how the financial markets of transition countries interact with other markets is an interest for both policy makers and investors. Secondly, CEE countries offer high returns on capital market investments with relatively low risk. Additionally, as the financial systems of CEE countries are strongly bank-based, an analysis of stock market development may still provide useful information.

The main contributions of this paper could be capitulated as follows. Firstly, we contribute to the studies which have examined the different stock markets' response to different turbulences (financial crisis and pandemic outbreak). Thus, we answer the question whether they can be equally responsible for the intensification of the impact of the US stock market on the stock exchanges of Central and Eastern Europe. Secondly, we employ a variety of methods to separately analyze the linear and nonlinear effect of connectedness structures for international equity markets. The area of transfer entropy has not been explored in depth. Therefore, using linear and nonlinear methodology, we can compare the complexity of the behavior of stock markets. Interesting results were obtained by Olbry´s and Majewska [6], who examined the benefits of diversifying their international portfolio to the largest European stock markets (i.e., the UK, France, and Germany) during the period 2003–2013. To the best of our knowledge, no current study has analyzed connectedness structures by verifying the linear and nonlinear effect in CEE stock markets compared to Western European markets during the COVID-19 pandemic.

Thirdly, we can observe that the correlations between US and other European markets are unstable. Additionally, we confirm that Western European markets displayed higher results of the correlations with the US stock market in comparison to CEE [7].

Fourthly, the study emphasizes that while globalization has contributed to a more integrated financial system, specific market effects can still play a significant role because of the interconnection in different countries of the global economy. From an investor perspective, our findings suggest that in the wake of the global crisis and pandemic outbreak, the benefits of diversification will be limited by the transfer of funds between developed and developing country markets.

The analysis by Gao and Mei [8] examined the structure of the correlation between the US and Asian stock indices during the global financial crisis of 2007–2009 with the use of a sliding window. As part of our article, we carried out verification of the method used by Gao and Mei [8] in relation to European indices, extending the research sample to the period of the COVID-19 pandemic. The sliding window is a technique used by [8–10] to obtain dynamically changing results in observation windows. Using various parameters of sliding windows allowed for receiving distinctive outputs that presented slightly different trends in the time series. Using the methods of linear correlations, mutual information, and transfer entropy, which take into account the sliding window, it was possible to build a network of risk transfer structure relationships for the daily rates of return of selected Western European markets and Central and Eastern European equity markets. We show that these networks detect significant differences in the behavior of individual stock indices, especially in turbulent market periods, thus highlighting the strongly changing relationships between stock markets in different countries.

The rest of the paper is organized as follows. Section 2 presents the literature review, while Section 3 provides the description of the data. Section 4 presents methodology. Section 5 analyzes the results of the linear and nonlinear effect in connectedness structures. Finally, Section 6 concludes with some discussion regarding the implications of the findings and possible extensions to future work.

#### **2. Literature Review**

Although there is no consensus in studies on the reasons for increasing inter-market correlations in times of market turbulences, most researchers accept that correlations change fundamentally during market crises. The empirical results of Boubaker and Raza [4] provide strong evidence of cross-market movement between US and CEE stock markets and show that joint movement exhibits large time differences and asymmetry in the tails of return distributions. The analysis demonstrated that changes in volatility in the US and the euro area are relevant factors causing risk shocks in European markets.

Studies on the impact of COVID-19 on the financial market spread rapidly; however, they still do not cover all economic aspects of the pandemic. The overall economic impacts are not yet straight, and there is no consensus in the research. For example, Ashraf [11], Zhang et al. [12], Akhtaruzzaman et al. [13], and Zaremba et al. [14] confirm that the last pandemic has led to a growth in global financial market risk. On the other hand, Sharif et al. [15] indicate that the COVID-19 pandemic affects the US economic risk much less than the geopolitical risk. Given a slower economic growth and relatively not liquid capital markets, it is possible that emerging markets have limited resources to cope with the pandemic. According to Topcu and Gulal [16], the negative impact of COVID-19 on emerging stock markets has gradually fallen and began to taper off by mid-April 2020. The recent result of the TGARCH model estimated in Visegrad group countries' markets reveals that there is a negative link between the stock market indices and COVID-19 spread [17].

Even though the correlation coefficient and regression models are measures of linear relation between the markets, there are also nonlinear effects that may not be captured with the linear methods. The vast majority of research in transfer entropy estimation concerns developed markets. For example, Qiu and Yang [18] verify the estimation of transfer entropy for short time sequences, using 38 important stock market indices from four continents to create further financial networks, omitting nevertheless Central and Eastern European markets. Similarly, Kuang [19] aims to construct the information flow networks on multi-time-scales among 31 international stock markets between 2007 and 2018, finding that developed markets are more dominant but vulnerable to short-term risk contagion. An interesting study was conducted by Karaca, Zhang, and Muhammad [20] to optimize the stock indices' forecasting model in the stock indices dataset; however, in their study, they used only the French and German indices. Nevertheless, developing stock market connectedness based on nonlinear methods such as mutual information and transfer entropy is still at a very early stage [21–27].

Mutual information and entropy transfer are frequently used methods to study the effect of long-memory volatility. Long-memory volatility can be seen as evidence of market participants' inability to use the information available on the market and can, therefore, be linked to the issue of (not) market efficiency. For example, Dima and Dima [28] analyze the case of the Bucharest stock exchange, where they suspect endogenous and exogenous causes of nonlinear volatility effects. They suggest that mutual information can be an alternative method of checking persistence, which can be understood as evidence of long memory in the financial market. Caginalp and Desantis [29] emphasize that the role of long-term volatility is not the explicit opposite of a risk/return relationship but rather that there is an ambiguous and complex relationship between volatility and return. Khoojine and Han [30] used the mutual information method to build a structure describing the return and trading volume network of the Chinese stock. You, Fiedor, and Hołda [24] use mutual information to analyze the correlation structure of the stock market in Shanghai and find that the Chinese stock market is not structurally riskier than US and Western Europe markets. Barbi and Prataviera [21] study nonlinear dependencies on the Brazilian equity network and underline the particular benefit of mutual information network analysis to identify the characteristics of financial markets due to nonlinear relationships. Ferreira, Dionísio, Almeida, Quintino, and Aslam [31] review the influential dynamics of CEE stock indices as well as US, German, UK, and Chinese indices and find strongly influential correlations between some CEE indices and the impactful character of the US index. They argue that the COVID-19 pandemic could intensify the influence of Chinese and US indices.

Thus, we believe that there is a need for development of a study that provides an insight into the cutting-edge methodology for analyzing the connectedness of stock markets, together with a structural and time analysis of the stock exchange in CEE and Western Europe comparing the 2007–2009 financial crisis and the COVID-19 pandemic outbreak.

#### **3. Data Characteristics**

The data used in this study were taken from the Stooq website and consist of daily logarithmic returns of one US stock market index: SPX (S&P500 Index–New York) and six European market indices, of which three are from developed countries: UKX (FTSE 100 Index–London), CAC (CAC40 Index–Paris), DAX (DAX Index–Frankfurt), and three are from developing countries: WIG20 (WIG20 Index–Warsaw), PX (PX Index–Praha), BUX (BUX Index–Budapest). The allocation was made in accordance with the classification used by MSCI Inc. [32].

There are 4773 observations for each time series in the period between January 2000 and August 2020. Table 1 presents preliminary statistics of the daily logarithmic returns for all indices. The measure of skewness demonstrates that all-time series are skewed. On the basis of excess kurtosis, we can see that almost all series are highly leptokurtic with respect to the normal distribution. The Doornik–Hansen tests show a rejection (at the 5% level) of the null hypothesis of normality for each of the return series.

**Table 1.** Summarized statistics for daily returns.


#### **4. Methods**

*4.1. Cross-Market Correlations*

As a first step, we use the Pearson correlation coefficient to measure the linear relationship. Next, we proposed an adjusted correlation coefficient following studies by Forbes and Rigobon [33], Olbry´s and Majewska [6], and Rigobon [34]:

$$\rho\_{\hat{V}A} = \frac{\rho\_{\mathbb{C}}}{\sqrt{1 + \delta \left[1 - \left(\rho\_{\mathbb{C}}\right)^2\right]}} \tag{1}$$

where:

*ρVA*ˆ —the adjusted correlation coefficient;

*ρ*ˆ *<sup>C</sup>*—the conditional (unadjusted) correlation coefficient;

*δ*—the change in turbulent period (crisis) volatility compared to the tranquil period (pre-crisis):

$$
\delta = \frac{\hat{\sigma}\_{\text{C}}^{2}}{\hat{\sigma}\_{\text{PC}}^{2}} - 1 \tag{2}
$$

where *σ*ˆ <sup>2</sup> *<sup>C</sup>*, *<sup>σ</sup>*<sup>ˆ</sup> <sup>2</sup> *PC* are the variances in the turbulent and tranquil periods.

Following that, the formula to transform Pearson correlations to a Fisher *Z* transformation is [35]:

$$\rho\_{VA} \ast = \frac{1}{2} [\ln(\rho\_\mathbb{C} + 1) - \ln(\rho\_\mathbb{C} - 1)] \tag{3}$$

To obtain approximately standard normal distributed *z*-statistic values, the difference is formed as follows:

$$Z = \frac{(\rho\_C - \rho\_{PC})}{\sqrt{\frac{1}{n\_C - 3} + \frac{1}{n\_{PC} - 3}}} \tag{4}$$

where *ρC*, *ρPC* are the cross-correlation coefficient in the turbulent and tranquil periods and *nC* and *nPC* are the sample sizes of the turbulent periods and tranquil period.

To verify the existence of significant change in cross-market correlations, we can test the hypotheses as follows:

$$H\_0: \rho\_{VA} = \rho\_{\text{PC}} H\_1: \rho\_{VA} \neq \rho\_{\text{PC}} \tag{5}$$

where *H*<sup>0</sup> states that there are no significant changes in adjusted correlation.

#### *4.2. Larntz–Perlman Procedure*

We used the Larntz–Perlman procedure [36] for testing the equality of correlation matrices computed over non-overlapping subsamples: the pre-crisis and crisis periods in the group of markets investigated. Longin and Solnik [37] affirmed that the knowledge about international covariance and correlation matrices of asset returns and their behaviors is essential for the calculation of portfolios.

To examine the equality of correlation matrices, we can test the pair of hypotheses:

$$H\_0: P\_\mathbb{C} = P\_\mathbb{PC} H\_1: P\_\mathbb{C} \neq P\_{\text{PC}} \tag{6}$$

where *PC* and *PPC* are population correlation matrices in the turbulent and tranquil periods. Rejection of the *H*<sup>0</sup> indicates lack of equality of correlation matrices in a turbulent episode.

In this article, we used the test statistic proposed by Larntz and Perlman [36]:

$$T\_{LP} = \sqrt{\frac{n-3}{2}} \ast \max\_{1 \le i < j \le p} \left| z\_{ij}^C - z\_{ij}^{PC} \right| \tag{7}$$

where *z<sup>C</sup> ij* and *<sup>z</sup>PC ij* are the Fisher *<sup>z</sup>*-transformed correlation between *<sup>ρ</sup>*ˆ*<sup>C</sup> ij* and *<sup>ρ</sup>*ˆ*PC ij* .

#### *4.3. Mutual Information*

Mutual information (MI) is a measure of statistical independence between two random variables, and it has its usage in evaluating both linear and nonlinear relationships [9]. Moreover, MI is defined as the amount of information transferred between studied systems [27].

There is no single commonly used MI estimator, but there are studies that compare them [38–44]. Determined by the sample size and underlying distribution or process, the MI rises with partition of an interval for time series. There are three main groups of estimators: histogram-based estimators, k-nearest neighbors, and kernel estimators [39,40]. Among histogram-based estimators we can distinguish three main subgroups: equidistant partitioning—bins of equal length [44]; equiprobable partitioning—each bin has the same occupancy, i.e., marginal equiquantization [45]; and adaptive partitioning as an extension of the previous two proposed by Darbellay and Vajda [41]. The k-nearest neighbors method takes into account the probability distributions for the distance between the point at which the density is to be estimated and its k-th nearest neighbor [40]. Another approach is to apply the kernel mutual information estimator constructed by Moon et al. [39] to centering kernel function at the data samples. According to the approach proposed by Darbellay [45], the marginal equiquantization estimation process allows one to maximize mutual information. Furthermore Dionísio et al. [46] emphasize that the comparison of MI is difficult in some contexts; therefore, it should apply a normalized measure of MI. Nevertheless, in order to ensure the comparability of our results with the study conducted by Gao and Mei [8], we will use the equidistant partitioning estimation process for our calculations.

In the study of MI, the selected method to discretize the time series is the binning method [9]. We fragmentize the range of the time series into n disjoint intervals *xn*(*n* = 1, 2, 3, . . . , *N*; *xn* = 0, 1, 2, 3) with fraction of all measurements equal to *p*(*xn*) = 1/*n*. By grouping the time series into bins *I* : *xn*(*n* = 1, 2, 3, . . . , *N*; *xn* = 0, 1, 2, 3) and *J* : *yn*(*n* = 1, 2, 3, . . . , *N*; *yn* = 0, 1, 2, 3) that share identical length *N*, we create two discrete processes. The MI is given as:

$$M(X;Y) = \sum\_{\mathbf{x}\_{n,y\_n}} p(\mathbf{x}\_n, y\_n) \log \frac{p(\mathbf{x}\_n, y\_n)}{p(\mathbf{x}\_n)p(y\_n)} \tag{8}$$

#### *4.4. Transfer Entropy*

Transfer entropy (TE) was introduced by Schreiber [47] as an approach to measuring the direct exchange of the flow of information between two systems evolving in time. Considering two stationary and discrete processes *I* : *xn*(*n* = 1, 2, 3, . . . , *N*; *xn* = 0, 1, 2, 3) and *J* : *yn*(*n* = 1, 2, 3, . . . , *N*; *yn* = 0, 1, 2, 3) that share identical length *N*, we measure the TE with *J* → *I* as the deviation of information collected from the previous state of *I* that comes purely from the latest state of *I*, which in turn was received from the last joint state of *I* and *J* [8,48]. The information propagation about the subsequent state of *xn*+<sup>1</sup> of *I* was received from the last joint state of *I* and *J*:

$$h\_1 = -\sum\_{\mathbf{x}\_{n+1}} p(\mathbf{x}\_{n+1}, \mathbf{x}\_n, y\_n) \ast \log p(\mathbf{x}\_{n+1} | \mathbf{x}\_n, y\_n) \tag{9}$$

The state of the subsequent observation *xn*+<sup>1</sup> of *I* is not based on the state of *J*; therefore, the information was received only from the state of *I*:

$$h\_2 = -\sum\_{\mathbf{x}\_{n+1}} p(\mathbf{x}\_{n+1}, \mathbf{x}\_n) \ast \log p(\mathbf{x}\_{n+1}|\mathbf{x}\_n) \tag{10}$$

The transfer entropy with processes *J* → *I* :

$$T\_{l \to l} = h\_2 - h\_1 = \sum\_{\mathbf{x}\_{n+1}, \mathbf{x}\_n, y\_n} p(\mathbf{x}\_{n+1}, \mathbf{x}\_n, y\_n) \ast \log \frac{p(\mathbf{x}\_{n+1} | \mathbf{x}\_n, y\_n)}{p(\mathbf{x}\_{n+1} | \mathbf{x}\_n)} \tag{11}$$

#### *4.5. Summary of Methods*

We would like to use a variety of methods, such as the cross-correlation, volatilityadjusted cross-correlation, Larntz–Perlman procedure [36], and the mutual information and transfer entropy approaches, to separately analyze the correlation structures for testing the linear and nonlinear relationships in returns between selected markets. Each method has advantages and disadvantages.

There is a sizeable empirical literature that presents nonlinear effects in financial time series [9]. It is not possible to model such behavior in a sufficient manner using Pearson correlation, due to the fact that it explores only linear relationships, ignoring a meaningful amount of information [49]. For this reason, it would be favorable to model both linear and nonlinear information using different methods.

Mutual information has solid foundations in the mathematical concept of information theory and can be used to model both linear and nonlinear connections but is easily influenced by dependencies that are not found in the covariance [40]. On the other hand, MI does not provide directional or dynamical information because of its static, symmetric property [47]. Furthermore, the amount of received information relies on discretization algorithms and bin size [9]. In comparison to MI, transfer entropy is more adequate for detecting the direct exchange of information between two systems, but, as Kaiser and Schreiber [50] pointed out, no similar monotonic convergence seems to hold. In contrast to MI, transfer entropy is created to avoid static correlations due to the common input signals [47]. This tool is widely used due to its close relationship to the concept of Granger

causality [51], which is the cause for combining two approaches (information-theoretic and predictive) to analyze directional relations between processes [52].

#### **5. Results**

#### *5.1. Cross-Market Correlations*

In the first step, using linear correlations, we examine whether the degree of stock market connectedness between the US stock market and CEE differs from that in developed markets. Figure 1 shows the mean linear correlations between each index and the rest of the indices received by using overlapping windows. We split the time series into sequence based on the fixed-size sliding window of 220 days (up) and 1000 days (down), with 1 trading day window step length. After exploring different values, we identified the optimal parameters that ensure smoothly but dynamically changing results. Using various parameters of sliding windows allowed for receiving distinctive outputs that presented slightly different trends in the time series. The selected values are similar to Onnela, Chakraborti, Kaski, Kertész, and Kanto [10]. The mean linear correlations of the Western European markets are higher than in CEE indices. We can observe that UKX, CAC, and DAX indices move together throughout the complete sample, and the mean linear correlation of the CAC index is the highest. On the other hand, the mean linear correlation of the UKX index from 2016 (Brexit) to March 2020 (COVID-19 pandemic) has a weaker relationship with other Western European indices. The relationship between the mean linear correlations of CEE markets fluctuates during the whole period. In the time of the crisis, the mean linear correlation of the BUX index rose until 2013 and then dropped dramatically. Between 2009 and 2015, the mean correlation of the WIG20 is higher than other CEE indices. From 2016, the mean correlation of the PX index is higher than the WIG20 and BUX. Out of the CEE markets, the mean correlation of the BUX index increased the most during the COVID-19 pandemic. This evidence is consistent with the study on CEE indices during the COVID-19 period [17]. When the fixed-size sliding window is 220 days, the mean linear correlations of European markets bounce after falling in 2005, 2015, 2018, and in early 2020. The mean linear correlation of stock exchanges in the US (presented as a black line) declined from 2007 to 2009 and then began to rise again. Even with the 1000-day fixed size sliding window, it is still clear that the trend is going up, especially starting from March 2020.

For further observation, the data were split into five short, distinctive periods: precrisis (1 September 2006 to 30 November 2007), crisis (1 December 2007 to 28 February 2009), post-crisis (1 March 2009 to 25 May 2010), pre-COVID-19 (30 September 2019 to 11 March 2020), and COVID-19 (12 March 2020 to 14 August 2020) in order to provide information on the strength and direction of the linear relationship. The results of the preliminary analysis are presented in Figure 2. We can see there that in all analyzed periods, linear correlations between the SPX and Western European indices achieve higher values than with CEE indices in all periods. The results show that COVID-19 has a considerable impact on all analyzed indices. The mean linear correlations of European and US markets prove to be higher during the COVID-19 period than in the crisis period. Furthermore, Western European indices are more affected by COVID-19 compared to CEE indices. During the COVID-19 period, the highest value of the correlation coefficient was observed in three cases: between the SPX and UKX, the SPX and CAC, and the SPX and DAX. In the group of CEE indices in the pre-crisis period, the linear correlation coefficients between the US and the WIG20 were at the highest level. During the crisis, this role is taken over by the BUX index; after the crisis, the PX index; and after that, during pre-COVID-19 and COVID-19 periods, again by the BUX index. Excluding the BUX index, all linear correlation coefficients between the US equity markets and selected European stock exchanges were higher in the post-crisis period than during and before the crisis. It is worth noting that only the linear correlation coefficient between the US equity markets and UKX index was lower in the COVID-19 period than in the pre-COVID-19 period.

**Figure 1.** The mean linear correlations between each index and the rest of the indices using overlapping windows. The upper part is a 220-day fixed-size sliding window (**a**), and the one below is a 1000-day fixed-size sliding window (**b**).

**Figure 2.** The linear correlations between the US and European stock market indices in the selected periods.

Table 2 shows the standard contemporaneous cross-market correlations and adjusted correlation coefficients, as seen in (1), of daily logarithmic returns on pairs of indices—the SPX/stock market index. We take into consideration the dependencies in the complete sample (January 2000–mid-August 2020) as well as in two equally sized subsamples: the pre-crisis period, September 2006–November 2007 (290 days), and the crisis period, December 2007–February 2009 (290 days). We analyze the changes in cross-market linkages after the economic shock to the US financial market. The supporting values are equal to: *σ*ˆ <sup>2</sup> *<sup>C</sup>* = 0.0006661542 (the variance in the turbulent period in the US stock market) and *σ*ˆ 2 *PC* = 0.0000864396 (the variance in the tranquil period in the US stock market), while the relative increase in the variance of the SPX returns, given by (2), is equal to *δ* = 6.706584.

**Table 2.** Contemporaneous cross-correlations and adjusted correlations of daily logarithmic returns in pairs—the SPX/stock market index—subsamples: the pre-crisis and crisis.


Notes: The table presents the data received through the analysis of the complete sample period of January 2000–December 2019 (4623 days); the pre-crisis period of September 2006–November 2007 (290 days); and the crisis period of December 2007–February 2009 (290 days). The numbers in brackets are *p*-values. Fisher *Z*statistic tests were null for no changes in correlation. Critical value of Student's t distribution is 1.711 (at the 10% significance level).

The results received in Table 2 for the crisis period indicate that the contemporaneous correlations between the US and other stock exchanges were higher than during the precrisis period, but the differences were low. In both periods, the values of contemporaneous correlations were higher in Western Europe than in CEE. The results of the Forbes and Rigobon methodology [33] show the absence of significant changes in cross-market linkages. The value of adjusted correlation between US and European stock markets decreased during crisis. There is no reason to reject the null hypothesis that states that there are no significant changes in the adjusted correlation for all analyzed markets. For this method as well, the values of adjusted correlations were higher in Western Europe than in CEE.

Moreover, we take into consideration the dependencies in the complete sample (January 2000–mid-August 2020) as well as in two equally sized subsamples: the pre-COVID-19 period of 30 September 2019–11 March 2020 (103 days) and the COVID-19 period of 12 March 2020–14 August 2020 (103 days). As shown in Table 3, we analyze the changes in cross-market linkages after the COVID-19 shock to the US financial market. The supporting values are equal to: *σ*ˆ <sup>2</sup> *<sup>C</sup>* = 0.0008037915 (the variance in the COVID-19 period in the US stock market) and *σ*ˆ <sup>2</sup> *PC* = 0.0002521314 (the variance in the tranquil period in the US stock market), while the relative increase in the variance of the SPX returns, given by (3), is equal to *δ* = 2.187987.


**Table 3.** Contemporaneous cross-correlations and adjusted correlations of daily logarithmic returns in pairs—the SPX/stock market index—subsamples: pre-COVID-19 and COVID-19.

Notes: The table presents the data received through the analysis of the complete sample period of January 2000–mid-August 2020 (4773 days); the pre-COVID-19 period of 30 September 2019–11 March 2020 (103 days); and the COVID-19 period of 12 March 2020–14 August 2020 (103 days). The numbers in brackets are *p*-values. Fisher *Z*-statistic tests were null for no changes in correlation. Critical value of Student's t distribution is 1.711 (at the 10% significance level).

The results received in Table 3 for the COVID-19 period indicate that the contemporaneous correlations between the US and other stock exchanges (except UKX) were higher than during the pre-COVID-19 period; however, the differences were low. These results provide support for the theory of Ferreira, Dionísio, Almeida, Quintino, and Aslam [31] that the pandemic crisis may be a factor for the intensification of US indices. Similar results were obtained in the study by Czech, Wielechowski, Kotyza, Benešová, and Laputková [17] and Aslam et al. [53], who emphasize that the COVID-19 pandemic caused great impacts on CEE stock markets. In both periods, the values of contemporaneous correlations were higher in Western Europe than in CEE. For DAX, WIG20, PX, and BUX, we reject the null hypothesis, which suggests the existence of changes in correlation. On the other hand, the results of the Forbes and Rigobon methodology [33] show the absence of significant changes in cross-market linkages. The value of adjusted correlation between US and European stock markets decreased during the pandemic. There is no reason to reject the null hypothesis that states that there are no significant changes in the adjusted correlation for all analyzed markets. For this method as well, the values of adjusted correlations were higher in Western Europe than in CEE.

We observed that, compared to the 2007–2009 crisis, contemporaneous correlations between the US and other stock exchanges increased significantly during the pre-COVID-19 and COVID-19 periods (Tables 2 and 3). In the case of the 2007–2009 crisis, we find one market (BUX) which indicates the lack of equality of correlation matrices, while during the COVID-19 period we find as many as four markets (DAX, WIG20, PX, BUX).

#### *5.2. Larntz–Perlman Procedure*

Table 4 summarizes the Larntz–Perlman test [36] performed on the SPX and the six European stock indices. We have reason to reject the null hypothesis (6), which suggests the stability of the correlation matrix via three adjacent sub-periods:



**Table 4.** Results of the Larntz–Perlman test.

#### *5.3. Mutual Information*

Figure 3 shows the outcome of average mutual information evolving in time. When the fixed-size sliding window equals 220 days, the average mutual information of European markets bounced after the fall that happened at the end of 2005, which is consistent with the mean linear correlation. Starting from March 2020, we can observe another soaring growth in the average mutual information of European markets. For the 1000-day fixedsize sliding window, the average mutual information showed an upward trend until 2013, when it peaked. It is worth noting that, starting from March 2020, we can see the growing tendency again; however, the UKX index is no longer so closely associated with other Western countries. Our main interest is in analyzing the connection between the US equity markets and European stock exchanges in the financial crisis of 2007–2009 and during the COVID-19 pandemic. The results received by comparing the MI in pre-crisis, crisis, and post-crisis periods are shown in Figure 4. Except for Hungary's stock exchange, the MI between the US equity markets and other European stock indices is lower during the crisis in comparison to the pre-crisis period. We observe similar results for COVID-19 in comparison to the pre-COVID-19 period, except for Hungary's and Czech Republic's stock exchanges.

#### *5.4. Transfer Entropy*

Figure 5 presents quickly changing outcomes of the average transfer entropy. We can observe that the average transfer entropy of the US stock market index reaches higher levels in comparison to the other markets. When the fixed-size sliding window is 220 days, the average transfer entropy of the US stock market index before January 2009 soars, but the peaks that it exhibits are sharp and narrow. A similar situation can be observed in March 2020. When the fixed-size sliding window is 1000 days, the average transfer entropy of the US stock market index grows continuously, then starts to decline after 2009, and rises again in March 2020. Figure 6 shows the outcomes of the TE values of the US equity markets of six European stock exchanges during the pre-crisis, crisis, post-crisis, pre-COVID-19, and COVID-19 periods. The TE from the US equity markets to Western Europe stock indices present higher values than CEE ones in the pre-crisis period. We observe the opposite situation in the pre-COVID-19 period. On the other hand, the TE from the US equity markets to CEE stock indices in the crisis period is higher than to Western Europe indices. In the COVID-19 period, the TE from the US equity markets to DAX and BUX stock indices was the highest. In the pre-crisis period, the TE from the US equity market to Poland is the weakest in comparison to other countries, but, during the crisis, it increased the most, reaching a level similar to Western Europe. On the other hand, in the pre-COVID-19 period, the TE from the US equity market to Germany is the weakest in comparison to other countries, but during the pandemic it increased the most. The TE from the US equity markets to selected European stock indices in the crisis period reaches a higher level in comparison to the pre-crisis period, with France being the exception. Contrary to that, the TE from the US equity markets to selected European stock indices in the COVID-19 period reaches lower levels in comparison to the pre-COVID-19 period, with Germany being the exception. During the crisis, the TE from the US equity markets to the BUX

index is the highest in the group of CEE countries and the UKX index in the group of Western Europe. In the post-crisis period, the TE from the US equity markets to other indices decreased dramatically, especially the BUX and UKX indices. During COVID-19, the TE from the US equity markets to the BUX index is the highest in the group of CEE countries and the DAX index in the group of Western Europe. Based on the presented outcomes, we deduce that when the fixed-size sliding window equals 1000 days, the growth of mean linear correlations slows down considerably after 2009. At the same time, the average mutual information continues to rise until it peaks around 2013. Thus, we conclude that the stronger dependencies between all indices that can be observed after 2009 are due to the nonlinear effect. Similar results have been obtained by Gao and Mei [8] and Haluszczynski et al. [9].

**Figure 3.** The average mutual information between each index and the rest of the indices using overlapping windows. The upper part is a 220-day fixed-size sliding window (**a**), and the one below is a 1000-day fixed-size sliding window (**b**).

**Figure 4.** The mutual information between the US stock index and six European stock indices during the selected periods.

**Figure 5.** The average transfer entropy between each index and the rest of the indices using overlapping windows. The upper part is a 220-day fixed-size sliding window (**a**), and the one below is a 1000-day fixed-size sliding window (**b**).

**Figure 6.** The transfer entropy from the US equity markets to six European equity markets during selected periods.

#### *5.5. Comparison of Results*

We would like to model linear and nonlinear behavior in financial time series through the evaluation of information on dynamic correlations. Due to that, we used not only linear Pearson correlation but also mutual information, which can be used both for linear and nonlinear connections, as well as transfer entropy, which allows one to examine nonlinear connections.

Table 5 shows the comparison of different methods used to measure the dependence between the US stock index and selected European stock indices. For each of the three methods, we compute the values for pre-crisis, crisis, post-crisis, pre-COVID-19, and COVID-19 periods. The correlation coefficient values range from 0.246 to 0.729. In the case of examined countries, there is a clear separation between two strongly connected groups: Western European indices and CEE indices. We recognize that Western Europe has higher linear correlation coefficient values (from 0.568 to 0.729) than CEE (from 0.246 to 0.658). The levels of correlation increased significantly in the pre-COVID-19 and COVID-19 periods in all markets (the highest for CAX and DAX indices from Western Europe and for PX and BUX from CEE in the COVID-19 period). The results confirm that the COVID-19 pandemic has led to a growth in European financial market risk, which is in line with Zhang et al. [12], Akhtaruzzaman et al. [13], Shehzad et al. [54], and Zaremba et al. [14]. It should be stressed that the amplitude of growth was much higher in CEE markets, which is similar to the findings of Topcu and Gulal [16] and Tilfani, Ferreira, and Boukfaoui [55]. The most stable level of correlation in all analyzed periods is presented by the UKX index (from 0.595 to 0.726) and the CAC index (from 0.598 to 0.729). On the other hand, BUX increased the most between pre-crisis and crisis periods (from 0.246 to 0.524). After the crisis, BUX began to behave like other CEE countries. Overall, relationships between local centers are greater within these groups than between them. These results are in line with those obtained by Stoica and Diaconas,u [56] and Gradojevi´c and Dobardži´c [57]. The results demonstrated that regional market integration is strengthened in times of crisis or pandemic.


**Table 5.** Comparison of different methods used to measure the dependence between the US stock index and European indices during pre-crisis, crisis, post-crisis, pre-COVID-19, and COVID-19 periods.

Notes: The rows of a heat map represent stock indices in specific periods, and the columns represent the methods used to measure the dependence between the US stock index and six European stock indices during pre-crisis, crisis, post-crisis, pre-COVID-19, and COVID-19 periods. Each cell in the particular methods is colorized based on the values (from green for the lowest values to red for the highest ones).

As can be observed in Table 5, similar conclusions to those received by using linear correlation can be obtained with mutual information. For both methods, Western Europe is the region that attains the largest values. Furthermore, the highest values of mutual information are achieved in the pre-COVID-19 period for Western Europe and in the COVID-19 period for CEE regions. It is interesting to note that the transfer entropy presents slightly different results. The values of transfer entropy in CEE are higher (from 0.017 to 0.079) than in Western Europe (from 0.015 to 0.025), which can be especially observed in the crisis and pre-COVID-19 periods. As per our results, notable information cannot be expressed well by linear measure, hence the usage of different methods that intercept linear and nonlinear correlations. In conclusion, our analysis suggests that stock indices quickly responded to the GFC as well as the COVID-19 pandemic, and these responses changed over time depending on the information flowing through markets.

#### **6. Discussion and Conclusions**

This study provides an analysis of the effect of the GFC and the COVID-19 pandemic on European stock markets. The main goal of this paper is to compare the risk transfer between US stock market indices and six European stock market indices before, during, and after the GFC, as well as before and during the COVID-19 outbreak. In our study, we also emphasize the differences in the correlation structure between CEE and Western European markets. We used a variety of methods to separately analyze the correlation structures for testing the linear and nonlinear structure of relationships in returns between the US stock index and selected European stock indices.

Testing the connectedness during the crisis period, the correlation between SPX and CEE indices changed more in terms of growth than in Western European indices. This is only a partial confirmation of earlier research [7], stating that the CEE stock exchanges are not more vulnerable to contagion, even if they are less liquid than Western European markets. Additionally, our findings stress that the amplitude of growth in the pre-COVID-19 period is much higher in CEE markets. Given a slower economic growth and relatively not liquid capital markets, emerging markets have probably limited resources to cope with the pandemic.

Nevertheless, the relationship between the mean linear correlations of CEE markets fluctuates during the whole period. In the years 2009–2015, the mean linear correlation for WIG20 is higher than for other CEE indices, but, starting from 2016, the correlation index for the PX is higher than for the WIG20 and the BUX. In the analyzed period, the stock markets in CEE were not stable or resistant to crisis shocks. This result may be explained by the smaller integration of CEE stock markets with global capital markets. For investors, this means another source of risk diversification in CEE markets.

Comparing to the GFC, our findings emphasize that the linear correlations between the S&P 500 and all European indices increased significantly in the pre-COVID-19 period. The negative impact of COVID-19 on stock markets continued or slightly increased by mid-August 2020. The results show that the COVID-19 pandemic has led to a growth in European financial market risk. These findings confirm those of earlier studies, such as Ferreira [58] and Grabowski [59]. An analysis of the volatility spillovers indicates that CEE markets are the recipients of volatility. As opposed to the previous research of Topcu and Gulal [16], our findings do not confirm that the influence of COVID-19 on emerging stock markets has gradually fallen and began to taper off by mid-April 2020.

The results that we obtained indicate that there are relatively significant differences between linear and nonlinear estimation. The transfer entropy from the US equity markets to CEE stock indices during the crisis is higher than to Western Europe indices. Before the crisis, the transfer entropy from the US equity market to Poland is the weakest compared to other countries, but, during the crisis, it increased the most. During the crisis, the transfer entropy from the US equity market to Poland is similar to Western Europe. Additionally, we infer that nonlinear effects lead to stronger dependencies between all indices after 2009. Starting from the COVID-19 pandemic period, we can observe soaring growth in the average mutual information and transfer entropy of all European markets.

Our study of European stock markets shows that cases of intensified and broken links between markets are particularly visible in CEE countries. This evidence may suggest that emerging equity markets are increasingly integrated into mature markets, thus becoming dependent on certain crises and pandemic outbreaks. This may be explained by short-term capital flows from less stable markets, changing political circumstances. Undoubtedly, research should provide an interesting insight for potential investors diversifying their stock portfolio. Our research has implications for risk management and asset pricing. Although CEE countries are considered a homogeneous group by international investors, the financial markets of these countries show varying degrees of integration. Therefore, from a portfolio diversification perspective, less developed markets may offer risk diversification opportunities that investors can capitalize on. For the purpose of portfolio risk management, information about the linkages between markets can be important for investors in making decisions. In addition, information on the increasing connectedness between markets may be relevant when portfolios are reallocated.

We believe that this study may be a benchmark for financial market network structure for further research in this area. Therefore, future researchers should test whether the results remain insignificant over a longer time horizon. Additionally, similar to the vast majority of research on contagion in emerging economies, our research focuses on the analysis of daily and weekly data. However, it would be worthwhile to investigate the connectedness of European stock markets with high-frequency information.

**Author Contributions:** Conceptualization, R.K.; methodology, R.K.; software, S.U.; validation, S.U.; formal analysis, S.U.; investigation, S.U.; resources, S.U.; data curation, S.U.; writing—original draft preparation, R.K. and S.U.; writing—review and editing, R.K. and S.U.; visualization, S.U.; supervision, R.K.; project administration, R.K.; funding acquisition, R.K. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Data Availability Statement:** All the data supporting reported results come from: https://stooq.pl (accessed on 9 February 2022).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


### *Article* **Replication in Energy Markets: Use and Misuse of Chaos Tools**

**Loretta Mastroeni \*,† and Pierluigi Vellucci †**

Department of Economics, University of Roma Tre, 00145 Roma, Italy; pierluigi.vellucci@uniroma3.it

**\*** Correspondence: loretta.mastroeni@uniroma3.it

† These authors contributed equally to this work.

**Abstract:** As pointed out by many researchers, replication plays a key role in the credibility of applied sciences and the confidence in all research findings. With regard, in particular, to energy finance and economics, replication papers are rare, probably because they are hampered by inaccessible data, but their aim is crucial. We consider two ways to avoid misleading results on the ostensible chaoticity of price series. The first one is represented by the proper mathematical definition of chaos and the related theoretical background, while the latter is represented by the hybrid approach that we propose here—i.e., consisting of considering the dynamical system underlying the price time series as a deterministic system with noise. We find that both chaotic and stochastic features coexist in the energy commodity markets, although the misuse of some tests in the established practice in the literature may say otherwise.

**Keywords:** nonlinear dynamics; chaos; butterfly effect; energy futures

**JEL Classification:** C650; G140; Q470

#### **1. Introduction**

As pointed out by many researchers (see, for example, [1]), replication is the key to credibility in applied sciences and confidence in all research findings. With regard in particular to energy finance and economics, replication papers are rare, probably because they are hampered by inaccessible data [1], but their aim is crucial and twofold. First, they wonder if the old results resist if more recent data are added and if the methods are updated, and if not, why this is so. Second, they take into account a large number of recent (or older) articles to check whether the results are still valid when compared with other contributions.

For instance, the same data may be examined by different authors with different methodological approaches. Can the difference in results be explained? Is it possible to distinguish credible results from others that are less so?

Recently, we started to focus on this question by considering, in particular, the findings of the so-called "chaos theory" on the energy commodity markets [2–4]. An important reason to be interested in chaotic behavior is that it resembles random behavior (even if they cannot be treated as the same).

In particular, it is interesting to know whether the fluctuations in many time series are really random or they are instead the product of a (complex) deterministic system [3–6]. The behavior of a completely random system is not predictable anyway. Otherwise, if it were completely deterministic, even if chaotic, its behavior could be predicted in the short term.

It is straightforward that evidence on deterministic chaos would have important implications for regulators and short-term trading strategies, in all financial markets and in particular in energy markets.

Energy commodity prices have been examined over the last 20 years to detect the presence of chaos as an alternative to stochastic models, but they revealed contrasting results: some papers highlighted the presence of chaos, while some others did not, and

**Citation:** Mastroeni, L.; Vellucci, P. Replication in Energy Markets: Use and Misuse of Chaos Tools. *Entropy* **2022**, *24*, 701. https://doi.org/ 10.3390/e24050701

Academic Editor: Joanna Olbry´s

Received: 16 April 2022 Accepted: 14 May 2022 Published: 16 May 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

this has led to a gradual loss of interest in the chaos theory applied to energy commodity markets. For example, the papers we have examined in this field—we have selected only those relating to crude oil, diesel, natural gas and copper—are refs. [7–17], but eight of them fall before 2009 and only three after. (For the discussion of the previous literature, see [2–4]).

The conflicting results of identifying chaos in the energy commodity markets can be seen as a replication problem.

Hence, in this paper, we highlight the role of theoretical assumptions of the methods employed in the literature of energy markets. In particular, we show that the mathematical definition of chaos and the theoretical background recalled and discussed here are able to avoid possible errors from misleading results on ostensible chaoticity of the price series.

After showing the importance of the theoretical background in the light of the problem of replication, we also discuss the hybrid approach introduced in [3,4]—i.e., consisting in considering the dynamical system underlying the price time series as a deterministic system with noise—in order to re-evaluate the presence of a chaotic feature in the energy commodity markets. This hybrid approach is based on the introduction of tools that take into account the co-existence of stochastic and chaotic behavior in the same time series, such as modified correlation entropy, noise level estimation and recurrence analysis.

The result is that chaotic characteristics coexist with stochastic ones in the time series of energy commodity prices.

The remainder of this article is structured as follows. Section 2 introduces the chaos definition. Section 3 presents the tools we employ in our analysis, while Section 4 discusses the results. In addition, Section 5 provides the conclusions of our paper.

#### **2. The "Core" of Chaos: Its Definition**

Who remembers Ian Malcolm, the mathematician of Jurassic Park? In a scene where he tries to explain the chaos theory to Ellie Sattler, he says: "It simply deals with unpredictability in complex systems. The shorthand is the Butterfly Effect. A butterfly can flap its wings in Peking and in Central Park you get rain instead of sunshine." That is very effective, simple and straightforward.

The chaos definition, however, goes deeper. According to one of the most widely accepted definitions of chaos, introduced by Robert L. Devaney [18] (hence known as *Devaney's chaos definition*), sensitive dependence on initial conditions, topological transitivity and density of periodic points are the "ingredients" of chaos (for the self-consistency of Devaney's definition, see the references in [2]). The intuitive meaning of sensitive dependence on initial conditions is straightforward: tiny differences become amplified. It is the most popular property of a chaotic system. Also called "butterfly effect", it is immediate enough to be cited in a popular film, as we said. This is probably why the "butterfly effect" becomes so predominant that in many contexts, it constitutes, itself, a definition of chaos. There is a lot of numerical evidence for this experimental definition of chaos, but it is not satisfactory, both theoretically and experimentally.

From a theoretical point of view, see, for example, the counterexample 3.3 introduced by Martelli et al. in [19]. Their counterexample shows that, although the "experimental" definition of chaos is easy to check, it defines as chaotic systems those which are not.

As far as the experimental point of view is concerned, however, it has been noted that the time series generated by stochastic systems can also show a sensitive dependence on the initial conditions [20–22] and, since chaos theory is an alternative paradigm to the stochastic approach, a problem arises with the definitions—what is chaotic and what is not.

In addition, while some tests for sensitive dependence on initial conditions have been introduced, for the other two properties that build the Devaney chaos definition, we have far fewer tests, and further, no tests for transitivity conditions of the chaos definition have been found [23].

For this reason, it is inappropriate to talk about chaos tests. We should instead refer to the specific property we are going to test. For example, all the papers considered in this article [7–17] resort to the experimental definition of chaos, testing sensitive dependence on initial conditions. However, the implications that the butterfly effect may have in the energy markets make this property interesting to study, as remarked in [2], but. . . how?

Is there a dichotomy between the butterfly effect and stochastic features? Or is it possible to think of a paradigm that can include both? The answer to this question is, yes, this dichotomy does not need to be a strict rule, as proved in [3,4]. Hence, in the following, we propose a systematic approach to detect the correct tests to work in this "hybrid" framework.

#### **3. Methodologies**

In this paper, entropy and recurrence analysis tools represent the key methodologies to assess the presence of the butterfly effect. Moreover, we extend some of them in order to deal with the coexistence of chaotic and stochastic behaviors.

In the following, *pt* and *<sup>κ</sup><sup>t</sup>* = ln *pt pt*−<sup>1</sup> are, respectively, the price and log returns at time *<sup>t</sup>*. The time series we will work on is defined as follows: {*κt*, *<sup>t</sup>* <sup>=</sup> 1, 2, . . . , *<sup>n</sup>*}, *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>.

#### *3.1. Phase Space Reconstruction*

Embedding the time series in a phase space is an important research topic on chaotic time series analysis [24]. In this case, the time evolution of returns is represented by the dynamical system that comes out of the phase space independent variables. The asymptotic behavior of the dynamical system is described by an *attractor*, whose dimension provides a measure of the minimum number of independent variables able to describe the dynamical system.

The scalar time series is topologically equivalent to the attractor, which can be reconstructed from a time series by using the method of the time delay coordinate [25,26]. The reconstructed attractor of the original system is given by the vector sequence

$$\mathcal{J}(i) = \left( \kappa\_{i\prime} \kappa\_{i+\tau\prime} \kappa\_{i+2\tau\prime} \dots \kappa\_{i\prime} \kappa\_{i+(m-1)\tau} \right) \tag{1}$$

where *m* is the embedding dimension, and *τ* is an appropriate time delay.

The choice of the time delay *τ* could be a potential issue. For example, the authors in [27] showed that the chaos measures estimation for stock price data is affected by the wrong choice of *τ*.

The authors in [8] estimated the optimal time delay as the one where average mutual information reaches its first minimum, obtaining a time lag greater than 1.

In [3,4], we employed the average mutual information (AMI) technique to select a proper value of *τ*. A proper value of *τ* can be determined using the first minimum of average mutual information (AMI) function, as done in [8]. The method of false nearest neighbors (FNN), introduced by [28], is an algorithm to estimate the minimal embedding dimension *m*. Let *r* be the threshold on the distance between two neighboring points, *k*(*i*) be the index of the time series element for which we have the minimum |*ζ*(*k*(*i*)) − *ζ*(*i*)|, *ζ*(*k*(*i*))(*m*) be the closest neighbor to *ζ*(*i*) in *m* dimensions, *σ* be the standard deviation of the data, and Θ(·) the Heaviside step function, i.e.,

$$\Theta(\mathbf{x}) = \begin{cases} 0, & \mathbf{x} < 0, \\ 1, & \mathbf{x} \ge 0. \end{cases}$$

Hence, the *false nearest neighbor* (FNN) metric is defined as

$$\text{FNN}(r) = \frac{\sum\_{i=1}^{n-m-1} \Theta\left(\frac{|\mathbb{E}(i)^{(m+1)} - \mathbb{E}(k(i))^{(m+1)}|}{|\mathbb{E}(i)^{(m)} - \mathbb{E}(k(i))^{(m)}|} - r\right) \Theta\left(\frac{\varepsilon}{r} - |\mathbb{E}(i)^{(m)} - \mathbb{E}(k(i))^{(m)}|\right)}{\sum\_{i=1}^{n-m-1} \Theta\left(\frac{\varepsilon}{r} - |\mathbb{E}(i)^{(m)} - \mathbb{E}(k(i))^{(m)}|\right)},\tag{2}$$

A proper value of *m* can be selected by imposing a threshold FNN∗ (in our case FNN∗ = 0.5%, as done in [3,4]) so that, if FNN is larger than FNN∗, the neighbor is false. Since the FNN decreases with the threshold *r*, this is the equivalent of selecting as the embedding dimension the minimum value of *m* such that FNN < FNN∗.

#### *3.2. Modified Correlation Entropy*

Let {*κi*} be the result of phase space reconstruction described by Equation (1). Hence, the authors in [29] showed that the Kolmogorov–Sinai (KS) entropy can be approximated by the correlation sum

$$\mathcal{C}\_{m}(r) = \frac{1}{n(n-1)} \sum\_{i,j=1 \atop i \neq j}^{n} \Theta(r - ||\mathcal{J}(i) - \mathcal{J}(j)||) \, . \tag{3}$$

where the distance metric is given by the Euclidean norm. From Equation (3), it is possible to achieve an early estimate of the KS entropy

$$K \simeq \frac{1}{\pi} \ln \frac{\mathbb{C}\_m(r)}{\mathbb{C}\_{m+1}(r)} \,. \tag{4}$$

and its adjusted estimation

$$K \simeq \frac{1}{\pi} \ln \frac{\mathbb{C}\_m(r)}{\mathbb{C}\_{m+1}(r)} - \frac{D}{2\pi} \ln \frac{m+1}{m} . \tag{5}$$

given by [30], where *D* is the correlation dimension.

Nevertheless, the computation of the correlation sum is affected by noise, which produces errors in these formulas, used instead in the literature so far.

The authors in [31] introduced the *modified correlation entropy* (MCE), which estimates the KS entropy for noisy time series. It is based on the correlation integral derived in [32] and assumes the presence of Gaussian additive noise.

#### *3.3. Noise Level*

Let 0.1 = *r*<sup>1</sup> < *r*<sup>2</sup> < ··· < *ri* < ··· < *rL* = 0.3 with a uniform step Δ*r* = *ri*+<sup>1</sup> − *ri*. The noise level is estimated by means of a linear least-squares method

$$\sigma^2 = \frac{\sum\_{i=2}^{L-2} (v\_{i+1} - v\_i)(u\_{i+1} - u\_i)}{2\sum\_{i=2}^{L-2} (u\_{i+1} - u\_i)^2} \,. \tag{6}$$

as obtained in [33]. It is based on an auxiliary time series (*ui*, *vi*), *i* = 1, . . . , *L*

$$\begin{aligned} u\_i &= \frac{(m-1)\Delta r(c\_i - c\_{i-1}) - r\_i(c\_{i-1} - 2c\_i + c\_{i+1}) - r\_i(c\_i - c\_{i-1})^2}{r\_i(\Delta r)^2} \\ v\_i &= r\_i \frac{c\_i - c\_{i-1}}{\Delta r} \end{aligned} \tag{7}$$

where *ci* = ln *C*0(*ri*).

#### *3.4. Recurrence Analysis*

*Recurrence quantification analysis* (RQA) can be considered as another important tool in chaotic time series analysis [34,35]. The *recurrence plot* (RP), introduced by [36], is defined by the matrix

$$M\_{\bar{i}\bar{j}} = \Theta(\epsilon - ||\mathcal{J}(i) - \mathcal{J}(j)||),\tag{8}$$

where is a tolerance parameter to be chosen and *ζ*(*i*) is derived by Equation (1). Since the distance is symmetric, we have that the matrix *M* is in turn symmetric and, then, the recurrence plot is symmetric with respect to the diagonal, by definition.

The parameter , which determines the density of RP, can be selected according to the criterion introduced in [37]:

$$\epsilon = k \cdot \max\_{i,j} \|\mathcal{J}(i) - \mathcal{J}(j)\|. \tag{9}$$

provided that *k* < 10% [34,38,39].

Related to the RP is the *recurrence rate* [34], which can be defined as follows:

$$RR(\tau) = \frac{1}{N - \tau} \sum\_{i=1}^{N-\tau} M\_{ij}. \tag{10}$$

The *recurrence quantification analysis* contains several measures of complexity. Its aim is to go beyond the visual impression yielded by RPs [34].

Some of them resort to the histogram *P*(*l*) of diagonal lines of length *l*, i.e.,

$$P(l) = \sum\_{i,j=1}^{N} \left(1 - M\_{i-1,j-1}\right) \left(1 - M\_{i+l,j+l}\right) \prod\_{k=0}^{l-1} M\_{i+k,j+k} \cdots$$

As recalled in [34], "processes with uncorrelated or weakly correlated, stochastic or chaotic behaviour cause none or very short diagonals, whereas deterministic processes cause longer diagonals and less single, isolated recurrence points". From this, it is natural to take

$$DET = \frac{\sum\_{l=l\_{\min}}^{N} IP(l)}{\sum\_{l=1}^{N} IP(l)} \tag{11}$$

as a measure for *determinism* of the system—percentage of recurrence points which form diagonal structures (of at least length *lmin*) over the total number of recurrence points.

Moreover, given the histogram *P*(*v*) of vertical lines of length *v*, i.e.,

$$P(\upsilon) = \sum\_{i,j=1}^{N} \left(1 - M\_{i,j}\right) \left(1 - M\_{i,j+\upsilon}\right) \prod\_{k=0}^{\upsilon -1} M\_{i,j+k}$$

it is possible to define the percentage of recurrence points which form vertical structures in the RP, the so-called *laminarity*:

$$LAM = \frac{\sum\_{v=v\_{\min}}^{N} vP(v)}{\sum\_{v=1}^{N} vP(v)}$$

whereas the average length of vertical structures is given by

$$TT = \frac{\sum\_{v=v\_{\min}}^{N} vP(v)}{\sum\_{v=v\_{\min}}^{N} P(v)}$$

and is called the *trapping time*.

#### **4. Implications of the New Approach**

We now turn to recall the main findings enclosed in [3,4], discussing them in the framework of our approach, i.e., the coexistence of the stochastic and chaotic paradigms.

Before embracing this hybrid paradigm for energy markets, it is very important to determine the two embedding parameters for the reconstruction of the phase space, namely, the time delay *τ* and the embedding dimension *m*. In Table 1, we recall the embedding parameters of some of the future contracts analyzed in [4], as collected by the U.S. Energy

Information Administration (EIA). As we can see, the optimal time lags are not always equal to 1.



According to our framework, the impact of the stochastic component can be initially estimated through the modified correlation entropy. An example of MCE estimation is depicted in Figure 1, where MCE and CE are compared depending on the threshold *r* [4].

**Figure 1.** MCE vs. CE; Cushing Crude Oil Contract 1 (on the **left**) and Natural Gas (on the **right**).

In Figure 1, we see the following:


Since MCE ≡ CE for noise-free data, these two points show the relevance of the stochastic component in our dataset of prices. The steadiness of MCE is typical of deterministic systems with noise (see Figure 11.3 of [40]).

Connected to this point is the noise level estimation. Few examples of noise level estimation are represented in Table 2 and, as discussed in [4], it shows that the level of noise cannot be ignored.

**Table 2.** Noise level estimation.


We now turn to prove these insights through the use of recurrence analysis. We show an example of the recurrence plot for copper dataset, examined in [3], in Figure 2, for = 6%.

**Figure 2.** Recurrence plot, copper (6%).

In Figure 2, black rectangles and single dots alternate along the entire picture. In the recurrence analysis, single points denote noisy behavior [34] because they indicate strongly uncorrelated, fluctuating data, whereas black rectangles characterize *laminar* behaviors. The latter are indicative of states that do not change or change slowly for some time [34,41]. Therein, periods are related to *intermittency*, a behavior of dynamical systems which has been extensively studied in the literature [42–45].

In economics and finance, intermittency results in the irregular alternation of phases of boom and of depression [46,47].

The authors in [48] showed "how economic intermittency is induced by an attractor merging crisis and how to recognize different recurrent patterns in the intermittent time series of economic cycles by separating them into laminar (weakly chaotic) and bursty (strongly chaotic) phases". Moreover, intermittency is related to the emergence of bubbles [3,35,49,50].

Intermittency is one of the common routes to chaos [51]. In such a state, the dynamical system switches between two different kinds of behavior called phases. Complex systems which exhibit intermittency can be described by a control parameter *p*. It is characterized by a critical threshold *pT*, which marks the switch from different dynamic regimes [51]. For example, the dynamical system underlying the copper time series is such that *p* > *pT*, because the laminar phases in Figure 2 are still pretty recognizable ([3]).

White areas or bands in the RPs are caused by abrupt changes and extreme events in the dynamics (*disrupted* typology [36]). They are indicative of transient activities and may reflect an underlying state change [34]. White bands with no recurrent points appear in Figure 2.

Pomeau and Manneville introduced three types of intermittency [42], whose structure were examined in [52] afterwards. According to [52], it is possible to distinguish the kind of intermittency showed by the system by looking at the patterns of RPs. Hence, following [52], the pattern in Figure 2 suggests the presence of a type I intermittency (Figure 3).

**Figure 3.** Type I intermittency, positioning of the rectangles in the RP (see Figure 8 in [52]).

Quite different is the RP depicted in Figure 4, for natural gas. We can spot the presence of a larger number of black rectangles, even if they are smaller.

**Figure 4.** Recurrence plot, natural gas (6%).

Then it is clear that, in this context, we cannot talk about purely chaotic (or stochastic) time series and that the energy commodity markets follow instead a hybrid paradigm—both chaotic and stochastic. However, do you remember Ian Malcolm's words? Rearranging them, *the shorthand of chaos is the butterfly effect*. In Section 2, we explained why this cannot be true, and the energy commodity markets give us a *counterexample*. Actually, we estimated the maximal Lyapunov exponent (MLE) for some of the datasets previously examined in [3,4] obtaining: MLE (copper) = −0.78; MLE (oil contract 1) = −0.68; MLE (natural gas) = 0.14. From these findings, according to the experimental definition of chaos, we may infer that the natural gas time series is chaotic [2].

MCE, noise level estimation and RP tell us a different story: the stochastic component is too large to be neglected. This result is also confirmed by the measure for determinism enclosed in Equation (11). For natural gas, DET= 0.22, which denotes a very high level of stochastic component. The choice of *lmin* = 10 satisfies the suggestions contained in [34,40]; the choice of (*k* = 6%) follows the criterion fixed by (9).

#### **5. Conclusions**

As pointed out by many researchers, replication is the key to credibility in applied sciences and confidence in all research findings. With regard, in particular, to energy finance and economics, replication papers are rare, probably because they are hampered by inaccessible data, but their aim is crucial and twofold. First, they wonder if the old results resist the addition of more recent data and the updating of new methods and, if not, why this is so. Second, they take into account a large number of recent (or older) articles to check whether the results are still valid when compared with other contributions.

While in [3,4] we proved that the contrasting results in chaos theory applied to energy economics are due to replication issues, in this paper, we consider two ways to avoid misleading results on the ostensible chaoticity of price series. The first one is represented by the proper mathematical definition of chaos and the related theoretical background, while the latter is represented by the hybrid approach that we propose here—which consists in considering the dynamical system underlying the price time series as a superposition of deterministic and stochastic systems. This hybrid approach is based on the introduction of tools that take into account the co-existence of stochastic and chaotic behaviors in the same time series, such as modified correlation entropy, noise level estimation and recurrence analysis.

We find that the chaotic and stochastic features coexist in the energy commodity markets, although the misuse of some tests in the established practice in literature—like CE or MLE—may say otherwise.

Our results are in line with the seminal paper by Barnett and Serletis who, more than 20 years ago, conjectured that controversies concerning the application of chaos theory in economics "might stem from the high noise level that exists in most aggregated economic time series and the relatively low sample sizes that are available with economic data" [53]. However, we should observe that the long debate produced by this paper did not answer the question, and, instead, papers dealing with the existence of chaos in economic and financial data continued to be published in the subsequent years [3,4]. Moreover, we do not completely agree with the conclusions enclosed in [53]: "However, it also appears that the controversies are produced by the nature of the tests themselves, rather than by the nature of the hypothesis, since linearity is a very strong null hypothesis, and hence should be easy to reject with any test and any economic or financial time series on which an adequate sample size is available". We do not believe that "the controversies are produced by the nature of the tests themselves", and instead we showed here that it would be more correct to speak of the superposition of chaotic and stochastic systems.

The consequences of such findings, though not investigated here, deserve further investigations and suggest, for future works, the adoption of different approaches to predict the behavior of energy commodity prices.

As for future works, artificial intelligence (AI) methods, such as machine learning, offer new possibilities to forecast energy consumption prices. Unlike conventional algorithms, which tend to follow explicit instructions to perform a specific task, machine learning (ML) takes into account various context variables and their mutual relationship while training. For example, in price prediction, supervised learning algorithms can already produce good results, which in turn are applied to time series data. There are already several studies on the predictability of time series data for various applications, including in the energy sector [54–57].

For the future, it would be therefore good to address these AI/ML-driven techniques for a robust evaluation and estimation of energy consumption prices in the outlook.

**Author Contributions:** Conceptualization, L.M. and P.V.; Data curation, L.M. and P.V.; Formal analysis, L.M. and P.V.; Validation, L.M. and P.V.; Writing—original draft, L.M. and P.V.; Writing review & editing, L.M. and P.V. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

