**1. Introduction**

A proper risk assessment is one of the key prerequisites of any prospective financial investment. Even for an asset of moderate volatility, underestimating the probability of occurrence of an event of a given magnitude can lead to severe outcomes. Among the methods of dealing with risk assessment is the determination of a correct probability distribution function for asset price fluctuations in order to construct an adequate model of that asset's price dynamics. This issue has been of central interest since the early years of econometrics. It was Bachelier who proposed a model of the stock option price dynamics based on an uncorrelated random walk with a Gaussian distribution of fluctuations [1]. Later, it was found that the Gaussian noise hypothesis was only a poor approximation of the empirical data, which shows non-vanishing higher moments of the fluctuation distributions, i.e., skewness and positive excess kurtosis. Based on an observation of the cotton price dynamics, Mandelbrot proposed to model the logarithmic price increments (returns) with a process of Lévy flights, which is described by a heavy-tailed probability distribution function that is stable [2,3]. These distributions are defined by their characteristic function as they do not have a closed analytic form. However, their tails decrease as a power law in the limit of large *x*:

$$L\_n(\mathbf{x}) \sim \frac{1}{|\mathbf{x}|^{1+n'}}, \quad \mathbf{x} \to \pm \infty,\tag{1}$$

**Citation:** W ˛atorek, M.; Kwapie ´n, J.; Drozd˙ z, S. Financial Return ˙ Distributions: Past, Present, and COVID-19. *Entropy* **2021**, *23*, 884. https://doi.org/10.3390/e23070884

Academic Editor: Ryszard Kutner

Received: 15 June 2021 Accepted: 9 July 2021 Published: 12 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

where 0 < *α* < 2.

According to Mandelbrot [4], such a process can account for the absence of a convergence of the aggregated return distribution to the normal distribution as expected by the central limit theorem (CLT). The heavy tails are thus viewed as a natural limit of the aggregated independent or weakly dependent factors provided they are described by the stable distributions. However, this hypothesis has a weak point because the empirical data cannot exhibit the infinite variance required to maintain the distribution stability under aggregation. After the pioneering work of Mandelbrot, many researchers investigated financial time series in order to verify his outcomes. For example, Fama reported that the daily returns of stocks are better approximated by the infinite-variance distribution than the normal distribution or a mixture of the normal distributions [5]. The Lévy stability of the return pdfs in their central parts was also confirmed, among others, by Blume (*α* ≈ 1.7 − 1.8) [6], Teichmoeller (*α* ≈ 1.6) [7], and Blattberg and Gonedes (*α* ≈ 1.6) [8]. Some reports pointed out that, although central parts of the return distribution can be approximated by the stable distributions, the same cannot be said about the distant parts of their tails, which decay faster than expected. Officer found that the tails of the daily and monthly return distributions are no doubt thicker than Gaussian but at the same time thinner than Lévy-stable [9]. Barnea et al. observed that the daily return distributions for some stocks are well approximated by stable distributions, while for other stocks, they are not [10]. Much later, Young and Graff reported that the real-estate annual return distribution can be fitted by a stable function using *α* ≈ 1.5 [11].

Along with the research on empirical data, much effort was devoted to developing models that could mimic the market dynamics. Among such models, the subordinated stochastic processes do not require an assumption of the Lévy-stable character of the underlying dynamics and assume that the price movement is a Brownian motion that takes place in time, which itself is a stochastic process with positive increments and finite variance (e.g., a lognormal process) [12]. In practice, the subordinating process is assumed to be volume or transaction number. As an alternative, Engle proposed that the distribution tails are heavy because of the heteroskedasticity of the return-generating process, in which large returns are caused by a locally large variance of the process [13]. Mantegna and Stanley found a dual structure of the stock index return distribution (S&P500 index during the years 1984–1989), with its central part being in agreemen<sup>t</sup> with a Lévy-stable distribution and with exponentially decaying distant tails [14]:

$$L\_{a,\gamma}^{\text{tr}}(\mathbf{x}) \sim \frac{1}{|\mathbf{x}|^{1+\alpha}} e^{-\gamma|\mathbf{x}|}, \quad \gamma > 0. \tag{2}$$

While considering the aggregated returns at different time horizons, they did not find any trace of a convergence to the normal distribution. Based on these findings, they proposed a new model for the price return dynamics: a truncated Lévy flight process. They also showed that the heteroskedastic model (GARCH) does not fit the data well [14]. This type of distribution (*α* ≈ 1.6–1.7) was also reported from an analysis of the same S&P500 index recorded over a longer interval (1986–2000). In contrast, the aggregated returns showed a crossover to a CLT regime around a time scale of 20 days [15].

Plerou et al. and Gopikrishnan et al. presented two parallel, comprehensive studies of the stock market high-frequency data representing stock price returns for 1000 American companies and S&P500 index returns [16,17], in which they observed the cumulative distribution function tails obtained from aggregated returns over a substantial spectrum of time scales from 5 min (stocks) and 1 min (index) to 4 years. They found that the return distributions have power law tails, with the exponent 2.5 < *α* < 4 depending on a stock. However, despite the fact that they did not fit the Lévy-stable domain (*α* < 2), these distributions were invariant under a change in the time scale up to Δ*t* = 16 days. Only for the sampling intervals longer than 16 days, a slow transition to a normal distribution was observed [16]. An analogous invariance of the return distribution shape with the power exponent *α* ≈ 3 under the time-scale change was observed for the S&P500 index, but in

that case, the crossover occurred earlier at Δ*t* = 4 days. Only for the time scales longer than 4 days, a slow convergence to a Gaussian distribution was seen. A similar behavior was found in the indices from other stock markets (Nikkei & Hang-Seng) [17]. This surprising behavior of the stock markets led the authors to formulate the so-called "inverse cubic law"—a conjecture that the power-law tails of the return distributions with the scaling exponent *α* ≈ 3 are a universal property of all stock markets at short and medium time scales [18]. Indeed, similar statistical characteristics were found by other researchers in data collected from other stock markets [19–35], Forex [36], commodity markets [36,37], and the cryptocurrency market [36,38–40].

The only possible explanation of this result is that the analyzed data violated the assumptions of the central limit theorem, i.e., the returns were significantly correlated. Indeed, the cross-correlations among the stock returns representing different companies are an obvious characteristics of all stock markets [34,41–45]. It was shown that the interstock cross-correlation strength has a strong impact on the index return distributions and can even modify their tail behavior, leading to a kind of alternation between different power-law regimes: stable and unstable [43]. On the other hand, the cross-correlations between different stock markets can also induce a significant regime change [21,22]. The existence of autocorrelation in returns is a more delicate issue: while the returns reveal some short-term memory lasting for a few minutes, the existence of long-term memory is doubtful [16,17,24,46,47], even though there were reports stating that the returns can show some autocorrelation or persistence over long terms [48–52]. On the other hand, there is consensus over a fact that the long-term autocorrelation is present in absolute returns (volatility) and in some more fundamental observables such as fluctuations in stock market orders, transaction size, and market liquidity [53–55]. The existence of a return autocorrelation can be considered an important factor that can destroy market efficiency [56,57]. These ubiquitous manifestations of the inverse cubic scaling in the financial data encouraged Gabaix et al. to propose a model that was able to account for this phenomenon [58]. According to this model, the inverse-cubic return fluctuations were a result of two processes: the volume fluctuation that forms a probability distribution function with the tail index 1.5 and a specific square-root form of the price impact function, which together produce a tail index equal to 3 [58]. However, Farmer and Lillo pointed out that the price impact function is specific to individual markets and even to individual stocks; thus, it cannot produce any universal behavior. Also, the dependence on transactions is slower than the square root and the volumes are not power-law distributed, so they cannot lead to a power-law behavior of the returns with *α* ≈ 3 [59]. The price changes are driven by more factors than simply volume and transaction number fluctuations—it can be the order book structure, for example [60,61]. Moreover, there is plenty of published evidence that various financial assets either do not have the power-law distribution tails [29,62–67] or their scaling exponent *α* differs from 3 even for the short time scales [36,68–73]. Given these results together, the inverse cubic scaling cannot be considered a universal property of financial returns and, thus, cannot be called "a law". However, it manifests itself sufficiently often to allow us to view it as one of the possible reference models describing the empirical return distribution tails (there is a plethora of volatility models, which takes into account various factors; a review of such models can be found in Poon and Granger [74]).

The power law tails of the return distributions, which are among the financial stylized facts, can be reproduced with a broad range of the scaling exponent by means of various models based on stochastic processes [63,75–83], including multiplicative processes [84–86], the minority game and other agent-related dynamics [87–91], as well as spin dynamics [23,92].

Apart from power-law functions, the tail behavior of the return distributions can also be approximated by exponential functions and stretched exponential functions [93]. The latter are defined by the following expression:

$$f(\mathbf{x}) \sim \exp \mathbf{x}^{-\beta}, \qquad 0 < \beta < 1. \tag{3}$$

Such a functional form allows for the stretched exponents to locally resemble the power laws. There were many published studies in which the return distributions were approximated successfully by the exponents, and some researchers advocate using these functions instead of the power laws [23,29,31,63–67,69,80,94,95]. Another type of exponential function that is sometimes considered in the context of financial data is the Laplace distribution function *p*(*x*) ∼ exp(−|*x*|). This function can also demonstrate heavy tails. It was observed that some empirical return distributions can be approximated by this function [62,96].

The functions that have been discussed so far do not exhaust the possible models that can be used to approximate the empirical return distributions. In a financial context, a particularly important class is the *q*-Gaussian functions. They were derived as a part of the formalism of nonextensive statistical mechanics based on the Tsallis nonadditive entropy [97]:

$$S\_q = k\_\text{B} \frac{1 - \int [p(\mathbf{x})]^q d\mathbf{x}}{q - 1},\tag{4}$$

where *p*(*x*) is some probability distribution and *k*B is a positive constant. Under certain conditions, this entropy is maximized by a family of *q*-Gaussian distributions given by

$$G\_q(\mathbf{x}) \sim \exp\_q[-\mathcal{B}\_q(\mathbf{x} - \boldsymbol{\mu}\_q)^2],\tag{5}$$

where

$$\exp\_q \mathbf{x} = [1 + (1 - q\_x)^{\frac{1}{1 - q}}, \quad \mathcal{B}\_q = [(3 - q)\sigma\_q^2]^{-1}, \tag{6}$$

provided that 0 < *q* < 3 and that *μq* and *σ*2*q* are *q*-mean and *q*-variance, respectively. The *q*-Gaussians generalize both the normal distribution (*q* = 1) and the Lévy distributions (5/3 < *q* < 3). Their attractiveness comes from the fact that, for the correlated random variables, the *q*-Gaussians become stable distributions. Moreover, their tail behavior can also resemble the power laws [98]. As the price returns are correlated, one can expect that these functions can describe the statistical properties of returns. Indeed, there is a growing evidence that the *q*-Gaussian distributions can approximate the empirical return distributions [30,32,66,99–101].

The *q*-Gaussians are among the functions borrowed from the nonextensive statistical mechanics that were exploited in this context. Another example is the *q*-exponent given by Equation (6), which was also reported to fit the empirical returns from a stock market [102]. Finally, some researchers consider the normal-inverse Gaussian function to be a prospective model that can successfully be fitted to the data [71].

This short review of the return distribution modeling approaches shows that there is a cornucopia of the reported results that were even contradictory sometimes. The only firm observation that is shared by all the studies is that the return distributions reveal heavy tails, at least at short time scales. On medium and long time scales, the situation depends strongly on a data set, a market, and a financial instrument. Drozd˙ z˙ et al. attempted to resolve this problem by noticing that the most well-known results regarding the return distributions, i.e., Mandelbrot's Lévy stability (*α* < 2) [4]; Mantegna and Stanley's truncated Lévy flights [14]; Plerou and Gopikrishnan's unstable power-law tails (*α* ≈ 3), which are persistent under aggregation of the returns until the time scales of days or even a month [16,17]; and their own results with the *α* ≈ 3 regime already breaking at the time scale of hours [24], were based on the data covering different epochs: 1816–1958 (Mandelbrot), 1984–1989 (Mantegna), 1926–1995 (Plerou and Gopikrishnan), and 1998–1999 (Drozd˙ z). One can follow the whole historical process of the financial ˙ market development, introduction of new financial instruments, technological innovations, transition from the classic "floor-based" markets to the digital markets, computing power increase, telecommunication revolution, etc. from past to present. This inevitably leads to the constantly increasing number of investors, transactions, and pieces of information that arrive at the market. These are accompanied by the increasing amount of money and

information processing speed, which, if taken together, result in an overall acceleration of the market time flow. Any unit of time nowadays corresponds to a much longer interval in the past. From this perspective, the market properties once observed, say, at a daily scale, now can be observed at scales of hours, minutes, or even seconds. This may be the very reason why Mandelbrot observed the Lévy-stable distributions that are hardly seen today and why Plerou and Gopikrishnan reported the crossover to the CLT-related convergence of distribution tails at the time scale of many days, while today, such a behavior is observed within hours or minutes. This hypothesis formulated by Drozd˙ z et al. was later supported ˙ by other analyses as well [24,30,32,69,72].

However, based on data covering a given time interval, one can observe an analogous phenomenon by considering, e.g., the stocks representing companies with different capitalization. Since there is statistically a relation between the capitalization of a company and the number of transactions involving its stock shares, the highly capitalized stocks "feel" that time flows faster than their lower-capitalized counterparts. In consequence, the properties of the corresponding return distributions substantially differ between both groups, with the former displaying thicker tails than the latter [24,34,35,100,103]. Qualitatively similar observation can be made by comparing the distributions for the data from the markets of different developmental stage, e.g., the mature markets and the emerging markets. The former are characterized by higher liquidity and a higher transaction number than the latter; therefore, generally, the situation is parallel to the previous cases. Studies of the data from the emerging markets report thick tails with small scaling exponents more frequently than the mature markets [25,26,28,52,66,94,104–110].

Another issue related to return distributions is their asymmetry between positive and negative parts. It was investigated in various works as it is also an important factor in investment risk assessments (the gain–loss asymmetry). Typically, this property was tested by means of the third moment (skewness) of return distributions, in which a negative value means a higher probability of a significant gain with respect to a significant loss while a positive value means the opposite. The negative skewness is associated, thus, with a positive tail of the distribution being heavier than the negative tail. There are mixed outcomes of the empirical data reported in the literature, including indications of either positive, negative, or neutral skewness as well as the scaling exponent difference between the left and the right tails (in the case of power-law tails) dependent on the analyzed time intervals, markets, and securities (e.g., References [14,16,17,20,28,31,36,62,71,94,111–119]). However, even though a difference between the positive and negative tails exists in the data, it has a much weaker impact on the distribution shape and the related investment risk than the heavy tails. Therefore, in many studies reported in the literature, only absolute returns are considered, neglecting their actual signs (e.g., References [16,17,24,30,32,39]). As our study is focused on an investigation of the tail exponent stability with respect to the time scale Δ*t* and, based on literature and our previous experience, we expect larger effects due to the time-scale change than due to the left–right tail asymmetry, we neglect the return sign and consider both tails together by analyzing the absolute return values. In fact, our major new finding is that, in recent years, the market's "internal" time stopped accelerating with respect to our ordinary "clock" time. Other factors also affect the convergence of return distributions to the Gaussian with increasing Δ*t*, especially those that cause extreme volatility and strong cross-correlations between assets such as COVID-19. We discuss the interplay of these two factors in the following sections.

The remainder of our paper is organized as follows: in Section 2, we present the data sets that were analyzed; in Section 3, we discuss the results; and in Section 4, we collect the main conclusions of our study.

### **2. Data**

We analyzed recent tick-by-tick recordings of the contracts for differences (CFDs) representing (1) six major stock market indices, CAC40 (Euronext), DAX30 (Deutsche Börse), FTSE100 (London SE), DJIA (New York SE), S&P500 (New York SE & NASDAQ), and NASDAQ100 (NASDAQ); (2) 240 U.S. stock shares and 30 stock shares with the highest capitalization from Germany, France, and the U.K. (see Appendix A for their list); (3) four commodities, U.S. crude oil (CL), high grade copper (HG), silver (XAG), and gold (XAU); (4) the currency exchange rates (not CFDs) involving five major currencies, USD, EUR, GBP, CHF, and JPY; and (5) two cryptocurrencies, bitcoin (BTC) [120] and ethereum (ETH) [121]. The commodity CFD prices are expressed in U.S. dollars. The data comes from Dukascopy (the index, stock share, commodity CFDs, and currency exchange rates) [122] and Kraken exchange (cryptocurrencies) [123] and covers 4 years from January 2017 to December 2020 (except for the stock share CFDs that cover a shorter interval starting from January 2018). Different instrument types have different trading hours, with the stock market index and commodity CFDs quoted from Monday to Friday (00:00–23:00 hours CET, daylight saving time-adjusted), the stock share CFDs quoted from Monday to Friday (U.S.: 15:30–22:00 CET, European: 09:00–17:30 CET), the currency exchange rates quoted around the clock from Monday to Friday, and the cryptocurrency exchange rates quoted continuously 24/7.

Price *P*(*t*) of an asset is defined at the moment of transaction only and remains undefined otherwise. Therefore, in order to construct an evenly sampled time series of the price quotations, we assume that the price remains constant between the consecutive transactions, which is standard practice. The quotations of all the instruments were sampled with Δ*t*= 1 s, 10 s, 1 min, 10 min, and 1 h frequency and transformed into the normalized logarithmic returns *r*Δ*t* according to

$$
\sigma\_{\Delta t} = (R\_{\Delta t} - \mu\_R) / \sigma\_{\mathbb{R}^\star} \quad R\_{\Delta t}(t) = \log(P(t + \Delta t)) - \log(P(t)), \tag{7}
$$

where *μR* and *σR* are the mean and standard deviation of *<sup>R</sup>*Δ*t*(*t*), respectively, and Δ*t* is a sampling interval. For each asset, we obtained five time series representing the returns for different time scales Δ*t*. Figure 1 shows the evolution of *P*(*t*) for various assets that are analyzed in our work. The COVID-19 outburst in the U.S. in March and April 2020 that had a strong impact on all financial markets has been distinguished by vertical lines. A few corresponding time series of the normalized returns *<sup>r</sup>*Δ*t*(*t*) with Δ*t* = 1 min are shown in Figure 2 together with a simulated Gaussian noise of the same length.

**Figure 1.** Evolution of the CFD price and exchange rate quotations of various assets over the 4 year interval 2017–2020 (data source: Dukascopy [122]) and the cryptocurrency prices (data source: Kraken [123]). The quotations have been standardized in order to facilitate comparison. The vertical dashed lines indicate the COVID-19 outburst in March–April 2020.

**Figure 2.** Time series of the standardized 1-min returns of sample financial instruments, S&P500 CFDs, gold CFDs (XAU/USD), and EUR/USD, together with a Gaussian noise of the same length. Note the leptokurtic character of the empirical data.
