**1. Introduction**

Many empirical data sets and theoretical models have been investigated using the tool of spectral analysis. Many researchers across different fields find the power spectral density (abbr. PSD) of the 1/ *f β* form (with 0.5 - *β* - 1.5) to be of a particular interest [1–10], both because of its apparent omnipresence and the implication of slowly decaying autocorrelation, which indicates the presence of the long-range memory phenomenon. Long-range memory is also one of the established stylized facts of the financial markets [11–19]. Consequently, as our group was investigating 1/ *f* noise [20–23], we have become naturally interested in the rapidly growing field of econophysics. The term "econophysics" was coined by H. E. Stanley in the Statphys conference in Kolkata in 1995 [24]. Over the last three decades, econophysics has matured both from the theoretical and the applied perspectives. Here, we review mostly our own and directly adjacent approaches, and we would like to recommend a couple of broader reviews, which can be found in [25,26].

Our first publications were devoted to the modeling of the financial markets [27,28]. In those works, we have considered trades occurring in the financial markets as point events driven by a point process proposed in [21–23]. Thanks to the organizers of the international conference Applications of Physics in Financial Analysis 4, held in Warsaw in 2003, we were able to present our findings to econophysicists. Our first results, inspired by interaction with the participants of the APFA 4 conference, have been published in [29,30]. We presented our ideas in a more general context of complex systems in [31,32].

**Citation:** Kazakeviˇcius, R.; Kononovicius, A.; Kaulakys, B.; Gontis, V. Understanding the Nature of the Long-Range Memory Phenomenon in Socioeconomic Systems. *Entropy* **2021**, *23*, 1125. https://doi.org/10.3390/e23091125

Academic Editor: Ryszard Kutner

Received: 4 August 2021 Accepted: 25 August 2021 Published: 29 August 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

Later, we took part in the COST Action P10 "Physics of Risk" and the follow-up COST Action MP0801 "Physics of Competition and Conflicts". Bronislovas Kaulakys and Vygintas Gontis were executive committee members of both COST Actions, while the other group members gave talks and poster presentations during the annual meetings and helped organize an annual action meeting in Vilnius in 2006. This COST action meeting has helped us embrace econophysics and be recognized as econophysicists.

While it may be natural to see trades in the financial markets as point events [27–30], modeling volatility and return as a point process was not as straightforward. We have developed our approach further by abstracting the point process away and considering a continuous framework of Langevin stochastic differential equations (abbr. SDEs). First, we have shown that the continuous interpretation of the point process model works well for trading activity [33]; thus, we have refined the SDE approach with model for volatility and return [34–38]. Interestingly, similar SDEs can be derived from a simple agent-based model (abbr. ABM) [39,40], too. With time, we have developed more complicated ABMs to account for the separation of time scales and order flow [41,42]. We have even branched out into sociophysics [43–46] as we have understood that the herding ABM we used to model the financial market is essentially equivalent to the well-known voter model [47–49].

For 10 months (in 2015 and 2016), Vygintas Gontis, with the support of the Baltic American Freedom Foundation, has stayed as a visiting researcher at the Center of Polymer Studies of Boston University. Discussions with the founding fathers of econophysics, H. E. Stanley, professors Sh. Havlin, B. Podobnik, and S. Buldyrev, resulted in a paper [50]. Together, we have considered volatility return intervals (term inspired by the studies [51–54]) of the financial time series at various time scales. In the paper, we have shown that the time intervals between large financial fluctuations is distributed according to a power–law probability density function (abbr. PDF) *p*(*τ*) ∼ *τ*<sup>−</sup>3/2 [50]. The same distribution arise in our models and from many other one-dimensional Markov processes [55], while the long-range memory process would exhibit a different distribution, such as *p*(*τ*) ∼ *<sup>τ</sup>*2−*H*, which is a well-known result for the fractional Brownian motion (abbr. FBM) [56].

Here, we provide an overview of our approach to understanding and modeling the long-range memory phenomenon in financial markets and other complex systems and share our most recent result. In Section 2, we introduce the original point process and discuss how to derive a non-linear SDE, which can reproduce the long-range memory phenomenon. We also discuss numerous extensions of both the point process model and non-linear SDE. Next, in Section 3, we show how we can obtain a similar SDE from a simple herding ABM. Following the overview, we also present a novel result, which concerns understanding the nature of the self-similarity and long-range memory phenomenon from the perspective of fractional Lèvy stable motion (abbr. FLSM) and auto-regressive fractionally integrated moving average (abbr. ARFIMA) time series. In Section 4, we tested various long-range memory estimators such as mean squared displacement, method of absolute value estimator, Higuchi's method, and burst and interburst duration analysis on fractional Lèvy stable motion (ARFIMA(0,d,0) time series). Finally, in Section 5, we share our future considerations.

#### **2. The Multiplicative Point Process, the Class of Stochastic Differential Equations, and Their Applications**

In this section, we overview how the physically motivated point process proposed in [21–23] was applied to model trading activity and absolute returns in the financial markets. We also discuss numerous extensions of the model into some related research topics, such as superstatistics and anomalous and non-homogeneous diffusion.

#### *2.1. The Multiplicative Point Process Model*

Let us consider signal *I*(*t*) composed of pulses with profiles given by *Ak*(*x*):

$$I(t) = \sum\_{k} A\_{k}(t - t\_{k})\_{\prime} \tag{1}$$

where *tk* is the event (pulse) time. There are many physical and social systems, which generate signals of such nature: electric current [57], music [58], human heartbeat [59], internet traffic [32], or trading activity [29] to name a few.

As most profiles of the pulses are brief, it is trivial that they would influence only high frequencies corresponding to the typical inverse pulse length. If we are interested in longer-term dynamics, it is sufficient to assume that the Kronecker delta function well approximates the profile, *Ak*(*x*) = *ak<sup>δ</sup>*(*x*). Many such systems are driven by the flow of identical or similar objects, such as electrons, packets, or trades. This lets us simplify (1) and investigate it as a temporal point process with unit events. Such a process can be either described by the event times {*tk*} or by the inter-event times {*<sup>τ</sup>k* = *tk*+<sup>1</sup> − *tk*}.

The inter-event times are a far more convenient choice to model as they at least can give a semblance of the stationarity, while event times are obviously non-stationary as {*tk*} is monotonically increasing series. In [21–23], it was analytically shown that a relatively slow autoregressive AR(1) Brownian motion of *τk* yield 1/ *f* fluctuations of the signal *<sup>I</sup>*(*t*). The author of [29] has built upon this observation and introduced multiplicative point process for the inter-event time

$$
\pi\_{k+1} = \tau\_k + \sigma^2 \gamma \tau\_k^{2\mu-1} + \sigma \tau\_k^{\mu} \varepsilon\_k. \tag{2}
$$

In the above, it is assumed that inter-event time fluctuates due to exogenous perturbations. Perturbations are assumed to be standard uncorrelated Gaussian random variables, *εk*. The general rate of change is governed by *σ*, while *γ* is the damping constant. Multiplicativity, specified by *μ*, ensures that *I*(*t*) is multifractal and has a power–law PDF. This point process model has found its use for the analysis of 1/ *f* noise and long-range memory in many diverse phenomena such as musical rhythm spectra [58], human cognition [60], human interaction dynamics [61], turbulence [62], and few others [63–66]. Inspired by this model, [67] has shown under which conditions 1/ *f β* spectrum can arise from reversible Markov chains.

After closer examination, it should be evident that Equation (2) can be seen as an iterative solution of a certain SDE if Euler–Maruyama method was used [68]. Hence the corresponding Langevin SDE can be trivially recovered from the iterative relation (2):

$$\operatorname{d}\pi = \sigma^2 \gamma \tau^{2\mu - 1} \operatorname{d}k + \sigma \tau^\mu \operatorname{d}\mathcal{W}\_k. \tag{3}$$

Here *W* is uncorrelated standard Wiener process. Note that this SDE is in the event space (or *k*–space) and not in the real time. Further, this SDE must be solved by restricting the diffusion of the inter-event time *τ* to some arbitrary interval [*<sup>τ</sup>*min, *<sup>τ</sup>*max] on the positive half-plane as otherwise this SDE may not have a stationary distribution. If stationary distribution exists, then the stationary PDF of *τ* is a power–law:

$$p\_k(\tau) = \frac{a+1}{\tau\_{\text{max}}^{a+1} - \tau\_{\text{min}}^{a+1}} \tau^a, \quad a = 2(\gamma - \mu). \tag{4}$$

Yet the main result of [29] is the power–law statistical properties of *<sup>I</sup>*(*t*). In the limit *τ*min → 0 and *τ*max → ∞ PSD of *I*(*t*) in arbitrarily long range of frequencies has a power–law slope:

$$S(f) \sim 1/f^{\beta}, \quad \beta = 1 + \frac{2(\gamma - \mu)}{3 - 2\mu}. \tag{5}$$

The number of events in a selected time window, for example number of trades per minute, also has a power–law distribution [29]:

$$p(N) \sim N^{-2(\gamma - \mu) - 3}.\tag{6}$$

Formally, one could define the number of events in a window of length *w* as *N*[*t*] = *t*+*w t <sup>I</sup>*(*u*) d *u* (here the square brackets indicate that *N* is in discrete time). These analytical results can be confirmed by numerical simulation (see Figure 1).

**Figure 1.** Statistical properties of the point process by numerically solving Equation (2): (**a**) sample fragment of corresponding *N*[*t*] time series, (**b**) PDF of the inter-event times, and (**c**) PSD of the process. Red curves correspond to numerical results, while black curves are theoretical power–law fits with (**b**) *α* = 0 and (**c**) *β* = 1. Model parameter values: *γ* = 0, *μ* = 0, *σ* = 0.1, *w* = 1.

#### *2.2. The Class of Non-Linear Stochastic Differential Equations*

In [33,69–71], we have made a transition from *k*-space to real time and this enabled us to model trading activity and absolute returns in the financial markets not only qualitatively, but quantitatively, too. The transition from SDE in *k*-space, Equation (3), to real time is achieved by substitution d *t* = *τ* d *k*, which yields:

$$\mathbf{d}\,\pi = \sigma^2 \gamma \tau^{2\mu-2} \,\mathrm{d}\,t + \sigma \tau^{\mu-1/2} \,\mathrm{d}\,\mathcal{W}.\tag{7}$$

Modeling inter-event time in real time makes less sense than in the *k*-space, so let us change the variable to the number of events per unit time *x* = 1*τ* . Applying Itô transformation yields:

$$\operatorname{ad}\mathbf{x} = \sigma^2 \left(\eta - \frac{\lambda}{2}\right) \mathbf{x}^{2\eta - 1} \operatorname{d}t + \sigma \mathbf{x}^{\eta} \operatorname{d}W. \tag{8}$$

In the above, we have introduced a more convenient set of parameters:

$$
\eta = \frac{5}{2} - \mu, \quad \lambda = 2(\gamma - \mu) + 3. \tag{9}
$$

As far as SDE (8) corresponds to the point process defined by Equation (2), the results for stationary PDF and PSD should apply:

$$p(\mathbf{x}) \sim \mathbf{x}^{-\lambda}, \qquad S(f) \sim 1/f^{\beta}, \quad \beta = 1 + \frac{\lambda - 3}{2\eta - 2}. \tag{10}$$

The validity of these theoretical predictions was extensively checked numerically (see Figure 2 for a quick example) and also, in [72], proven analytically. The analytical proof provided in [72] allows interpreting the process modeled by SDE (8) in a more general context. In fact we can model any process possessing these power–law statistical properties, even processes, which make less sense from the perspective of the original point process.

**Figure 2.** Various slopes of PDF (**a**) and PSD (**b**) reproduced by the numerical solutions of SDE (8). Model parameter values: *σ* = 1, *η* = 2.5 (all cases) and *λ* = 2 (red curves in both (**<sup>a</sup>**,**b**)), 3 (blue curves), 4 (green curves), and 5 (magenta curves). Black dashed lines correspond to (**a**) *p*(*x*) ∼ *x*<sup>−</sup>*<sup>λ</sup>* with *λ* = 1.5 and *λ* = 6 (upper and lower curves), (**b**) *S*(*f*) ∼ 1/ *f β* with *β* = 0.5 and *β* = 2 (upper and lower curves).

Equation (8) and similar random walk models have been used to model the EUR/CHF exchange rate [73]. It has also lead to numerous modifications by our group, which we discuss in detail in the following subsections.

#### *2.3. Reproducing the Long-Range Memory Using GARCH(1,1) Process*

Autoregressive conditional heteroscedasticity (abbr. ARCH) family models [74–79] are quite popular forecasting tools among professional traders as well as researchers interested in the long-range memory phenomenon. Unlike SDEs, ARCH family models have explicitly built-in memory, which is built-in either via explicit dependence on the numerous previous states, infinitely many in the case of the ARCH(∞) model [80–82], or via fractional integration procedure, which introduces memory similar to the one present in the fractional Brownian motion, as in the fractionally integrated GARCH (abbr. FIGARCH) model [83–85]. In [86], we have shown that it is possible to modify the GARCH(1,1) model, which is Markovian in nature, to reproduce 1/ *f* spectrum.

Generalized autoregressive conditional heteroskedasticity (abbr. GARCH) processes can be approximated by the diffusion processes. There are two competing approaches, which yield continuous approximations of GARCH processes using sets of SDEs. One of the approaches was proposed by Nelson [87] and the other by Kluppelberg et al. [88,89]. In the GARCH(1,1), Nelson's approach is easier to apply, but has a drawback that the resulting COGARCH(1,1) would be driven by two sources of noise, instead of the one in the GARCH(1,1). Yet, we can circumvent the problem by ignoring the observed heteroskedastic economic variable *zt* and focusing on the approximation of the volatility process, *σ*2*t* , of GARCH(1,1):

$$
\omega\_l = \sigma\_l \omega\_{l,l} \tag{11}
$$

$$
\sigma\_t^2 = a + bz\_{t-1}^2 + c\sigma\_{t-1}^2 = a + b\sigma\_{t-1}^2\omega\_{t-1}^2 + c\sigma\_{t-1}^2.\tag{12}
$$

In the above, *ωt* is the noise, while *a*, *b*, and *c* are the GARCH(1,1) model parameters. For Nelson's approach to work, we need to compute first and second moments of change in volatility. With the usual GARCH(1,1) we obtain SDE for geometric Brownian motion [86].

Now lets introduce non-linearity into Equation (12). In [86], we have explored two such options:

$$
\sigma\_t^2 = a + b\sigma\_{t-1}^\mu \omega\_{t-1}^\mu + c\sigma\_{t-1}^2. \tag{13}
$$

$$
\sigma\_t^2 = a + b \sigma\_{t-1}^\mu |\omega\_{t-1}|^\mu + \sigma\_{t-1}^2 - \varepsilon \sigma\_{t-1}^\mu. \tag{14}
$$

Both of these options can be approximated by SDEs belonging to the class of SDEs (8) with *λ* = *μ* and *η* = *μ*/2. Consequently both of these options reproduce 1/ *f* spectrum with *μ* = 3. Other parameters, *a*, *b*, and *c*, influence only the additional terms, which restrict the diffusion of *σ*<sup>2</sup> *t* . Setting these values too high shrinks the interval and the power–law distribution becomes extremely hard to observe.

#### *2.4. Anomalous Diffusion in the Long-Range Memory Process*

SDE (8) can be also seen to describe a heterogeneous diffusion in a non-linear potential. Such diffusion leads to anomalous growth in variance [90]

$$
\left< \left[ \mathbf{x}(t) - \left< \mathbf{x}(t) \right> \right]^2 \right> \sim t^{\theta}, \quad \theta = \frac{1}{1 - \eta}. \tag{15}
$$

This phenomenon is also known as anomalous diffusion [91–93]. If *θ* = 1 then the process exhibits normal diffusion. Otherwise if 0 < *θ* < 1, the diffusion is slower than normal and is referred to as sub-diffusion. The diffusion may also be faster, if 1 < *θ* < 2, in that case it is called super-diffusion.

The anomalous diffusion can be obtained from SDE (8) only for specific parameter values such as *λ* < 1 and *η* < 1/2 [90]. Because power–law slope of the PSD, *β*, varies between 0 and 2, from Equation (10), it follows that anomalous diffusion and power–law noise can be observed at the same time only for negative parameter *η* values, specifically for *η* < (*λ* − 1)/2 and *λ* < 1; however, for these parameters values numerical simulation would become very slow and inefficient [72]; therefore, we have considered generalizing SDE (8) by considering non-Gaussian white noise.

In [94], we have considered Lévy *α*-stable noise. SDE equivalent to SDE (8), but with Lévy *α*-stable noise takes the following form:

$$\frac{\mathbf{d}\,\mathbf{x}}{\mathbf{d}\,t} = \gamma(\eta,\lambda,a)\mathbf{x}^{a(\eta-1)+1} + \mathbf{x}^{\eta}\xi\_a(t). \tag{16}$$

Here, *ξα*(*t*) is a white noise, the intensity of which is distributed according to the symmetric Lévy *α*-stable distribution. The characteristic function of the noise intensity is given by:

$$
\langle \exp(ik\xi\_a) \rangle = \exp\left(-\sigma^a |k|^a\right). \tag{17}
$$

Here, *α* is the index of stability and *σ* is the scale parameter. We interpret SDE (16) in an Itô sense and it can also be written in the form

$$\mathbf{d} \ge \mathbf{x} = \gamma(\eta, \lambda, a) \mathbf{x}^{a(\eta - 1) + 1} \mathbf{d} \, t + \mathbf{x}^{\eta} \, \mathbf{d} \, L^{a}\_{t}. \tag{18}$$

Here, d *Lα t* stands for the increments of Lévy *α*-stable motion *Lα t* . If SDE (16) is solved with reflective boundary conditions and

$$\gamma(\eta,\lambda,a) = \frac{\sin\left[\pi(\frac{\mu}{2} - a\eta + \lambda)\right]}{\sin\left[\pi(a(\eta - 1) - \lambda)\right]} \frac{\Gamma(a\eta - \lambda + 1)}{\Gamma(a(\eta - 1) - \lambda + 2)}\tag{19}$$

then generalized SDE (16) generate time series with power–law steady-state PDF and power–law PSD:

$$p(\mathbf{x}) \sim \mathbf{x}^{-\lambda}, \qquad S(f) \sim \frac{1}{f^{\beta'}} \quad \beta = 1 + \frac{\lambda - 3}{\kappa(\eta - 1)}.\tag{20}$$

Extensive numerical simulations have shown that due to the presence of the multiplicative Lévy *α*-stable noise in Equation (16) both sub-diffusion and super-diffusion can be observed together with power–law noise even for positive *η* values [95]; however, no analytical expression for anomalous diffusion exponent dependence on SDE parameters has been derived yet.

In Figure 3, we show a sample series of the solutions of SDE (16) and the statistical properties of the series when the noise is Lévy *α*-stable noise with *α* = 1. The other SDE (16) parameters were picked so 1/ *f* spectrum would be reproduced. As can be seen in the subfigure (**a**), ongoing diffusion is disrupted by huge jumps, which are characteristic to Lévy flights.

**Figure 3.** Statistical properties of the time series obtained by solving SDE with Lévy *α*-stable noise, Equation (16): (**a**) sample fragment of the time series, (**b**) PDF, and (**c**) PSD of time series. Red curves correspond to numerical results, while black curves are power–law best fits with exponents (**b**) *λ* ≈ 3.3, (**c**) *β* ≈ 1.

If we consider modeling only sub-diffusive processes, then we can study another generalization of SDE (8), originally proposed in [96]. If we start with a Markovian process described by the Itô SDE

$$\mathbf{d}\,\mathbf{x}(\tau) = f[\mathbf{x}(\tau)]\,\mathbf{d}\,\tau + \mathbf{g}[\mathbf{x}(\tau)]\,\mathbf{d}\,\mathcal{W}(\tau). \tag{21}$$

The drift and diffusion functions of the above SDE are given by

$$f(\mathbf{x}) = \sigma^2 \left(\eta - \frac{\lambda}{2}\right) \mathbf{x}^{2\eta - 1}, \qquad \mathbf{g}(\mathbf{x}) = \sigma \mathbf{x}^{\eta}. \tag{22}$$

We interpret the time *τ* as an internal (operational) time. For the trapping processes that have a distribution of the trapping times with power–law tails, the physical time *t* = *<sup>T</sup>*(*τ*) is given by the strictly increasing *<sup>α</sup>*+-stable Lévy motion defined by the Laplace transform

$$
\left\langle e^{-kT(\tau)} \right\rangle = e^{-\tau k^{a\_+}}.\tag{23}
$$

Here, the parameter *α*+ takes the values from the interval 0 < *α*+ < 1. Thus, the physical time *t* obeys the SDE

$$\operatorname{d}t(\pi) = \operatorname{d}L^{\mathfrak{a}\_+}(\pi),\tag{24}$$

where *dL<sup>α</sup>*+ (*τ*) stands for the increments of the strictly increasing *<sup>α</sup>*+-stable Lévy motion *L<sup>α</sup>*+ (*τ*). For such physical time *t* the operational time *τ* is related to the physical time *t* via the inverse *<sup>α</sup>*+-stable subordinator

$$S(t) = \inf\{\tau : T(\tau) > t\}.\tag{25}$$

Such subordination leads to power spectral density

$$S(f) \sim \begin{cases} \frac{1}{\omega^{\beta}}, & 1 - a\_{+} < \beta < 1 + a\_{+},\\ \frac{1}{\omega^{1+a\_{+}}}, & \beta > 1 + a\_{+}. \end{cases}, \qquad \beta = 1 + \frac{a\_{+}(\lambda - 3)}{(\eta - 1)} \tag{26}$$

Proposed SDEs (8), (16), and (21) have served as a basis to study heterogeneous diffusion in a non-homogeneous medium [90,96,97] and time subordinated processes [98,99] as well as the effects of non-linear variable transformations [100,101].

In paper [98], we investigated the distinction between the internal time of the system and the physical time as a source of 1/ *f* noise. We have introduced the internal (operational) time into the earlier point process [21–23] together with additional equations relating the internal time to the physical time. In this scenario, we can still recover power–law statistical features similar to the ones obtained by solving Equation (8). In the financial markets, the internal time could reflect the fluctuating human activity, e.g., trading activity, yielding the long-range correlations in the volatility. The effective approach for the solution of highly non-linear SDEs was proposed [98] by a suitable choice of the internal time and variable steps of integration.

The effects of non-linear variable transformations [100,101] sugges<sup>t</sup> that long-range memory in certain cases can be just a measurement effect. As far as the non-linear transformation of the observable *x* to *y*

$$\alpha = \frac{1}{y^{\delta}} \tag{27}$$

with *δ* being the transformation exponent, yields SDE for the variable *y* of the same form such as Equation (8) for *x*.

#### *2.5. Inverse Cubic Law for Long-Range Correlated Processes*

The inverse cubic law is an established stylized fact stating that the cumulative distributions of various financial market time series such as the number of trades, the trade volume, or the return [12,14,15,19]. Thus, this law is as important for the modeling as the consideration of long-range memory and fractal scaling, which are also stylized facts [6,12,14,15,19]. We have in proposed [102] that the non-linear SDE yields both the power–law behavior of the PSD and the inverse cubic law of the cumulative distribution. This was achieved using the idea that when the market evolves from calm to violent behavior there is a decrease of the delay time of multiplicative feedback of the system in comparison to the driving noise correlation time. This results in a transition from the Itô to the Stratonovich sense of the SDE and yields a long-range memory process.

We start from a simple quadratic SDE

$$\mathbf{d}\,\mathbf{x} = \mathbf{x}^2 \circ\_{\mathbb{A}} \mathbf{d}\,\mathcal{W} \tag{28}$$

where *α* is the interpretation parameter, defining the *α*-dependent stochastic integral of the SDE (28),

$$\int\_0^T f(\mathbf{x}(t)) \diamond\_{\mathbf{d}} \, d\mathcal{W}\_l \equiv \lim\_{N \to \infty} \sum\_{n=0}^{N-1} f(\mathbf{x}(t\_n)) \Delta \mathcal{W}\_{l\_n}.\tag{29}$$

Here, *tn* = *n*+*α N T* with 0 ≤ *α* ≤ 1. Natural choices of the parameter *α* are: (i) *α* = 0, pre-point (Itô convention), (ii) *α* = 1/2, mid-point (Stratonovich convention), and (iii) *α* = 1, post-point (Hänggi–Klimontovich, kinetic, or isothermal convention) [103].

The quadratic SDE (28) is the simplest multiplicative SDE without the drift term symmetric for the positive and negative deviations of some observable *x*. More generally, the same process can be described by the delayed SDE [103]

$$\mathbf{d}\,\mathbf{x}(t) = f(\mathbf{x}(t))\,\mathbf{d}\,t + \mathcal{g}(\mathbf{x}(t-\delta))\mathbb{J}\_t^\tau \,\mathbf{d}\,t.\tag{30}$$

Here, *f*(*x*) represents arbitrary deterministic drift of the observable *x*, while *g*(*x*) effectively controls the diffusion as *ζτt* is the noise term, which is assumed to have correlation time *τ*. Note that the diffusion function depends on the delayed value of the observable *x* (by time interval *δ*).

It may be shown [103] that in the limit *δ* → 0 and *τ* → 0 (under the condition *δ*/*τ* = *const*) SDE (30) can be transformed into

$$\mathbf{d}\,\mathbf{x} = f(\mathbf{x}(t))\,\mathbf{d}\,t + \mathbf{g}(\mathbf{x}(t)) \circ\_a \mathbf{d}\,\mathcal{W} \tag{31}$$

with the interpretation parameter being determined by

$$a\left(\frac{\delta}{\tau}\right) \simeq \frac{1}{2(1+\delta/\tau)}.\tag{32}$$

Under the perturbation by the white noise, in a case of *τ δ*, even for a short delay in feedback *δ*, we achieve the Itô outcome, because there is no correlation between the sign of the noise *ζt* and the time-derivative of the feedback *g*(*x*). On the contrary, under the perturbation by the correlated noise, *τ δ*, a correlation emerges between the sign of *ζt* and the time-derivative of *g*(*x*). In this case the correlation yields the Stratonovich outcome [103].

In general, the value of *α* may depend on the coordinate *x* and/or other system' parameters. SDE (28) with *α* = 0 may be transformed into SDE in Itô sense

$$\mathbf{d}\,\mathbf{x} = 2a\mathbf{x}^3 \,\mathrm{d}\,t + \mathbf{x}^2 \,\mathrm{d}\,\mathcal{W}.\tag{33}$$

This SDE is a particular case of the general Itô Equation (8) yielding the power–law steady-state PDF and the power–law PSD (10). These SDEs become identical for *η* = 2 and *λ* = 4(1 − *<sup>α</sup>*).

Let us note that 1/ *f β* noise emerges due to the large fluctuations in the time series, while the finite time studies reveal the commonly observed magnitudes of the observable. The common fluctuations can be modeled by the familiar in the financial application's Itô SDEs. On the other hand, the large rapid fluctuations of the violent market arise due to the strong correlated influences; the processes of such a market are fast, all durations become short in comparison to the herding correlation time, and, consequently, the market should be modeled by the Stratonovich version of SDE.

For the modeling of such dynamics, we generalize Equations (28) and (33) with *x*-dependent parameter *<sup>α</sup>*(*x*). Let

$$\mathbf{d}\,\mathbf{x} = 2\boldsymbol{\alpha}(\mathbf{x})\mathbf{x}^3\,\mathbf{d}\,\mathbf{t} + \mathbf{x}^2\,\mathbf{d}\,\mathcal{W},\tag{34}$$

with, e.g.,

$$u(\mathbf{x}) = \frac{1}{2} \left[ 1 - \exp\left\{ -\left(\frac{\mathbf{x}}{\chi\_{\mathcal{E}}}\right)^2 \right\} \right],\tag{35}$$

where *xc* is the Itô to Stratonovich interpretations crossover parameter. Equations (34) and (35) represent transition from Itô to Stratonovich convention with an increase in the variable *x* and decrease of the delay time of multiplicative feedback for larger *x*, according to the Wong–Zakai theorem [103]. Detailed numerical analysis of the model represented by Equations (34) and (35) is presented in paper [102].

#### *2.6.* 1/ *f β Noise with Distributions Other Than Power–Law*

Solutions of the SDE (8) will always have power–law statistical properties of the (10) form; however, often noise with 1/ *f β* PSD is distributed according to PDF, which is not power–law, but Gaussian or some other distribution. Here, we review two different approaches, which allow for other distributions to be observed in time series with 1/ *f β* spectrum: superstatistical and coupled SDE approaches.

In [104], it was suggested that the Poissonian-like process with the slowly changing average inter-event time may be represented as the superstatistical process exhibiting 1/ *f* noise. It was assumed that the inter-event time *τk*, obtained by solving Equation (2), represents not the actual (observed) inter-event time, but its average (reciprocal of the event rate). In this setup, the actual inter-event time *τ*ˆ*k* would be given by the conditional probability

$$
\varphi(\mathfrak{t}\_k|\mathfrak{r}\_k) = \frac{1}{\mathfrak{r}\_k} e^{-\mathfrak{t}\_k/\mathfrak{r}\_k} \,\,\,\,\,\,\tag{36}
$$

similar to the non-homogeneous Poisson process. This additional randomization has no influence on the lower frequencies of the PSD and the intensity of the signal.

The PDF of the observed inter-event time *τ*ˆ*k* may be derived from the superstatistical model,

$$p(\mathfrak{t}\_k) = \int\_0^\infty p(\mathfrak{t}\_k|\mathfrak{r}\_k) p\_k(\mathfrak{r}\_k) \, \mathrm{d}\, \mathfrak{r}\_k. \tag{37}$$

Equations (36) and (37) generate the *q*-exponential distribution used in the nonextensive statistical mechanics and many real systems [105]. Detailed analytical derivations and the numerical verification were presented in [104].

In the paper [38], a similar superstatistical approach was taken with respect to the intensity of the signal *x*, obtained by solving SDE (8). The observed series *x*ˆ is assumed to be generated from *x* series by applying exogenous noise, which is described by an arbitrary conditional distribution *ϕ*(*x*<sup>ˆ</sup>|*x*). In this approach, the steady-state distribution of *x*ˆ is given by

$$p(\mathfrak{x}) = \int\_0^\infty p(\mathfrak{x}|\mathfrak{x}) p(\mathfrak{x}) \, \mathrm{d}\mathfrak{x}.\tag{38}$$

Analytical and numerical analysis of inter-trade duration, the trading activity, and the return using the superstatistical method with the exponential and normal distributions of the local signal, driven by the stochastic process, were discussed in detail in [38].

In later sections of this paper, we show that the superstatistical approach is not the only approach that allows us to change the observed signal PDF. The coupled SDE approach, proposed in [99], allows for more flexibility and easier interpretation of how the statistical properties become independent of each other. The general form of the set of coupled SDEs was derived from the scaling properties needed for the realization of 1/ *f β* noise [99]

$$\mathbf{d} \, \mathbf{x} = f(\mathbf{x}) y^{2\eta} \, \mathbf{d} \, \mathbf{t} + \mathbf{g}(\mathbf{x}) y^{\eta} \, \mathbf{d} \, \mathcal{W}\_{1\prime} \tag{39}$$

$$\operatorname{d}y = \sigma^2 \left(\eta + 1 - \frac{\lambda}{2}\right) y^{2\eta+1} \operatorname{d}t + \sigma y\_t^{\eta+1} \operatorname{d}\mathcal{W}\_2. \tag{40}$$

Here, *f*(*x*) and *g*(*x*) are arbitrary drift and diffusion functions, which determine the stationary PDF of *x*; *W*1 and *W*2 are uncorrelated standard Wiener processes. The first equation describes the changes in the intensity of the signal, while the second equation represents fluctuations in the rate of change. These coupled SDEs allow for 1/ *f β* spectrum to be reproduced together with arbitrary steady-state PDF of the observed value *x*. It was shown that the power–law slope of the PSD, *β*, of the time series of *x* generated by solving SDEs (39) and (40) depends on the parameters *η* and *λ* as follows

$$
\beta = 1 + \frac{\lambda - 1}{2\eta}.\tag{41}
$$

In Figure 4, we show that one can obtain a Gaussian distribution of *x* (subfigure (**b**)) together with 1/ *f* spectrum (subfigure (**c**)). In subfigure (**a**), one can visually see the impact of the variations in the rate of change.

**Figure 4.** Statistical properties of the time series obtained by solving coupled SDEs (39) and (40): (**a**) sample fragment of *x*(*t*) time series, (**b**) PDF of the externally observed values *x*, and (**c**) PSD of *<sup>x</sup>*(*t*). Red curves correspond to numerical results, while black curves are theoretical fits: (**b**) standard Gaussian PDF, (**c**) *S*(*f*) ∼ 1/ *f β*.

#### *2.7. Reproducing Statistical Properties of the Financial Markets*

While qualitatively, the trading activity and the absolute returns have power–law distributions and exhibit long-range memory property [14,19], corresponding empirical statistical properties have a finer structure. In order to reproduce the empirical statistical properties in detail, some modifications to the SDE are needed.

The author of [13] has determined that Hurst exponents of the trading activity time series of 1000 US stocks are remarkably close: *H* ≈ 0.85. This implies that the PSD of the trading activity should have a power–law slope *β* = 2*H* − 1 ≈ 0.7. The author of [13] has also discovered the that slope of the PDFs of the trading activity also has a power–law tail with exponent *λ* ≈ 4.4. It would be impossible to reproduce such values by using SDE (8), because Equation (10) implies that if *λ* > 3, then *β* > 1. In our analysis of 26 US stocks [106], we have confirmed the slope of the PDF, but we have observed a more complicated PSD, with two slopes instead of one (*β* < 1 for both slopes).

Both of these issues are resolved by a modified SDE for trade intensity, *n* [33]:

$$\operatorname{ad} n = \sigma^2 \left[ \eta - \frac{\lambda}{2} + \left( \frac{n\_0}{n} \right)^2 \right] \frac{n^{2\eta - 1}}{\left( n\varepsilon + 1 \right)^2} \operatorname{d} t + \sigma \frac{n^{\eta}}{n\varepsilon + 1} \operatorname{d} \mathcal{W}. \tag{42}$$

The problem of the two PSD slopes is resolved, because this SDE has two different effective *η* values. For *n* <sup>−</sup><sup>1</sup> the effective *η* is equal to the specified parameter value (in the numerical simulations we have used *η* = 5/2, thus *η*ˆ1 = 5/2). For *n* <sup>−</sup><sup>1</sup> the effective *η* is one smaller than the specified parameter value *η*ˆ2 = *η* − 1 = 3/2). The slope of the PDF increases from the value predicted in Equation (10) due to integration, as trading activity is defined as number of trades per time window *w*, or in the current parametrization, an integral of trade intensity: *N*[*t*] = *t*+*w t n*(*u*) d *u*.

Parameter *n*0 and the related term in the drift function ensure that *n* would not become very small as the term causes the potential to rapidly grow for *n* < *n*0. This helps us avoid negative trade intensities, which are impossible by definition, as well as ensure some level of minimal trading activity, which in our experience may differ for different stocks and different markets [37,106].

In Figure 5, we have shown that the stochastic model can match statistical properties of MMM stock traded on NYSE. While the matches are not perfect, some of the noticeable differences can be explained by the fact that the stochastic model does not take into account intraday seasonalities.

**Figure 5.** Trading activity (**a**) PDF and (**b**) PSD for MMM stock traded on NYSE (red curve) and the numerical solutions of SDE (42). Model parameters values: *η* = 2.5, *λ* = 4.3, *σ*<sup>2</sup> = 0.045, = 0.36, *n*0 = 0.14. Empirical and numerical PDF was obtained by considering trades in the 300 s time window.

Reproducing statistics of absolute return requires another modification of the SDE [36]. Our empirical analysis, confirmed by the other authors [105], indicated that the *q*-Gaussian distribution [38,107] seems to be a good fit for the empirical absolute return, defined as the log–price difference, distribution. This is achieved by:

$$\operatorname{ad}\mathbf{x} = \sigma^2 \left[ \eta - \frac{\lambda}{2} - \left( \frac{\mathbf{x}}{\mathbf{x}\_{\text{max}}} \right)^2 \right] \frac{\left( 1 + \mathbf{x}^2 \right)^{\eta - 1}}{\left( 1 + \epsilon \sqrt{1 + \mathbf{x}^2} \right)^2} \operatorname{x} \operatorname{d}t + \sigma \frac{\left( 1 + \mathbf{x}^2 \right)^{\frac{\eta}{2}}}{1 + \epsilon \sqrt{1 + \mathbf{x}^2}} \operatorname{d}W. \tag{43}$$

To reproduce the full complexity of the empirical data, another ingredient is needed, namely external noise, which can be understood as an effect of news flow or the distortions caused by the discrete order flow:

$$r\_t = \xi \left\{ r\_0 = 1 + \frac{2}{w} \left| \int\_{t-w}^t \mathbf{x}(u) \, \mathrm{d}u \right|, q = 1 + 2/\lambda\_2 \right\}. \tag{44}$$

This relation was inspired by the superstatistical approach (discussed in Section 2.6) and determined by trying to fit the empirical data as best we can. We have empirically determined that the best fit is obtained when *ξ* is a process that generates uncorrelated random variates from a *q*-Gaussian distribution with *q* ≈ 1.4 (*λ*2 ≈ 5) and *r*0 being one minute (*w* ≈ 60 s) moving average filter of the solutions of SDE (43). Using this model, we were able to reproduce empirical statistical properties of stock from New York (abbr. NYSE) and Vilnius stock exchanges (abbr. VSE) [36,37].

In Figure 6, we have demonstrated that the stochastic model reproduces empirical data reasonably well from NYSE and VSE. Some of the noticeable differences can be observed because we do not take into account the intraday seasonality, and we do not directly take into account that VSE had relatively low liquidity (many one minute time intervals have zero returns). Differing liquidity is a likely explanation for the differences seen between NYSE and VSE, too.

**Figure 6.** Comparison of empirical (**a**) PDFs and (**b**) PSDs of absolute one minute return as observed in NYSE (red curves) and VSE (blue curves) stocks. Empirical results are compared against the model, generated by the SDE (43) and exogenous noise Equation (44), (black curves). Model parameter values: *η* = 2.5, *λ* = 3.6, = 0.017, *x*max = 103, *λ*2 = 5.

#### *2.8. Variable Step Method for Solving Non-Linear Stochastic Differential Equations*

Note that SDEs (8), (42), and (43) are not Lipshitz continuous [68]; thus, they have to be solved by imposing boundary conditions, which would prevent the explosion of the solutions. An alternative way to achieve Lipshitz continuity is to include additional terms for restricting diffusion, which would have no detrimental effects on the PSD and PDF of the time series. Such is the role of the *n*0 term in SDE (42) and *x*max term in SDE (43).

Lacking Lipshitiz continuity causes another complication in solving the SDEs: the standard Euler–Maruyama or Milsten methods [68] do not yield good results with reasonable step sizes. This complication is resolved by using a variable step size. The core idea is to use a larger step size whenever the anticipated changes would be small and use the smaller step size whenever significant changes are coming. The mathematical form of the variable step size is often unique to the SDE being solved, but a good rule of thumb would be to linearize the drift and the diffusion functions. See [69,70] for more details.

For example, SDE (8) in our works is solved by the following set of difference equations:

$$\mathbf{x}\_{i+1} = \mathbf{x}\_i + \kappa^2 \left(\eta - \frac{\lambda}{2}\right) \mathbf{x}\_i + \kappa \mathbf{x}\_i \varepsilon\_{i\prime} \tag{45}$$

$$t\_{i+1} = t\_i + \kappa^2 \mathbf{x}^{2-2\eta}.\tag{46}$$

In the above *κ* is a small number that acts as an error tolerance parameter. The smaller it becomes, the better *xi* reproduces desired statistical properties given by Equation (10), but at the expense of numerical computation time.

Similarly, this variable step method can be also applied to SDEs with *α*-stable Lévy noise. For example, we can solve SDE (16) numerically by using the following set of difference equations

$$\mathbf{x}\_{k+1} = \mathbf{x}\_k + \kappa^a \gamma \mathbf{x}\_k + \frac{\kappa}{\sigma} \mathbf{x}\_k \mathbf{j}\_{k'}^a \tag{47}$$

$$t\_{k+1} = t\_k + \frac{\kappa^a}{\sigma^a} \mathbf{x}\_k^{-a(\eta -1)} \, , \tag{48}$$

where *ξαk* is a random variable having *α*-stable Lévy distribution. This set of difference equations should be solved only with the reflective boundaries at *x* = *x*min and *x* = *x*max using the projection method [108]. In nutshell, if the variable *xk*+<sup>1</sup> acquires the value outside of the interval [*<sup>x</sup>*min, *<sup>x</sup>*max] then the value of the nearest reflective boundary is assigned to *xk*+1. Iterative equations for SDEs (42) and (43) are a bit more complicated [36,106], but they still remain qualitatively the same.

Note that the introduction of the variable time step into the numerical solution of an SDE is equivalent to introducing the subordination scheme directly into the SDE, when internal time and physical time are related by a non-linear transformation [98].

#### **3. Agent-Based Model of the Long-Range Memory in the Financial Markets**

In the previous section, we have discussed how our group has started from the physically motivated point process model and arrived at the general class of SDEs reproducing long-range memory phenomenon; however, this generality has its drawback: microscopic mechanisms of the modeled systems are ignored. We then tried to investigate some existing financial ABMs for the possibility to derive SDE of a similar form to SDE (8). We have failed to do so with some prominent ye<sup>t</sup> complicated ABMs, such as the ones proposed in [109,110] (for more prominent ABMs of the time, which include some other candidates we have tried, see [111]); however, we have found success with Kirman's herding model, initially proposed in [112] and later analyzed in financial market context by [113,114].

#### *3.1. Kirman's Herding Model*

Kirman's herding model can be defined via two one-step transition probabilities in a system with two possible states:

$$p(X \to X+1) = (N-X)[\sigma\_1 + hX]\Delta t,\tag{49}$$

$$p(X \to X - 1) = X[\sigma\_2 + h(N - X)]\Delta t. \tag{50}$$

In the above, *X* is the number of agents in state 1 and *N* is the total number of agents within the system. Total number of agents is conserved, so the number of agents in state 2 is trivially given by *N* − *X*. Here, Δ*t* is a short time window during which only one transition should be likely. Transitions may occur either due to independent behavior (governed by parameters *<sup>σ</sup>i*), or due to recruitment (governed by parameter *h*). Using birth–death process formalism [115] it is easy to find SDE corresponding to Kirman's herding model with *x* = *X*/*N*:

$$\mathbf{d}\,\mathbf{x} = \left[ (1-\mathbf{x})\sigma\_1 - \mathbf{x}\sigma\_2 \right] \mathbf{d}\,t + \sqrt{2h\mathbf{x}(1-\mathbf{x})} \,\mathbf{d}\,\mathcal{W}.\tag{51}$$

#### *3.2. Kirman's Herding Model for the Financial Markets*

Evidently, SDE (51) is not of the same form as SDE (8), but we have not ye<sup>t</sup> discussed the meaning of states 1 and 2. In many financial ABMs of the time, it was a common choice to assume that agents represent chartist and fundamentalist traders [111]. Assuming that chartist traders trade based on the wide variety of technical trading tools, which often produce conflicting predictions, their excess demand (difference between the supply and demand generated by the group as a whole) is given by:

$$D\_{\mathfrak{c}} = r\_0 X\_{\mathfrak{c}}(t) \mathfrak{z}(t),\tag{52}$$

where *Xc*(*t*) is the number of chartist traders and *ξ*(*t*) is their average mood (describing average sentiment to buy or sell). The relative impact of the chartists' traders in comparison to fundamentalist traders is given by *r*0. Fundamentalist traders on the other hand are often assumed to trade based on the quantity known as a fundamental price, *Pf* , with the expectation that the price, *<sup>P</sup>*(*t*), in the long run, will converge towards the fundamental price. Under this assumption, their excess demand is given by:

$$D\_f = X\_f(t) \ln \frac{P\_f}{P(t)}.\tag{53}$$

Using the excess demand functions of the both groups, we can use Walras law [116] to obtain the expression for the price [40,113]:

$$P(t) = P\_f \exp\left[r\_0 \frac{\mathbf{X}\_c(t)}{\mathbf{X}\_f(t)} \xi(t)\right]. \tag{54}$$

The log–return of the price is evidently given by:

$$
\tau\_w(t) = \ln P(t) - \ln P(t - w) = r\_0 \frac{\mathbf{x}\_\mathcal{E}(t)}{\mathbf{x}\_f(t)} \zeta\_w(t). \tag{55}
$$

In the above, *ζw*(*t*) is the mood change function over time window *w*. As the mood changes on a very short time scale and we are interested in the long-term dynamics, we can simply assume that *ζw*(*t*) is some kind of uncorrelated noise and consider only a more slowly varying ratio between fractions of chartists and fundamentalists. As the total number of agents is fixed, we can define long-term component of return, modulating return, as:

$$y(t) = \frac{x(t)}{1 - x(t)}.\tag{56}$$

SDE for the modulating return is given by:

$$\operatorname{d}y = [\sigma\_1 + (2 - \sigma\_2)y](1 + y)\operatorname{d}t + \sqrt{2hy}(1 + y)\operatorname{d}W\_\prime \tag{57}$$

which is roughly similar to the SDE (8) with *η* = 3/2 and *λ* = *σ*2 *h*+ 1.

 This SDE can be generalized by introducing variable event rate *τ*(*y*) = *y*<sup>−</sup>*α*. This addition can be explained by the fact that it is well known that returns and trading volume correlate and the best correlation is achieved between squared returns and volume [16–18,117], hence suggesting that *α* = 2 is a likely candidate. With this extension and when considering only the highest powers of *y* (as the large *y* tend to influence the PSD), we obtain [40]:

$$\operatorname{d}y = h(2 - \sigma\_2)y^{2+n} \operatorname{d}t + \sqrt{2hy^{3+n}} \operatorname{d}W. \tag{58}$$

Now this SDE is completely equivalent to the SDE (8) with *η* = 3+*α* 2 and *λ* = *σ*2 *h* + *α* +1. Consequently PSD of *y* will have a frequency range in which:

$$S\_{\mathcal{Y}}(f) \sim 1/f^{\mathcal{G}}, \quad \beta = 1 + \frac{\frac{\sigma\_2}{\hbar} + \alpha - 2}{1 + \alpha}. \tag{59}$$

In the later papers, we modified this herding ABM until it was able to reproduce the absolute return PDF and PSD close to the empirical absolute return PDFs and PSDs. In [118], we have shown that considering mood dynamics can help in reproducing fractured PSD. In [41], we have reliably introduced the exogenous noise, much similar to what was achieved with the SDE driven model in [36], into this ABM, thus producing a consentaneous model. In [119,120], we have explored the opportunities to control the fluctuations in the artificial financial markets driven by the herding ABM, showing that the random trading, control strategy suggested in [121], may also destabilize the market. In [42], we have removed the assumption about the exogenous noise and replaced it with order book dynamics, thus presenting another possible explanation for fracture in the PSD: it also arises due to market price lagging behind the changes in the equilibrium price, Equation (54). Notably, the order book version of the model was able to reproduce both trading activity and absolute return statistical properties at the same time.

In Figure 7, we have reproduced one of the figures from [41] to show how well the ABM can reproduce the empirical data from New York, Vilnius, and Warsaw stock exchanges (abbr. WSE). Here, we have shown that the model was able to reproduce 10 min absolute return PDFs and PSDs from the different stock exchanges, but in the original article, more intraday time scales are covered, and seasonality was also taken into account.

**Figure 7.** Comparison of empirical (**a**) PDFs and (**b**) PSDs of absolute ten minute return as observed in NYSE (red curves), VSE (blue curves), and WSE (green curves) stocks. Empirical results are compared against the consentaneous model, defined in [41]. Model parameter values are the same as in Figure 2 of [41].

#### *3.3. Kirman's Herding Model, Voter Model, and the Opinion Dynamics Context*

Attentive reader with a background in opinion dynamics will likely notice that Kirman's model is remarkably similar to the well-known voter model [47–49]. They are identical, which has prompted us to question whether the voter model is truly a model for voters, which Fernandez–Garcia et al. in [122] also raised. This has lead us to explore and model statistical properties of spatially heterogeneous electoral data [43]. As we have noticed segregation effects in the electoral data, we have continued our investigation by considering the migratory nature of census and electoral data [44]. Similar approaches were taken by others as well. Sano and Mori [123] have looked into spatiotemporal Japanese election data in their model, assuming a noticeable fraction of stubborn voters who do not allow for the party's popularity to drop below a certain threshold. Braha et al. [124] have considered spatiotemporal US election data and have also emphasized the role of opinion leaders and spatial variability of external influences. Fenner et al. [125,126] have started from a generative model inspired by survival analysis, but in later works transition to the SDE framework [127,128]. Michaud and Szilva [129] have fixed issues with the model originally proposed by Fernandez–Garcia et al. [122], mainly, they have redefined how the noise term is handled so that the model would be more mathematically well-posed. Marmani et al. [130] have provided a similar empirical analysis of Italian electoral data and provided an additional perspective from the point of view of Shannon entropy.

As is common in opinion dynamics [47–49], we have also explored the influence of network topologies on the statistical properties of Kirman's herding model. Namely, we have demonstrated [131] a continuous transition from extensive case, characterized by localized interactions, Gaussian distributions, and Boltzmann entropy, to a non-extensive case, characterized by global interactions, *q*-Gaussian distribution, and Tsallis entropy. Similar results were demonstrated earlier by Alfarano and Milakovic [132], who have explored how Kirman's herding model works on random, Barabasi–Albert, and smallworld network topologies. Similar observations were also made in [133], but Carro et al. have used the so-called annealed approximation, which takes into account network structures better than the usual mean-field approximation.

Recently, we have also used the noisy voter model to model parliamentary presence [45]. A paper by Vieira et al. [134] has inspired us to look into the Lithuanian parliamentary presence data. Unlike Vieira et al., we have observed not a ballistic diffusion regime but superdiffusive behavior; however, both of these regimes can be obtained from the noisy voter model with imperfectly acting agents. Namely, agents can internally intend to attend the parliamentary session or skip, but the action itself may be random despite being conditioned on the intended action. As Vieira et al. have used fractional diffusion equation as a model, this result implies that it may be possible to fake long-range memory

encoded in the fractional diffusion equation by using Markov models employing non-linear transformations of the voter model [101].

The classical voter model incorporates only a recruitment mechanism, despite other responses to social interaction being possible. For example, diamond model [135] posits that independence and anti-conformity mechanism may be important to understanding human social behaviors. Similarly, Latane social impact theory [136] predicts the importance of supportive interactions—namely, individuals strengthening the conviction of their like-minded peers. While this theory was recently studied in the opinion dynamics context [137,138], it has not been combined with the voter model. One could also consider majority-vote models [139–141] and q-voter models [142,143] as implementing some kind of support by the like-minded agents. In majority-vote models, recruitment is only possible if a majority of agents have opposing opinions (therefore, the majority becomes harder to convince, but the minority remains as susceptible to change). In most q-voter models, a group of *q* agents must share an opinion to convince a single agent. We have implemented supportive interactions by decreasing the transitions rates of the agents by an amount proportional to the number of like-minded agents. In some cases, these modifications cause the transition rates go to zero, which freezes the system state. Similar qualitative behavior is observed in works, which consider non-Markovian mechanisms, such as implicit opinion freezing or aging [144–147]. This serves as another example that highly non-linear Markovian models can lead to similar dynamics as the dynamics generated by the non-Markovian models.

#### **4. Searching for the True Long-Range Memory Test**

We have reviewed our experience of modeling long-range memory phenomena using Markovian models in the earlier sections. We have shown numerous examples of nonlinearity causing behaviors and dynamics reminiscent of the models with true long-range memory (such as delayed feedback, aging, freezing, and fractional dynamics). In this section, we present our latest endeavor to find a statistical test, which would distinguish whether the real-life systems possess true or spurious long-range memory. We proposed a test earlier, based on the specific first-passage times, which we refer to as the burst and interburst duration analysis (abbr. BDA) [148–151].

Investigating empirical PDF of burst and interburst duration compared with the model properties, we have interpreted the observed long-range memory in the financial markets by ordinary non-linear SDEs representing multifractal stochastic processes with non-stationary increments [152,153]. One has to take into account the interplay of endogenous and exogenous fluctuations in the financial markets to build a comprehensive model of this complex system [154]. Non-linear SDEs might be applicable in the modeling of other social systems, where models of opinion or population dynamics lead to the macroscopic description by these equations [148–151]. The description by SDEs is an alternative to the modeling incorporating fractional dynamics, if power–law statistical properties are observed in the empirical data.

The BDA employs the dependence of first-passage time PDF on Hurst exponent *H* for the fractional Brownian motion [56,152,153,155].

FBM, FLSM, and ARFIMA [156–158] form the theoretical background of long-range memory and self-similar processes. These processes, first of all, served for the modeling of systems with anomalous diffusion and expected fractional dynamics [159]. We can consider fractional models possessing true long-range memory as they have correlated increments. Self-similar processes with non-Gaussian stable increments are essential for the modeling of social systems as well. In the financial markets, power–law distributions of noise often interplay with autocorrelations [160–162]. In [163], we implemented BDA for the order disbalance time series seeking to confirm or reject the long-range memory in the order flow. Further, we analyzed the same LOBSTER data of order flow in the financial markets [164] from the perspective of FLSM and ARFIMA models seeking to identify the impact of increment distributions and correlations on estimated parameters of self-similarity [165]. The revealed peculiarities of non-Gaussian fractional dynamics in this financial system

raise new questions about whether used sample estimators are reliable. In this section, we test various long-range memory estimators such as mean squared displacement, absolute value estimator, Higuchi's method, and BDA on discrete fractional Lèvy stable motion represented by the ARFIMA sample series.

#### *4.1. Fractional Processes with Non-Gaussian Noise*

FBM serves as a model of the correlated time series with stationary Gaussian increments and generalizes the classical Brownian motion [1]. One can define FBM, *BH*(*t*), of the index *H* (Hurst parameter) in the interval 0 < *H* < 1 as the Itô integration over classical Brownian motion *B*

$$B\_H(t) = \int\_{-\infty}^{\infty} \left( (t - u)\_+^d - (-u)\_+^d \right) \mathrm{d} \, B(u) \, \tag{60}$$

where *d* = *H* − 1/2, (*x*)+ = max(*<sup>x</sup>*, <sup>0</sup>). The parameter *H* in FBM quantifies fractal behavior, long-range memory, and anomalous diffusion. This is not the case for the other more general stochastic processes. Thus, in this contribution the Hurst parameter *H* is responsible only for the fractal properties of the trajectories. We will consider fractional Lèvy stable motion as more general process with non-Gaussian distribution *<sup>L</sup><sup>α</sup>H*(*t*) representing an integrated process of independent and stable stationary increments d *<sup>L</sup><sup>α</sup>*(*u*) [156]

$$L\_H^a(t) = \int\_{-\infty}^{\infty} \left( (t - u)\_+^d - (-u)\_+^d \right) \mathrm{d} \, L^a(u),\tag{61}$$

where parameter *d* depends on *H* and parameter of stable distribution *α*, *d* = *H* − 1/*<sup>α</sup>*. The parameter *α* characterizes special class of stable, invariant under summation, distributions [166], useful in the modeling both super and sub-diffusion [159]. Here, we are interested in the symmetric zero mean, stable distribution defined by the stability index in the region 0 < *α* < 2. This new parameter is responsible for the power–law tails of the new PDF *<sup>P</sup>*(*x*) ∼ |*x*|−<sup>1</sup>−*α*.

FBM and FLSM exhibit identical self-similar scaling behavior in statistical sense,

$$B\_H(\mathfrak{c}t) \sim \mathfrak{c}^H B\_H(\mathfrak{t}), \qquad L\_H^a(\mathfrak{c}t) \sim \mathfrak{c}^H L\_H^a(\mathfrak{t}), \tag{62}$$

where *x* ∼ *y* means that *x* and *y* have identical distributions. One can establish the relation with the fractal dimension of trajectories *D* = 2 − *H* [167]. In analogy to the notions used in fractal geometry, these types of processes can be considered self-similar.

Mean squared displacement (abbr. MSD) is another important statistical property of various complex systems. Mathematically it was introduced as an ensemble average of the possible microscopic trajectories *x*(*t*) [159]

$$
\langle (\mathbf{x}(t) - \mathbf{x}(0))^2 \rangle \sim t^{\lambda}, \quad \lambda = 2d + 1. \tag{63}
$$

Note that Equation (63) is valid for the FBM, while the ensemble average of FLSM diverges [156]. For the FBM *d* = *H* − 1/2, while for the FLSM *λ* is not defined. When *d* < 0, one observes dynamics as sub-diffusion and for *d* > 0 as super-diffusion.

In experimental or empirical data analysis, one usually deals with discrete-time sample data series {*Xi*}. It is challenging to decide which model to apply in the description of empirical data when diffusion is anomalous *d* = 0, as observed dynamics in the sample data can originate from the long-range memory or power–law of the noise. We will use the sample MSD defined as

$$M\_N(k) = \frac{1}{N - k + 1} \sum\_{i=0}^{N-k} (X\_{i+k} - X\_k)^2. \tag{64}$$

Let us also introduce increment process {*Yi* = *Xi* − *Xi*−<sup>1</sup>}, which is extracted from the sample data series. In the case of the FBM increment process, it is called fractional Gaussian noise (abbr. FGN), and in the case of FLSM, it is called fractional Lèvy stable noise (abbr. FLSN). The authors in [156] provide evidence of FLSM non-ergodicity and that *MN*(*k*) ∼ *kλ*, where *λ* = 2*d* + 1, for large *N*, *k*, and *N*/*k*. Thus, the MSD sample analysis of time series with FLSM assumption becomes very important providing estimation of the memory parameter *d*. The long-range memory usually is defined through the divergence of autocovariance *ρ*(*k*), ∑∞*<sup>k</sup>*=<sup>1</sup> *ρ*(*k*) = <sup>∞</sup>, [11]

$$\rho(k) = \frac{1}{N - k + 1} \sum\_{i=1}^{N-k+1} Y\_i Y\_{i+k} = 2^{-1} \{ (k+1)^{2H} - 2k^{2H} + |k - 1|^{2H} \} \tag{65}$$

$$\sim H(2H - 1)k^{-\gamma}, \quad k \to \infty.$$

For the FGN, the exponent of autocorrelation is defined by the Hurst parameter *γ* = 2 − 2*H*. We see that FBM is an essential long-range memory process with various statistical properties defined by the Hurst parameter. Thus, researchers use an extensive choice of statistical estimators to determine *H* and evaluate memory effects even when investigated time series deviate from the Gaussian distribution.

Accepting a more general FLSM approach, one has to reevaluate previously used estimators [163], as we now have more independent parameters. The stability index 0 < *α* < 2 and the memory parameter *d* both contribute to the observed sample properties. Since in the Lèvy stable case, the second moment is infinite the measure of noise autocorrelation, e.g., the co-difference [166,168], is used instead of covariance

$$
\pi(k) = \sim k^{-(a-aH)}.\tag{66}
$$

Note that the parameter *γ* = *α* − *αH* = *α* − *αd* − 1, has a strong dependency on *α*, when for the Gaussian processes, it was considered just as the indicator of long-range memory. Consequently, the previously used sample power spectral density analysis, the rescaled range analysis [169–171], or multifractal detrended fluctuation analysis [172,173] has to be reevaluated from the perspective of FLSM [163,165].

Earlier, we have introduced the burst and interburst duration analysis (BDA) as one more method to quantify the long-range memory through the evaluation of *H* [149,152,153,163]. For the one dimensional bounded sample time series, any threshold divides these series into a sequence of burst *Tbj* and interburst *Tij* duration, *j* = 1, ...*Nb*. The notion of burst and interburst duration follows from the threshold first-passage problem initiated at the nearest vicinity of the threshold. The burst duration is the first-passage time from above and interburst from below the threshold, see [149,152,153,163] for more details. The empirical (sample) PDF (histogram) of *Tj* gives us the information about *H*, as the power–law part of this PDF should be *T*2−*<sup>H</sup>* [56]. We have to revise the method of BDA from the more general perspective of FLSM [165], as the question of which properties can be recovered using this method is open and has to be investigated.

The method of absolute value estimator (abbr. AVE) works correctly even for the time series with infinite variance [11,167,168,174]. The method is based on mean value *δn* calculated from sample series *Yi* and evaluating its scaling with length of sub-series *n*. Divide the increment series *Yi* into blocks of size *n*, so that *m* · *n* = *N*, and average within each block to obtain the aggregated series *Y*(*n*) *j* = 1*n* <sup>∑</sup>*jni*=(*j*−<sup>1</sup>)*n*+<sup>1</sup> *Yi*. Calculate *δn*

$$\delta\_{\mathfrak{u}} = \frac{1}{m} \sum\_{j=1}^{m} |\mathcal{Y}\_j^{(n)} - \langle \mathcal{Y} \rangle|,\tag{67}$$

where *X* is the overall series mean. Then the absolute value scaling parameter *HAV* can be evaluated from the scaling relation

$$
\delta\_n \sim n^{HAV}^{-1}.\tag{68}
$$

One more almost equivalent estimator of scaling properties regarding the FLSM is Higuchi's method [11,175]. It relies on finding fractional dimension *D* of the length of the path. The normalized path length *Ln* in this method is defined as follows

$$L\_n = \frac{N-1}{n^3} \sum\_{i=1}^n \frac{1}{m-1} \sum\_{j=1}^{m-1} |X\_{i+jn} - X\_{i+(j-1)n}|\_{\prime} \tag{69}$$

and *Ln* ∼ *<sup>n</sup>*<sup>−</sup>*D*, where *D* = 2 − *H*.

We investigate four methods: AVE, Higuchi's, MSD, and BDA for the analysis of ARFIMA time series as a test sample of FLSM.

#### *4.2. Numerical Exploration of the Accumulated ARFIMA(0,d,0) Time Series*

Let us consider the discrete process {*Xi*} defined as a cumulative sum,

$$X\_{i+1} = X\_i + Y\_{i\prime} \tag{70}$$

of correlated increments {*Yi*}. Let the increments be generated by the ARFIMA(0,d,0) process [158,176]:

$$Y\_i = \sum\_{j=0}^{\infty} \frac{\Gamma(j+d)}{\Gamma(d)\Gamma(j+1)} Z\_{i-j\prime} \tag{71}$$

with random *Zi*−*<sup>j</sup>* from the domain of attraction of an *α*-stable law with 0 < *α* ≤ 2. One can calculate the sum in Equation (71) using the fast Fourier transform algorithm. The approximate relation between FLSM and ARFIMA can be derived using Riemann-sum approximation, see [176] for details.

Seeking to generate comparable time series with that analyzed in [165], the order disbalance time series of the financial markets we choose is *N* = 7 × 106, nine values of *d* = {−0.4, −0.3, −0.2, −0.1, 0.0, 0.1, 0.2, 0.3, 0.4} and four values of *α* = {2, 1.5, 1.25, 1.0}. The sample time series for any set of parameters have been evaluated using four estimators described above: MSD, AVE, Higuchi's estimator, and BDA. We evaluate *H* as described in the previous subsection. First of all, we partition time series *Yi* in subsets with 5 × 10<sup>5</sup> time steps and accumulate them to obtain 14 subseries *Xi*. Then, the exponent *λ* or the Hurst parameter are evaluated for each subseries using MSD, AVE, and Higuchi's sample estimators. Finally, we calculate the mean and standard deviation of defined 14 *λ* and *H* sets. Estimated *d* we calculate using *d* = *H* − 1/*α* or *d* = (*λ* − 1)/2 in MSD case. The graphs in Figure 8 of estimated *d* versus used ARFIMA model d serve as a good test of used estimators.

**Figure 8.** Comparison of the MSD (**a**), AVE (**b**), and Higuchi (**c**) estimator performance when estimating *d* from the accumulated ARFIMA(0,d,0) series in the unbounded case, {*Xi*} generated by Equation (70). Different curves correspond to the different values of the noise distribution stability parameter: *α* = 2 (red triangles), 1.5 (green diamonds), 1.25 (blue squares), and 1 (black circles).

Our numerical result given in subfigure (**a**) confirms the theoretical prediction for the sample MSD *MN*(*k*) ∼ *k*2*d*+<sup>1</sup> [156] as estimated *d* using this relation almost coincides with model *d* for all values of *α*. It is accepted that two estimators, absolute value and Higuchi's, are almost equivalent and should be applicable for the analysis of fractional processes with stable distribution [11,167,168,174]. Indeed, the results of our numerical investigation, see (**b**) and (**c**) subfigures in Figure 8b,c, confirm the equivalence of these estimators. Nevertheless, the estimated values of memory parameter *d* deviate considerably from its model value, when *α* → 1, and these deviations are much more prominent for the super-diffusion case *d* > 0. These deviations do not arise as a computational effect, as the estimated relative standard deviation decreases from 0.15 to 0.02 for the evaluated *H* in the investigated interval of *d*. Fortunately, this result does not contradict the study [165], where we used these estimators to evaluate *d* in empirical order disbalance time series exhibiting sub-diffusion.

It is important to note that the estimators, MSD, AVE, and Higuchi's should work well only for the unbounded time series when the most physical systems and processes are of finite size and duration. In all such cases, boundary effects might become important, and one must choose or propose more reliable estimators [167]. The BDA considered in our previous work [149,152,153,163], probably, can serve as an alternative approach. This method works better for the bounded time series, where more intersections of series with the threshold can be expected. Thus, in this contribution for the BDA, we restrict the diffusion of *Xi* to the interval [− *Xmax*, *Xmax*] (in our analysis we use *Xmax* = (10<sup>5</sup>)<sup>2</sup>*d*+1). This restriction is implemented as a soft boundary condition:

$$X\_{i+1} = \max(\min(X\_i + Y\_{i\prime}X\_{\max}), -X\_{\max}).\tag{72}$$

This iterative relation replaces Equation (70) in the {*Xi*} series generation algorithm.We define the PDF of the burst and interburst duration *Tj* for the whole set of time steps *N* = 7 × 10<sup>6</sup> and the series threshold equal to zero mean. Note that only in this symmetric case PDF's of burst and interbust duration coincide. Seeking to understand how the diffusion restriction mechanism impacts the results of other estimators, we use the same restriction mechanism for the 14 subseries obtained after the partition procedure. We present the results of this analysis in Figure 9.

**Figure 9.** Comparison of the MSD (**a**), AVE (**b**), Higuchi (**c**), and BDA (**d**) estimator performance when estimating *d* from the accumulated ARFIMA(0,d,0) series in the bounded case, {*Xi*} generated by Equation (72). Different curves correspond to the different values of the noise distribution stability parameter: *α* = 2 (red triangles), 1.5 (green diamonds), 1.25 (blue squares), and 1 (black circles).

Though the used diffusion restriction is relatively soft and changes the direction of movement in the limited number of trajectories points, the results of MSD, AVE, and Higuchi's estimators changed very considerably—compare subfigures (**<sup>a</sup>**–**<sup>c</sup>**) with the corresponding results in Figure 9. Contrary, the results obtained using *H* defined by BDA, see subfigure (**d**), resembles AVE (**b**) and Higuchi's estimator (**c**) subfigures from unbounded

series Figure 9. Further investigation is needed to define the best methods and sample estimators for evaluating parameters of fractional time series impacted by various diffusion restrictions. The vast amount of data available from the financial markets can serve as empirical time series considered from the perspective of FLSM.

#### **5. Future Considerations**

Here, we have reviewed our approaches to modeling the long-range memory phenomenon and power–law statistics in a variety of complex systems. Our approach differs from the usual approach taken by mathematicians in that we have used Markovian models instead of the non-Markovian alternatives. We were able to reproduce similar behaviors due to our models being driven by various non-linear dependencies. In the case of SDEs, non-linearity may cause the increments of the stochastic process to be non-stationary and, by consequence, cause spurious long-range memory [177,178]. The many models we have built over the years are not models of true long-range memory; however, the critical question is whether our models capture the memory as observed in the financial markets and possibly other socioeconomic complex systems. Section 4, which describes our most recent endeavor, hints at three components that are needed to provide an answer.

The first component is a statistical test, which should distinguish between spurious and true long-range memory. Currently, we are considering the BDA method [148–151], which performs reasonably well in comparison to the alternatives. The core idea of the method is that for any one-dimensional Markovian random walk first-passage time PDF should be a power–law with exponent −3/2 at least for some of the duration. Deviations from this law could indicate the presence of true long-range memory. Though the method may fail when the stochastic process is not one-dimensional, the study of what happens in the multidimensional case, e.g., as in [99], is pending. Other challenges may also arise, as discussed in Section 4.

The second component would be a selection of models exhibiting both spurious and true long-range memory. Our prior research has introduced a variety of models of spurious long-range memory; hence, the next steps would be formulating comparable alternative models and studying properties of the existing long-range memory models. Here, we have focused on estimating long-range memory in the fractional Lévy stable motion (modeled using ARFIMA(0,d,0) discrete process), which is a generalization of the fractional Brownian motion; however, in general, other models could also be considered, for example, the multiplicative point process (see Section 2) could be generalized by replacing uncorrelated Gaussian noise with fractional Gaussian noise. Other correlation structures or variable pulse duration could also be considered as an extension [179]. Other notable alternatives and extensions include continuous-time random walk [180] and complex contagion frameworks [181,182].

The third component would be a variety of data from socioeconomic complex systems. Many of our earlier approaches relied on high-frequency absolute return and trading activity time series, but in our most recent works, we have shifted our attention to the order book data obtained from LOBSTER [164]. Order book data seem to invite a more general approach by understanding the data within FLSM or ARFIMA mindset for a broad class of anomalous diffusion processes [157,167,168]. The vast data in social and financial systems have to be investigated to identify and validate the fractional dynamics and long-range memory. Our first results in this direction [163,165] question the interpretation of longrange memory in the order flow data of financial markets. First of all, a prudent choice of estimators based on FLSM and ARFIMA assumptions are needed. After extensive analysis from this perspective, it would be possible to decide whether the investigated social system exhibits true long-range memory or observed power–law statistical properties are just the outcome of strong non-linear effects.

Research effort combining all these three components could yield a better understanding of the long-range memory phenomenon as it is observed in the variety of complex systems. The comprehensive interpretation of long-range memory observed in the financial

and other social systems should considerably contribute to developing advanced analytical tools for applications in financial markets. Thus, we have focused on the description and explanation of the long-range memory phenomenon. Notably, a few more recent works refer to or use some of our results and are more application-minded. In [73] a non-linear SDE was derived, providing both physical and economic arguments, to study the performance of EUR/CHF exchange rate. The derived SDE belongs to the class described by (8). The author of [183] has considered the relationship between aging and long-range memory phenomena in a couple of physics experiments: blinking-quantum-dots, single-file diffusion, and Brownian motion in a logarithmic potential. The author of [184] has shown that SDE (8) applies to the modeling of the dynamics on microblogging networks. The author of [185] has considered the effects of perturbations on the stability of power–law distributions in general with an application to wealth distributions. The author of [186] tested the applicability of simple stochastic models to the modeling of non-stationary behavior of intraday tick-by-tick returns. The author of [187] has tested forecast robustness of non-linear GARCH model when time series exhibit high positive autocorrelation. Mean reversion phenomenon was studied in Karachi Stock Exchange data from the perspective of GARCH models in [188]. The authro of [189] has compared the performance of nonlinear SDE models against Black and Scholes model, which is one of the models used by the practitioners. Various modifications of Heston model, another model favored by the practitioners, are also reminiscent of SDE (8) [190]. We hope to inspire and maybe take up more application–minded endeavors.

**Author Contributions:** Conceptualization, R.K., A.K., B.K. and V.G.; methodology, R.K., A.K., B.K. and V.G.; software, A.K. and V.G.; validation, R.K., A.K., B.K. and V.G.; formal analysis, B.K.; investigation, R.K., A.K., B.K. and V.G.; resources, R.K., A.K., B.K. and V.G.; data curation, R.K., A.K., B.K. and V.G.; writing—original draft preparation, R.K., A.K., B.K. and V.G.; writing—review and editing, R.K.; visualization, R.K., A.K., B.K. and V.G.; supervision, V.G.; project administration, R.K.; funding acquisition, R.K. and V.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This project was funded by the European Union (project No 09.3.3-LMT-K-712-19-0017) under the agreemen<sup>t</sup> with the Research Council of Lithuania (LMTLT).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.
