*Editorial* **Multiscale Entropy Approaches and Their Applications**

#### **Anne Humeau-Heurtier**

LARIS—Laboratoire Angevin de Recherche en Ingénierie des Systèmes, University of Angers, 49035 Angers, France; anne.humeau@univ-angers.fr

Received: 28 May 2020; Accepted: 2 June 2020; Published: 10 June 2020

**Keywords:** multiscale entropy; multivariate data; entropy

#### **1. Introduction**

Multiscale entropy (MSE) measures have been proposed from the beginning of the 2000s to evaluate the complexity of time series, by taking into account the multiple time scales in physical systems. Since then, these approaches have received a great deal of attention and have been used in a large range of applications. Multivariate approaches have also been developed.

The algorithms for a MSE approach are composed of two main steps: (i) a coarse-graining procedure to represent the system's dynamics on different scales; and (ii) the entropy computation for the original signal and for the coarse-grained time series to evaluate the irregularity for each scale. Moreover, different entropy measures have been associated with the coarse-graining approach, each one having its advantages and drawbacks: approximate entropy, sample entropy, permutation entropy, fuzzy entropy, distribution entropy, dispersion entropy, etc.

In this Special Issue, we gathered 24 papers focusing on either the theory or applications of MSE approaches. These papers can be divided into two groups: papers that propose either new developments on entropy-based measures or improve the understanding of existing ones (nine papers); and papers that propose new applications of existing entropy-based measures (14 papers), as described below. Moreover, one paper proposes a review on cross-entropy methods and their multiscale approaches [1].

#### **2. New Developments in Entropy-Based Measures**

Lee et al. proposed a multiscale distribution entropy based on a moving averaging multiscale process and distribution entropy to study short-term heart rate variability (HRV) [2]. The authors show that the new entropy-based measure outperforms MSE and multiscale permutation entropy as it is insensitive to the length of signals. The new measure shows a decrease in the complexity of HRV with aging and for congestive heart failure patients.

Zhao et al. proposed the multiscale entropy difference (MED) to assess the predictability of nonlinear financial time series on several time scales [3]. MED quantifies the contributions of the past values by reducing the uncertainty of the forthcoming values in signals on several time scales. The algorithm has been validated on simulated data and then applied to the analysis of Chinese stock markets.

Cheng et al. proposed a method based on multimodal multiscale dispersion entropy for the biometric characterization of heart sounds [4]. The work relies on the use of the improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN) and refined composite multiscale dispersion entropy. The authors show that the proposed method is effective for heart sound biometric recognition.

Dong et al. proposed a method, KeepSampEn, to minimize the error due to missing values in sample entropy calculation [5]. For this purpose, they modified the computation process but not the data. The results reveal that KeepSampEn shows a consistent lower average percentage error than other methods as skipping the missing values, linear interpolation and bootstrapping.

Tiwari et al. investigated the multiscale features of the mental workload for ambulant users [6]. Features that outperform benchmark ones are proposed and they exhibit complementarity when used in combination. Thus, the authors reported that composite coarse-graining via a new second moment moving average scaling method, combined with the modified permutation entropy method, outperforms other combinations.

From a Taylor series expansion, Dávalos et al. developed an explicit expression for the multiscale permutation entropy (MPE) estimator's variance as a function of the time scale and ordinal pattern distribution [7]. They also determined the Cramér–Rao lower bound of the MPE. The results show that MPE variance is related to the MPE measurement and increases linearly with time scale, but not when the MPE measure reaches its maximum value. Moreover, for short time scales compared to the signal length, the MPE variance resembles the MPE Cramér–Rao lower bound.

Bajic et al. proposed a method that enables an application of MSE to an arbitrary number of signals [8]. The authors also wanted to test whether their method recognizes the changes of the dependency level (coupling strength, level of interaction) of joint multivariate signals in different biomedical experiments. For this purpose, they use the copula density to determine the coupling strength. Moreover, the authors apply the composite MSE to the systolic blood pressure, the pulse interval, and the body temperature of rats exposed to different ambient temperatures.

Azami et al. introduced the multivariate multiscale dispersion entropy (mvMDE) to quantify the complexity of multivariate time series [9]. When applied to different kinds of signals, the results show that mvMDE has some advantages over multivariate multiscale entropy (mvMSE) and multivariate multiscale fuzzy entropy (mvMFE).

Martins et al. introduced a new method to assess the complexity of multivariate time series [10]. This new method takes into account the presence of short-term dynamics and long-range correlations and uses vector autoregressive fractionally integrated (VARFI) models. This leads to a linear parametric representation of the vector's stochastic processes. Then, an analytical formulation is obtained to derive the MSE measures. The authors tested this new approach on cardiovascular and respiratory signals to assess the complexity of the heart period, the systolic arterial pressure, and the respiration variability in different physiological conditions. The results show that, by taking into account long-range correlations, the method proposed by the authors overcomes the existing ones as it captures significant variations in the complexity that are not observed with standard existing methods.

#### **3. Applications of Existing Entropy-Based Measures**

In this Special Issue, 14 papers propose to use existing entropy-based measures for different kinds of applications, as mentioned below.

Harezlak et al. studied eye movement signal characteristics [11]. For this purpose, the authors used several methods: approximate entropy, fuzzy entropy, and the largest Lyapunov exponent. For these three methods, multilevel maps are defined. The results show better accuracy for saccadic latency and saccade, than previous studies using eye movement dynamics.

Liau et al. evaluated the changes in the complexity of the center of pressure (COP) during walking at different speeds and for different durations [12]. For this purpose, the MSE was used. The authors show that both the walking speed and walking duration factors significantly affect the complexity of COP.

Based on ensemble empirical mode decomposition (EEMD) and MSE and using an accelerometer, Nurwulan et al. proposed a measure, the postural stability index (PSI), to distinguish different stability states in healthy subjects [13]. PSI is able to discriminate between normal walking and walking with obstacles in healthy subjects.

McDonough et al. were interested by post-encoding memory consolidation mechanisms in a sample of young, middle-aged and older adults [14]. For this purpose, they tested a novel measure of information processing, network complexity and studied if it was sensitive to these post-encoding mechanisms. Network complexity was determined by assessing the irregularity of brain signals within a network over time. This was performed through MSE. The results show that network complexity is sensitive to post-encoding consolidation mechanisms that enhance memory performance.

Menon and Krishnamurthy mapped neuronal and functional complexities from the MSE of resting-state functional magnetic resonance imaging (rfMRI) blood oxygen-level dependent (BOLD) signals and BOLD phase coherence connectivity [15].

De Wel et al. proposed a novel unsupervised method to discriminate quiet sleep from non-quiet sleep in preterm infants, from the decomposition of a multiscale entropy tensor [16]. This was performed according to the difference in the electroencephalography (EEG) complexity between the neonatal sleep stages.

Jelinek et al. investigated the efficacy of applying multiscale Renyi entropy on heart rate variability (HRV) to obtain information on the sign, magnitude, and acceleration of the signals with time [17]. The results show that their quantification using multiscale Renyi entropy leads to statistically significant differences between the disease classes of normal, early cardiac autonomic neuropathy (CAN), and definite CAN.

El-Yaagoubi et al. studied the dynamics, the consistency and the robustness of MSE, multiscale time irreversibility (MTI), and multifractal spectrum in HRV characterization in long-term scenarios (7 days) [18]. The results show that congestive heart failure (CHF) and atrial fibrillation (AF) populations show significant differences at long-term and very long-term scales (thus, MSE is higher for AF while MTI is lower for AF).

For an early Alzheimer's disease (AD) diagnosis, Perpetuini et al. used sample entropy and the MSE of functional near infrared spectroscopy (fNIRS) in the frontal cortex of early AD and healthy controls during three tests that were used to assess visuo-spatial and short-term-memory abilities [19]. A multivariate analysis revealed promising results (good specificity and sensitivity) in the capabilities of fNIRS and complexity for an early diagnosis.

Keshmiri et al. studied the effect of the physical embodiment on older people's prefrontal cortex (PFC) activity when they are listening to stories [20]. For this purpose, they used MSE. Their results show that, in older people, physical embodiment leads to a significant increase of MSE for PFC activity. Moreover, this increase reflects the perceived feeling of fatigue.

Xu et al. used the short-time series MSE (sMSE) to study the complexities and temporal correlations of Wikipedia page views of four selected topics [21]. The goal was to understand the complexity of human website searching activities. The results show that sMSE is useful to analyze the temporal variations of the complexity of page view data for some topics. Nevertheless, the regular variations of sample entropy cannot be accepted as is when different topics are compared.

Lin et al. developed an entropy-based structural health monitoring system to solve the problem of unstable entropy values observed when multiscale cross-sample entropy was used to assess damage in laboratory-scale structure [22]. The results could be interesting for long-term monitoring.

Ge et al. proposed a bearing fault diagnosis technique using the local robust principal component analysis (to remove background noise: it decomposed the signal trajectory matrix into multiple low-rank matrices) and multiscale permutation entropy that identified the low-rank matrices corresponding to the bearing's fault feature [23]. The latter matrices are then combined into a one-dimensional signal and represents the extracted fault feature component.

Shang et al. used variational mode decomposition and multiscale dispersion entropy to propose a novel feature extraction method for partial discharge fault analysis [24]. Moreover, a hypersphere multiclass support vector machine was used for partial discharge pattern recognition.

Let us now hope that these papers will bring other interesting applications and lead to new ideas to further improve the study of the irregularity and complexity of data (1D, 2D, *n*-D).

**Funding:** This research received no external funding.

**Acknowledgments:** I express my thanks to the authors of the above contributions and to the *Entropy* Editorial Office and MDPI for their support during this work.

**Conflicts of Interest:** The author declares no conflict of interest.

#### **References**


© 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Review* **(Multiscale) Cross-Entropy Methods: A Review**

**Antoine Jamin 1,2,\* and Anne Humeau-Heurtier <sup>2</sup>**


Received: 9 December 2019; Accepted: 26 December 2019; Published: 29 December 2019

**Abstract:** Cross-entropy was introduced in 1996 to quantify the degree of asynchronism between two time series. In 2009, a multiscale cross-entropy measure was proposed to analyze the dynamical characteristics of the coupling behavior between two sequences on multiple scales. Since their introductions, many improvements and other methods have been developed. In this review we offer a state-of-the-art on cross-entropy measures and their multiscale approaches.

**Keywords:** cross-entropy; multiscale cross-entropy; asynchrony; complexity; coupling; cross-sample entropy; cross-approximate entropy; cross-distribution entropy; cross-fuzzy entropy; cross-conditional entropy

#### **1. Introduction**

To quantify the asynchronism between two time series, Pincus and Singer have adapted the approximate entropy algorithm to a cross-approximate entropy (cross-ApEn) method [1]. Then, other cross-entropy methods—that improve the cross-ApEn—have been developed [2–7]. Furthermore, additional cross-entropy methods have been introduced to quantify the degree of coupling between two signals, or the complexity between two cross-sequences [8–10]. Cross-entropy methods have recently been used in different research fields, including medicine [5,11,12], mechanics [13], and finance [7,10].

The multiscale approach of entropy measures was proposed by Costa et al. in 2002 to analyze the complexity of a time series [14]. In 2009, Yan et al. proposed a multiscale approach for cross-entropy methods to quantify the dynamical characteristics of coupling behavior between two sequences on multiple scale factors [15]. Then, other multiscale procedures have been published with different cross-entropy methods [16,17]. Multiscale cross-entropy methods have recently been used in different research fields, including medicine [18–21], finance [6,9], civil engineering [22], and the environment [23].

Cross-entropy methods and their multiscale approaches are used to obtain information on the possible relationship between two time series. For example, Wei et al. applied percussion entropy to the amplitude of digital volume pulse signals and changes in R-R intervals of successive cardiac cycles for assessing baroreflex sensitivity [18]. Results showed that the method is able to identify the markers of diabetes by the nonlinear coupling behavior of the two cardiovascular time series. Moreover, Zhu and Song computed cross-fuzzy entropy on a vibration time series to assess the bearing performance degradation process of motor [13]. Results showed that the method detects trend for bearing degradation process over the whole lifetime. In addition, Wang et al. applied multiscale cross-trend sample entropy to analyze the asynchrony between air quality impact factors (fine particulate matters, nitrogen dioxide, . . . ), and air quality index (AQI) in different regions of China [23]. Results showed that the degree of synchrony between fine particulate matter and AQI is

higher than the other air quality impact factor which reveals that fine particulate matter has become the main source of air pollution in China.

Our paper presents a state-of-the-art in three sections: First, the cross-entropy methods are introduced. We detail, in the second section, different multiscale procedures. A multiscale cross-entropy generalization is presented and other specific multiscale cross-entropy algorithms are proposed in the third section.

#### **2. Cross-Entropy Methods**

In this section, we classify cross-entropy methods according to their entropy measures: Cross-approximate entropy, cross-sample entropy, and cross-distribution entropy. Other methods that use different cross-entropy-based measures are also detailed. Table 1 shows the twelve measures that are detailed in this section.

**Table 1.** Cross-entropy measures, in chronological order, that are presented in this review. Authors, year, reference, and section location are indicated for each item.


#### *2.1. Cross-Approximate Entropy-Based Measures*

#### 2.1.1. Cross-Approximate Entropy

Cross-approximate entropy (cross-ApEn), introduced by Pincus and Singer [1], allows to quantify asynchrony between two time series. For two vectors **u** and **v** of length *N*, cross-ApEn is computed as:

$$\text{cross-ApEn(m,r,N)} (\mathbf{v}||\mathbf{u}) = \Phi^m(r)(\mathbf{v}||\mathbf{u}) - \Phi^{m+1}(r)(\mathbf{v}||\mathbf{u}),\tag{1}$$

where <sup>Φ</sup>*m*(*r*)(**v**||**u**) = <sup>1</sup> *<sup>N</sup>*−*m*+<sup>1</sup> <sup>∑</sup>*N*−*m*+<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> log *<sup>C</sup><sup>m</sup> <sup>i</sup>* (*r*)(**v**||**u**) and *<sup>C</sup><sup>m</sup> <sup>i</sup>* (*r*)(**v**||**u**) is the number of sequences, of *m* consecutive points, of **u** that are approximately (within a resolution *r*) the same as sequences, of the same length, of **v**. One major dawback of this approach is that *C<sup>m</sup> <sup>i</sup>* (*r*)(**v**||**u**) should not be equal to zero. This is why cross-ApEn is not really adapted for a short time series. Furthermore, it is direction-dependent because often <sup>Φ</sup>*m*(*r*)(**v**||**u**) is generally not equal to its direction conjugate <sup>Φ</sup>*m*(*r*)(**u**||**v**) [2]. The value of cross-ApEn computed from two signals can be interpreted as a degree of synchrony or mutual relationship.

#### 2.1.2. Binarized Cross-Approximate Entropy

Binarized cross-approximate entropy (XBinEn), introduced by Škori´c et al. [5] in 2017, is an evolution of cross-ApEn to quantify the similarity between two time series. It has the advantage of being faster than cross-ApEn. XBinEn encodes a time series divided into vectors of length *m*. For two vectors **u** and **v** of length *N*, the XBinEn algorithm follows these six steps:

1. Binary encoding series are obtained as:

$$\mathbf{x}\_{i} = \begin{cases} 0 & \text{if } u\_{i+1} - u\_{i} \lessapprox 0 \\ 1 & \text{if } u\_{i+1} - u\_{i} > 0 \end{cases}, \quad y\_{i} = \begin{cases} 0 & \text{if } v\_{i+1} - v\_{i} \lessapprox 0 \\ 1 & \text{if } v\_{i+1} - v\_{i} > 0 \end{cases} \tag{2}$$

where *<sup>i</sup>* <sup>=</sup> 1, 2, ..., *<sup>N</sup>* <sup>−</sup> 1, *xi* <sup>∈</sup> **<sup>X</sup>**(*i*) *<sup>m</sup>* = [*xi*, *xi*+*t*, ..., *xi*+(*m*−1)*t*], and *yi* <sup>∈</sup> **<sup>Y</sup>**(*i*) *<sup>m</sup>* = [*yi*, *yi*+*t*, ..., *yi*+(*m*−1)*t*]. The time lag *t* allows a vector decorrelation to be performed;

2. Vector histograms *N*(*m*) *<sup>X</sup>* (*k*) and *<sup>N</sup>*(*m*) *<sup>Y</sup>* (*n*) are computed as:

$$N\_X^{(m)}(k) = \sum\_{i=1}^{N-(m-1)t} I\{\sum\_{l=0}^{m-1} x\_{i+l \cdot t} \times 2^l = k\}, \quad N\_Y^{(m)}(n) = \sum\_{j=1}^{N-(m-1)t} I\{\sum\_{l=0}^{m-1} y\_{j+l \cdot t} \times 2^l = n\}, \tag{3}$$

where *<sup>k</sup>*, *<sup>n</sup>* = 0, 1, ..., 2*<sup>m</sup>* − 1, and *<sup>I</sup>*{·} is a function that is equal to 1 if the indicated condition is fulfilled;

3. The probability mass functions are obtained as:

$$P\_X^{(m)}(k) = \frac{N\_X^{(m)}(k)}{N - (m - 1)t}, \quad P\_Y^{(m)}(n) = \frac{N\_Y^{(m)}(n)}{N - (m - 1)t},\tag{4}$$

where *<sup>k</sup>*, *<sup>n</sup>* = 0, 1, ..., 2*<sup>m</sup>* − 1;

4. A distance measure is applied:

$$d(\mathbf{X}\_m^{(i)}, \mathbf{Y}\_m^{(j)}) = \sum\_{k=0}^{m-1} I\{\mathbf{x}\_{i+k \cdot t} \neq \mathbf{y}\_{j+k \cdot t}\},\tag{5}$$

where *i*, *j* = 1, ..., *N* − (*m* − 1)*t*;

5. The probability *p<sup>m</sup> <sup>k</sup>* (*r*) that a vector is within the distance *r* from a particular vector is estimated:

$$p\_k^m(r) = \Pr\{d(\mathbf{X}\_m^{(k)}, \mathbf{Y}\_m) \preccurlyeq r\};\tag{6}$$

6. XBinEn is finally obtained as:

$$\text{XBin}(m, r, N, t) = \Phi^{(m)}(r, N, t) - \Phi^{(m+1)}(r, N, t), \tag{7}$$

where Φ(*m*)(*r*, *N*, *t*) = ∑2*m*−<sup>1</sup> *<sup>k</sup>*=<sup>0</sup> *<sup>P</sup>*(*m*) *<sup>X</sup>* (*k*) · ln (*p<sup>m</sup> <sup>k</sup>* (*r*)).

This method gives almost the same results as cross-ApEn for a non-short time series. However, it is computationally more efficient than cross-ApEn. Its main disadvantage is that it cannot identify small signal changes. XBinEn is adapted to environments where processor resources and energy are limited but it is not a substitute to cross-ApEn [5]. It is proposed when the cross-ApEn procedure cannot be applied. The value of XBinEn computed from two signals can be interpreted as a degree of relationship between a related pair of time series.

#### *2.2. Cross-Sample Entropy-Based Measures*

#### 2.2.1. Cross-Sample Entropy

Cross-sample entropy (cross-SampEn) quantifies the degree of asynchronism of two time series. This method was introduced by Richman and Moorman in 2000 to improve the cross-ApEn limitations (see Section 2.1.1) [2]. Cross-SampEn is a conditional probability measure that quantifies the probability that a sequence of *m* consecutive points (called sample) of a time series **u**—that matches another

sequence of the same length of another time series **v**—will still match the other sequence when their length is increased by one sample (*m* + 1). For two vectors **u** and **v**, cross-SampEn is computed as:

$$\text{cross-SampleEn}(m, r, N)(\mathbf{v}||\mathbf{u}) = -\ln \frac{A^m(r)(\mathbf{v}||\mathbf{u})}{B^m(r)(\mathbf{v}||\mathbf{u})},\tag{8}$$

where *<sup>m</sup>* is the sample length, *<sup>N</sup>* is the vectors (**<sup>u</sup>** and **<sup>v</sup>**) length, *<sup>A</sup>m*(*r*)(**v**||**u**) and *<sup>B</sup>m*(*r*)(**v**||**u**) are, respectively, the probability that a sequence of **u** and a sequence of **v** will match for *m* + 1 and *m* points (within a tolerance *r*).

For two time series **u** and **v** of length *N*, cross-SampEn can also be described as:

$$\text{cross-SampleEn}(\mathbf{u}, \mathbf{v}, m, r, N) = -\ln \frac{n^{(m+1)}}{n^{(m)}},\tag{9}$$

where *n*(*m*) represents the total number of sequences of *m* consecutive points of **u** that match with other sequences of *m* consecutive points of **v**.

The main difference between cross-ApEn and cross-SampEn is that cross-SampEn shows relative consistency whereas cross-ApEn does not. Unlike cross-ApEn, cross-SampEn is not direction-dependent. However, cross-SampEn generates, sometimes, undefined values for short time series. The value of cross-SampEn computed from two time series can be interpreted as a measure of similarity of the two time series.

#### 2.2.2. Modified Cross-Sample Entropy

Modified cross-sample entropy (mCSE), introduced by Yin and Shang in 2015, has been developed to detect the asynchrony of a financial time series [4]. Inspired by the generalized sample entropy, proposed by Silva and Murta, Jr. [25], the authors proposed to adapt this method to cross-SampEn. The method combines cross-SampEn and nonadditive statistics. For two vectors **u** and **v** of length *N*, mCSE is computed as:

$$\text{mCSE}(m, r, N) = -\log\_q \frac{\sum\_{i=1}^{N-m} n\_i^{(m+1)}}{\sum\_{i=1}^{N-m} n\_i^{(m)}},\tag{10}$$

where *m* is the sample length, *q* is the entropic index, and *n*(*m*) *<sup>i</sup>* is the number of times that the distance between vectors **y***<sup>m</sup>* = {*v*(*i*), *v*(*i* + 1), ..., *v*(*i* + *m* − 1) : 1 *i* - *N* − *m* + 1} and **x***<sup>m</sup>* = {*u*(*i*), *u*(*i* + 1), ..., *u*(*i* + *m* − 1) : 1 *i* - *N* − *m* + 1} is less than or equal to the tolerance *r*. The distance is calculated with *d*(*xm*(*i*), *ym*(*i*)) = max{|*u*(*i* + *k*) − *v*(*j* + *k*)| : 0 *k m* − 1}.

The value of mCSE computed from two time series can be interpreted as a degree of synchrony between the two time series and it can illustrate some intrinsic relations between the two time series.

#### 2.2.3. Modified Cross-Sample Entropy Based on Symbolic Representation and Similarity

Modified cross-sample entropy based on symbolic representation and similarity (MCSEBSS), introduced by Wu et al. in 2018, has been developed to quantify the degree of asynchrony of two financial time series with various trends (stock markets from different areas [6]). In comparison with cross-SampEn, this method reduces the probability of including undefined entropies and it is more robust to noise. For two vectors **u** and **v** of length *N*, MCSEBSS is computed as:

$$\text{MCSERSS}(\mathbf{u}, \mathbf{v}, m, r, N) = -\ln \frac{n^{(m+1)}}{n^{(m)}},\tag{11}$$

where *m* is the sample length and *n*(*m*) is the number of template matches by comparing *s*(**u***m*(*i*), **v***m*(*j*)) and *r*. For **u***<sup>m</sup>* = {*u*(*i* + *k*)} and **v***<sup>m</sup>* = {*v*(*i* + *k*)} (0 *k m* − 1 and 1 *i* - *N* − *m*), the similarity function *s*(*um*(*i*), *vm*(*j*)) is calculated as:

$$s(u\_m(i), v\_m(j)) = \frac{\#\text{ of } 1 \text{ in count}(i, j)}{m}, \qquad 1 \lessapprox i, j \lessapprox N - m,\tag{12}$$

where *count*(*i*, *j*) is obtained by the function *f* defined as:

$$f = \begin{cases} 1 & \text{if } u\_m(i+k) = v\_m(j+k) \\ 0 & \text{if } u\_m(i+k) \neq v\_m(j+k) \end{cases}, \quad 0 \lessapprox k \lessapprox m-1. \tag{13}$$

The parameter *<sup>r</sup>* must be fixed between *<sup>m</sup>*−*<sup>n</sup> <sup>m</sup>*+<sup>1</sup> and *<sup>m</sup>*−*<sup>n</sup> <sup>m</sup>* , where *n* is the maximum number of zeros obtained with *count*(*i*, *j*) to consider **u** and **v** similar.

The value of MCSEBSS computed from two time series can be interpreted as a degree of asynchrony of the two time series. A low cross-entropy value indicates a strong synchrony between two signals.

#### 2.2.4. Kronecker-Delta-Based Cross-Sample Entropy

The Kronecker-delta-based cross-sample entropy (KCSE), introduced by He et al. in 2018, has been developed to define the dissimilarity between two time series [7]. KCSE is based on the Kronecker-delta function *δx*,*<sup>y</sup>* that returns 1 if two variables are equal and 0 otherwise. For two vectors **u** and **v** of length *N*, KCSE is calculated as:

$$\text{KCSE}(m) = -\ln \frac{B^{m+1}}{B^m} \,\text{,}\tag{14}$$

where *<sup>B</sup><sup>m</sup>* <sup>=</sup> <sup>∑</sup>*N*−*m*+<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> KrD*um*(*i*),*vm*(*i*) *<sup>N</sup>*−*m*+<sup>1</sup> and *<sup>B</sup>m*+<sup>1</sup> <sup>=</sup> <sup>∑</sup>*N*−*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> KrD*um*+1(*i*),*vm*+1(*i*) *<sup>N</sup>*−*<sup>m</sup>* . The dissimilarity, between **<sup>u</sup>***m*(*i*) = [*u*(*i*), *u*(*i* + 1), ..., *u*(*i* + *m* − 1)] and **v***m*(*i*)=[*v*(*i*), *v*(*i* + 1), ..., *v*(*i* + *m* − 1)], is calculated as:

$$\text{KrD}\_{\boldsymbol{u}\_{m}(i),\boldsymbol{v}\_{m}(i)} = \frac{\delta\_{\boldsymbol{u}\_{m}(i),\boldsymbol{v}\_{m}(i)} + \delta\_{\boldsymbol{u}\_{m}(i+1),\boldsymbol{v}\_{m}(i+1)} + \dots + \delta\_{\boldsymbol{u}\_{m}(i+m-1),\boldsymbol{v}\_{m}(i+m-1)}}{n}.\tag{15}$$

Authors show that KCSE is better to classify financial data than multidimensional scaling based on the Chebyshev distance method [7]. The value of KSCE computed from two time series can be interpreted as a degree of irregularity between the two time series.

#### 2.2.5. Permutation-Based Cross-Sample Entropy

The permutation-based cross-sample entropy (PCSE), introduced by He et al. in 2018, is quite similar to KCSE (see Section 2.2.4) [7]. A permutation step has only been added. For two vectors **u** and **v** of length *N*, PCSE is calculated as:

$$\text{PCSE}(m) = -\ln \frac{B^{m+1}}{B^m} \,\text{,}\tag{16}$$

where *<sup>B</sup><sup>m</sup>* <sup>=</sup> <sup>∑</sup>*N*−*m*+<sup>1</sup> *<sup>i</sup>*=<sup>1</sup> KrDpermuX*m*(*i*),permuY*m*(*i*) *<sup>N</sup>*−*m*+<sup>1</sup> and *<sup>B</sup>m*+<sup>1</sup> <sup>=</sup> <sup>∑</sup>*N*−*<sup>m</sup> <sup>i</sup>*=<sup>1</sup> KrDpermuX*m*+1(*i*),permuY*m*+1(*i*) *<sup>N</sup>*−*<sup>m</sup>* . The KrD function is defined in Section 2.2.4. The two vectors **permuX***m*(*i*) and **permuY***m*(*i*) are obtained by a permutation algorithm defined with the permutation entropy [26]. The Video S1 shows an example of a permutation algorithm.

PCSE shows better results than KCSE for synthetic data (ARFIMA model) [7]. However, the two approaches give the same results for financial data [7]. Authors show that KCSE is better to classify financial data than multidimensional scaling based on the Chebyshev distance method [7]. The value

of PCSE computed from two time series can be interpreted as degree of irregularity between the two time series.

#### 2.2.6. Cross-Trend Sample Entropy

Inspired by MCSEBSS (see Section 2.2.3), Wng et al. developed the cross-trend sample entropy (CTSE) to quantify the synchronism between two time series with strong trends [23]. For two time series **u** and **v** of length *N*, CTSE is calculated with the following four steps algorithm:

1. The two time series are symbolized as:

$$\mathcal{U}(j) = \begin{cases} 1 & \text{if } \vec{u}(j) > u(j) \\ 0 & \text{otherwise} \end{cases}, \qquad V(j) = \begin{cases} 1 & \text{if } \vec{v}(j) > v(j) \\ 0 & \text{otherwise} \end{cases}, \qquad 1 \leqslant j \leqslant N,\tag{17}$$

where **u**˜ and **v**˜ are, respectively, the trend of **u** and **v** obtained by polynomial fitting (linear, quadratic or higher order).

2. The template vectors **u***m* and **v***m* are constructed as:

$$\mathbf{u}\_m(i) = \{\mathcal{U}(i+k)\}, \qquad \mathbf{v}\_m(i) = \{V(i+k)\}, \tag{18}$$

where 0 *k m* − 1 and 1 *i* -*N* − *m*.

3. The similarity between *xm*(*i*) and *ym*(*i*) is calculated as:

$$d(x\_m(i), y\_m(i)) = \frac{\#\text{ of } 1 \text{ in } \mathbb{C}\_m(i)}{m}, \qquad 1 \lessapprox i \lessapprox N - m,\tag{19}$$

where the *i*-th symbol vector *Cm* is determined with *f* , a symbolic function between two template vectors **u***m* and **v***m*, as:

$$f = \begin{cases} 1 & \text{if } u\_m(i+k) = v\_m(i+k) \\ 0 & \text{otherwise} \end{cases}, \qquad 0 \lessapprox m-1. \tag{20}$$

4. CTSE is finally computed as:

$$\text{CTSE}(u, v, r, N) = -\ln \frac{n^{(m+1)}}{n^{(m)}},\tag{21}$$

where *n*(*m*) is obtained by comparing *d*(*xm*(*i*), *ym*(*j*)) within a tolerance *r* for 1 *i* -*N* − *m*.

CTSE has two advantages over MCSEBSS: It is more sensitive to the difference of dynamical characteristic between two signals, and it works well with signals with trends (linear, quadratic, cubic, and sinusoidal) [23]. The value of CTSE computed from two time series can be interpreted as an indicator of dynamical structure regarding the two time series with potential trends.

#### *2.3. Cross-Distribution Entropy-Based Measures*

#### 2.3.1. Cross-Distribution Entropy

In 2018, Wang and Shang introduced the cross-distribution entropy (cross-DistEn) to quantify the complexity between two cross-sequences [9]. To generalize the standard statistical mechanics, the authors replaced the standard distribution entropy (DistEn) based on Shannon entropy by DistEn based on Tsallis entropy [9]. The authors showed that cross-DistEn better illustrates the relationships between two vectors than cross-SampEn does [9]. For two times series **u** and **v** of length *N* cross-DistEn follow these four steps:


$$d\_{i\rangle} = \max\{|u\_{i+k}^{(\tau)} - v\_{j+k}^{\tau}|, 0 \le k \lessle m - 1\};\tag{22}$$


$$crossDistEn(\mathbf{u}, \mathbf{v}) = \frac{1}{\ln\left(a\right)} \frac{1}{q - 1} (1 - \sum\_{t=1}^{M} P\_t^{q})\_{\prime} \tag{23}$$

where *q* is the order of the Tsallis entropy and *a* the logarithm base of the entropy computation.

The main advantage of cross-DistEn is that it is adapted for short time series. With financial data, cross-DistEn illustrates better the relationship between signals than cross-SampEn [9]. The value of cross-DistEn computed from two time series can be interpreted as a degree of linkage of the two time series.

#### 2.3.2. Permutation Cross-Distribution Entropy

The permutation cross-distribution entropy (PCDE), introduced by He et al. in 2019, is a variant of cross-DistEn (see Section 2.3.1) [10]. The permutation allows to characterize fluctuations and prevents the impact of spatial distances on results. The PCDE algorithm is the same as the one of cross-DistEn, detailed in Section 2.3.1. However, an additional step is added before step 2 to permute **X**(*i*) and **Y**(*j*) with the permutation algorithm mentioned in Section 2.2.5. The distance matrix is therefore constructed with the permuted vectors. The value of PCDE computed from two time series can be interpreted as a degree of dissimilarity between the two time series.

#### *2.4. Other Cross-Entropy-Based Measures*

#### 2.4.1. Cross-Conditional Entropy

Cross-conditional entropy (CCE), introduced by Porta et al. in 1919, quantifies the degree of coupling between two signals [8]. A corrected conditional entropy has been introduced to improve the approximate entropy that suffers from limitations when a finite number of sample is considered [27]. CCE is an adaptation of the corrected conditional entropy. For two signals **u** = {*u*(*i*), *i* = 1, ..., *N*} and **v** = {*v*(*i*), *i* = 1, ..., *N*}, CCE is computed as:

$$\text{CCE}\_{\mathbf{v}/\mathbf{u}}(L) = -\sum\_{L-1} p(\mathbf{u}\_{L-1}) \sum\_{i/L-1} p(\upsilon(i)/\mathbf{u}\_{L-1}) \times \log p(\upsilon(i)/\mathbf{u}\_{L-1}),\tag{24}$$

where *L* is the length of the pattern extracted to be compared, *p*(**u***L*−1) is the joint probability of the pattern **u***L*−1(*i*)=(*u*(*i*), *uL*−1(*i* − 1)), and *p*(*v*(*i*)/**u***L*−1) is the probability of the sample *v*(*i*) given the pattern **u***L*−1(*i*). If a mixed pattern, composed by *L* − 1 samples, of **u** and **v**: (*v*(*i*), *u*(*i*), ..., *u*(*i* − *L* + 2)) = (*v*(*i*), **u***L*−1), is defined and with the Shannon entropy *E*(**u***L*) = − ∑*<sup>L</sup> p*(**u***L*)log *p*(**u***L*), CCE can also be described as:

$$\text{CCE}\_{\mathbf{v}/\mathbf{u}}(L) = E(\boldsymbol{\upsilon}(i), \mathbf{u}\_{L-1}) - E(\mathbf{u}\_{L-1}).\tag{25}$$

For a limited amount of samples, the approximation of CCE always decreases to zero while increasing L. To solve this problem, a modification has been introduced as:

$$\text{CCE}\_{\mathbf{v}/\mathfrak{u}}(L) = \widehat{\text{CCE}\_{\mathbf{v}/\mathfrak{u}}}(L) + \text{perc}\_{\mathbf{v}/\mathfrak{u}}(L) \times \widehat{E}(\mathbf{v}),\tag{26}$$

where perc**v**/**<sup>u</sup>** is the ratio of mixed patterns found only once over the total number of mixed patterns, CCE**<sup>v</sup>**/**u**(*L*) and *E*(**v**) are, respectively, the estimates of the CCE**v**/**u**(*L*) and *E*(**v**) based on the considered limited dataset.

CCE can be defined as a measure of unpredictability of one signal when the second is observed because it quantifies the amount of information carried by one signal which cannot be derived from the other. It is not fully a measure of synchronization. The main disadvantage of CCE is that it is not totally adapted for short time series.

#### 2.4.2. Cross-Fuzzy Entropy

Cross-fuzzy entropy (C-FuzzyEn), introduced by by Xie et al. in 2010 [3], is an adaptation of fuzzy entropy, introduced by Chen et al. [28], that quantifies the synchrony or similarity of patterns between two signals [3]. C-FuzzyEn is an improvement of cross-SampEn that is more adapted for short time series and more robust to noise. For two times series **u** and **v** of length *N*, C-FuzzyEn is obtained with the following three steps algorithm:

1. The distance *d<sup>m</sup> ij* between **<sup>X</sup>***<sup>m</sup> <sup>i</sup>* and **<sup>Y</sup>***<sup>m</sup> <sup>j</sup>* is computed as:

$$d\_{ij}^{m} = d[\mathbf{X}\_i^m, \mathbf{Y}\_j^m] = \max\_{k \in \{0, m-1\}} |u(i+k) - \overline{u}(i) - v(j+k) - \overline{v}(i)|,\tag{27}$$

where *m* is the number of consecutive data to compare, **X***<sup>m</sup> <sup>i</sup>* = {*u*(*i*), *u*(*i* + 1), ..., *u*(*i* + *m* − <sup>1</sup>)} − *<sup>u</sup>*(*i*), and **<sup>Y</sup>***<sup>m</sup> <sup>j</sup>* = {*v*(*i*), *v*(*i* + 1), ..., *u*(*v* + *m* − 1)} − *v*(*i*). *u*(*i*) and *v*(*i*) are calculated as: *u*(*i*) = <sup>1</sup> *<sup>m</sup>* <sup>∑</sup>*m*−<sup>1</sup> *<sup>l</sup>*=<sup>0</sup> *<sup>u</sup>*(*<sup>i</sup>* <sup>+</sup> *<sup>l</sup>*), and *<sup>v</sup>*(*i*) = <sup>1</sup> *<sup>m</sup>* <sup>∑</sup>*m*−<sup>1</sup> *<sup>l</sup>*=<sup>0</sup> *v*(*i* + *l*);

2. The synchrony or similarity degree *D<sup>m</sup> ij* is computed as: *<sup>D</sup><sup>m</sup> ij* = *<sup>μ</sup>*(*d<sup>m</sup> ij* , *<sup>n</sup>*,*r*), where *<sup>μ</sup>*(*d<sup>m</sup> ij* , *n*,*r*) is the fuzzy function obtained as:

$$\mu(d\_{ij}^{m}, n, r) = \exp - \frac{(d\_{ij}^{m})^{n}}{r},\tag{28}$$

where *r* and *n* determine the width and the gradient of the boundary of the exponential function, respectively;

3. Finally, C-FuzzyEn is computed as:

$$\mathbb{C}\text{-FuzzyEn}(m, n, r) = \ln \Phi^m - \ln \Phi^{m+1},\tag{29}$$

$$\text{where } \Phi^m = \frac{1}{N-m} \sum\_{i=1}^{N-m} (\frac{1}{N-m} \sum\_{j=1}^{N-m} D\_{ij}^m), \text{ and } \Phi^{m+1} = \frac{1}{N-m} \sum\_{i=1}^{N-m} (\frac{1}{N-m} \sum\_{j=1}^{N-m} D\_{ij}^{m+1}).$$

The value of C-FuzzyEn computed from two time series can be interpreted as the synchronicity of patterns.

#### 2.4.3. Joint-Permutation Entropy

Joint permutation entropy (JPE), introduced by Yin et al. in 2019, quantifies the synchronism between two time series. It is based on permutation entropy that consists of comparing neighboring values of each point and mapping them to ordinal patterns to quantify the complexity of a signal [26]. For two signals **u** and **v**, JPE is computed as the Shannon entropy of the *d*! × *d*! distinct motif combinations {(*πd*,*<sup>t</sup> <sup>i</sup>* , *<sup>π</sup>d*,*<sup>t</sup> <sup>j</sup>* )}:

$$\text{JPE}(d, t) = -\sum\_{i, j \colon (\pi\_i^{d, t}, \pi\_j^{d, t})} p(\pi\_i^{d, t}, \pi\_j^{d, t}) \cdot \ln p(\pi\_i^{d, t}, \pi\_j^{d, t}), \tag{30}$$

where *d* is the embedded dimension and *p*(*π<sup>d</sup> <sup>i</sup>* , *<sup>π</sup><sup>d</sup> <sup>j</sup>* ) is the joint probability of {(*πd*,*<sup>t</sup> <sup>i</sup>* , *<sup>π</sup>d*,*<sup>t</sup> <sup>j</sup>* )} appearing in the **X***d*,*<sup>t</sup> <sup>l</sup>* <sup>=</sup> {*ul*, *ul*<sup>+</sup>*t*, ..., *ul*+(*d*−1)*t*} and **<sup>Y</sup>***d*,*<sup>t</sup> <sup>l</sup>* = {*vl*, *vl*<sup>+</sup>*t*, ..., *vl*+(*d*−1)*t*} and it is defined as:

$$p(\pi\_i^{d,t}, \pi\_j^{d,t}) = \frac{||l:l \lessapprox T, \text{type}(X\_l^{d,t}, Y\_l^{d,t}) = (\pi\_i^{d,t}, \pi\_j^{d,t})||}{T},\tag{31}$$

where *T* = *N* − (*d* − 1)*t*, type(·) corresponds to the map from pattern space to symbol space, and || · || corresponds to the cardinality of a set.

The main advantages of JPE are the simplicity, the robustness, and the low computational cost. The value of JPE computed from two time series can be interpreted as a degree of correlation between the two time series [29].

#### **3. Multiscale Procedures**

To study entropy or cross-entropy measures of time series across scales, a multiscale procedure can be used. In this part we detail, in chronological order, three multiscale methods: The coarse-grained, the time-shift, and the composite coarse-grained approaches.

#### *3.1. Coarse-Graining Procedure*

In 2002 Costa et al. introduced the coarse-graining procedure to analyze the complexity, defined by the analysis of the irregularity through scale factors [14]. This method is an improvement, more adapted for a biological time series, of the coarse-graining procedure introduced by Zhang [30]. This procedure has been used in multiscale entropy and cross-entropy methods [6,9,15,20,31–33]. For each scale factor, this procedure derives a set of vectors illustrating the system dynamics. For a monovariate discrete signal **x** of length *N*, the coarse-grained time series **y**(*τ*) is calculated as:

$$y\_j^{(\tau)} = \frac{1}{\tau} \sum\_{i=(j-1)\tau+1}^{j\tau} x\_{i\nu} \tag{32}$$

where *τ* is the scale factor and 1 *j* - *N <sup>τ</sup>* . The length of the coarse-grained vector is *<sup>N</sup> <sup>τ</sup>* . An example of coarse-graining procedure is presented in Figure 1A.

**Figure 1.** Examples of multiscale procedures for the ten first points of a time series **x**. (**A**) represents the coarse-graining procedure (modified from [34]), (**B**) shows the time-shift procedure, and (**C**) illustrates the composite coarse-graining procedure (modified from [35]).

#### *3.2. Time-Shift Procedure*

As for the coarse-grained procedure, the time-shift procedure is used to decompose a signal through different scale factors and to perform a multiscale analysis. While coarse-graining procedure uses the averaging of time series on several interval scales, the time-shift procedure applies time shifting in time series. The main disadvantage of a coarse-graining procedure is the loss of pattern information hidden in the time series. To overcome this limitation, Pham used the Higuchi's fractal dimension (HFD) [36] and proposed a new multiscale analysis [37]. The time-shift procedure illustrates the fractal dimension of a signal. This method has been recently used with entropy and cross-entropy measures [17,37–39]. HFD shows stable numerical results for stationary, non-stationary, deterministic, and stochastic time series [40]. For a monovariate discrete signal **x** of length *N*, the *β* time-shift signal **<sup>y</sup>**(*τ*) *<sup>β</sup>* is calculated as:

$$\mathbf{y}\_{\boldsymbol{\beta}}^{(\tau)} = (\mathbf{x}\_{\boldsymbol{\beta}\prime} \mathbf{x}\_{\boldsymbol{\beta}+\tau\prime}, \dots, \mathbf{x}\_{\boldsymbol{\beta}+\lfloor \frac{N-\beta}{\tau} \rfloor \tau}). \tag{33}$$

For each time scale *τ*, *β* time-shift time series are computed (*β* = 1, 2, ..., *τ*). An illustration of the time-shift procedure is presented in Figure 1B.

#### *3.3. Composite Coarse-Graining Procedure*

The coarse-graining procedure, introduced by Costa et al. [14], increases the variance of estimated entropy values at large scale. To overcome this limitation, by Wu et al. introduced in 2013 a composite coarse-graining procedure [35]. This method has been used with entropy and cross-entropy measures [16,32]. For a monovariate discrete signal **x** of length *N*, the *k*-th composite coarse-grained time series **<sup>y</sup>**(*τ*) *<sup>k</sup>* is computed as:

$$y\_{k,j} = \frac{1}{\pi} \sum\_{i=(j-1)\tau+k}^{j\tau+k-1} x\_{i\prime} \tag{34}$$

where 1 *j* - *N <sup>τ</sup>* . For each time scale *τ*, *k* composite coarse-grained time series are computed (1 *k τ*). An illustration of the composite coarse-graining procedure is presented in Figure 1C.

#### **4. Multiscale Cross-Entropy Methods**

#### *4.1. Generalization*

Multiscale cross-entropy (MCE) methods consist of applying a cross-entropy measure for each scale factor obtained by a specific procedure. For each scale factor *τ*, MCE is computed as:

$$\mathcal{MCE}(\mathbf{X}^{(\tau)}, \mathbf{Y}^{(\tau)}) = \frac{1}{k} \sum\_{\beta=1}^{k} \operatorname{crossEn}(\mathbf{X}\_{\beta}^{(\tau)}, \mathbf{Y}\_{\beta}^{(\tau)}),\tag{35}$$

where **X**(*τ*) and **Y**(*τ*) are computed with a multiscale procedure (see Section 3), *k* is the number of time series that are generated by the multiscale procedure (*k* = 1 for the coarse-graining procedure and *k* = *τ* for the time-shift and the composite coarse-graining procedures), and crossEn is the cross-entropy method used (see Section 2). Table 2 shows the multiscale cross-entropy methods that can be generalized with Equation (35). Before the computation of MCE, a pre-treatment can be performed. For example, the asymetric multiscale cross-SampEn (AMCSE) method [33] decomposes each signal into two, one for the positive trends and the other for the negative trends, before applying a coarse-graining procedure and cross-SampEn.


**Table 2.** Multiscale cross-entropy methods, in chronological order, that can be generalized with Equation (35). For each method, the multiscale procedure and the cross-entropy measure used and the reference are mentioned.

#### *4.2. Particular Cases*

Some multiscale cross-entropy methods cannot follow the generalization previously introduced. In this part we detail three particular methods, in chronological order: The adaptive multiscale cross-SampEn, the refined composite multiscale cross-SampEn, and the percussion entropy.

#### 4.2.1. Adaptive Multiscale Cross-Sample Entropy

The adaptive multiscale cross-sample entropy (AMCSE), introduced by Hu and Liang in 2011, assesses the nonlinear interdependency between different visual cortical areas [41]. The method uses the multivariate empirical mode decomposition (MEMD), introduced by Rehman and Mandic [42], to decompose two time series into intrinsic mode functions (IMFs) that represent the oscillation mode embedded in the data. For two time series **u** and **v**, AMCSE is calculated with the following three steps algorithm:


$$\mathcal{S}\_{f2c}^{\tau} = \sum\_{i=\tau}^{N} \text{IMF}\_{i\prime} \quad (\tau \lessapprox N), \tag{36}$$

$$S\_{c2f}^{\tau} = \sum\_{i=1}^{N+1-\tau} \text{IMF}\_{i\prime} \quad (\tau \lessapprox N). \tag{37}$$

The two directions can be used separately or used in tandem to reveal the underlying dynamics of complex time series;

3. For each scale factor *τ*, the cross-SampEn (see Section 2.2.1) is applied between the two scales of data (*S<sup>τ</sup> <sup>f</sup>* <sup>2</sup>*<sup>c</sup>* and *<sup>S</sup><sup>τ</sup> <sup>c</sup>*<sup>2</sup> *<sup>f</sup>*) extracted from vectors **u** and **v**.

#### 4.2.2. Refined Composite Multiscale Cross-Sample Entropy

Yin et al. introduced in 2016 the composite multiscale cross-sample entropy (CMCSE) that follows the generalization (see Section 4.1), where the composite coarse-graining procedure and cross-SampEn are used [16]. The main disadvantage of this method is that cross-SampEn generates some undefined values when the number of matched sample is zero. To overcome this limitation, Yin et al. introduced the refined CMCSE (RCMCSE). This method leads to better results with short time series. For two times series **u** and **v** of length *N*, RCMSE is computed with the following three steps algorithm:


$$\text{RCMSSE}(\mathbf{u}, \mathbf{v}, \tau, m, r) = -\ln \frac{\sum\_{k=1}^{\tau} n\_{k, \tau}^{m+1}}{\sum\_{k=1}^{\tau} n\_{k, \tau}^{m}},\tag{38}$$

where *m* is the dimension and of the matched vector pairs and *r* is the distance tolerance for the matched vector pairs.

4.2.3. Percussion Entropy

Wu et al. introduced, in 2013, the multiscale small-scale entropy index (MEI*SS*) that is obtained by summing the values of entropy for the first five scale factors [43]. Percussion entropy, introduced by Wei et al. in 2019, allows one to quantify a percussion entropy index (PEI) [18]. The method has been introduced to assess baroreflex sensitivity. PEI compares the similarity in tendency of change between two time-series. This index has been compared to MEI*SS*. For two time series **u** and **v** of length *N*, PEI is computed with the following three steps algorithm:

1. A binary transformation of **u** and **v** is used to obtain **x** = {*x*1, *x*2, ..., *xN*−1} and **y** = {*y*1, *y*2, ..., *yN*−1}:

$$x\_{i} = \begin{cases} 0 & u(i+1) \lesssim u(i) \\ 1 & u(i+1) > u(i) \end{cases} \quad y\_{i} = \begin{cases} 0 & v(i+1) \leqslant v(i) \\ 1 & v(i+1) > v(i) \end{cases};\tag{39}$$

2. The percussion rate for each scale factor *τ* is computed as:

$$P\_{\tau}^{m} = \frac{1}{n - m - \tau + 1} \sum\_{i=1}^{n-m-\tau+1} \text{count}(i),\tag{40}$$

where *m* is the embedded dimension vectors and count(*i*) represents the match number between **A**(*i*) = {*xi*, *xi*+1, ..., *xi*+*m*−1} and **B**(*i* + *τ*) = {*yi*+*τ*, *yi*+*τ*+1, ..., *yi*+*τ*+*m*−1};

3. PEI is calculated as:

$$\text{PEI}(m, n\_{\tau}) = \phi^m - \phi^{m+1},\tag{41}$$

where *φ<sup>m</sup>* = ln ∑*n<sup>τ</sup> <sup>τ</sup>*=<sup>1</sup> *<sup>P</sup><sup>m</sup> <sup>τ</sup>* and *n<sup>τ</sup>* is the number of scales to consider. Wei et al. [18] have chosen *n<sup>τ</sup>* = 5 in accordance with MEI*SS*.

This algorithm is a generalization of the method developed by Wei et al. [18] for a specific time series, amplitudes of successive digital volume pulse signals and changes in R-R intervals of successive cardiac cycles. At the moment, it has not been used to process other kinds of signals.

#### **5. Conclusions**

In this review we proposed a state-of-the-art of cross-entropy measures, multiscale procedures, and multiscale cross-entropy methods. Multiscale cross-entropy methods offer other interesting perspectives for time series analysis. Furthermore, all the cross-entropy methods, detailed in this review, can be translated into multiscale cross-entropy methods with the multiscale procedures presented in this review.

**Supplementary Materials:** The following are available online at http://www.mdpi.com/1099-4300/22/1/45/s1, Video S1: Permutation entropy–An exemple to obtain permutation vectors.

**Author Contributions:** Investigation, A.J. and A.H.-H.; supervision, A.H.-H.; writing–original draft, A.J.; writing–review and editing, A.H.-H. All authors have read and agreed to the published version of the manuscript.

**Funding:** A CIFRE grant N◦2017/1165 was awarded by ANRT to the company COTTOS Médical to support the work of graduate student A.J.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
