**Complexity Analysis of Surface Electromyography for Assessing the Myoelectric Manifestation of Muscle Fatigue: A Review**

#### **Susanna Rampichini 1, Taian Martins Vieira 2,3,\*, Paolo Castiglioni <sup>4</sup> and Giampiero Merati 1,4**


Received: 16 March 2020; Accepted: 2 May 2020; Published: 7 May 2020

**Abstract:** The surface electromyography (sEMG) records the electrical activity of muscle fibers during contraction: one of its uses is to assess changes taking place within muscles in the course of a fatiguing contraction to provide insights into our understanding of muscle fatigue in training protocols and rehabilitation medicine. Until recently, these myoelectric manifestations of muscle fatigue (MMF) have been assessed essentially by linear sEMG analyses. However, sEMG shows a complex behavior, due to many concurrent factors. Therefore, in the last years, complexity-based methods have been tentatively applied to the sEMG signal to better individuate the MMF onset during sustained contractions. In this review, after describing concisely the traditional linear methods employed to assess MMF we present the complexity methods used for sEMG analysis based on an extensive literature search. We show that some of these indices, like those derived from recurrence plots, from entropy or fractal analysis, can detect MMF efficiently. However, we also show that more work remains to be done to compare the complexity indices in terms of reliability and sensibility; to optimize the choice of embedding dimension, time delay and threshold distance in reconstructing the phase space; and to elucidate the relationship between complexity estimators and the physiologic phenomena underlying the onset of MMF in exercising muscles.

**Keywords:** sEMG; approximate entropy; sample entropy; fuzzy entropy; fractal dimension; recurrence quantification analysis; detrended fluctuation analysis; correlation dimension; largest Lyapunov exponent

#### **1. Introduction**

#### *1.1. General Aspects*

The analysis of surface electromyography (sEMG) is widely used to characterize the electrical activity of muscle fibers during a contraction, both in isometric (force generation without changing the length of the muscles) and isotonic conditions (force generation by either lengthening [eccentric contraction] or shortening [concentric contraction] the muscles). Whatever the type of contraction, the prolongation of muscle contractions over time invariably causes the onset of muscle fatigue, defined as the inability to sustain force generation over time. To date, sEMG revealed that signs of muscle fatigue may manifest prior to the fatigue onset, suggesting the susceptibility of muscles to fatigue could be assessed noninvasively from the skin. These early signs of myoelectric alterations are often

termed myoelectric manifestations of muscle fatigue (MMF) and are of utmost interest in physiology, pathophysiology, training and rehabilitation studies. However, from the first studies on sEMG analysis during fatiguing contractions it has become apparent that the sEMG signal shows a complex behavior, due to many concurrent factors. Therefore, in recent years, different complexity-based methods of analysis previously applied to physical and other biological time series have been tentatively applied to the sEMG, searching for new techniques to individuate early and efficiently the MMF onset during sustained isotonic and isometric muscle contraction.

In this review, we briefly describe what MMF is and how it has been assessed, we introduce sEMG as a tool to study the mechanisms underpinning muscle fatigue and explain the main linear and spectral methods to detect MMF in exercising muscles. Then, we review the principal complexity methods for sEMG analysis based on an extensive literature search over different databases to be maximally descriptive of all the methodology used, without further considerations on the methodological approach, experimental design, data analysis, and results. For each index of sEMG complexity, we provide a brief description of its meaning, the algorithm for its estimation, the typical parameters setting in sEMG analysis and the main articles employing it in investigating different muscles activations. The relationships reported in previous studies between each index and the physiological mechanisms underpinning muscular activation are propaedeutic to better understand the impact of MMF on each index. Indeed, muscular activation occurring at the beginning of a fatiguing contraction represents the preliminary phase of the fatigued condition. The main results obtained in studies on muscle fatigue are presented and finally, the interpretative theories hypothesized by the investigators are introduced without any personal endorsement but as an objective representation of the state of the art of this field of research.

#### *1.2. Muscle Fatigue*

Muscle fatigue, a reversible reduction in force generation capacity, continues to generate great interest in the scientific community worldwide [1–4]. Its manifestation in several neuromuscular disorders [5] and its influence on sports performance [6] and rehabilitation [7] have led to deeply explore the underlying mechanisms of this phenomenon, which seem to be multifactorial. Beyond psychological aspects, many neuromuscular features ascribable to the central and peripheral nervous system head for electrochemical alterations. Following a classic two-domain concept, central and peripheral fatigue can be distinguished whenever the involved mechanism relates to the spinal and supra-spinal tract (central origin) or to structures distal to the neuromuscular junction (peripheral origin).

At the central level, within the cerebral motor cortex fatigue causes the alteration of cells excitability, the inhibition of motor cortex output and the interruption of action potential conductions at axonal branching sites. As a consequence, the recruiting strategy of muscle fibers, based on increasing the number of muscle fibers and their discharge rate, is deprived of both mechanisms. Moreover, the recruitment of motor units, initially asynchronous, shifts toward a more synchronized pattern and the fatigued motor neurons require a higher excitatory input to ensure their firing rate. Finally, the firing rate of the motor units decreases [3,4,8].

At the peripheral level, electrophysiological adjustments consequent to fatigue onset include accumulation of both inorganic phosphate in the sarcoplasm and increase of intracellular pH. Imbalance of intra- and extra-cellular sodium and potassium concentration combines with impairment in calcium release and reuptake at the sarcoplasmic level and the inhibition of cross-bridges interactions [3,4]. As a result, altered neuromuscular transmission and action potential propagation occur [6,9]. These phenomena, combined with a changing strategy of motor unit recruitment, contribute to span the shape of the action potential, the electrical signal generated by all the motor units recruited by the central nervous system. A reduction of the conduction velocity, the speed at which the action potential propagates along the sarcolemma membrane, is attributed to fatigue onset and represents a focus point in the study of muscle contraction [1,4,6,9].

#### *1.3. The Surface Electromyography*

Muscle contraction is preceded by a cascade of electrophysiological events, from the excitation of motor neurons in the spinal cord to the propagation of action potentials across the muscle T-tubules. All these events, to a certain degree, contribute to the generation and propagation of electric potential in the surrounding tissues, referred to as electromyogram. The electromyogram is often termed as an interference signal, as it coalesces the contribution of many different motor units; depending on the contracting muscle and on the contraction intensity, the number of excited motor units may indeed range from tens to hundreds [10,11]. As schematically illustrated in Figure 1, the interference electromyography (EMG), *x*(*t*), may be modelled as the sum of trains of motor unit action potentials *mi*(*t*), each defined as the time convolution between the discharge instants δ - *<sup>t</sup>* <sup>−</sup> *tij* and the waveform *si*(*t*) of the action potential of each single unit:

$$\mathbf{x}(t) = \sum\_{i}^{N} m\_{i}(t) = \sum\_{i}^{N} \sum\_{j}^{M\_{i}} \delta(t - t\_{ij}) \* s\_{i}(t) \tag{1}$$

where *N* and *Mi* respectively correspond to the number of motor units recruited and the total number of discharges (*j* = 1, 2, 3, ... , *Mi*) for the *i*-th motor unit. The degree of EMG interference is therefore clearly dependent on how often motor units discharge and on the number of motor units excited. Overtly, the degree of interference increases with the contraction level.

**Figure 1.** Schematic representation of the generation of electromyograms from motor unit action potentials. The recorded surface electromyography (sEMG) differs from the physiological electromyogram because of noise and filtering introduced by the detection; *g*(*t*) is the recorded signal on which spectral or complexity-based analyses are conducted, *x*(*t*) is the true signal of interest, based on neurophysiological backgrounds, *e*(*t*) is additive noise, and *H*(*f*) is the transfer function of the recording apparatus.

According to Equation (1), two main sources explain *mi*(*t*): the discharge instants *tij* and the waveform representing the motor unit action potential, *si*(*t*). Being the signal arising from the spinal cord and determining the onset and frequency of muscle excitation, the train of impulses characterising the motor unit discharge instants is regarded as the neural drive to the muscle [12]. The mathematical (Equation (1)) and conceptual (Figure 1) definitions for EMG do not necessarily imply a central origin for the discharge instants as often inappropriately conceived [9,13]; synaptic inputs arising from corticospinal pathways, spinal interneurons, and peripheral afferent feedback collectively determine the net neural drive to muscles [14]. Differently from the muscle neural drive, the waveform of motor unit action potentials does not carry any information from the spinal cord. It is entirely defined by peripheral factors, related to physiological, anatomical and detection aspects [15–18]. Physiological (e.g., conduction velocity, intracellular action potential duration) and anatomical (e.g., depth and length of muscle fibres) aspects are not under the direct control of the experimenter. On the other hand, detection aspects, as position and size of electrodes, should be cautiously defined according to the muscle studied and the purpose of the study. Considering t he widespread sampling of sEMG with a couple of surface electrodes, i.e., bipolar electrodes, here we therefore focus attention on the effect of bipolar montages. The magnitude of the bipolar montage transfer function may be approached as [19]:

$$\left| H(f) \right| \propto \sin^2(\pi f d) \tag{2}$$

with *d* being the centre-to-centre distance between electrodes. Because of its high-pass filtering response for spatial frequencies smaller than 1/2*d*, the bipolar montage is a simple procedure for attenuating common mode signals associated with power line interference and far-field potentials [20–22]. Benefits of attenuation of the latter factor are well conceived in studies aimed at estimating conduction velocity [23] but may be questionable when the intention is to estimate force from EMGs [24]. Clearly from Equation (2), *H*(*f*) <sup>=</sup> 0 at the frequencies *<sup>f</sup>* <sup>=</sup> *<sup>n</sup>*/*d*, *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>. Considering the multiplicative effect of *H*(*f*) on the EMG spectrum), the bipolar montage leads therefore to dips in the frequency spectrum *G*(*f*) of the recorded EMG [17,25].

The electrode filter function *H*(*f*) is particularly relevant when bipolar electrodes are aligned parallel to the underlying muscle fibres, whereby space and time are intertwined. In this case, the argument of the sine function in Equation (2) can be rewritten as π*f d*/*v*, with *v* corresponding to the action potential conduction velocity. This relationship between *d* and *v* could motivate attempts to define the appropriate inter-electrode distance not leading to spectral dips and methods for the estimation of conduction velocity from dips location in *G*(*f*) [26,27]. Both possibilities are arguable though, given they are valid for the specific case electrodes and fibres reside in parallel directions and because the conduction velocity differs between motor units. Moreover, the definition of appropriate inter-electrode distance in bipolar recording should not be based on the avoidance of spectral dips and of spatial aliasing [28] but on whether and how much both affect the possibility of extracting physiologically relevant information from the electromyogram. Although short distances may help attenuating the detection of undesired sources, non-targeted muscles, it may result in the detection of signals unrepresentative of the whole, target muscles. Notwithstanding the selectivity-specificity issue has been traditionally acknowledged [29,30], reports on this matter are incipient [31,32]. Throughout this review, we assume the bipolar EMG is both selective and specific, sampling exclusively from all fibres of the target muscle.

#### *1.4. Surface EMG Analysis in Time and Frequency Domains*

Different indices have been proposed to characterize the surface EMG (sEMG) in both time and frequency domains. Here we refer to these indices as sEMG descriptors. Time descriptors often convey information related to the amplitude of sEMG (i.e., amplitude descriptors) whereas spectral descriptors typically relate to the distribution of energy across the sEMG frequency or power spectrum. Restating the repertoire of time and spectral descriptors so far proposed appears pointless given recent reviews on this issue [33–35]. Our focus is rather on the most widely used descriptors and on their sensitivity to physiological, anatomical and detection aspects.

The sEMG can be conceived as a Gaussian random process with limited bandwidth [36]. The presence of random components in the signal makes unsuitable the use of specific waveform features, such as the peak or peak-to-peak value, to describe the amplitude of sEMGs. The sEMG amplitude is therefore more appropriately defined in statistical terms. Let's consider the measured sEMG as a zero mean signal *g*(*t*) conveying trains of action potentials of different motor units, uncorrelated between themselves (Figure 1), and let's call the power of individual trains of action

potentials of each (*i*) of the *N* excited motor units, represented in time domain as σ<sup>2</sup> *mi* or in frequency domain as *PMi* , and the discharge rate and energy of the action potential of each motor unit as *DRi* and *Ei* respectively. Then, the following relationships holds for the standard deviation or root mean square value of *g*(*t*) [12]:

$$\sigma\_{\mathcal{S}} = \sqrt{\frac{1}{T} \int\_{0}^{T} g^2(t)} = \sqrt{\sum\_{i=1}^{N} \sigma\_{m\_i}^2(t)} = \sqrt{\sum\_{i=1}^{N} P\_{M\_i}(f)} = \sqrt{\sum\_{i=1}^{N} DR\_i E\_i} \tag{3}$$

where *T* corresponds to the period over which the sEMG has been recorded. According to Equation (3), the variance (power) of the recorded signal equals the sum of the power of individual trains of action potentials. Note that the additive property does not hold for the standard deviations: σ*<sup>g</sup>* ≤ σ*mi* . Another interesting aspect in Equation (3) is the monotonic relationship between σ*<sup>g</sup>* and the discharge rate *DRi* and the energy *Ei* of the action potential of each motor unit. These two aspects lead to considerations of practical relevance. First, owing to the temporal overlapping of positive and negative phases of excited motor units, an issue known as amplitude cancellation [37], not all motor units contribute to σ*g*. Keenan et al [38] have shown however that normalization of σ*<sup>g</sup>* with respect to amplitude values obtained during a reference condition (e.g., maximal voluntary contraction) helps contending with the cancellation issue. Second, even though σ*<sup>g</sup>* is sensitive to both discharge rate and number of unit excited, it is also sensitive to any factors affecting the shape, and thus *Ei*, of motor unit action potentials, be them of physiological origin or not. The impossibility of distinguishing the contribution of both origins demands caution when drawing inferences from σ*<sup>g</sup>* [39], in particular when physiological and non-physiological factors may change abruptly and unpredictably like during dynamic contractions [40].

Different descriptors have been also proposed to characterize the EMG spectrum [13,18,41]. The most widely considered are the mean frequency (MNF) and the median frequency (MDF) defined as:

$$\text{MNF} = \frac{\int\_{f\_{\text{min}}}^{f\_{\text{max}}} f \left| G(f) \right|^2}{\int\_{f\_{\text{min}}}^{f\_{\text{max}}} \left| G(f) \right|^2} \tag{4}$$

$$\int\_{f\_{\rm min}}^{\rm MDF} \left| \mathbf{G}(f) \right|^2 = \int\_{\rm MDF}^{f\_{\rm max}} \left| \mathbf{G}(f) \right|^2 \tag{5}$$

with *fmin* and *fmax* defining the EMG bandwidth (typically ranging from 20 to 400 Hz). MDF is less sensible to noise [41] and more sensitive to simulated variations in the EMG spectrum [42] than MNF. Theoretical and experimental considerations upon the effect of discharge instants on the EMG spectrum revealed the rate of discharge of motor units (delta function in Equation (1)) contributes equally to frequencies over 30–40 Hz [18,43,44]. Consequently, and differently from its amplitude, the EMG spectrum is mostly dependent on the waveform of action potentials and not on the discharge rate of motor units. Factors affecting the waveform of action potentials may either change or scale its shape, as the filtering effect of the tissue interposed between electrodes and the excited fibers and the muscle fiber conduction velocity [16,45]. As for amplitude descriptors, the possibility of discerning the relative contribution of physiological, anatomical and detection source affecting spectral descriptors demands careful reflection.

Before commenting on the use and validity of amplitude and spectral descriptors during fatiguing conditions, a general consideration is necessary on EMG stationarity. The above descriptors presume the recorded EMG is stationary, at least in the wide-sense. Wide-sense stationarity is well accepted in applications for which variations in contraction intensity and in muscle shape and properties may be regarded marginal. These circumstances are often limited to laboratory applications, whereby isometric, constant force contractions may be applied. Even so, during such a controlled condition, non-stationarities may manifest, often related to the building up of muscle fatigue. On this regard, Bonato et al [42] wisely classified the sources of non-stationarities in sEMG as being either slow or fast. Slow non-stationarities are mostly associated with sluggish events, as the accumulation of metabolites in the muscle tissue or changes in temperature. Fast non-stationarities are related to any abrupt changes that could be triggered, e.g., by sudden variations in contraction level or in muscle length, both typically occurring in dynamic contractions. The effect of both non-stationarities may be circumvented by appropriately dimensioning the window over which spectral descriptors in isometric contractions are computed [41,46] or by averaging spectral descriptors across a few cycles, if possible, during dynamic conditions [42]. The crucial point though is not the non-stationarity itself but whether EMG descriptors are sufficiently sensitive and robust to detect physiological changes induced by the process under study and nothing else, be it fatigue or any other matter of applied relevance.

#### *1.5. Myoelectric Manifestation of Muscle Fatigue in Time and Frequency Domains*

Experienced sEMG users may wisely contest the potential of the technique to assess muscle fatigue. As defined here, and in agreement with others [9,14,18,47], muscle fatigue may be well assessed by any measurements of performance directly related to the reversible reduction of muscle force. Even the eye of an expert observer could accurately judge the onset of muscle fatigue. In these terms, the use of sEMG finds limited, if any, relevance. It is then that distinction between muscle fatigue and electrophysiological events leading to muscle fatigue must be distinguished. This discrepancy is well discussed in the classical review by De Luca [18]. The failure point, defining the onset of muscle fatigue and thus of a relevant reduction in force, power or performance in general, is preceded by alterations in the chain of events leading to voluntary contraction. These alterations, summarized in the illuminating work of Kirkendal [47], are hardly observable to the naked eye or to performance-measuring sensors. However, these alterations affect the electric potential generated in the surrounding tissues during muscle contraction, making of the sEMG a valid and popular means for studying signs of muscle fatigue. That is, the MMF [18,48,49]. The crucial point though is determining which sEMG descriptors are specifically sensitive to which of the physiological alterations most likely leading to muscle fatigue.

Both amplitude and spectral descriptors have been considered to assess MMF during fatiguing conditions. It is well established indeed that when performance is maintained at a constant level, before the failure point, the amplitude and the frequency spectrum of sEMG change [48–51]. The value of sEMG amplitude and spectral descriptors in studying MMF is however dissimilar, with spectral descriptors typically exhibiting more consistent variations during fatiguing contractions than amplitude descriptors. Multiple factors may account for this. When the sEMG is detected from muscles in which the fibers are aligned parallel to the electrodes, for example, the location of spectral dips depends on the conduction velocity (*fdip* = *nv*/*d*; Section 1.3). In this circumstance, although the decrease in conduction velocity often reported in fatiguing contractions increases the energy of low frequency components, it also shifts the spectral dips to lower frequencies; both effects may therefore cancel out, not altering the total signal power and thus signal amplitude (see Figure 8 in [18]). Similarly, the decreased EMG amplitude expected for when the discharge rate of fatigued motor units decreases (Equation (3)) may be cancelled by the recruitment of additional, fresh units [52] (see Figure 8 in [14]). Motor unit recruitment is another—and possibly the most crucial confounding—factor affecting EMG amplitude. Motor units are known to have different sizes, with bigger units exhibiting a greater number of muscle fibers and thus greater action potentials. Even though one may argue the contribution provided by the recruitment of a big unit may outweigh that resulting from the recruitment of a small unit, the effect on EMG amplitude depends on the average distance of fibers of each unit to the electrodes (see Figure 2 in [39]). This issue is further aggravated if evidence on the rotation of motor units during fatiguing contractions is taken into consideration. Within a single muscle, different motor units have been shown to (rotate) be alternately recruited and de-recruited during prolonged, constant-force contractions [51,53,54]. If sEMGs recorded from a single muscle location do not convey information from the whole muscle [32], motor unit rotation may lead to decreases in sEMG amplitude and inferences on decreased excitation due to fatigue could be incorrect (see Figure 1 in [53]). Given all

these competing factors cannot be controlled for, at least not during voluntary fatiguing contractions, amplitude descriptors may change unpredictably and their use to assess MMF may be unsuitable.

Physiological and non-physiological factors manifesting during fatiguing contractions are known to affect not just amplitude but also the spectral, sEMG descriptors. Differently, though, before the failure point is observed during a constant-performance condition, changes in MNF and MDF consistently indicate a relative shift in energy from high to low frequencies [15,18,41,42,46,48,50,55,56]. Such spectral compression is often attributable to decreases in conduction velocity with fatigue, possibly triggered by altered distribution of H<sup>+</sup> and K<sup>+</sup> across the sarcolemma [47]. The altered membrane excitability with fatigue may also lead to increased duration of intracellular action potentials, similarly leading to spectral compression [15]. The popular use of EMG spectral descriptors to study MMF is therefore presumably attributable to the fact they are equivalently affected by the different culprits of fatigue. The key question is which of these spectral descriptors is mostly sensitive and robust to describe MMF. Different indices have been proposed to characterize the spectral changes taking place with fatigue in the sEMG, based on different, time-frequency distribution approaches [13,55,57–59]. These studies have however devoted to much attention to comparing changes between traditional (MNF, MDF) and the proposed spectral indices without apparently caring for the validity of these changes. All these indices may indeed be flawed as none of these studies has controlled for actual variations in EMG spectrum. Comparing the performance of different indices from experimental data only seems unwise given the relative contribution of physiological and non-physiological sources arising in fatiguing conditions may be unpredictable. Rigorous, simulation studies have been published on this matter though [41,42,46,56]. From synthesized signals, for example, Bonato et al [42] observed that MDF computed from the Choi–Williams time-frequency distribution was shown to most accurately and robustly track abrupt and slow changes in the EMG spectrum typically occurring during dynamic contractions. The ability of MDF to capture the simulated changes was strictly related to focusing analysis on the most biomechanically repeatable portion of the cycle and to the averaging of the spectral descriptors over a few consecutive cycles; i.e., assessing MMF in dynamic conditions demands the underlying movement is repeated as consistently as possible until endurance. Collectively, these results indicate the traditional spectral descriptors may be well suited to study MMF during both isometric and dynamic condition, when certain methodological precautions are taken. EMG users must however be careful when inferences are to be drawn on the mechanisms underpinning fatigue from these spectral descriptors, as different mechanisms may affect them equally.

The considerations just presented for the EMG descriptors traditionally used to assess MMF apply likewise to any other proposed descriptors, many of which are illustrated in the next section. The validity of these indices may be acceptable only after they have been evaluated for robustness and sensitivity, during well-controlled, experimental and simulation conditions.

#### *1.6. Myoelectric Manifestation of Muscle Fatigue in the Complexity Domain*

The complex patterns of sEMG could be attributed to the mechanisms underlying its generation, which seem to be non-linear or even chaotic in nature, as it reproduces the non-linear electrical activity of the neuromuscular system [60]. In addition, the complex properties of sEMG seem to change with fibers contraction during muscle activation [61], potentially giving additional means to the linear sEMG analysis methods in assessing MMF [62]. Therefore, many different methods belonging to the classic non-linear time series analysis of biological signals have been proposed so far to obtain information on fatigue-induced adaptations of neuromuscular processes that could go unnoticed by linear analysis approaches [63]. The hypothesis is that, compared to linear and spectral indices, complexity measurements may detect additional EMG changes occurring with MMF. In the following of this review, readers will find the state of the art about complexity analysis applied to EMG signals, their qualities and the pitfalls that are settled in the procedures [64,65]. Awareness on the limitations of complexity-based methods will be also provided.

#### **2. Materials and Methods**

The measures of complexity of biological signals refer to the predictability of a time-series independently from the amplitude of its fluctuations [66], quantify its temporal irregularity [67] or its long-range (fractal) correlations [68] and estimate the amount of chaos in the underlying system [69]. To address all these aspects of complexity analysis, this review is based on the literature search of the PubMed and Scopus scientific databases using the following terms: EMG, fatigue, nonlinear analysis, complexity, fractal, nonlinear dynamic, entropy, approximate entropy (ApEn), sample entropy (SampEn), fuzzy entropy (FuzzyEn), multiscale entropy (MSE), recurrence plot analysis, detrended fluctuation analysis (DFA), largest Lyapunov exponent (LLE), correlation dimension (CD). Initially, a list of 333 articles was obtained. After having excluded duplicates papers and manuscripts dealing with pattern recognition and EMG classification, a subgroup of 109 studies was considered for the final analysis. Then, the review was limited to the 106 papers written in English without applying any other exclusion criteria. The collected papers were classified into four methodological groups: (1) fractals and self-similarity; (2) correlation; (3) entropy; and (4) deterministic chaos. For each method, we described its mathematical implementation and the influence of muscle activation and fatigue on the complexity indices. The physiological interpretation of the sEMG changes with muscle contraction, when available, aims at providing the reader a key to interpret the results when fatiguing contractions are investigated.

#### **3. Results**

#### *3.1. Fractals and Self-Similarity*

#### 3.1.1. Fractal Dimension

In 1977 Mandelbrot coined the term "fractal" to describe geometric shapes that reveal more details at increasing degree of magnification [70]. Three related features are accredited to fractal forms: heterogeneity, self-similarity and the absence of a well-defined scale of length. Heterogeneity reflects the property of showing emerging details the more closely the shape is examined. Self-similarity defines the characteristics of resembling similar structures at different size scale [68]. The description of fractal structures goes through the determination of fractal dimension (FD), an index characterizing "the complexity and space filling propensity of a structure" [71]. Transposed to time series signals, FD has been demonstrated to describe the self-similarity of a pattern over multiple time-scale [71,72]. FD can be estimated with different algorithms and a popular one is the Katz's method [73] which, however, provides FD estimates that may depend on the length of the time series [74]. The Katz's method has been revised by Anmuth et al. [61] to be applied to sEMG signal during isometric contractions. Given a signal lasting 3 seconds, FD was estimated for the middle 1 s as:

$$FD = \frac{\log N}{\left[\log N + \log\left(\frac{d}{L}\right)\right]}\tag{6}$$

where *N* is the number of samples in the signal, *d* is the planar extent of the waveform (computed as the distance between the first point of the sequence and the point of the series that provides the farthest distance), and *L* the total length of the signal (sum of distances between successive points) [61,73].

Another popular FD estimator is the box-counting method. This algorithm superimposes the time series waveform with a regular grid of square boxes. The size (*S*) of the boxes is increased from small to large dimensions and the number (*N*) of boxes crossed by the waveform is computed for each size. FD is thus estimated as:

$$FD = \frac{\log N}{\log \frac{1}{S}} \tag{7}$$

Since the fractals structures show an inverse power law relationship between *N* and *S*, *FD* in Equation (7) corresponds to the slope of the linear relationship between log*N* and log1/*S* [71]. In the study of Gitter, box sizes were chosen as a multiple of the -amplifier bit resolution and the sampling rate and its range varied from 2 to 500 boxes [71] (where a unit box had a physical dimension of 5580 μV/μs). FD values close to 1 reflect smoothed signals whereas values approaching 2 are typical of signals with high space-filling propensity [75]. The box-counting algorithm has been used to evaluate sEMG signals during isometric and isotonic contractions [76–78].

*FD and muscle activation.* Anmuth et al. [61], and Gitter and Czerniecki [71] investigated the behavior of FD as a function of force and found that, similarly to other traditional EMG indices, the average FD increased almost linearly with the force intensity for force values below 50% of the maximal force (Figure 2). Conversely, above this level the FD rise declined, deviating from the linear increase [61,71,79]. Similarly, Beretta-Piccoli et al. [80] found a low dependence of FD on force intensity. Indeed, they observed a linear relationship between FD and the level of force from 10% till 30% of the maximum voluntary contraction (MVC), but at higher force intensity FD leveled to a plateau. Even though these results led the authors to speculate the FD descriptor is "a reliable indicator of motor unit synchronization, less dependent from the firing rate" no direct evidence appears to confirm the sensitivity of FD to motor unit synchronization.

**Figure 2.** Averaged fractal dimension (FD) as a fraction of maximal voluntary contraction (MVC) force (redrawn from Gitter and Czerniecki, [71], with permission).

Xu et al. [79] determined FD on simulated EMG signals in which motor unit recruitment and firing rate was varied. They found that FD increased with the recruitment but the rate of the increment tended to plateau when recruitment was high. Moreover, firing rate influenced FD, but only for low values of recruitment [79].

Noticeably, not all the investigators found linear correlations between FD and force, neither at low level of force. Indeed, Troiano et al. [81] did not found any relationship between FD and percentage of MVC force in trapezius muscle, and similar results were obtained by Poosapadi and Kumar [82].

*FD and fatigue:* FD has been proposed to monitor changes in EMG signal as a consequence of fatiguing contractions [75,77,78]. Beretta-Piccoli et al. [75] used FD to investigate MMF in knee-extensors muscles, reporting the time-course of FD values in vastus lateralis and vastus medialis muscles during sustained contractions at different intensities. Analyzing the time course of FD during the development of fatigue a clear significant negative slope appeared, although different in the two muscles. The authors, citing the study of Mesin et al [78] in which a decline in FD was associated with a progressive MU synchronization, ascribed this behavior to an increase in MU synchronization as expression of the central nervous system adaptation to fatigue progression. Moreover, the investigators attributed the different slopes found between the two muscle bellies to the different proportion of slow and fast twitch fibers constituting the muscles.

The decay of FD during sustained isometric contractions is the common denominator of the studies of Mesin et al. [78], Beretta-Piccoli et al. [75,83], Troiano et al. [81] and Boccia et al. [77]. Indeed, they found a linear decrease of FD during fatiguing contractions and attributed this response to an increase in motor unit synchronization (Figure 3). In [78], FD values showed no association with motor unit conduction velocity, supporting the idea that FD is more sensible to central rather than peripheral fatigue. Despite this, the authors drew these conclusions using advanced signal analysis techniques, the interference nature of the EMG signal makes questionable any speculation on the origin of the fatigue components (central rather than peripheral).

Lin et al. [84] investigated the FD during isotonic repeated submaximal contractions (pedaling) but observed no change. Meduri et al. [85] also tested the existence of different gender-related resistance to fatigue in biceps brachii muscle. The time courses of conduction velocity and FD were determined during the time-to-exhaustion task. Investigators found a lower initial FD in females compared to males. Moreover, the rate of FD decrease at low level of contraction intensity was not different between genders whereas males showed a significantly higher decrement of FD during 60% MVC exhausting contraction. Importantly, the authors speculated the initial values of FD seem to be affected by motor unit synchronization as well as by subject fat layers and skin properties (Figure 3).

**Figure 3.** Mean percentage of changes in FD versus time in males (blue) and females (red) during 60% MVC prolonged contraction. The time scale is expressed as a percentage of the total exhaustion time for each subject (from Meduri et al., [85] with permission).

Troiano et al. [81] investigated the behavior of FD during fatiguing contraction at 50% MVC and found a significant fatigue influence on FD. Indeed, the rate of changes of FD determined during fatiguing tasks strongly correlated with endurance time, making this parameter a valuable tool to predict the time to exhaustion during an isometric task.

Finally, Mesin et al. [86] explored the influence on FD on both different percentages of motor unit synchronization (from 0–20%) and different motor units firing rates (5–40 Hz). As previously anticipated, the Authors evidenced the existence of an inverse relationship between FD and motor

unit synchronization and the positive relation with motor units firing rate. These findings have shed new light on the interpretation of fatigue-induced changes of FD, making FD no more considered as an exclusive index of motor unit synchronization [86].

Table 1 summarizes methodological aspects and results of studies on FD and muscle fatigue.

**Table 1.** Estimation parameters and fractal dimension (FD) in studies comparing fresh vs. fatigued muscles.


\* Contraction type was simulated in this work and isometric in all the others. BB = biceps brachii; FDS = flexor digitalis superficialis; VL = vastus lateralis; T = trapezius.

#### 3.1.2. Detrended Fluctuation Analysis.

A process g(*k*) is "self-similar" when it holds the same statistical properties of *a*−H*g*(*ak*), with H the Hurst exponent. This means that subsets of the original series properly rescaled to the size of the original one look statistically similar to the original, a property called "self-similarity". The Detrended fluctuation analysis (DFA) is a complexity method to assess the scaling properties of self-similar signals. The algorithm returns a scale parameter α which is strictly related to the Hurst exponent, with α = 0.5 in case of no correlation (white noise), α = 1 in case of "1/f" (or pink) noise, and α = 1.5 in case of Brownian motion (or random walk). In particular, 0 < α < 0.5 indicates anti-correlation between samples whereas α > 0.5 indicates long-range correlation [87].

To estimate α of a series *g*(*k*) of *N* samples, first *y*(*k*), cumulative sum of *g*(*k*), is calculated. Then:


$$F(n) = \sqrt{\frac{1}{N'} \sum\_{k=1}^{N'} \left[ y(k) - y\_n(k) \right]^2} \tag{8}$$

The Steps 1–4 are repeated for different box sizes *n* and α is estimated as the slope of the regression line fitting *F*(*n*) vs. *n* in a log-log plot [87]. Successive improvements of the DFA method considered least-square detrending polynomials of order greater than one and were able to employ the whole series of *N* samples for each block *n* with properly overlapped boxes [88,89]. The popularity of the DFA method lies in the fact that unlike other estimators of the Hurst exponent it does not require to know in advance whether the fractal series belongs to the family of the fractional-Gaussian noises (fGn) or Brownian motions (fBm) [65]. The DFA provides acceptable estimates of H for both these classes, being α = H for an fGn process, and α = H + 1 for an fBm process [90].

*DFA and muscle activation*: Different studies demonstrated an increase of the DFA scaling exponent with muscle effort [91–93]. In addition, concentric contractions result in lower α values compared to isometric and eccentric contractions, with scale exponents close to one (the characteristic value of 1/*f* power law phenomena) suggesting a higher level of complexity [93]. Such difference were explained by the different levels of motor unit recruitment which occur during concentric versus eccentric contraction [94,95] and possibly by the different motor control strategy which regulates concentric and isometric contraction [96,97].

*DFA and fatigue*. The MMF as assessed by the DFA scale exponent results in a significant loss of signal complexity. Interestingly, Hernandez et al. [93] recently found a significant multivariate effect from fatigue status and muscle contraction type. They found that α DFA was significantly lower during non-fatigued compared to fatigued conditions and during concentric compared to isometric contractions. During fatigued condition, α was close to 1.5, value characteristics of Brownian motion.

#### 3.1.3. Multifractality

Complex systems may also generate multifractal time series. A multifractal series is composed by interwoven fractal processes and specific methods of analysis should be applied to identify the components of the multifractal dynamics.

One multifractal method used in sEMG analysis is based on the evaluation of the singularity spectrum over successive epochs of 1s duration [62]. The measured signal is covered with boxes of size *l* and the probability *Pi*(*l*) in each box *i* is calculated. For monofractal series, *Pi*(*l*) increases as the power α*<sup>i</sup>* of the size *l*, the exponent α*<sup>i</sup>* being called the singularity strength.

For the multifractal analysis, a normalized *Pi*(*l*) measure is used:

$$\mu\_i(q,l) = \frac{[P\_i(l)]^q}{\sum\_j [P\_i(l)]^q}.\tag{9}$$

The exponent *q* allows highlighting the different components of the multifractal time series. The normalized measure in fact amplifies the fractal components with greater singularity when *q* > 1 and those with lower singularity when *q* < 1. In particular, if the series is monofractal the singularity strength does not change with *q*. Thus, averaging α*<sup>i</sup>* over all the boxes *i* one obtains the function α(*q*) that provides a measure of the degree of multifractality. The singularity α determines the Hausdorff fractal dimension *f* of the data and therefore, as α changes with *q*, also *f* changes with α. The function *f*(α) that describes the fractal dimension as a function of the singularity strength is called the singularity spectrum.

Another way to assess the multifractality of a time series is to extend the DFA method, which was originally proposed for monofractal series. This is done modifying the definition of the variability function *F*(*n*) in Equation (8) and calculating a variability function *Fq*(*n*) which depends on the moment order *q* as:

$$\begin{cases} \; \, \_qF\_q(n) = \left( \frac{1}{M} \sum\_{k=1}^M \left( \sigma\_n^2(k) \right)^{q/2} \right)^{1/q} \quad \text{for } q \neq 0\\ \; \, \_qF\_q(n) = e^{\frac{1}{2M} \sum\_{k=1}^M \ln \left( \sigma\_n^2(k) \right)^{q/2}} \quad \text{for } q = 0 \end{cases} \tag{10}$$

where *M* is the number of blocks of size *n* and σ<sup>2</sup> *<sup>n</sup>*(*k*) is the variance of the residuals in each block [98]. When *q* = 2, *Fq*(*n*) coincides with the "monofractal" variability function *F*(*n*). The multifractal variability function amplifies the fractal components with greater amplitude when *q* > 0 and those with lower amplitude when *q* < 0. At each moment order *q*, a multifractal DFA coefficient, α(*q*), is estimated as the slope of the regression line fitting *Fq*(*n*) vs. *n* in a log-log scale. If α(*q*) depends on *q* the series is multifractal while monofractal series are characterized by constant α(*q*) functions.

*Multifractality in muscle action*. Li et al. applied the method of multifractal DFA to the cross-correlation function between force and sEMG [99]. The results show a strong statistical self-similarity in the correlation sequences between force and sEMG signals, with fractal characteristics similar to 1/f noise or fractional Brownian motion. The multifractal DFA has been applied to the biceps brachii contraction, and it was observed that the sEMG signal is mono- and multifractal in different time scales, with "several fractal-scaling breaks" [100].

*Multifractality and fatigue*. The singularity spectrum *f*(α) of the sEMG signal of the biceps brachii was estimated during isometric contractions and the area of the singularity spectrum was taken as a concise index of the degree of the sEMG multifractality [62]. The results demonstrated that the area of *f*(α) consistently increased during the static contraction suggesting the use of *f*(α) for assessing muscle fatigue.

The multifractal DFA approach was used also to evaluate whether the effects of fatigue on the EMG signal could be estimated with greater accuracy than that of conventional indices of EMG such as the MDF of the sEMG power spectrum [100]. The observed changes in Hurst exponent in the fatigued muscle may be due to a reduction in conduction velocity in muscles fibers and to the enlarged motor unit action potential, which may increase the long-range correlation in sEMG at small time scales.

#### *3.2. Correlation*

#### 3.2.1. Correlation Dimension

In 1996, Nieminem and Takala demonstrated that sEMG is better modeled as the output of a non-linear dynamic system rather than as a random stochastic signal [101], suggesting the use of non-linear analysis methods. Among the non-linear methods, the evaluation of the correlation dimension (CD) [102] has been used to classify the sEMG dynamics, both at rest and during light and fatiguing muscle contractions. CD is a measure of the amount of correlation contained in a signal connected to the fractal dimension. The CD estimation requires the calculation of the correlation integral C(*r*), which is the mean probability that the states of the dynamical systems at two different times are close, i.e., within a sphere of radius *r* in the space of the phases. Given a time series *g*(*k*), the phase space is reconstructed by the vectors *G*(*k*) = [*g*(*k*), *g*(*k* + <sup>τ</sup>), ... , *g*(*k* + (*m* <sup>−</sup> *1*)τ)]T with *m* the embedding dimension and τ a delay. The correlation integral is then estimated by the sum:

$$\mathbf{C}(r) = \frac{1}{N^2} \sum\_{\substack{i,j=1 \\ i \neq j}}^N \Theta(r - \|\mathbf{G}(i) - \mathbf{G}(j)\|) \tag{11}$$

where *N* is the number of states, Θ the Heavyside function and ... the Euclidean norm. If *g*(*k*) is the output of a complex system, when *N* increases and *r* decreases, *C*(*r*) tends to increase as a power of *r*, *C*(*r*)~*rCD*. Thus, CD, the correlation dimension of the system can be estimated as the slope of the straight line of best fit in the linear scaling range region in a plot of ln (*r*) versus ln *r*.

The algorithm requires a large amount of data to provide reliable estimates, a restraint in the analysis of sEMG. Furthermore, the estimates are unreliable for *m* greater than 14 (Nieminem and Takala, 1996) and the computational time increases exponentially with the number of samples (Bai-Lin, 1990).

*Correlation dimension and muscle activation*. The studies on correlation dimension applied to sEMG firstly confirm the non-linear character of muscle electrical activity, which shows a structure different from a pure random noise [103]. Thus, during the dynamic muscle contraction, neuromuscular system has been demonstrated to "progressively changes from narrow band orderly recruitment pattern to a broadband chaotic pattern" [103].

As it concerns muscle activity, EMG signals from lower limbs muscles during walking were found to exhibit signs of chaotic behavior by the computation of CD values between two and three [104,105]. Furthermore, the study of the electrical activity of paravertebral muscles during different bending postures demonstrated that CD is a reliable method to compare the EMG signal in various muscle contraction conditions [106]. Finally, during a submaximal test of isometric loading Meigal et al. demonstrated that correlation dimension was able to distinguish the sEMG characteristics between two groups of young and old healthy individuals [107].

More recently, Wang et al. used a mixed mathematical approach, based on decomposing the sEMG signal by the wavelet transform for calculating CD, to distinguish four types of forearm movements. This could be prospectively useful to classify muscle movements in the conception and design of new powered limb prostheses [108].

*Correlation dimension and fatigue*. Muscle fatigue seems to reduce the dimensionality of the system, as assessed by CD: this has been ascribed to motor unit synchronization and reduction in action potential velocity and firing rate, which may reduce the neuro-muscular system adaptability [101]. However, a precise connection between the physiologic adaptation to fatigue in muscle activity and the changes in correlation dimension of sEMG signals is still lacking.

#### 3.2.2. Recurrence Quantification Analysis

The recurrence quantification analysis (RQA) is a nonlinear geometrical tool used "to bring out temporal correlations in a manner that is instantly apparent to the eye" [109]. This analysis was proposed by Eckmann in 1987 to detect recurring patterns and non-stationarities in a dynamic system [110]. Given a data set *x*(*i*) of points, RQA is constituted by a recurrence plot in which an array of dots is arranged in a square map and darkened pixels are plotted at specific coordinate *i*, *j* whenever the point *x*(*j*) is closer than a distance threshold *r* to the point *x*(*i*). When the distance between *x*(*i*) and *x*(*j*) is below *r*, *x*(*j*) is considered as recurrent and then, a dot is signed on the recurring map at the coordinate (*i*, *j*). Given a time series *g*(*i*) its recurrence plot is obtained as follows:


Since *i* and *j* are times the resulting recurrence plot provides information on the time correlation of the data set.

Different recurrent structures might be found looking at the recurrence plots [111,112]. Single isolated points result from chance recurrences in the signal; upward diagonal lines reflect the presence of a deterministic rule into the signal as they appear "whenever strings of vectors reoccur further down the dynamic" [113]; vertical and horizontal lines indicate the occurrence of isolated vectors of data set that match with a repeated string of vectors separated in time; and blank bands are the consequence of transients in time series. Given that subtle patterns are not always detected, different quantitative descriptors can be determined. Readers can found an exhaustive description of the recurrence plot descriptors in the brilliant paper of Webber and Zbilut [112]. The most often used are:

(i) Percent determinism (%DET), that quantifies the percentage of recurrent points forming diagonal line structures

$$\%DET = \frac{\sum\_{l=I\_{\text{min}}}^{N} IP(l)}{\sum\_{i,j}^{N} R\_{i,j}} \tag{12}$$

where *P*(*l*) is the frequency distribution (i.e., the probability) of diagonal lines with length *l*, being *l* an integer number;

(ii) Percent recurrence (%REC), that quantifies the density of recurrent points in the plot:

$$\%REC = \frac{1}{N^2} \sum\_{i,j=1}^{N} R\_{i,j} \tag{13}$$

A critical aspect of RQA is the need to carefully tuning the embedding dimension, the delay τ and the threshold distance to obtain reliable estimates [78,111,114]. A typical value of the delay τ is the first zero of the autocorrelation function.

*RQA and muscle activation*. Several studies investigated the sensitivity of RQA to sEMG shifts towards more deterministic behaviors under different contraction intensities and characteristics in both small and large muscles [114–118]. Filligoi and Felici [113] evaluated %DET during voluntary contractions at three different force levels, each sustained for 20 seconds. Although the initial %DET value was insensitive to force levels, the slope correlated with the contraction intensity.

RQA behavior was also investigated in response to different levels of motor unit synchronization by computing %DET before, during, and after the injection of a drug to increase the motor units synchronization [118]. %DET rose as a function of synchronization in most of the investigated muscles, leading the authors to consider it as a suitable tool to monitor changes in motor unit synchronization [118]. Different results were reported by Schmied et al. [119] that did not find a correlation between %DET and the amount of synchronous impulses when contractions were performed at a low level intensity. No correlations were also found between %DET and potentiation phenomena neither in endurance-trained nor in power-trained athletes [120].

Some studies compared recurrence analysis to frequency analysis finding prompter response and higher magnitude of %DET compared to spectral indices. This supports the idea that recurrence indices present a higher sensitivity than spectral indices to detect sEMG drifts [114,116,121].

*RQA and fatigue.* RQA also explored the effects of fatigue on muscular activation in different studies, which found a continuous rise of %DET as a function of time, although the results could be influenced by factors such as contraction intensity, muscle size [78,111,116,121–124], altitude and other muscles characteristics [122]. The role of contraction intensity was explored by Webber and Zbilut who found an almost-steady-state behavior of %DET during sustained light loading whereas during heavy loading a progressive rise occurred [111].

RQA was also adopted to characterize fatigue effects in different groups: power-trained athletes, endurance athletes, wheelchair basketball players and sedentary control subjects [116,120,122,125,126]. While %DET increased in all the athletes' phenotypes, it did not in control group. Changes in %DET in athletes were ascribed to a more regular and more similar bursts pattern, while differences between groups were explained with the different proportion in fibers composition. Figure 4 shows an example, in a representative subject, of the computation of EMG power spectrum (with the calculation of the median spectral frequency) and RQA plot (personal data) during non-fatigued and fatigued muscle conditions. During fatigue, the computed mean spectral frequency decreases and the spectral power increases. Furthermore, the density of recurrent points remains relatively unchanged (constant %REC), but the arrangement of points is altered, indicating an increased periodic component in the EMG during fatigue.

Muscle endurance was also evaluated by RQA after exposure to high altitude. Similarly to normobaric condition, %DET progressively increased during the sustained contraction; however, the slope became steeper under exposure to hypobaric hypoxia [122]. Two studies evaluated the behavior of spectral variables and recurrence-plot indicators (%DET and %REC) on experimental as well as simulated EMG signals. In these latter the response to two typical signs of muscular fatigue, like reduction of conduction velocity and the increase in motor unit synchronization, were explored. %DET and %REC showed to be influenced by the conduction velocity and by the degree of synchronization [78,116]. Ito and Hotta, by the use of RQA, recently explored sEMG behavior during exhausting contraction under blood flow restriction. They found an increase in %DET during contraction and even higher values when blood flow restriction was applied [127]. Table 2 summarizes the parameters adopted for RQA analysis in previous studies and the results in terms of %DET and %REC in fresh and fatigued muscles.

**Figure 4.** Power spectra with median frequency (MDF) (left panels) and recurrence plots with percent determinism (%DET) and percent recurrence (%REC) from recurrence quantification analysis (RQA, right panels) of sEMG signals for the non-fatigued (**A**) and fatigued (**B**) vastus lateralis muscle in one representative subject (personal data); analysis parameters are: *N* = 1024; τ = 4; *m* = 4; *r* = 15.

**Table 2.** Percent determinism (%DET) and percent recurrence (%REC) in fresh vs. fatigued muscles by RQA.


*m* = embedding dimension; τ = delay; τ<sup>0</sup> = first zero of the autocorrelation function (typically between 3–5 ms); *r* = radius as % of maximum distance or (a) of mean distance; BB=biceps brachii; BM =back muscles; BR=brachioradialis; D = deltoid; EC = extensor carpi radialis; FD = first dorsal interosseous; MF = multifidus; Q = quadriceps; VL = vastus lateralis.

#### *3.3. Entropy*

3.3.1. Approximate Entropy, Sample Entropy and Fuzzy Entropy

In 1991, Pincus coined the term approximate entropy (ApEn), to indicate a method estimating the "likelihood that runs of patterns that are similar remain similar on next incremental comparisons" [67]. An advantage of this method is its applicability in noisy and short datasets [128–130]. To calculate ApEn of a series *g*(*i*) of *N* equally-spaced values, one should first set an embedding dimension *m* and a distance threshold *r* and then:


$$\mathcal{C}\_i^m(r) = \frac{n\_i^m(r)}{N - m + 1} \tag{14}$$


Then,

$$ApEn(m, r) = -ln\left[\frac{\mathbb{C}^{m+1}(r)}{\mathbb{C}^m(r)}\right] \tag{15}$$

Deterministic sequences present a high degree of regularity, i.e., if they are similar for *m* points they are likely similar also for the next point, *m* + 1. Therefore, higher is the regularity, lower is ApEn. Since each sequence matches itself, ApEn is a biased estimator and it is lower than expected for short records [128]. This also implies that it lacks relative consistency, making it difficult to interpret the comparison of different datasets. Moreover, because of its bias, ApEn depends on the signal length. When two time-series are compared, care must be taken to estimate ApEn on the same signal durations [130].

Sample Entropy (SampEn) addresses the drawbacks caused by self-matching and provides better consistency and performance than ApEn [128]. SampEn reduces the bias avoiding self-comparison between vectors [130]. This is done by calculating *nm <sup>i</sup>* (*r*), the number of vectors similar to *G*(*i*), for all the vectors *G*(*j*) excluding *j* = *i*. This leads to defining SampEn as:

$$\text{SampEn}(m, r) = -\ln \frac{A^m(r)}{B^m(r)}\tag{16}$$

where:

$$A\_i^m(r) = \frac{n\_i^{m+1}(r)}{N - m - 1} \tag{17}$$

$$B\_i^m(r) = \frac{n\_i^m(r)}{N - m - 1} \tag{18}$$

Boasting better consistency and robustness, the fuzzy approximate entropy (FuzzyEn) was proposed in 2010 for noisy and short datasets [129,131]. Additionally, FuzzyEn was independent of the tolerance *r* introducing the concept of fuzzy membership functions for determining the degree of similarity between patterns. Therefore, the similarity between *G*(*i*) and *G*(*j*) is quantified by a fuzzy continuous and convex function [129,132]:

$$C\_i^m(r) = \frac{1}{N - m + 1} \sum\_{j=1, \ j \neq i}^{N-m+1} \Omega\left(d\_{i,j'}^m r\right) \tag{19}$$

with [121,131,133,134]

Finally,

$$
\Omega \Big( d\_{i,j'}^{\text{in}} r \Big) = \varepsilon^{-\frac{d\_{i,j}^2}{r}} \tag{20}
$$

$$\text{Fuzz}zyEn(m, r) = -\ln\left[\frac{\mathbb{C}^{m+1}(r)}{\mathbb{C}^m(r)}\right] \tag{21}$$

where *Cm*(*r*) is the average of *Cm <sup>i</sup>* (*r*) for all the vectors *G*(*i*).

#### 3.3.2. Multiscale Entropy

The measures of entropy like SampEn cannot properly distinguish whether the irregularity of the time series just reflects random components or whether it is generated as the output of a genuine complex system. To better detect the presence of complexity in the time series some authors proposed a multiscale approach to entropy [135]. The multiscale entropy method is based on the evaluation of SampEn on progressively coarse-grained series. A coarse graining of order *n* consists in applying a moving average filter of order *n* on the original series *g*(*i*) and in decimating the filtered series taking one sample every *n*. Then, SampEn is estimated over the coarse-grained series obtaining the multiscale entropy at the scale *n*, MSE(*n*). Clearly, MSE (*n* = 1) coincides with SampEn by definition. Like SampEn, also the multiscale entropy needs the preliminary choice of the proper embedding dimension *m* and threshold distance *r*. In addition, it is still unclear whether the same threshold *r* should be used at all the scales *n* or whether it should be adjusted at each scale, *r*(*n*) [136]. Recently, the coarse-graining procedure has been improved to allow stable estimates at large scales even when analyzing relatively short data segments and to reduce leakage from the shorter to the larger scales due to the wide transition band of the moving average filter [137]. A concise way to quantify the MSE(*n*) profile is to sum all scales shorter than a critical scale τ*<sup>c</sup>* to obtain a short-term complexity index, *CS*, and to sum all the scales larger than τ*<sup>c</sup>* up to the largest estimated scale, *nmax*, to obtain a long term complexity index, *CL*, as:

$$\mathcal{C}\_{\mathcal{S}} = \sum\_{n=1}^{\tau\_{\mathcal{E}}} MSE(n) \tag{22}$$

$$\mathcal{C}\_L = \sum\_{n=\tau\_c+1}^{n\_{\text{max}}} MSE(n) \tag{23}$$

To identify the critical scale τ*<sup>c</sup>* analyzing sEMG during isometric contractions, Cashaback et al. performed a piecewise-linear regression on MSE(*n*) estimates for scales *n* between 1 and 50 samples (corresponding to the range between 0.004 and 0.2 s) and found a single breakpoint demarcating two linear scaling regions [138]. The intersection of the two-piece regression defined τ*<sup>c</sup>* (see Figure 5).

*Entropy and muscle activation*. Several studies used entropy-based methods in characterizing the complexity of EMG signals during relaxed conditions [139] and contractions [117,131,140]. From a physiological viewpoint, as healthy biological systems show markedly higher complexity than compromised ones, low entropy values could be read as a sign of impairment [141].

Despite the different studies using ApEn on EMG signals, its consistency and reliability have recently been questioned [72]. Zhou et al. employed SampEn and FuzzyEn to interpret sEMG collected at different intensity levels of contraction and found a very weak correlation between SampEn and muscle torque while FuzzyEn showed a direct positive correlation with the effort [134]. These authors concluded that FuzzyEn could be a useful alternative to force estimation whereas SampEn might be determined as a biomarker of EMG able to overcome interference due to changing muscular contractions intensity. A relationship between entropy measures and force production was also examined by Troiano et al., [81] who found no effect of fatigue on entropy values.

**Figure 5.** Example of sEMG multiscale entropy and identification of the critical scale τ<sup>c</sup> for the definition of short-term and long-term complexity (from [138] with permission).

Finally, MSE analysis was applied to sEMG signal by Cashaback [138] to evaluate the short-term complexity of sEMG at three different intensity contractions. The authors reported a correlation between MSE and contraction intensity although the level of complexity at 100% was only slightly different compared to the one found at 70%. The investigators hypothesized that, given that force production above 70% is mainly attributed to an increase in temporal firing, signal complexity might be mainly influenced by rate discharge rather than motor unit recruitment [138].

*Entropy and fatigue*. The use of entropy algorithms to study MMF has been recently evaluated [121,129,132,142]. Hernandez et al. [93] recently studied the individual influence of fatiguing contractions and of different contraction types on the complexity of sEMG signal by SampEn and DFA. The effect of the combination of both factors were also evaluated. Given that SampEn values decreased in fatigued conditions and different values were found among the contraction types, the Authors concluded that "sEMG complexity is affected by fatigue status and contraction type, with the degree of fatigue-mediated loss of complexity dependent on the type of contraction used to elicit fatigue" [93] (Figure 6).

**Figure 6.** Sample entropy (mean ± SEM) of vastus lateralis sEMG signals from non-fatigued to fatigued conditions during concentric, eccentric, and isometric contractions (from Hernandez et al., [93] with permission).

Lin et al. applied the SampEn algorithm to sEMG signals collected from quadriceps muscles during cycling. Comparing the results obtained under fatigued and un-fatigued conditions they found no differences in SampEn values [84]. The absence of any changes in signal EMG complexity was attributed to the different type of contraction (isometric and cyclic).

FuzzyEn was also used to characterize the determinism of sEMG signal during fatigue [129,132,142]. The study of Xie et al. compared the time course of FuzzyEn with that of ApEn and of the MDF and found that FuzzyEn decreased linearly during muscle contraction as well as the MDF, where ApEn did not [129]. Successively the Authors compared the performance of FuzzyEn with SampEn and ApEn and concluded in favor of FuzzyEn, due to its better robustness to the analysis length [132]. Navaneethakrishna et al. [142] applied FuzzyEn to explore determinism in sEMG signal under fatigued and un-fatigued conditions and, similarly to previous studies, found a decline in entropy throughout fatigue development.

Kahl and Hofmann [121] compared six different algorithms (including SampEn and FuzzyEn) in the detection of local MMF. The sEMG signal was analyzed by spectral, entropy and recurrence quantification analysis. Authors found that entropy-based variables performed better than recurrence methods, though ApEn provided a low MMF detection quality. Better results were found from SampEn. Moreover, a limit of FuzzyEn method was recognized on the high computational effort.

The above cited work of Cashaback et al. [138], based on MSE approach, found that entropy values significantly decreased after fatigue. The authors hypothesized that the reduction of signal complexity might have resulted from a decrease of action potential amplitude and velocity as a consequence of alterations in the metabolic and enzymatic events involved in muscle contractions. Similarly, Navaneethakrishna et al. [142] observed a clear reduction of MSE values with MMF and attributed the finding to the fatigue-induced synchronization of motor unit recruitment that in turn would have led to the generation of more regular pattern in the neuromuscular signal.

MSE was used to investigate MMF also in a group of children with cerebral palsy to have a deeper insight into the central nervous system and neuropathological mechanisms underpinning muscle contractions [143]. Investigators noticed a decreasing pattern of MSE along with fatigue development and ascribed it to a reduction of motor unit synchronization.

Table 3 shows settings and results obtained using entropy algorithms in the studies taken into considerations.


**Table 3.** Entropy of sEMG during contractions.

*r* = threshold expressed as a fraction of standard deviation; BB = biceps brachii; Q = quadriceps; VL = vastus lateralis; EC = extensor carpi radialis; FC = flexor carpi ulnaris.

#### *3.4. Deterministic Chaos*

#### Largest Lyapunov Exponent

The determination of the chaotic properties of a nonlinear system may be performed through the computation of largest Lyapunov exponent (λLLE), which estimate the rate of exponential divergence of neighboring trajectories into the phase space. This measure can therefore quantify the "amount of chaos" in a system. Different algorithms have been implemented to determine λLLE from finite amounts of experimental data. The first implementation by Wolf estimated the non-negative Lyapunov exponent and determined the grade of unpredictability by the magnitude of the exponent, but it was rather inefficient [144]; later, the Rosenstein's method proved to be more efficient and overcame the drawbacks of the Wolf algorithm [145]. Rosenstein's algorithm requires four input variables: time delay, minimum embedding dimension, mean period and maximum number of iterations. Briefly, the EMG time series of *N* points is considered as a trajectory in the embedding space. The algorithm locates the nearest neighbor of each point *j* of the trajectory, and considers the distance between these two close points as a small perturbation, Δ*j*(0). It is assumed that the *j*-th pair of nearest neighbors diverges in time at the exponential rate given by the largest Lyapunov exponent λLLE, which means that *ln*Δj(*i*) = C*<sup>j</sup>* + λLLE*i*. This equation, evaluated λLLE for all the *j* pairs, represents a set of parallel lines. To reliably estimate λLLE from short and noisy data, the average of the parallel lines is computed. In general, the average line shows a long linear region after a short transition, and is estimated as the slope of the regression line fitting the average line.

*Muscle activation and* λ*LLE*. The Rosenstein method for calculating λLLE was applied on EMG signals by Chakraborty and Parbat [72] for the assessment of chaotic patterns during isotonic contractions of biceps brachii muscle (arm flexion with 1 kg load). Considering the stochastic nature of EMG, the authors used Cao's method for determining the embedding dimension [146], whereas the time delay was determined through Kraskov's mutual information function [147], the mean period was obtained as the reciprocal of the median frequency found by the average Welch periodogram technique and 100 iterations were used as the last input variable. The results obtained by this application suggested the presence of deterministic chaos in EMG signal, and found an, although very limited, variability with the applied load. In another study, when applied to the electrical activity of paravertebral muscles during various bending postures, the positive Lyapunov exponent could not discriminate the contraction conditions, differently from CD [106].

*Muscle fatigue and* λ*LLE*. The estimation of the largest Lyapunov exponent had limited applications in the evaluation of muscle manifestation of fatigue. The λLLE value did not change with the increase of the muscle load in [72], although it was unlikely that the load used in this work provoked a significant fatigue state in the tested muscle (biceps brachii). Significant reductions in the dynamic stability of low back EMG were found during a fatiguing task (30 repetitions of trunk extension) by means of the maximum Lyapunov exponent [148]. Interestingly, the index was lower in subjects with chronic low back pain (in whom paravertebral muscles are often contracted for antalgic reasons) compared to control subjects, with a trend more pronounced in people with low back pain toward a reduction during asymmetric versus symmetric tasks [148]. In a work of Padmanabhan and Puthusserypady [104] sEMG signals exhibited chaotic behavior with a greater number of positive Lyapunov exponent for signals recorded during maximal voluntary contraction than during walking. Finally, Sbriccoli et al., [149] demonstrated a significant reduction (by 14–42%) of λLLE in EMG from muscles with exercise-induced muscle damage (by 35 maximal contractions of biceps brachii), with complete recovery after two weeks.

#### **4. Discussion**

This review aimed at describing the main linear and complexity analysis methods in the literature which were applied to the EMG signal to determine the effects of fatigue on muscle electric activity (the scheme we followed is summarized in Figure 7). The issue we reviewed plays an important role in physiology (e.g., exercise physiology, neurophysiology, training, etc.) and pathophysiology settings (physical rehabilitation, neurology, prosthesis development, etc.).

Some linear and spectral descriptors of EMG, as the σ<sup>g</sup> and MDF have been demonstrated to be sensitive to fatigue-induced variations of EMG. However, intriguingly, it has been shown that the EMG signal also exhibits many complexity characteristics deserving to be evaluated, especially to understand whether these features have an onset time and a sensitivity to MMF development different from those of the classic linear descriptors.

Several papers focusing on the complex behavior of EMG demonstrated so far that the EMG signal is non-linear in nature and expresses the features of a low dimension chaotic system [72,100,104]. Many complexity indices have been therefore used in characterizing the changes occurring in EMG with muscle activation and with fatigue. Some of them seems to be more informative and shows early changes compared to traditional linear and spectral analysis. In addition, fatigue results in a significant loss in EMG complexity [124,127,143,148].

**Figure 7.** Scheme representing the considered linear and complexity-based indices for the sEMG analysis.

Among the complexity analyses applied so far to the EMG the fractal analysis had many applications. Though not universally accepted [81,82], FD typically reveals an increase during muscle activation at low intensity levels of force production [61,76,79,82] with a decrease in response to MMF [64–66,79,84,86]. The common finding of the reported studies suggests an inverse relationship

between FD and motor unit synchronization. By contrast, FD seems to be positively related to motor unit firing rate [78,86]. Finally, it showed to be suitable to estimate the exhausting time during an isometric contraction [80].

The RQA approach is another widely used index of complexity applied to EMG. A rise in %DET was attributed to an increase in motor unit synchrony and in a more similar bursts of motor unit potential action generation patterns. Local MMF is accompanied by an increase in recurrent statistics in sEMG signal therefore, %DET represents a promising tool in revealing early onset the MMF during a challenging motor task [45,78,111,121–124].

In addition, the entropy-based measurement has been widely used to evaluate how fatigue influences the determinism of EMG signal. During fatigue development entropy parameters show a clear decline, reflecting a shift of EMG towards more regular pattern. The decay observed in sEMG complexity by entropy has been ascribed to a decrease of both action potential amplitude and velocity probably due to alterations in metabolic and enzymatic events involved in muscle contractions [93,129,142,143].

A promising extension of MMF detection capabilities by complexity indices applied to EMG was introduced by the study of multifractality [62]. This method has shown a higher degree of correlation and accuracy with the progress of fatigue compared to the median spectral frequency, and presents possible applications, such as discrimination between normal and pathological sEMG (e.g., in those neuromuscular disease where a reduction of the number of motoneurons occurs and the action potential of the residual motor units changes in shape and duration) [100].

Finally, the determination of the largest Lyapunov exponent from sEMG demonstrated the chaotic properties of this nonlinear system but its potential in detecting MMF seems to be limited [72,106]. Therefore, despite some intriguing results [104,148,149], future standardized fatiguing protocols are needed to confirm whether λLLE of sEMG can be diagnostic tool to assess MMF and impairments, as well as the effectiveness of treatment in different settings, as clinic (rehabilitation) and sporting contexts.

All these findings, collectively, might make the use of complexity analysis tempting. However, readers have to consider the several pitfalls and tricks thronging the analysis process. Indeed, almost all the complexity procedures present some limitations in their use that should be considered. First, the quality of the estimates of complexity indices increases with the length of the dataset and for this reason complexity methods generally require long time series: this may be a critical point because EMG data during fatiguing muscle contractions are usually of reduced length. Therefore, there is a need to develop indices and estimation algorithms which can be meaningfully applied to short dataset. In this regard, recent lines of research in the complexity analysis of physiological signals are aimed at specifically designing algorithms for short time series, for instance by reducing the estimator bias and variance in multiscale entropy analysis [137,150,151] or by improving the consistency of multifractal DFA estimates [88]. It is, therefore, desirable that these algorithms are properly adapted to the analysis of sEMG and applied to detect the electromyographic manifestation of muscle fatigue.

Second, many of these analyses are based on highly recursive calculation procedures and therefore needs high computational times. Third, from a statistic viewpoint, there is a requirement for surrogate data analysis, in order to test the EMG signal for non-linearity in different conditions (e.g., fatigued vs. non-fatigued states). In the vast majority of the studies cited in this review no surrogate data analysis has been performed. Fourth, only in some cases an accurate parameterization of the variables used in the specific complexity analysis (in particular the parameters used to reconstruct the phase space, as the embedding dimension, the time delay, the critical scale and the threshold distance) has been performed. This latter point has been deeply stressed in those studies. Indeed, given that an inaccurate setting of the algorithms parameters severely impacts on final results, a meticulous detection of the most appropriate setting is absolutely required to achieve reliable results and avoid improper conclusions. We encourage the interested readers to undertake the endeavor of assessing the sensitivity of complexity descriptors with synthetic EMG signals, whereby the effect of different sources leading to MMF can be controlled for. It is our understanding that only then would it be possible to reveal the added

value of complexity analysis in screening the various physiologic phenomena that may manifest in experimental EMG signal during fatiguing contractions (synchronization of the motor units generating the action potentials, changes in the shape of action potentials, in the firing rate, in the biochemical conditions and metabolism of the muscle fiber, etc.). Indeed, in some analyses reported in this review, the authors attempted to correlate the behavior of the complexity indices of EMG to the changes in the physiological phenomena that underlie the MMF during a protracted muscle contraction. However, this should be possible only when working with synthetic signals, in which several phenomena, such as fiber recruitment and action potential synchronization, can be controlled. Differently, in an interference signal such as surface EMG, it is virtually impossible, even with sophisticated algorithms, to distinguish the peripheral components of fatigue from the central ones. The conclusions of many authors on this topic should, therefore, be evaluated with caution and considered to be eminently speculative.

In conclusion, although some complexity indices seem to detect MMF efficiently, more work remains to be done to compare these indices in terms of reliability and sensibility, to optimize the choice of the parameters used to reconstruct the phase space and to elucidate their relationship with the physiologic phenomena underlying the onset of fatigue in exercising muscles.

**Author Contributions:** Conceptualization, S.R., T.M.V., P.C. and G.M.; literature search: S.R.; writing—original draft preparation: S.R., T.M.V. and G.M.; writing—review and editing: G.M. and P.C.; supervision: G.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Relative Consistency of Sample Entropy Is Not Preserved in MIX Processes**

**Sebastian Zurek ˙ 1, Waldemar Grabowski 1, Klaudia Wojtiuk 1, Dorota Szewczak 1, Przemysław Guzik <sup>2</sup> and Jarosław Piskorski 1,***<sup>∗</sup>*


Received: 11 May 2020; Accepted: 19 June 2020; Published: 21 June 2020

**Abstract:** Relative consistency is a notion related to entropic parameters, most notably to Approximate Entropy and Sample Entropy. It is a central characteristic assumed for e.g., biomedical and economic time series, since it allows the comparison between different time series at a single value of the threshold parameter *r*. There is no formal proof for this property, yet it is generally accepted that it is true. Relative consistency in both Approximate Entropy and Sample entropy was first tested with the *MIX* process. In the seminal paper by Richman and Moorman, it was shown that Approximate Entropy lacked the property for cases in which Sample Entropy did not. In the present paper, we show that relative consistency is not preserved for *MIX* processes if enough noise is added, yet it is preserved for another process for which we define a sum of a sinusoidal and a stochastic element, no matter how much noise is present. The analysis presented in this paper is only possible because of the existence of the very fast NCM algorithm for calculating correlation sums and thus also Sample Entropy.

**Keywords:** time series analysis; sample entropy; relative consistency

#### **1. Introduction**

Relative consistency of Sample Entropy (*SampEn*) has been assumed in all clinical and economic applications [1–4]. Indeed, if we say that a one time series is more complex than another on the basis of their value at a certain threshold *r*, we assume this either explicitly or tacitly. It is quite surprising that there are no comprehensive studies on this property of Sample Entropy. The analytic proof of this property would be very hard to derive. In fact, very little theoretical work has been done on these parameters, most of which is limited to the Moorman and Richman paper [4]. We do not know the distribution of Sample Entropy (the *t* distribution assumed in [4] is based on data, not analytical properties), we do not know entropy profiles of most common processes or the influence of noise on these processes. The experimental verification of relative consistency requires calculation of *SampEn* across many different thresholds for many time series e.g., many *RR* (i.e., the distance between two consecutive R-vawes in the ECG) intervals time series, acquired with the same equipment (the same sampling rate as well as other external conditions). This is difficult because of the computational burden of this task. In this paper, we overcome this difficulty by using the NCM algorithm [5]. Unlike a formal proof, this procedure would not yield certainty about relative entropy, but it could either corroborate, or decisively refute it.

We believe that the methodology developed in this paper is systematic and applicable to all types of time series studied with the use of Sample Entropy. Furthermore, we provide the software necessary to perform such an analysis quickly and without large hardware investments.

This paper is not the first one to study relative consistency of Sample Entropy as a universal property. Indeed, even the creators of *SampEn* allow for the possibility of Sample Entropy not holding universally for all time series [4].

In [6], the authors perform an analysis of short time series acquired by recording gait data. The most significant result in the context of the present paper is the finding that *SampEn* is not relatively consistent at *r* = 0.2 for the time series of step time of length 200—the averaged entropy profiles for short data of young subject cross with the profiles of older subjects. The authors do not provide specific results, i.e., how many curves cross, but nonetheless this is a significant finding for this time series. The cross between entropy profiles in [6] was observed for very short recordings, but this finding was corroborated in longer recordings [7]. The authors find that, for one hour recordings of time series of step time, the relative consistency is lost between overground and treadmill walking recordings at *r* = 0.2 for *m* = 2 and *m* = 3. The authors attribute this result to *r* · *SD* being close to the precision of the data. Still, this study demonstrates that relative consistency is at least subject to some technical conditions.

In [8], it is found that relative consistency is not preserved in very short (*N* = 50) sinusoidal signals for sample entropy, while it is for the Fuzzy Entropy, a parameter that is introduced in that paper. While analyzing a similar set of measures, i.e., Approximate Entropy, Sample Entropy, and Fuzzy Measure Entropy, Zhao et al. [9] find that sample entropy does not behave consistently in distinguishing between normal sinus rhythm and congestive heart failure groups, which may be indicative of a lack of relative consistency. This paper is not entirely conclusive in this respect as it uses a segmented approach, and thus it is quite dissimilar to our study as well as the above-mentioned papers, but they do find that Fuzzy Measure Entropy is, as expected, totally consistent in this respect.

In this paper, we concentrate on the process which was used to demonstrate and study the properties of Approximate Entropy and *SampEn*, i.e., the *MIX* process. We contrast the results obtained for this process with a closely related process and show that their properties with respect to relative consistency are widely different. The considerations in this paper are limited to Sample Entropy because the fact that Approximate Entropy is not relatively consistent with respect to the *MIX* process has already been shown in [4].

#### *1.1. Sample Entropy*

Given a time series

$$\mathcal{U}\_{l} = \{ \mu(1), \mu(2), \dots, \mu(N) \}, \tag{1}$$

where *N* is the number of data points, let us build an auxiliary object

$$\mathcal{V}\_{i}^{m,\mathsf{r}} = \{\vec{v}(1), \vec{v}(2), \dots, \vec{v}(N - (m - 1)\mathsf{r})\},\tag{2}$$

which is a set of vectors in an *m*-dimensional *embedding space* [10,11], Vectors *vm*,*τ*(*i*)=[*u*(*i*), *u*(*i* + *τ*), *u*(*i* + 2*τ*), ... , *u*(*i* + (*m* − 1)*τ*)], i.e., the*vm*,*τ*(*i*) consist of *m* ordered points, beginning at position *i*. The parameter *<sup>τ</sup>* is known as time lag. Therefore, we have *Lm* <sup>=</sup> *<sup>N</sup>* <sup>−</sup> (*<sup>m</sup>* <sup>−</sup> <sup>1</sup>)*<sup>τ</sup>* vectors *<sup>V</sup>m*,*<sup>τ</sup> <sup>i</sup>* for a fixed *τ*—these are often called templates.

Let us define the so called correlation sum:

$$\mathbf{C}^{\mathbf{m}}(r) = L\_{\mathbf{m}}^{-1} \sum\_{l=1}^{L\_{\mathbf{m}}} \mathbf{C}\_{l}^{\mathbf{m}}(r). \tag{3}$$

*C<sup>m</sup> <sup>i</sup>* are defined as

$$\mathbb{C}\_{i}^{m}(r) = \begin{cases} \left(L\_{\mathfrak{m}} - 1\right)^{-1} \sum\_{j=1, j\neq i}^{L\_{\mathfrak{m}}} \Theta\left(r - \left|\vec{v}\_{\mathfrak{m}}(i) - \vec{v}\_{\mathfrak{m}}(j)\right|\right) & \text{if } i \le L\_{\mathfrak{m}}\\ 0 & \text{if } i > L\_{\mathfrak{m}} \end{cases} \tag{4}$$

Θ is called the *Heaviside* function

$$\Theta(\mathbf{x}) = \begin{cases} 1 & \text{if } \mathbf{x} \ge \mathbf{0} \\ 0 & \text{if } \mathbf{x} < \mathbf{0} \end{cases}. \tag{5}$$

where *r* is called the radius of comparison [12], which is used to check the similarity of two vectors by checking their distance with respect to a norm. The distance between two vectors in can be defined in many ways, but the following maximum coordinate distance definition seems to have the best mathematical properties [13]:

$$\left|\vec{\boldsymbol{\sigma}}\_{m}(\boldsymbol{i})-\vec{\boldsymbol{\sigma}}\_{m}(\boldsymbol{j})\right|=\max\_{k=1,2,\ldots,m}\left(\left|\boldsymbol{\mu}(\boldsymbol{i}+(k-1)\boldsymbol{\tau})-\boldsymbol{\mu}(\boldsymbol{j}+(k-1)\boldsymbol{\tau})\right|\right).\tag{6}$$

*SampEn* (just like Approximate Entropy) is an attempt to build an estimator the Eckmann and Ruelle [14] entropy:

$$ER = \lim\_{r \to 0} \lim\_{m \to \infty} \lim\_{N \to \infty} [\Phi^m(r) - \Phi^{m+1}(r)],\tag{7}$$

where *N*, *r*, and *m* have the same definition as before. This expression involves limits, so it cannot be directly applied to a realistic, measured time series. In order to make this possible, this definition was rewritten by Richman and Moorman [4] to the following form for a finite time series:

$$\Phi^{\mathfrak{m}}(r) = (N - m + 1)^{-1} \sum\_{i=1}^{N-m+1} \log B\_i^{\mathfrak{m}}(r),\tag{8}$$

*B<sup>m</sup> <sup>i</sup>* has the same definition as *<sup>C</sup><sup>m</sup> <sup>i</sup>* , with the only difference that self matches are included in *<sup>B</sup><sup>m</sup> <sup>i</sup>* and excluded from *C<sup>m</sup> <sup>i</sup>* . Richman and Moorman propose using the following, closely related quantity as complexity measure instead of the Eckmann–Ruelle entropy

$$Samp\mathbb{E}n(m,r) = \lim\_{N \to \infty} [-\ln \frac{\mathbb{C}^{m+1}(r)}{\mathbb{C}^m(r)}],\tag{9}$$

which, for a finite time series, can be estimated by

$$\operatorname{SampleEn}(m, r, N) = -\ln \frac{\mathbb{C}^{m+1}(r)}{\mathbb{C}^m(r)}.\tag{10}$$

A detailed look into the constitutive elements of these formulas lead to the conclusion that Sample Entropy is the negative logarithm of the conditional probability that two sequences, which are within the *r* radius of tolerance of one another for *m* points, remain at the same radius of tolerance for *m* + 1st point. A more detailed treatment of *SampEn* may be found in [4].

#### *1.2. Relative Consistency*

The notion of relative consistency was introduced by Pincus in [1,3], and this property follows from the properties of the Kolmogorov–Smirnov (KS) entropy. Rewritten in terms of *SampEn*, we have the following property: for deterministic dynamical processes *A, B*, we should have that, from KS entropy(A) < KS entropy(B), it follows that *SampEn*(*m*,*r*)(*A*) < *SampEn*(*m*,*r*)(*B*) and, conversely, for a wide range of *m* and *r*. This entails that, if *SampEn*(*m*1,*r*1)(*A*) < *SampEn*(*m*1,*r*1)(*B*), then *SampEn*(*m*2,*r*2)(*A*) < *SampEn*(*m*2,*r*2)(*B*) and vice versa. In other words, if *SampEn* for one process is lower than that for another

process for a set of parameters (*m*1,*r*1), then this holds true for any other set (*m*2,*r*2) [4]. It should be stated clearly that this is an expectation and a desirable property of *SampEn*, rather than a mathematically proven property. If this holds true, then we are able to compare two processes at a single point (*m*1,*r*1) and draw conclusions for all points. This is what is actually happens in applications.

#### *1.3. The MIX and MIXTURE Processes*

Let us now define the two processes which will be used to test the relative consistency of *SampEn* under different conditions.

#### 1.3.1. *MIX*(*p*) Process

Let 0 ≤ *p* ≤ 1 be discrete probability. Let us define three time series [4]:

$$X\_{\bar{j}} = \sqrt{2}\sin(\bar{j}),\tag{11}$$

In the above, we do not use the frequency modifying factor <sup>2</sup>*<sup>π</sup>* <sup>12</sup> since our sampling is quite dense, as will become apparent in the Data Analysis section.

$$\mathcal{Y}\_{\vec{\jmath}} = \mathcal{U}(-\sqrt{3}, \sqrt{3}), \tag{12}$$

i.e., uniform independent, identically distributed random variable, and

$$Z\_{\bar{\jmath}} = B(1, p), \tag{13}$$

i.e., a Bernoulli random variable with probability of success equal to *p*. We can now define the *MIX*(*p*) process as

$$MIX(p) = (1 - Z\_j)X\_j + Z\_j Y\_j. \tag{14}$$

#### 1.3.2. *MIXTURE*(*λ*) Process

This is a very simple process which is composed of a sum of two processes: a deterministic and stochastic process, the second of which is controlled by a tuning parameter *λ*. Let us define

$$X\_j = \sqrt{2}\sin(j),\tag{15}$$

$$Z\_j = \mathcal{U}(0, 1),\tag{16}$$

and the final process

$$MIXTLIRE(\lambda) = X\_{\bar{j}} + \lambda(Z\_{\bar{j}} - 0.5). \tag{17}$$

It can be seen that the *λ* parameter controls the amplitude of the added noise. Figure 1 presents a few examples of of the above processes.

#### *1.4. The NCM Algorithm*

The NCM algorithm is the fastest algorithm for calculating correlation sums and *SampEn* which at the same time uses the whole time series, without any sub-sampling or simplifications.

This algorithm is of the *look-up* table type and uses many tricks limiting the number of operations as compared to the brute-force approach. The central objects of NCM are triangular matrices *N*ˆ whose elements are defined in the following way:

$$m\_{i\bar{j}} = ||u\_i - u\_{i + (j+1) \cdot \mathbf{r}}||\_\prime \tag{18}$$

where *u* are the elements of the *U* time series and *τ* is the time lag. For a time series with *N* points, the dimensionality of this matrix is *N* − *τ* − 1 × (*N* − 1)/*τ* − 1. It is quite obvious that for any realistic time series this amounts to a very large matrix. The first operation-reducing technique is limiting calculation to sub-blocks. Using the symmetry of the matrix, elements with indices not meeting the condition *i* + (*j* + 1) · *τ* ≤ *N* − *τ* − 1 are set to zero, thus halving the number of summations in the correlation sum. Another result of this approach is removing redundant operations in calculating *C<sup>m</sup>* by reducing the number of operations for maximum norm from *m*<sup>2</sup> to *m*. The last operation is searching for the first occurrence of 0 from an arithmetic operation instead of a loop, with

$$(n\_{i,j}^m - b) / a, \quad a = -\frac{r\_{\max} - r\_{\min}}{n - 1}, \quad b = r\_{\max}.\tag{19}$$

Other, more standard optimization and algorithmic techniques as well as hardware scaling can be used to further improve the performance of the procedure. The full description of the algorithm may be found in [5] and Python code on the GPL license may be downloaded from https://github.com/sebzur/ NCM-algorithm.

#### **2. Materials and Methods**

All the signals were processed with the use of the Python programming language. The correlation sums were calculated with the use of the NCM algorithm using the software available at https: //github.com/sebzur/NCM-algorithm. The results were analyzed with the R programming language and statistical system.

#### *2.1. The MIX Process*

One hundred signals were generated with four thousand samples, and four periods were generated at the frequency of 5 *Hz*. The tuning parameter *p* for these signals changed from 0 to 1. For each of the signals, the correlation sums *Cm* and *Cm*+<sup>1</sup> were calculated for *m* = 2 (compare Equation (9)) and *SampEn* was calculated using these results. The *SampEn* profiles were drawn for values of threshold *r* spanning the segment (0 · *SD*, 4 · *SD*). As a result, we obtained 1100 plots for the different levels of the tuning parameter. We searched for crossing profiles within the plots by subtracting the profiles and counting how many times the difference changed sign.

#### *2.2. The MIXTURE Process*

One hundred signals were generated according to the definition from formula (17). The parameters assumed were the same as for the *MIX* process, and the *λ* tuning parameter also changed in the range *λ* ∈ (0, 1). We calculated correlation sums and *SampEn* profiles for the same sets of parameters as in the *MIX* process case. As before, we got 1100 plots and we checked for crossing profiles.

#### **3. Results**

#### *3.1. The MIX Process*

Figure 2 presents the entropy profiles for *m* = 2.

**Figure 2.** In this figure, entropy profiles of the *MIX* process calculated for *m* = 2 embedding are presented. Each line corresponds to different *p*—starting from 0.1 with step 0.1. The inset presents the close up of crosses between entropy profiles for *p* = 0.8 and *p* = 0.9 in linear scale.

We have observed that there are crossings in the entropy profiles. All the crosses in the present study behave in the same way: the lines always cross at two points, which means that there is a finite region for which the order of the lines reverses. The crosses have been summarized in Table 1 below. For each line, it contains two values of *r* at which the lines cross, and the *SampEn* value at the crossing point. It is interesting to notice that many entropy profiles cross with the maximum randomness *MIX*(1) process, and there is also one more case which does not involve this process, namely *MIX*(0.8) and *MIX*(0.9).


**Table 1.** Crosses between entropy profiles for the *MIX*(*p*) process.

#### *3.2. The MIXTURE Process*

Figure 3 presents the entropy profiles for *m* = 2. There are no crossings in this type of process, irrespective of the value of *λ*.

**Figure 3.** In this figure entropy profiles of *MIXTURE* process calculated for *m* = 2 embedding are presented. Each line corresponds to different *λ*—starting from 0.1 with step 0.1. The lowest *SampEn* value corresponds to *λ* = 0.1.

#### **4. Discussion**

In the present paper, we have studied the relative consistency property of the Sample Entropy parameter. We have experimentally studied two synthetic time series—the *MIX*(*p*) and *MIXTURE*(*λ*) processes. Both of these processes have one tuning parameter; however, in the *MIX*(*p*) process, the amount of randomness is controlled, and, in the *MIXTURE*(*λ*) process, the size of the random effect is controlled. It turns out that relative consistency is not preserved for the former, while it is preserved in the latter.

The nature of Sample Entropy profiles crossing is different from that found in [6], or the behavior observed in Approximate Entropy [2], which is a flip behavior. In our analysis, the entropy profiles for *MIX*(*p*) always cross twice. Of course, the reason could be the lack of possibility to observe the other cross in the above cited studies.

It is also interesting to notice that, though the amount of variance in a time series has been reported to influence the results of entropy calculations [15,16], this does not seem to be the case for the relative consistency of *MIXTURE*(*λ*), which is preserved for all values or *λ*.

A question may arise whether the values of *r* at which the crosses were found are important for practical applications. In our opinion, it is impossible to relate the value of *r* for the *MIX* process with a value of *r* from, say, an *RR* intervals time series. These are two completely different processes and the values cannot be compared. The *MIXTURE* process is in fact a good example here—there are no crosses for this process, so no crossing value of *r* in *MIX* corresponds to any *r* value in the *MIXTURE* process.

We believe that finding a process which is clearly relatively consistent for all values of the parameters, as opposed to a process which is mostly relatively consistent, is one of the most important results of this paper. This finding may have deep consequences.

In applications to medicine and economy relative consistency is assumed anywhere where two signals are compared and conclusions are drawn on the basis of a single set of parameters (*m*,*r*). It is absolutely necessary to study real signals, e.g., time series of *RR* intervals, using the methodology we have presented in this paper or a similar one. If the studied real signals behave more like the *MIXTURE*(*λ*) process, then we can continue assuming relative consistency, if they behave more like the *MIX*(*p*) processes, the approach to comparing signals may need to be modified.

Looking at the results obtained in this manuscript as well as the cited papers, we can notice that various processes have various *regions of relative consistency*, i.e., a region in a multidimensional space within which the process is relatively consistent. For the *MIX*(*p*) region, we can see that process is relatively consistent for a certain region in the (*r*, *p*) space, and, for the *MIXTURE*(*λ*) relative consistency, holds for the entire studied region in the same space. These regions have been demonstrated in Figure 4.

**Figure 4.** Relative consistency region for the *MIX*(*p*) and *MIXTURE*(*λ*) processes for *p* ∈ (0, 1) and *r* ∈ (0, 1.82). The region for *MIX*(*p*) has been smoothed by natural splines to interpolate the values between measured points.

We suggest that such regions be found for real-life processes, and, from the data gleaned from other studies, we can suspect that they will be different for different processes and for different spaces. For example, the results presented in [6,7] suggest that for the gait signal there is at least one region where the signal is not relatively consistent in the (*r*, *f*) space, where *f* is sampling frequency, and the point (0.2, 148*Hz*) belongs to this non-consistency region (for a study on the influence of sampling frequency on *SampEn* see also [17]). The same two studies by Yentes et.al. suggest that other spaces of interest could be (*r*, *N*), where *N* is the length of the time series, and (*N*, *f*) or even some multidimensional spaces involving these variables can be studied in this way.

The *RR* intervals time series seems to be a good candidate for such a study because of its widespread use, equipment availability, and the fact that many researchers are already very familiar with this time series. Such a study would require a group of recordings from similar subjects, taken with the same equipment, in stationary conditions. They would also need to be long enough to let the underlying process assume as many intermediate stages as possible. In our opinion, the main obstacle in studying longer time series is the computational burden of calculating entropy profiles. The method described in this paper as well as the open source software we make available makes this possible.

We believe that the answer to the question of the regions in which relative consistency holds in various types of signals is one of the most important problems for the applicability of *SampEn* because, in order to be able to safely compare the values of *SampEn* at selected points, we need to make sure that the comparisons take place within a region of relative consistency.

**Author Contributions:** Conceptualization, S.Z.; Formal analysis, J.P.; Funding acquisition, J.P.; Investigation, K.W.; Resources, S.Z.; Software, K.W. and D.S.; Supervision, J.P.; Visualization, K.W.; Writing—original draft, S.Z. and W.G.; ˙ Writing—review and editing, P.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## *Article* **Suppressing the Influence of Ectopic Beats by Applying a Physical Threshold-Based Sample Entropy**

### **Lina Zhao 1, Jianqing Li 1,2,\*, Jinle Xiong 1, Xueyu Liang <sup>1</sup> and Chengyu Liu 1,\***


Received: 17 February 2020; Accepted: 1 April 2020; Published: 4 April 2020

**Abstract:** Sample entropy (SampEn) is widely used for electrocardiogram (ECG) signal analysis to quantify the inherent complexity or regularity of RR interval time series (i.e., heart rate variability (HRV)), with the hypothesis that RR interval time series in pathological conditions output lower SampEn values. However, ectopic beats can significantly influence the entropy values, resulting in difficulty in distinguishing the pathological situation from normal situations. Although a theoretical operation is to exclude the ectopic intervals during HRV analysis, it is not easy to identify all of them in practice, especially for the dynamic ECG signal. Thus, it is important to suppress the influence of ectopic beats on entropy results, i.e., to improve the robustness and stability of entropy measurement for ectopic beats-inserted RR interval time series. In this study, we introduced a physical threshold-based SampEn method, and tested its ability to suppress the influence of ectopic beats for HRV analysis. An experiment on the PhysioNet/MIT RR Interval Databases showed that the SampEn use physical meaning threshold has better performance not only for different data types (normal sinus rhythm (NSR) or congestive heart failure (CHF) recordings), but also for different types of ectopic beat (atrial beats, ventricular beats or both), indicating that using a physical meaning threshold makes SampEn become more consistent and stable.

**Keywords:** sample entropy; heart rate variability; ECG; ectopic beat

#### **1. Introduction**

Entropy is a valuable tool for quantifying the complexity or regularity of cardiovascular time series and provides important insights for understanding the underlying mechanisms of the cardiovascular system. Since the concept of 'information entropy' was first proposed by Shannon in 1948 [1], entropy was used as a tool to quantify the quantity of information. Approximate entropy (ApEn) [2], proposed by Pincus et al., is an entropy algorithm initially used in physiological signal analysis as it is adaptive in short-term time series processing. However, ApEn introduces self-matching in calculations, resulting in estimation bias and poor relative consistency [3]. To solve this problem, Richman and Moorman developed an improved version of sample entropy (SampEn) [3], which is based on the calculation of the conditional probability that any two segments of *m* beats that are similar remain similar when their length increases by one beat. Compared with ApEn, SampEn has a lower estimate bias, better relative consistency and less dependence on data length, which makes it more appropriate in physiological signal processing. SampEn is now the most widely used entropy algorithm in physiological signal analysis.

For entropy calculation, three intrinsic parameters, i.e., the embedding dimension *m*, the tolerance threshold *r* and the time series length *N* need to be initialized. SampEn was reported to not be sensitive to the time series length *N* if *N* ≥ 200 ∼ 300 [4,5]. Parameter *m* is based on the length *N* under the suggested relationship of *N* ≈ 10*<sup>m</sup>* ∼ 20*<sup>m</sup>* [6]. Among all three parameters, the tolerance threshold *r* is the most difficult to be determined. Usually, the recommended *r* is between 0.10 and 0.25 times the standard deviation (SD) of the physiological data [3,7]. If the *r* value is too small, the number of matched vectors will be small, and by contrast, if the *r* value is too big, detailed information within time series will be ignored [8,9]. Moreover, in practice, RR interval time series in different physiological/pathological groups usually have variable SD values, inducing that the comparison between different groups uses different threshold criteria, and it is not easy to find an appropriate *r* value to achieve an optimal result if simply using the suggested range of 0.10 to 0.25 times the SD.

Researchers have made several useful attempts to improve the performances of entropy measures. One is multiscale analysis. Costa et al. developed a multiscale entropy (MSE) method [10,11], with the hypothesis that MSE can better describe cardiovascular complexity. MSE is based on the evaluation of SampEn in coarse-grained RR interval time series with a coarse-graining order from 1 to a preset scale (such as 10) [12,13]. However, coarse-graining changes the SD of time series and thus changes the corresponding *r* value [14], resulting in different opinions on the selection of *r* values, i.e., whether using a fixed tolerance *r* or using a varying tolerance *r* adjusted at each scale as a fraction of the SD of the coarse-grained time series is better [10]. Another attempt is the use of fuzzy theory-assisted entropy methods, such as fuzzy entropy developed by Chen et al. [15] and fuzzy measure entropy developed by Liu et al. [16,17], where fuzzy functions are employed to replace the traditional Heaviside function used in SampEn, to improve the statistical stability of SampEn outputs. Herein, although the determination rule for vector similarity is changed, the tolerance *r* still uses the fixed range of 0.10 to 0.25 times the SD. There are also entropy developments focusing on specific disease detection, such as for the detection of atrial fibrillation (AF) [18–21], heart failure [22,23], diabetes [24], etc. Specially, Lake and colleagues developed a new AF entropy detector, named the coefficient of sample entropy (COSEn), for AF determination within an extremely short RR interval time series (only 12 RR intervals). COSEn allowed flexibility in choosing the tolerance *r* and suggested an appropriate choice of a fixed *r* value of 30 ms [25].

In a previous study, we found that SampEn reported higher values in the normal sinus rhythm (NSR) group than the congestive heart failure (CHF) group when selecting a small threshold *r* value (*r* = 0.10), but reported lower values when using large threshold *r* values (*r* = 0.20 or 0.25) [4]. The opposite entropy change trend brings difficulty to defining a unified threshold *r* to distinguish CHF patients from NSR subjects in heart rate variability (HRV) analysis. To solve this problem, we proposed a physical threshold-based SampEn method to discriminate the opposite entropy change trend in the task of classifying CHF and NSR subjects [26], where the physical threshold-based SampEn was demonstrated to have a better stability than the traditional SampEn.

HRV analysis is based on the analysis of normal RR intervals from the beats generated by the sinoatrial node. Unlike the normal beats generated by the sinoatrial node, ectopic beats are generated by additional electrical impulses imposed by other latent pacemakers [27]. Ectopic beats may cause bias in the reliable measurement of HRV in both the time and frequency domains [28,29], as well as in entropy measurement [30]. Even the presence of only one ectopic beat can introduce an increase in the high frequency power in HRV of around 10% [31]. Although many detection and editing methods for ectopic beats have been proposed [32–34], there is no agreed conclusion on how to efficiently remove them. More importantly, the efficiency of editing ectopic beats dramatically decreases when dealing with the dynamic ECG signals due to signal noise. In dynamic ECGs, noises caused by the body's activities, motion artifacts, electrode interferences etc., are inevitable [35,36]. A recent study demonstrated that even when using state-of-the-art QRS detectors, an 80% or higher accuracy of QRS detection is not achieved. By contrast, these methods can easily obtain a 99% accuracy using conventional ECG databases such as the PhysioNet/MIT Arrythmias database [37]. Potential detection errors from the automatic analysis of dynamic ECGs also bring abnormal RR intervals, i.e., RR intervals

lasting for too much or too little time. The existence of either the ectopic beats or the falsely detected QRS locations can significantly contaminate the entropy outputs.

Thus, the effectiveness of entropy measures, typically SampEn, should be re-checked for analyzing the dynamic ECG signals. A predictable situation is that SampEn may change a lot if moving the analysis window from an ectopic-free RR interval time series to an entopic one. Thus, it is necessary to further develop an entropy method, which can keep relatively stable when randomly dealing with the ectopic or ectopic-free RR interval time series for a specific subject/patient. Due to the fact that it is difficult to identify the abnormal RR intervals caused by noises or true ectopic beats in the automatic analysis for dynamic ECGs, this necessity becomes urgent and practical for real signal processing. In this study, we aimed to test the performance of a new physical threshold-based SampEn when applied to RR interval time series with ectopic beats, to explore if it can efficiently suppress the sudden change in entropy results due to the appearance of ectopic beats, i.e., to verify its ability to suppress the influence of ectopic beats for HRV analysis.

#### **2. Methods**

#### *2.1. Data*

All data used were from the PhysioNet/MIT RR Interval Databases from http://www.physionet. org [38], a free-access, online archive of physiological signals. The NSR RR Interval database includes 54 long-term RR interval recordings of subjects with normal sinus rhythms aged from 29 to 76. The CHF RR Interval database includes 29 long-term RR interval recordings of subjects aged from 34 to 79, with CHF diagnoses (NYHA classes I, II and III). Each of the long-term RR interval recordings is a 24-h recording, including both day-time and night-time. Both the NSR and CHF subjects took the Holter ECG measurement under a similar level of physical activity. The original ECG signals were digitized at 128 Hz, and the beat annotations were obtained by automated analysis with manual review and correction.

A 5-min time window was used to segment the long-term RR interval records. The 5-min RR segments with at least one ectopic beat were extracted as ectopic segments used in this study. Information regarding ectopic beats was manually annotated by experts and was given in the database, classifying them into two types: atrial (A) or ventricular (V) beats, depending on the localization of the ectopic focus. In each 5-min RR segment, RR intervals greater than 2 s, but not ectopic intervals, were removed, since they are all noisy intervals arising from artificial influences [4]. Figure 1 shows examples of ectopic RR segments from an NSR subject and a CHF patient. Tables 1 and 2 summarize the numbers of ectopic beats and ectopic 5-min segments in each of the 54 NSR and 29 CHF records. For each recording (subject), we only chose the recordings with more than 10 ectopic segments, while excluding the ectopic segments with more than 6 ectopic beats, since the majority of ectopic segments have 1–5 ectopic beats.

**Figure 1.** Examples of 5-min ectopic RR segments. (**A**) An ectopic segment with ventricular (V) ectopic beats from an normal sinus rhythm (NSR) subject. (**B**) An ectopic segment with atrial (A) ectopic beats from a congestive heart failure (CHF) patient. Please note there are other atrial ectopic beats in this 5-min RR segment, where the RR interval values have sudden changes.

**Table 1.** A summary of the ectopic beats and segments in the PhysioNet/MIT RR Interval Databases for the NSR group.


\* indicates the recordings excluded for the analysis since there are no 10 or more ectopic 5-min RR segments including 5 or fewer ectopic beats.


**Table 2.** A summary of the ectopic beats and segments in the PhysioNet/MIT RR Interval Databases for the CHF group.

\* indicates the recordings excluded for the analysis since there are no 10 or more ectopic 5-min RR segments including 5 or fewer ectopic beats.

#### *2.2. Physical Threshold-Based SampEn*

The calculation process for the physical threshold-based SampEn is summarized as follows [26]: For the RR segment *x*(*i*) (1 ≤ *i* ≤ *N*), given the parameters *m* and *r*, first formed is the vector sequence *X<sup>m</sup> i* :

$$X\_i^m = \{ \mathbf{x}(i), \mathbf{x}(i+1), \dots, \mathbf{x}(i+m-1) \} \text{ 1} \le i \le N-m \tag{1}$$

The vector *X<sup>m</sup> <sup>i</sup>* represents *<sup>m</sup>* consecutive *<sup>x</sup>*(*i*) values. Then, the distance between *<sup>X</sup><sup>m</sup> <sup>i</sup>* and *<sup>X</sup><sup>m</sup> <sup>j</sup>* based on the maximum absolute difference is defined as:

$$d\_{i,j}^m = d\left[X\_i^m, X\_j^m\right] = \max\_{0 \le k \le m-1} \left| x(i+k) - x(j+k) \right| \tag{2}$$

For each *X<sup>m</sup> <sup>i</sup>* , denote *Bm <sup>i</sup>* (*r*) as (*N* − *m*) <sup>−</sup><sup>1</sup> times the number of *X<sup>m</sup> <sup>j</sup>* (<sup>1</sup> <sup>≤</sup> *<sup>j</sup>* <sup>≤</sup> *<sup>N</sup>* <sup>−</sup> *<sup>m</sup>*) that meets *<sup>d</sup><sup>m</sup> <sup>i</sup>*,*<sup>j</sup>* ≤ *r*. Similarly, set *A<sup>m</sup> <sup>i</sup>* (*r*) is (*N* − *m*) <sup>−</sup><sup>1</sup> times the number of *Xm*+<sup>1</sup> *<sup>j</sup>* that meets *dm*<sup>+</sup><sup>1</sup> *<sup>i</sup>*,*<sup>j</sup>* ≤ *r* for all 1 ≤ *j* ≤ *N* − *m*. Instead of using the traditional threshold, which is between 0.10 and 0.25 times the SD of the data, herein, a physical threshold *r* is used to form a unified comparison baseline for determining the vector similarity. As the raw ECG signals were digitized at 128 Hz, which means that the difference between any two vectors is approximately an integer multiple of 8 ms, here we used *r* = 12 ms as the physical threshold according to the previous suggestion [10].

Then, SampEn is defined by:

$$\text{SampEn}(m, r, N) = -\ln\left(\sum\_{i=1}^{N-m} A\_i^m(r) / \sum\_{i=1}^{N-m} B\_i^m(r)\right) \tag{3}$$

In addition, previous studies suggested that using an embedding dimension of *m* = 1 or 2 can obtain better results for classifying NSR and CHF groups when setting the RR time series length as *N* = 300 [4]. In this study, we kept this suggestion of *m* = 1 and 2.

To test the performance of physical threshold-based SampEn, traditional SampEn was used as the comparative method. Entropy values were first calculated from the raw ectopic 5-min RR segments. Then, the ectopic RR intervals in these ectopic RR segments were removed to form the ectopic-free RR segments. Finally, entropy values were re-calculated from these constructed ectopic-free RR segments. Entropy variances before and after ectopic beat removal were calculated, and the variation could be regarded as an index for evaluating the performance of entropy measures' abilities to suppress the influence of ectopic beats.

#### **3. Results**

#### *3.1. Demonstration of the Influence of Ectopic Beats on Entropy Values*

Figure 2 shows the entropy results from an NSR subject (NSR002). As shown in Table 1, NSR002 has a total of 146 5-min ectopic RR segments. The left panels in Figure 2 show the entropy values for these 146 ectopic RR segments before ectopic RR interval removal (red dotted line) and after ectopic RR interval removal (blue line). The traditional SampEn has a large variation before and after ectopic RR interval removal, while the new physical threshold-based SampEn has very small changes when analyzing ectopic free segments. The right panels show the corresponding variance ratios, i.e., the entropy value of the ectopic free segment minus the entropy value of ectopic segment, divided by the entropy value of the ectopic segment. The entropy variance ratios in SampEn varied from −65.24% to 2.25%, with an average of −16.32% and an SD of 21.93%. The corresponding variance ratios for the physical threshold-based SampEn varied from 0% to 3.34% (*m* = 1, *r* = 12 ms), with an average of 0.81% and an SD of 0.66%; and from −0.51% to 3.21% (*m* = 2, *r* = 12 ms), with an average of 0.57% and an SD of 0.72%. Compared with the traditional SampEn, the physical threshold-based SampEn showed significantly lower variance ratios, demonstrating the better robustness of the new SampEn method.

**Figure 2.** An example of the influence of ectopic beats. Entropy values and their variance ratios for subject NSR002 before and after the ectopic beat removal: (**A1**) entropy results and (**A2**) their variance ratios for the traditional SampEn (*m* = 2, *r* = 0.2), (**B1**) entropy results and (**B2**) their variance ratios for the physical threshold-based SampEn (*m* = 1, *r* = 12 ms), and (**C1**) entropy results and (**C2**) their variance ratios for the physical threshold-based SampEn (*m* = 2, *r* = 12 ms).

By contrast, Figure 3 shows similar results from a CHF patient (CHF202), which has a total of 150 ectopic RR segments, as shown in Table 2. The entropy variance ratios in SampEn varied from −62.50% to 3.53%, with an average of −3.18% and an SD of 11.36%. The corresponding variance ratios for physical threshold-based SampEn varied from −0.35% to 2.01% (*m* = 1, *r* = 12 ms), with an average of 0.55% and an SD of 0.49%; and from −0.98% to 1.39% (*m* = 2, *r* = 12 ms), with an average of 0.20% and

an SD of 0.42%. Compared with the traditional SampEn, the physical threshold-based SampEn also showed significantly lower variance ratios in the demonstrated CHF patient.

**Figure 3.** An example of the influence of ectopic beats. Entropy values and their variance ratios for subject CHF202 before and after the ectopic beat removal: (**A1**) entropy results and (**A2**) their variance ratios for the traditional SampEn (*m* = 2, *r* = 0.2), (**B1**) entropy results and (**B2**) their variance ratios for the physical threshold-based SampEn (*m* = 1, *r* = 12 ms), and (**C1**) entropy results and (**C2**) their variance ratios for the physical threshold-based SampEn (*m* = 2, *r* = 12 ms).

#### *3.2. Demonstration of the Influence of Atrial Beats on Entropy Values*

There are two types of ectopic beat in the used PhysioNet/MIT RR Interval Databases, atrial and ventricular beats (shown in Figure 1). To further test the robustness of physical threshold-based SampEn method, we analyzed the ectopic segments only containing atrial or ventricular beats. For NSR002, there are 17 segments containing atrial beats and 137 segments containing ventricular beats among all 146 ectopic RR segments. For CHF202, there are 41 segments containing atrial beats and 123 segments containing ventricular beats among all 150 ectopic RR segments.

Figure 4 shows the results of 17 atrial ectopic RR segments from NSR002. Entropy variance ratios in SampEn varied from −53.40% to 1.77%, with an average of −8.48% and an SD of 19.54%. The corresponding variance ratios for physical threshold-based SampEn varied from 0% to 1.38% (*m* = 1, *r* = 12 ms), with an average of 0.42% and an SD of 0.45%; and from −0.51% to 1.77% (*m* = 2, *r* = 12 ms), with an average of 0.32% and an SD of 0.56%. Compared with the traditional SampEn, the physical threshold-based SampEn showed significantly lower variance ratios for the analysis of atrial ectopic RR segments. Figure 5 shows the similar results from CHF202, which includes 41 atrial ectopic RR segments. The entropy variance ratios in the SampEn varied from −43.10% to 3.53%, with an average of −2.34% and an SD of 8.51%. The corresponding variance ratios for physical threshold-based SampEn varied from −0.19% to 0.97% (*m* = 1, *r* = 12 ms), with an average of 0.24% and an SD of 0.33%; and from −0.39% to 1.09% (*m* = 2, *r* = 12 ms), with an average of 0.10% and an SD of 0.30%. The results for CHF also support that the physical threshold-based SampEn had significantly lower variance ratios in the analysis of atrial ectopic RR segments.

**Figure 4.** An example of the influence of atrial ectopic beats. Entropy values and their variance ratios for subject NSR002 (only 17 atrial ectopic RR segments) before and after the ectopic beat removal: (**A1**) entropy results and (**A2**) their variance ratios for the traditional SampEn (*m* = 2, *r* = 0.2), (**B1**) entropy results and (**B2**) their variance ratios for the physical threshold-based SampEn (*m* = 1, *r* = 12 ms), and (**C1**) entropy results and (**C2**) their variance ratios for the physical threshold-based SampEn (*m* = 2, *r* = 12 ms).

**Figure 5.** An example of the influence of atrial ectopic beats. Entropy values and their variance ratios on subject CHF202 (only 41 atrial ectopic RR segments) before and after the ectopic beat removal: (**A1**) entropy results and (**A2**) their variance ratios for the traditional SampEn (*m* = 2, *r* = 0.2), (**B1**) entropy results and (**B2**) their variance ratios for the physical threshold-based SampEn (*m* = 1, *r* = 12 ms), and (**C1**) entropy results and (**C2**) their variance ratios for the physical threshold-based SampEn (*m* = 2, *r* = 12 ms).

#### *3.3. Demonstration of the Influence of Ventricular Beats on Entropy Values*

Figure 6 shows the results of 137 ventricular ectopic RR segments from NSR002. Entropy variance ratios in SampEn varied from −65.24% to 2.46%, with an average of −16.15% and an SD of 21.57%. The corresponding variance ratios for physical threshold-based SampEn varied from 0% to 3.34% (*m* = 1, *r* = 12 ms), with an average of 0.82% and an SD of 0.66%; and from −0.89% to 3.22% (*m* = 2, *r* = 12 ms), with an average of 0.57% and an SD of 0.73%. Compared with the traditional SampEn, the physical threshold-based SampEn also showed significantly lower variance ratios in the analysis of ventricular ectopic RR segments. Figure 7 shows the similar results from CHF202, which includes 123 ventricular ectopic RR segments. The entropy variance ratios in SampEn varied from −48.55% to 1.56%, with an average of −2.97% and an SD of 10.89%. The corresponding variance ratios for the physical threshold-based SampEn varied from −0.35% to 2.01% (*m* = 1, *r* = 12 ms), with an average of 0.59% and an SD of 0.49%; and varied from −0.98% to 1.63% (*m* = 2, *r* = 12 ms), with an average of 0.22% and an SD of 0.43%. The results for CHF also support the idea that the physical threshold-based SampEn had lower variance ratios in the analysis of ventricular ectopic RR segments.

**Figure 6.** An example of the influence of ventricular ectopic beats. Entropy values and their variance ratios for subject NSR002 (only 137 ventricular ectopic RR segments) before and after the ectopic beat removal: (**A1**) entropy results and (**A2**) their variance ratios for the traditional SampEn (*m* = 2, *r* = 0.2), (**B1**) entropy results and (**B2**) their variance ratios for the physical threshold-based SampEn (*m* = 1, *r* = 12 ms), and (**C1**) entropy results and (**C2**) their variance ratios for the physical threshold-based SampEn (*m* = 2, *r* = 12 ms).

**Figure 7.** An example of the influence of ventricular ectopic beats. Entropy values and their variance ratios for subject CHF202 (only 123 ventricular ectopic RR segments) before and after the ectopic beat removal: (**A1**) entropy results and (**A2**) their variance ratios for the traditional SampEn (*m* = 2, *r* = 0.2), (**B1**) entropy results and (**B2**) their variance ratios for the physical threshold-based SampEn (*m* = 1, *r* = 12 ms), and (**C1**) entropy results and (**C2**) their variance ratios for the physical threshold-based SampEn (*m* = 2, *r* = 12 ms).

#### *3.4. Total Results*

Table 3 and Figure 8 show the entropy variance ratios and standard deviations for each subject in the NSR group (in total, 45 recordings with the required numbers of ectopic segments, as indicated in Table 1) when comparing the entropy values from both before and after ectopic beat removal. The absolute variance ratio and standard deviation of SampEn for each subject were obviously larger than those from the two physical threshold-based SampEn methods, and the mean variance ratios were −6.91%, 0.63% and 0.43% for SampEn and the two physical threshold-based SampEn methods (*m* = 1 and *m* = 2 respectively, and, for both, *r* = 12 ms). In addition, SampEn showed significantly larger standard deviations of entropy variance ratios within subjects than the two physical threshold-based SampEn methods. The average standard deviations were 13.93%, 0.62% and 0.68% for SampEn and the two physical threshold-based SampEn methods (*m* = 1 and *m* = 2 respectively, and, for both, *r* = 12 ms).


**Table 3.** Entropy variance ratios and standard deviations for each subject in the NSR group.

**Figure 8.** Box plots of the entropy variance ratios and standard deviations for each subject in the NSR group. (**A**) Traditional SampEn (*m* = 2, *r* = 0.2), (**B**) physical threshold-based SampEn (*m* = 1, *r* = 12 ms) and (**C**) physical threshold-based SampEn (*m* = 2, *r* = 12 ms).

Similarly, Table 4 and Figure 9 show the entropy variance ratios and standard deviations for each patient in the CHF group (24 recordings). The absolute variance ratio and standard deviation for each subject of SampEn were obviously larger than those from the two physical threshold-based SampEn methods, and the mean variance ratios were −5.01%, 1.54% and 1.41% for SampEn and the two physical threshold-based SampEn methods (*m* = 1 and *m* = 2 respectively, and, for both, *r* = 12 ms). Meanwhile, SampEn showed significantly larger standard deviations of entropy variance ratios within patients than the two physical threshold-based SampEn methods. The average standard deviations were 11.69%, 1.28% and 1.46% for SampEn and the two physical threshold-based SampEn methods (*m* = 1 and *m* = 2 respectively, and, for both, *r* = 12 ms). These results further confirmed the better stability of SampEn using the physical threshold.

**Figure 9.** Box plots of the entropy variance ratios and standard deviations for each subject in the CHF group. (**A**) Traditional SampEn (*m* = 2, *r* = 0.2), (**B**) physical threshold-based SampEn (*m* = 1, *r* = 12 ms) and (**C**) physical threshold-based SampEn (*m* = 2, *r* = 12 ms).


**Table 4.** Entropy variance ratios and standard deviations for each subject in the CHF group

When comparing the group differences of variance ratios between the NSR and CHF groups, the traditional SampEn showed no significant difference (*P* = 0.3) while the physical threshold-based SampEn showed significant differences (both *<sup>P</sup>* <sup>&</sup>lt; 0.01 for two parameter *<sup>m</sup>* settings), with *<sup>P</sup>* <sup>=</sup> <sup>4</sup> <sup>×</sup> <sup>10</sup>−<sup>7</sup> for *<sup>m</sup>* = 1 and *<sup>P</sup>* <sup>=</sup> <sup>2</sup> <sup>×</sup> <sup>10</sup>−<sup>6</sup> for *<sup>m</sup>* = 2 respectively.

#### **4. Discussion and Conclusions**

In all of the three intrinsic parameters of SampEn, the parameter *r* is the most difficult to be determined. Different opinions regarding the selection of threshold *r* would lead to different entropy outputs. In a previous study, researchers developed different methods for the selection of the threshold *r* [8,39], and tried to make the selection method more rigorous and standardized [4,40]. However, there is no unified standard for *r* value selection now. Special selection methods only perform well under specific circumstances, and the influencial factors may include data type, data length, disease type, etc. Therefore, the argument has always been whether to use a fixed tolerance *r* or a varying tolerance *r*. Researchers first explored this issue in the MSE method, which performed SampEn analysis on several different scales and thus induced the question of whether using a fixed or a varying tolerance *r* at different scales was better. Angelini et al. reported that using a fixed and a varying tolerance *r* in MSE generated similar changes in CHF analysis [41]. Silva et al. also confirmed this finding in a rat model of hypertension and CHF [42], suggesting that the selection of the tolerance *r* in the MSE method is not relevant. However, the fixed tolerance *r* at different scales only stays the same for special subjects. For different subjects, there is also an inter-variability of the tolerance *r,* since different subjects have different signal variabilities of time series.

In a previous study, we found that SampEn reported lower values in CHF patients when using a small threshold *r* value (*r* = 0.10), but higher values when using large threshold *r* values (*r* = 0.20 or 0.25). The opposite entropy change trend brings difficulty to the clinical explanation. To solve this problem, we proposed a physical threshold-based SampEn method to discriminate the opposite entropy change trend in classifying CHF and NSR subjects. This previous study was performed only on RR segments without any ectopic beats. The raw ECG signal had a sample rate of 128 Hz, generating differences of roughly 8 ms and its multiples for RR intervals. Thus, we tested the effects of different *r* values of *r* = 12 ms, *r* = 20 ms, *r* = 28 ms etc., and found that *r* = 12 ms provided the best discrimination between the CHF and NSR groups. In this study, we used the previously proposed fixed tolerance *r* method with *r* = 12 ms [26] with physical meaning to analyze the RR interval time series with ectopic beats, to explore if the new *r* method has better performance for ectopic time series. Forty-five NSR and 24 CHF recordings were enrolled in this study, all of which had an appreciable number of ectopic beats, including atrial and ventricular beats. SampEn entropy results from both the traditional varying threshold (a fraction of the SD of time series) and the new fixed physical meaning threshold were compared before and after ectopic beat removal. For both the NSR and CHF groups, the entropy variance of SampEn with the traditional threshold is obviously larger than that when using the physical meaning threshold, which verifies the better consistency of the new physical meaning threshold method.

Ectopic beats are routinely removed or edited from the RR interval time series prior to HRV analysis. Salo et al. found that both time- and frequency-domain indices were sensitive to the editing of RR intervals [28]. This finding was consistent with our current study, where we showed that the SampEn calculated by the traditional method was sensitive to the removal of ectopic beats (one to five beats). The reason is that the ectopic beats usually result in sudden changes in the RR interval time series. This effect is significant on the transient change of HRV reflected by both the time- and frequency-domain indices, as well as nonlinear indices like SampEn [29,43]. However, for each subject, after ectopic beats were removed, the entropy value only changed significantly in specific segments. The entropy value variance for all segments in subject NSR002 was between −65.24% and 2.25% for the traditional threshold; and between 0% and 3.34% (*m* = 1), and −0.51% and 3.21% (*m* = 2) for the two physical meaning thresholds. The results in subject CHF202 were similar, i.e., between −62.50% and 3.53% for the traditional threshold; and −0.35% and 2.01% (*m* = 1), and −0.98% and 1.39% (*m* = 2) for the two physical meaning thresholds. The absolute change in SampEn with the traditional threshold was much more significant than that in SampEn with the physical meaning threshold.

In addition, we also analyzed the effect of different ectopic beats (atrial or ventricular) on the tested SampEn output. Results from the segments only containing atrial or ventricular beats showed that SampEn using the physical meaning threshold still performed better than SampEn using the traditional threshold. When atrial beats or ventricular beats were removed, the absolute entropy value variation in the former SampEn was significantly smaller than that in the latter.

In conclusion, SampEn using the physical meaning threshold has better performance, not only for different data types (NSR or CHF recordings), but also for different types of ectopic beat (atrial beats, ventricular beats, or both), and using the physical meaning threshold makes SampEn become more consistent and stable.

**Author Contributions:** Conceptualization, L.Z. and C.L.; Data curation, L.Z.; Formal analysis, L.Z.; Funding acquisition, J.L. and C.L.; Investigation, L.Z. and C.L.; Methodology, L.Z., J.X., X.L.; Project administration, C.L.; Resources, C.L.; Software, L.Z.; Supervision, J.L. and C.L.; Validation, L.Z., J.X. and X.L.; Writing—original draft, L.Z.; Writing—review & editing, J.X. and C.L. All authors have read and agreed the publication of the final version of the manuscript.

**Funding:** This research was funded by the Distinguished Young Scholars of Jiangsu Province (BK20190014), the National Natural Science Foundation of China (81871444), the Primary Research & Development Plan of Jiangsu Province (BE2017735) and the China Postdoctoral Science Foundation funded project (2019M661696).

**Acknowledgments:** The authors thank the support from the Southeast–Lenovo Wearable Heart-Sleep-Emotion Intelligent Monitoring Lab.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
