1. Introduction
Entropy–related features have been used extensively for time series classification purposes with great results. They have been applied in many scientific and technical domains, although medicine is probably the most exploited field, with outstanding results in healthy–ill subjects classification and early diagnosis tasks [
1].
The recently proposed time series entropy measure termed Slope Entropy (SlpEn) [
2] is able to achieve high classification accuracy using a diverse set of records [
2,
3,
4]. It has also been already implemented in scientific software tools despite its short life, such as in EntropyHub (
https://github.com/MattWillFlood/EntropyHub.jl, accessed on 29 December 2022) and CEPS, Complexity and Entropy in Physiological Signals [
5].
However, more steps have to be taken towards a complete optimisation of this measure, including the refinement of the initial algorithm. This way, the method will become more convenient to use, more robust, and with enhanced generalisation capabilities, as was the case for older entropy methods in the past, an ongoing and dynamic process.
For example, an already classical method, Approximate Entropy (ApEn) [
6], has been quite extensively studied and characterised to provide guidelines regarding the input parameter values [
7], its behaviour depending on the length of the time series [
8], its statistical properties [
9], its robustness against noise [
10], its sensitivity to missing data [
11], and its generic properties [
12], among others.
The same applies to a more recent method, Permutation Entropy (PE) [
13]. Generic optimizations [
14], generic recommendations [
15], how to use non–uniform embedding [
16], the influence of time series length [
17], how to improve its robustness against ties [
18], specific recommendations for parameter selection [
19], how to exploit the information related to forbidden ordinal patterns [
20], and the real influence of ties in classification tasks [
21], are examples of the multiple characterisation studies applied to PE.
Most of these methods are based on computing the relative frequency of some direct or derived specific time series features or symbolic patterns, and applying the computation of, among others, the Shannon entropy to the resulting histogram [
22]. Therefore, the relative frequency estimated values are always statistically bounded so that their sum equals 1, or at least a finite constant value known in advance such as log
[
15] (being
m the embedded dimension as defined later).
Nevertheless, other recent studies have demonstrated that the total number of symbolic patterns theoretically possible or expected, and the actual number found or extracted even from an infinite length time series, could greatly differ due to the so–called forbidden patterns [
23]. Since this number of forbidden patterns is strongly related to the determinism degree of the time series under analysis [
24,
25], other works proposed to use the real number of patterns found instead of the theoretical one to compute the relative frequency histogram, improving significantly the classification performance based on modified features [
20], but introducing a bias that results in the estimated probabilities not adding up to a finite number, depending on the time series length
N. Therefore, these features are no longer an entropy in the sense of the Shannon entropy. However, the term entropy is kept for simplicity since they are still based on that formulation.
From a time series classification perspective, this lack of statistical rigour not only does not produce inaccurate classification results, but rather contributes to unearth additional information, increasing the accuracy of the classification process. Moreover, considering a generic feature set for object classification, there is no need at all for the individual features to satisfy any specific statistical property; only their segmentation power really matters. This is the case for the modified version of PE proposed in [
20], and SlpEn [
2]. Despite such performance improvement, researchers in the time series entropy realm are more familiar with results within the range 0 and 1, where 0 usually corresponds to completely deterministic series, and 1 to full randomness. In addition, this range is more easily interpretable and dealt with.
Many optimisations methods have been proposed in the scientific literature to improve the performance and robustness of already widely used entropy methods for time series classification. For example, the work [
26] describes how to use entropy profiling to improve short length time series analysis using KS–entropy measures, very dependent on input parameter values. They removed the need of manual
r parameter selection almost entirely. In [
27], the authors investigated normalised first order differences of time series to obtain more robust estimates of signal complexity under diverse conditions. Other studies, as [
28], illustrate the possible negative effects of normalisation on method performance that should be also taken into account, and are avoided in the present work.
There are also many applicable normalisation schemes described in the literature to scale back any numerical results to the above mentioned
range [
29,
30,
31], and to reduce the influence of time series length. This is also the case in [
32], where the authors apply normalisation to Approximate Entropy to reduce the possible influence on calculations of different lengths.
Along this line, we propose in this work a specific normalisation method for SlpEn to keep its results in the
range, and make them less dependent on the length of the time series. This method will be based on a–priori estimations of the real number of unique patterns likely to be found, without any detrimental effect on classification performance, and using an approach similar to that in [
33] applied to Lempel–Ziv complexity normalisation.
The main contribution of this paper is to propose simple exact and approximate values on which base SlpEn boundaries for normalisation. The practical implications of the study will be illustrated by means of a classification analysis using the SlpEn customisation proposed on both synthetic and real time series of different lengths and properties.
4. Discussion
Results in
Table 6,
Table 7,
Table 8,
Table 9 and
Table 10 show normalised SlpEn values remain between 0 and 1, and its classification performance is unaffected; regardless, the normalisation method was used, which were the main goals of the present study.
The comparison in
Table 6 using the analytic bounds for normalisation,
, confirmed the sensitivity and the specificity of both SlpEn variants were exactly the same. The minimum normalised SlpEn value was achieved by the most deterministic data set, the periodic, with values in the vicinity of
. On the opposite end, the random dataset achieved the maximum normalised value, around
. Although this range falls within the goal of
, it is probably too narrow to provide a good perspective of the randomness of the datasets, mainly at the minimum level.
That is why heuristic bounds better matching real entropy analysis schemes were devised and applied. The corresponding results were shown in
Table 9. In this case, the results were
for the Periodic data set, and
for the House dataset. The differences were mainly due to the new normalisation scheme, but the specific case of the Periodic database, with
SlpEn, was due to the thresholding applied, with 0 SlpEn for those time series considered deterministic. Obviously, this results in a
classification accuracy for the Periodic database. If that is not acceptable, the threshold can be customised depending on the application.
However, comparing
Table 6 and
Table 9, it becomes apparent that the maximum heuristic bound is not really necessary, since the logarithmic nature of the expressions expands mainly the lower part of the interval. This is justified by the results in
Table 10, where using the minimum heuristic bound, and the exact analytic maximum, the results are still reasonably distributed according to their determinism degree. Therefore, this seems to be the most efficient solution for the objective of the present paper.
Finally, in order to illustrate the effect of length on standard SlpEn calculation, more experiments were conducted varying the length of records from 1000 up to 4000 samples, in 1000 samples steps, and the results reported in
Table 7 and
Table 8 (only for Random and Bonn databases). As can be observed in such Tables, the normalisation method devised also reduces the SlpEn dependence on time series length. On this same matter, results in
Table 11 show the performance of the normalisation method in its final recommended version is also equal to that of the original one even under the difficult conditions that very short records entail, as with the downsampling shown in
Table 12.
5. Conclusions
This paper proposes to use a Max–Min normalisation scheme to keep the results of SlpEn within the interval . The main difficulty to apply any normalisation to SlpEn is that some parameters are not known until the record is processed; in this case, the number of unique patterns found k, from which the maximum value of SlpEn can be derived.
The first approach proposed uses an analytic technique that computes the limits from assumptions about the minimum possible SlpEn value, which coincides with the SlpEn of a constant gradient time series (), and the maximum SlpEn value, based on a uniform histogram and maximisation values for k (). This approach is easily implemented, keeps SlpEn within the desired interval, and do not damage the classification performance. The main weakness of this approach is that SlpEn results are too close to 0.9–1.0, and differences are not visually very apparent.
The second approach shifts the bounds to values based on real cases, when very deterministic time series such as constant or periodic records are of no interest in entropy terms, and therefore it is not necessary to keep a part of the interval for them, just the 0 value. On the opposite bound, it can be shown empirically that the number of patterns found is usually several orders of magnitude smaller than that theoretically expected, and this relationship can be estimated and applied to refine the upper bound.
Using this last approach, the SlpEn results are better distributed. However, with a global analysis of all the results, it seems the optimal combination of bounds is the minimum heuristic bound , and the analytic maximum, given by . These are the final bounds recommended to be included in the computation of SlpEn to obtain a method with less disparity in the result values.
In future studies, this kind of normalisation or a derived one could be also customised to make SlpEn almost independent of
N to improve accuracy when applied to non–uniform datasets in terms of length. That is the case of records such as body temperature or blood pressure records [
43,
44], where each time series frequently has a different length due to many acquisition artifacts [
11]. Other length reduction techniques such as trace segmentation [
45] should be assessed.