**4. Results**

To understand the relationship between the CFS and prediction performance for a given model, experiments were carried out using two different datasets: synthetic and M4 TS. For the first case, the purpose is to understand the relationship between *ESC* complexities against data whose underlying mechanisms can be controlled. In the second case, the value of the CFS in a real-world setting is explored to obtain a better idea of its potential in identifying regions (perhaps groups) of forecastability.

## *4.1. Complexities and Forecastability of the Synthetic TS*

This section is divided into two subsections that are described below. The forecastability is analyzed only with the ARIMA forecasting method, which was executed from the *forecast* package. However, its parameters *p, r, and q* are tuned by following the procedure in [44]. This method is executed with *ARIMA* function with different combinations of *p* ∈ [0, <sup>10</sup>], *d* ∈ [1, 3], *q* ∈ [0, <sup>10</sup>], and selecting those that obtained the smallest Akaike Information Criterion (AIC) value, all of these trying to obtain the better forecast for each TS that belongs to the subset of synthetic TS.

## 4.1.1. The Logistic Map

To start the discussion of synthetic TS results, the logistic map is analyzed. This is a common benchmark used for the elucidation of the relationship between entropy-based complexities and forecastability [20,21,28]. Hence, the well-known Feigenbaum diagram along with its corresponding *ESC* measurements (from top to bottom) obtained for the logistic map are shown in Figure 3. Recall that the Feigenbaum diagram is a visual summary of the values (*xt*) visited by a system as a function of a bifurcation parameter. Thus, in this case, as the parameter *r* grows the logistic map transitions from permanent oscillations between fixed-point pairs to the chaotic regime. Colors in the Feigenbaum diagram correspond to the *log*(*MASE*) obtained by an ARIMA model: lower errors are shown in dark blue whereas those with higher values are displayed in bright yellow. Hence, as the logistic map dynamics becomes more chaotic, the TS become less forecastable by the ARIMA model.

For the *ESC* plots, colors correspond to different entropy-based complexities: red for *Hdist*, green for *Hspct*, blue for *Hperm*, and purple for *<sup>H</sup>*2*reg*. Observe that all entropy-based complexities are constant when oscillating between two values (*r* ≤ 3.44), except for *Hperm*. In this case, *ESC* values using a binary alphabet shall be (*E* = 1, *S* = 0, *C* = 0) for *Hdist*, *<sup>H</sup>*2*reg*, and *Hperm*, however, by forcing an arbitrary large alphabet size, the self-organization is revealed. In fact, for *Hperm* and *r* ≤ 3.44, (*E* = 0, *S* = 1, *C* = 0) for most cases consequence of a Dirac delta PD, with the exception of some spikes in which new ordinal patterns emerge. Observe that several of these spikes have worse *log*(*MASE*) than those obtained by contiguous *r* values. *Hdist* grows immediately after *r* = 3.44 due to doubling of the limit cycle, but remains steady until *r* = 3.54, this contrasts to *<sup>H</sup>*2*reg* which does not grow, and *Hspct* and *Hperm* which increases slower. In fact, *Hdist* seems to be a more sensitive measure to the alphabet size and not necessarily TS intricacy, since its *E* becomes very high between [3.54, 3.63] in comparison to the rest of the complexities. Eventually, *Hspct* and *Hdist* concur in that the emergence of new states (*E* ∼ 1) (or complementary, the reduction in self-organization *S* ∼ 0) is similar to a random process. Conversely, *<sup>H</sup>*2*reg* and *Hperm* increase slower as *r* grows, although the former does not change until *r* ∼ 3.68 indicating that regimes displayed by the logistic map for *r* ≤ 3.68 are constant.

**Figure 3.** The Logistic Map and its ESC (Emergence, Self-Organization, and Complexity). The top plot shows the bifurcation diagram, whereas below the corresponding ESC for different entropy measures is showed.

The interplay between new states and the self-organization of the system, displayed by *C* is very interesting. For *Hdist*, when the logistic map has 2 fixed points (*r* ≤ 3.44) a *C* = 0.5 is obtained, when the period doubles it increases to *C* ∼ 0.8, and it shows maximal complexity (*C* ∼ 1) at points between double-periods and chaotic regimes (*at the edge of chaos*). Hence, for obtaining a lower *log*(*MASE*) it is necessary that 0.5 ≤ *Cdist* ≤ 1, *S* ≥ 0.5, and *E* ≤ 0.5. For *Hspct* a similar relationship is observed in the sense that high *C* values are associated with lower *log*(*MASE*) due to *E* ≤ 0.5 and *S* ≥ 0.5 proportions. Notice that this *C* separates ARIMA performance into two performance regions, with the worst *log*(*MASE*) corresponding to complexities below 0.6, dropping even to *C* ∼ 0 as the logistic map becomes more chaotic. In contrast, *C* for *Hperm* and *<sup>H</sup>*2*reg* have larger complexity values for worse forecasting performance; *<sup>C</sup>*2*reg* separates ARIMA performance into two performance regions similar to *Cspct*.

## 4.1.2. The CFS of All Synthetic Data

All the synthetic data were mapped as 2D point in the CFS which is displayed in Figure 4.

In Figure 4A *ESC* variables are projected into the CFS plane to display its loadings. Notice that the first two Principal Components (PCs) explain a large amount of the variance in the data (*PC*1 ∼ 73%, *PC*2 ∼ 10.6%), due to most of the series in the data set belonging to the logistic map. *Cperm*, *Edist*, *Eperm*, and *<sup>E</sup>*2*reg* have positive loadings on the PC1, whereas *Sdist*, *Sperm*, and *<sup>S</sup>*2*reg* have negative loadings. *Espct* and *Sspct* are parallel to its *Hdist*counterpart, but with lower loadings on the PC1. The rest of the

complexities have lower loadings in these two PCs. Convex hulls are used to denote each TS source; however, note that these are constrained to specific regions in the CFS. Hulls corresponding to the corrupted sine waves mostly overlap each other, and share a large portion with GRATIS data.

**Figure 4.** The Logistic Map and its ESC. The top plot shows the bifurcation diagram, whereas below the corresponding ESC for different entropy measures its showed. (**A**) *ESC* variables are projected into the Complexity Feature Space (CFS) plane to display its loadings; (**B**) Two dimension Time Series are colored in accordance to its *log*(*MASE*); (**C**) K-means clustering algorithm results using four centroids.

In Figure 4B 2D TS are colored in accordance to its *log*(*MASE*). Observe that a clockwise relationship between forecasting performance is displayed: ARIMA best performance lies in the upper left quadrant and its worst results on the lower right. It is interesting that the worst *log*(*MASE*) correspond to noisy time series, instead of the chaotic source, and that they are *conveniently* confined to specific regions in the CFS. By *convenient* we meant that a clustering algorithm may be used to cluster TS characterized by the *ESC* variables to obtain performance clusters, employed to determine if a forecasting method is useful or not for a given TS. Encouraged by the latter, the results, obtained by the popular K-means clustering algorithm using four centroids, are shown in Figure 4C. Notice that the resulting clusters correspond to the performance regions mentioned before.

## *4.2. Complexities and Forecastability of the M4 Competition TS*

Before we delved into the analysis of M4 Competition results, we display the relationships of different *ESC* measures of the M4 set in the CFS. In Figure 5 all TS (Yearly, Quarterly, Monthly, Weekly and Daily) are displayed as 2-D points; we focus on the *C* measure for each entropy measure (Figure 5a 2-regimen, Figure 5b distribution, Figure 5c permutation, and Figure 5d spectral); colors range from brighter (corresponding to higher values *C* ∼ 1) to darker (corresponding to lower values *C* ∼ 0). Observe that both *<sup>C</sup>*2*reg* and *Cperm* achieves the line gradient behavior with high complexity values as the PC1 becomes more negative, and lower complexity values as it becomes more positive. Interestingly, regarding PC2, they are on opposite sides. On the other side, *Cdist* visual gradient is perceived more on the y-axis (lower values are positive and higher values are negative), in contrast to the PC1 where no clear relationship between high and low *C* values is observed. Similarly, *Cspct* shows high values over most of the two PCs plane. However, for both *Cdist* and *Cspct* this behavior can be product of the reduction of dimensionality by the linear method.

These intuitions are corroborated by the loadings of these variables on the four most important PCs, which are presented in Table 3. Notice that the PC1 and PC2 are mainly represented by Permutation and 2-regimen complexities. On the other hand, for PC3 the most significant variable is *Cdist* which has a negative loading, whereas for PC4, the most significant variable is *Cspct*. In particular, the *C* part of the *ESC* measures will be used for the analysis in Section 4.3. Table 4 shows results for the explained variance proportion corresponding to each principal component. Observe that the first two PCs account for most of the variance (≈77%) in data, and with only 4 PCs we account for the 100% of the variance.

In Figure 6a selected M4 TS are shown in the CFS color-coded by the period that corresponds to its frequency. Observe that Daily and Monthly TS are readily identifiable in the 2d projection, while the former is restrained to a specific region of the CFS, and the latter is spread across the CFS. Weekly TS are constrained to the middle section of the CFS, while Yearly and Quarterly TS are barely noticeable. On the other hand, in Figure 6b M4 TS are shown colored in accordance to the winning method, where 6838 points correspond to ARIMA, 6384 correspond to the SMYL, 5064 to the ETS, and 4324 to the Theta. It is worth mentioning that even when ARIMA wins in more TS than the Smyl algorithm, error magnitudes of the former are larger in comparison to the latter. Moreover, there are no specific regions in which any of the tested methods obtain better performance than the rest, which is consistent with the *No-Free Lunch* theorem.

**Figure 5.** Four complexity measures and the Principal Components Analysis (PCA) of 12 features (ESC).


**Table 3.** Principal Components Analysis (PCA) results.

**Table 4.** Proportion of variance for the principal components.


*Entropy* **2020**, *22*, 89

**Figure 6.** Analysis of TS regarding its Period frame and Winning method by TS. (**a**) Selected M4 Time Series are shown in the Complexity Feature Space (CFS), and each one is colored according to the period of its frequency; (**b**) M4 Time Series are colored according to the winning method.

Continuing with the experiments on M4 dataset, one of our main interests is to determine the forecastability of the M4 Competition through the complexity measures of TS. Therefore, we consider four methods of M4-Competitions in order to establish whether there exists or not a relationship between the MASE error (*log*(*MASE*) to effects of functionality) by forecasting method (Smyl, Theta, ARIMA, and ETS). The first activity was to divide the complete dataset into four quartiles, in Figure 7, with each gray point representing one TS that belongs to the complete dataset of M4 Competition, and the dark green point representing the TS whose (*log*(*MASE*) value is found of the first quartile; specifically, in Figure 7a, the (*log*(*MASE*) values corresponding to the Smyl forecasting method, the Figure 7b, the (*log*(*MASE*) values corresponding to the Theta forecasting method, and so on, this figure shows that the TS with low *log*(*MASE*) value are concentrated in the negatives values of the second principal component and has a high value for the first principal component, according to Figure 6a; this kind belongs mainly to the Monthly period.

**Figure 7.** Relationship between *log*(*MASE*) and Complexity measures of the first quartile.

In the same way, Figure 8 represents the TS that integrates the four quartiles according to the *log*(*MASE*) values, where the purple point corresponds to the TS that belongs to this quartile. Making a comparison between Figure 8a–d it is noted that the *log*(*MASE*) values for each one of forecast methods is closer between them, and in terms of distribution area for these TS, we determine that when the complexity measures are higher, the *log*(*MASE*) value is higher too; moreover, compared to the distribution of TS by periods (Yearly, Quarterly, Monthly, Weekly and Daily), the major part of Daily TS belongs to this quartile.

**Figure 8.** Relationship between *log*(*MASE*) and Complexity measures of the fourth quartile.
