3.4.1. Parameters Settings

So far, we have neglected some details regarding the parameters to build the CFS to analyze forecasting performance. First, we detail the parameters used to generate synthetic TS data. Then, entropy-based parameters and the performance metric are presented.

All synthetic TS have 10<sup>4</sup> observations, all sine waves have an amplitude of *α* = 1, and frequency of *ω* = 2. SNRs for both TS corrupted by uniform and Gaussian noise are *SNR* = <sup>10</sup>−3, <sup>10</sup>−2, <sup>10</sup>−1, <sup>10</sup>−0.9, <sup>10</sup>−0.5, <sup>10</sup>−0.1, 1, 10. In particular, for the Gaussian noise, we used a standard deviation of *σ*<sup>2</sup> = 1. Thus, for each sine wave, we generated 8 TS, giving a total of 16 sine waves. Regarding the logistic map, we employed an *r* ∈ [3.1, 4] such that Δ*r* = 0.005 with a starting point of *x*0 = 0.1 which produces 181 TS. The last subset of synthetic TS is 16 TS generated with the tool GRATIS described in a previous section. We choose parameters like length equal to 600, spectral entropy between 0.25 and 1.00 rank, and Monthly frequency. In total, the Synthetic dataset is composed of 215 TS. On the other hand, we selected a subset of the M4 Competition data composed of 22,610 TS. Regarding the forecasting horizon, for the M4 Competition TS, the dataset contains for each TS a training and test observation part, and a defined horizon as well, thus, the Yearly period has an horizon of 6, Quarterly 8, Monthly 18, Weekly 13, and Daily 14, considering that the synthetic TS generated corresponds to Monthly period, following the same scheme of M4 Competition, the horizon selected was of 18.

Regarding the entropy-based complexities, there are some parameters to be established beforehand. In the case of *Hspct* we employed the implementation of the package *forecast* [42], which is an already normalized version of entropy i.e., 0 ≤ *Hspct* ≤ 1. In contrast, *Hdist*, *<sup>H</sup>*2*reg*, and *Hperm* require

selection of an alphabet size. It is worth mentioning that the selection of the alphabet length was rather arbitrary, and perhaps it is a parameter for tuning or taking advantage. In the case of *Hdist* and *<sup>H</sup>*2*reg* we used an alphabet size of *d* = 8; thus, the alphabet size was 2*d* = 256. The case of *Hperm* is special, since it will depend on the time series and the reconstructed phase space. For such purpose, the Mutual Information method and the Approximate Nearest Neighbor method are employed to estimate *τ* and *de*, respectively, in accordance with Cao's practical method [43]. In particular, for the logistic map *τ* = 2, *de* = 8 in accordance to [28], in this case the alphabet size is 8! = 40, 320 permutations (however only those *<sup>P</sup>*(*<sup>π</sup>j*) > 0 are considered). To estimate the PD required by *Hperm* the package *pdc* is employed [35]. The *<sup>H</sup>*2*reg* and *ESC* measures from the entropy-based complexities were calculated using a self-implementation in R based on [8]. To estimate the error of forecasting methods, we use forecast values of 4 methods provided in *M4comp2018* package. The MASE is calculated using the *forecast* package. Furthermore, during the experimentation we noticed a logarithm relationship between the MASE and some *ESC* values, thus, MASE values are scaled *log*10 to highlight this relationship.
