**2. Materials**

The complete dataset of the time series used in this paper is divided into two subsets: Synthetic and M4 Competition. Each of these is described in the following subsections.

## *2.1. Synthetic Time Series*

Three generating mechanisms were considered for the construction the of synthetic TS: (a) sine waves; (b) logistic map; and (c) a time series tool, namely *GRATIS* [15]. It is worth noting that in order to generate time series belonging to the same mechanism type, either the parameters of the generating function are modified or noise is introduced at a certain specific Signal-to-Noise Ratio (SNR). The synthetic TS considered are *Sine Wave corrupted by uniform noise*, *Sine Wave corrupted by Gaussian noise*, *1-D logistic Map*, and the *GRATIS tool*.

## 2.1.1. Sine Waves TS

A stationary and seasonal TS is generated using a sine wave of the form:

$$
\omega\_t = \mathfrak{a} \cdot \sin(\omega t),
\tag{1}
$$

where *xt* is the observation at time *t*, *α* corresponds to the wave amplitude parameter, and *ω* to the wave frequency. A family of time series is spawned from Equation (1) by corrupting the wave at specific SNRs. In the case of the latter, the contaminated series is defined as

$$X = f(x) + k \cdot \epsilon\_{\prime}$$

where *f*(*x*) is the sine wave, *k* is an increasing constant, and ∈ *P*(*X*) is a shock which follows a uniform or Gaussian distribution. In these cases, the SNR is determined by

$$SNR = \frac{Var(f(x))}{Var(\epsilon)},$$

where larger values of SNR imply that it is easier to detect the signal, and smaller values otherwise.

## 2.1.2. Logistic Map TS

The logistic Map is a 1-dimensional chaotic dynamic system which is commonly employed as benchmark to study tools and methods used to characterize chaotic dynamics [19–21]. The logistic map function is defined as

$$\mathbf{x}\_{t+1} = \mathbf{r} \cdot \mathbf{x}\_t (1 - \mathbf{x}\_t),\tag{2}$$

where *x*0 is a random number within 0 < *xt* < 1, and *r* is a constant. In fact, this last parameter is the one that defines the behavior of Equation (2). More precisely, when *r* < 1 the system always collapses to zero, for 1 ≤ *r* ≤ 3 the system tends to a single value, for 3 < *r* < 3.6 the system is fixed to period-doubling points, and from *r* ∼ 3.6 the system exhibits a chaotic behavior.

## 2.1.3. GRATIS TS

The last subset of time series was generated using the GRATIS tool [15] that is based on Gaussian Mixture Autoregressive (MAR) models. These models contain multiple stationary or non-stationary autoregressive components, non-linearity, non-Gaussianity, and heteroscedasticity. To tune the parameters for MAR models, the GRATIS' authors use a Genetic Algorithm when the distance between the target feature vector and the feature vector is close to zero. This tool generates time series with diverse parameters such as length, frequency, and behavior features.

## *2.2. M4 Competition TS*

The complete set is composed of 100,000 real-life series divided into sets named "periods" (Yearly, Quarterly, Monthly, Weekly, Daily, and Hourly) and subdivided into subsets or types (Demographics, Finance, Industry, Macro, Micro, and Other). Our criterion for selecting time series was: (1) To choose time series with a minimum of two hundred and 50 observations; (2) The frequency group should have more than one set type. Consequently, TS from the Hourly group were not selected since it only contains time series from the type *Other*. The complete dataset is shown in Table 1, which has two final columns named size and percentage (%); the former refers to the number of time series selected from each frequency group, and the latter is the correspondent percentage of selected time series concerning that group. According to the last criterion, in our dataset, we consider only the subsets Yearly, Quarterly, Monthly, Weekly, and Daily; the total number of the TS in our dataset is 22,610.


**Table 1.** M4 Competition time series.
