3.1.1. Spectral Entropy

Power Spectral Density (PSD) estimation is commonly used in signal-processing literature. By transforming a given time series *xt* from the time domain to the frequency domain using the discrete Fourier transform, the latter provides information about the power of each frequency component. These frequencies describe a spectral probability distribution which can be used to assess the uncertainty about a future prediction, namely spectral entropy *Hspct* (a cartoon of this process is shown in Figure 1B). To calculate this from a TS (assumed to be stationary), we first require its Autocovariance Function (ACVF). This is defined as

$$\gamma\_x(k) = \mathbb{E}[(\mathbf{x}\_t - \mu\_x)(\mathbf{x}\_{t-k} - \mu\_x)], \qquad k \in \mathbb{Z}, \tag{4}$$

where *μx* is the TS mean value and *k* corresponds to the lag. With the ACVF, the spectrum of the TS is obtained through the Fourier transform such as

$$S\_{\mathbf{x}}(\lambda) = \frac{1}{2\pi} \sum\_{j=-\infty}^{\infty} \gamma\_{\mathbf{x}}(j) e^{ij\lambda}, \qquad \qquad \lambda \in [-\pi, \pi], \tag{5}$$

where *i* = √−1 and *Sx* : [−*π*, *π*] → R+. To understand the implications of Equation (5) consider a white noise TS *ω*. In such case, *γω*(*k*) = 0 for *k* = 0, thus, the spectrum is constant for all *λ* ∈ [−*π*, *π*] [24]. Then, if we define *σ*2*x* <sup>=</sup> *<sup>π</sup>*−*<sup>π</sup> Sx*(*λ*) for *k* = 0, the spectral density of *xt* is

$$f\_{\mathbf{x}}(\lambda) = \frac{S\_{\mathbf{x}}(\lambda)}{\sigma\_{\mathbf{x}}^2} = \frac{1}{2\pi} \sum\_{j=-\infty}^{\infty} \rho\_{\mathbf{x}}(j) e^{ij\lambda},\tag{6}$$

where *ρ*(*k*) = *γ*(*k*) *γ*(0) corresponds to the Autocorrelation Function (ACF). The *fx*(*λ*) can be used as a probability density function of a random variable such that it is ascribed to the unit circle. For instance, in the case of *ω*, *fx*(*ω*) = 12*π* , which corresponds to the spectral density of the uniform distribution [24].

Using Equation (6), the spectral entropy *Hspct* is defined as

$$H\_{\rm spcf} = \int\_{-\pi}^{\pi} f\_{\mathbf{x}}(\lambda) \log g\_{\mathbf{z}} f\_{\mathbf{x}}(\lambda) d\lambda. \tag{7}$$

If the value obtained by Equation (7) is relatively small, *xt* is more *forecastable* since it contains more signal than noise, whereas a larger value stands otherwise [18]. The previous analysis can be simplified by normalizing *Hspct* within 0 ≤ *Hspct* ≤ 1 by dividing it by the uniform distribution entropy i.e., *loga*(<sup>2</sup>*π*) (which has the maximum entropy for a finite support). In this sense, the uncertainty about a required prediction *xt*+*h* is given by the characteristics of the process itself [24].
