*2.2. Pre-Processing*

For the pre-processing stage, the next step are proposed. Since the total amount of data recorded for each subject is different, a new subset of data is extracted, adjusting the number of observations to be equal for each subject. Theh, from the new set of data, a segmentation is applied to form one hour data intervals. This segmentation allowed the classification of depressive episodes per hour.

Therefore, based on the hourly segmentation, three different subsets are constructed; night motor activity (from 21 to 7 h taking into account the sunrise standard hours) [21], day motor activity (from 8 to 20 h) and finally all day motor activity with the total day hours. The number of observations contained in each dataset is shown in Table 1. After separated the data into day, night and 24 h data were cleaned from missing data.

**Table 1.** Datasets created from Depresjon dataset.


Finally, the last step for the pre-processing is the cleaning of the data by the elimination of missing data, represented as NA, and the standardization of the motor activity. This standardization center the data into the mean mark, allowing to know how far the signal is from the mean point. This standardization is calculated by Equation (1).

$$z\_i = \frac{\mathfrak{x}\_i - \mathfrak{x}}{s},\tag{1}$$

where *xi* is the actual point of the activity data, *x*¯ is the mean of the total motor activity data and *s* is the standard deviation of the total motor activity data.

## *2.3. Feature Extraction*

For each dataset, the 24 features shown in Table 2 are extracted. This process is based on similar works that extract features from an accelerometer signal [22–24].

From the total features extracted, ten are based on the time-domain, as shown in Table 2, referred to the data collected by the actigraph every minute.

To transform the time-domain data into frequency-domain, the fast Fourier transform (FFT) is applied, which can be calculated with Equation (2),

$$\mathbf{x}(k) = \sum\_{n=0}^{N-1} \mathbf{x}(n) \ast \mathbf{e}^{-j2\pi \left(\mathbf{x} \cdot \mathbf{\overline{\boldsymbol{\nabla}}}\right)},\tag{2}$$

where *x*(*n*) represents each motor activity collected per minute on an hour, *N* represents the total observations on an hour lapse, *k* represents the current frequency taking values from 0 to *N* − 1, and *x*(*k*) represents the spectral components of the samples.

For this FFT process, the representation of the original signal in the frequency domain is computed using the discrete Fourier transformation (DFT). This representation is formed by complex numbers, eliminating the imaginary part of each number in the frequency-domain signal. For this transformation, it is needed to calculate the power spectral density (PSD), as shown in Equation (3),

$$P = \lim\_{T \to \infty} \frac{1}{T} \int\_0^T |x(k)|^2 \, dt\_\prime \tag{3}$$

where *P* represents the energy from the signal, *T* represents the length of the signal lapse and *x*(*k*) represents the frequency-domain signal. The spectrum is normalized by the length of the signal.

The 14 remaining features are extracted from the PSD of the signal to obtain the best characterization of the signal.


**Table 2.** Features extracted for the day, night and full day datasets, in time and frequency domain.

1 —*n* = total number of samples, *xi* = actual sample. 2—*μ*4 = *μ* of the fourth moment, *σ*4 = *SD* of the fourth moment. 3—*Q*3 = third quartile three, *Q*1 = first quartile. 4—*pi*(*xi*) = probability of *xi*. It represents the media uncertainty of a random variable. 5—*μ*3 = third standardized moment. 6—*x*(*n*) = magnitude of bin number *n*.

## *2.4. Feature Selection*

The next step consists on reducing the dimension of the feature sets and selecting the best model for the description of the data. To accomplish the task, a FS approach is applied to the three sets of features (day, night and full day), using 70% of the data for the training of the model and the 30% remaining data for the testing of the model [25].

FS is implemented using the logistic regression (LR) classifier, since the nature of the data is binary (depressive, "1", and not depressive, "0", episode). Therefore, LR is used to model the selected features by FS for the classification of depressive episodes. For simplicity, each feature is labeled with a number, as shown in Table 3.


**Table 3.** Features extracted from the hourly motor activity segments.
