**4. Discussion**

In this section the discussion of the results obtained for the different stages applied in this work is presented. Initially, the features extracted from the data are submitted to a feature selection to remove redundant or non-significant features, preserving those that contribute most to the description of depressive subjects in the different moments of the day. Then, a classification is carried out, modeling the set of features that presented the best result in the previous step for the different moments of the day. A final validation is applied to statistically evaluate the performance of the models obtained.

According to the validation values shown in the previous section, the feature selection and classification stages allowed to obtain statistically significant results.

From the feature selection step, a series of feature sets have been obtained along with their calculated accuracies, shown in Table 4. The main objective in this step is to be able to select the smallest set of features obtaining the best accuracy. As can be seen, the accuracy follows an increase pattern for most cases each time a feature is increased in the set. However, for the three data sets it is observed that the accuracy stops increasing when the tenth feature is added, and even decreases for the Day Data and Full Day Data sets. For the Night Data and the Full Day Data, the best accuracy is calculated with the set of nine features, obtaining 0.7818 and 0.7792, respectively, while for the Day Data the best accuracy is calculated with the set of eight features, obtaining 0.7744. And, in general, of all the feature sets selected for the three data sets, the best accuracy is obtained with the nine-features set of the Night Data. Based on this and in order to make a direct comparison, the sets of nine features have been selected as the best for each data set. The description of the features included in these sets are shown in Table 5.

Comparing the best nine-features sets shown in Table 5, it can be observed that only for the Night Data the maximum (time) value is selected in the FS. This may be due to sleep disorders that make patients with depression more active at night, being possible to differentiate the level of motor activity of a person who does not have this condition, since it is regularly lower.

Another important detail shown in the description of the feature sets is the frequency-related features, since for the three different data sets the same features were selected in this domain. This demonstrates the robustness and generalization in the information provided by these features, since regardless of the time of day, it is possible to identify subjects with depression presence with the levels of activity that occur.

The next step corresponds to the modeling of the set of features selected for each dataset for a classification task, based on the RF technique. For this purpose, the nine-features sets selected for each dataset were submitted to the modeling. In addition, taking into account that of all the selected feature sets, the one with the best accuracy was the nine-feature set of the Night Data, another classification was made in the Full Day Data and Day Data using this nine-feature set. To avoid confusion, this classification is labeled as Best Model Full Day and Best Model Day, for the Full Day Data and the Day Data respectively.

It is important to mention that RF was selected for the classification since it has been used to classify the motor activity of depressed subjects in other works. Zanella-Calzada et al. [22] present the classification of depressive and no depressive episodes using RF, obtaining an accuracy of 0.893, while M. Pal et al. [32] compare the performance between RF and SVM, resulting RF more efficient even with fewer parameters to make the classification.

To measure the significance of the classification, the TP, TN, FP, FN, sensitivity, specificity, PPV, NPV and accuracy metrics were measured, obtaining the results shown in Table 6. Initially, it can be seen that the FP and FN values are not significant if they are compared with the TP and TN values, taking into account that the TP and TN are the conditions and controls, respectively, correctly classified being much higher than the FP and FN values, which are the subjects incorrectly classified. An important point to note is that the lowest number of FP and FN is obtained when the classification is carried out using the set of nine features selected for Night Data applied to the three different data sets, as can be seen in the Best Model Day and Best Model Full Day. However, the lowest number of FP and FN are obtained in the Night Data set, allowing to demonstrate that the data of these nine features, specifically in this period of the day, generate values in the levels of the motor activity that allow to identify the depressive subjects, reducing the ambiguities that may be obtained in the activity levels presented by the Day Data and Full Day Data sets.

For the rest of the results it can be seen that the highest values were obtained using the Night Day feature set, since if a comparison is made of the accuracy obtained from the classification of the nine features selected from the Data Day set and the accuracy obtained from this same data set but using the nine features of the Night Data (Best Model), an increase of 0.45% can be observed. In the case of Full Day Data it is observed the same behaviour, obtaining an increase in the accuracy of 0.43% when using the set of nine features of the Night Data. For the Night Day data, the classification accuracy is almost perfect, obtaining a value of 99.72%.

Based on this, it can be noted the grea<sup>t</sup> contribution generated by the set of features selected for the Night Data set, but specifically with the maximum (time) feature, since it is the main difference between this set of selected features and the others two sets, contributing to have significant behavior not only in the Night Data set, but in all sets. The maximum (time) feature represents the highest value obtained from activity level and specifically at night, it allows to identify depressive subjects almost perfectly. This may be because, according to Armitage et al. [33], around 80% of patients diagnosed with MDD suffer from sleep disorders. This represents an important change in the circadian rhythm of patients suffering from depression and a healthy persons, causing depressive subjects to have more motor activity. In Figure 2 can be notice that even in a sleepy hour for both control and condition (4 a.m.) patient suffering from MDD have more disturbances than the control subject.

Finally, it should be noted that, as mentioned above, the best results are obtained using the Night Data dataset with the set of nine features specifically selected for this dataset. While the lowest results are obtained using the Day Data data set, however, these values increase if the nine features selected for Night Data are used. For the Full Day Data set, an intermediate value can be observed between the other two datasets. Therefore, based on this, it can be known that the ambiguities in the classification of depressive subjects are greater during the day than during the night. This may be due to the fact that during the day people regularly must carry out daily activities, such as work or studies, regardless of whether they suffer from depression. While at night, the presence of this condition may be more evident due to the irregularities that it can cause to sleep, while people who do not suffer from depression can have a quieter sleep and therefore much less activity levels.
