*5.5. Accuracies of the Supervised Classifiers*

The classification accuracies of the LA, AC, Po and PSCOM tasks are shown in Table 11. They are obtained applying the leave-one-out and 10-fold cross validation method for the MLR, SVM and kNN classifiers. Accuracies refer to two different classification goals per task: discriminating PD from HC (two-classes classifier, binary problem) and classifying PD subjects into three UPDRS severity classes (three-classes classifier, multiclass problem).

The kNN with k = 3 and SVM with polynomial degree d = 2 gave the best performance using the leave-one-out cross validation, then these values were chosen for the system classifiers. In general, the accuracies of SVM classifiers are better than the kNN and MLR ones. Furthermore, the results of binary classification problem, in classifying HC and PD subjects, are quite better than the multiclass classification ones. This behavior was not unexpected because, in general, the classifiers perform worse on the same training data when the number of classification labels (i.e., classes) increases. In reporting the multiclass classification accuracy is more appropriate to indicate the per-class accuracy (fourth column for leave-one-out and sixth column for 10-fold of the Table 11), where the classification accuracies are averaged over the classes [62]. The absolute classification error (ec) was defined as the difference between the UPDRS class C, assigned by the neurologists, and the estimated class C' assigned by the system to each motor performance *i* (ec = |C*i* – C'*i*|). The ec value for the kNN and

MLR classifiers is sometimes larger than 1 UPDRS class, even when their average accuracies are better than that of the SVM classifiers. On the contrary, the ec value for the best SVM classifiers was never greater than 1 UPDRS class for all the tasks; this means that the automatic assessments are always close to the neurologist' ones. This is also an important feature for the system reliability respect to an average greater agreement but with large spot disagreements.


**Table 11.** Classification accuracies for the supervised classifiers.

<sup>a</sup> 100 iterations.

In addition, the results in Table 11 show that the two-classes accuracy is higher for LA and PSCOM, while is slightly lower for the other two tasks. This is in agreement with the Figure 8, in which the AC and Po graphs show more overlapping between UPDRS classes as compared to the LA and PSCOM ones. The partial incoherence of some parameters in separating the different classes has probably an impact on the classifier performance. The behavior of the two-classes classification accuracy is not repeated in case of the three-classes classification, for which the worst performance is obtained for PSCOM task. This could be due again to CoM parameters not directly comparable to PIGD subscale assessments. Looking at the error distribution, we obtain a big contribution from UPDRS 3 class (i.e., most impaired PD subjects). The limited number of observations assessed as UPDRS 3 suggests that some significant parameters, which should have been considered, are probably missed, and could be included among the selected features only by increasing the number of UPDRS 3 observations in the training set.
