In order to find the best method for detecting outlying segments within human physical activities, a series of classification experiments were carried out. They were aimed at verifying their effectiveness in this task and examining which factor influenced the effectiveness to the greatest extent. Each dataset was divided into two samples—training and validating. Standard proportions were used, i.e., (training set) to (testing set). The selection was carried out randomly so that the segments marked as anomalous did not occur in one sequence, but actually imitated outliers.
6.1. Detecting Outlying Activities Using a Nested Binary Classifier
The first experiment used the nested binary classifier discussed in
Section 4 at four levels. The tests with the following classifiers were performed sequentially: k-NN, logistic regression, naive Bayesian classifier, CART decision tree, support vector machine (SVM). As described in
Section 4, a nested
k-NN binary classifier at one level learned to detect one activity. Continuing to the last level, it indicated all the other unclassified activities that were searched for as outlying activities.
For the Inertia dataset (Dataset No. 1), four activities were classified: , , (going down the stairs), (going up the stairs). The activity that was unknown in the and datasets, i.e., an outlying activity, was the fifth activity, . Five outliers were detected in this set.
The nested k-NN algorithm (NestBC_k-NN) reached a high match level of
.
Table 3 lists the evaluation measures for all nested classifiers for each level of nesting. A decrease in accuracy was noted for the two activities, namely, the
activity and the
activity. This is due to the high similarity between these activities. The nested k-NN classifier left eleven samples as unknown, of which three were true anomalies, and the others belonged to other activities. The two missing samples were mistakenly labeled as going up the stairs. The decision tree, as expected, fared much better, although only three out of five true anomalous segments were detected in the last nesting. The other two actually belonged to the previous activity. One outlier activity was misclassified.
For the nested SVM classifier (NestBC_SVM), at nesting levels III and IV, a decrease in accuracy was observed with the last two activities. In this case, however, it was much more significant. It is likely that the model was overtrained towards the anomalous class due to the large number of descriptive attributes that this algorithm is sensitive to. Among the 23 samples diagnosed as outliers, NestBC_SVM detected all true outlying activities. The remaining 17 belonged to other activities. As many as 11 of these came from the main class of the last nested model.
In the case of the nested naive Bayesian classifier (NestBC_NB), so-called undertraining of the model was observed. The confusion matrix is illustrated in
Figure 1.
At level I, the algorithm coped very well with recognizing the numerous sitting classes. However, as soon as more dynamic activities appeared, its effectiveness began to decrease significantly, and the algorithm had great difficulty in distinguishing them. The reason for this is probably the small number of samples present during training. However, it should be emphasized that all five anomalous segments were found by the final nesting. Interestingly, the remaining four samples belonged to walking activity, and thus, these were the only ones that were misclassified in the first level. The algorithm correctly detected all fragments of going down the stairs, but also classified all going up segments as going down. This reinforces the assumption that the model was not able to distinguish between the two activities at first.
In the case of the nested logistic regression model, based on the confusion matrix illustrated in
Figure 2, it is easy to see a small number of misclassified samples at each level. The model achieved the best results for all calculated measures evaluating the performance of the classifier given in
Table 3. Only three out of four observations were actually anomalous. The other two were classified as “walking”. The anomalous class was dominated by the more numerous going up the stairs class.
For the second dataset (WISDM) (Dataset No. 2), the following four activities were considered: sitting, walking, and going down and going up the stairs. Standing and running were outlier activities. The number of actual outlier activities was 5219. As before, experiments were performed for each classifier. The evaluation measures of the nested classifiers for the WISDM dataset are given in
Table 4.
With each successive nesting, the accuracy of the model slightly decreased, but still remained at a high level. In the last phase, 5287 anomalous activities were received, of which 5158 actually belonged to this classification. The remaining 129 were misclassified activities.
The nested decision tree algorithm performed slightly worse than the nested kNN model. Also, this time there is a significant improvement compared to the previous dataset. A smaller number of incorrectly classified static activity segments was obtained, as compared to the Inertia dataset. However, the number of false positive and false negative observations in dynamic activities increased. In none of the nestings did the accuracy fall below
. Precision and sensitivity were not less than
. Specificity was also very high. The exact values of all assessment measures are given in
Table 4. In total, 5105 true outlier cases were found, while 663 were overmatched.
For the WISDM dataset, the nested support vector machine on the first level selected the segments of sitting activity without major problems. With the emergence of outlier activity, the effectiveness decreased. At levels III and IV, the anomalous class was clearly dominant. In the end, 5173 anomalies were correctly classified, but a huge percentage, comprising as many as 2829 samples, were matched incorrectly by the nested SVM classifier. For the nested naive Bayes classifier, it was noticed that the model was guided by the predominance of the number of representatives of a given class (again, anomalous). A total of 4221 valid anomalous samples were detected, while the missing 998 were probably lost in the first two nestings. The nested logistic regression model for the WISDM dataset performed very well. In none of the nests was accuracy over . It was also noticed that the nested logistic regression model coped worse with activities that require movement. The numbers of incorrectly classified samples were the largest at the second, third, and fourth level of the model. In the end, 4942 real outliers were detected, while the remaining 2309 belonged to other classes.
For the third analyzed dataset of UCI (Dataset No. 3) human movement activities, the correct activities included sitting, walking, and going down and going up the stairs. Standing and lying down were outlier activities. The number of actual outlier activities was 1069. The nested binary k-nearest neighbors algorithm coped very well with this last dataset. It is worth noting that even distinguishing very similar activities on the last two levels was not a major problem. Very small numbers of incorrectly diagnosed segments were obtained (I (nesting)—37, II—14, III—16, IV—42). Accuracy, precision, sensitivity, and specificity ranged from
to
. The measures of nested classifier evaluation at particular levels are given in
Table 5. Accuracy did not drop below
at any level of nesting, even with the dominance of the anomalous class. In the end, 1032 of the real outliers were correctly detected, and 130 samples were overmatched.
For the nested CART decision tree, 998 out of 1069 anomalous segments were classified, and 367 observations were overmatched. According to
Table 5, the lowest accuracy value, i.e., about
, was obtained in the last nesting. The nested binary support vector machine turned out to be much more efficient than in the case of the WISDM dataset. The nested SVM model correctly diagnosed 1041 anomalous segments, whereas 131 fragments were incorrectly assigned with this class. Accuracy never fell below
. The nested naive Bayes classifier for UCI dataset behaved in a rather surprising way. For the Inertia dataset and WISDM dataset, the accuracy decreased with subsequent nestings. In this case, the first level was characterized by the lowest percentage of correctly matched labels, while the last level was characterized by the highest. Nevertheless, it was noted that there was overmatching, resulting in the detection of only 78 segments. Only 18 incorrect segments were observed. For the nested logistic regression model, 1056 anomalies were detected. Only 134 of the finally obtained outliers were incorrect. The evaluation measures of the nested logistic regression model (
Table 5) confirm its high quality for the detection of outlier activities. Accuracy did not fall below
at any level. Precision had the biggest drop at the second level (
), contrary to sensitivity, which then reached the highest value. No major differences were observed between individual nestings.
In conclusion, the nested classifier works correctly, and can be used to detect anomalous activities of human movement phases. The disadvantage, however, is that it can generate many additional outliers at the same time, which is certainly not a desirable phenomenon.