1. Introductions
Falls are one of the leading causes of death among the elderly [
1]. Approximately 28–38% of people over 65 suffer a fall each year [
2]. Falls can result in bruises and swellings, as well as fractures and traumas [
3]. In addition to the physical consequences, the fear of falling can impact on the elderly’s quality of life. A fear of falling is associated with a decline in physical and mental health, and an increased risk of falling [
4]. Therefore, falls and fall-related injuries are major healthcare challenges to overcome.
Many studies have tried to improve the physical performance of the elderly by performing rehabilitation programs to help prevent falls. Røyset et al. [
5] conducted a fall prevention program using the Norwegian version of the fall risk assessment method, “STRATIRY” (score 0–5), but achieved no significant improvement when compared to the control group during a short stay in an orthopedic department. Gürler et al. [
6] proposed a recurrent fall prevention program including assessment of fall risk factors, education on falls and home modification. This program was effective in reducing fall-related risk factors and increasing fall knowledge. Palestra et al. [
7] presented a rehabilitation system based on a customizable exergame protocol (KINOPTIM) to prevent falls in the elderly. As a result of training for 6 months, the performance of the postural response was improved by an average of 80%.
Prevention of falls through long-term rehabilitation programs is important to improve the quality of life for the elderly, but preparation for the situation of a fall is also important. Falls may have serious consequences; however, most of the consequences of these falls are not directly attributed to the falls themselves, but to the lack of timely assistance and treatment [
8]. Vellas et al. [
9] reported that 70% of older adults who had fallen at home were unable to get up unaided, and that more than 20% of patients admitted to hospital as a result of a fall had been on the ground for an hour or more. Moreover, 50% of people affected by falls who remain unassisted for more than an hour die within the subsequent six-month period following the accident [
10]. A fall-detection algorithm that detects and notifies the occurrences of falls as quickly as possible is important.
In general, an inertial measurement unit (IMU) sensor has been used for fall detection. Threshold-based methods were mainly used for fall detection [
11,
12,
13]. They are advantageous because of a smaller computational time and can detect falls before the impact occurs. However, rapid movements can sometimes be misrecognized as falls. The machine learning-based algorithms require a relatively long computational time, but can distinguish similar actions accurately. It has been reported that machine learning-based algorithms perform better than threshold-based algorithms [
14]. For this reason, threshold-based algorithms are mainly used to protect using wearable airbags through pre-impact fall detection [
15], and machine learning-based algorithms are used to detect post-falls.
Several classifiers, such as support vector machine (SVM), k-nearest neighbor (k-NN), naïve Bayes (NB), least square method (LSM), artificial neural network (ANN) and others are used to detect post-falls. Researchers attempted to compare the classifiers which were suitable for post-fall detection [
16,
17,
18]. Vallabh et al. [
16] used smartphones placed in the trouser pockets to distinguish seven activities of daily living (ADLs) and four falls. They extracted data within the interval of acceleration between –20 m/s
2 and 20 m/s
2, and extracted feature vectors, such as the mean, median and skewness from this interval. Five classifiers were compared: LSM (75.4%) < NB (80.0%) < ANN (85.9%) < SVM (86.8%) < k-NN (87.5%). Özdemir et al. [
17] used six IMU sensors to distinguish 16 ADLs and 20 falls. They extracted data within the 2 s window based on the impact, and extracted feature vectors, such as the mean, variance, skewness, kurtosis and so on from this interval. Six classifiers were compared: ANN (95.68%) < dynamic time warping (DTW) (97.85) < Bayesian decision making (BDM) (99.26%) < SVM (99.48%) < LSM (99.65%) < k-NN (99.91%). Gibson et al. [
18] used one IMU sensor on the chest. They extracted data within the 2 s window based on the impact and extracted wavelet acceleration signal coefficients as feature vectors from this interval. Five classifiers were compared: ANN (92.2%) < probabilistic principal component analysis (PPCA) (92.2%) < LDA (94.7%) < radial basis function (RBF) (95.0%) < k-NN (97.5%). Previous studies defined specific data sections and extracted discrete-type feature vectors made by compressing multiple frames into one. In addition, ANN exhibited a poor performance compared with other classifiers.
Conversely, some studies reported a good performance in post-fall detection with ANN [
19,
20]. Yodpijit et al. [
19] used one IMU sensor on the waist to distinguish four ADLs and one fall. They used 128 data samples based on the impact and extracted the magnitude of the vectorial sum of acceleration and the magnitude of the vectorial sum of the angular velocity as feature vectors. Their ANN algorithm, fused together with the threshold, resulted in an accuracy of 99.23%. Yoo et al. [
20] used one IMU sensor on the wrist to distinguish six ADLs and one fall. All data were unified to have 175 values based on the longest data. The signal magnitude vector (SMV) value of the acceleration and the raw value of acceleration were used as feature vectors. They used only ANN and achieved an accuracy of 100%. Previous studies [
19,
20] defined a specific data section and extracted time-series feature vectors for the ANN classifier. Even though a good performance was achieved, it depended on the subjects, motions, and classifiers. Therefore, the direct comparison among different studies was relatively difficult.
In general, some data within a dataset were used to develop an algorithm, and the remaining data were used to test the algorithm. However, previous studies analyzed completely different datasets for an effective test, representing here as a cross-dataset. Cao et al. [
21] and Delgado-Escaño et al. [
22] used different datasets for training and testing to evaluate the algorithm. Cao et al. [
21] suggested the adaptive action detection algorithm from human video with high accuracy (95.02%) and used four different datasets to generalize the action detection model. Delgado-Escaño et al. [
22] presented a new cross-dataset classifier based on a deep architecture and a k-NN classifier for fall detection and people identification. They tested their algorithm using four different public IMU datasets. Evaluation through a cross-dataset is necessary to apply the algorithm to real situations.
In this study, the performance of algorithms to detect post-fall were evaluated according to classifiers (ANN and SVM) and feature vectors (time-series and discrete data) when untrained motions were used as test data. Some previous studies [
19,
20], using ANN alone, showed a high accuracy of over 99%, but some [
16,
17,
18], comparing ANN with other traditional classifiers, showed relatively low accuracy. The accuracy of the algorithm depended on the subjects, motions and classifiers, and therefore, a direct comparison among different studies is relatively difficult. SVM was selected as a representative of traditional classifiers to compare with ANN, since it was frequently used in other studies and showed good performance in fall detection. The SisFall dataset [
23] was used for cross-dataset. In addition, four different processing conditions (normalization, equalization, increase in the number of training data and additional training with external data) were applied to determine the effect on the performance of the classifiers (ANN and SVM).
3. Results
Table 3 shows the performance of ANN and SVM classifiers for the internal and external tests according to the feature vectors and different processing conditions. When no processing was used, the following characteristics were shown. For ANN, SWF showed the highest accuracy in both internal and external tests. For SVM, raw data exhibited the highest accuracy in the internal test, and IDWF exhibited the highest accuracy in the external test. It is noted that SWF exhibited a poor performance in SVM, while it generally showed good performance in ANN. Furthermore, both ANN and SVM exhibited good performances when only raw data were used in the internal test. However, the performance was better in the external test when feature vectors were used. For the internal test, normalization processing of feature vectors resulted in decreased performance in ANN, but increased performance in SVM. For the external test in ANN, raw data and IDWF showed increased performance, but SWF showed decreased performance. For the external test in SVM, raw data showed decreased performance, but SWF and IDWF showed increased performance. The equalization processing was not as effective as the normalization. In most cases, the sensitivity increased, but the specificity decreased. When the number of training data increased, for ANN, the performance decreased in the internal test but increased in the external test. For SVM in the internal test, all feature vectors exhibited increased performances. However, for SVM in the external test, the performance of SWF increased, but the performance of raw data and IDWF decreased. The result of additional training with external data was compared with the result of increasing the number of training data. For both ANN and SVM, the overall performances decreased in the internal test but increased in the external test.
False alarms are compared between the increasing a training our laboratory data and additional training of external data, as shown in
Table 4 The top two or three false alarms are listed in order. The following major false alarms were detected when the number of training data increased: (a) ADLs with the rapid change in the body COM (YD05,11,06,07, SD05,06,11,13,16,17), (b) fall motions with the slow change in the body COM (YF04,05, SF10,11,12,14,15), (c) lateral lying motions (SD12,13,14) and (d) some lateral falls (SF03,12,15).
Table 5 and
Table 6 represented the false alarms in the external test with ANN and SWF. Major false alarms for the increase in training data came from significant lateral motions (SD12,13,14 and SF03,12). The additional training with external data reduced those major false alarms significantly. However, some ADLs that involved a rapid change in the body COM (SD13,16) and fall motions that involved a slow change in the body COM (SF11,14) were still falsely detected.