Firstly, we will compare our results to existing literature. Afterwards, we will discuss the results of the lab study including a detailed evaluation of the individual algorithms with respect to their accuracy and their advantages and disadvantages in a smart shoe scenario. Special emphasis will be placed on the number of sensors that are needed to run the algorithms and the underlying power consumption of these sensors. Finally, the results of the field study on the tartan track will be discussed.
4.2. Lab Study
In this section, we will discuss the results of the lab study for each algorithm in detail with respect to their advantages/disadvantages and the number of sensors needed for their implementation.
Stride time: The
Stride time algorithm leads to the lowest accuracy for the lab study dataset. Even though stride time and stride length relative to the subject’s height correlate non-linearly, the correlation does not seem to be high enough to compute velocity and stride length accurately. The low correlation is also visible in
Figure 10. The gray dots are the relative stride length values obtained from the lab study dataset, and the red line is the step function for male subjects defined in
Table 3a. We see that the step function does not approximate the underlying data accurately. The standard deviation of the relative stride length within a certain stride time range (e.g.,
) of the step function is high. This is due to the fact that velocity is controlled by stride frequency and stride length. The
Stride time algorithm cannot handle that fact, as it only depends on stride frequency.
In the Bland–Altman plots for the stride length metric (
Figure 8), the other three algorithms showed overlapping sample clouds. This indicates that people increased their velocity both by increasing their stride length and by decreasing their stride time in higher velocities. The other algorithms are capable of dealing with this effect due to the fact that the sample clouds are separated in the Bland–Altman plots of the velocity metric. This is not observable in the plots for the
Stride time algorithm. Thus, the other algorithms can deal better with the velocity control via stride frequency and stride length.
Furthermore, we want to discuss the shape of the
Stride time algorithm’s Bland–Altman plots briefly. The long diagonal lines in the plots (
Figure 8b) originate from the steps in the step function introduced in
Table 3. One line belongs to one stride time range. The small deviations within the diagonals originate from the different body heights. We observed that for some stride time ranges, the gold standard velocity ranged from 2–6 m/s (color coded within one diagonal), showing that the stride time ranges of the step function obtained from the literature do not generalize well. Furthermore, the relative stride lengths presented in
Table 3 are averaged over specific study populations. Even if a subject controls its stride frequency in the exact same manner as encoded by the stride time ranges of the step function, the resulting stride length could be incorrect due to an incorrect relative stride length.
Despite the algorithm’s low accuracy, an advantage of the stride time algorithm is that it can be implemented very energy efficiently. In the case of an IMU scenario, only a stride segmentation is necessary to compute the stride time. The stride segmentation presented in this paper only needs the sampling of the acceleration in the dorsoventral direction; thus, a 1D-accelerometer would be sufficient. In fact, strides could be segmented without an IMU using sensors such as piezo-electric switches to detect the ground contact [
40].
Acceleration: The plot with the ME for the different velocity ranges in
Figure 7 shows that the
Acceleration algorithm works better for the the two velocity ranges from 3–4 m/s and 4–5 m/s. In addition, the Bland–Altman plots in
Figure 8 show outliers especially for the highest velocity range for both the stride length and the average velocity. The reason for that can be observed in
Figure 3, where we see that the second degree polynomial used to map the velocity integration value
to the velocity value approximates the reference data better for the velocity range from 3–5 m/s and especially not well for the highest velocity range. This can be explained by the spread of the underlying data being too large to be represented by the polynomial.
However, the
Acceleration algorithm outperforms the
Stride time algorithm and shows comparable performance to the
Deep Learning algorithm for the velocity range of 3–4 m/s. The advantage of the
Acceleration algorithm over the better performing
Trajectory and slightly better performing
Deep Learning algorithm is its energy efficiency. For the computation of the stride length and the velocity, only a triaxial accelerometer needs to be sampled. Sampling only an accelerometer consumes less energy than sampling the gyroscope or sampling both sensors. For example, for the MPU9250 from InvenSense, the supply current needed for sampling only the accelerometer is less than 15% of the current needed for sampling both the accelerometer and the gyroscope [
41]. Furthermore, the sampling rate can be further reduced for the
Acceleration algorithm [
6]. We also tested the reduction of the sampling rate for the lab study dataset and observed that a reduction to 60 Hz does not affect the accuracy of the algorithm. With such a low sampling rate, the energy consumption can be further reduced. Another advantage of the algorithm is its generalizability and its applicability to other movements like side stepping [
6].
Foot trajectory: The
Trajectory algorithm performs best for the lab study dataset. Especially for velocities up to 5 m/s, the algorithm achieves a ME of less than 0.012 m for the stride length and 0.014 m/s for the average velocity. For velocities higher than 5 m/s, the accuracy drops. In the Bland–Altman plots (
Figure 8e,f), outliers for this velocity range are visible. The zero-velocity update based on the detection of the minimum energy in the gyroscope signal is error prone for such high velocities. The foot has no real zero-velocity phase and is always in motion. Thus, the underlying zero-velocity assumption does not hold. One way how to improve this algorithm is to propose a better solution for the initial condition when applying it to higher running velocities. Future work could evaluate whether a regression model based on the velocity during the swing phase would be a better initial condition.
For the
Trajectory algorithm, we were also interested in the applicability of the zero velocity update for the different strike types due to the foot never being flat on the ground for forefoot runners. Hence, we also evaluated the accuracy of the
Trajectory algorithm for the different strike types. The violin plots for forefoot and rearfoot runners are depicted in
Figure 11. The plots show that the MEs do not differ significantly for the two strike types. However, the standard deviation is higher for forefoot runners. The low ME both for the forefoot and the rearfoot strike type can be explained by the fact that we align the foot during the zero velocity phase with gravity. The higher standard deviation originates in the more dynamic nature of the forefoot running style. Thus, the zero velocity phase cannot be detected accurately, which results in higher errors.
An advantage of the
Trajectory algorithm is that it provides more information about the stride than the velocity and the stride length. During the computation of these parameters, the orientation of the shoe in space is also calculated, which allows for a determination of other parameters like the sole angle, which defines the strike pattern or the range of motion in the frontal plane that is associated with pronation [
42]. Furthermore, the algorithm uses solely signal processing and has no training phase, which makes it well applicable to unseen data. This holds for lower velocities and the transition to walking.
In terms of an embedded implementation and energy efficiency, the Trajectory algorithm needs both accelerometer and gyroscope data. Thus, it needs more energy than the Stride time and the Acceleration algorithm for acquiring 6D-IMU data.
Deep learning: The
Deep Learning algorithm produced an ME of less than 0.095 m/s for the velocity and 0.104 m for the stride length for all velocity ranges in the lab study dataset. Compared to Hannink et al. [
16], we reduced both the number of filters in the second convolutional layer and the number of outputs in the fully-connected layer, because the results using the identical structure yielded worse results for our use case. The differences in the architecture are listed in
Table 6. Generally, the performance of the DCNN network is worse compared to the results reported in [
16].
We see that our approach needs less parameters due to the reduction of filters in the second convolutional layer and the smaller output number of the fully-connected layer. However, our results show a larger error. The reason for that might be a larger variation in our training data and the different strike pattern in running. The range of the target parameter of stride length is 3.62 m in the lab study of this work and 1.16 m in the dataset for geriatric patients of Hannink et al. [
16]. The strike patterns in running differ significantly for forefoot and rearfoot runners, which also introduces more variation in the input data.
Besides, we observed that during training, the training errors and validation errors still varied after the five training epochs, even though we had more training samples than Hannink et al. [
16]. Increasing the number of epochs or batches did not change the varying validation errors. This indicates that the DCNN does not generalize well. Thus, the results might be further improved by incorporating more data samples in the training process of the network.
The embedded implementation of the presented method is a challenge as the DCNN model comprises 85,425 parameters. However, it is still in a range where it can be implemented on a microcontroller. For this method, the acceleration and the gyroscope have to be sampled. This further increases the energy demand compared to the Acceleration approach. Taking computational effort and performance into account, the Acceleration method would be a better trade-off for an embedded implementation.