*4.2. Lab Study*

In this section, we will discuss the results of the lab study for each algorithm in detail with respect to their advantages/disadvantages and the number of sensors needed for their implementation.

**Stride time:** The *Stride time* algorithm leads to the lowest accuracy for the lab study dataset. Even though stride time and stride length relative to the subject's height correlate non-linearly, the correlation does not seem to be high enough to compute velocity and stride length accurately. The low correlation is also visible in Figure 10. The gray dots are the relative stride length values obtained from the lab study dataset, and the red line is the step function for male subjects defined in Table 3a. We see that the step function does not approximate the underlying data accurately. The standard deviation of the relative stride length within a certain stride time range (e.g., 0.748 < *tstride* ≤ 0.800) of the step function is high. This is due to the fact that velocity is controlled by stride frequency and stride length. The *Stride time* algorithm cannot handle that fact, as it only depends on stride frequency.

**Figure 10.** Visualization of the correlation between the stride time *tstride* and the relative stride length *dstride*,*rel* for male subjects. The light gray dots depict the data obtained from the field study, whereas the red curve and the black dashed lines visualize the step function obtained from literature and implemented in the *Stride time* algorithm.

In the Bland–Altman plots for the stride length metric (Figure 8), the other three algorithms showed overlapping sample clouds. This indicates that people increased their velocity both by increasing their stride length and by decreasing their stride time in higher velocities. The other algorithms are capable of dealing with this effect due to the fact that the sample clouds are separated in the Bland–Altman plots of the velocity metric. This is not observable in the plots for the *Stride time* algorithm. Thus, the other algorithms can deal better with the velocity control via stride frequency and stride length.

Furthermore, we want to discuss the shape of the *Stride time* algorithm's Bland–Altman plots briefly. The long diagonal lines in the plots (Figure 8b) originate from the steps in the step function introduced in Table 3. One line belongs to one stride time range. The small deviations within the diagonals originate from the different body heights. We observed that for some stride time ranges, the gold standard velocity ranged from 2–6 m/s (color coded within one diagonal), showing that the stride time ranges of the step function obtained from the literature do not generalize well. Furthermore, the relative stride lengths presented in Table 3 are averaged over specific study populations. Even if a subject controls its stride frequency in the exact same manner as encoded by the stride time ranges of the step function, the resulting stride length could be incorrect due to an incorrect relative stride length.

Despite the algorithm's low accuracy, an advantage of the stride time algorithm is that it can be implemented very energy efficiently. In the case of an IMU scenario, only a stride segmentation is necessary to compute the stride time. The stride segmentation presented in this paper only needs the sampling of the acceleration in the dorsoventral direction; thus, a 1D-accelerometer would be sufficient. In fact, strides could be segmented without an IMU using sensors such as piezo-electric switches to detect the ground contact [40].

**Acceleration:** The plot with the ME for the different velocity ranges in Figure 7 shows that the *Acceleration* algorithm works better for the the two velocity ranges from 3–4 m/s and 4–5 m/s. In addition, the Bland–Altman plots in Figure 8 show outliers especially for the highest velocity range for both the stride length and the average velocity. The reason for that can be observed in Figure 3, where we see that the second degree polynomial used to map the velocity integration value *ι* to the velocity value approximates the reference data better for the velocity range from 3–5 m/s and especially not well for the highest velocity range. This can be explained by the spread of the underlying data being too large to be represented by the polynomial.

However, the *Acceleration* algorithm outperforms the *Stride time* algorithm and shows comparable performance to the *Deep Learning* algorithm for the velocity range of 3–4 m/s. The advantage of the *Acceleration* algorithm over the better performing *Trajectory* and slightly better performing *Deep Learning* algorithm is its energy efficiency. For the computation of the stride length and the velocity, only a triaxial accelerometer needs to be sampled. Sampling only an accelerometer consumes less energy than sampling the gyroscope or sampling both sensors. For example, for the MPU9250 from InvenSense, the supply current needed for sampling only the accelerometer is less than 15% of the current needed for sampling both the accelerometer and the gyroscope [41]. Furthermore, the sampling rate can be further reduced for the *Acceleration* algorithm [6]. We also tested the reduction of the sampling rate for the lab study dataset and observed that a reduction to 60 Hz does not affect the accuracy of the algorithm. With such a low sampling rate, the energy consumption can be further reduced. Another advantage of the algorithm is its generalizability and its applicability to other movements like side stepping [6].

**Foot trajectory:** The *Trajectory* algorithm performs best for the lab study dataset. Especially for velocities up to 5 m/s, the algorithm achieves a ME of less than 0.012 m for the stride length and 0.014 m/s for the average velocity. For velocities higher than 5 m/s, the accuracy drops. In the Bland–Altman plots (Figure 8e,f), outliers for this velocity range are visible. The zero-velocity update based on the detection of the minimum energy in the gyroscope signal is error prone for such high velocities. The foot has no real zero-velocity phase and is always in motion. Thus, the underlying zero-velocity assumption does not hold. One way how to improve this algorithm is to propose a better solution for the initial condition when applying it to higher running velocities. Future work could evaluate whether a regression model based on the velocity during the swing phase would be a better initial condition.

For the *Trajectory* algorithm, we were also interested in the applicability of the zero velocity update for the different strike types due to the foot never being flat on the ground for forefoot runners. Hence, we also evaluated the accuracy of the *Trajectory* algorithm for the different strike types. The violin plots for forefoot and rearfoot runners are depicted in Figure 11. The plots show that the MEs do not differ significantly for the two strike types. However, the standard deviation is higher for forefoot runners. The low ME both for the forefoot and the rearfoot strike type can be explained by the fact that we align the foot during the zero velocity phase with gravity. The higher standard deviation originates in the more dynamic nature of the forefoot running style. Thus, the zero velocity phase cannot be detected accurately, which results in higher errors.

**Figure 11.** Violin plots of the error (*sref* − *sstride*) in the velocity computation for forefoot strikers and rearfoot strikers.

An advantage of the *Trajectory* algorithm is that it provides more information about the stride than the velocity and the stride length. During the computation of these parameters, the orientation of the shoe in space is also calculated, which allows for a determination of other parameters like the sole angle, which defines the strike pattern or the range of motion in the frontal plane that is associated with pronation [42]. Furthermore, the algorithm uses solely signal processing and has no training phase, which makes it well applicable to unseen data. This holds for lower velocities and the transition to walking.

In terms of an embedded implementation and energy efficiency, the *Trajectory* algorithm needs both accelerometer and gyroscope data. Thus, it needs more energy than the *Stride time* and the *Acceleration* algorithm for acquiring 6D-IMU data.

**Deep learning:** The *Deep Learning* algorithm produced an ME of less than 0.095 m/s for the velocity and 0.104 m for the stride length for all velocity ranges in the lab study dataset. Compared to Hannink et al. [16], we reduced both the number of filters in the second convolutional layer and the number of outputs in the fully-connected layer, because the results using the identical structure yielded worse results for our use case. The differences in the architecture are listed in Table 6. Generally, the performance of the DCNN network is worse compared to the results reported in [16].


**Table 6.** Differences in the study setup and architecture presented in [16] from our DCNN implementation.

We see that our approach needs less parameters due to the reduction of filters in the second convolutional layer and the smaller output number of the fully-connected layer. However, our results show a larger error. The reason for that might be a larger variation in our training data and the different strike pattern in running. The range of the target parameter of stride length is 3.62 m in the lab study of this work and 1.16 m in the dataset for geriatric patients of Hannink et al. [16]. The strike patterns in running differ significantly for forefoot and rearfoot runners, which also introduces more variation in the input data.

Besides, we observed that during training, the training errors and validation errors still varied after the five training epochs, even though we had more training samples than Hannink et al. [16]. Increasing the number of epochs or batches did not change the varying validation errors. This indicates that the DCNN does not generalize well. Thus, the results might be further improved by incorporating more data samples in the training process of the network.

The embedded implementation of the presented method is a challenge as the DCNN model comprises 85,425 parameters. However, it is still in a range where it can be implemented on a microcontroller. For this method, the acceleration and the gyroscope have to be sampled. This further increases the energy demand compared to the *Acceleration* approach. Taking computational effort and performance into account, the *Acceleration* method would be a better trade-off for an embedded implementation.
