*4.1. Effect of Frame Length on Classification Performance*

The objective of this section is to choose the appropriate data windowing method and frame length. We investigate the influence of disjoint and overlapped windowing with different frame lengths on the classification accuracy of the UIR system. We use disjoint windowing with frame lengths *Lf* = {100, 150, 200, 250} ms, and overlapped windowing with frame lengths *Lf* = {200, 250, 300} ms and increments *I* = {50, 150, 200} ms. We extract time-domain (TD) and frequency-domain (FD) features from the data frames generated by the three measurement signals collected from able-bodied subjects. Note that the size of the data set depends on the parameters of the windowing methods. For instance, a 10-second walking sequence with disjoint windows of length 100 ms provides 100 frames and consequently 100 training patterns.

In this experiment, LDA, QDA, MLP, and SVM with RBF kernel are separately trained and tested using 10-fold cross validation (CV) for each subject with only one feature type for each of the measurements: WL, VAR, MAV, RMS, WAMP, ANG, and AR4. These features are used because they are considered the most representative TD and FD features. Figure 5 illustrates the mean classification accuracy of LDA with different frame lengths for able-bodied subjects using 10-fold CV. A single value on the horizontal axis of the figure indicates the frame length of disjoint windowing. A pair of values indicates the frame length and increment length of overlapped windowing; for instance, 200–50 denotes *Lf* = 200 ms and *I* = 50 ms. Figure 5 shows that the classification accuracy typically improves as the frame length increases. We observed similar trends with other classifiers. In fact, classifier type does not influence our choice of frame length. Therefore, we only provide classification results for LDA as a representative result.

Figure 5 shows that a larger frame is more likely to include more information, and consequently lower bias and variance in classification performance. For instance, the increase in accuracy with WL is approximately 18% when the frame length increases from 100 ms to 200 ms. The increases are 11.6% and 9.8% when using the VAR and AR4 features, respectively. However, the accuracy does not vary significantly with frame length for the remaining features in Figure 5. The figure illustrates that all representative TD features except ANG provide better classification performance than AR4, which is the representative FD feature.

**Figure 5.** Mean LDA performance for the able-bodied subjects for different data windowing methods and frame lengths. On the horizontal axis, a single value indicates the frame length of disjoint windowing, and a pair of values indicates the frame length and increment length of overlapped windowing. For instance, 200–50 indicates *Lf* = 200 ms and *I* = 50 ms for overlapped windowing.

In this experiment, very small frame length is not used because it would result in poor prediction accuracy. Conversely, large frame length is not used because it would result in a violation of the real-time constraint. To find the best frame length, we statistically compare performance using Wilcoxon signed-rank tests. For this purpose, LDA is trained multiple times, each time using one TD or FD feature type. We perform LDA training individually for each feature type rather than the full feature set to provide a sufficient number of samples for statistical comparison. The null hypothesis of the test is that the differences between mean classification accuracy (average accuracy of all LDAs trained individually with every single TD and FD feature type) corresponding to two different frame lengths are from a distribution with zero mean at the specified level of significance. If the null hypothesis cannot be rejected, then we conclude that the two compared frame lengths are not statistically significantly different, as indicated by an ≈ sign and a T (tie). If we can reject the null hypothesis, then the two frame lengths are statistically significantly different, and this is indicated by a + sign. The better frame length is the one with better mean classification accuracy and is shown by B (better) while the worse one is shown by W (worse). Table 2 provides the results of the statistical tests at a 10% significance level.

Table 2 shows that frames with length larger than 200 ms perform better than frames with length 150 or 100 ms. Table 2 shows that the two overlapped frame windows with *Lf* = 250 ms, *I* = 50 ms and *Lf* = 300 ms, *I* = 200 ms tie for similar performance, and perform better than the other frame lengths.

Taking into account the real-time constraint, the length of the MVF filter, and processing time, we choose overlapped windowing with *Lf* = 250 ms, *I* = 50 ms throughout the remainder of the paper as the best trade-off, except where specifically mentioned otherwise.

**Table 2.** Comparison of mean classification performance for different frame lengths (row values versus column values) using Wilcoxon signed-rank tests at a 10% significance level. Mean classification performance is considered as the average of all linear discriminant analysis (LDA) classifiers trained individually with every single time-domain (TD) and frequency-domain (FD) feature type. ≈ indicates that the two compared frame lengths tie (T) with similar performance and are not statistically significantly different. + indicates that the two frame lengths are statistically significantly different, and B or W indicates that the row frame length performs better or worse than the column frame length, respectively. ∗ indicates that the lower triangular half of the table is equal to its upper triangular half.


We use principal component analysis (PCA) [46] and Fisher linear discriminant analysis (FLDA) [47] to visualize the training set. A training pattern is a vector of all TD and FD features extracted from a frame of raw data. We performed 2D dimension reduction for three different frame lengths. Figure 6 illustrates the 2D scatter plot for able-bodied subject AB01. To save space, we do not provide the same figures for other subjects because we obtained similar results for those subjects.

Figure 6 shows that FLDA provides better visualization than PCA in terms of gait mode separability. We verify that longer frame length leads to better gait mode separation, and eventually better classification performance. Most importantly, Figure 6 verifies the effectiveness of the TD and FD features.

**Figure 6.** Two-dimensional scatter plot for visualization using principal component analysis (PCA) (**left column**) and Fisher linear discriminant analysis (FLDA) (**right column**) for able-bodied subject AB01.
