6.1. EMG-Based Movement Prediction: Classification Performance, Computing Time, Resource Utilization
Classification Performance: A grid search was used for optimization of
,
, and
p. Based on previous experience, the chosen parameter ranges for optimization were
ms,
ms,
. Regarding the channel setup, we estimate the performance of each single channel as a baseline for comparison, and the following combinations of channels:
for channels
and
for channels
, where
denotes the number of channels that have to be over the computed threshold simultaneously (see
Section 4.1).
To this end, the classification performances of eight single-channels and five different channel combinations (see
Figure 6) were analyzed by a two-way repeated measures ANOVA with the channel setup (CS, 13 levels as described above) and hardware architecture (HA, 2 levels: CPU-based system with double precision floating point vs. FPGA-based system with fixed-point arithmetic) as within-subjects factors. Greenhouse-Geisser correction was applied when necessary and the corrected
p-value is reported. We performed a post-hoc analysis to compare the channel setups and hardware architecture. Bonferroni-Holm correction was applied for multiple comparisons.
We obtained a significant main effect for CS [], but no effect of HA []. The best channel setups are ‘’ and ‘’ [these setups vs. all other setups: ]. There was no significant difference between CPU and FPGA.
Furthermore, we observe a significant interaction between CS and HA []. We found no significant differences between both types of hardware architectures for all channel setups [], except for channel 1, and for all combinations except ‘’.
As expected, the channel placed on the left arm show worse performance compared to the other channel setups []. Combinations of multiple channels improve the performance, the best performance is obtained by channel setups ‘’, ‘’ and ‘’. Based on these results, all subsequent investigations related to the EMG based movement prediction are based on the setup ‘’.
Computing Time: Computing times were measured using all 40 ms segments from datasets of the offline study, in total this data consists of 381,991 segments. All measured times contain the time spent in the core algorithmic kernel and the time required to make the results accessible to the Python-based top-level framework pySPACE, e.g., for the FPGA-based system this includes the time required to perform the data transfer between main memory and hardware accelerator.
The required time to process a segment of the proposed system (FPGA) and the reference systems (MRS and SRS, see
Section 5.5) is shown in
Figure 7 (left part). We can observe that the required time to process a segment depends on the number of channels and the utilized computing device. For all channel setups, the MRS requires the highest amount of time (2.431–15.97 ms, depending on the number of channels) and has the highest variability of time (
ms). The SRS provides a significant speedup over the MRS system (5.6–11.8×, depending on the number of channels) and has a smaller variability of the processing time (
ms). The highest performance and lowest variability of processing time can be obtained by the FPGA-based system (speedup over SRS: 9.0–21.6×, speedup over MRS: 50.6–253.5×,
ms). The required time to process a segment represents a delay that might be, depending on the actual application and the related time constraints, unacceptable.
Resource Utilization: The resource utilization is shown in
Figure 7 (right part). It can be observed that there is only a small increase in the amount of utilized LUTs, FFs, and DSP48 for an increasing number of channels. In contrast to this, the increase of utilized BRAMs is more pronounced (10.5 BRAMs are required for a single channel, 37.5 BRAMs for four channels, and 73 BRAMs for eight channels). The reason for this increase of utilized BRAMs is the requirement to increase the buffer sizes of the variance filter and adaptive threshold computations to compute the sliding mean and variance (see
Section 4.1), whereas the actual computational units are used in a multiplexed fashion for different data samples and channels.
6.2. MRCP-Based Movement Prediction: Classification Performance, Computing Time, Resource Utilization
Classification Performance: The classification performance for different numbers of channels according to the extended 10–20 system and both hardware architectures was analyzed using a repeated measures ANOVA with two within-subjects factors: numbers of channels (, 4 levels: channels with ) and type of hardware architecture (, 2 levels: ). Greenhouse-Geisser correction was applied when necessary. Bonferroni-Holm correction was applied for multiple comparisons.
The obtained classification performance is shown in
Figure 8. A significant main effect can be observed for
[
], but not for
[
,
]. The effect of number of channels is not straightforward for each architecture type. In our application, the highest BA is achieved with 96 channels (except for 124 channels) [
vs.
:
,
vs.
:
,
vs.
:
].
Computing Times: The computing times for the different reference systems discussed in
Section 5.5 are shown in
Figure 9. Reported is the amount of time required to perform a single prediction, i.e., to process 200 ms of EEG data. Two values are reported for each setup: the time required for the preprocessing (which corresponds to the Preprocessing DFHWA shown in
Section 4.2) as well as the time for the actual prediction (which corresponds to the MRCP Processing DFHWA). It can be observed that, independent from the number of channels, the processing times of the SRS are sufficiently fast to predict movements online, i.e., they require less than 40 ms of time to process a segment of 40 ms of EEG data. We can only observe a small speedup by parallelizing the preprocessing (the smallest speedup of 47% is obtained for 32 channels, the highest speedup of 66% can be obtained for 124 channels). The reason for this observation is a high proportion of serial code related to object construction and data exchange between nodes in pySPACE. A separate investigation (not shown) revealed that the speedup increases if the size of the data segments is increased. However, this would also result in an increase of the latency related to each prediction. Hence, a larger segment size is not feasible for the application of online movement prediction.
The MRS requires a significantly higher amount of time for each prediction. For a small number of channels (), it is possible to use the MRS for online prediction (i.e., less than 40 ms are required to process a segment). However, for these channel setups we obtain a classification performance worse than for . Hence, an optimal system configuration would consist of a configuration with a higher number of channels, i.e., . For such a configuration the MRS does not provide a sufficient amount of computing performance for an online application.
If the DFHWAs are used, the computation times can be considerably reduced and the time constraints for real-time prediction are met. The achievable speedup depends on the number of channels. For 32 channels, achievable speedup is the lowest (FPGA vs. SRS: –, FPGA vs. MRS –), the highest speedups can be achieved for 124 channels (FPGA vs. SRS: –, FPGA vs. MRS: –).
Resource Utilization: A further concern was again the amount of required resources in relation to the number of channels. The resource utilization for different PL resources and number of EEG channels is shown in
Figure 8 (right). It can be observed that the increase of resources for an increasing number of EEG channels is generally low, the highest increase of resources can be observed for the required LUTs in the preprocessing part of the MRCP (32 vs. 124 channels: 249.7%). A smaller increase can be observed for the FFs (32 vs. 124 channels: 10.3%), while no increase can be observed for the BRAMs and DSP48 slices. The reason for this observation is attributable to the
delay scaling technique applied to scale the system to different numbers of channels. As discussed in
Section 3.2, the application of delay scaling increases the number of registers in a DFHWA (which are implemented as LUTs and FFs), but not the number of multipliers and adders (which are mapped to DSP48 slices).
6.4. Hybrid Movement Prediction: Classification Performance, Prediction Time
The last step in the offline analysis was to determine if the combination of different modalities as described in
Section 3.5 can improve the classification performance. The classification performance shown in
Figure 6 and
Figure 8 reveals that the EMG-based movement prediction provides a higher classification performance than the MRCP-based movement prediction. However, it is well known that the EEG allows a prediction of movements earlier than the EMG. Hence, an important point for the usage in a rehabilitation device is to study the effect combining multiple predictions on the overall classification performance, reliability and prediction time. The proposed system provides the possibility to process several data streams in parallel. Hence, it permits to apply the hybrid combinations described in
Section 3.5.
Classification Performance: The classification performance of the hybrid combinations for the CPU- and FPGA-based systems is shown in
Figure 12. For comparison, we performed a three-way repeated measures ANOVA with the movement prediction method (method, 4 levels: EMG, MRCP, PaM, PaE), combination with P3 (P3, two levels: NP3, CP3), and hardware architecture (HA, 2 levels: CPU, FPGA) as within-subjects factors. Greenhouse-Geisser correction was applied when necessary and the corrected
p-value is reported. For post-hoc comparisons, we performed paired t-tests. Bonferroni-Holm was applied for multiple comparisons. We obtained significant main effects for method [
] and P3 [
], but no effect of HA [
].
The best method is EMG, followed by MoE, MRCP, and MaE [ for all pairs of methods]. A combination with the P3 improves the classification performance [CP3 vs. NP3: ]. There was no difference in classification performance between CPU and FPGA [].
Furthermore, we observe a significant interaction between movement prediction method and P3 combination []. The best performance is obtained by EMG, followed by MoE, irrespective if it is combined with P3 or not []. However, for the MRCP we observe an increased performance compared to MaE only when it is combined with the P3 [MRCP vs MaE: p < 0.05 for P3, for NP3]. P3 improves the classification performance for all methods except for MaE [ for MaE, for other methods]. The hardware architecture interacts with neither P3 nor methods.
Finally, there was no interaction between HA, P3 and methods []. The post-hoc analysis reveals that the combination with the P3 improves the classification performance for all methods [], except for MaE []. This pattern is observed for both architectures. Furthermore, for both CP3 and NP3 there was no difference in classification performance between both architectures for all methods []. A combination with the P300 improves the classification performance for all methods except for MaE [ for MaE, for other methods]. This pattern was observed again for both architectures.
Another important point in the direct comparison of different combinations is a consideration of the errors rates and precision. The mean FNRs, FPRs and recall for the combinations and both hardware architectures is given in
Table 2. For comparison, we performed three-way ANOVAs with movement detection method, combination with P300, and hardware architecture as within-subjects factors, which were performed separately for the FNR, FPR and precision as dependent variables, respectively.
For the FNR, we observe a significant main effect for the movement prediction method [], but neither for combination with P300 [] nor hardware architecture []. The lowest FNR is obtained by MoE, followed by EMG, and MRCP; the highest FNR is obtained for MaE [ for all pairs of methods].
For the FPR, we observe significant main effects for movement prediction method [] and combination with P300 [], but not for []. Regarding the FPR, the best method is EMG, followed by MoE, MRCP, and MaE [ for all pairs of methods]. A combination with the P3 improves the FPR [CP3 vs. NP3: ]. There was no difference in FPR between CPU and FPGA [].
For the FPR, we observe again a significant interaction between movement prediction method and P3 combination []. A further analysis reveals that the combination with the P3 reduces the FPR for all methods []. This pattern is observed for both architectures. Furthermore, for both NP3 there was a difference in FPR between both architectures for all methods []. A combination with the P300 improves the FPR for all methods []. This pattern was observed again for both architectures.
The precision depends substantially on the applied signal combination method. For the precision, we observe significant main effects for movement prediction method [], combination with P300 [], and for []. Regarding the precision, the best method is MaE, followed by EMG, MRCP, and MoE [ for all pairs of methods except MRCP vs. MoE: ]. A combination with the P3 improves the precision [CP3 vs. NP3: ]. For the precision, we observed a difference between CPU and FPGA [].
Furthermore, we observe again a significant interaction between movement prediction method and P3 combination [] and additionally a significant interaction between method and []. A further analysis reveals that the combination with the P3 improves the precision for all methods []. This pattern is observed for both architectures. Moreover, for NP3 there was a difference in precision between both architectures for EMG and MaE [], but not for MRCP and MoE []. For CP3, there was no difference in precision for both architectures for all methods []. A combination with the P300 improves the precision for all methods []. This pattern was observed again for both architectures.
Summary: We observe that the MRCP-based movement prediction is less accurate than the EMG-based prediction. This can also be observed for the individual trials shown in
Figure 13. The reason for the lower accuracy might be the detection of a movement planning or intention that does not result in a movement execution [
10,
158,
159]. The classification performance is improved by the combination of the MRCP with the P300 and/or EMG. Depending on the type of combination, different properties regarding FPR, FNR and precision can be observed. The combination of the MRCP with the EMG by a logical and decreases the FPR, but increases the FNR. The opposite can be observed for the FNR and precision. In an additional combination with the P3000, we observe a decrease of the FPR, but an increase of the FNR for the MRCP and MoE. While the decrease of the FNR is only small for the MRCP, we can observe a significantly higher increase for MaE. In this case, up to
of the movements are missed. A decrease of the FPR can be observed for the MoE combination. The precision is increased in general by a combination with the P300.
6.9. Comparison to Previous Work
Although combining ERD/ERS or MRCPs with EMG is a promising approach for the development of hybrid BCIs [
42], there are still only few BCI systems that rely on this combination. Hence, the first part of the following comparison is limited to systems that combine ERD/ERS or MRCPs with EMG or other modalities. In the BCI systems listed in the first part of this comparison, the data processing is performed in standard personal computers or laptops, in contrast to out FPGA-based approach. However, as discussed in
Section 1, neurorehabilitation applications would benefit from mobile or embedded BCI systems. Although FPGAs are well suited to build such systems (see
Section 2.4 and
Section 2.5), only few FPGA-based BCI systems exist. The second part of the comparison compares our system with FPGA-based BCIs.
Similar to our work, the combination of EEG and EMG for the detection of an arm has been investigated in [
82]. Two different methods for the combination of EEG and EMG were investigated. It was shown that certain combinations obtain a better and more stable performance for the detection of movements in comparison to a single signal. Due to the EEG/EMG combination, the system can be used to compensate exhaustion or fatigue of the user. In contrast to our work, the subjects could move the left or right arm. However, the combination with further modalities, such as the P300, has not been investigated.
Multiple additional modalities next to EEG and EMG (e.g., eye gaze, Electrooculography (EOG), hand position) and their combinations to predict targets of human reaching motions were investigated in [
84]. Similar to our work, it was shown that EEG predicts movements earlier that EMG, but with a lower accuracy than EMG. Furthermore, it was shown that EOG or eye tracking can also be used to improve the performance of EEG-based movement predictions. However, in contrast to our work, the study is limited to the combination of two modalities, a combination with the P300 has not been investigated.
In a previous work [
25], we investigated the effect of combining EEG and EMG on the classification performance and reliability for the prediction of movements in an offline evaluation. However, in that work we neither considered the combination with further signals (i.e., the additional combination with P300), nor did we use a mobile FPGA-based system. Furthermore, we did not validate the approach in an online test.
In most other works, the combination of EEG and EMG is used for neurorehabilitation applications, e.g., to control orthoses or prostheses. For instance, in [
83] it was shown that the realtime control of an upper limb wearable robot can be improved by such a combination. However, EEG was only used to compensate a lack of EMG, i.e., to estimate the torques of elbow and forearm based on EEG signals, if the EMG is has not enough power. The work shows that the torque estimation accuracy decreases if an EEG/EMG combination is used. The study does not investigate the effect of the combination on classification performance, if movements should be predicted.
The control of active orthoses for gait rehabilitation based on EEG and EMG has been presented in [
43]. The work proposes an approach for the control of active orthoses that incorporates motion intention recognition based on the detection of ERD/ERS to predict the gait phase. In addition, EMG was used to estimate the active torque generated by the patient’s muscles. However, the control approach did also not consider the fusion of EEG and EMG to enhance the classification accuracy to predict movements.
A hybrid control approach for an upper limb prosthesis based on EEG and EMG was investigated in [
86]. EEG was used to control the exoskeleton asynchronously, joint angles were estimated using EMG. The study shows that it is possible to estimate the joint angles with root-mean-square errors of less than
. However, similar to the studies discussed above, it was not investigated whether a combination of EEG and EMG can be used to predict movements with an improved accuracy.
A combined EEG/EMG classification for hand and wrist movements in transhumeral amputees was investigated in an offline study in [
87]. It was shown that the EEG/EMG combination outperforms the classification by either EEG or EMG. However, in contrast to the study in this paper, the aim was to differentiate between different movements instead of predicting upcoming movements. The authors point out that a miniature and portable system is required for future practical applications.
In the literature discussed above, the data processing was based on standard PCs. The aim of FPGA-based BCI systems is to replace a standard PC. In [
107,
109], Shyu et. al. present an FPGA-based system for the detection of SSVEPs. The system can be used for the control of a multimedia system [
107] or a hospital bed [
109]. Especially the tight integration of EEG signal processing and motor control to actuate the hospital bed represents an interesting opportunity for other rehabilitation devices. However, since these systems target the detection of SSVEPs, they have been designed to process only two EEG channels, and are hence not usable for the detection of, e.g., MRCPs.
The work in [
108] presents a low-cost FPGA-based P300 speller. Similar to us, they achieve a classification performance that is comparable to a standard CPU. However, most parts of the system are implemented using softcore CPUs (only a band-pass Butterworth filter is implemented in hardware), and only seven EEG channels can be used. However, the aim of the system is to detect the P300, the detection of ERD/ERS or MRCP is not considered.
A recent work uses an FPGA-based prototype for fall prediction in everyday life [
110], which is based on a previous software-implementation [
85]. Similar to our work they combine EEG processing (for the detection of movement related potentials) with EMG and show in an offline evaluation that it is possible to obtain a small numerical error, despite the usage of fixed-point arithmetic in the FPGA. However, for the fall prediction, a fixed number of only 7 EEG and 8 EMG channels is sufficient; the classification performance, resource utilization and computing time for different numbers of channels has not been investigated. The proposed system requires 56 ms for the prediction of a movement, which is within the reported application time limit of 300 ms for fall prediction, while the MRCP prediction of our system requires approx.
ms for the prediction of a movement using 124 EEG channels (see
Figure 9). The study does not investigate the effect of different combination methods on the classification performance.
In previous works [
162,
163], we presented preliminary systems for the detection of the P300 and MRCPs, respectively, using FPGA-based systems. However, these systems were inflexible and did not exploit the full parallel processing capabilities to process multiple physiological signals at the same time, so that these systems were only capable to detect either the P300 or MRCP. These systems did not support the fusion of multiple signals, which can improve the classification performance significantly, as shown in this work. Furthermore, these preliminary systems have not been validated in an online evaluation.
Overall, this comparison shows that our FPGA-based system has, similar to other recent hybrid BCI systems, the capability to process EEG and EMG data in combination. However, to the best of our knowledge, the combination of MRCP, EMG and additionally the P300 has not been addressed before. Furthermore, the capabilities of our FPGA-based system can be useful in applications that require miniaturized BCI-systems with high requirements on performance and flexibility that cannot be fulfilled by other previously proposed FPGA-based BCI-systems. Hence, it can help to improve the capabilities of future neurorehabilitation devices.
6.10. Limitations
The evaluation of the proposed FPGA-based system shows that it is feasible to use it for movement prediction using EEG and EMG data. However, several limitations of the study should be noted.
First, the study has been conducted under laboratory conditions. Typical rehabilitation applications will be performed under more uncontrolled conditions. Thus, the EEG and EMG will be affected by artifacts and different kinds of noise. We have chosen the present study design to provide the possibility to determine the time of the movement onset exactly and to be able to compare the obtained results to other studies [
25,
64]. However, the methodology for the detection of the MRCP and P300 has already been successfully applied in other applications, e.g., the usage of MRCP-based movement prediction to enhance the user-experience of an exoskeleton by adapting the control algorithms [
58] and the P300 detection for the monitoring of the user’s state [
57]. Hence, since we also observe no significant performance difference between the CPU and FPGA-based implementations, we assume that the developed system will work under such conditions without major modifications.
Secondly, the current study was conducted using healthy subjects. Although it has been shown that patients that suffer from, e.g., stroke, are feasible to perform movement-imagination despite chronic or severe motor impairments [
68,
69,
164], different kinds of brain damages can affect the EEG [
165,
166,
167,
168,
169,
170] and EMG [
171,
172], and hence, possibly the performance of the proposed system. However, MRCP-based BCIs can be successfully used by patients suffering from stroke [
173,
174,
175]. For other neurological diseases, such as amyotrophic lateral sclerosis, it has been shown that certain MRCP characteristics do not differ from healthy patients [
176]. Hence, we assume that the proposed system can be used by patients with neurological motor impairments. Since the system supports different types of signal combinations for movement prediction, it is possible to choose an operation mode depending on the specific conditions of a patient. Nevertheless, the specific difference in classification performance between healthy and impaired subjects depending on impairment and operation mode has to be investigated in the future.
Thirdly, there are mobile systems available with more advanced CPU architectures and higher clock frequencies than the MRS that was used in the experiments for comparison. However, although these might provide a higher computational performance than the MRS, they are less extensible than the proposed FPGA-based system. Due to the low resource utilization of the DFHWAs, we are able to integrate further features like artifact removal methods, eye-tracking [
177], physiological signals to monitor fatigue or exhaustion, and also control algorithms and motor commutation components to drive a rehabilitation device directly.