Next Article in Journal
Transformer and Adaptive Threshold Sliding Window for Improving Violence Detection in Videos
Previous Article in Journal
Cascade Proportional–Integral Control Design and Affordable Instrumentation System for Enhanced Performance of Electrolytic Dry Cells
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Independent Vector Analysis for Feature Extraction in Motor Imagery Classification

by
Caroline Pires Alavez Moraes
1,*,
Lucas Heck dos Santos
1,
Denis Gustavo Fantinato
2,
Aline Neves
1 and
Tülay Adali
3
1
Center for Engineering, Modeling and Applied Social Sciences (CECS), Federal University of ABC (UFABC), Santo André 09280-560, SP, Brazil
2
Department of Computer Engineering and Automation (DCA), Universidade Estadual de Campinas (UNICAMP), Campinas 13083-852, SP, Brazil
3
Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County (UMBC), Baltimore, MD 21250, USA
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(16), 5428; https://doi.org/10.3390/s24165428 (registering DOI)
Submission received: 18 June 2024 / Revised: 19 August 2024 / Accepted: 20 August 2024 / Published: 22 August 2024
(This article belongs to the Section Biomedical Sensors)

Abstract

:
Independent vector analysis (IVA) can be viewed as an extension of independent component analysis (ICA) to multiple datasets. It exploits the statistical dependency between different datasets through mutual information. In the context of motor imagery classification based on electroencephalogram (EEG) signals for the brain–computer interface (BCI), several methods have been proposed to extract features efficiently, mainly based on common spatial patterns, filter banks, and deep learning. However, most methods use only one dataset at a time, which may not be sufficient for dealing with a multi-source retrieving problem in certain scenarios. From this perspective, this paper proposes an original approach for feature extraction through multiple datasets based on IVA to improve the classification of EEG-based motor imagery movements. The IVA components were used as features to classify imagined movements using consolidated classifiers (support vector machines and K-nearest neighbors) and deep classifiers (EEGNet and EEGInception). The results show an interesting performance concerning the clustering of MI-based BCI patients, and the proposed method reached an average accuracy of 86.7 % .

1. Introduction

The brain–computer interface (BCI) enables a direct connection between the brain and the external world. Electroencephalography is a technique capable of translating brain activities into commands based on scalp-recorded measurements [1,2]. Some benefits of using this BCI method are its non-invasive nature, safety, high temporal resolution, and relatively low cost. All these advantages have attracted the interest of the scientific community, and as a result, electroencephalogram (EEG) signals have been used in the study of several areas, such as epilepsy and brain tumor detection [3], alternative communication channels for disabled patients [4], emotion recognition [5], and neuromuscular disorders [6]. Among all EEG-based applications, the motor imagery (MI) paradigm is probably one of the most popular. It refers to the imagination or mental rehearsal of a motor movement without any real motor execution [7,8].
The MI paradigm has been employed in cognitive psychology and cognitive neuroscience to explore the unconscious structure that anticipates a movement execution. In some contexts, such as medical, athletic, and musical areas [9,10,11], a mental rehearsal can be as effective as an authentic physical proceeding, which leads to a promising future therapeutic tool to improve the performance of motor functions in patients with damage to the central nervous system [9].
There are several proposed methods to identify the MI movements from EEG signals as accurately as possible, most of them focused on feature selection or classification algorithms. However, feature extraction over EEG signals for BCI systems has shown to be a crucial stage for classification performance.
BCI Competition III Dataset 4a (DS4a) is a widely recognized motor imagery dataset that has been extensively studied, leading to the development of various techniques. For instance, Na Lu et al. [12] introduced a method known as structure-constrained semi-nonnegative matrix factorization (NMF), which extracts key EEG patterns in the time domain by enforcing mean envelopes of event-related potentials (ERPs) as constraints. This approach, called SCS-NMF, achieved an accuracy of 68.94%. On the other hand, Rasool Ameri et al. [13] developed a dictionary pair learning (DPL) method for EEG classification, using L0- and L1-norm calculation to obtain sparse coefficients via linear projection, resulting in an accuracy of around 80%. More recently, researchers in [14] proposed a framework combining bispectrum, entropy, and common spatial pattern (BECSP) for feature extraction from MI-EEG signals, achieving an accuracy of 84.91% after selecting the most interesting ones, through a tree-based method. Additionally, the study in [15] compared three popular signal decomposition techniques—empirical mode decomposition, discrete wavelet transform, and wavelet packet decomposition—for EEG classification, with wavelet packet decomposition (WPD) sub-bands yielding an average accuracy of 92.8% for DS4a. Despite these consistent results, DS4a classification remains a challenge. Considering the previous methods, the possibility of working with only one subject may not be enough to deal with a multi-source retrieving problem in some cases. Assuming that the motor imagery data are collected from several subjects executing the same task, the problem can be extended to a multi-model approach, such as independent vector analysis (IVA), which could explore the dependences across subjects.
IVA was firstly proposed for the separation of convolutive mixtures in the frequency domain [16], considering the joint blind source separation (JBSS) problem [17]. Since then, there have been applications in fMRI (functional magnetic resonance imaging) [18,19] from HD-sEMG (high-density surface electromyography) [20], multimodal neuroimaging data fusion [21], and EEG data as muscle artifact removal [22]. However, the IVA application in MI-based BCI as a feature extraction technique is a pioneering approach. This work proposes a novel perspective of MI classification through a new feature extraction method: a combination of IVA and Autoregressive (AR) models. The method is applied to the motor imagery dataset from the BCI Competition III Dataset 4a. Classification is obtained through different methods: support vector machines (SVM), K-nearest neighbors (KNN), EEGNet, and EEG-Inception, in order to evaluate the efficiency of the obtained features. Comparing the results with the ones in the literature, this novel approach showed a homogeneous accuracy performance.
In Section 2, we describe the JBSS problem and the IVA method. Section 3 shows the AR model and the classifiers applied in this work. Section 4 presents BCI competition III dataset 4a and the data preprocessing. The simulation results are presented and analyzed in Section 5. Finally, we conclude this paper in Section 6.

2. Joint Blind Source Separation

In certain applications, such as neurodiagnostic applications [23], dealing with multiple datasets is a necessity, which leads to multisubject/multimodal data fusion. In these cases, the task of blind source separation may be extended to JBSS, which exploits correlations across datasets (inter-set dependence) while still searching to recover independent latent sources within a dataset (intra-set independence) [19]. The general concept of the JBSS problem involves K datasets, each containing M independent sources and N samples. Such a mixing process can be modeled by the following equation:
x [ k ] ( n ) = A [ k ] s [ k ] ( n ) , 1 n N , 1 k K ,
where s [ k ] ( n ) = [ s 1 [ k ] ( n ) , , s M [ k ] ( n ) ] T R M is the concatenated source vector of the k-th dataset, ( · ) T denotes the vector transpose, A [ k ] R M × M is the k-th invertible mixing matrix, both assumed unknown, and  x [ k ] ( n ) = [ x 1 [ k ] ( n ) , , x M [ k ] ( n ) ] T R M is the concatenated mixture vector of the k-th dataset.
Following the recommendation of [17], it is important to perform data whitening prior to performing source separation. Considering V [ k ] as the whitening matrix of the k-th dataset, z [ k ] ( n ) = V [ k ] x [ k ] ( n ) is the whitened mixture signal.
The demixing process aims to find matrices W [ k ] and the corresponding source vector estimates y [ k ] ( n ) for each one of the K datasets. Hence, the separation system is given by
y [ k ] ( n ) = W [ k ] z [ k ] ( n ) , 1 n N , 1 k K .
The mixing matrices are potentially distinct for each dataset and are not necessarily related, admitting permutation and/or scale ambiguity.

2.1. Independent Vector Analysis

Independent vector analysis is a powerful approach for solving the JBSS problem, and is an extension of independent component analysis (ICA) to multiple datasets by leveraging the dependence across datasets [16,17]. In [19], the role diversity, i.e., different statistical properties, is explained for both ICA and IVA, and the application of both to medical image analysis is discussed. When applied to multiple datasets, Group ICA (GICA) is a widely used approach [24,25]; however, it is noted that it has limitations when compared to IVA in terms of common group-level spatial maps and inter-subject variability preservation [26,27]. The statistical dependence modeled through a multivariate probability density model provides full interaction across the datasets, making IVA an attractive method for problems such as subgroup identification when used with multisubject data [19]. Ref. [28] showed encouraging results of IVA application to subgroup identification, revealing significant differences between the identified subgroups. Such studies demonstrate the ability of IVA to deal with subject variability and subgroup identification, highlighting the advantages of IVA in dealing with multiple datasets.
In IVA, the components from a particular dataset are assumed to be statistically independent of each other, as in ICA methods. However, in contrast to ICA, IVA also exploits the dependence between correlated components from different datasets. These correlated components are regrouped into the so-called source component vectors (SCVs). The mth SCV can be written as y m = [ y m [ 1 ] , , y m [ K ] ] T R K , which is statistically independent of all the other SCVs [17]. The IVA cost function is given by
I I V A = m = 1 M H [ y m ] k = 1 K log | det ( W [ k ] ) | C 1 ,
where H [ · ] is the entropy function, det ( W [ k ] ) is the determinant of matrix W [ k ] and C 1 is a constant term that depends only on x [ k ] .
The mutual information part of the IVA cost function is responsible for solving the permutation ambiguity that occurs in the JBSS problem [17,19]. Furthermore, the minimization of the cost function (3) simultaneously minimizes the entropy of all components and maximizes the mutual information within each estimated SCV [17]. In addition, IVA has shown, in most cases, good performance in capturing variability in spatial components across datasets [25,27]. In this paper, we work with IVA-G [17] that only takes second-order statistical information into account, assuming multivariate Gaussian distributions for the SCVs.

2.2. IVA Using Vector Gradient Descent

In [17], the authors describe four algorithms for minimizing the IVA cost function given by (3). Within these alternatives, the vector gradient descent algorithm was chosen in this paper due to the decoupling method that enables the tailoring of the step size for each direction, resulting in faster convergence per iteration than that achieved in traditional methods [29]. Using this approach, the IVA cost function (3) is differentiated with respect to w m [ k ] [17], where w m [ k ] is the mth row of W [ k ] :
I I V A w [ k ] = E { ϕ [ k ] ( y m ) z [ k ] } h m [ k ] ( h m [ k ] ) T w m [ k ] ,
where ϕ [ k ] ( y ) = log p ( y m ) y m [ k ] , p ( y m ) is the pdf (probability density function) of y m , and h m [ k ] can be defined as a unit length vector such that W ˜ m [ k ] h m [ k ] = 0 , where W ˜ m [ k ] is the ( M 1 ) × M matrix obtained by removing the mth row of the demixing matrix W [ k ] [17,29].
The gradient obtained by (4) is used to iteratively adapt each demixing row of W [ k ] :
( w m [ k ] ) i t + 1 ( w m [ k ] ) i t μ I I V A w m [ k ] ,
followed by a normalization step:
( w m [ k ] ) i t + 1 ( w m [ k ] ) i t + 1 ( w m [ k ] ) i t + 1 ,
where μ is the adaptation step size and i t represents each iteration.

3. Classification Algorithms

Having extracted the features through IVA, a dimensional reduction is highly recommended before classification. Thus, an autoregressive (AR) model is used [30,31], and its weights are extracted for the classification step. This step will be detailed in the sequel, followed by a brief description of the classification methods SVM, KNN, EEGNet, and EEG-Inception.

3.1. Autoregressive Model

The AR model is frequently used to represent a random process in view of preserving their important attributes and also reducing the data dimension. This is possible due to the model structure, where the output variable linearly depends on its own previous values [32]. Thus, an autoregressive model of order q describes the signal u as follows:
u ( n ) = a 1 u ( n 1 ) + a 2 u ( n 2 ) + + a q u ( n q ) + ν ( n ) ,
where ν ( n ) is a white noise with zero mean and variance σ ν 2 , and { a 1 , , a q } are the AR parameters that can also be written as a = [ a 1 , , a q ] .
Based on the Yule–Walker equations [33], the coefficients of the AR model can be estimated by
a ^ = R u 1 r u ,
where R u = E [ u ( n 1 ) u T ( n 1 ) ] , r u = E [ u ( n 1 ) u ( n q + 1 ) ] and u ( n 1 ) = [ u ( n 1 ) , u ( n 2 ) , , u ( n q ) ] T . In this paper, an AR model is obtained for each estimated source ( y m [ k ] ) in each k-th dataset.

3.2. Classifiers

SVM is an efficient supervised algorithm based on statistical learning theory that can be used for classification or regression problems [34]. While in other methods, the separation hyperplane normally assumes distributed class-conditioned data, SVM seeks to find the separation hyperplane with the largest margin between classes.
The KNN classifier is one of the most popular neighborhood classifiers in pattern recognition [35]. It is a nonparametric supervised learning classifier that uses the majority within the K-closest training examples to classify or predict an individual data point. The previous described algorithms are well-known in the machine learning field.
Nevertheless, deep learning approaches have been attracting the attention of the scientific community, and have presented promising results in biomedical engineering applications. EEGNet [36] is a compact convolutional neural network for EEG-based BCIs. The method uses depthwise and separable convolutions to construct an EEG-specific network that encapsulates several well-known EEG feature extraction concepts, such as optimal spatial filtering and filterbank construction, while simultaneously reducing the number of trainable parameters when compared to other networks. More recently, a deep learning model has been proposed by E. Santamaria-Vazquez et al. [37], called EEG-Inception. This method integrates the inception modules for event-related potential (ERP) detection, which can be efficiently combined with other structures in light architecture and requires very few calibration trials.

4. Experimental Setup

In the previous section, we described a method for feature extraction based on IVA. In order to better understand the proposed method, in this section, we investigate the performance of the algorithm by applying it to a real EEG dataset for motor imagery movements. In the following, we describe this dataset and the preprocessing stages.

4.1. Dataset Description—BCI Competition III Dataset 4a

Dataset 4a from BCI Competition III (DS4a) is provided by B. Blankertz et al. [38], and contains data recorded from 5 subjects—identified as “aa”, “al”, “av”, “aw”, and “ay”—using 118 channels sampled at 1000 Hz, which were downsampled to 100 Hz. The cue-based BCI paradigm involves two motor imagery tasks: imagining right-hand movement and right-foot movement, totaling 280 trials per subject.
During each trial, a fixation cross was shown to each subject, followed by a short acoustic warning tone, indicating the beginning of the trial. Two seconds later, a cue arrow pointing either right or down appeared on the screen for 3.5 s, instructing the subjects to perform the corresponding motor imagery task. Subjects were to continue the motor imagery task until the arrow disappeared, after which a brief black screen signaled a short break.

4.2. Proposed Method

In order to evaluate the proposed method, initially, we split the dataset into training and test data using k-fold cross-validation with k f = 10. However, since the IVA matrix initialization (e.g., based on a Gaussian distribution) is a relevant stage for feature extraction and to maintain the test data unknown, using the training data, we also considered a hold-out sample technique of 10 % as a validation dataset to investigate the IVA initialization effect on the performance of the method. In addition, each time series given by the EEG signal was separated into window samples of 4 s according to each motor imagery class.

4.2.1. Training Stage

Firstly, in the training stage, the data were whitened, as recommended in [17], for each subject, separately. The EEG signal collected from each subject was considered to be one dataset. IVA was applied in the training data for each class separately to obtain the W c [ k ] matrices that correspond to the extraction of the main features for the c-th class and k-th subject, with  k { 1 , , 5 } for DS4a, and  c = 1 , 2 . This procedure is presented in Figure 1. Then, considering the k-th subject, the estimated SCV components were obtained by multiplying the training and validation data by each class matrix W 1 [ k ] and W 2 [ k ] , followed by each corresponding whitening matrix V 1 [ k ] and V 2 [ k ] , resulting in y c t r a i n [ k ] and y c v a l i d [ k ] , as shown in Algorithm 1. Using both matrices at this point is necessary considering that validation and test data are assumed to be completely unknown. The choice of obtaining IVA matrices for each class can leverage the feature extraction process, leading to a possible classification improvement. Subsequently, y 1 [ k ] and y 2 [ k ] were stacked and AR modeling was applied to each extracted feature (corresponding to each EEG channel) in order to reduce and adjust the data dimension. The resulting AR parameters were used as the classifier inputs. Finally, the data of each subject were classified according to the two considered classes. This second step is exemplified in Figure 2.
Optimization of the IVA cost function, given by (3), is not an easy task. As usually occurs with gradient descent-based algorithms, initialization plays a crucial role. In order to better explore the method’s potential, a search for a good W c [ k ] initialization was implemented using the validation data. The suitable initialization for W c [ k ] is denoted as W s e l e c t e d c .
To select the appropriate W s e l e c t e d c , SVM and KNN classifiers were chosen (block diagram of Figure 2), since both are low-cost, well-established algorithms and can provide a feasible direction to find the suitable W s e l e c t e d c . More details will be discussed in Section 5.1. When using deep learning approaches, given by EEGNet and EEG-Inception, the AR parameter extraction step is not necessary. Thus, W s e l e c t e d c and the training data were used directly.

4.2.2. Test Stage

After selecting the suitable initialization, W s e l e c t e d c , the procedure described in Figure 2 is reapplied using the test data, where the estimated SCV components are represented by y c t e s t [ k ] . The whole method is summarized in Algorithm 1. The IVA and classifier weights obtained in the training stage are kept constant and applied to the test data. This procedure was used to classify the motor imagery movements between the right hand (RH) and right foot (RF) for DS4a.
In the following, methods will be named after the classifier used: IVAS for SVM, IVAK for KNN, IVAEN for EEGNet, and IVAEI for EEG-Inception.
Algorithm 1 IVAS, IVAK, IVAE or IVAEI
Initialization parameter algorithm: q, μ
- Training Stage:
for each initialization W c [ k ]  do
    W c [ k ] random initialization;
   for each class c do
     Apply IVA - input: z t r a i n [ k ] ; output: W c [ k ] , k { 1 , , 5 } and c = 1 , 2
   end for
   for each subject k do
     for each class c do  
         y c t r a i n [ k ] = W 1 [ k ] V 1 [ k ] x c t r a i n [ k ] W 2 [ k ] V 2 [ k ] x c t r a i n [ k ] and y c v a l i d [ k ] = W 1 [ k ] V 1 [ k ] x c v a l i d [ k ] W 2 [ k ] V 2 [ k ] x c v a l i d [ k ]  
     end for
      y t r a i n [ k ] [ y 1 t r a i n [ k ] y 2 t r a i n [ k ] ] and y v a l i d [ k ] [ y 1 v a l i d [ k ] y 2 v a l i d [ k ] ]  
     AR model is applied for each channel and subject, according to Equation (7)
     SVM and KNN classifier-training with y t r a i n [ k ] , evaluated with y v a l i d [ k ] -output-movement classification accuracy
   end for
end for
  
- W c [ k ] with the highest accuracy - W selected c [ k ] for each subject and class
    
- Test Stage:
for each subject k do  
    y t e s t [ k ] = W selected 1 [ k ] V 1 [ k ] x t e s t [ k ] W selected 2 [ k ] V 2 [ k ] x t e s t [ k ]    
   if SVM or KNN then
     input: Extract AR parameters from y t e s t [ k ] ; output: MI classification
   end if
   if EEGNet or EEGInception then
     input: Apply directly y t e s t [ k ] ; output: MI classification
   end if
end for

5. Results and Discussion

In order to evaluate the algorithm’s performance, in this section, we analyze the effect of IVA initialization, the number of EEG channels considered, and correlation cross-subjects for dataset DS4a. To analyze such aspects, we fixed IVA adaptation step size to μ = 1 and the number of AR coefficients q = 4 , based on our previous work in [39].

5.1. IVA Initialization

In Section 4.2, we described the IVA matrix selection methodology, which is grounded on a random IVA initialization search. Concerning the number of initialization iterations, for the sake of computational efficiency and based on pre-analysis, 100 iterations were used to select the appropriate W s e l e c t e d c . In that sense, IVA was randomly initialized 100 times using a Gaussian distribution with zero mean and unit variance, and for each initialization, accuracy was computed based on the parameters of the classifiers and validation dataset, and considering the same IVA matrix initialization for all subjects. Figure 3 shows the kernel density estimation (KDE) for DS4a and two algorithms: IVAS and IVAK. In both cases, it is possible to verify the occurrence of an initialization that maximizes accuracy, even if it may be a rare event. In Figure 3a, the subjects “aw” and “ay” show a longer tail and similar pattern, finding initializations with accuracies over 90 % , and the subject “al” has the highest accuracy probability. On the other hand, in Figure 3b, for instance, three out of five subjects present a greater likelihood for accuracy around 80 % , while the curves obtained for subjects “aa” and “av” present a mean achieved accuracy lower than the one obtained by the other subjects, showing a probable higher classification complexity.
Based on these analyses of the initialization that maximizes accuracy, the matrix that leads to the greatest accuracy is chosen for the dataset and subjects (measured in the validation set), and applied to the test dataset.

5.2. Number of EEG Channels

Another interesting analysis is to investigate the algorithm’s performance with respect to the number of EEG channels used as each IVA dataset input. Considering that DS4a EEG signals have 118 channels, the number of channels was analyzed using 13, 21, 37, 80, and 118. To reduce the number of channels, those located in brain regions known for higher activity during motor imagery tasks were chosen [40]. The results are shown in Figure 4. The results show that using all the available channels leads to a decrease in performance. The best results for DS4a were obtained by IVAS, with 37 EEG channels (FAF5, FAF1, FAF2, FAF6, F7, F5, F3, F1, Fz, F2, F4, F6, F8, FFC7, FFC5, FFC1, FFC2, FFC4, FFC6, FFC8, FT9, FT7, FC3, FC1, FCz, FC2, FC4, FC6, FT8, FT10, CFC7, CFC5, CFC3, CFC1, CFC2, CFC4, CFC6). Ideally, as the number of channels increases, more information is available. However, we hypothesize that the amount of noise also increases and could prejudice the feature extraction process. Among the algorithms tested, SVM was the one that performed better. In this case, the IVAK algorithm presented the lowest performance when the number of EEG channels was the maximum 118 channels.

5.3. Correlation Cross-Subjects

In Section 2.1, we mentioned the SCVs and how they are extracted through IVA. In this section, we present the relation between the results achieved from SCV covariance matrices, obtained through the use of the estimated sources y m [ k ] , and the DS4a cross-subjects. Figure 5 shows two SCV covariance matrix examples extracted from IVA components (IVA Cp.). These results are based on the use of 37 channels, which achieved the best outcome in the previous analysis. In Figure 5a, we present the covariance matrix obtained from the SCVs for the right hand movement and IVA component 6 as an example. As can be seen, two cases present a higher cross-correlation: “aa” with “av”, which achieves a value of 0.909; and “aa” with “aw” which achieves 0.708. The second example is based on the right foot movement and IVA component 23, shown in Figure 5b, where the highest correlation of 0.936 was achieved also among subjects “aa” and “av”, but other relevant correlations were reached between subjects “av” and “aw”, with a value of 0.887, and subjects “al” and “aw” with a value of 0.749.
Based on the covariance measure obtained from the SCVs, Table 1 and Table 2 show the results obtained for the five highest correlations cross-subjects, for each MI class (the same considered in the discussion above), IVA component, and subject. For this reason, the same IVA component or subject may appear more than once, meaning that it contributed again to one of the highest correlation situations. The correlation values shown were computed based on an average of 10-fold.
In Section 4.2, we computed the KDE and investigated the IVA initialization for the five subjects, having noted a similar distribution between subjects “aa” and “av”, and subjects “aw” and “ay” in Figure 3. Comparing these results with the ones derived from Table 1 and Table 2, we can observe an analogous behavior, i.e., subjects “aa” and “av” present a high cross-correlation when considering IVA components 9 and 13, for right hand movement, and components 24 and 3 for right foot MI. Additionally, for subject “av”, four of the five selected correlations were related to subject “aa” in Table 2, which represents a strong relation between them. In the second case, for subjects “aw” and “ay”, higher correlations emerged from IVA components 9 and 2 for the right hand and right foot, respectively. These results present a significant correlation across subjects (around 0.45) and an intriguing perspective, since the KDE distribution of the subjects could lead to a potential clustering of MI-based BCI patients, from which similar features could be exploited to improve classification performance or aid the development of a global model.

5.4. Deep Learning Approaches

Thus far, we have combined IVA feature extraction with two different classification algorithms. IVAS presented the best performance and provided valuable features using, as parameters, q = 4 and μ = 1 . In order to explore the deep learning approach and evaluate the influence of the components extracted from IVA, the obtained independent components were applied to the EEGNet model (IVAEN) and EEG-Inception model (IVAEI). It is important to note that the AR step was not applied in this case. These methods were implemented and trained using the Braindecode library [41]. Moreover, we applied an augmentation data technique based on Gaussian white noise and/or replication [42]. Having chosen the parameters for each algorithm, Table 3 presents the final results for the DS4a dataset.
The second-best result for subjects “aa” and “al” was obtained using IVAEN. IVAEI achieved 92.1 % and 91.4 % for subjects “aw” and “ay”, respectively. The latter matched the WPD method’s performance, while the former showed a slight difference of 3.3 % . The IVAS algorithm was able to find the third-best result for subject “av” (the most difficult subject to be classified) when compared to other results in the literature. On average, it is possible to note that the IVAEI results showed the second-best average accuracy performance of 86.7 % and a standard deviation of 9.4 .

6. Conclusions and Future Perspectives

In this work, we have presented a feature extraction method for motor imagery classification through EEG signals. This approach minimizes the mutual information to achieve independent vector analysis through multiple datasets. The proposed method was evaluated using the BCI Competition III Dataset 4a with five subjects. Although there are some limitations in terms of tested datasets, when we concentrated on a single well-established one, we were able to conduct a more in-depth and focused analysis, using this dataset for a proof of concept. For DS4a, the IVAEI algorithm obtained the best results, reaching an accuracy of 86.7 % , considering the average between all subjects. Moreover, we showed how the selection of the algorithm parameters such as step sizes, number of AR coefficients, and number of EEG channels affect the algorithm’s performance, and how the components’ correlations identified from IVA could lead to another interesting result, concerning the clustering of MI-based BCI patients. In the future, we consider investigating a generalization of the method, extending the work using more complex datasets, exploring the clustering problem, or even integrating multimodal data for enhanced feature extraction. Furthermore, IVA can be incorporated into a number of remaining challenging tasks in the EEG context, such as online analysis, transfer learning, and feedback systems. While IVA has been traditionally used as an offline method, new extensions allow for the regression of previous results using a new subject’s data without the need to perform a complete decomposition [43,44], which would allow for the required flexibility. A possible transfer learning approach is another interesting concept that can be exploited, considering that the correlation is naturally incorporated into the process to enhance the model’s performance and robustness.
Concerning the deep learning approaches, we focused on comparing different feature extraction methods where deep learning algorithms were used only for the final classification stage, since the feature extraction stage plays an important role in the classification of EEG signals for BCI systems. Additionally, the comparison between methods with and without feature extraction could yield insightful results regarding the use of IVA to obtain network weights, something that was not on the scope of this paper. In this sense, IVA might be better suited for a separate stage, but incorporating a feedback system could be a promising approach for future developments.
Finally, despite IVA’s strong identifiability properties, the direct interpretation of sensor domain components remains challenging. Future research could enhance interpretability by transforming data to the spectral domain or using features like event-related potentials (ERPs), as proposed in previous studies [21,45,46].

Author Contributions

Conceptualization, C.P.A.M., L.H.d.S., D.G.F., A.N. and T.A.; methodology, C.P.A.M., D.G.F. and A.N.; software, C.P.A.M. and L.H.d.S.; validation, C.P.A.M. and L.H.d.S.; formal analysis, C.P.A.M., D.G.F. and A.N.; writing—review and editing, C.P.A.M., D.G.F., A.N. and T.A. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by São Paulo Research Foundation (FAPESP-Process #2023/00640-1), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES-Process 88887.595656/2020-00), and US National Science Foundation (NSF-2316420).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ARAutoregressive
BCIBrain–computer interface
BSSBlind source separation
DS4aBCI Competition III Dataset 4a
EEGElectroencephalogram
ICAIndependent component analysis
IVAIndependent vector analysis
JBSSJoint blind source separation
KDEKernel density estimation
KNNK-nearest neighbors
MIMotor imagery
SCVSource component vectors
SVMSupport vector machines

References

  1. Wolpaw, J.R. Brain-Computer Interfaces as new brain output pathways. J. Physiol. 2007, 579, 613–619. [Google Scholar] [CrossRef] [PubMed]
  2. Nijholt, A.; Tan, D.; Pfurtscheller, G.; Brunner, C.; Millán, J.d.R.; Allison, B.; Graimann, B.; Popescu, F.; Blankertz, B.; Müller, K.R. Brain-Computer Interfacing for intelligent systems. IEEE Intell. Syst. 2008, 23, 72–79. [Google Scholar] [CrossRef]
  3. Song, Z.; Fang, T.; Ma, J.; Zhang, Y.; Le, S.; Zhan, G.; Zhang, X.; Wang, S.; Li, H.; Lin, Y.; et al. Evaluation and Diagnosis of Brain Diseases based on Non-invasive BCI. In Proceedings of the 2021 9th International Winter Conference on Brain-Computer Interface (BCI), Gangwon, Republic of Korea, 22–24 February 2021; pp. 1–6. [Google Scholar]
  4. Geronimo, A.; Simmons, Z.; Schiff, S. Performance predictors of brain-computer interfaces in patients with amyotrophic lateral sclerosis. J. Neural Eng. 2016, 13, 026002. [Google Scholar] [CrossRef]
  5. Grilo, M.; Ribeiro, L.; Moraes, C.; Melo, C.; Fantinato, D.; Sampaio, L.; Neves, A.; Ramos, R. Artifact Removal in EEG based Emotional Signals through Linear and Nonlinear Methods. In Proceedings of the 2019 E-Health and Bioengineering Conference (EHB), Iasi, Romania, 21–23 November 2019; pp. 1–4. [Google Scholar]
  6. López-Larraz, E.; Antelis, J.M.; Montesano, L.; Gil-Agudo, A.; Minguez, J. Continuous decoding of motor attempt and motor imagery from EEG activity in spinal cord injury patients. In Proceedings of the 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, San Diego, CA, USA, 28 August–1 September 2012; pp. 1798–1801. [Google Scholar]
  7. Padfield, N.; Zabalza, J.; Zhao, H.; Masero, V.; Ren, J. EEG-based brain-computer interfaces using motor-imagery: Techniques and challenges. Sensors 2019, 19, 1423. [Google Scholar] [CrossRef] [PubMed]
  8. Pfurtscheller, G.; Neuper, C.; Flotzinger, D.; Pregenzer, M. EEG-based discrimination between imagination of right and left hand movement. Electroencephalogr. Clin. Neurophysiol. 1997, 103, 642–651. [Google Scholar] [CrossRef]
  9. Jackson, P.L.; Lafleur, M.F.; Malouin, F.; Richards, C.; Doyon, J. Potential role of mental practice using motor imagery in neurologic rehabilitation. Arch. Phys. Med. Rehabil. 2001, 82, 1133–1141. [Google Scholar] [CrossRef] [PubMed]
  10. Kappes, H.B.; Morewedge, C.K. Mental simulation as substitute for experience. Soc. Personal. Psychol. Compass 2016, 10, 405–420. [Google Scholar] [CrossRef]
  11. Debarnot, U.; Guillot, A. When music tempo affects the temporal congruence between physical practice and motor imagery. Acta Psychol. 2014, 149, 40–44. [Google Scholar] [CrossRef]
  12. Lu, N.; Li, T.; Pan, J.; Ren, X.; Feng, Z.; Miao, H. Structure constrained semi-nonnegative matrix factorization for EEG-based motor imagery classification. Comput. Biol. Med. 2015, 60, 32–39. [Google Scholar] [CrossRef]
  13. Ameri, R.; Pouyan, A.; Abolghasemi, V. Projective dictionary pair learning for EEG signal classification in brain computer interface applications. Neurocomputing 2016, 218, 382–389. [Google Scholar] [CrossRef]
  14. Hou, Y.; Chen, T.; Lun, X.; Wang, F. A novel method for classification of multi-class motor imagery tasks based on feature fusion. Neurosci. Res. 2022, 176, 40–48. [Google Scholar] [CrossRef]
  15. Kevric, J.; Subasi, A. Comparison of signal decomposition methods in classification of EEG signals for motor-imagery BCI system. Biomed. Signal Process. Control 2017, 31, 398–406. [Google Scholar] [CrossRef]
  16. Kim, T.; Eltoft, T.; Lee, T.W. Independent vector analysis: An extension of ICA to multivariate components. In International Conference on Independent Component Analysis and Signal Separation; Springer: Berlin/Heidelberg, Germany, 2006; pp. 165–172. [Google Scholar]
  17. Anderson, M.; Adali, T.; Li, X.L. Joint blind source separation with multivariate Gaussian model: Algorithms and performance analysis. IEEE Trans. Signal Process. 2011, 60, 1672–1683. [Google Scholar] [CrossRef]
  18. Lee, J.H.; Lee, T.W.; Jolesz, F.A.; Yoo, S.S. Independent vector analysis (IVA): Multivariate approach for fMRI group study. Neuroimage 2008, 40, 86–109. [Google Scholar] [CrossRef] [PubMed]
  19. Adali, T.; Anderson, M.; Fu, G.S. Diversity in independent component and vector analyses: Identifiability, algorithms, and applications in medical imaging. IEEE Signal Process. Mag. 2014, 31, 18–33. [Google Scholar] [CrossRef]
  20. Wang, K.; Chen, X.; Wu, L.; Zhang, X.; Chen, X.; Wang, Z.J. High-density surface EMG denoising using independent vector analysis. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 28, 1271–1281. [Google Scholar] [CrossRef]
  21. Adali, T.; Levin-Schwartz, Y.; Calhoun, V.D. Multimodal data fusion using source separation: Two effective models based on ICA and IVA and their properties. Proc. IEEE 2015, 103, 1478–1493. [Google Scholar] [CrossRef]
  22. Chen, X.; Peng, H.; Yu, F.; Wang, K. Independent vector analysis applied to remove muscle artifacts in EEG data. IEEE Trans. Instrum. Meas. 2017, 66, 1770–1779. [Google Scholar] [CrossRef]
  23. Du, Y.; Fu, Z.; Calhoun, V.D. Classification and prediction of brain disorders using functional connectivity: Promising but challenging. Front. Neurosci. 2018, 12, 525. [Google Scholar] [CrossRef]
  24. Allen, E.A.; Erhardt, E.B.; Wei, Y.; Eichele, T.; Calhoun, V.D. Capturing inter-subject variability with group independent component analysis of fMRI data: A simulation study. Neuroimage 2012, 59, 4141–4159. [Google Scholar] [CrossRef]
  25. Ma, S.; Phlypo, R.; Calhoun, V.D.; Adalı, T. Capturing group variability using IVA: A simulation study and graph-theoretical analysis. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 3128–3132. [Google Scholar]
  26. Michael, A.M.; Anderson, M.; Miller, R.L.; Adalı, T.; Calhoun, V.D. Preserving subject variability in group fMRI analysis: Performance evaluation of GICA vs. IVA. Front. Syst. Neurosci. 2014, 8, 106. [Google Scholar] [CrossRef]
  27. Laney, J.; Westlake, K.P.; Ma, S.; Woytowicz, E.; Calhoun, V.D.; Adalı, T. Capturing subject variability in fMRI data: A graph-theoretical analysis of GICA vs. IVA. J. Neurosci. Methods 2015, 247, 32–40. [Google Scholar] [CrossRef] [PubMed]
  28. Yang, H.; Akhonda, M.A.; Ghayem, F.; Long, Q.; Calhoun, V.D.; Adali, T. Independent vector analysis based subgroup identification from multisubject fMRI data. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 23–27 May 2022; pp. 1471–1475. [Google Scholar]
  29. Li, X.L.; Zhang, X.D. Nonorthogonal joint diagonalization free of degenerate solution. IEEE Trans. Signal Process. 2007, 55, 1803–1814. [Google Scholar] [CrossRef]
  30. Brüggemann, R. Model Reduction Methods for Vector Autoregressive Processes; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2012; Volume 536. [Google Scholar]
  31. Kargin, V.; Onatski, A. Curve forecasting by functional autoregression. J. Multivar. Anal. 2008, 99, 2508–2526. [Google Scholar] [CrossRef]
  32. Pfurtscheller, G.; Neuper, C.; Schlogl, A.; Lugger, K. Separability of EEG signals recorded during right and left motor imagery using adaptive autoregressive parameters. IEEE Trans. Rehabil. Eng. 1998, 6, 316–325. [Google Scholar] [CrossRef]
  33. Stoica, P.; Friedlander, B.; Söderström, T. A high-order Yule-Walker method for estimation of the AR parameters of an ARMA model. Syst. Control Lett. 1988, 11, 99–105. [Google Scholar] [CrossRef]
  34. Vapnik, V. Statistical Learning Theory; Wiley New York: New York, NY, USA, 1998. [Google Scholar]
  35. Fix, E.; Hodges, J.L. Discriminatory analysis. Nonparametric discrimination: Consistency properties. Int. Stat. Rev. Int. Stat. 1989, 57, 238–247. [Google Scholar] [CrossRef]
  36. Lawhern, V.J.; Solon, A.J.; Waytowich, N.R.; Gordon, S.M.; Hung, C.P.; Lance, B.J. EEGNet: A compact convolutional neural network for EEG-based brain-computer interfaces. J. Neural Eng. 2018, 15, 056013. [Google Scholar] [CrossRef]
  37. Santamaria-Vazquez, E.; Martinez-Cagigal, V.; Vaquerizo-Villar, F.; Hornero, R. EEG-inception: A novel deep convolutional neural network for assistive ERP-based brain-computer interfaces. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 28, 2773–2782. [Google Scholar] [CrossRef]
  38. Blankertz, B.; Muller, K.R.; Curio, G.; Vaughan, T.M.; Schalk, G.; Wolpaw, J.R.; Schlogl, A.; Neuper, C.; Pfurtscheller, G.; Hinterberger, T.; et al. The BCI competition 2003: Progress and perspectives in detection and discrimination of EEG single trials. IEEE Trans. Biomed. Eng. 2004, 51, 1044–1051. [Google Scholar] [CrossRef] [PubMed]
  39. Moraes, C.P.A.; Aristimunha, B.; Dos Santos, L.H.; Pinaya, W.H.L.; de Camargo, R.Y.; Fantinato, D.G.; Neves, A. Applying Independent Vector Analysis on EEG-Based Motor Imagery Classification. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
  40. Tiwari, A.; Chaturvedi, A. A novel channel selection method for BCI classification using dynamic channel relevance. IEEE Access 2021, 9, 126698–126716. [Google Scholar] [CrossRef]
  41. Schirrmeister, R.T.; Springenberg, J.T.; Fiederer, L.D.J.; Glasstetter, M.; Eggensperger, K.; Tangermann, M.; Hutter, F.; Burgard, W.; Ball, T. Deep learning with convolutional neural networks for EEG decoding and visualization. Hum. Brain Mapp. 2017, 38, 5391–5420. [Google Scholar] [CrossRef] [PubMed]
  42. Rommel, C.; Paillard, J.; Moreau, T.; Gramfort, A. Data augmentation for learning predictive models on EEG: A systematic comparison. J. Neural Eng. 2022, 19, 066020. [Google Scholar] [CrossRef] [PubMed]
  43. Gabrielson, B.; Sun, M.; Akhonda, M.A.; Calhoun, V.D.; Adali, T. Independent vector analysis with multivariate Gaussian model: A scalable method by multilinear regression. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
  44. Vu, T.; Yang, H.; Laport, F.; Gabrielson, B.; Calhoun, V.D.; Adalı, T. A Robust and Scalable Method with an Analytic Solution for Multi-Subject FMRI Data Analysis. In Proceedings of the ICASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 1831–1835. [Google Scholar]
  45. Adali, T.; Akhonda, M.; Calhoun, V.D. ICA and IVA for data fusion: An overview and a new approach based on disjoint subspaces. IEEE Sens. Lett. 2018, 3, 7100404. [Google Scholar] [CrossRef]
  46. Belyaeva, I.; Gabrielson, B.; Wang, Y.P.; Wilson, T.W.; Calhoun, V.D.; Stephen, J.M.; Adali, T. Learning Spatiotemporal Brain Dynamics in Adolescents via Multimodal MEG and fMRI Data Fusion Using Joint Tensor/Matrix Decomposition. IEEE Trans. Biomed. Eng. 2024, 71, 2189–2200. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Procedure description to obtain the IVA matrices W c [ k ] for each class based on the training data.
Figure 1. Procedure description to obtain the IVA matrices W c [ k ] for each class based on the training data.
Sensors 24 05428 g001
Figure 2. Procedure description for the k-th subject used in training and test datasets.
Figure 2. Procedure description for the k-th subject used in training and test datasets.
Sensors 24 05428 g002
Figure 3. Performance analysis of IVAS and IVAK concerning IVA initialization based on the KDE for subjects from Dataset4a. (a) Dataset4a with IVAS; (b) Dataset4a with IVAK.
Figure 3. Performance analysis of IVAS and IVAK concerning IVA initialization based on the KDE for subjects from Dataset4a. (a) Dataset4a with IVAS; (b) Dataset4a with IVAK.
Sensors 24 05428 g003
Figure 4. IVAS and IVAK performance analysis with respect to the number of EEG channels.
Figure 4. IVAS and IVAK performance analysis with respect to the number of EEG channels.
Sensors 24 05428 g004
Figure 5. Examples of SCV covariance matrices obtained through IVA for the DS4a considering right hand and right foot movements.
Figure 5. Examples of SCV covariance matrices obtained through IVA for the DS4a considering right hand and right foot movements.
Sensors 24 05428 g005
Table 1. Main IVA component correlations per subject for right hand (DS4a) and the similarities between subjects compared with the KDE analysis.
Table 1. Main IVA component correlations per subject for right hand (DS4a) and the similarities between subjects compared with the KDE analysis.
Right Hand
IVA Cp.Cp. 9Cp. 21Cp. 18Cp. 13Cp. 17
“aa”Cross-Subj.“av”“al”“ay”“av”“ay”
Correlation0.4930.4630.4500.4350.432
IVA Cp.Cp. 21Cp. 30Cp. 21Cp. 8Cp. 23
“al”Cross-Subj.“aa”“av”“av”“aa”“aa”
Correlation0.4630.4580.4310.4300.422
IVA Cp.Cp. 9Cp. 30Cp. 13Cp. 9Cp. 21
“av”Cross-Subj.“aa”“al”“aa”“ay”“al”
Correlation0.4930.4580.4350.4340.431
IVA Cp.Cp. 9Cp. 30Cp. 23Cp. 26Cp. 23
“aw”Cross-Subj.“ay”“aa”“al”“aa”“aa”
Correlation0.4180.4180.4140.4070.398
IVA Cp.Cp. 18Cp. 9Cp. 17Cp. 2Cp. 9
“ay”Cross-Subj.“aa”“av”“aa”“aa”“aw”
Correlation0.4500.4340.4320.4290.418
Table 2. Main IVA component correlations per subject for right foot (DS4a) and the similarities between subjects compared with the KDE analysis.
Table 2. Main IVA component correlations per subject for right foot (DS4a) and the similarities between subjects compared with the KDE analysis.
Right Foot
IVA Cp.Cp. 16Cp. 2Cp. 24Cp. 6Cp. 3
“aa”Cross-Subj.“al”“al”“av”“ay”“av”
Correlation0.5350.4700.4650.4500.444
IVA Cp.Cp. 16Cp. 16Cp. 2Cp. 2Cp. 16
“al”Cross-Subj.“aw”“aa”“aa”“ay”“av”
Correlation0.5400.5350.4700.4630.448
IVA Cp.Cp. 24Cp. 16Cp. 3Cp. 11Cp. 8
“av”Cross-Subj.“aa”“al”“aa”“aa”“aa”
Correlation0.4650.4480.4440.4370.426
IVA Cp.Cp. 16Cp. 2Cp. 16Cp. 16Cp. 11
“aw”Cross-Subj.“al”“ay”“av”“aa”“al”
Correlation0.5400.4490.4200.4200.410
IVA Cp.Cp. 2Cp. 6Cp. 2Cp. 7Cp. 27
“ay”Cross-Subj.“al”“aa”“aw”“aa”“aa”
Correlation0.4630.4500.4490.4380.436
Table 3. Accuracy and standard deviation obtained in classifying BCI Competition III Dataset 4a.
Table 3. Accuracy and standard deviation obtained in classifying BCI Competition III Dataset 4a.
Subjects
Methods“aa”“al”“av”“aw”“ay”Average ± Sd
SCS-NMF64.292.6760.072.655.368.9 ± 14.7
DPL81.510060.283.079.480.8 ± 14.1
BECSP77.710073.984.888.184.9 ± 10.1
WPD9692.388.995.491.492.8 ± 2.9
IVAS71.896.170.084.385.081.4 ± 10.7
IVAK59.693.661.172.168.671.0 ± 12.2
IVAEN87.898.566.468.682.680.8 ± 13.4
IVAEI84.396.469.392.191.486.7 ± 9.4
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Moraes, C.P.A.; dos Santos, L.H.; Fantinato, D.G.; Neves, A.; Adali, T. Independent Vector Analysis for Feature Extraction in Motor Imagery Classification. Sensors 2024, 24, 5428. https://doi.org/10.3390/s24165428

AMA Style

Moraes CPA, dos Santos LH, Fantinato DG, Neves A, Adali T. Independent Vector Analysis for Feature Extraction in Motor Imagery Classification. Sensors. 2024; 24(16):5428. https://doi.org/10.3390/s24165428

Chicago/Turabian Style

Moraes, Caroline Pires Alavez, Lucas Heck dos Santos, Denis Gustavo Fantinato, Aline Neves, and Tülay Adali. 2024. "Independent Vector Analysis for Feature Extraction in Motor Imagery Classification" Sensors 24, no. 16: 5428. https://doi.org/10.3390/s24165428

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop