4.2.1. Motor Imagery Results

Figure 6 depicts the average classification accuracy for all subjects in MI database as a function of the number of selected features during the training-validation stage, for TE*κα* and TE*<sup>θ</sup> κα*. These results show there is a small improvement in the ability to discriminate between the MI tasks when using features extracted through phase TE, as compared with real-valued TE. In addition, they reveal that the CKA-based feature selection strategy successfully identified the most relevant connections for MI task classification. That is to say, the classification system has a stable performance even for a very reduced number of connectivity features. This is fundamental for any practical BCI application that intends to use phase TE as a characterization strategy, since estimating single-trial phase TE is computationally expensive [8]. Therefore, it is important to reduce as much as possible the number of channel pair connectivity features required to achieve peak classification performance. Additionally, it is important to highlight that while classification accuracies in Figure 6, and in Table 1, are in the same range of those obtained through other connectivitybased characterization approaches [10,23], they are far below those obtained from methods such as common spatial patterns [59–61]. A possible explanation is that bivariate TE might be more robust at describing long-range interactions rather than local ones [41], like those arising from MI-related activity, centered on the sensorimotor area. In addition, the differences with the results in [10], where we used TE*κα* to characterize the same database, lay mostly in the fact that in this study we select and analyze one 2 s long time window covering the period right after the end of the visual cue, while in [10] we report results from multiple overlapping windows covering the entirety of the task. Lastly, the large standard deviations from the average accuracies in Figure 6 point to disparate performances for different subjects.

**Figure 6.** Average classification accuracies, and their standard deviations, for all subjects in the MI database as a function of the number features selected to train the classifiers.

Figure 7A shows the highest average classification accuracy per subject for TE*<sup>θ</sup> κα*, GC*<sup>θ</sup>* and PSI, during the training-validation stage. The subjects are ordered from highest to lowest performance. The analogous information for the testing stage is presented in Figure 7. In both stages, the TE*<sup>θ</sup> κα*-based classifier performs slightly better than those based on alternative connectivity estimation strategies in most subjects. In addition, as inferred from Figure 6, there are large variations in performance for the different subjects in the database, consistent across the two classification stages. This behavior has been reported elsewhere [10,59–62].

**Figure 7.** (**A**) Highest average classification accuracy for each subject in the MI database during the training-validation stage. (**B**) Accuracies obtained for each subject during the testing stage. The subjects are ordered from highest to lowest performance according to the accuracies obtained for the TE*<sup>θ</sup> κα*-based classifier in the training-validation stage.

In order to gain insight into the observed performance differences, in the case of TE*<sup>θ</sup> κα*, we exploited the second advantage provided by the CKA-based relevance analysis. The relevance vector index not only allows us to perform feature selection but also provides a one-to-one relevance mapping to each connectivity feature. That is to say, we can reconstruct normalized relevance connectivity matrices by properly reshaping , so as to visualize the connectivity pairs and frequency ranges that are discriminant for the task of interest. In that line, we followed the approach proposed in [23] to interpret the relevance information by clustering the subjects according to common relevance patterns.

First, for each subject and frequency band of interest, we obtained a relevance vector *n*,Δ*<sup>f</sup>* <sup>∈</sup> <sup>R</sup>*<sup>C</sup>* whose elements were associated with each node (EEG channel) in the data by computing the relevance of the total information flow of every node. Such magnitude was defined as the sum of the relevance values , obtained from all data in the training dataset, corresponding to all directed interactions targeting and originating from a particular node. Then, we concatenated the vectors *n*,Δ*<sup>f</sup>* <sup>∈</sup> <sup>R</sup>*<sup>C</sup>* for all frequency bands to obtain a single relevance vector *<sup>n</sup>* <sup>∈</sup> <sup>R</sup>2*C*. Next, we reduced the dimension of the relevance vectors *<sup>n</sup>* of each subject through t-Distributed Stochastic Neighbor Embedding (t-SNE), which preserves the spatial relationships existing in the initial higher-dimensional space [63]. Figure 8A shows the obtained two-dimensional representation of the relevance vectors for each subject in the MI database, colored according to their respective classification accuracy. Note that the distribution of the subjects in the plot is related to their classification accuracies. This indicates that shared relevance patterns are related to the obtained classification results, meaning that subjects with similar *<sup>n</sup>* had close performances. Then, we grouped the subjects into two clusters using the k-means algorithm. The number of clusters was selected by visual inspection of the t-SNE results. Figure 8B displays the two groups, termed G. I and G. II. The TE*<sup>θ</sup> κα*-based classifier has average accuracies of 0.59 ± 0.05 and 0.80 ± 0.09 for the subjects in G. I and G. II, respectively.

**Figure 8.** (**A**) Two-dimensional representation of the relevance vectors for each subject in the MI database obtained after applying t-SNE on *n*. (**B**) Groups identified by k-means. For the TE*<sup>θ</sup> κα*-based classifier the subjects grouped in G. I have an average accuracy of 0.59 ± 0.05, while those in G. II have an average accuracy of 0.80 ± 0.09.

Finally, Figure 9 shows the average nodal relevance, as defined by *n*, and the most relevant connectivities for each group, discriminated by frequency band. For G. I we observe high node relevance mostly in the *α* band in right fronto-central, left-central, and centro-parietal regions. The most relevant connections in the *α* band tend to originate or target fronto-central nodes, while the ones in the *β<sup>l</sup>* band favor parietal and centroparietal areas. For G. II, the node relevance is concentrated around the right centro-parietal region, particularly channel CP4, for both frequency bands. The most relevant connections in the *α* band involve short-range interactions mainly between centro-parietal and central regions. The most relevant connections in the *β<sup>l</sup>* band, which display higher values than those of *α*, originate from CP3 and CP4 and target central and fronto-central nodes. Since the G. II includes all the subjects with good classification performances, we can conclude that the information that allows to satisfactorily classify the left and right hand MI tasks from TE*<sup>θ</sup> κα* features corresponds mostly to the incoming and outgoing information flow coded in the phases of the oscillatory activity in the centro-parietal region. These results are in line, in terms of spatial location, with those we found in [10], and with physiological interpretations that argue that MI activates motor representations in the parietal areal and the premotor cortex [64].

**Figure 9.** Topoplots of the average node (channel) relevance for each group of clustered subjects and frequency band of interest in the MI database (see Figure 8). The arrows represent the most relevant connectivities for each group. For visualization purposes, only 3% of the connections, those with the highest average relevance values per group, are depicted.
