**1. Introduction**

Neural oscillations are observed in the mammalian brain at different temporal and spatial scales [1]. Oscillations in specific frequency bands are present in distinct neural networks, and their interactions have been linked to fundamental cognitive processes such as attention and memory [2,3] and to information processing at large [4]. Three properties characterize such oscillations: amplitude, frequency, and phase, the latter referring to the position of a signal within an oscillation cycle [5]. Oscillation amplitudes are related to neural synchrony expansion in a local assembly, while the relationships between the phases of neural oscillations, such as phase synchronization, are involved in the coordination of anatomically distributed processing [6]. Moreover, from a functional perspective, phase

**Citation:** De La Pava Panche, I.; Álvarez-Meza, A.; Herrera Gómez, P.M.; Cárdenas-Peña, D.; Ríos Patiño, J.I.; Orozco-Gutiérrez, Á. Kernel-Based Phase Transfer Entropy with Enhanced Feature Relevance Analysis for Brain Computer Interfaces. *Appl. Sci.* **2021**, *11*, 6689. https://doi.org/10.3390/app11156689

Academic Editor: Gabriele Cervino

Received: 2 June 2021 Accepted: 19 July 2021 Published: 21 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

synchronization and amplitude correlations are independent phenomena [7], hence the interest in studying phase-based interactions independently from other spectral relationships. Additionally, phase relationships are linked to neural synchronization and information flow within networks of connected neural assemblies [8]. Therefore, a measure that aims to capture phase-based interactions among signals from distributed brain regions should ideally include a description of the direction of interaction. A fitting framework for such measure is that of brain effective connectivity [9].

Effective brain connectivity, also known as directed functional connectivity, measures the influence that a neural assembly has over another one, establishing a direction for their interaction by estimating statistical causation from their signals [10]. Directed interactions between oscillations of similar frequency can be captured through measures such as Geweke-Granger causality statistics, partially directed coherence, and directed transfer function [9,11]. However, since these metrics depend on both amplitude and phase signal components, they do not identify phase-specific information flow [8]. The phase slope index (PSI), introduced in [12], measures the direction of coupling between oscillations from the slope of their phases; still, it only captures linear phase relationships [13]. In this context arises the concept of phase transfer entropy, a phase-specific nonlinear directed connectivity measure introduced in [8]. Transfer entropy (TE) is an information-theoretic quantity, based on Wiener's definition of causality, that estimates the directed interaction, or information flow, between two dynamical systems [14,15]. In [8], the authors first extract instantaneous phase time series by complex filtering the signals of interest in a particular frequency, since a signal's phase is only physically meaningful when its spectrum is narrowbanded [16]. Such filtering-based approach has also been explored to obtain phase-specific versions of other information-theoretic metrics such as permutation entropy and timedelayed mutual information [7,16]. Then, the authors compute TE from the obtained phase time series. Nonetheless, since conventional TE estimators are not well suited for periodical variables, in [8] phase TE estimates are obtained through a binning approach performed over multiple trials simultaneously, in a procedure termed trial collapsing.

Phase TE has found multiple applications in neuroscience, such as gaining insight into reduced levels of consciousness by evaluating brain connectivity [17], analyzing resting-state networks [18], and assessing brain connectivity changes in children diagnosed with attention deficit hyperactivity disorder following neurofeedback training [19]. It has even been used to detect fluctuations in financial markets data [20]. Nonetheless, phase TE, estimated as in [8], cannot be employed as a characterization strategy for brain– computer interfaces (BCI) since they require features extracted on an independent trial basis, i.e., each trial must be associated with a set of features. Effective connectivity measures, such as phase TE, can be used to assess the induced physiological variations in the brain occurring during BCI tasks [21]. Discriminative information may be hidden in the dynamical interactions among spatially separated brain regions that characterization methods commonly employed in BCI are not able detect [22]. This information could be relevant to address issues such as the inefficiency problem in some BCI systems [23]. In that context, authors in [6] applied a binning strategy to estimate single-trial phase TE to set up classification systems for visual attention. Nonetheless, binning estimators for single trial-based estimation of information-theoretic measures exhibit systematic bias [8]. Furthermore, spectrally resolved TE estimation methods that can obtain single-trial TE estimates have been recently proposed in the literature [24,25]. Yet, phase TE is conceptually different from them [25], as they are not phase-specific metrics.

Here, we propose a novel methodology to estimate TE between single pairs of instantaneous phase time series. Our approach combines the kernel-based TE estimator we introduced in [10], with phase time series obtained by convolving neural signals with a Morlet Wavelet. The kernel-based TE estimator expresses TE as a linear combination of Renyi's entropy measures of order *α* [26,27] and then approximates them through functionals defined on positive definite and infinitely divisible kernel matrices [28]. Its most important property is that it sidesteps the need to obtain the probability distributions

underlying the data. Instead, the estimator computes TE directly from kernel matrices that, in turn, capture the similarity relations among data. It is robust to varying noise levels and data sizes and to the presence of multiple interaction delays in a network [10]. In this work, we hypothesize that the above-described estimator could overcome the hurdles other single-trial TE estimators face when obtaining TE values from instantaneous phase time series since it would not have to explicitly obtain probability distributions from circular variables [8]. Additionally, since our primary motivation to introduce a robust phase TE estimation methodology is the use of such measures in the context of BCI applications, we also explore a relevance analysis strategy based on centered kernel alignment (CKA) [29]. The CKA-based analysis allows us to identify the set of pairwise channel connectivities relevant to discriminate between specific conditions, favoring the neurophysiological interpretation of our results and providing an option to avoid carrying out all to all channel connectivity estimations in practical BCI systems based on phase TE.

We employ simulated and real-world EEG data to test the introduced effective connectivity measure. The simulated data are obtained from neural mass models, mathematical models of neural mechanisms that generate time series with oscillatory behavior similar to electrophysiological signals. Obtained results for such data show that the proposed kernelbased phase TE estimation method successfully detects the direction of interaction imposed by the model. Indeed, it detects statistically significant connections in the frequency bands of interest, even for weak couplings and narrowband bidirectional interactions. It also displays robustness to realistic levels of noise and signal mixing. Regarding the EEG data, we consider two databases containing signals recorded under two different cognitive paradigms, consisting of motor imagery tasks and a change detection task designed to study working memory. Attained classification results demonstrate that our approach is competitive compared to real-valued and phase-based directed connectivity measures. Thus, this proposal extends the approach described in [10] by introducing a measure that captures directed interactions between the phases of oscillations at specific frequencies. Unlike alternative approaches in the literature, it can be obtained from single trial data, which allows it to be used as a characterization strategy in BCI applications. In addition, the results obtained for the EEG data show that our approach, coupled with the CKAbased relevance analysis, largely outperforms the real-valued kernel-based transfer entropy in [10] as characterization strategy for cognitive tasks such as working memory.

The remainder of the paper is organized as follows: in Section 2 we formally introduce the concept of phase TE and our kernel-based approach for single-trial phase TE estimation. We also describe the proposed CKA-based relevance analysis. Section 3 details the experiments we carried out using simulated and real EEG data in order to evaluate the performance of our proposal. In Section 4 we present and discuss our results, and finally, Section 5 contains our conclusions.
