**1. Introduction**

The Brain-Computer Interface (BCI) is a method of communication between a user and a system, where the intention of the subject is translated into a control signal by classifying the specific pattern which is characteristic of the imagined task, for example, the movement of the hand and/or foot [1]. The most widely used technique to register the electrical activity for BCI applications is the electroencephalography (EEG), which is a non-invasive and low-cost technique. The recording is done by placing electrodes on the scalp according to the 10–20 system [2], which records electrical impulses associated with neuronal activity in the brain cortex. The BCI can be based on exogenous such as the event-related P300 potential and Visual Evoked Potentials (VEPs), or endogenous potentials, where Motor Imagery (MI) widely used in BCI applications is the dynamic state where a subject evokes a movement or gesture. The event related phenomena represent frequency-specific changes in the ongoing EEG activity and may consist, in general terms, of either decreases or increases of power in given frequency bands [1]. Most of the brain activity is concentrated in electrophysiological bands called: delta *δ* (0.5–4 Hz), theta *θ* (4–7.5 Hz), alpha *α* (8–13 Hz), and beta *β* (14–26 Hz) [2]. Another important frequency for applications in BCI is the *μ* or sensorimotor rhythm, with the same frequency bands as *α*, but located in the motor cortex instead of the visual cortex where *α* is mainly generated [3]. There are several works which report the importance of *μ* frequencies for MI

detection [3–7], where Pfurtscheller et al. published [8–12]. They demonstrate the changes of EEG activity in *μ* and *β* rhythms caused by voluntary movements.

Endogenous MI-BCI-based system does not require external stimuli, hence it is more acceptable to the users [4]. Nonetheless, MI depends on the ability to control the electrophysiological activity, which makes feature extraction and classification for MI-BCI based system more difficult than for exogenous responses. One of the major limitations of EEG records is the low signal-to-noise ratio and the fact that the signals picked up at the electrodes are a mixture of sources that cannot be observed directly by non-invasive methods. Therefore, for endogenous BCI approaches, a preprocessing step is required to identify independent sources of the mixtures observed in the electrodes. A well-known preprocessing method is based on the decomposition of multi-channel EEG data into spatial patterns which are calculated from two classes of MI, known as Common Spatial Patterns (CSP) [13,14]. CSP is a supervised method where class information must be available a priori and its effectiveness relies on the subject-specific frequency bands [7,15].

As an unsupervised alternative to the estimation of independent sources, Blind Source Separation (BSS) algorithms have been incorporated in EEG preprocessing, mainly in medical applications to improve the tasks of diseases diagnosis [16]. BSS algorithms make the source estimations from the mixed observation using statistical information. It has been shown that BSS is especially suitable for removing a wide variety of artifacts in EEG recordings [17] and separating *μ* rhythms generated in both brain hemispheres [18]. Therefore, BSS is a useful method for constructing spatial filters for preprocessing raw multi-channel EEG data in BCI research [15].

Due to its unsupervised and statistical nature, BSS does not require a priori information about MI classes, nor specific frequency bands, which is an advantage over CSP approaches. Nonetheless, an inherent disadvantage of BSS algorithms is that for each processed trial, the order is not preserved, which limits its direct application in further classifier stages used in BCI, where the order of the input vectors must be conserved to avoid loss of the adjustment parameters for each new data entry. Some automated BSS approaches have been proposed to discern between sources of interest and artifacts, and thus minimize the aforementioned inconvenience making use of statistical concepts [19–21].

In the classification stage, the most widely used approaches are Linear Discriminant Analysis (LDA) [22], Support Vector Machine (SVM) [23], Multilayer Perceptron [24], and Bayesian classifier [25]. A recent approach that has given excellent results, mainly in computer vision is deep learning [26]. However, deep learning techniques have not been widely used for EEG-BCI applications, due to factors such as as noise, the correlation between channels, and the high dimensional EEG data [27]. Some works where deep learning has been used for MI classification have been proposed [27–35]. However, for MI-BCI based paradigm the datasets are small due to the fatigue where the participants are exposed in each session. Therefore, it has been difficult to use deep learning for this purpose [32].

In this research, a fastICA BSS algorithm is used to obtain estimated independent components. A typical spectral profile of Movement Related Independent Components (MRIC) with significant components in the *μ* and *β* frequencies is used for sorting in each processed trial, thus ensuring that the sources estimated to be the most active in MI frequencies remain at the beginning of the array, while the artifacts are placed in the final positions. For each estimated source, the Continuous Wavelet Transform (CWT) is calculated for a given time window, generating an image containing temporal, frequential, and spatial information. This process is carried out throughout all trials, forming a set of images to train and test a Convolutional Neural Network (CNN).

A contribution in the present work is the use of BSS instead of the widely used CSP. Even though this has worked well for MI-BCI based, these spatial filters require prior information of the classes to be separated in order to maximize the differences between them. In addition, BSS is an unsupervised approach that does not require prior information about the classes. The problem of large datasets needed for training is minimized using the MRIC criterion to sort the estimated sources. The paper is structured as follows: Section 2 the background of BSS, CWT, and CNN are explained. Section 3 the proposed methodology to obtain CWT maps from estimated sources is described, along with the details of the CNN architecture. The experimental results and discussion are presented in Section 4. Finally, conclusions and a future work overview are presented in Section 5.
