1. Introduction
Over many years, our world has transferred into a digital community, where each subject lives with an unique digital identifier [
1]. Indeed, there are many identifiers, such as identification passwords and cards. At the same time, these identifiers can be easily circumvented, stolen, and forgotten [
2]. Therefore, personal characteristics or behaviors can be used to strengthen identification applications. Such techniques, so-called biometrics, use several in-person information to allow more robust identification systems, such as face and voice recognition, fingerprint information, and iris data, among others [
3].
On the other hand, the widespread and influential deployment of biometric systems leads to a new challenge, which is called “spoofing” [
1,
4]. Such an attack is classified as the most dangerous in security systems since it is designed to break the biometrics systems’ security, thus allowing unwarranted persons to obtain admission to the system [
2].
In real life, there have already been several spoofing attacks on biometrics systems, such as face spoofing (printed photos and 3D mask attacks [
5,
6]), fake fingerprints (gummy fingers), finger–vein systems fooled through a piece of paper [
7], iris recognition systems fooled by an eyeball opposite to the scanner of iris, and voice recognition fooled by replaying a voice recording opposite to the recognition system speaker [
7]. Therefore, people are looking for biometric authentication systems that can grant access to a person based on invisible characteristics, thus becoming harder to be attacked by an external threat. In this context, one shall refer to user authentication based on brain signals, which can be captured by the well-known electroencephalogram (EEG) exam [
8].
The EEG is a clinical test that places electrodes on the person’s scalp to detect the brain’s electrical activities, which are further recorded for visualization purposes. Such information reflects the voltage currents inside the brain from ionic flows concerning the neurons’ activity [
2]. Approaches to capture electrical brain signals can be categorized as invasive and non-invasive [
9], where the former ones require surgery to embed electrodes in the brain. The electrocorticography brain–computer interface (ECoG BCI) is an example, which is usually intended for recording the movements of the arm [
2]. Other signal types are used in the non-invasive approaches, such as functional magnetic resonance imaging and magnetoencephalography.
Many studies have proposed to solve issues relevant to identification in biometric applications. For instance, Jayarathne [
10] gathered signals as a biometric approach from 21 test subjects to verify their identity. The authors employed the EMOTIV EEG Headset with 14 channels. The Common Spatial Patterns were used for feature extraction and Linear Discriminant Analysis for classification purposes. The proposed approach achieved a
recognition rate, which motivated the authors to claim that EEG signals might be an excellent approach to replace PINs when accessing ATMs. However, the selection of relevant channels that produces the optimal subset of EEG features is of prime importance for (i) reducing computational complexity, (ii) reducing over-fitting, and (iii) eliminating inconveniences during clinical application [
11].
Table 1 presents some studies we thought might be relevant to EEG channel selection.
According to
Table 1, many studies have worked on the channel selection problem with different methodologies such as Common Spatial Patterns, optimization, Pearson correlation coefficient, and additional connectivity metrics. Most existing works were implemented based on the data extracted from 64 EEG channels. Furthermore, most works reduced the number of channels and presented significant classification accuracy. In contrast, refs. [
10,
11] reported good classification performances, but the number of selected channels is still high. On the other hand, some studies have reduced channels up to
, but with a moderated classification rate.
Recently, several researchers proposed the use of optimization approaches to solve challenges with non-stationary signals [
1,
6,
12,
13]. In addition, EEG-based user identification with supervised classification and optimization methods has shown significant improvements compared to traditional techniques [
2,
14,
15].
Signal acquisition is one of the significant problems concerning the EEG-based user identification technique, which is performed by placing electrodes on the head of a human [
19,
20,
21]. In addition, such a process is usually uncomfortable since it requires good knowledge to place the sensors correctly. Additionally, some questions must be considered: “Is it essential to place all these electrodes on the head of persons?” and “Whether not, may we detect the most significant ones for user identification and then utilize fewer electrodes?”.
The above questions led our work to model the EEG channel selection as an optimization problem. Flower Pollination Algorithm (FPA) is a robust optimization method and has been successfully applied to many real-world problems [
22]. Although FPA has proved to be a great success in finding optimal solutions to many issues, it suffers, like metaheuristic algorithms, from the inability to generate new solutions when it is stuck in local minima [
23]. According to [
1], the authors tested several meta-heuristic algorithms for EEG channel selection, with FPA achieving the most accurate results. However, it still has some problems, such as being stuck in local minima. For this reason, we propose to hybridize FPA with the local search optimizer
-hill climbing (
-hc) [
24].
This work is one of the first to employ hybrid optimization methods with supervised classification methods for biometric user identification using EEG. The main point of hybridizing any two approaches is to complement their advantages and avoid their shortcomings. This work aims to learn the most critical EEG channels by proposing a hybrid approach composed of -hc and FPA, named “FPA-hc”. Therefore, we expect to obtain more accurate results when applying optimization approaches to select optimal EEG channels. The main contributions of this work are summarized as follows:
To evaluate the proposed FPA-hc for EEG-based user identification. Such a hybrid approach aims to improve local pollination in FPA to avoid being stuck in local minima.
To perform an extensive study to select the most suitable classifier to guide the optimization process using FPA-hc. Our experiments showed that Support Vector Machines with Radial Basis Function (SVM-RBF) obtained the most effective results, thus being the preferred approach in this work.
The remainder of this article is organized as follows:
Section 1 presents the main concepts regarding EEG signals, as well as related works about EEG-based identification. The proposed method is detailed in
Section 2. The results are discussed in
Section 3, the discussion is provided in
Section 4, and the conclusions and future works are set out in
Section 5.
4. Discussions
As aforementioned, the primary purpose of this study is to evaluate the proposed FPA-hc-SVM for EEG-based user identification. In this work, we modeled the channel selection task as an optimization problem and introduced the SVM classifier for EEG-based biometric user identification. One can observe the proposed methods achieved similar accuracy rates using SVM considering three different autoregressive coefficients and wavelet features, with an advantage to FPA-hc-SVM when compared to standard FPA and -hc optimizers.
Concerning the number of selected channels, FPA-hc-SVM has succeeded in reducing up to half of the total electrodes. The proposed algorithm reduced the total number of electrodes from 64 to 34, 36, 35, and 39 for , , , and , respectively. Moreover, we can observe that different coefficients provide different accuracy rates regarding the EEG-based person identification task. The proposed approach obtained of accuracy using only 35 sensors and with features.
Another exciting feature of FPA
-hc-SVM concerns the location of the selected electrodes. It is worth noticing that the proposed method showed that the most common sensors are located on the frontal, occipital, and parietal lobes, although they also spread along with the head. Such finding is an interesting observation, which means FPA
-hc-SVM tried to identify channels not too close to each other to obtain relevant details from all over the human brain.
Table 6 shows a comparison of the proposed method against some state-of-the-art techniques. It is worth noting that FPA
-hc-SVM achieved the highest accuracy rate when compared to other methods. However, considering the number of selected channels, some improvements are still needed to achieve a minimum number of channels selected, such as the application of multi-objective optimization.
Compared with the previous work, the proposed approach achieved more accurate results using the same feature extraction method (AR). The accuracy rate was , while the number of channels selected is 32 for the previous work. Here, the proposed method achieved an accuracy rate of using 34 channels for the AR features. Overall, the proposed method archived the best accuracy results () with only 35 channels, where the previous work achieved the best results with an accuracy of 96 with the same number of selected channels (i.e., 35 channels).
5. Conclusions and Future Works
In this work, we proposed a hybrid approach composed of the Flower Pollination Algorithm and the
-hill climbing algorithm (FPA
-hc-SVM) to address the challenge of channel selection in EEG-based biometric person identification. The hybrid approach between FPA and
-hc algorithm has been designed to improve the local pollination part of the FPA to overcome local minima. It is worth mentioning that another version of hybrid FPA was also introduced in [
2]. However, the main differences between the proposed approach and the previous ones are: (i) the hybridization in [
2] is used to enhance the quality of the best-achieved solution, but here, the hybridization is used for local pollination solutions; (ii) the feature extraction techniques used in [
2] were time domain, frequency domain, and time-frequency domain, and the feature extraction methods used here are computed by wavelet features and auto-regressive features; (iii) furthermore, in [
2], the 10-fold cross-validation method is used for the training–testing stage, while in this study, a training–validation–testing stage is used.
The primary purpose of this work is to demonstrate that all electrodes are not needed to achieve a high accuracy rate. Therefore, this paper is introduced to model the problem of channel selection as an optimization problem. The channel’s subset that optimizes the recognition ratio over a validation set is employed as the fitness function.
The proposed approach (FPA
-hc-SVM) is tested using a standard EEG dataset with 64 EEG channels and the data recorded from 109 individuals. In addition, the performance of the proposed method is evaluated using five criteria, which are (i) Accuracy, (ii) F-Score, (iii) Recall, (v) Specificity, and (iv) the number of the channel selected. The FPA
-hc-SVM was tested using two different feature extraction methods, i.e., Wavelet feature s(
) and Auto-regressive models with three different coefficients (i.e., AR
, AR
, and AR
). The outcomes of the experiments presented the introduced method excelled both standard FPA and the one proposed by [
1,
2,
26,
27,
28]. It is worth noting that, while retaining high accuracy rates, the number of sensors has been lessened by half. Additionally, the outcomes displayed a positive correlation between the number of features obtained from the EEG signal and the accuracy ratio. Such a finding suggests that the proposed approach can remove duplicate and undesirable features while retaining specific features.
On the other hand, the current version of the proposed algorithm has some limitations, which are as follows:
The proposed algorithm was tested by splitting EEG datasets into three subgroups, i.e., training, validating, and test sets. This approach may lead to overfitting the results. We recommended trying the FPA-hc-SVM using k-fold-cross-validation approach instead.
The FPA-hc-SVM technique was tested using features and auto-regressive models only. Future work may recommend testing the proposed method over different features. In addition, we recommend investigating the usage of a multi-objective approach.