EEG Topography Amplification Using FastGAN-ASP Method

Zhao, Min; Zhang, Shuai; Mao, Xiuqing; Sun, Lei

doi:10.3390/electronics12244944

Open AccessArticle

EEG Topography Amplification Using FastGAN-ASP Method

School of Cryptography Engineering, Information Engineering University, Zhengzhou 450001, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(24), 4944; https://doi.org/10.3390/electronics12244944

Submission received: 26 October 2023 / Revised: 2 December 2023 / Accepted: 7 December 2023 / Published: 8 December 2023

(This article belongs to the Special Issue AI Security and Safety)

Download

Browse Figures

Versions Notes

Abstract

:

Electroencephalogram (EEG) signals are bioelectrical activities generated by the central nervous system. As a unique information factor, they are correlated with the genetic information of the subjects, exhibiting robustness against forgery. The development of biometric identity recognition based on EEG signals has significantly improved the security and accuracy of biometric recognition. However, EEG signals obtained from incompatible acquisition devices have low universality and are prone to noise, making them challenging for direct use in practical identity recognition scenarios. Employing deep learning network models for data augmentation can address the issue of data scarcity. Yet, the time–frequency–space characteristics of EEG signals pose challenges for extracting features and efficiently generating data with deep learning models. To tackle these challenges, this paper proposes a data generation method based on channel attention normalization and spatial pyramid in a generative adversative network (FastGAN-ASP). The method introduces attention mechanisms in both the generator and discriminator to locate crucial feature information, enhancing the training performance of the generative model for EEG data augmentation. The EEG data used here are preprocessed EEG topographic maps, effectively representing the spatial characteristics of EEG data. Experiments were conducted using the BCI Competition IV-Ⅰ and BCI Competition IV-2b standard datasets. Quantitative and usability evaluations were performed using the Fréchet inception distance (FID) metric and ResNet-18 classification network, validating the quality and usability of the generated data from both theoretical and applied perspectives. The FID metric confirmed that FastGAN-ASP outperforms FastGAN, WGAN-GP, and WGAN-GP-ASP in terms of performance. Moreover, utilizing the dataset augmented with this method for classification recognition achieved an accuracy of 95.47% and 92.43%.

Keywords:

EEG classification; generating adversarial network (GAN); data augmentation; EEG topography; attention mechanism

1. Introduction

With the rapid advancement of technology, the analysis of biological features plays an increasingly important role in identity authentication. Traditional biometric features such as facial recognition [1], fingerprints, voiceprints, and gait are susceptible to tampering, forgery, and are irreversible [2]. Electroencephalogram (EEG), as a novel biometric feature directly generated by the nervous system and associated with cortical brain activity, possesses unique neural pathway patterns. From a physiological perspective, it is difficult to forge, and changes in individual behavior and states can be revoked at any time. Therefore, utilizing EEG for identity authentication, where individual differences serve as a unique identifier, enhances the accuracy and security of biometric authentication.

Currently, there are two main ways to collect electroencephalogram (EEG) signals: invasive and noninvasive [3]. Invasive methods include intracranial techniques during surgery and chronic implanted electrode recordings in animal models, among others. However, invasive methods come with certain traumas and ethical issues. Noninvasive methods, such as electrode caps, encounter environmental and physiological noise during measurements, termed “artifacts”, which can have amplitudes similar to or larger than the original EEG signals [4]. Despite efforts to reduce noise using various filtering and denoising techniques, the collected raw EEG signals still suffer from a low signal-to-noise ratio (SNR) [5], making them unsuitable for direct application in EEG recognition scenarios. Furthermore, when utilizing EEG signals for identity authentication, the processed signals, which involve decoding and feature extraction from the multidimensional temporal data containing time–frequency–space characteristics, differ significantly from the original signals. The brain atlas (BA) represents a computationalized form of EEG signals, expressing the brain’s electrophysiological information using imaging techniques [6]. It uses seven different color spectrums to describe the spatial distribution of delta, theta, alpha, and beta waves in the brain’s electrical activity, replacing waveform curves in original EEG signals with colorful planar images to vividly represent the time–frequency–space characteristics of EEG data.

Data augmentation is an alternative approach that alleviates the problem of data scarcity by artificially synthesizing new samples from existing ones. The emergence of generative adversarial networks (GANs) [7] has introduced adversarial learning concepts to address data insufficiency. However, EEG data differs from regular data as it represents brain responses formed by the fusion of temporal, spectral, and spatial features. When augmenting EEG data, it is essential to consider these time–frequency–space characteristics and select appropriate data formats for augmentation purposes.

Therefore, this paper proposes an attention-based generative adversarial network for augmenting preprocessed EEG topographic maps to address the issue of insufficient EEG data in identity recognition scenarios. The main contributions are as follows:

(1): Integrating channel attention normalization and spatial pyramid structures into the GAN, enabling data generation based on a small sample size.
(2): Using ‘color images’ as a replacement for original waveform EEG signals as the model input, allowing the dataset to visualize the fusion of time–frequency–space features.
(3): Conducting comparative experiments on the BCI Competition IV-Ⅰ and BCI Competition IV-2b datasets. The proposed model’s ability to generate high-quality EEG topographic maps is quantitatively evaluated using Fréchet inception distance (FID) [8]. Furthermore, classification experiments validate that the generated EEG image data from this paper can enhance classification recognition accuracy.

2. Related Work

2.1. Data Augmentation of EEG Signals

Brain–computer interface (BCI) connects the brain to the external world by recognizing brain activities and transforming them into information or commands [9]. Brainwave signals, as the intermediary between these two aspects, have demonstrated significant potential in classification and recognition tasks.

Varone et al. [10] proposed an ensemble method using five networks, namely VGG19, Xception, DenseNet201, MobileNet V2, and ResNet152V2, for probability fusion to perform classification tasks on two datasets, Mex and MI. The datasets consist of time series obtained through unsupervised methods and TF spectrogram images obtained through supervised methods. The ensemble method achieved an accuracy of 97.9%, significantly outperforming current methods. This confirms the utility of the proposed classification approach in such tasks and further validates the reliability of data augmentation in brainwave signals.

Dong et al. [11] proposed a RESIT interpolation method, which can quickly and effectively reconstruct the signals of bad channels accompanied by an ideal infinity reference. They validated the effectiveness of the generated EEG signals using this method through resting-state and event-related EEG datasets. Additionally, Huang et al. [12] applied the Hilbert–Huang transform (HHT) to EEG signals and improved the classification model in separable convolutional networks by incorporating displacement variables through bilinear interpolation. This optimization enhanced the model’s feature extraction capabilities, leading to higher classification accuracy. This study provides confirmation of the effectiveness of interpolation algorithms in enhancing and classifying EEG signals.

Lee et al. [13] proposed a technique called borderline-SMOTE, a minority oversampling method that synthesizes data for the minority class by generating synthetic instances using the m-nearest neighbors of minority class instances. These synthetic instances are then added to the actual data through weighted calculations. Evaluation results from EEG data collected from the P300 task indicate that this method enhances the robustness of decision boundaries, leading to improved classification accuracy on the P300 dataset.

Gubert et al. [14] suggested incorporating additional information about intra-electrode and inter-electrode correlations into the information matrix used by the traditional common spatial pattern (CSP) method. In experiments distinguishing binary (left and right) movements in two common motor imagery datasets, their approach demonstrated a 5% increase in overall classification accuracy compared to CSP-based methods, with individual subject classification improving by up to 31.7%.

J. Schwabedal et al. [15] proposed using Fourier transform to enhance EEG data. They randomized the Fourier transform phases within the [0, 2Π] range and applied inverse Fourier transform for data augmentation. Validated on the CAPSLPDB sleep database using a convolutional neural network (CNN), their method showed a 7% increase in classification accuracy.

Zhang et al. [16] applied empirical mode decomposition to decompose motor imagery EEG data. They combined intrinsic mode functions to form artificial EEG signals, transformed EEG signals into tensors using wavelet transform, and inputted the features into neural networks. Experimental results using both CNN and WNN (wavelet neural network) models showed accuracy rates of 77.9% and 88%, respectively.

So far, prior research on data augmentation of EEG signals has explored various methods such as altering data representations, SMOTE, interpolation algorithms, and decomposition/reconstruction. These methods have been validated through classification experiments. Due to the nonstationary nature and unique structure of EEG signals, enhancing the acquisition of temporal–spatial features in EEG signals is a challenging and crucial aspect of data augmentation. Generative models, particularly GAN, excel in capturing data distributions from real samples to generate synthetic samples. GANs have shown remarkable performance in medical image data augmentation, making them a focal point for investigating data augmentation in EEG signals.

2.2. Data Augmentation in Medical Imaging Using GAN

The emergence of medical imaging technologies such as X-ray computed tomography (CT), magnetic resonance imaging (MRI), and X-ray has made significant contributions to improving healthcare [17]. However, due to the unique nature of the data and constraints related to privacy, the bottleneck in its research lies in the expensive data collection and processing. The development of deep learning has revolutionized the generation of medical imaging data [18]. Generative adversarial networks, employing adversarial learning, have been applied to the augmentation of medical imaging data.

Ma et al. [19] proposed the structure and illumination constrained GAN (StillGAN), which introduces illumination regularization and structural loss. It treats low-quality and high-quality images as two different domains and achieves uniform illumination by training to transfer features from high-quality images to low-quality images. This process preserves fine structural details. Experimental results indicate that data generated by this model can enhance the performance of neural fiber segmentation, neural distortion grading, central fovea localization, and disease diagnosis.

Yao et al. [20] introduced the weighted feature transfer GAN (WFT-GAN) for medical image data augmentation. They established a weighted feature transfer channel in the model, preserving the impact of encoded features on image synthesis while reducing interference between different features. Additionally, they incorporated local perceptual adversarial loss to enhance the extraction and learning of local features. Experiments on three datasets demonstrate that this method can synthesize higher-quality medical images.

Dong et al. [21] proposed a supervised 3D GAN for generating CT images from MRI data. The inclusion of a loss function based on image gradient differences effectively alleviated the issue of image blurriness in the generated images. Additionally, an ACM strategy was employed to achieve GAN context awareness. Experimental results indicate that this method produces CT images that are accurate and robust in predicting MRI outcomes.

Costa et al. [22] utilized GAN to synthesize fundus images. They employed blood vessel segmentation techniques to pair real fundus images with their respective vascular trees. Subsequently, the model learned the mapping from binary vascular trees to new retinal images. The generation of new images was accomplished by sampling the appropriate parameter space. Experimental findings suggest that, despite sharing the same vascular tree, original and generated images exhibit visual differences in global appearance. Furthermore, quantitative analysis of the synthesized retinal images confirmed the retention of detailed features present in the real image dataset.

GAN has made significant strides in data augmentation for medical images. The electroencephalogram topography proposed in this paper can be considered as a type of medical image encompassing spatiotemporal features. Therefore, the central focus of this study is on applying GAN to enhance the data of electroencephalogram topography.

2.3. Enhancing EEG Signals Using Wasserstein GAN

The adversarial training of the original GAN minimizes the Jensen–Shannon (JS) divergence between real and generated data. However, the discontinuity of JS divergence makes it challenging to provide useful gradients for optimizing the generator, being one of the main reasons for the instability in training the original GAN. To address this issue, Wasserstein GAN (WGAN) replaces the Jensen–Shannon divergence with earth mover’s distance (EMD), also known as Wasserstein-1 distance. EMD is continuous and, almost everywhere, differentiable, thus offering meaningful gradients for generator optimization, ensuring the convergence of GAN. Leveraging the advantages of WGAN, its application in enhancing EEG signals has garnered significant attention.

Zhang et al. [23] introduced label constraints to guide the generation process within WGAN and modified the gradient penalty to a centered gradient penalty. This modification led to improved classification accuracy in SEED dataset recognition.

Luo et al. [24] introduced a conditional Wasserstein GAN (CWGAN) framework within the original GAN model for EEG data augmentation to enhance emotion recognition in EEG signals. By inputting differential entropy (DE) forms of EEG signals into CWGAN, realistic-like EEG data were generated. The proposed CWGAN framework was evaluated on two public EEG datasets (SEED and DEAP) for emotion recognition. Experimental results demonstrated that using EEG data generated by CWGAN significantly improved the classification accuracy of emotion recognition models.

Panwar et al. [25] proposed a WGAN model with gradient penalty (WGAN-GP) for synthesizing EEG data in rapid serial visual presentation (RSVP) tasks. They validated the effectiveness of the generated data by classifying RSVP target events.

Aznan et al. [26] utilized WGAN to generate EEG data, optimizing the interaction efficiency in steady-state visually evoked potential (SSVEP) tasks. Subsequently, they fine-tuned a pretrained classifier using real-time EEG collected during offline phases, applied for robot control, and achieved real-time navigation. The results indicated a significant improvement in the accuracy of real-time navigation across multiple subjects.

From previous studies employing WGAN for enhancing EEG signals, it is evident that enabling models to adequately learn the spatiotemporal features within EEG signals poses a challenging task for data augmentation. In low-data scenarios, generative models often need to capture various features of input data for learning. Hence, constructing reasonable network structures to extract feature information while employing appropriate data representations to fully express the spatiotemporal features of the data is crucial.

3. Materials and Methods

The low signal-to-noise ratio and high redundancy issues in EEG signals can significantly reduce the accuracy of EEG classification and increase computational costs. This paper proposes FastGAN-ASP, which utilizes an attention mechanism to swiftly identify crucial features and their respective positions in images, thereby enhancing the training efficiency of the generative model. Simultaneously, the dataset adopts the representation of EEG topographic maps. It visualizes the EEG signals at each sampling point and moment in the form of graph theory. Ultimately, EEG signals are transformed into “color images”, serving as inputs to the FastGAN-ASP model for training and generation. Figure 1 illustrates the model training framework for generating EEG topographic maps using FastGAN-ASP.

As shown in Figure 1, the waveform EEG signals are transformed into EEG topographic maps, corresponding to Section 3.1 in the paper. The obtained EEG topographic maps are used as initial input feature maps for training the FastGAN-ASP model. The model employs attention normalization and spatial pyramid to extract key features from the initial feature maps. Ultimately, the model outputs the generated EEG topographic maps, corresponding to Section 3.2 in the paper.

3.1. Transformation of EEG Signals into EEG Topographic Maps

The brainwave signals encompass multiple time series corresponding to measurements from electrodes placed at different spatial positions on the scalp. Aggregating the measurement data from these electrodes into feature vectors is a standard method for analyzing EEG signal data. However, due to the discrete nature of individual channels, spatial features cannot be extracted when extracting time–frequency domain features. The EEG topographic maps used in this study employ graph theory to represent the static structure or regional correlations of brain activity as images. In addition to spatial features reflecting brain region connectivity, the power spectral density is quantitatively mapped into color values within the EEG topographic maps. This method not only intuitively reflects the functional state activities of specific brain regions but also quantitatively describes connectivity network relationships between brain regions and their temporal variations.

Principles of EEG Topographic Map Formation:

(1): Operational Principles:

Raw EEG signals are collected from electrodes at different positions at the scalp level, filtered using bandpass filters to extract signals between 0.5 Hz and 30 Hz, and amplified using EEG signal amplifiers. Common EEG wave frequency ranges include delta (0.5–3 Hz), theta (4–7 Hz), alpha (8–13 Hz), and beta (18–30 Hz).

After amplification, the raw EEG signals are converted from waveform signals to digital signals using the computer’s analog-to-digital converter. The digital signals are segmented into sections, each of length N. One segment is selected, and its x(n) sampling points are processed using the fast Fourier transform to calculate the power at various frequencies, followed by the computation of the average power spectrum.

(2): Imaging Principles:

Power spectrum data from the finite and discrete electrodes are used to estimate the power spectrum distribution across the entire brain cortex. Utilizing a two-dimensional interpolation method, the power spectral density matrix from known electrode positions infers the values of other points on the topographic map. Through the principle of equipotential effects, these values are quantified into corresponding color pixel values and displayed on an idealized top-down view of the brain.

The dimensions of the image represent frequency, spatial position, and amplitude. The obtained EEG topographic map serves as an input for the generative adversarial network model proposed in this paper. This transformation method can be applied to EEG signals of any paradigm, demonstrating its versatility.

3.2. GAN Based on Channel Attention Normalization and Spatial Pyramid

For the original GAN, random noise (z) is input into the generator (G). G learns to generate samples that mimic the distribution of real samples. Subsequently, both generated and real samples are fed into the discriminator (D), which evaluates and outputs judgments of whether the samples are real or fake. The objective function of the original GAN is represented in Equation (1):

{m i n}_{G} {m a x}_{D} V (D, G) = E_{{x ~ P}_{{d a t a}^{(x)}}} [l o g D (x)] + E_{{z ~ P}_{z^{(z)}}} [l o g (1 - D (G (z)))]

(1)

Here,

P_{{d a t a}^{(x)}}

denotes the distribution of real samples,

P_{z^{(z)}}

represents the distribution of generated samples,

D (x)

is the probability of real images, and

D (G (z))

is the probability of generated samples. During the training process, the discriminator aims to maximize the objective function, pushing the output probabilities

D (x)

toward 1 for real samples and

D (G (z))

toward 0 for generated samples. Simultaneously, the generator seeks to minimize the objective function, aiming to push the output probability

D (G (z))

toward 1. In the ideal state of adversarial training, the discriminator cannot distinguish between real and generated samples, resulting in output probabilities close to 0.5 for both types of samples. At this point, the generator effectively captures the true data distribution, generating realistically convincing samples and artificially increasing the size of the original dataset to address data scarcity issues.

The proposed GAN in this paper introduces an attention mechanism in both channel and spatial dimensions. Specifically, it incorporates normalization constraints based on channel attention and enhances spatial attention by introducing a pyramid structure. These modifications are designed to extract crucial channel features, swiftly locate key areas where features are concentrated, suppress noise interference during model training, improve training efficiency, and consequently enhance the clarity and precision of generated images.

Taking into account the characteristics of EEG datasets, the GAN model adopts the faster and stabilized GAN (FastGAN) [27] for generating high-resolution images from small samples. In comparison to the original GAN, FastGAN incorporates spectral normalization (SLE) and autoencoders, providing increased stability in generating high-quality synthetic images. The basic structure of the FastGAN-ASP generator is illustrated in Figure 2.

In the generator model, SLE represents the skip layer excitation module, and orange arrows denote identical skip layer excitation modules. Blue boxes indicate upsampling operations, and blue arrows signify the execution of the same upsampling operations. Within the blue boxes, BatchNorm represents batch normalization, GLU stands for gated linear units, Atten-Norm represents attention normalization, and SP denotes spatial pyramid. The generator takes a 256-dimensional noise vector and undergoes transpose convolutions, resulting in a feature map with dimensions of 4 × 4 × 1024. It then passes through a series of upsampling operations and skip layer excitation modules, ultimately outputting a feature map with dimensions of 256 × 256 × 3.

The fundamental structure of the FastGAN-ASP discriminator is illustrated in Figure 3, where blue boxes represent residual downsampling operations, and blue arrows indicate the execution of identical residual downsampling operations. The orange box signifies the decoder structure.

Here, the discriminator is regarded as an encoder, assisted by a compact decoder for supplementary training. The encoder extracts features from real images, and the decoder reconstructs images based on these features. The discriminator decodes on feature maps

f_{1}

(resolution 16 × 16) and

f_{2}

(resolution 8 × 8). The decoder comprises four convolutional layers, ultimately generating images with a resolution of 128 × 128. Specifically, the height and width of

f_{1}

are randomly cropped to 1/8 of their original size. In the same region, the real image is cropped to obtain

I_{p a r t}

. Adjusting the size of the real image results in

I

, and the decoder utilizes the cropped

f_{1}

to generate

I_{p a r t}^{'}

, leveraging

f_{2}

to generate

I^{'}

. This process enables the discriminator to learn local features from

f_{1}

and global features from

f_{2}

. Finally, the discriminator and decoder are trained together, matching

I_{p a r t}^{'}

with

I_{p a r t}

and

I^{'}

with

I

, minimizing the loss function.

3.2.1. Channel Attention Normalization

The FastGAN model struggles to locate key features within real images, resulting in unclear generated image details. Additionally, the model faces challenges in effectively filtering information, imposing a heavy burden on the network structure. In order to model interdependencies among channels, allowing the network to recalibrate features selectively and extract useful features while suppressing irrelevant ones in the channel dimension, and thus efficiently allocate limited computational resources, we introduce the channel attention mechanism into both the generator and discriminator. The channel attention module is illustrated in Figure 4.

The input X is mapped into a feature map U by the Ftr operation, where

X \in R^{H^{'} \times W^{'} \times C^{'}}

,

U \in R^{H \times W \times C}

, and

V = (v_{1,} v_{2} \dots v_{c})

represent the set of learned convolution kernels.

v_{c}

refers to the parameters of the c-th convolution kernel, and the output after the Ftr operation is denoted as

U = (u_{1,} u_{2} \dots u_{c})

, as illustrated in Equation (2).

u_{c} = v_{c} * X = \sum_{i = 1}^{c} v_{c}^{s} * x^{s}

(2)

The symbol * denotes convolution,

v_{c} = v_{c,}^{1} v_{c}^{2} \dots v_{c}^{c}

,

X = (x^{1}, x^{2} \dots x^{c})

,

u_{c} \in R^{H \times W}

, and

v_{c}^{s}

represent a two-dimensional convolutional kernel, where “

s

” denotes a single channel of

v_{c}

, and

x^{s}

represents the input of the s-th channel. When inputting spatial features of a channel, the model not only learns the channel feature relationships but also adds up the convolution results of each channel. Therefore, channel feature relationships and spatial feature relationships learned by the convolutional kernels are mixed together. The channel attention module performs feature extraction in the channel dimension, allowing the model to directly learn channel feature relationships. The resulting feature map has dimensions

H \times W \times C

.

This channel attention module models channel relationships in a squeeze-and-excitation (SE) block fashion. First is the squeeze module that compresses the image. Since the convolution operation only learns local information, the obtained feature map U cannot capture inter-channel relationships. To address this issue, global average pooling is used in the spatial dimension to compress the global information of the input feature map into a channel descriptor, as specifically shown in Equation (3).

z_{c} = F_{s q} (u_{c}) = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} u_{c} (i, j), z \in R^{c}

(3)

The original feature map U has dimensions

H \times W \times C

(where H is height, W is width, and C is the number of channels). It compresses

H \times W \times C

to

1 \times 1 \times C

, effectively compressing the two-dimensional feature channels into one-dimensional parameters, embedding global information. Moreover, the input and output dimensions still match in terms of channel numbers. Following this compression, the excitation module performs activation correction. The purpose is to learn nonlinear interactions between channels, capturing dependencies among channels. This objective is achieved using Equation (4).

s = F_{e x} (z, W) = σ (g (z, W)) = σ (W_{2} R e L U (W_{1,} z))

(4)

σ

represents the sigmoid activation function, and

W_{1}

and

W_{2}

are the weight matrices of two fully connected layers, where

W_{1} \in R^{\frac{c}{r} \times c}

and

W_{2} \in R^{c \times \frac{c}{r}}

. r is the scaling parameter, aimed at reducing the number of channels to lower computational load, and c represents the number of channels. The compressed parameter z undergoes the first fully connected operation to reduce dimensionality.

W_{1}

has a dimension of

\frac{c}{r} \times c

,

z

has a dimension of

1 \times 1 \times C

, and

W_{1} \times Z

has a dimension of

1 \times 1 \times \frac{c}{r}

. After applying the ReLU activation function, there is another fully connected operation to restore the original dimension. The result is multiplied by

W_{2}

, resulting in an output dimension of

1 \times 1 \times C

. Finally, the sigmoid function is applied to obtain

s

.

u_{c}

is a two-dimensional matrix,

s_{c}

represents the learned activation values for each channel, and their element-wise product yields the final output, as shown in Formula (5).

X_{c}^{'} = F_{s c a l e} (u_{c,} s_{c}) = s_{c} u_{c}

(5)

The channel attention network structure is illustrated in Figure 5.

The channel attention module can extract useful features and suppress irrelevant ones through self-training neural networks. However, it overlooks the consideration of feature weight information, which can further suppress irrelevant features within channels. Therefore, in addition to incorporating the channel attention foundation, a normalization constraint is introduced. The importance of weights is represented through the scaling factor of batch normalization, further suppressing unimportant features and enhancing the channel attention mechanism. An attention normalization module is embedded within the channel module, aiming to distinguish the importance of features by measuring the variance of model weights. The variance within channels is quantified using the scaling factor in batch normalization (BN), indicating its significance, as shown in Equation (6).

B_{o u t} = B N (B_{i n}) = γ \frac{B_{i n} - μ_{β}}{\sqrt{σ_{β}^{2} + ε}} + β

(6)

μ_{β}

and

σ_{β}

represent the mean and standard deviation of mini-batch B, while

γ

and

β

denote the trainable affine transformation parameters (scale and shift). The structure of the channel attention module is illustrated in Figure 6 and mathematically represented by Equations (7) and (8), where

γ

stands for the channel scaling factor, and

M_{c}

denotes the output features.

W_{r} = \frac{γ_{i}}{\sum_{j = 0} γ_{j}}

(7)

M_{c} = S i g m o i d (W_{γ} (B N (F_{1})))

(8)

3.2.2. Spatial Pyramid

The channel attention mechanism captures dependencies in the channel aspect. However, in image data, in addition to channels, structural information is also a crucial feature. This includes the overall framework of the image as well as detailed edge information. To obtain these details, introducing more convolutional layers into the neural network becomes necessary. This unavoidably leads to an increase in parameters, resulting in more complex computations and longer training times. To address this issue and enable the model to quickly identify the spatial location of key image features, a spatial attention module is introduced into both the generator and discriminator. This allows the model to locate relevant spatial targets and assign weights. The original feature map has dimensions of

H \times W \times C

(where

H

is height,

W

is width, and

C

is the number of channels). The spatial attention module performs average pooling and max pooling in the channel dimension, compressing the feature map from

H \times W \times C

to

H \times W \times 1

. After compressing

H \times W

into one dimension, it is equivalent to the parameters in this dimension having a global view of the previous

H \times W

. This broadens the receptive field. The two feature maps are then concatenated in the channel dimension, resulting in a feature map with dimensions

H \times W \times 2

. After a convolution operation with a 7 × 7 kernel, reducing it to one channel while keeping the height and width unchanged, the output feature map has dimensions

H \times W \times 1

. This is followed by a sigmoid activation function to generate the spatial weighting coefficient M. Multiplying this coefficient with the input feature map yields the scaled new feature, as shown in Equation (9).

M_{s} (F) = σ (f^{7 \times 7} ([A v g P o o l (F); M a x P o o l (F)])) = σ (f^{7 \times 7} ([F_{a v g}^{s}; F_{m a x}^{s}]))

(9)

σ

represents the sigmoid function,

f^{7 \times 7}

denotes a convolution operation with a filter size of 7 × 7,

F_{a v g}^{s}

has dimensions

H \times W \times 1

, and

F_{m a x}^{s}

has dimensions

H \times W \times 1

. The schematic diagram of the spatial attention module is illustrated in Figure 7.

In the spatial attention module, both average pooling and max pooling are simultaneously applied to the feature map. While this helps prevent overfitting and regularizes each feature map, it tends to overly emphasize regularization and overlooks the representation of original features and structural information. Aggregating the original feature maps into a single average value can result in a loss of feature representation capability, thus affecting the learning ability of features. Therefore, to address this issue and enhance feature learning, a pyramid structure is introduced on top of the spatial attention module. In this structure, adaptive average pooling from the pyramid replaces the average pooling and max pooling in the spatial attention module. This integration of structural regularization and information into the attention path ensures that the model learns both the structure and features effectively. The multi-layer perceptron learns weight feature maps from the output of the spatial pyramid structure, allowing the generation of higher-quality images with fewer network layers. The spatial pyramid network structure is illustrated in Figure 8.

The term “AAP” stands for adaptive average pooling, incorporating three adaptive average pooling layers at different scales: 4 × 4, 2 × 2, and 1 × 1. The 4 × 4 average pooling aims to capture more feature representations and structural information. The 1 × 1 average pooling serves as a traditional average pooling with strong structural regularization, while the goal of the 2 × 2 average pooling is to balance the relationship between structural information and structural regularization. The outputs of these three layers are then adjusted into three one-dimensional feature vectors, which are connected to form a one-dimensional feature map.

Assuming a CNN with

l \in [1, L]

layers, where each layer produces output feature maps

X_{l}

,

P (\cdot, \cdot)

represents adaptive average pooling,

F_{f c} (\cdot)

F represents fully connected layers,

C (\cdot)

represents concatenation operations,

σ (\cdot)

represents the sigmoid activation function,

R (\cdot)

represents the resize function, and

X_{l} \in R^{H \times W \times C}

denotes the intermediate feature mapping. The spatial pyramid’s output for

X_{l}

is expressed as shown in Equation (10).

S (x_{l}) = C (R (P (x_{l}, 4)), R (P (x_{l}, 2)), R (P (x_{l}, 1)))

(10)

Following this, after passing through fully connected layers, batch normalization, and ReLU activation functions, another concatenation operation is performed. The output, after passing through the sigmoid activation function, is expressed as shown in Equation (11).

ξ (x_{l}) = σ (B N (F_{f_{c}} (R e L U (B N (F_{f_{c}} (S (x_{t})))))))

(11)

4. Experimental Results and Analysis

4.1. Experimental Environment and Experimental Dataset

The CPU used for the experiments is Intel(R) Core(TM) i7-10700K @3.80 GHz, the GPU is NVIDIA GeForce RTX 2080 SUPER, the operating system is Windows 10 64 bits, and the Python version is 3.6.

4.2. Experimental Data

This section describes the dataset used in the experiment, as well as the preprocessing steps applied to the data. Finally, the obtained data are visualized in the form of images.

4.2.1. Dataset Description

The experiment was conducted on the BCI Competition IV-Ⅰ [28] and BCI Competition IV-2b [29] EEG datasets. The BCI Competition IV-Ⅰ dataset consists of data from seven healthy subjects who performed motor imagery tasks based on visual cues of arrows pointing to the left, right, or down. Throughout the experiment, motor imagery was conducted without any external stimulation. Subjects were instructed to choose two out of three categories (left hand, right hand, and foot) for execution. The BCI Competition IV-2b dataset includes data from nine healthy right-handed subjects with normal or corrected-to-normal vision. Subjects were seated approximately 1 meter away from a computer screen while wearing EEG recording equipment. The EEG data collected from each subject were divided into five modules. The first two modules consisted of EEG motor imagery data without visual feedback, while the last three modules included data with visual feedback. Prior to data collection, an eye movement signal test was conducted to assess and eliminate the interference of eye movement signals on EEG signals. EEG signals were captured during motor imagery tasks using three channels: C3, C4, and Cz. These channels are located in the motor cortex region of the brain, providing a better reflection of signal changes during movement while minimizing interference from irrelevant signals. During the execution of motor imagery tasks, the signals from the C4 and C3 electrode sites corresponding to the movement cortex areas for the left and right hands were affected. Cz retained different electrode information depending on the task. This resulted in changes in the power spectral density values in corresponding frequency bands. The differences in power spectral density values were quantified and mapped into a 256 × 256 EEG topographic map for visualization representation.

4.2.2. Data Preprocessing and Feature Extraction

The signal-to-noise ratio of EEG signals is often low, with various unwanted noise components present. Therefore, preprocessing and feature extraction are two crucial steps in the processing of EEG signals. Preprocessing helps eliminate unwanted artifacts from EEG signals, thereby improving the signal-to-noise ratio. Feature extraction aims to capture relevant characteristics from the signals. In the referenced literature [10], a custom pipeline is employed. It utilizes a version of the multichannel Wiener filter to remove high-frequency noise. Subsequently, the process involves blink detection based on a threshold and automatic removal of artifacts. Taking inspiration from this, our approach initially involves the use of a notch filter at 50 Hz to eliminate power line interference. Subsequently, a bandpass filter in the range of 0.1–45 Hz is applied to extract the target frequency band. Independent component analysis (ICA) is then employed to remove ocular artifacts, resulting in smoother EEG data with a noticeable reduction in noise artifacts.

4.2.3. EEG Imaging

Analyzing preprocessed EEG signals using the MNE-Python 1.6.0 software involves utilizing the mne.plot_psd() function for visualizing power spectral density, thereby obtaining the power spectrum distribution density of the brain model. Additionally, the mne.plot_topomap() function is employed for visualizing EEG topography.

Figure 9 illustrates the transformation of waveform EEG signals into EEG topographic maps using the mne.plot_topomap() function after preprocessing, where, different colors represent varying power spectral density values of the electroencephalogram signals.

4.3. Experimental Design

This section describes the experimental phase of generating EEG topographic maps using FastGAN. Firstly, the proposed model and three other generative models are employed to generate EEG topographic maps, each applied to an augmented dataset. Subsequently, both quantitative and qualitative analyses are conducted, providing a comparative assessment of the experimental results across different augmented datasets.

4.3.1. Training the Generator

For the BCI Competition IV-Ⅰ and BCI Competition IV-2b datasets, the EEG signals from each individual are transformed into EEG topographic maps of size 256 × 256, serving as the original datasets. From each original dataset, 500 randomly selected EEG topographic maps are extracted for training. Therefore, the total number of images in the BCI Competition IV-Ⅰ dataset is 3500, and in the BCI Competition IV-2b dataset, it is 4500. These images are then fed into the generative models for training. The generative models employed in the experiment include FastGAN-ASP and comparative models FastGAN, WGAN-GP, and WGAN-GP-ASP. The generator and discriminator are trained for 100,000 iterations, with a batch size of 64. Every 1000 iterations, the weights of the generator and discriminator are saved. Additionally, the generated samples corresponding to each generator are also saved.

4.3.2. Quantitative Analysis

From a theoretical perspective, we employ the FID score for quantitative evaluation of augmented EEG topographic maps. FID originates from statistical aspects of computer vision features calculated from original images. It measures the dissimilarity between generated and real images by quantifying the distance between the feature vectors of these two image sets. This is expressed specifically as shown in Equation (12).

F I D = {∥ μ_{r} - μ_{g} ∥}^{2} + T r (\sum_{r} + \sum_{g} - {2 (\sum_{r} \sum_{g})}^{\frac{1}{2}}),

(12)

where

μ_{r}

represents the mean of features from real images,

μ_{g}

represents the mean of features from generated images,

\sum_{r}

denotes the covariance matrix of features from real images, and

\sum_{g}

denotes the covariance matrix of features from generated images.

T r

signifies the trace of a matrix, which is the sum of the elements along the diagonal of the matrix.

For each individual in both the original dataset and the augmented dataset, 2000 EEG topographic maps were randomly sampled. The FID score was then computed to measure the distributional distance between the features of the two sets of images.

4.3.3. Qualitative Analysis

From an applied perspective, employing classification networks for assessing augmented EEG topographic maps qualitatively is the focus. These networks are utilized for conducting classification experiments on the aforementioned datasets. Given the datasets’ characteristics of being small scale yet high resolution, a basic CNN might not adequately meet the demands for classification tasks. Consequently, the ResNet-18 model, a deep convolutional neural network, is employed for training and subsequent classification. The objective is to compare the accuracy achieved in classification experiments using datasets augmented by four distinct models: FastGAN-ASP, FastGAN, WGAN-GP, and WGAN-GP-ASP.

From the original dataset, 1000 EEG topographic maps are randomly selected for each individual. These samples are then divided into training and testing sets using a 7:3 ratio. Prior to augmentation, the original training set for each individual consists of 700 images, while the testing set comprises 300 images. Subsequent to augmenting samples using FastGAN-ASP, FastGAN, WGAN-GP, and WGAN-GP-ASP, the original training set is expanded by a factor of five, resulting in each individual’s training set now containing 3500 images. The testing set remains unchanged.

4.4. Experimental Results and Analysis

4.4.1. Quantitative Analysis Result

The closer the feature distributions of real and generated images are, the more similar the two sets of images. A smaller FID score indicates a closer resemblance. When the FID score is 0, it signifies identical datasets. The FID scores for the generated EEG topographic maps compared to the real EEG topographic maps for each individual are presented in Table 1 and Table 2.

From Table 1, it can be observed that, for the BCI Competition IV-Ⅰ dataset, except for the fifth participant, the FID scores for the other individuals are lower when the model incorporates attention mechanisms compared to when it does not. Additionally, the FID score for the FastGAN model is lower than that of the WGAN-GP model.

As for Table 2, for the BCI Competition IV-2b dataset, FastGAN-ASP shows a significantly greater reduction in FID compared to FastGAN. Among the fifth participant, FastGAN-ASP demonstrates a 10.3% decrease in FID relative to the FastGAN model, and WGAN-GP-ASP exhibits a 15.4% decrease in FID compared to the WGAN-GP model.

At the same time, to further demonstrate the impact of incorporating channel attention normalization and spatial pyramid on the generated results, ablation experiments were conducted. The baseline model for ablation experiments was FastGAN. Three variations were considered for comparison: individually adding channel attention normalization (AN), individually adding spatial pyramid (SP), and simultaneously adding both channel attention normalization and spatial pyramid (ASP).

From Table 3, it can be observed that, after individually adding attention normalization, the model’s FID scores decreased by 5.4% and 2.0% compared to the baseline model. However, after individually adding spatial pyramid, the FID scores showed a slight increase. The model with both attention normalization and spatial pyramid had the lowest FID score, showing a reduction of 6.3% and 2.4% compared to the baseline model. This indicates that the combination of attention normalization and spatial pyramid can effectively enhance model performance, making the generated EEG images closer to real images.

4.4.2. Qualitative Analysis Result

From Figure 10, it can be observed that, in the BCI Competition IV-Ⅰ dataset, after training with four different generative models for the same number of epochs, the proposed FastGAN-ASP model achieves the highest accuracy on both the training and testing sets.

From Table 4, it can be observed that, after enhancement using FastGAN-ASP, the average accuracy of the test set increased by 2.83% compared to FastGAN, by 11.91% compared to WGAN-GP, and by 10.7% compared to WGAN-GP-ASP. The average accuracy of the WGAN-GP-ASP test set classification increased by 1.2% compared to WGAN-GP.

According to Figure 11, it is evident that in the training and test sets of the BCI Competition IV-2b dataset, the classification accuracy after enhancement with FastGAN-ASP is higher than the classification accuracy after enhancement with FastGAN. Additionally, when conducting classification experiments on the dataset augmented with WGAN-GP-ASP, which incorporates an attention mechanism, the classification accuracy is higher compared to the classification accuracy of the dataset augmented with the WGAN-GP model.

From Table 5, it can be observed that, after enhancement with FastGAN-ASP, the average accuracy of the test set increased by 1.62% compared to FastGAN, by 21.67% compared to WGAN-GP, and by 18.12% compared to the enhancement using the WGAN-GP-ASP model. The average accuracy of the test set enhanced by WGAN-GP-ASP increased by 3.55% compared to WGAN-GP.

From the comparison of different models in Table 6, it is evident that employing the method proposed in this paper for data augmentation significantly improves the performance of downstream classification recognition tasks. This demonstrates the learning and generative capabilities of the FastGAN-ASP model on real EEG topographic maps. It also validates the authenticity and usefulness of augmented samples.

Through quantitative and qualitative analyses on the BCI Competition IV-Ⅰ and BCI Competition IV-2b standard datasets using different generative models, it can be observed that when trained on high-resolution images, a small dataset tends to result in WGAN-GP generating relatively uniform samples. In contrast, the FastGAN model, incorporating SLE and autoencoders with skip connections instead of deeper network structures, effectively prevents the problem of mode collapse. Consequently, the FID score of the generated images is significantly lower than that of WGAN-GP. The average classification accuracy in classification experiments is higher than WGAN-GP. Furthermore, the incorporation of channel attention mechanisms and spatial pyramid structures in the model allows it to rapidly and accurately learn key features in the data. This results in generated images that closely match the feature distribution of real images. In summary, the generation performance of the FastGAN-ASP model is optimal.

In addition, when compared with the accuracy of current state-of-the-art methods on the BCI Competition IV-2b dataset, Table 7 indicates that the method proposed in this paper achieved a highest accuracy of 92.43%.

5. Conclusions

The brain–computer interface technology provides a new connection pathway between the human brain and external computers, offering new perspectives for identity recognition. However, the development bottleneck for identity recognition based on EEG signals lies in the expensive data collection and processing. EEG signals, distinct from ordinary data, pose challenges for data augmentation due to their unique time–frequency–space characteristics. This paper proposes a data generation method based on channel and spatial attention using generative adversarial networks (FastGAN-ASP). It transforms nonstationary and low signal-to-noise ratio waveform EEG signals into intuitive and distinct EEG topographic maps, representing the time–frequency–space features of EEG data in an image format. FastGAN-ASP is employed to augment EEG topographic maps. This method demonstrates promising results on the BCI Competition IV-Ⅰ and BCI Competition IV-2b standard datasets. Quantitative and qualitative analyses using FID scores and ResNet-18 classification networks show that the image quality generated by this model surpasses those generated by FastGAN, WGAN-GP, and WGAN-GP-ASP models. Moreover, it enhances the accuracy of downstream classification recognition tasks. However, EEG signals are time-series data with time–frequency–space characteristics, and effectively capturing and representing the information they contain is a challenge in data augmentation. In future work, we aim to further improve the representation of EEG data and implement more effective methods for augmentation to enhance the accuracy of EEG signal classification.

Author Contributions

Conceptualization, M.Z. and S.Z.; methodology, M.Z.; software, M.Z.; validation, M.Z.; formal analysis, M.Z.; investigation, M.Z.; resources, M.Z. and L.S.; data curation, M.Z.; writing—original draft preparation, M.Z.; writing—review and editing, M.Z., X.M. and L.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

Xiao, Z.; Gao, X.; Fu, C.; Dong, Y.; Gao, W.; Zhang, X.; Zhou, J.; Zhu, J. Improving Transferability of Adversarial Patches on Face Recognition with Generative Models. In Proceedings of the CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; 2021; pp. 11840–11849. [Google Scholar] [CrossRef]
Zhang, S.; Sun, L.; Mao, X.; Hu, C.; Liu, P. Review on EEG-based authentication technology. Comput. Intell. Neurosci. 2021, 20, 5229576. [Google Scholar] [CrossRef] [PubMed]
Krucoff, M.O.; Rahimpour, S.; Slutzky, M.W.; Edgerton, V.R.; Turner, D.A. Enhancing nervous system recovery through neurobiologics, neural interface training, and neurorehabilitation. Front. Neurosci. 2016, 10, 584. [Google Scholar] [CrossRef] [PubMed]
Urigüen, J.A.; Garcia-Zapirain, B. EEG artifact removal—State-of-the-art and guidelines. J. Neural Eng. 2015, 12, 031001. [Google Scholar] [CrossRef] [PubMed]
Jas, M.; Engemann, D.A.; Bekhti, Y.; Raimondo, F.; Gramfort, A. Autoreject: Automated artifact rejection for MEG and EEG data. NeuroImage 2017, 159, 417–429. [Google Scholar] [CrossRef] [PubMed]
Duffy, F.H.; Burchfiel, J.L.; Lombroso, C.T. Brain electrical activity mapping (BEAM): A method for extending the clinical utility of EEG and evoked potential data. Ann. Neurol. Off. J. Am. Neurol. Assoc. Child Neurol. Soc. 1979, 5, 309–321. [Google Scholar] [CrossRef]
Goodfellow, I.; Pouget-abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar]
Dowson, D.C.; Langau, B.V. The Fréchet distance between multivariate normal distributions. J. Multivar. Anal. 1982, 12, 450–455. [Google Scholar] [CrossRef]
Wolpaw, J.; Birbaumer, N.; McFarland, D.; Pfurtscheller, G.; Vaughan, T. Brain-computer interfaces for communication and control. Clin. Neurophysiol. 2002, 113, 767–791. [Google Scholar] [CrossRef]
Varone, G.; Boulila, W.; Driss, M.; Kumari, S.; Khan, M.K.; Gadekallu, T.R.; Hussain, A. Finger pinching and imagination classification: A fusion of CNN architectures for IoMT-enabled BCI applications. Inf. Fusion 2024, 101, 102006. [Google Scholar] [CrossRef]
Dong, L.; Zhao, L.; Zhang, Y.; Yu, X.; Li, F.; Li, J.; Lai, Y.; Liu, T.; Yao, D. Reference electrode standardization interpolation technique (RESIT): A novel interpolation method for scalp EEG. Brain Topogr. 2021, 34, 403–414. [Google Scholar] [CrossRef]
Huang, W.; Xue, Y.; Hu, L.; Liuli, H. S-EEGNet: Electroencephalogram signal classification based on a separable convolution neural network with bilinear interpolation. IEEE Access 2020, 8, 131636–131646. [Google Scholar] [CrossRef]
Lee, T.; Kim, M.; Kim, S.P. Data Augmentation Effects Using Borderline-SMOTE on Classification of a P300-Based BCI. In Proceedings of the 2020 8th International Winter Conference on Brain-Computer Interface (BCI), Gangwon, Republic of Korea, 26–28 February 2020; pp. 1–4. [Google Scholar] [CrossRef]
Gubrt, P.H.; Costa, M.H.; Silva, C.D.; Trofino-Neto, A. The performance impact of data augmentation in CSP-based motor-imagery systems for BCI applications. Biomed. Signal Process. Control. 2020, 62, 102152. [Google Scholar] [CrossRef]
Schwabedai, J.T.C.; Snyder, J.C.; Cakmak, A.; Nemati, S.; Clifford, G.D. Addressing class imbalance in classification problems of noisy signals by using fourier transform surrogates. arXiv 2018, arXiv:1806.08675. [Google Scholar] [CrossRef]
Zhang, Z.; Duan, F.; Sole-casals, J.; Dinares-Ferran, J.; Cichocki, A.; Yang, Z.; Sun, Z. A novel deep learning approach with data augmentation to classify motor imagery signals. IEEE Access 2019, 7, 15945–15954. [Google Scholar] [CrossRef]
Shung, K.K.; Smith, M.; Tsui, B.M.W. Principles of Medical Imaging; Academic Press: San Diego, CA, USA, 2012. [Google Scholar]
He, C.; Liu, J.; Zhu, Y.; Du, W. Data augmentation for deep neural networks model in EEG classification task: A review. Front. Hum. Neurosci. 2021, 15, 765525. [Google Scholar] [CrossRef]
Ma, Y.; Liu, J.; Liu, Y.; Fu, H.; Hu, Y.; Cheng, J.; Qi, H.; Wu, Y.; Zhang, J.; Zhao, Y. Structure and illumination constrained GAN for medical image enhancement. IEEE Trans. Med. Imaging 2021, 40, 3955–3967. [Google Scholar] [CrossRef]
Yao, S.; Tan, J.; Chen, Y.; Gu, Y. A weighted feature transfer gan for medical image synthesis. Mach. Vis. Appl. 2021, 32, 22. [Google Scholar] [CrossRef]
Nie, D.; Trullo, R.; Lian, J.; Petitjean, C.; Ruan, S.; Wang, Q.; Shen, D. Medical Image Synthesis with Context-Aware Generative Adversarial Networks. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2017: 20th International Conference; Springer: Cham, Switzerland, 2017; pp. 417–425. [Google Scholar] [CrossRef]
Costa, P.; Galdran, A.; Meye, M.; Abramoff, M.D.; Niemeijer, M.; Mendonça, A.M.; Campilho, A. Towards adversarial retinal image synthesis. arXiv 2017, arXiv:1701.08974. [Google Scholar] [CrossRef]
Zhang, A.; Su, L.; Zhang, Y.; Fu, Y.; Wu, L.; Liang, S. EEG data augmentation for emotion recognition with a multiple generator conditional Wasserstein GAN. Complex Intell. Syst. 2021, 8, 3059–3071. [Google Scholar] [CrossRef]
Luo, Y.; Zhu, L.Z.; Wan, Z.Y.; Lu, B.L. Data augmentation for enhancing EEG-based emotion recognition with deep generative models. J. Neural Eng. 2020, 17, 056021. [Google Scholar] [CrossRef]
Panwar, S.; Rad, P.; Jung, T.P.; Huang, Y. Modeling EEG data distribution with a Wasserstein generative adversarial network to predict RSVP events. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 28, 1720–1730. [Google Scholar] [CrossRef] [PubMed]
Aznan, N.K.N.; Connolly, J.D.; Al Moubayed, N.; Breckon, T.P. Using Variable Natural Environment Brain-Computer Interface Stimuli for Real-Time Humanoid Robot Navigation. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 4889–4895. [Google Scholar] [CrossRef]
Liu, B.; Zhu, Y.; Song, K.; Elgammal, A. Towards faster and stabilized gan training for high-fidelity few-shot image synthesis. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
Blankertz, B.; Dornhrge, G.; Krauleaat, M.; Müller, K.-R.; Curio, G. The non-invasive Berlin brain–computer interface: Fast acquisition of effective performance in untrained subjects. NeuroImage 2007, 37, 539–550. [Google Scholar] [CrossRef] [PubMed]
Leeb, R.; Brunner, C.; Mvller-putz, G.; Schloegl, A.; Pfurtscheller, G. BCI Competition 2008–Graz Data Set B. Graz Univ. Technol. Austria 2008, 16, 1–6. [Google Scholar]
Zhang, K.; Xu, G.; Han, Z.; Ma, K.; Zheng, X.; Chen, L.; Duan, N.; Zhang, S. Data augmentation for motor imagery signal classification based on a hybrid neural network. Sensors 2020, 20, 4485. [Google Scholar] [CrossRef]
Dai, G.; Zhou, J.; Huang, J.; Wang, N. HS-CNN: A CNN with hybrid convolution scale for EEG motor imagery classification. J. Neural Eng. 2020, 17, 016025. [Google Scholar] [CrossRef]
Majidov, I.; Whangbo, T. Efficient classification of motor imagery electroencephalography signals using deep learning methods. Sensors 2019, 19, 1736. [Google Scholar] [CrossRef]

Figure 1. The model framework diagram of FastGAN-ASP.

Figure 2. Structure of the FastGAN-ASP generator.

Figure 3. Structure of the FastGAN-ASP discriminator.

Figure 4. The channel attention module.

Figure 5. The channel attention network structure.

Figure 6. Channel attention normalization.

Figure 7. Spatial attention module.

Figure 8. Spatial pyramid network structure.

Figure 9. Transforming raw EEG signals into EEG topographic maps.

Figure 10. Accuracy of training and test sets after enhancement of BCI Competition IV-I dataset.

Figure 11. Accuracy of training and test sets after enhancement of BCI Competition IV-2b dataset.

Table 1. Quantitative comparison results of FID scores of different models in the BCI Competition IV-I dataset.

Model		1	2	3	4	5	6	7
FastGAN	original	82.095	74.479	100.599	80.752	93.697	107.768	137.607
FastGAN	add ASP	72.627	73.516	93.553	78.686	96.826	86.886	134.569
WGAN-GP	original	449.195	380.209	374.615	324.697	443.304	444.959	446.427
WGAN-GP	add ASP	394.320	249.502	327.360	289.148	463.536	379.110	407.473

Table 2. Quantitative comparison results of FID scores of different models in the BCI Competition IV-2b dataset.

Model		1	2	3	4	5	6	7	8	9
FastGAN	original	101.001	83.345	82.780	69.010	94.844	80.746	84.537	69.814	105.578
FastGAN	add ASP	96.765	78.533	79.578	77.690	85.998	80.826	81.310	93.024	79.664
WGAN-GP	original	339.890	348.734	313.376	325.627	365.457	292.676	294.096	285.703	350.812
WGAN-GP	add ASP	324.131	292.719	305.735	330.319	316.657	273.903	293.357	276.299	297.627

Table 3. Quantitative comparison of FID scores of different models on different datasets (↓ indicates that a smaller value is preferable).

Model	FID (↓)
Model	BCI Competition IV-I	BCI Competition IV-2b
baseline	96.723	85.739
baseline + AN	91.722	84.032
baseline + SP	102.979	90.117
baseline + ASP	90.952	83.710

Table 4. Classification accuracy (%) of BCI Competition IV-I dataset after enhancement by different methods.

Model		1	2	3	4	5	6	7	Average Accuracy (%)
Fast GAN	original	91.20	93.32	92.31	93.10	90.15	94.23	94.14	92.64
Fast GAN	add ASP	93.43	95.40	94.51	94.17	93.85	96.36	96.58	95.47
WGAN- GP	original	83.63	83.95	84.56	85.21	81.03	83.65	82.87	83.56
WGAN- GP	add ASP	85.42	83.56	84.78	85.81	85.30	83.77	84.70	84.76

Table 5. Classification accuracy (%) of BCI Competition IV-2b dataset after enhancement by different methods.

Model		1	2	3	4	5	6	7	8	9	Average Accuracy (%)
FastGAN	original	89.90	88.92	90.00	92.20	92.03	92.49	91.07	91.11	90.12	90.87
FastGAN	add ASP	89.79	90.65	92.55	93.00	92.81	93.13	93.89	93.06	93.02	92.43
WGAN-GP	original	66.76	75.02	71.30	69.32	62.50	73.56	76.13	72.19	70.02	70.76
WGAN-GP	add ASP	76.08	74.35	78.55	71.49	69.30	73.67	75.98	77.61	71.78	74.31

Table 6. Classification accuracy (%) of different datasets enhanced by different methods.

Model	Accuracy (%)
Model	BCI Competition IV-I	BCI Competition IV-2b
FastGAN enhance	92.64	90.87
FastGAN-ASP enhance	95.47	92.43
WGAN-GP enhance	83.56	70.76
WGAN-GP-ASP enhance	84.76	74.31

Table 7. Classification accuracy (%) of BCI Competition IV-2b dataset by advanced methods.

Number	References	Enhancement Strategy	Classification Method	Accuracy (%)
1	Huang [12]	Feature transformation	CNN	81.52
2	Zhang [30]	DCGAN	CNN	80.6~92.3
3	Dai [31]	Feature transformation	HS-CNN	85.6~87.6
4	Majidov [32]	SW	RM classifier	82.39
5	Ours	FastGAN-ASP	ResNet-18	92.43

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhao, M.; Zhang, S.; Mao, X.; Sun, L. EEG Topography Amplification Using FastGAN-ASP Method. Electronics 2023, 12, 4944. https://doi.org/10.3390/electronics12244944

AMA Style

Zhao M, Zhang S, Mao X, Sun L. EEG Topography Amplification Using FastGAN-ASP Method. Electronics. 2023; 12(24):4944. https://doi.org/10.3390/electronics12244944

Chicago/Turabian Style

Zhao, Min, Shuai Zhang, Xiuqing Mao, and Lei Sun. 2023. "EEG Topography Amplification Using FastGAN-ASP Method" Electronics 12, no. 24: 4944. https://doi.org/10.3390/electronics12244944

APA Style

Zhao, M., Zhang, S., Mao, X., & Sun, L. (2023). EEG Topography Amplification Using FastGAN-ASP Method. Electronics, 12(24), 4944. https://doi.org/10.3390/electronics12244944

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

EEG Topography Amplification Using FastGAN-ASP Method

Abstract

1. Introduction

2. Related Work

2.1. Data Augmentation of EEG Signals

2.2. Data Augmentation in Medical Imaging Using GAN

2.3. Enhancing EEG Signals Using Wasserstein GAN

3. Materials and Methods

3.1. Transformation of EEG Signals into EEG Topographic Maps

3.2. GAN Based on Channel Attention Normalization and Spatial Pyramid

3.2.1. Channel Attention Normalization

3.2.2. Spatial Pyramid

4. Experimental Results and Analysis

4.1. Experimental Environment and Experimental Dataset

4.2. Experimental Data

4.2.1. Dataset Description

4.2.2. Data Preprocessing and Feature Extraction

4.2.3. EEG Imaging

4.3. Experimental Design

4.3.1. Training the Generator

4.3.2. Quantitative Analysis

4.3.3. Qualitative Analysis

4.4. Experimental Results and Analysis

4.4.1. Quantitative Analysis Result

4.4.2. Qualitative Analysis Result

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI