The model was evaluated using ten-fold cross-validation, with metrics such as accuracy, precision, recall, F1-score, and AUC. The uncertainty of the classification results was used to assess the reliability of the model’s classification. In addition, DU-former was compared with Transformer, EEGNet, and CNN models to demonstrate its superior performance in classifying EEG signals related to episodic memory.
5.2.1. Model Training Results
Table 3 shows the classification results of the DU-former model across seven frequency bands. As observed, the model’s classification performance improved from the low- to high-frequency bands. The classification performance was weakest in the Alpha2 band, with accuracy, precision, recall, F1, and AUC values of 0.850, 0.833, 0.809, 0.820, and 0.817, respectively. The best performance was observed in the Gamma band, with accuracy, precision, recall, F1, and AUC values of 0.975, 0.954, 0.989, 0.971, and 0.977. All frequency bands showed classification results above 0.85, demonstrating that DU-former has a strong classification ability for EEG signals related to episodic memory.
Figure 14 illustrates the uncertainty distribution of classification results across the seven frequency bands. In this context, uncertainty refers to the degree of confidence the DU-former model has in its classification predictions, with lower uncertainty indicating higher confidence and more reliable decisions. The horizontal axis, labeled “Uncertainty Level”, represents the degree of uncertainty in the model’s classification predictions, while the vertical axis, labeled “Occurrence Rate”, indicates the proportion of predictions that fall within each uncertainty level.
Uncertainty plays a crucial role in EEG classification, as high uncertainty may indicate ambiguous neural patterns, signal noise, or overlapping feature distributions between different classes [
22,
23]. By quantifying uncertainty, researchers can assess the reliability of the model’s decisions and determine the proportion of classifications that can be considered trustworthy. As shown in
Figure 14, predictions with uncertainty values less than 0.1 account for over 65% of the total data, suggesting that the model is highly confident in most cases.
If an uncertainty threshold of ≤0.3 is considered acceptable, then the acceptable classification results for each frequency band account for 83.0%, 83.2%, 89.1%, 94.5%, 87.1%, 97.3%, and 83.6% of the total data, respectively. This demonstrates that the DU-former provides highly reliable classification results for EEG signals related to episodic memory.
5.2.2. Model Comparison Results
To evaluate whether DU-former outperforms existing models in classifying episodic memory EEG signals, this paper compares it with EEGNet, Transformer, and CNN.
- (1)
Delta Band Model Comparison Results
Table 4 presents the classification results of the four models in the Delta frequency band. As shown in
Table 4, the DU-former model has the best overall performance, with accuracy, precision, recall, F1, and AUC scores of 0.883, 0.868, 0.856, 0.861, and 0.839, respectively. CNN has the worst performance, with an accuracy score of only 0.772.
- (2)
Theta Band Model Comparison Results
Table 5 shows the classification results of the four models in the Theta frequency band. As presented in
Table 5, the DU-former model performs the best, with accuracy, precision, recall, F1, and AUC scores of 0.986, 0.986, 0.983, 0.984, and 0.986, respectively. CNN performs the worst, with an accuracy score of only 0.908.
- (3)
Alpha1 Frequency Band Model Comparison Results
Table 6 shows the classification results of the four models in the Alpha1 frequency band.
Table 6 shows that DU-former outperforms the other three models in all metrics, with accuracy, precision, recall, F1, and AUC values of 0.852, 0.819, 0.838, 0.827, and 0.828, respectively. CNN performed the worst, with an accuracy of only 0.758.
- (4)
Alpha2 Band Model Comparison Results
Table 7 shows the classification results of the four models in the Alpha2 frequency band. As shown in
Table 7, the DU-former model outperforms the other models overall, with accuracy, precision, recall, F1, and AUC scores of 0.850, 0.833, 0.809, 0.820, and 0.817, respectively. In contrast, CNN yields the worst performance, with accuracy, precision, recall, F1, and AUC scores of 0.800, 0.804, 0.787, 0.795, and 0.788, respectively.
- (5)
Beta1 Band Model Comparison Results
Table 8 shows the classification results of the four models in the Beta1 frequency band. As seen from
Table 8, the DU-former model achieves the best performance, with accuracy, precision, recall, F1, and AUC scores of 0.908, 0.909, 0.872, 0.889, and 0.882, respectively. CNN performs the worst, with an accuracy score of only 0.826.
- (6)
Beta2 Band Model Comparison Results
Table 9 shows the classification results of the four models in the Beta2 frequency band. According to
Table 9, the DU-former model performs the best, with accuracy, precision, recall, F1, and AUC scores of 0.928, 0.921, 0.909, 0.914, and 0.936, respectively. CNN performs the worst, with an accuracy score of only 0.886.
- (7)
Gamma Band Model Comparison Results
Table 10 presents the classification results of the four models in the Gamma frequency band. As indicated in
Table 10, the DU-former model performs the best, with accuracy, precision, recall, F1, and AUC scores of 0.975, 0.954, 0.989, 0.971, and 0.977, respectively. CNN performs the worst, with accuracy, precision, recall, F1, and AUC scores of 0.817, 0.748, 0.834, 0.789, and 0.820, respectively.
5.2.3. Ablation Study Results
This paper conducted ablation experiments to analyze the impact of various modules on the performance of the DU-former model in classifying EEG signals across different frequency bands. The results shown in
Table 11 demonstrate that the exclusion of key modules, such as the SMHSA and the reparameterization module, significantly degrades the model’s classification performance.
In the case of the Alpha1 band, the model’s accuracy dropped from 0.852 to 0.536 when the SMHSA module was excluded, showing a stark decline in the model’s ability to capture critical temporal and spatial dependencies. This sharp reduction highlights the importance of the self-attention mechanism in learning relevant features from the EEG signals. In contrast, excluding the reparameterization module led to a less severe decline, with accuracy dropping to 0.780. This indicates that while the reparameterization module plays a critical role in handling uncertainty and improving model stability, the absence of SMHSA has a more pronounced impact on performance.
Similar trends were observed in other frequency bands such as Alpha2, Beta1, and Beta2. For instance, when the SMHSA module was excluded, the accuracy decreased substantially in the Alpha2 band from 0.850 to 0.555, and in the Beta2 band from 0.928 to 0.582. This consistent degradation underscores the pivotal role of the self-attention mechanism in maintaining classification performance across various EEG frequency bands.
Compared to the removal of the SMHSA module, the removal of the reparameterization module generally caused smaller reductions in performance, as seen in the Beta1 band, where accuracy dropped from 0.908 to 0.860, and in the Gamma band, where accuracy decreased from 0.975 to 0.971.
The results of these ablation studies reinforce the significance of both the SMHSA and reparameterization modules in the DU-former model. The SMHSA module is essential for efficiently learning the complex relationships within the data, while the reparameterization module helps the model handle data uncertainty and noise effectively. Thus, removing either module results in a notable drop in classification accuracy, particularly when the SMHSA module is excluded. These findings confirm that the DU-former’s architecture is highly sensitive to the presence of these modules, further emphasizing their integral role in enhancing the model’s robustness and performance.