4.1. Dataset
The dataset used in this study comprised a total of 25 h of EEG recordings collected from five healthy volunteer participants, all of whom were students, engaged in a low-intensity control task [
24]. This task involved controlling a computer-simulated train using the “Microsoft Train Simulator” program. Each experiment required the participants to control the train for 35–55 min along a primarily featureless route. The study focused on three mental states: focused but passive attention, unfocused or detached but awake, and drowsy.
The first mental state, “focused”, involved participants passively supervising the train while maintaining continuous concentration and focus without active engagement. The second state, “unfocused”, was characterized by participants being awake but not paying attention to the screen, representing a potentially dangerous state which should trigger an alert. This state, which is difficult to detect through external cues such as video monitoring, requires sophisticated discrimination methods. The third state involved explicit drowsiness [
24].
Each participant controlled the simulated train for three distinct 10 min phases during each experiment. Initially, the participants focused closely on the simulator’s controls. In the second phase, they stopped providing control inputs and ceased paying attention to the screen while remaining awake. In the final phase, the participants were allowed to relax, close their eyes, and doze off. Each participant completed seven experiments, with a maximum of one per day. The first two experiments were for habituation, and data collection was conducted during the last five trials. To facilitate the transition to the drowsy state, all experiments were conducted between 7:00 p.m. and 9:00 p.m. The raw data from these experiments are available to researchers on the Kaggle website (
https://www.kaggle.com/datasets/inancigdem/eeg-data-for-mental-attention-state-detection).
4.2. Simulation Results
Table 1 shows the classification accuracy results when using an open-source classification tool for six different classification models [
46]. The performance of the classifiers was evaluated for both the SCTN-based and FFT features. The FFT features were extracted for various frame durations (for 5, 10, 15, and 20 ms). The best feature set which resulted in the best classification was selected for comparison. The results show that EEG classification, based on our proposed approach (SCTN), outperformed the FFT-based classification for all five models and demonstrated an accuracy improvement of up to 6.9%.
We employed two evaluation methods for situational awareness detection: subject-specific and group-level classification. For the subject-specific classification, the classification network was trained individually for each participant using only their data. For this, 80% of the random samples from the specific subject’s EEG data were used for training. The remaining 20% of the subject data which were not used for training were used for testing.
Similarly, for the group-level classification, an 80–20 data splitting was also used. For this scenario, the training data set included combined data from all participants randomly selected from 80% of the joint EEG recordings of all participants. The testing was carried out on the remaining 20% mixed dataset, which was not used during the training phase.
To further evaluate the SCTN-based classifier, we compared our proposed approach with previous related works applied to EEG classification.
Table 2 summarizes the results of four previous studies on detecting mental attention states as described in [
24] compared to the results of our proposed SCTN-based classifier. As seen in the table, our method demonstrated better results than all previous studies which used the SVM classifier. Average classification accuracies of 96.8% and 91.7% were achieved for classifying the three mental states while using our method (SCTN) and the SVM method [
24], respectively.
The proposed architecture was evaluated for individual training for each participant and a common-subject paradigm where a single generic classifier was jointly trained for all participants. The results show that the generic mental state detector performed slightly worse than the subject-specific detectors, demonstrating accuracy degradation of about 4–6% compared with training with individual subjects. The good generalization ability was achieved due to the proposed feature extraction method and the EEG feature mapping, which successfully mitigated the differences between users and highlighted relevant characteristics.
The performance evaluation is demonstrated for each subject for both the subject-specific model and the transfer learning and generic model.
Table 3 depicts the results of applying a transfer learning (TL) approach (training on one subject and testing on another) for the five participants (
P0–
P4). Additionally, the table also provides the results of the generic model (training with samples from all participants and testing on a specific participant).
The results show a success rate in the range from 94.7% to 98.3% for the subject-specific case (indicated diagonally on the table) and an average accuracy of 92.53% for the transfer learning model. The generic model demonstrated comparable results, with a 92.64% success rate on average. These findings underscore the model’s ability to effectively classify situational awareness based on EEG data for both training paradigms.
Figure 7 depicts the topography maps of the EEG signal magnitude divided into five frequency bands (delta, theta, alpha, beta, and gamma) for each of the three mental states (
focused,
unfocused, and
drowsy). It can be seen that both the
focused and
unfocused mental states were mainly characterized by the delta and theta channels (i.e., in the low-frequency EEG activity, specifically within the 1–8 Hz range). However, while in the
focused state, there was increased activity in the frontal lobe, and in the state of lack of focus, there was a decrease in activity in this area. The drowsing state was characterized mainly by the alpha channel sub-band (8–14 Hz). This outcome is consistent with existing research findings on the correlation between the alpha EEG band and the
drowsy mental state [
24].
Figure 8 depicts the timely activity examined in the
F3 EEG channel located in the frontal lobe for the delta sub-band. The spike rate accurately corresponded to the three mental states and demonstrated an activity decrease for the
unfocused state compared with the
focused state.
We evaluated the relative contribution of the various EEG electrodes in discerning the three mental states (focused, unfocused, and drowsy). Remarkably, we discovered that the best classification performance could be achieved when using only five EEG electrodes: F7, F3, AF3, F4, and AF4. Three of those five electrodes were significantly situated over the frontal lobe.
Table 4 shows the mental state classification results achieved with a reduced set of only 5 EEG electrodes compared with the 14 available EEG electrodes while employing all five EEG sub-bands or only the three low-frequency sub-bands (delta, theta, and alpha). Although using all 14 EEG channels and the five sub-bands resulted in the best performance with 96.8% accuracy, the accuracy loss while using only 5 EEG channels and three sub-bands was minor, with a degradation of only 3.2%. However, the benefit was reducing the area and power, since only 1202 neurons were required compared with 2776 neurons for the best case scenario. Applying only the three dominant sub-bands (delta, theta, and alpha) for mental classification resulted in 94.1% accuracy compared with 96.8% with all five available sub-bands while reducing the number of required neurons by about 9%. Using all of the frequency sub-bands with only five EEG electrodes resulted in a slight accuracy loss of 1.9% but saved about 50% of the required neurons.
4.3. Performance Evaluation
The performance evaluation of the proposed method was carried out using the four following metrics:
Accuracy, calculated as the ratio of the sum of the true positive (
) and true negative (
) predictions to the total number of predictions (
,
, false positive (
), and false negative (
)) made by the model:
Sensitivity, determined as the ratio of true positive (
) predictions to the sum of true positive (
) and false negative (
) predictionsL
Specificity, calculated as the ratio of true negative (
) predictions to the sum of true negative (
) and false positive (
) predictions:
False positive rate (FPR), determined as the ratio of false positives (
) to the sum of false positives (
) and true negatives (
):
Table 5 shows the proposed method’s performance in terms of the four metrics, including the accuracy, sensitivity, specificity, and false positive rate (FPR). An average accuracy of 96.8% was demonstrated for the subject-specific model, and precise identification of all three EEG-based mental states was achieved.
The high average sensitivity (96.9%) highlights the model’s robust ability to detect true positive cases, while an average specificity of 98.33% underscores its proficiency in correctly identifying true negative cases, thereby minimizing the risk of misclassifying normal brain activity as an indication of wrong situational awareness classification.
Furthermore, the low FPR average of 1.63% indicates a low incidence of false positive predictions, ensuring reliable differentiation between different mental states based on the EEG recordings.
Table 6 depicts a confusion matrix for evaluating the classification of the three mental states, demonstrating highly accurate classification results.