1. Introduction
Driving is a dynamic and complex set of synchronous actions including various secondary tasks i.e., simultaneous cognitive, spatial and visual tasks. The rapid increase of in-vehicle systems like telematics and infotainment systems increase the number of secondary tasks with the primary task of driving. Along with the workload of natural driving, secondary tasks and different road environments increase the Mental Workload (MWL) of drivers. However, an excessive in-vehicle drivers’ MWL, eventually causing mental fatigue if prolonged over time, can lead to significantly deteriorated driving performance and makes the driver more vulnerable to making mistakes [
1,
2]. A study revealed that 72% of all the road accidents happen each year due to driver errors [
3]. The overwhelming increase in traffic fatalities due to elevated MWL forces the need of determining in-vehicle drivers’ MWL efficiently. Researchers of diverse domains have identified drivers’ MWL assessment mechanisms both in simulated and real environments [
2,
4,
5]. Physiological measures, particularly Electroencephalography (EEG), have been shown to be a suitable measure of MWL [
6,
7,
8]. On the other hand, the process of acquiring EEG signals during natural driving requires complex equipment to be used in addition to the in-vehicle systems. As a result, the process of in-vehicle recording of EEG is not favorable to natural driving. At this point an approach for drivers’ MWL monitoring that contains minimal utilization of EEG signal is a sine qua non.
Several studies have exploited the vehicular parameters such as lateral speed, steering wheel angle, lane change, etc. as a complementary measure to EEG to obtain insight about driver’s psycho-physiological state [
9,
10]. Also, vehicular parameters are not obstructive during driving in comparison to EEG recording. Therefore, it would be possible to (i) utilize the association of vehicular parameters and EEG signals in terms of Mutual Information (MI) [
11] in developing a feature template establishing the combined effect on MWL, and then (ii) this feature template can be further used to evaluate in-vehicle driver’s MWL from the vehicular parameters, which can be easily extracted from the built-in systems of a vehicle. More specifically, the conceived application is to record EEG signals once for a specific driver and a specific vehicle along with different vehicular features while driving, taking advantage of the added value of neurophysiological data (i.e., EEG). A feature template will be created combining the underlying characteristics of EEG and vehicular signals and thus enhancing the statistical power of prediction models. Then, this feature template will be fed with only vehicular data afterwards to generate online assessment of in-vehicle MWL of the driver, thus avoiding repeated use of invasive devices for recording EEG signals in vehicle and performing complex computations as well.
In this context, this study further investigated the possible association between vehicular and EEG signals and their relationship with the MWL of drivers while driving. In particular, the present work validates the fusion of mentioned signals with the aim to develop feature set that can be used for in-vehicle drivers’ MWL evaluation with a provision for reducing the complexity of recording EEG signals repeatedly in the concerned tasks. The aim of this study can be outlined as:
Develop a new feature fusion methodology for producing a “feature template” from vehicular and EEG signals. This template can be used to generate a feature set utilizing only vehicular signal for evaluating in-vehicle drivers’ MWL.
Assess the reliability of the feature set developed from the proposed methodology.
Validate the performance of machine learning (ML) models in quantifying and classifying drivers’ MWL using the features extracted from proposed methodology.
The remaining sections of this article is organized as follows. The background of the research domain and several related works are described in
Section 2.
Section 3 contains detailed description of the experimental setup, data collection, analysis, feature set generation and validation of the feature set using regression and classification. The outcome of the performed methodologies and discussions on different outcomes are provided in
Section 4 and
Section 5, respectively. In conclusion, a summary and possible future of this work are discussed in
Section 6.
2. Background and Related Works
The task of driving is a combination of several dynamic and complex activities that include simultaneous visual, cognitive and spatial tasks [
1]. Fastenmeier and Gstalter defined driving as a human–machine system that continuously changes with the environment. The components of the environments are traffic flow (high or low), road layout (straight, junctions, roundabout or curves), road design (motorways, city or rural), weather (rainy, snowy or windy), time of a day (morning, midday or evening), etc. These components define the overall complexity of the driving task [
12]. Furthermore, various studies outline driving as a hierarchy of different tasks in three levels. Strategic tasks like decision making constitutes the first level. On the above of strategic tasks, the second level lies with tasks like maneuvering or reacting in response to the change of environment, which is termed the tactical level. The third level is called the operational level, which includes controlling the vehicle. The first two levels demand voluntarily processing and observing various elements of environment by the drivers. On the other hand, tasks on the third level are automatically performed depending on the driver’s experience, which involves less processing of surrounding information. Miscellaneous tasks associated with the primary task i.e., controlling the vehicle, tends to increase the MWL of drivers, which results in errors [
1,
13].
In the twenty-first century, driving a vehicle causes extensive irregularities in the MWL of drivers [
14]. With the increasing number of vehicles on the road and in-vehicle technologies, the task of driving is getting more complex, resulting high MWL. However, the term workload can be related to both physical and/or mental assets and task demands. In case of driving, MWL is more appropriate and considerably varies depending on driver’s capabilities and required task demands [
15]. It is observed that both high and low MWL can impede the driving performance [
16]. Higher MWL than normal can lead to driver’s diverted attention, distraction, inadequate time and capacity for information processing. On the other hand, low MWL can result slower reaction to events, reduced attention and alertness. Thus, as complex task, driving demands both psychological and physiological undertaking where MWL is an ineluctable aspect [
17]. A study dedicated to finding the causes of road accidents demonstrates that human error directly or indirectly contributes to 90% of the accidents [
18]. Because of the association of driver’s MWL to committing errors while driving, and since these errors have been demonstrated as a principle contributing factor to road accidents, research on determining the in-drive MWL of drivers looks extremely urgent and important.
Assessment of Drivers’ Mental Workload
A substantial amount of research works were performed on assessing the MWL of humans while dealing with operational activities, but most of them are concerned about aviation sector rather than automobiles [
14]. However, aviation has only a small selection of pilots, which becomes easier to exploit, whereas the automobile domain constitutes with a comparatively higher number of drivers with diverse background, experience, skills and age group, which results in complex research work. Generally, irrespective of the domain, MWL is assessed in different ways. The methods can be assembled into three classes [
19].
Subjective Measures i.e., NASA Task Load Index (NASA-TLX), workload profile (WP), etc.
Task Performance Measures i.e., time to complete a task, reaction time to secondary task, etc.
Physiological Measures i.e., EEG, heart rate (HR), etc.
In combination with the subjective measures, the physiological measures are primarily objective in nature, which can be accumulated without imposing additional tasks to the participant. Contrarily, gathering task performance measures requires additional secondary tasks while driving, whereas the primary task remains already overloaded with diverse secondary tasks. Nevertheless, physiological measures can assess the mental impairment of the participant without imposing additional tasks and degrading the performance on primary task [
6,
8]. According to Guzik, physiological measures are selected often over other measures as a mean of assessing MWL because of cheap and smaller technologies [
20]. Respiration, blood pressure, skin conductance, cardiac activities, brain measures, ocular measures, etc., are noteworthy instances of physiological measures. An abundant accessibility of technology, portability and capability of physiological activities, more specifically, indication of the neural activation, EEG signals have been widely chosen by researchers to assess the MWL of drivers while driving. In a recent review of works on drivers’ MWL, Charles and Nixon mention that most research works are carried out using EEG signals as a tool to measure MWL [
21,
22]. In addition, it has been established through research that a significant association lies between MWL and EEG features extracted in time and frequency domain. Waveform length, zero crossings, mean absolute values, slope signs changes, etc., features are extracted from EEG in a time domain and further utilized in classification tasks in the domain of brain–computer interfacing [
23]. On the other hand, the Alpha and the Theta wave rhythms of EEG signals, respectively, over the parietal and the frontal regions of brain significantly illustrate the MWL variation of participants [
24,
25].
Computationally expensive methods like statistical analysis and signal processing are largely deployed to transform the EEG signals into features that can be directly used for measuring MWL. Literature indicates variety of approaches to extract features from EEG signals. For example, a non-linear approach using fractal dimensions, discrete wavelet transform, non-negative matrix factorization, time and frequency domain analysis, etc. [
26,
27,
28,
29,
30]. Recently, the use of Deep Learning (DL) techniques increased in this domain to reduce the complexity of adopting the mentioned methods. A Convolutional Neural Network (CNN) was used by Wen et al. for unsupervised feature learning from EEG signals in classifying epilepsy patients [
31]. In addition to CNN, use of Long Short-Term Memory (LSTM) [
32], Deep Belief Network (DBN) [
33], Stacked Denoising Autoencoder (SDAE) [
34], etc., were also observed in the literature. After extracting features from the EEG signal, different ML algorithms are widely used, namely, Support Vector Machine (SVM), k-Nearest Neighbors (k-NN), Fuzzy-c Means Clustering, Multi-Layer Perceptron (MLP), etc. [
35].
Summarizing, the prevailing methods of assessing in-vehicle MWL of drivers require extensive setup to collect physiological signals. On top of that, complex analysis and computation are required to extract the expected outcome let alone the further deployment of the outcome. However, almost in all modern vehicles, there are provisions available to record the different parameters of vehicle maneuvering e.g., velocity and acceleration. Solovey et al. utilized these vehicular data aligning with physiological data to evaluate automotive user interfaces [
9]. As of now, to our knowledge, no work has been done considering only the vehicular data in assessing driver’s MWL. The prior work builds the foundation of this work to employ vehicular data with pre-compiled hybrid template of vehicular and physiological data for assessing in-vehicle MWL of drivers that may be useful to reduce the complexity of in-vehicle setup and extensive analysis of physiological measures.
5. Discussion
An increase of secondary tasks e.g., reaching for the mobile phone, interacting with the mobile phone (touching on the screen, dialing and texting), talking, reading the screen, glancing at the phone momentarily and talking or listening to a hands-free device together with the primary task of driving causes increased MWL. According to the state-of-the-art (SotA) approaches, to measure MWL, Electroencephalography (EEG) has been proven to be a good parameter and widely used in research [
6,
7,
8], although it is not feasible enough in terms of data acquiring, processing and decision making while driving a car in naturalistic environment. So, the aim of this study is to perform research and development to identify a methodology for constructing a novel mutual information-based feature set from the fusion of electroencephalography and vehicular signals and deployed in evaluating drivers’ mental workloads. In this study, EEG and vehicular signals were recorded through driving experiment in real scenarios that varies in different factors; “HOUR” and “ROAD” [
24]. Here, two different events were also introduced to investigate the effects on drivers’ MWL. Since the experiment was conducted in a real environment, there might be the presence/absence of other road users. The events leveraged the provision for analyzing uniformly for all participants the effect of specific road users other than the regular traffic on the road. According to the initial data analysis at group level, it was observed that different situations and road users affect the MWL of drivers and their vehicle handling. The results results from the observation (
Section 3.2) confirmed the experimental hypothesis, i.e., “the driving task in terms of road complexity as well as events induced differences in driving behaviors and drivers’ experienced MWL”. Statistical hypothesis tests were conducted on average driving velocity and drivers’ MWL and significant (
p < 0.05) differences were observed. The tests are described in details in
Section 3.2.3. In addition to that, several comparative plots were drawn to assess the effects visually, which are illustrated in
Figure 2 and
Figure 3. In short, the comparisons pointed out that MWL and vehicle handling both changes when the road condition or events on the road are altered. However, the effects of change in events on MWL and driving behaviors are stronger than change in road condition. These findings and together with prior literature review on use of advantages and disadvantages of EEG features as a measure of MWL produced the base of further analysis and increase the urge to utilize mostly vehicular features in association to EEG for evaluating MWL of drivers.
To combine EEG features and vehicular features, a correlation between them were calculated and the assessed values of the correlation coefficients were negligible. On the contrary, prior investigations on the average driving velocity and MWL (
Section 3.2) showed changes while driving environments were varied (
Section 3.2.3). Thus, the motivation of exploiting MI between EEG and vehicular signal developed entirely on the low correlation coefficient and conversely significant similarity in the change of MWL and vehicular signal. Furthermore, the new novel concept of utilizing MI was proposed. Here, the reference values of MI between two continuous variables should be in the range
[
11]. The MI is calculated based on the relation between EEG and vehicular features where the average value was found to be approximately 8.5, which is very low but not null. The data for this study were recorded from a specific experiment from some specific participants, which represented their brain activity and vehicle handling together for the respective population distribution. However, The low MI values could be derived due to a smaller number of vehicular features. Despite the fact that the MI values were low, in MWL evaluation, the proposed features in some cases outperformed established objective measures. If there were more vehicular features, there could be wider variety of ways to mimic the handling of vehicle by the participants. As a result, systems would attain higher performance in MWL evaluation. Experiments are underway to increase the number of vehicular features by adding other parameters from inertial measurement unit (IMU) devices.
One of the objectives of this study was to quantify MWL of drivers from the proposed feature set. To test the performance of using the proposed feature set, four different ML regression methods were investigated: LnR, MLP, RF and SVM, considering the MWL score extracted by expert-defined methods as true values. For the regression, the true values of MWL score fall in the range
, where 0 represents no MWL and 1 represents highest from individual point of view [
24]. For each of the regression models, the average MAE and MSE were around 0.16 and 0.04 (
Table 4). Again, these errors were compared with the results of regression models trained using EEG-based features. In comparison, using different features produced approximately similar errors while predicting MWL scores of drivers and the comparison of MAE in 10-fold CV is illustrated in
Figure 6 and
Figure 7. From the visualizations it was observed that the difference in average error from RF regression model was lowest among the considered models, which might be an effect of functional differences in terms of ensemble technique [
63], as described in
Section 3.4.
In addition to MWL quantification, the performances of MWL and event classification using MI-based features were also examined against EEG-based features. Classifier-wise average performance on MWL and event classification was tested using a one-sided Wilcoxon signed-rank test [
59]. Unlike MWL quantification, the average performance of SVM classifier with MI-based feature set was significantly higher in both classification tasks (
Table 5). According to Shah, SVM is the most widely-used algorithm for classification tasks on the basis of features extracted from EEG signals [
35]. The initial finding of this study aligns with the statement. On the other hand, the other three classifiers: LgR, MLP and RF performed better in event classification with MI-based features. To access the correct binary classification capacity, AUC-ROC curves were plotted where RF outperformed all other classifiers in terms of AUC values.
Figure 8 illustrates the AUC-ROC curves for RF and MLP classifiers that achieved the higher AUC values while tested on the holdout set for simplicity. In addition to that, DeLong’s test [
75] of comparing AUC values demonstrated similar significant differences as the one-sided Wilcoxon signed-rank test [
59] showed. It can be observed from
Table 6 that all the calculated AUC values are within the 95% confidence interval for true AUC values. Moreover, the values of
Z and
p are consistent i.e., in case of significant values of
p, we accept the alternate hypothesis that the values of AUC for classifiers trained on MI-based features are higher than the values of AUC for classifiers trained on EEG-based feature and the signs of test statistics,
Z express the same relation between the AUC values. However, according to the performance metrics, in MWL classification, RF achieved the highest AUC value of 0.92 with accuracy 82% with MI-based features and the AUC value was 0.96 (
Figure 8a) with accuracy 88% (
Table 7) with EEG-based features. Again, the performance on event classification (
Car or
Pedestrian) was evaluated with the same ML algorithms considering both the feature sets. In event classification result, RF with MI-based features with AUC value 0.98 outperformed EEG-based features with AUC value 0.95 (
Figure 8b). The accuracy on the test set in the classifying event was found to be 94% by the RF classifier by using MI-based features, which is the best performance achieved in this whole study (
Table 8).