*2.4. Measures*

Two methods of driving stress evaluation were studied and compared: Dependency between stress and road traffic, and between stress and road type. The vehicle speed and the driver's EDA signals were recorded for different road segments such as city and highway driving as shown in Figure 3.

The road traffic conditions were classified as a low traffic or high traffic state based on the vehicle speed. We determined the high traffic criteria set in Table 3, which characterize potential traffic congestion. The occurrence of high traffic corresponds to high driving stress and its absence to low driving stress. Average speed and standard speed deviation were calculated for every six-minute window of speed data. Speed data for every six-minute window were automatically checked and compared with the adjusted average speed and standard speed deviation sets in Table 3 to determine the potential traffic congestion. If an average and standard deviation of the vehicle speed during a

certain period were lower than the criteria set (Table 3), then that period was classified as high traffic conditions, and the other period was classified as low traffic conditions. The driver's stress level was assumed to be low in low traffic conditions and high in high traffic conditions.

**Figure 3.** Line plot of the collected speed and electrodermal activity (EDA) data in the same time series for an evening session.


**Table 3.** Models with various averages and standard deviations of vehicle speed.

For the second model of dependency between road type and stress state, it was considered that city driving results in a high stress level, and highway driving results in a low stress level. We determined that city driving causes higher stress due to a large number of pedestrians, traffic lights, and traffic congestion. In contrast, such difficult driving conditions are less likely to occur when driving on highways. Therefore, highway driving was classified as low stress driving.

We analyzed the extracted EDA features for each road segment, with a driving time of about 35–40 min on highway-type segments, and 20–25 min on city segments, for a total duration of approximately 60 min. The segments with overlapped periods of high and low levels of stress were excluded. Feature extraction is an important signal processing step for finding more dominating information and for reducing the volume of auxiliary research procedures. It is known that, depending on the type of the signal, di fferent features can be extracted. In this study, for physiological EDA signals, the features proposed by Healey and Picard [5], amplitude (OM) and duration (OD) calculated from signal peaks and valleys, were extracted.

In the current study, the processed EDA signal was resampled at 15.5 Hz. Based on the calculation of mean OD ± 3 × standard deviation (SDOD), it was found that 99.7% of OD falls within the confidence interval. This means that the six-minute window size (approximately equal to 5580 samples) meets the accuracy requirements. In this study, a one-minute sliding window was determined, which was approximately equal to 930 samples, and the moving average window was 60 samples. To find the signal peaks and valleys, the "findpeaks" function in MATLAB (R2017 version) was used. The signal feature extraction process allows us to extract the following EDA characteristics: minimum (min OD and min OM), maximum (max OD and max OM), mean (mean OD and mean OM), standard deviation (SD OD and SD OM), summation (sum of ODs and sum of OMs), and the number of occurrences of duration and amplitude (nOD and nOM). Figure 4 shows the results of applying the OM and OD extraction algorithm for the morning driving session on 11 January 2018.

**Figure 4.** EDA signal processing for January 11, 2018 morning session (peaks and valleys are marked by - and ×, respectively).

#### *2.5. Analysis Method*

Logistic regression analyses were performed, by using IBM SPSS Statistics Version 23 Software, on every tra ffic conditions and di fferent road types. Figure 5 contains the schematic description of the development process for both models and their application areas. The physiological features of EDA (OD and OM) and OBD-II data (vehicle speed) were used to construct the same framework in this study. In particular, the model development section of Figure 5 summarizes the study in detail, which is designed to identify the abstract content shown in Figure 1.

The development process in Figure 5 was performed in five steps: Data collection, data pre-processing, analysis, results, and comparison of classifiers. The data collection step shows the period, place, and used sensors during the experiment. Data pre-processing introduces the preliminary processing steps on obtained data for each method. Analysis and results show the analytical methods used, and the main results obtained. Comparison of classifiers provides general comparison for both methods. The model application describes the most applicable areas for the developed methods, such as road traffic management, medicine, and electronic devices.

Based on previous studies in the introductory section, key parameters of physiological signals, road types and traffic situation characteristics have been selected as classifiers. The most important EDA features related to driving stress are minimum, maximum, mean, standard deviation, sum, and the number of occurrences of OD and OM. The most important driving conditions affect the mental state of the driver. Road type [5,6] and traffic jam [24] are representative elements. Based on this, the current

study classified the road types and identified traffic jams using the vehicle's speed and the standard deviation of the speed [27,28]. A summary of the models used in this study is shown in Table 4.

**Figure 5.** Stepwise development and application of the models.



For a complete evaluation of the classification model efficiency, the accuracy (A), sensitivity (Sn), specificity (Sp), and positive predictive value (PPV) were calculated additionally as the specifying characteristics (see Table 3):

A= (cases of high stress + cases of low stress)/(all cases of stress) = TP + TN/(TP + TN + FP + FN), (1)

$$\text{Sn} = \text{(cases of high stress)} \text{(all cases of stress)} = \text{TP/(TP+FN)},\tag{2}$$

$$\text{Sp} = \text{(cases of low stress)} \text{(all cases of low stress)} = \text{TN} \text{(TN} + \text{FP)},\tag{3}$$

$$\text{PPV} = \text{(cases of high stress)} / \text{(all cases of high stress)} = \text{TP} / \text{(TP} + \text{FP)}, \tag{4}$$

In Equations (1)–(4), FP is the false positive, FN is the false negative, TP is the true positive, and TN is the true negative data. The accuracy determines the ratio of correct predictions to the total analyzed cases. False positives and false negatives are cases when the developed classifier erroneously recognizes low stress as high stress and high stress as low stress, respectively. True positives and true negatives are cases when the developed classifier recognizes stress levels correctly.
