Naturalistic Driving Data-Based Anomalous Driving Behavior Detection Using Hypertuned Deep Autoencoders

Abbas, Shafqat; Malik, Muhammad Ozair; Javed, Abdul Rehman; Hong, Seng-Phil

doi:10.3390/electronics12092072

Open AccessArticle

Naturalistic Driving Data-Based Anomalous Driving Behavior Detection Using Hypertuned Deep Autoencoders

¹

Department of Cyber Security, Air University, Islamabad 44000, Pakistan

²

Department of Electrical and Computer Engineering, Lebanese American University, Byblos P.O. Box 36, Lebanon

³

AI Advanced School, aSSIST University, 46 Ewhayeodae 2-gil, Fintower, Sinchon-ro, Seodaemun-gu, Seoul 03767, Republic of Korea

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(9), 2072; https://doi.org/10.3390/electronics12092072

Submission received: 30 March 2023 / Revised: 19 April 2023 / Accepted: 28 April 2023 / Published: 30 April 2023

(This article belongs to the Special Issue IoT for Intelligent Transportation Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Autonomous driving is predicted to play a large part in future transportation systems, providing benefits such as enhanced road usage and mobility schemes. However, self-driving cars must be perceived as safe drivers by other road users and contribute to traffic safety in addition to being operationally safe. Despite efforts to develop machine learning algorithms and solutions for the safety of automated vehicles, researchers have yet to agree upon a single approach to categorizing and accurately detecting safe and unsafe driving behaviors. This paper proposes a modified Z-score method-based autoencoder for anomalous behavior detection using multiple driving indicators. The experiments are performed on the benchmark Next Generation Simulation (NGSIM) vehicle trajectories and supporting datasets to discover anomalous driving behavior to assess our proposed approach’s performance. The experiments reveal that the proposed approach detected 81 anomalous driving behaviors out of 1031 naturalistic driving behavior instances (7.86%) with an accuracy of 96.31% without early stopping. With early stopping, our method successfully detected 147 anomalous driving behaviors (14.26%) with an accuracy of 95.25%. Overall, the proposed approach provides promising results for detecting anomalous driving behavior in automated vehicles using multiple driving indicators.

Keywords:

intelligent transportation; naturalistic driving; autoencoder; anomaly detection; deep learning

1. Introduction

Automated driving is widely anticipated to have a role in future mobility systems [1,2]. Lane keep assist (LKA) and adaptive cruise control (ACC), two popular examples of contemporary vehicle automation, will advance and expand further. These so-called advanced driving assistance systems (ADASs) are expected to improve safety and traffic flow [3,4]. Expectations for self-driving cars are much higher since they will enable the transportation system to be overhauled by enabling mobility-as-a-service plans, effective ground logistics, and better road use [5]. Automated vehicles need to be technologically and operationally safe to gain these advantages [6,7]. They should appear to be driving safely to their passengers and other drivers on the road. This is especially important when automated and non-automated vehicles share the road, or during "mixed-traffic" conditions. Finally, the impact of autonomous vehicles on traffic safety should be demonstrably neutral or, ideally, positive.

International functional safety standards govern the process of certifying the safety of vehicular automation technology [8,9]. However, to the best of our knowledge, there still needs to be an agreement on guaranteeing that autonomous vehicles (AVs) operate safely in traffic, even under ideal technological circumstances. Although a positive impact is expected, there is still no agreement on what "driving safely" entails or how AVs will affect traffic safety [10].

Machine and deep learning have been used for various automated vehicle-related applications, such as traffic object detection, sensor anomaly detection, and attack detection [11,12,13]. The notions of traffic safety and safe driving are linked yet separate [14]. Regarding traffic safety, national and local governments frequently worry about the serious injuries, fatalities, or property damage caused by interactions in traffic [15,16,17]. It is often assessed using statistical measurements, such as the number of fatalities per 100 million kilometers driven [18]. When poor traffic safety results in the loss of life, way of life, and economical production, there is a strong motivation to develop traffic safety policies and laws [19]. As a result, numerous advancements have been made in road infrastructure, driver information systems, electronic stability control, and other areas of vehicle technology. Determining whether new automotive innovations positively or negatively affect traffic safety is more complex. Advanced driver assistance systems (ADASs), comfort ADAS (such as LKA and ACC), and self-driving technologies can be used to classify AV technology. Since ADASs are made specifically to prevent collisions, the connection between ADAS and traffic safety is obvious (although proving their usefulness is another matter [18]). Less obvious and perhaps more passively, self-driving cars’ effects on traffic safety are those of comfort ADAS. Both technologies impact how drivers operate their vehicles and change how they behave in ways that other drivers might not expect [20,21], perhaps increasing the likelihood of collisions. To avoid these negative consequences, AVs should operate in ways that all road users perceive as predictable and safe, especially in mixed traffic (i.e., with low collision risk) [22]. This is what is meant when we talk about safe driving practices. The following are the major contributions of this paper:

An efficient modified z-score-based autoencoder approach is proposed for detecting anomalous driving behaviors among running traffic.
Experiments are performed on the benchmark Next Generation Simulation (NGSIM) vehicle trajectories and supporting datasets to discover anomalous driving behavior to assess our proposed approach’s performance.
Results reveal that the proposed method detected 81 anomalous driving behaviors out of 1031 naturalistic driving behavior instances (7.86%) with an accuracy of 96.31%, without early stopping, whereas, with early stopping, our method successfully detected 147 anomalous driving behaviors (14.26%) with an accuracy of 95.25%. Results demonstrate that combining multiple parameters helps better and more efficiently categorize safe and unsafe driving behaviors.

2. Literature Review

Despite its sensitivity, it is surprising that the field of safe driving behaviors has yet to receive due attention despite years of research. Many aspects of this field still need to be explored.

According to studies [3,23,24], cities have become increasingly smart as technology has evolved. Smart mobility is a vital component of smart cities, with autonomous vehicles playing a key role [25]. However, vulnerabilities in autonomous vehicles can have severe consequences for human safety and quality of life. As a result, security researchers have been studying attacks and defenses for autonomous vehicles. Despite this, there has not been a systematic exploration of attacks and defenses for autonomous vehicles. The study [23] conducted a survey analyzing 151 papers published from 2008 to 2019. They categorized autonomous attacks into three categories: autonomous control systems, autonomous driving system components, and vehicle-to-everything communications. They also divided defense against such attacks into three categories: security architecture, intrusion detection, and anomaly detection. Techniques for detecting anomalies using artificial intelligence and machine learning are gradually being developed as large data and communication technologies advance. Based on their systematic survey, they propose implications for future research on autonomous attacks and countermeasures. They strongly recommend combining artificial intelligence with the major components of smart cities to create a comprehensive approach to protecting autonomous vehicles.

Another research work [26] demonstrates that as autonomous vehicles become increasingly popular, they are also becoming a target for a new generation of attackers who seek to exploit vulnerabilities in the technology for various reasons, including curiosity, financial gain, criminal activity, and state-sponsored attacks. One example of such exploitation may be employing self-driving vehicles as weapons in terrorist operations, such as driving into densely populated regions to cause mass casualties. To address this growing security risk, the survey [26] provides a detailed overview of the security approaches created to secure the many technologies that support autonomous cars, such as sensing, location, vision, and network technologies. By utilizing tailored machine learning models, these technologies have the potential to be further strengthened and made more secure. In addition, this survey also identifies future research opportunities in the field of autonomous vehicle security to continue developing more robust defense mechanisms against potential attacks. Overall, it is crucial to address the security vulnerabilities in autonomous vehicles to ensure the safety and well-being of individuals and communities that rely on this technology.

The author of [27] provides a preference system that provides path recommendations to the driver to understand their behavior. According to [28], the safety of roadway inspections by connected and automated vehicles (CAVs) is at risk due to their heavy reliance on sensor readings, information from other vehicles, and roadside units. To address these concerns, the authors developed an efficient approach that identifies inconsistencies in CAVs using Bayesian deep learning (BDL) with discrete wavelet transform (DWT), which enhances security and safety in CAVs. The DWT is designed primarily for smoothing CAV sensor readings and detecting and identifying aberrant sensor behavior or data points generated by malicious cyber assaults or defective vehicle sensors. The data are then fed to a BDL module. According to numerical experiments, the proposed method in [28] significantly improves the accuracy, sensitivity, precision, and F1-score evaluation metrics for detecting anomalies. On average, the proposed method in [28] demonstrates performance gains of 7.95%, 9%, 8.77%, and 7.33% compared to the convolutional neural network (CNN). The corresponding gains are 5%, 7.9%, 7.54%, and 4.1% when compared to BDL.

The study [29], which served as the foundation for this study, offers a novel definition of safe driving behavior for autonomous vehicles. The method is based on simulations of ordinary human driving behavior, known to produce interactions of medium to low severity. By modeling such behavior, autonomous vehicles can interact with other traffic participants predictably and safely [30]. Preliminary results indicate that the proposed approach effectively distinguishes between typical and anomalous driving behavior within the examined data set. In addition, in [31], the authors present a brand new observer-based method to improve connected and autonomous vehicle transportation security and safety. The strategy uses a car-following model and the adaptive extended Kalman filter (AEKF) to estimate a vehicle’s condition by utilizing its position, speed, and the condition of the surrounding traffic. Data from the leading vehicle are used to identify sensor anomalies using one-class support vector machine models that have already been trained on the subject vehicle.

A different study [32] also emphasizes the significance of connected and automated vehicles (CAVs) as a crucial infrastructure advancement for realizing a smart world. CAVs enable seamless and immediate data transfer. However, sensor-generated data are subject to anomalies caused by flaws, mistakes, and cyber attacks, which can result in mishaps and fatalities. The suggested method transforms data streams into vectors before processing them with WAVED to detect anomalies in CAVs. It makes use of the average projected probability of several classifiers. In the study [12], the long short-term memory based convolutional neural network (MSALSTMCNN) method is used to benchmark the performance of the multi-stage attention mechanism, achieving gains in F-score of up to 3.24% for detecting a variety of single anomaly types, including mixed anomaly types.

The research paper [33] states that the emergence of connected and autonomous vehicles (CAVs) is expected to change the automotive market landscape. HCAVs, like any other cyber–physical item, are prone to transmission faults, hardware damage, software failings, power instability, and cyber attacks. To address these issues, the author proposes a deep learning approach that uses hierarchical models and an LSTM auto-encoder to extract signal features. This method accurately classifies each signal sequence in real time, allowing CAVs to operate safely and reliably despite external disruptions. Although the above research touched on this domain briefly, it only partially covered it, and there is still much to be done. Despite the sensitivity of this domain, it is surprising that it has yet to receive due attention. After years of research, there still needs to be a consensus on safe driving behaviors. Many unexplored areas exist in this field. Another recent study [32] highlights the importance of CAVs as critical infrastructure development for realizing a smart world, as they enable seamless and immediate data transfer. However, the sensor-generated data are subject to anomalies brought on by flaws, mistakes, and cyber attacks, which can result in accidents and fatalities. The proposed method turns data streams into vectors before processing them with WAVED to detect anomalies in CAVs. The average predicted probability of many classifiers is employed to detect anomalies, achieving gains in F-score of up to 3.24% for detecting various single anomaly types, including mixed anomaly types.

The authors in [34] suggest that the future of intelligent transportation systems (ITSs) and connected and automated vehicles (CAVs) will involve a highly interconnected system. Access to appropriately anonymized CAV mobility data is necessary for traffic centers to enable the best decision making and supervision. Using vehicle locations and received signal strength indicators as features, the authors of this study present a novel unsupervised learning model based on a deep auto-encoder for identifying self-reported location anomalies in CAVs [34]. The use of autonomous vehicles has skyrocketed in recent years across the globe [35]. However, research in this area now uses adaptive machine learning methods rather than conventional statistical models. Existing machine learning models may not be immediately relevant in this context due to the complex nonlinear interaction between the spatial and temporal data collected from the environment throughout the cars’ adaptive decision-making process. To solve this difficulty, the study analyzes numerous factors for a relative evaluation of multiple deep learning models.

Threats to the CAVs is another area; since automated vehicles rely upon automated software and hardware, they are highly vulnerable to cyber attacks. Any such incident can lead to severe consequences, such as road accidents. This literature review discusses six research papers on various topics related to developing autonomous systems for precision agriculture, vehicle control, and estimation. Ref. [36] presents a novel algorithm called YOLOv5-tassel that can detect tassels in maize crops from RGB imagery acquired by UAVs. The suggested technique outperformed other well-known object detection approaches with a mean average precision (mAP) value of 44.7%. Ref. [37] provides an algorithm for estimating the sideslip angle of an autonomous vehicle based on consensus and vehicle kinematics/dynamics synthesis. The developed framework uses velocity-based and consensus Kalman state observers to estimate velocity, attitude, gyro bias, and heading errors. Ref. [38] proposes a method to autonomously estimate the yaw misalignment of inertial measurement units (IMUs) mounted on vehicles using onboard sensors. The proposed method uses a Kalman filter to estimate yaw misalignment and velocity error. Ref. [39] proposes a novel kinematic-model-based vehicle slip angle (VSA) estimation method that fuses information from a global navigation satellite system (GNSS) and an IMU. The proposed method uses a vehicle attitude angle observer and integration of reverse smoothing and grey prediction to estimate the VSA. Ref. [40] presents a model-based approach to estimate vehicle sideslip and roll angles based on lateral and longitudinal acceleration measurements. The proposed algorithm uses an extended Kalman filter (EKF) and a nonlinear tire model to estimate the slip and roll angles. Finally, [41] proposes an algorithm to detect and track people in a UAV-based search-and-rescue scenario. The proposed algorithm uses a deep learning model to detect people and a tracking algorithm based on the Hungarian algorithm to track the detected individuals.

Limitations of existing work: The related work mentioned above is the only content we could find on safe and unsafe driving behavior. This indicates that no conclusive work has been conducted in this area. However, all the works mentioned above have some limitations. The limitations of the three most relevant research works are summarized below in Table 1.

3. Proposed Approach

The autoencoder is an effective solution for both supervised and unsupervised anomaly detection. This section explains the foundation of our suggested methods, including DL, autoencoders, and the Z-score method. Figure 1 illustrates our suggested approach’s general flow and steps. This figure outlines the steps involved in the experimental process, including raw data input, pre-processing, train/test split, auto-encoding, outlier detection and testing. The input data process involves inputting sensor data from the automated vehicle, while pre-processing includes data cleaning, filtering, and normalization. The autoencoder completes the rest of the processes automatically, and model training involves training a machine learning model, using the automatically chosen features (by the autoencoder). Finally, the testing phase involves evaluating the performance of the trained model in detecting abnormal driving behavior in automated vehicles.

3.1. Dataset

The Next Generation Simulation (NGSIM) vehicle trajectories and supporting data were used in the study [42]. During the NGSIM initiative, researchers collected detailed vehicle trajectory data on southbound US 101 and Lankershim Boulevard in Los Angeles, CA, USA, eastbound I-80 in Emeryville, CA, USA, and Peachtree Street in Atlanta, GA, USA. A network of synchronized digital video cameras was used to collect these data. The NGSIM program’s NGVIDEO software application derived vehicle trajectory information from the video. The resulting vehicle trajectory data supplied precise location information for each car inside the research region every one-tenth of a second, allowing for comprehensive lane positions and other vehicle locations.

3.2. Feature Selection

The original dataset had 25 columns, of which the following 7 least important features were removed: O_Zone, D_Zone, Int_ID, Section_ID, Direction, Movement, and Location. The remaining 18 features were used for experimentation in this paper. They are as follows: Vehicle_ID, Frame_ID, Total_Frames, Global_Time, Local_X, Local_Y, Global_X, Global_Y, v_length, v_Width, v_Class, v_Vel, v_Acc, Lane_ID, Preceding, Following, Space_Headway, and Time_Headway.

3.3. Z-Score Method

To solve the aforementioned issue, this research provides a new technique for detecting safe driving behavior for automated vehicles in mixed traffic based on models of typical human driving behavior. The paper focuses on detecting unsafe driving behavior using a combination of indicators rather than waiting for an automated vehicle to encounter an unsafe situation. Unsafe driving behaviors are outliers in the data, and the primary target is to detect those outliers. Conventional approaches for outlier detection are dependent on the number of observations. In the current scenario, the data are approximately normally distributed, making the z-score method a suitable option for outlier detection. However, the challenge is that the mean value is susceptible to outliers, which can lead to wrong conclusions. The proposed approach in this paper uses the same method but with modifications. Instead of the mean and standard deviation, the median and median deviation are used. The median is a robust statistic that is not affected by outliers. This approach is known as the robust Z-score method and uses the median absolute deviation (MAD). The following formula is used for this method:

$\bar{x}$ , i.e., the median value of the sample
MAD, is calculated as follows in Equation (1):

M A D = m e d i a n | x_{i} - x |

(1)

Moreover, the Z-score can be calculated as follows in Equation (2):

M_{i} = \frac{0.6745 (x_{i} - x)}{M A D}

(2)

3.4. Deep Learning Model

Autoencoders, a common technique for supervised and unsupervised anomaly detection tasks, are used in this paper. Autoencoders have been used for various tasks, including data compression, image denoising, image colorization, and dimensionality reduction. A neural network autoencoder learns a low-dimensional representation of the input data. It comprises an encoder, which learns to map input data to a low-dimensional representation, and a decoder, which learns to map this representation back to the original input data. The encoder can then use this information to train a useful “compression” function that converts input data into a lower-dimensional representation. The reconstruction error is used to train the model, which is the difference (mean squared error) between the original input and the decoder’s reconstructed output.

The deep learning model used in this experiment can be depicted as follows.

The autoencoder is a sequential model consisting of two parts, an encoder and a decoder as shown in Figure 2. The encoder part of the model takes in the input data. Further, it reduces its dimensionality to a lower-dimensional representation, which is then passed to the decoder part of the model. The decoder part of the model takes the reduced-dimensional representation and reconstructs the original input data. The encoder part of the model consists of five fully connected (dense) layers, with the activation function set to “elu” for each layer. The number of neurons in the input layer is the same as in the input data, while the last hidden layer has two neurons. The encoder reduces the dimensionality of the input data from its original dimensionality to two dimensions. The decoder part of the model consists of four fully connected (dense) layers, with the activation function set to “elu” for each layer. The first layer has four neurons, and the last layer has the same number of neurons as the input data. The decoder takes the two-dimensional representation generated by the encoder and reconstructs the original input data. The Adam optimizer and the mean squared error (MSE) loss function are used to create the autoencoder model. During training, the accuracy measure is also calculated. Table 2 presents the hyperparameters used for anomalous behavior detection.

The model described in [32] serves as the foundation for the traffic interaction model in this study. It categorizes interactions based on severity and suggests that ordinary human driving behavior results in medium- to low-severity interactions. Current traffic statistics corroborate our assumption that such interactions have minimal collision probability [16]. As a result, if automated vehicles follow such usual driving behavior, they will engage with other traffic participants recognizably and predictably, minimizing the likelihood of collisions and driving safely. Driving interactions can be difficult to describe, but when alert, drivers can assess the safety of their surroundings and take appropriate action based on implicitly learned behaviors and worldviews [16]. This study proposes training deep learning (DL) algorithms, specifically autoencoders, on real driving data to capture such implicit driving patterns. Using anomaly detection principles, the method is specifically intended for common transverse and longitudinal driving interactions and can be used to assess whether an individual vehicle, whether automated or not, usually acts (safe) or not (unsafe) [33]. These models are computationally light.

We train an autoencoder on normal driving data and then use it to detect anomalous behavior in new driving data as shown in Algorithm 1. The algorithm takes driving datum X as input and splits it into training and testing sets. We train an autoencoder on the training data to learn how to reconstruct normal driving behavior. Once the autoencoder is trained, we calculate reconstruction errors on the testing data. We calculate the mean and standard deviation of the reconstruction errors on the testing data to detect anomalous behavior. Then, we compute an anomaly score for each test sample based on how far its reconstruction error is from the mean. We consider samples with high anomaly scores to be anomalous. This algorithm can detect anomalous driving behavior, such as sudden lane changes, abrupt braking, or speeding. The algorithm can identify anomalies that may indicate unsafe or reckless driving by training an autoencoder to reconstruct normal driving behavior.

Algorithm 1 Anomalous driving behavior detection using deep learning

Input: Driving Behavior data Xi, labels Yi
Output: Anomaly Types
Evaluation Metrics: Accuracy, Loss

1:: $x \leftarrow [B e h a v i o r a l]$ Dataset
2:: Split X and Y;
3:: Train autoencoder (training data);
4:: Reconstruction errors(testing data);
5:: Calculate the mean and standard deviation of the reconstruction errors;
6:: $S_{i} = | e r r o r_{i} - μ | / σ$ ;
7:: Return Anomaly Type;

Hardware Requirements: Following are the minimum hardware requirements to carry out this experimentation:

Minimum Processor: Core i5 (5th Generation)
Minimum RAM: 8 GB
No external GPU is required

Software Requirements: The following are the software requirements to carry out this experimentation:

The concerned dataset;
Software tools for data preprocessing, such as cleaning and formatting the driving data for deep learning algorithms;
Software tools for implementing and training the deep learning model, such as TensorFlow or PyTorch.

4. Results and Discussion

We implemented the above-mentioned approach successfully. We started with 5034 instances of driving behaviors using both longitudinal and traversal parameters. In total, 4000 instances were used for training with an 80–20% train–test split ratio. The trained model was tested on the remaining 1034 instances. Throughout this research, we used two approaches, i.e., with an early-stopping mechanism and without an early-stopping mechanism. We did so to judge which approach gives better results. Our ultimate findings are discussed as follows.

4.1. Standard (without Early Stopping)

Figure 3 shows the training loss versus the validation loss. At epoch#0, there was a significant loss in training and validation, with both parameters remaining inversely proportional. However, as the number of epochs increased, we observed a decrease in the loss. At epoch#0, the training loss was approximately 0.07, while the validation loss was about 0.049. By epoch#8, the training and validation loss reached a minimum, nearly zero, and remained at that level from epoch#8 to epoch#50.

Similarly, the following Figure 4 depicts the training accuracy vs. validation accuracy. In this case, we saw a symmetrical pattern between the epochs and the accuracy. At epoch#0, the validation and the training accuracy were at 0.885. At epoch#10, the training accuracy hit a maximum of 0.93, while the validation accuracy went up to 0.94. Then, with an increased number of epochs, both accuracy measures followed each other to a maximum of 98.34% from epoch#2 to epoch#13. From this onward, we see a constantly changing pattern up to epoch#50.

The training loss based on the number of examples is illustrated in Figure 5. Fortunately, no significant training loss was observed with an increase in the number of samples. At 0 samples, the training loss was also 0. Furthermore, the training loss remained 0 for approximately 2900 samples. However, at the initial increase in samples above 0, there was a minor training loss of a maximum of 0.057.

Similarly, the test loss concerning the number of examples is depicted in the following Figure 6. Similar to the previous case, we do not observe any significant test loss with an increase in the number of samples. At 0 samples, the test loss is also 0. This pattern seems to be preserved till around 980 samples. However, in the beginning, slightly above 0 samples, we observed a small test loss of a maximum of 0.058.

As a result, our model successfully detected 81 anomalous driving behaviors in a total of 1031 naturalistic driving behavior instances (7.86%), with a maximum accuracy of 96.31%. The final result can be depicted as the following Figure 7. We can see a total of 1031 driving behavior instances in blue color. Additionally, we can observe 81 anomalous driving behaviors, shown in red color.

4.2. Using Early-Stopping

The following Figure 8 depicts the training vs. the validation loss when using the early-stopping mechanism. We have two parameters, i.e., epoch # and the loss. For all values of epochs, we can see that the validation loss is significantly greater than the training loss. At the start (epoch#0), the training loss is 0.0025, and the validation loss is 0.0034. Both the losses gradually decrease with an increase in the epoch number. Finally, we observe the training loss approaching a minimum value of 0.0018, while the validation loss decreases below 0.0027.

Similarly, the following Figure 9 depicts the training vs. the validation accuracy. We observed a constantly changing pattern in this case. However, the validation accuracy shows more changing behavior than the training accuracy. The important point to note here is that the training accuracy remains greater at the start than the validation accuracy throughout. At epoch#0, the training accuracy is 0.952, and the validation accuracy is 0.943. After a series of changes, finally, the validation accuracy increases up to 0.948, and the training accuracy increases up to 0.953.

The following Figure 10 depicts the training loss concerning the no. of samples. We see a very preserved pattern in this case. At 0 samples, the training loss is also 0. Similarly, with an increase in the no. of samples, we do not see any change in the training loss, i.e., it remains at 0. The same pattern is observed in up to 2900 samples.

Figure 11 below shows the testing loss vs. the number of samples. We see a very similar pattern as the previous case. At 0 samples, the test loss is also 0. There is no discernible change in it that we can see. With increasing the no. of samples, the test loss remains at 0. The same pattern is seen in up to 900 samples.

Finally, using the early-stopping approach, our method successfully detected 147 anomalous driving behaviors (14.26%) with an accuracy of 95.25%. Figure 12 below shows the final results. We can see 1031 naturalistic driving behavior instances, shown in blue. Besides that, we can observe the detected 147 anomalous driving behaviors, shown in red.

Summing up the experimentation, this research takes a step further regarding categorizing autonomous vehicles’ safe and unsafe driving behaviors. The approach proposed in this paper provides a very coherent approach toward the objective. Additionally, the use of multiple indicators for detection is an add-on. Finally, the research ends with 95.25% accuracy with the early-stopping approach. On the other hand, without using the early-stopping mechanism, the results show 96.31% accuracy.

5. Conclusions and Future Work

To conclude, this paper provides an efficient technique for detecting outliers, especially when categorizing safe and unsafe driving behaviors according to naturalistic driving data. Autoencoders are the most efficient way to detect outliers. Unlike previous research, we used multiple indicators, which resulted in better results. While considering the driving behavior of the automated vehicles, it must be borne in mind that a vehicle’s behavior is identified not only by what is in its longitudinal direction but also by the traversal dimension, i.e., left and right.Additionally, this research focused on the threat because of an automated vehicle instead of a threat to the automated vehicle. Using the early-stopping approach gives us better results than omitting early stopping. It makes the model more efficient, faster, and better. In this research, we successfully identified abnormal driving behaviors from thousands of driving instances. Shortfall of the current research: This research differentiates between safe and unsafe driving behaviors but does not categorize the detected unsafe behaviors. As a future proceeding, the classification of this unsafe behavior is still something to be carried on. The unsafe behavior can be further classified into multiple categories, such as overspeeding, reckless driving, zig-zag driving, under-limit driving etc. Additionally, the accuracy achieved in this experiment, i.e., 96.31%, can be further improved by testing more efficient techniques other than the autoencoder. Adding more parameters to judge driving behaviors will be beneficial. So far, we find the autoencoder to be the best method for this classification.

Author Contributions

Conceptualization, S.A. and M.O.M.; methodology, S.A., M.O.M., A.R.J. and S.-P.H.; software, M.O.M. and A.R.J.; validation, S.A., A.R.J. and S.-P.H.; formal analysis, S.A. and M.O.M.; investigation, M.O.M.; resources, A.R.J.; data curation, M.O.M. and S.-P.H.; writing—original draft preparation, S.A., M.O.M. and A.R.J.; writing—review and editing, S.A., M.O.M., A.R.J. and S.-P.H.; visualization, M.O.M., A.R.J., S.A. and S.-P.H.; supervision, A.R.J.; project administration, S.A. and M.O.M.; funding acquisition, S.-P.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by AI Advanced School, aSSIST University, Seoul, Korea.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors share no conflict of interest.

References

Sajid, F.; Javed, A.R.; Basharat, A.; Kryvinska, N.; Afzal, A.; Rizwan, M. An efficient deep learning framework for distracted driver detection. IEEE Access 2021, 9, 169270–169280. [Google Scholar] [CrossRef]
Min, H.; Fang, Y.; Wu, X.; Lei, X.; Chen, S.; Teixeira, R.; Zhu, B.; Zhao, X. A Fault Diagnosis Framework for Autonomous Vehicles with Sensor Self-Diagnosis. Expert Syst. Appl. 2023, 120002. [Google Scholar] [CrossRef]
Javed, A.R.; Shahzad, F.; ur Rehman, S.; Zikria, Y.B.; Razzak, I.; Jalil, Z.; Xu, G. Future smart cities requirements, emerging technologies, applications, challenges, and future aspects. Cities 2022, 129, 103794. [Google Scholar] [CrossRef]
Xu, J.; Park, S.H.; Zhang, X.; Hu, J. The improvement of road driving safety guided by visual inattentional blindness. IEEE Trans. Intell. Transp. Syst. 2021, 23, 4972–4981. [Google Scholar] [CrossRef]
Xiao, Z.; Fang, H.; Jiang, H.; Bai, J.; Havyarimana, V.; Chen, H.; Jiao, L. Understanding private car aggregation effect via spatio-temporal analysis of trajectory data. IEEE Trans. Cybern. 2023, 53, 2346–2357. [Google Scholar] [CrossRef] [PubMed]
Javed, A.R.; Ahmed, W.; Pandya, S.; Maddikunta, P.K.R.; Alazab, M.; Gadekallu, T.R. A Survey of Explainable Artificial Intelligence for Smart Cities. Electronics 2023, 12, 1020. [Google Scholar] [CrossRef]
Javed, A.R.; Hassan, M.A.; Shahzad, F.; Ahmed, W.; Singh, S.; Baker, T.; Gadekallu, T.R. Integration of blockchain technology and federated learning in vehicular (iot) networks: A comprehensive survey. Sensors 2022, 22, 4394. [Google Scholar] [CrossRef] [PubMed]
Beshah, T.; Ejigu, D.; Abraham, A.; Snasel, V.; Kromer, P. Pattern recognition and knowledge discovery from road traffic accident data in ethiopia: Implications for improving road safety. In Proceedings of the 2011 World Congress on Information and Communication Technologies, Mumbai, India, 11–14 December 2011; pp. 1241–1246. [Google Scholar]
Hadi, A.S.; Imon, A.R.; Werner, M. Detection of outliers. Wiley Interdiscip. Rev. Comput. Stat. 2009, 1, 57–70. [Google Scholar] [CrossRef]
Rodrigues, J. Outliers Make Us Go MAD: Univariate Outlier Detection. 2018. Available online: https://medium.com/@joaopedroferrazrodrigues/outliers-make-us-go-mad-univariate-outlier-detection-3a72f1ea8c7 (accessed on 26 January 2023).
Zhang, H.; Luo, G.; Li, J.; Wang, F.Y. C2FDA: Coarse-to-fine domain adaptation for traffic object detection. IEEE Trans. Intell. Transp. Syst. 2021, 23, 12633–12647. [Google Scholar] [CrossRef]
Javed, A.R.; Usman, M.; Rehman, S.U.; Khan, M.U.; Haghighi, M.S. Anomaly detection in automated vehicles using multistage attention-based convolutional neural network. IEEE Trans. Intell. Transp. Syst. 2020, 22, 4291–4300. [Google Scholar] [CrossRef]
Wang, H.; Gao, Q.; Li, H.; Wang, H.; Yan, L.; Liu, G. A structural evolution-based anomaly detection method for generalized evolving social networks. Comput. J. 2022, 65, 1189–1199. [Google Scholar] [CrossRef]
Wang, F.; Wang, H.; Zhou, X.; Fu, R. A Driving Fatigue Feature Detection Method Based on Multifractal Theory. IEEE Sens. J. 2022, 22, 19046–19059. [Google Scholar] [CrossRef]
Xu, J.; Zhang, X.; Park, S.H.; Guo, K. The alleviation of perceptual blindness during driving in urban areas guided by saccades recommendation. IEEE Trans. Intell. Transp. Syst. 2022, 23, 16386–16396. [Google Scholar] [CrossRef]
Tejada, A.; Legius, M.J. Towards a quantitative “safety” metric for autonomous vehicles. In Proceedings of the 26th International Technical Conference on the Enhanced Safety of Vehicles (ESV), Eindhoven, The Netherlands, 10–13 June 2019. [Google Scholar]
Xu, J.; Guo, K.; Sun, P.Z. Driving performance under violations of traffic rules: Novice vs. experienced drivers. IEEE Trans. Intell. Veh. 2022, 7, 908–917. [Google Scholar] [CrossRef]
Shinar, D. Traffic Safety and Human Behavior; Emerald Group Publishing: Bingley, UK, 2017. [Google Scholar]
Belin, M.; Johansson, R.; Lindberg, J.; Tingvall, C. The Vision Zero and its consequences. In Proceedings of the 4th International Conference on Safety and the Environment in the 21st Century, Vienna, Austria, 23–27 November 1997; pp. 23–27. [Google Scholar]
Yao, Z.; Yoon, H.S. Control strategy for hybrid electric vehicle based on online driving pattern classification. SAE Int. J. Altern. Powertrains 2019, 8, 91–102. [Google Scholar] [CrossRef]
Schwarting, W.; Pierson, A.; Alonso-Mora, J.; Karaman, S.; Rus, D. Social behavior for autonomous vehicles. Proc. Natl. Acad. Sci. USA 2019, 116, 24972–24978. [Google Scholar] [CrossRef] [PubMed]
Juhlin, O. Traffic behaviour as social interaction-implications for the design of artificial drivers. In Proceedings of the 6th World Congress on Intelligent Transport Systems (ITS), Toronto, ON, Canada, 8–12 November 1999. [Google Scholar]
Kim, K.; Kim, J.S.; Jeong, S.; Park, J.H.; Kim, H.K. Cybersecurity for autonomous vehicles: Review of attacks and defense. Comput. Secur. 2021, 103, 102150. [Google Scholar] [CrossRef]
Xu, J.; Pan, S.; Sun, P.Z.; Park, S.H.; Guo, K. Human-Factors-in-Driving-Loop: Driver Identification and Verification via a Deep Learning Approach using Psychological Behavioral Data. IEEE Trans. Intell. Transp. Syst. 2022, 24, 3383–3394. [Google Scholar] [CrossRef]
Xiao, Z.; Shu, J.; Jiang, H.; Min, G.; Chen, H.; Han, Z. Perception Task Offloading with Collaborative Computation for Autonomous Driving. IEEE J. Sel. Areas Commun. 2023, 41, 457–473. [Google Scholar] [CrossRef]
De La Torre, G.; Rad, P.; Choo, K.K.R. Driverless vehicle security: Challenges and future research opportunities. Future Gener. Comput. Syst. 2020, 108, 1092–1111. [Google Scholar] [CrossRef]
Chen, P.; Wu, J.; Li, N. A Personalized Navigation Route Recommendation Strategy Based on Differential Perceptron Tracking User’s Driving Preference. Comput. Intell. Neurosci. 2023, 2023, 8978398. [Google Scholar] [CrossRef]
Eziama, E.; Awin, F.; Ahmed, S.; Marina Santos-Jaimes, L.; Pelumi, A.; Corral-De-Witt, D. Detection and identification of malicious cyber-attacks in connected and automated vehicles’ real-time sensors. Appl. Sci. 2020, 10, 7833. [Google Scholar] [CrossRef]
Tejada, A.; Manders, J.; Snijders, R.; Paardekooper, J.P.; de Hair-Buijssen, S. Towards a Characterization of Safe Driving Behavior for Automated Vehicles Based on Models of “Typical” Human Driving Behavior. In Proceedings of the 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), Rhodes, Greece, 20–23 September 2020; pp. 1–6. [Google Scholar]
Cao, B.; Zhang, W.; Wang, X.; Zhao, J.; Gu, Y.; Zhang, Y. A memetic algorithm based on two_Arch2 for multi-depot heterogeneous-vehicle capacitated arc routing problem. Swarm Evol. Comput. 2021, 63, 100864. [Google Scholar] [CrossRef]
Wang, Y.; Masoud, N.; Khojandi, A. Real-time sensor anomaly detection and recovery in connected automated vehicle sensors. IEEE Trans. Intell. Transp. Syst. 2020, 22, 1411–1421. [Google Scholar] [CrossRef]
Svensson, Å.; Hydén, C. Estimating the severity of safety related behaviour. Accid. Anal. Prev. 2006, 38, 379–385. [Google Scholar] [CrossRef] [PubMed]
Zhou, C.; Paffenroth, R.C. Anomaly detection with robust deep autoencoders. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 665–674. [Google Scholar]
Wang, X.; Mavromatis, I.; Tassi, A.; Santos-Rodriguez, R.; Piechocki, R.J. Location anomalies detection for connected and autonomous vehicles. In Proceedings of the 2019 IEEE 2nd Connected and Automated Vehicles Symposium (CAVS), Honolulu, HI, USA, 22–23 September 2019; pp. 1–5. [Google Scholar]
Miglani, A.; Kumar, N. Deep learning models for traffic flow prediction in autonomous vehicles: A review, solutions, and challenges. Veh. Commun. 2019, 20, 100184. [Google Scholar] [CrossRef]
Liu, W.; Quijano, K.; Crawford, M.M. YOLOv5-Tassel: Detecting tassels in RGB UAV imagery with improved YOLOv5 based on transfer learning. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 2022, 15, 8085–8094. [Google Scholar] [CrossRef]
Xia, X.; Hashemi, E.; Xiong, L.; Khajepour, A. Autonomous Vehicle Kinematics and Dynamics Synthesis for Sideslip Angle Estimation Based on Consensus Kalman Filter. IEEE Trans. Control. Syst. Technol. 2022, 31, 179–192. [Google Scholar] [CrossRef]
Xia, X.; Xiong, L.; Huang, Y.; Lu, Y.; Gao, L.; Xu, N.; Yu, Z. Estimation on IMU yaw misalignment by fusing information of automotive onboard sensors. Mech. Syst. Signal Process. 2022, 162, 107993. [Google Scholar] [CrossRef]
Liu, W.; Xia, X.; Xiong, L.; Lu, Y.; Gao, L.; Yu, Z. Automated vehicle sideslip angle estimation considering signal measurement characteristic. IEEE Sens. J. 2021, 21, 21675–21687. [Google Scholar] [CrossRef]
Xiong, L.; Xia, X.; Lu, Y.; Liu, W.; Gao, L.; Song, S.; Yu, Z. IMU-based automated vehicle body sideslip angle and attitude estimation aided by GNSS using parallel adaptive Kalman filters. IEEE Trans. Veh. Technol. 2020, 69, 10668–10680. [Google Scholar] [CrossRef]
Gao, L.; Xiong, L.; Xia, X.; Lu, Y.; Yu, Z.; Khajepour, A. Improved vehicle localization using on-board sensors and vehicle lateral velocity. IEEE Sens. J. 2022, 22, 6818–6831. [Google Scholar] [CrossRef]
Dot, U. Next Generation Simulation (NGSIM) Vehicle Trajectories and Supporting Data; US Department of Transportation: Washington, DC, USA, 2018.

Figure 1. Experimental methodology for detecting abnormal driving behavior in automated vehicles.

Figure 2. Network structure for the deep learning model.

Figure 3. Training vs. validation loss without early-stopping.

Figure 4. Training and validation accuracy without early-stopping.

Figure 5. Train loss vs. no. of examples without early-stopping.

Figure 6. Test loss vs. no. of examples without early-stopping.

Figure 7. Normal vs. abnormal driving behavior detected without early-stopping.

Figure 8. Training vs. validation loss with early-stopping.

Figure 9. Training vs. validation accuracy with early-stopping.

Figure 10. Training loss vs. no. of samples with early-stopping.

Figure 11. Testing loss vs. no. of samples with early-stopping.

Figure 12. Normal vs. abnormal driving behavior detected with early-stopping.

Table 1. Review of previous works.

Ref.	Focus	Proposed Approach	Dataset	Results	Limitations
[28]	Identification of Malicious Cyber Attacks	Bayesian deep learning (BDL) combined with the discrete wavelet transform (DWT)	Veremi (i.e., vehicle comparison misbehavior)	Performance gain as compared to CNN	Poor performance on low network density
[29]	Characterization of Autonomous Cars’ Safe Driving Behavior	DL model (Autoencoders)	Longitudinal, naturalistic driving data (from NGSIM)	Distinguish between normal (safe) and abnormal (unsafe) driving styles	Outliers and longitudinal interactions between two vehicles were considered
[31]	Real-Time Sensor Anomaly Detection	One Class Support Vector Machine models	Self Generated	Better anomaly detection and time delay factor PoC	Only single vehicle sensor data are used

Table 2. Hyperparameters used in this experiment.

Hyperparameter	Value
Batch Size	100
Epochs	50
Activation Function	elu
Optimizer	adam
Loss Function	mse

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Abbas, S.; Malik, M.O.; Javed, A.R.; Hong, S.-P. Naturalistic Driving Data-Based Anomalous Driving Behavior Detection Using Hypertuned Deep Autoencoders. Electronics 2023, 12, 2072. https://doi.org/10.3390/electronics12092072

AMA Style

Abbas S, Malik MO, Javed AR, Hong S-P. Naturalistic Driving Data-Based Anomalous Driving Behavior Detection Using Hypertuned Deep Autoencoders. Electronics. 2023; 12(9):2072. https://doi.org/10.3390/electronics12092072

Chicago/Turabian Style

Abbas, Shafqat, Muhammad Ozair Malik, Abdul Rehman Javed, and Seng-Phil Hong. 2023. "Naturalistic Driving Data-Based Anomalous Driving Behavior Detection Using Hypertuned Deep Autoencoders" Electronics 12, no. 9: 2072. https://doi.org/10.3390/electronics12092072

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Naturalistic Driving Data-Based Anomalous Driving Behavior Detection Using Hypertuned Deep Autoencoders

Abstract

1. Introduction

2. Literature Review

3. Proposed Approach

3.1. Dataset

3.2. Feature Selection

3.3. Z-Score Method

3.4. Deep Learning Model

4. Results and Discussion

4.1. Standard (without Early Stopping)

4.2. Using Early-Stopping

5. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI