Next Article in Journal
Candlestick Pattern Recognition in Cryptocurrency Price Time-Series Data Using Rule-Based Data Analysis Methods
Previous Article in Journal
Human–Object Interaction: Development of a Usability Index for Product Design Using a Hierarchical Fuzzy Axiomatic Design
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Factors, Prediction, and Explainability of Vehicle Accident Risk Due to Driving Behavior through Machine Learning: A Systematic Literature Review, 2013–2023

by
Javier Lacherre
1,*,
José Luis Castillo-Sequera
2 and
David Mauricio
1
1
Faculty of Systems Engineering and Informatics, National University of San Marcos, Lima 15081, Peru
2
Department of Computer Sciences, Polytechnic School, University of Alcala, 28871 Alcala de Henares, Spain
*
Author to whom correspondence should be addressed.
Computation 2024, 12(7), 131; https://doi.org/10.3390/computation12070131
Submission received: 25 April 2024 / Revised: 20 June 2024 / Accepted: 26 June 2024 / Published: 28 June 2024
(This article belongs to the Section Computational Engineering)

Abstract

:
Road accidents are on the rise worldwide, causing 1.35 million deaths per year, thus encouraging the search for solutions. The promising proposal of autonomous vehicles stands out in this regard, although fully automated driving is still far from being an achievable reality. Therefore, efforts have focused on predicting and explaining the risk of accidents using real-time telematics data. This study aims to analyze the factors, machine learning algorithms, and explainability methods most used to assess the risk of vehicle accidents based on driving behavior. A systematic review of the literature produced between 2013 and July 2023 on factors, prediction algorithms, and explainability methods to predict the risk of traffic accidents was carried out. Factors were categorized into five domains, and the most commonly used predictive algorithms and explainability methods were determined. We selected 80 articles from journals indexed in the Web of Science and Scopus databases, identifying 115 factors within the domains of environment, traffic, vehicle, driver, and management, with speed and acceleration being the most extensively examined. Regarding machine learning advancements in accident risk prediction, we identified 22 base algorithms, with convolutional neural network and gradient boosting being the most commonly used. For explainability, we discovered six methods, with random forest being the predominant choice, particularly for feature importance analysis. This study categorizes the factors affecting road accident risk, presents key prediction algorithms, and outlines methods to explain the risk assessment based on driving behavior, taking vehicle weight into consideration.

1. Introduction

There are around 1.35 million deaths worldwide per year due to vehicle accidents [1]; in Europe, 60% of such deaths occur on two-lane roads [2]. In this regard, the United Nations Organization has proposed 17 sustainable development goals (SDGs) for the year 2030, where SDG-3, “Good health and well-being”, aims to reduce deaths and injuries resulting from traffic incidents by 50% worldwide [3]. One potential option is the implementation of autonomous vehicles. Nevertheless, complete automation in driving is still a considerable distance away, making it unlikely in the foreseeable future [4]; furthermore, extensive research is still needed, especially in the field of prediction.
Since 1975, research has focused on predicting vehicle accident risk (VAR). Chipman and Morgan [5] studied various factors such as demerit points, age, gender, license class, and accident history. Their findings highlighted demerit points as the key factor influencing future accident risk, offering a chance to prevent accidents when linked with driving behavior (DB). Extensive research over time has led to modifications and regulations in environmental, vehicular, traffic, driver, and management domains to reduce risks. These measures aim to reduce risks, such as using deceleration devices and central protection barriers on roads for risk mitigation [2]. Additionally, mechanisms have been implemented for collision prevention, pedestrian identification, lane change alerts, and detection of driver distraction and drowsiness with feedback to the driver, among other capabilities [6]. These advancements prompted governments to implement safety manuals, such as the Road Safety Manual, which includes a widely used VAR prediction model [7,8]; however, this model does not consider DB in its statistical analysis [9].
The study problem in this article focuses on driving behavior (DB) and its impact on traffic accident incidence. DB refers to the actions and responses of a driver during various driving scenarios, encompassing the journey from an initial point to a final destination, taking into account factors such as travel time [10]. DB can be categorized into distinct groups with similar patterns, facilitating the estimation of driving risk levels [11]. These groups include the following: normal, drowsy, and aggressive behaviors [12].
In the VAR context, DB holds utmost significance as it accounts for the highest incidence of traffic accidents—surpassing 70% of accidents in certain countries, such as Peru [13]. Therefore, the vehicle accident risk due to driving behavior (DBVAR) refers to the probability of a traffic accident occurring due to actions taken by drivers behind the wheel, which can increase the chances of suffering an accident and endanger road safety. Identifying this risk is fundamental to protecting human lives, promoting road safety, reducing the costs associated with traffic accidents, and developing effective safety policies.
Research has been conducted to determine factors for predicting traffic incidents using machine learning (ML) methods. Xu et al. [14] found that there was a strong correlation between aggressive DB and aspects of the driver, vehicle, and environment. In a similar vein, Li et al. [15] included the environment, vehicle, driver, and traffic. Likewise, Niu et al. [16] and Yang et al. [17] included the management domain. It is important to study each of these factors, not only to find a better model but also to mitigate the risk of accidents and their consequences, in addition to improving road safety [18].
Regarding prediction, different artificial intelligence algorithms have been used to predict the risk of vehicle accidents. Geng et al. [2] presented an extensive modeling framework for evaluating truck safety on two-lane rural roads using extreme gradient boosting (XGboost), achieving an impressive accuracy of 96.67%. In the study by Peng et al. [19], it was noted that long short-term memory (LSTM) is suitable for extracting significant and continuous information from vehicles such as accelerations and decelerations, which they applied for DBVAR prediction and achieved a 93.5% accuracy.
On the other hand, various algorithms have also been used for DBVAR explainability. In the study by Masello et al. [20], Shapley additive explanations (SHAP) was applied, and it was found that the speed limit was a very relevant factor for the riskiest events. In the same sense, the study by Alfai et al. [21], based on the random forest (RF) feature importance method, discovered that the most significant predictors for DBVAR were the mean speed of the vehicle, the vehicle’s instantaneous speed, and its longitudinal acceleration.
The amount of research on DBVAR has motivated various researchers to perform state-of-the-art studies. In the study by Bouhsissin et al. [22], 93 articles were reviewed between 2015 and 2022, from which it was highlighted that ML algorithms occupied the predominant position with 60%, followed by deep learning (DL) algorithms and statistical methods (with 34.87% and 5.15%, respectively). The most-used algorithms were support vector machine (SVM), logistic regression (LR), LSTM, artificial neural network (ANN), k-nearest neighbors (KNN), RF, and convolutional neural network (CNN). In parallel, 39 relevant factors were identified in this area. In the study by Paredes et al. [23], 27 articles were analyzed between 2015 and 2020, finding 17 ML algorithms in which Bayesian algorithms and decision trees mainly stood out. In addition, 21 relevant factors were identified in this context, coinciding with the results of Bouhsissin et al. [22], where the most used were acceleration, deceleration, and speed. Likewise, in the research of Elassad et al. [24], 82 articles from the period 2009–2019 were reviewed, and the factors and prediction aspects were analyzed. A total of 14 general factors grouped into the dimensions of driver, vehicle, and environment were identified, and it was found that SVM, neural network (NN), Bayesian learners (BL), and ensemble learners (EL) were the four most used algorithms, present in 72% of the selected studies. On the other hand, in the study by Silva et al. [25], the prediction and explainability aspects were studied in relation to the frequency of accidents and severity classification, based on 26 articles from the period 2003–2020, and it was found that the main techniques were KNN and decision tree (DT); however, ANN was found to be the most suitable for predicting accident frequency. Furthermore, they highlighted the road environment, human behaviors, accident characteristics, and vehicle-related elements as the main contributors to the elucidation of accident causes.
Studies in the field have revealed that a wealth of knowledge exists that needs to be inventoried, analyzed, and classified. However, in the context of ML, there is a tendency to use algorithms that evaluate risk based on accident frequency and DBVAR, without differentiating between light and heavy vehicles or associated factors related to vehicle trip management. These factors include the estimated delay time to the destination or whether a heavy vehicle is loaded or empty. Furthermore, current approaches focus on contributing factors that explain the frequency or severity of accidents but do not identify the factors contributing to DBVAR. This gap is crucial as regulations increasingly mandate the incorporation of mechanisms for reading trajectory and security data. Through analyzing these data, conducting prediction in real time, and explaining the causes, we can significantly mitigate the number of accidents.
This study aims to systematically review all the important developed aspects related to the factors, prediction, and explainability of DBVAR based on ML and aims to answer the following research question: Which factors, ML advances for prediction, and explainability methods have been investigated in relation to DBVAR?
The main contributions of this article are as follows:
  • Providing a comprehensive catalog of traffic accident risk factors, classified into five dimensions;
  • Identifying the various prediction algorithms, data sets used, and performance metrics employed in the analysis;
  • Compiling the various studies utilizing multiple methods to explain factors contributing to DBVAR;
  • Providing the reader with a wide range of bibliographic references that they can utilize to delve deeper into understanding the models based on ML that facilitate prediction and explanation of DBVAR.
This article is organized into five sections, as follows. In Section 2, the methodology followed for the systematic review of the literature is presented. Section 3 presents the results, focused on answering the research questions, the discussion of which is presented in Section 4. Finally, the conclusions follow in Section 5.

2. Methodology

For this article, a systematic review of the literature was carried out based on the model applied by Silva et al. [25] and Shiguihara et al. [26] to ensure scientific rigor, which consisted of the following phases:
  • Planning: Define the research questions to be addressed, establish the sequence of steps to be carried out to search, and identify primary studies in indexed databases, also including the inclusion/exclusion criteria used for the selection of articles.
  • Development: The selection of primary studies is carried out in accordance with planning, following which the quality is evaluated and the data are extracted and synthesized.
Results: Statistics on publications are shown, and the research questions are answered in Section 2.3 and Section 3, respectively.

2.1. Planning

Three research questions were proposed in order to determine the aspects developed to understand the factors, prediction, and explainability of the DBVAR:
  • RQ1: What are the factors considered in predicting DBVAR?
  • RQ2: What are the advances of ML in DBVAR prediction?
  • RQ3: What advances in explainable artificial intelligence (XAI) exist for DBVAR prediction?
In order to address the research questions, we conducted a review of primary publications in journals indexed in the SCOPUS and Web of Science (WoS) databases, using the following search string:
(“vehicle accident risk” OR “car accident risk” OR “car following” OR “driving behavior ” OR “driving style” OR “driver behavior ” OR “driving risk” OR “driver risk” OR “road safety”) AND ((factors OR features OR causes) OR (predicti* OR forecast* OR progno*) OR (explainability OR explainable OR interpretabl* OR xai)) AND (“machine learning” OR “deep learning” OR lstm).
As shown in Table 1, the string was applied in “title-abs-key” format for Scopus and “topic” format for WoS, considering the period from January 2013 to July 2023. Additionally, the search was limited to publications with SCImago journal ranking impact factor. Finally, the inclusion and exclusion criteria established in Table 2 were applied.

2.2. Development

The possible original investigations found during the search were subjected to a selection procedure based on the criteria detailed in Table 2, covering both inclusion and exclusion criteria. To achieve this, it was necessary to carry out a prior review of the content, in order to determine its relevance for the present study and find those studies related to the factors, prediction, or explainability of DBVAR using ML. Most of the works were discarded as they corresponded to unrelated topics such as driver identification, energy consumption, autonomous vehicles, vehicles with fewer than four wheels, racing cars, pollution, level of accident severity, traffic study, or time and cost optimization. Figure 1 explains the applied process and identifies the activities carried out to select or reject studies.

2.3. Results

Potentially eligible studies and selected studies
The systematic review search conducted in Scopus and WoS resulted in 1674 articles, of which 80 were selected (see Table 3).
Trend of studies by year
The number of publications in the aspects of factors, prediction, or explainability of DBVAR showed exponential growth both in potential articles (see Figure 2a) and in selected articles (see Figure 2b). This could be explained by the increasing number of traffic accidents and the introduction of ML technologies for accident prediction and explainability.
Study trends across different countries
Figure 3 illustrates the distribution of studies based on the authors’ country of affiliation, with China and the United States representing 45% of the total concentration.
Articles selected by journal quality factor
Regarding the journal quality factor, 60% (48) of the articles were categorized in quartile Q1 and 35% (28) in quartile Q2, indicating that 95% of the articles fell within the top two quartiles (see Figure 4). This highlights the quality of the studies.
Articles selected by journal
Figure 5 illustrates that the two most prominent journals—Accident Analysis and Prevention and IEEE Access—were situated in the Q1 quartile and collectively accounted for 25% of the publications. Notably, there were 27 other journals categorized under “Others”, each contributing a single article.

3. Results

This section addresses the research questions posed in Section 2.1 based on the selected studies.
A.
RQ1: What are the factors considered in predicting DBVAR?
DB encompasses a driver’s actions, awareness, and adherence to road regulations. These factors can directly impact a driver’s behavior or prompt changes, and comprehending them aids in enhancing safety standards [28]. In this context, 115 factors were found in 48 studies, which were classified considering three of the four categories from Silva et al. [25], separating the factors related to traffic from the environment category and adding a management category, then excluding the accident category (characteristics of the occurred accident type) as this was a result and not a risk, and so, it did not correspond to a DBVAR. The resulting categories were as follows:
(1)
Environment: environment and geographical distribution.
(2)
Traffic: related to vehicles surrounding to the one being studied.
(3)
Vehicle: static or moving mode features.
(4)
Driver: related to the human who drives the vehicle.
(5)
Management: efficient vehicle fleet and drivers control and coordination.
Environment factors: A total of 20 factors were found from 23 articles, where the weather was the most used (in 9), followed by date–time and slope (in 8 and 5, respectively; see Table 4).
Traffic factors: A total of 17 were identified, where the most studied were the distance between two vehicles, the time to collision, and the traffic density, in 13, 10, and 9 studies out of 25, respectively (see Table 5).
Vehicle factors: A total of 44 factors were identified in 39 articles, where the most used were speed, acceleration, and steering angle, in 27, 23, and 9 studies, respectively (see Table 6).
Driver factors: A total of 25 were identified, where the most used were heart rate and eye, in 4 studies each, representing 39% of the 18 studies (see Table 7).
Management factors: A total of nine were identified in two studies (see Table 8).
On the other hand, the DBVAR prediction studies considered four variables of interest that described the accident, which were presented in two studies (see Table 9) and were used as a prediction object.
B.
RQ2: What are the advances of ML in DBVAR prediction?
Prediction based on statistical or ML methods allows behavior to be predicted in the case of an event, in order to predict probable future results such as DB or traffic accidents [30]. These models use the factors as input to make predictions; however, once the result is obtained, the reasoning behind the decision becomes unknown, and it is not possible to determine which of the factors has contributed most significantly to the generated effect [65]. For this reason, they are called “closed box” techniques, and to fully understand them, the use of additional explainability techniques is necessary.
To address this question, we examined 76 studies, identifying 22 core algorithms. Among these, CNN and GB emerged as the primary choices, which were featured in 19 and 15 studies, respectively. These algorithms were employed either independently or in hybrid models, resulting in the highest accuracy with XGBoost at 100%, as detailed in Table 10. Additionally, Table 11 reveals that out of all the conducted studies, four were centered on heavy vehicles, while two focused on rural roads. Moreover, Table 12 underscores that the primary aspect under scrutiny in the analysis of DB was driving style, comprising 45 studies.
C.
RQ3: What advances in XAI exist for DBVAR prediction?
XAI allows for adequate interpretation of the prediction process [17], for which models are used to analyze the importance and dependence of the factors that contribute to explaining the result [40]; in this way, confidence and transparency in the predictions can be ensured, such that they can reasonably be applied in the field of transportation safety [14].
To answer this question, 18 studies were found that used six methods to explain the factors with the greatest contribution to the DBVAR. In this context, RF and GB feature importance were the most used (in 50%), as well as SHAP (in 33%). They mainly focused on explaining DB in accident risk, where China and the United States were the main countries where the studies were applied (see Table 13).

4. Discussion

The result of this systematic literature review is a catalog of factors, prediction algorithms, and methods used to explain the importance of the factors. Researchers can use the different results to understand progress in the field and provide new approaches to reduce the risk of accidents to protect human lives, promote road safety, reduce the costs associated with traffic accidents, and develop effective safety policies. The relevance of this information is validated as 95% of studies were within the first two quartiles, such that the quality of the results is guaranteed. The research questions are discussed below.

4.1. About Factors

In this study, it was observed that the factors were classified into five dimensions (vehicle, environment, traffic, driver, and management), where vehicle was the most studied. Speed, acceleration, and distance between two vehicles stood out as the most-used factors due to their direct influence on the driver’s ability to control the vehicle in various risk situations. In addition, they also determine the level of severity of an accident. Additional crucial factors include the geographical location, determined through the Global Positioning System (GPS), as it enables us to comprehend other related elements, such as the geographical environment. The increasing prevalence of cost-effective sensors and cameras in vehicles is driving a trend toward greater data acquisition in real time, consequently enhancing the precision of models. At present, China leads research on DBVAR factors, probably due to the growth, leadership, and expansion of its automotive industry.
Some studies have considered the accident domain; however, this refers to the results and not the causes. Furthermore, they tend to focus on accident characterization, and so, they have not been considered as factors; however, they could be considered as an object to predict. Likewise, it is important to highlight that there were no factors associated with trip management, such as delay in delivery, the driver’s experience on the route, or whether the vehicle was loaded. Therefore, it is important to consider management-related factors (i.e., those in the management dimension) to evaluate commercial vehicles and improve the understanding of vehicle accidents as a whole.

4.2. About Prediction

In this study, the algorithms, vehicles, and roads types used in accident risk prediction research were identified. The novelty lies in the growing use of algorithms that combine convolutional and recurrent neural networks, taking advantage of the diversity of sensors integrated into vehicles, which generate data and images. The most commonly used algorithm is CNN or other combinations with CNN as it allows one to take advantage of the individual strengths of each model through sequential data processing, such as text and image analysis. Furthermore, it was identified that the most important algorithm in terms of accuracy for detecting distracted driving was Mobilenetv3 as it showed high accuracy in terms of real-time pattern recognition. However, for driving style recognition, the algorithm with the best accuracy was XGBoost. Therefore, there is a trend toward the use of deep learning, possibly due to the availability of larger volumes of data and advances in hardware and software, as well as the ability to achieve better performance overall. Regarding the metrics used to evaluate the algorithms, a consensus has been found in the use of accuracy as one of the main indicators in most studies. However, it is important to note, as observed in Table 10, that eight different metrics were identified, and not all studies considered the same metrics. Additionally, it is worth mentioning that only 2.6% of the studies focused on rural roads and 5% on heavy vehicles. To improve precision in this topic, it is suggested to explore the incorporation of transformer neural networks and dynamic Bayesian networks, which can capture long-term relationships in time series data. Additionally, alerts can be implemented for drivers and fleet managers regarding risk level, such that they can take preventive actions based on the provided information.
Moreover, based on Table 10, it can be determined that the most commonly used models (present in 80% of the studies) are MLP, CNN, GB, LSTM, and RF. This could be explained by the fact that MLP is one of the pioneering models in ML, while the other four stand out for their high accuracy, averaging 92.94%. Furthermore, an analysis of Table 4, Table 5, Table 6, Table 7, Table 8 and Table 9 on factors and Table 10 on models reveals that the common factors influencing the performance of the most commonly used models are speed, acceleration, and heading angle.

4.3. About Explainability

In this study, six explainability methods were identified in 18 studies, where the most studied was “RF feature importance,” with influencing factors related to the environment, such as road shape, road network, and weather. The increasing adoption of deep learning algorithms has highlighted the importance of understanding and trusting model decisions, driving the use of explainability methods to identify influential risk factors that might not be obvious to humans. Although the reviewed research barely addressed management factors, it is relevant to study their importance in explainability. Furthermore, there exist very successful methods, such as local interpretable model-agnostic explanations (LIME), which could provide good results in this context.

5. Conclusions

For this study, we conducted a systematic literature review related to DBVAR through ML. Out of the 1674 articles identified, 80 research papers were meticulously chosen through analysis, enabling the discovery of advancements in the field with respect to factors, prediction, and explainability. Within this review, we identified 115 factors across 48 studies, 22 prediction algorithms within 76 studies, and 6 explainability algorithms across 18 studies, all of which elucidated the influence of certain factors on prediction outcomes. Unlike other state-of-the-art studies on DBVAR, this work considered three crucial aspects: the influencing factors, accident prediction, and explainability. In relation to factors, we identified five dimensions: environment (20 factors), traffic (17 factors), vehicle (44 factors), driver (25 factors), and management (9 factors). In particular, speed, acceleration, and distance between two vehicles were the most-studied factors. In the realm of ML advancements, CNN and GB emerged as the most commonly employed algorithms. Moreover, there is a growing trend in leveraging deep learning and hybrid models for enhanced precision. Notably, XGboost achieved the highest accuracy at 100% on a DBD data set of Turkish origin. It is worth noting that the majority of studies focused on light vehicles, with limited research conducted on heavy vehicles and rural roads. In reference to advances in explainability, it was found that the most-used method was the RF algorithm with feature importance. Additionally, the most studied models were MLP, CNN, GB, LSTM, and RF, and the common factors influencing their performance were speed, acceleration, and heading angle.
This study had some limitations that should be considered. Only studies in English were included, and only the WoS and Scopus databases were used as sources of information. Based on our findings, future research should focus on developing practices and strategies to address DBVAR factors in order to reduce the occurrence of traffic accidents, as well as extending this study to include other languages and additional databases.

Author Contributions

Conceptualization, J.L.; methodology, J.L.; formal analysis, J.L.; investigation, J.L.; resources, J.L.; writing—original draft preparation, J.L.; writing—review and editing, J.L. and D.M.; supervision, D.M. and J.L.C.-S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. World Health Organization. Global Status Report on Road Safety—Time for Action; World Health Organization: Geneva, Switzerland, 2018. [Google Scholar]
  2. Geng, Z.; Ji, X.; Cao, R.; Lu, M.; Qin, W. A Conflict Measures-Based Extreme Value Theory Approach to Predicting Truck Collisions and Identifying High-Risk Scenes on Two-Lane Rural Highways. Sustainability 2022, 14, 11212. [Google Scholar] [CrossRef]
  3. Naciones Unidas. La Agenda 2030 y los Objetivos de Desarrollo Sostenible: Una Oportunidad para América Latina y el Caribe; Comisión Económica para América Latina y el Caribe (CEPAL): Santiago de Chile, Chile, 2018; ISBN 978-92-1-058643-6. [Google Scholar]
  4. Kashevnik, A.; Shchedrin, R.; Kaiser, C.; Stocker, A. Driver Distraction Detection Methods: A Literature Review and Framework. IEEE Access 2021, 9, 60063–60076. [Google Scholar] [CrossRef]
  5. Chipman, M.L.; Morgan, P. The Role of Driver Demerit Points and Age in the Prediction of Motor Vehicle Collisions. J. Epidemiol. Community Health 1975, 29, 190–195. [Google Scholar] [CrossRef]
  6. Celaya-Padilla, J.M.; Galván-Tejada, C.E.; Lozano-Aguilar, J.S.A.; Zanella-Calzada, L.A.; Luna-García, H.; Galván-Tejada, J.I.; Gamboa-Rosales, N.K.; Rodriguez, A.V.; Gamboa-Rosales, H. “Texting & Driving” Detection Using Deep Convolutional Neural Networks. Appl. Sci. 2019, 9, 2962. [Google Scholar] [CrossRef]
  7. AASHTO. Highway Safety Manual; American Association of State Highway and Transportation Officials: Washington, DC, USA, 2010. [Google Scholar]
  8. Das, S.; Tsapakis, I.; Khodadadi, A. Safety Performance Functions for Low-Volume Rural Minor Collector Two-Lane Roadways. IATSS Res. 2021, 45, 347–356. [Google Scholar] [CrossRef]
  9. Tola, A.M.; Demissie, T.A.; Saathoff, F.; Gebissa, A. Crash Distribution Dataset: Development and Validation for the Undivided Rural Roads in Oromia, Ethiopia. Transp. Telecommun. J. 2022, 23, 11–24. [Google Scholar] [CrossRef]
  10. Halim, Z.; Sulaiman, M.; Waqas, M.; Aydın, D. Deep Neural Network-Based Identification of Driving Risk Utilizing Driver Dependent Vehicle Driving Features: A Scheme for Critical Infrastructure Protection. J. Ambient Intell. Humaniz. Comput. 2023, 14, 11747–11765. [Google Scholar] [CrossRef]
  11. Shi, X.; Wong, Y.D.; Li, M.Z.-F.; Palanisamy, C.; Chai, C. A Feature Learning Approach Based on XGBoost for Driving Assessment and Risk Prediction. Accid. Anal. Prev. 2019, 129, 170–179. [Google Scholar] [CrossRef]
  12. Yi, D.W.; Su, J.Y.; Liu, C.J.; Quddus, M.; Chen, W.H. A Machine Learning Based Personalized System for Driving State Recognition. Transp. Res. Part C Emerg. Technol. 2019, 105, 241–261. [Google Scholar] [CrossRef]
  13. Observatorio Nacional de Seguridad Vial Boletín Estadístico de Siniestralidad Vial, 2021. Available online: https://www.onsv.gob.pe/post/boletin-estadistico-de-siniestralidad-vial-2021/ (accessed on 20 May 2022).
  14. Xu, W.; Wang, J.; Fu, T.; Gong, H.; Sobhani, A. Aggressive Driving Behavior Prediction Considering Driver’s Intention Based on Multivariate-Temporal Feature Data. Accid. Anal. Prev. 2022, 164, 106477. [Google Scholar] [CrossRef] [PubMed]
  15. Li, D.; Wang, Y.; Xu, W. A Deep Multichannel Network Model for Driving Behavior Risk Classification. IEEE Trans. Intell. Transp. Syst. 2023, 24, 1204–1219. [Google Scholar] [CrossRef]
  16. Niu, Y.; Li, Z.M.; Fan, Y.X. Analysis of Truck Drivers’ Unsafe Driving Behaviors Using Four Machine Learning Methods. Int. J. Ind. Ergon. 2021, 86, 103192. [Google Scholar] [CrossRef]
  17. Yang, C.; Chen, M.Y.; Yuan, Q. The Application of XGBoost and SHAP to Examining the Factors in Freight Truck-Related Crashes: An Exploratory Analysis. Accid. Anal. Prev. 2021, 158, 106153. [Google Scholar] [CrossRef]
  18. Zhong, S.; Fu, X.; Lu, W.; Tang, F.; Lu, Y. An Expressway Driving Stress Prediction Model Based on Vehicle, Road and Environment Features. IEEE Access 2022, 10, 57212–57226. [Google Scholar] [CrossRef]
  19. Peng, L.; Wang, Y.; Zhang, F.; Zhang, J.; Li, Z. Evaluation of Emergency Driving Behaviour and Vehicle Collision Risk in Connected Vehicle Environment: A Deep Learning Approach. IET Intell. Transp. Syst. 2021, 15, 584–594. [Google Scholar] [CrossRef]
  20. Masello, L.; Castignani, G.; Sheehan, B.; Guillen, M.; Murphy, F. Using Contextual Data to Predict Risky Driving Events: A Novel Methodology from Explainable Artificial Intelligence. Accid. Anal. Prev. 2023, 184, 106997. [Google Scholar] [CrossRef]
  21. Al-refai, G.; Elmoaqet, H.; Ryalat, M. In-Vehicle Data for Predicting Road Conditions and Driving Style Using Machine Learning. Appl. Sci. 2022, 12, 8928. [Google Scholar] [CrossRef]
  22. Bouhsissin, S.; Sael, N.; Benabbou, F. Driver Behavior Classification: A Systematic Literature Review. IEEE Access 2023, 11, 14128–14153. [Google Scholar] [CrossRef]
  23. Paredes, J.J.; Yepes, S.F.; Salazar-Cabrera, R.; de la Cruz, Á.P.; Molina, J.M.M. Intelligent Collision Risk Detection in Medium-Sized Cities of Developing Countries, Using Naturalistic Driving: A Review. J. Traffic Transp. Eng. (Engl. Ed.) 2022, 9, 912–929. [Google Scholar] [CrossRef]
  24. Elassad, Z.E.A.; Mousannif, H.; Moatassime, H.A.; Karkouch, A. The Application of Machine Learning Techniques for Driving Behavior Analysis: A Conceptual Framework and a Systematic Literature Review. Eng. Appl. Artif. Intell. 2020, 87, 103312. [Google Scholar] [CrossRef]
  25. Silva, P.B.; Andrade, M.; Ferreira, S. Machine Learning Applied to Road Safety Modeling: A Systematic Literature Review. J. Traffic Transp. Eng. (Engl. Ed.) 2020, 7, 775–790. [Google Scholar] [CrossRef]
  26. Shiguihara, P.; De Andrade Lopes, A.; Mauricio, D. Dynamic Bayesian Network Modeling, Learning, and Inference: A Survey. IEEE Access 2021, 9, 117639–117648. [Google Scholar] [CrossRef]
  27. Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 Statement: An Updated Guideline for Reporting Systematic Reviews. Syst. Rev. 2021, 10, 89. [Google Scholar] [CrossRef]
  28. Alkinani, M.H.; Khan, W.Z.; Arshad, Q. Detecting Human Driver Inattentive and Aggressive Driving Behavior Using Deep Learning: Recent Advances, Requirements and Open Challenges. IEEE Access 2020, 8, 105008–105030. [Google Scholar] [CrossRef]
  29. Elassad, Z.E.A.; Mousannif, H.; Moatassime, H.A. A Real-Time Crash Prediction Fusion Framework: An Imbalance-Aware Strategy for Collision Avoidance Systems. Transp. Res. Part C Emerg. Technol. 2020, 118, 102708. [Google Scholar] [CrossRef]
  30. Shangguan, Q.; Fu, T.; Wang, J.; Luo, T.; Fang, S. An Integrated Methodology for Real-Time Driving Risk Status Prediction Using Naturalistic Driving Data. Accid. Anal. Prev. 2021, 156, 106122. [Google Scholar] [CrossRef]
  31. Zhao, H.; Li, X.; Cheng, H.; Zhang, J.; Wang, Q.; Zhu, H. Deep Learning-Based Prediction of Traffic Accidents Risk for Internet of Vehicles. China Commun. 2022, 19, 214–224. [Google Scholar] [CrossRef]
  32. Hu, Z.; Zhou, J.; Zhang, E. Improving Traffic Safety through Traffic Accident Risk Assessment. Sustainability 2023, 15, 3748. [Google Scholar] [CrossRef]
  33. Wang, J.; Xu, W.; Fu, T.; Jiang, R. Recognition of Trip-Based Aggressive Driving: A System Integrated With Gaussian Mixture Model Structured of Factor-Analysis, and Hierarchical Clustering. IEEE Trans. Intell. Transp. Syst. 2022, 23, 20442–20451. [Google Scholar] [CrossRef]
  34. Ghandour, R.; Potams, A.J.; Boulkaibet, I.; Neji, B.; Barakeh, Z.A. Driver Behavior Classification System Analysis Using Machine Learning Methods. Appl. Sci. 2021, 11, 10562. [Google Scholar] [CrossRef]
  35. Khodairy, M.A.; Abosamra, G. Driving Behavior Classification Based on Oversampled Signals of Smartphone Embedded Sensors Using an Optimized Stacked-LSTM Neural Networks. IEEE Access 2021, 9, 4957–4972. [Google Scholar] [CrossRef]
  36. Li, D.C.; Lin, M.Y.-C.; Chou, L.-D. Macroscopic Big Data Analysis and Prediction of Driving Behavior with an Adaptive Fuzzy Recurrent Neural Network on the Internet of Vehicles. IEEE Access 2022, 10, 47881–47895. [Google Scholar] [CrossRef]
  37. Arumugam, S.; Bhargavi, R. Road Rage and Aggressive Driving Behaviour Detection in Usage-Based Insurance Using Machine Learning. Int. J. Softw. Innov. 2023, 11, 1–29. [Google Scholar] [CrossRef]
  38. Kanwal, K.; Rustam, F.; Chaganti, R.; Jurcut, A.D.; Ashraf, I. Smartphone Inertial Measurement Unit Data Features for Analyzing Driver Driving Behavior. IEEE Sens. J. 2023, 23, 11308–11323. [Google Scholar] [CrossRef]
  39. Shangguan, Q.; Fu, T.; Wang, J.; Fang, S.; Fu, L. A Proactive Lane-Changing Risk Prediction Framework Considering Driving Intention Recognition and Different Lane-Changing Patterns. Accid. Anal. Prev. 2022, 164, 106500. [Google Scholar] [CrossRef]
  40. Nikolaou, D.; Ziakopoulos, A.; Dragomanovits, A.; Roussou, J.; Yannis, G. Comparing Machine Learning Techniques for Predictions of Motorway Segment Crash Risk Level. Safety 2023, 9, 32. [Google Scholar] [CrossRef]
  41. Guo, H.; Xie, K.; Keyvan-Ekbatani, M. Modeling Driver’s Evasive Behavior during Safety–Critical Lane Changes: Two-Dimensional Time-to-Collision and Deep Reinforcement Learning. Accid. Anal. Prev. 2023, 186, 107063. [Google Scholar] [CrossRef]
  42. Lyu, N.; Wang, Y.; Wu, C.; Peng, L.; Thomas, A.F. Using Naturalistic Driving Data to Identify Driving Style Based on Longitudinal Driving Operation Conditions. J. Intell. Connect. Veh. 2022, 5, 17–35. [Google Scholar] [CrossRef]
  43. Abdelrahman, A.E.; Hassanein, H.S.; Abu-Ali, N. Robust Data-Driven Framework for Driver Behavior Profiling Using Supervised Machine Learning. IEEE Trans. Intell. Transp. Syst. 2022, 23, 3336–3350. [Google Scholar] [CrossRef]
  44. Wang, H.; Wang, X.; Han, J.; Xiang, H.; Li, H.; Zhang, Y.; Li, S. A Recognition Method of Aggressive Driving Behavior Based on Ensemble Learning. Sensors 2022, 22, 644. [Google Scholar] [CrossRef]
  45. Liu, Z.; Ren, S.; Peng, M. Identification of Driver Distraction Based on SHRP2 Naturalistic Driving Study. Math. Probl. Eng. 2021, 2021, 6699327. [Google Scholar] [CrossRef]
  46. Zhao, L.; Xu, T.; Zhang, Z.; Hao, Y. Lane-Changing Recognition of Urban Expressway Exit Using Natural Driving Data. Appl. Sci. 2022, 12, 9762. [Google Scholar] [CrossRef]
  47. van der Wall, H.E.C.; Doll, R.J.; van Westen, G.J.P.; Koopmans, I.; Zuiker, R.G.; Burggraaf, J.; Cohen, A.F. The Use of Machine Learning Improves the Assessment of Drug-Induced Driving Behaviour. Accid. Anal. Prev. 2020, 148, 105822. [Google Scholar] [CrossRef]
  48. Yurtsever, E.; Yamazaki, S.; Miyajima, C.; Takeda, K.; Mori, M.; Hitomi, K.; Egawa, M. Integrating Driving Behavior and Traffic Context Through Signal Symbolization for Data Reduction and Risky Lane Change Detection. IEEE Trans. Intell. Veh. 2018, 3, 242–253. [Google Scholar] [CrossRef]
  49. Wang, K.; Yang, Y.; Wang, S.; Shi, Z. Research on Car-Following Model Considering Driving Style. Math. Probl. Eng. 2022, 2022, 7215697. [Google Scholar] [CrossRef]
  50. Rahman, M.J.; Beauchemin, S.S.; Bauer, M.A. Predicting Driver Behaviour at Intersections Based on Driver Gaze and Traffic Light Recognition. IET Intell. Transp. Syst. 2020, 14, 2083–2091. [Google Scholar] [CrossRef]
  51. Misra, A.; Samuel, S.; Cao, S.; Shariatmadari, K. Detection of Driver Cognitive Distraction Using Machine Learning Methods. IEEE Access 2023, 11, 18000–18012. [Google Scholar] [CrossRef]
  52. Malik, M.; Nandal, R.; Dalal, S.; Jalglan, V.; Le, D.N. Driving Pattern Profiling and Classification Using Deep Learning. Intel. Autom. Soft Comput. 2021, 28, 887–906. [Google Scholar] [CrossRef]
  53. Seo, H.; Shin, J.; Kim, K.-H.; Lim, C.; Bae, J. Driving Risk Assessment Using Non-Negative Matrix Factorization with Driving Behavior Records. IEEE Trans. Intell. Transp. Syst. 2022, 23, 20398–20412. [Google Scholar] [CrossRef]
  54. Lattanzi, E.; Freschi, V. Machine Learning Techniques to Identify Unsafe Driving Behavior by Means of In-Vehicle Sensor Data. Expert Syst. Appl. 2021, 176, 114818. [Google Scholar] [CrossRef]
  55. Kadri, N.; Ellouze, A.; Ksantini, M.; Turki, S.H. New LSTM Deep Learning Algorithm for Driving Behavior Classification. Cybern. Syst. 2023, 54, 387–405. [Google Scholar] [CrossRef]
  56. Zhang, X.; Yan, X. Predicting Collision Cases at Unsignalized Intersections Using EEG Metrics and Driving Simulator Platform. Accid. Anal. Prev. 2023, 180, 106910. [Google Scholar] [CrossRef]
  57. Tran, D.; Do, H.M.; Sheng, W.H.; Bai, H.; Chowdhary, G. Real-Time Detection of Distracted Driving Based on Deep Learning. IET Intell. Transp. Syst. 2018, 12, 1210–1219. [Google Scholar] [CrossRef]
  58. Nakano, K.; Chakraborty, B. Real-Time Distraction Detection from Driving Data Based Personal Driving Model Using Deep Learning. Int. J. Intell. Transp. Syst. Res. 2022, 20, 238–251. [Google Scholar] [CrossRef]
  59. Panagopoulos, G.; Pavlidis, I. Forecasting Markers of Habitual Driving Behaviors Associated with Crash Risk. IEEE Trans. Intell. Transp. Syst. 2020, 21, 841–851. [Google Scholar] [CrossRef]
  60. Shi, L.; Qian, C.; Guo, F. Real-Time Driving Risk Assessment Using Deep Learning with XGBoost. Accid. Anal. Prev. 2022, 178, 106836. [Google Scholar] [CrossRef]
  61. Fan, X.; Wang, F.; Song, D.; Lu, Y.; Liu, J. GazMon: Eye Gazing Enabled Driving Behavior Monitoring and Prediction. IEEE Trans. Mob. Comput. 2021, 20, 1420–1433. [Google Scholar] [CrossRef]
  62. Albadawi, Y.; AlRedhaei, A.; Takruri, M. Real-Time Machine Learning-Based Driver Drowsiness Detection Using Visual Features. J. Imaging 2023, 9, 91. [Google Scholar] [CrossRef]
  63. Alotaibi, M.; Alotaibi, B. Distracted Driver Classification Using Deep Learning. Signal Image Video Process. 2020, 14, 617–624. [Google Scholar] [CrossRef]
  64. Taherisadr, M.; Asnani, P.; Galster, S.; Dehzangi, O. ECG-Based Driver Inattention Identification during Naturalistic Driving Using Mel-Frequency Cepstrum 2-D Transform and Convolutional Neural Networks. Smart Health 2018, 9–10, 50–61. [Google Scholar] [CrossRef]
  65. Haque, M.M.; Sarker, S.; Dewan, M.A.A. Driving Maneuver Classification from Time Series Data: A Rule Based Machine Learning Approach. Appl. Intell. 2022, 52, 16900–16915. [Google Scholar] [CrossRef]
  66. Hu, J.; Zhang, X.; Maybank, S. Abnormal Driving Detection with Normalized Driving Behavior Data: A Deep Learning Approach. IEEE Trans. Veh. Technol. 2020, 69, 6943–6951. [Google Scholar] [CrossRef]
  67. Jahan, I.; Uddin, K.M.A.; Murad, S.A.; Miah, M.S.U.; Khan, T.Z.; Masud, M.; Aljahdali, S.; Bairagi, A.K. 4D: A Real-Time Driver Drowsiness Detector Using Deep Learning. Electronics 2023, 12, 235. [Google Scholar] [CrossRef]
  68. Khan, T.; Choi, G.; Lee, S. EFFNet-CA: An Efficient Driver Distraction Detection Based on Multiscale Features Extractions and Channel Attention Mechanism. Sensors 2023, 23, 3835. [Google Scholar] [CrossRef]
  69. Abosaq, H.A.; Ramzan, M.; Althobiani, F.; Abid, A.; Aamir, K.M.; Abdushkour, H.; Irfan, M.; Gommosani, M.E.; Ghonaim, S.M.; Shamji, V.R.; et al. Unusual Driver Behavior Detection in Videos Using Deep Learning Models. Sensors 2023, 23, 311. [Google Scholar] [CrossRef]
  70. Huang, C.; Huang, C.; Wang, X.; Cao, J.; Wang, S.; Wang, S.; Zhang, Y.; Zhang, Y. HCF: A Hybrid CNN Framework for Behavior Detection of Distracted Drivers. IEEE Access 2020, 8, 109335–109349. [Google Scholar] [CrossRef]
  71. Li, T.; Zhang, Y.; Li, Q.; Zhang, T. AB-DLM: An Improved Deep Learning Model Based on Attention Mechanism and BiFPN for Driver Distraction Behavior Detection. IEEE Access 2022, 10, 83138–83151. [Google Scholar] [CrossRef]
  72. Aljohani, A.A. Real-Time Driver Distraction Recognition: A Hybrid Genetic Deep Network Based Approach. Alex. Eng. J. 2023, 66, 377–389. [Google Scholar] [CrossRef]
  73. Lin, Y.C.; Cao, D.X.; Fu, Z.H.; Huang, Y.M.; Song, Y.Y. A Lightweight Attention-Based Network towards Distracted Driving Behavior Recognition. Appl. Sci. 2022, 12, 4191. [Google Scholar] [CrossRef]
  74. Kabir, M.F.; Roy, S. Real-Time Vehicular Accident Prevention System Using Deep Learning Architecture. Expert Syst. Appl. 2022, 206, 117837. [Google Scholar] [CrossRef]
  75. Hossain, M.U.; Rahman, M.A.; Islam, M.M.; Akhter, A.; Uddin, M.A.; Paul, B.K. Automatic Driver Distraction Detection Using Deep Convolutional Neural Networks. Intell. Syst. Appl. 2022, 14, 200075. [Google Scholar] [CrossRef]
  76. Lin, P.-W.; Hsu, C.-M. Innovative Framework for Distracted-Driving Alert System Based on Deep Learning. IEEE Access 2022, 10, 77523–77536. [Google Scholar] [CrossRef]
  77. Xiao, W.; Liu, H.; Ma, Z.; Chen, W.; Sun, C.; Shi, B. Fatigue Driving Recognition Method Based on Multi-Scale Facial Landmark Detector. Electronics 2022, 11, 4103. [Google Scholar] [CrossRef]
  78. Ezzouhri, A.; Charouh, Z.; Ghogho, M.; Guennoun, Z. Robust Deep Learning-Based Driver Distraction Detection and Classification. IEEE Access 2021, 9, 168080–168092. [Google Scholar] [CrossRef]
  79. Xue, Q.; Gao, K.; Xing, Y.; Lu, J.; Qu, X. A Context-Aware Framework for Risky Driving Behavior Evaluation Based on Trajectory Data. IEEE Intell. Transp. Syst. Mag. 2023, 15, 70–83. [Google Scholar] [CrossRef]
  80. Fan, Y.; Gu, F.; Wang, J.; Wang, J.; Lu, K.; Niu, J. SafeDriving: An Effective Abnormal Driving Behavior Detection System Based on EMG Signals. IEEE Internet Things J. 2022, 9, 12338–12350. [Google Scholar] [CrossRef]
  81. Zhang, Y.; Chen, Y.; Gao, C. Deep Unsupervised Multi-Modal Fusion Network for Detecting Driver Distraction. Neurocomputing 2021, 421, 26–38. [Google Scholar] [CrossRef]
  82. Boucetta, Z.; Fazziki, A.E.; Adnani, M.E. Integration of Ensemble Variant CNN with Architecture Modified LSTM for Distracted Driver Detection. Int. J. Adv. Comput. Sci. Appl. 2022, 13, 440–458. [Google Scholar] [CrossRef]
  83. Safarov, F.; Akhmedov, F.; Abdusalomov, A.B.; Nasimov, R.; Cho, Y.I. Real-Time Deep Learning-Based Drowsiness Detection: Leveraging Computer-Vision and Eye-Blink Analyses for Enhanced Road Safety. Sensors 2023, 23, 6459. [Google Scholar] [CrossRef]
  84. Jardin, P.; Moisidis, I.; Kartal, K.; Rinderknecht, S. Adaptive Driving Style Classification through Transfer Learning with Synthetic Oversampling. Vehicles 2022, 4, 1314–1331. [Google Scholar] [CrossRef]
  85. Wang, H.; Chen, J.; Huang, Z.; Li, B.; Lv, J.; Xi, J.; Wu, B.; Zhang, J.; Wu, Z. FPT: Fine-Grained Detection of Driver Distraction Based on the Feature Pyramid Vision Transformer. IEEE Trans. Intell. Transp. Syst. 2023, 24, 1594–1608. [Google Scholar] [CrossRef]
  86. Ping, P.; Huang, C.; Ding, W.; Liu, Y.; Chiyomi, M.; Kazuya, T. Distracted Driving Detection Based on the Fusion of Deep Learning and Causal Reasoning. Inf. Fusion 2023, 89, 121–142. [Google Scholar] [CrossRef]
  87. Jahangiri, A.; Berardi, V.J.; MacHiani, S.G. Application of Real Field Connected Vehicle Data for Aggressive Driving Identification on Horizontal Curves. IEEE Trans. Intell. Transp. Syst. 2018, 19, 2316–2324. [Google Scholar] [CrossRef]
  88. Siddiqui, H.U.R.; Saleem, A.A.; Brown, R.; Bademci, B.; Lee, E.; Rustam, F.; Dudley, S. Non-Invasive Driver Drowsiness Detection System. Sensors 2021, 21, 4833. [Google Scholar] [CrossRef]
  89. Zhang, Y.; Chen, Y.; Gu, X.; Sze, N.N.; Huang, J. A Proactive Crash Risk Prediction Framework for Lane-Changing Behavior Incorporating Individual Driving Styles. Accid. Anal. Prev. 2023, 188, 107072. [Google Scholar] [CrossRef]
  90. Cai, B.; Di, Q. Different Forecasting Model Comparison for Near Future Crash Prediction. Appl. Sci. 2023, 13, 759. [Google Scholar] [CrossRef]
  91. Singh, G.; Bansal, D.; Sofat, S. A Smartphone Based Technique to Monitor Driving Behavior Using DTW and Crowdsensing. Pervasive Mob. Comput. 2017, 40, 56–70. [Google Scholar] [CrossRef]
Figure 1. Systematic review process according to PRISMA [27].
Figure 1. Systematic review process according to PRISMA [27].
Computation 12 00131 g001
Figure 2. Number of publications per year: (a) potentially eligible and (b) selected studies.
Figure 2. Number of publications per year: (a) potentially eligible and (b) selected studies.
Computation 12 00131 g002
Figure 3. Studies by authors’ country of affiliation.
Figure 3. Studies by authors’ country of affiliation.
Computation 12 00131 g003
Figure 4. Articles by quality factor.
Figure 4. Articles by quality factor.
Computation 12 00131 g004
Figure 5. Articles by journal.
Figure 5. Articles by journal.
Computation 12 00131 g005
Table 1. Database search string.
Table 1. Database search string.
DatabaseSearch String
ScopusTITLE-ABS-KEY ((“vehicle accident risk” OR “car accident risk” OR “car following” OR “driving behavior” OR “driving style” OR “driver behavior” OR “driving risk” OR “driver risk” OR “ road safety”) AND ((factors OR features OR causes) OR (predicti* OR forecast* OR progno*) OR (explainability OR explainable OR interpretabl* OR xai)) AND (“machine learning” OR “deep learning”))
WoSResults for (“vehicle accident risk” OR “car accident risk” OR “car following” OR “driving behavior” OR “driving style” OR “driver behavior” OR “driving risk” OR “driver risk” OR “road safety”) AND ((factors OR aspects OR causes) OR (predicti* OR forecast*) OR (explainability OR explainable OR interpretable OR xai)) AND (“machine learning” OR “deep learning”) (Topic)
Table 2. Inclusion and exclusion criteria.
Table 2. Inclusion and exclusion criteria.
Inclusion CriteriaExclusion Criteria
CI1: Studies that answer the research questions (factors, prediction models, or explainability)
CI2: Primary type studies
CI3: Studies that present metrics to evaluate the quality of predictive models
CI4: Studies presented in English
CE1: Studies aimed at cost reduction
CE2: Studies not related to vehicular transportation
CE3: Studies that do not present test results
CE4: Studies that are not of the “journal” type of article
Table 3. Potentially eligible studies and selected studies.
Table 3. Potentially eligible studies and selected studies.
SourcePotentially Eligible StudiesSelected Studies
Scopus111552
WoS55928
Total167480 a
a 26 studies removed from WoS for being duplicates in Scopus.
Table 4. Environmental factors used in DBVAR.
Table 4. Environmental factors used in DBVAR.
IDFactorDescription#Studies
01WeatherAtmospheric conditions affecting visibility and traction, increasing accident risk.9[14,15,16,20,29,30,31,32,33]
02Date–timeSpecific time and date of travel, influencing traffic congestion and driver fatigue.8[30,31,32,34,35,36,37,38]
03SlopeTerrain inclination impacting vehicle speed and control.5[2,14,15,18,20]
04LaneVehicle position on the road, influencing collision risk.5[20,34,39,40,41]
05Road conditionPavement quality and obstacles compromising safety.4[15,16,18,20]
06Meteorological conditionsAtmospheric elements such as rain and snow, affecting visibility and vehicle adherence.4[15,20,29,31]
07Light conditionsLevel of available light impacting visibility and driver reaction time.4[20,30,31,32]
08Road typeRoad design affecting speed and maneuverability.3[12,20,42]
09Road obstructionRoadside obstacles posing hazards.3[16,41,43]
10Curve typeShape and degree of road curves that affect the driver’s driving.3[14,18,20]
11Segment lengthDistance between road reference points.3[18,40,42]
12Curve radiusMeasure of road curve curvature.2[2,40]
13Road safetyPresence of safety measures on the road.1[18]
14Number of lanesQuantity of lanes available on the road.1[40]
15WeekdayDay of the week of travel.1[30]
16Road MeasurementsSpecific road data.1[40]
17CrosswalkDesignated pedestrian crossing areas.1[31]
18Population densityNumber of individuals living in a specific area.1[17]
19Employment densityConcentration of workplaces in a given area.1[17]
20Land useUtilization of land along the road.1[17]
Table 5. Traffic factors used in DBVAR.
Table 5. Traffic factors used in DBVAR.
IDFactorDescription#Studies
01Distance between two vehiclesSpace separating two vehicles on the road, influencing collision likelihood.13[2,11,14,30,33,34,35,39,42,44,45,46,47]
02Time to collisionEstimated time before a collision between vehicles, assuming current speeds and trajectories remain unchanged.10[11,14,29,33,34,35,42,46,48,49]
03Traffic densityVolume of vehicles on the road, impacting accident frequency.9[2,14,17,18,20,30,31,34,35]
04OverspeedingDriving at a speed exceeding legal limits, elevating accident risk.5[14,15,20,31,40]
05Speed difference between two vehiclesVariation in velocity between two vehicles, affecting collision potential.4[30,33,39,45]
06Road signalsSigns indicating traffic regulations or hazards, influencing driver actions and accident likelihood.3[15,17,20]
07Time headwayTime interval between vehicles, affecting collision risk.3[42,46,48]
08Accident risk levelDegree of vulnerability to vehicular accidents, influenced by various factors.3[11,31,40]
09Average speedMean velocity of vehicles, affecting accident probability.2[2,40]
10Density by vehicle typeDistribution of vehicle types on the road, impacting accident dynamics.2[2,17]
11Non-compliance with regulationsFailure to adhere to traffic laws, elevating accident risk.2[16,43]
12Lateral distance with objectsDistance between vehicles and roadside objects, affecting collision probability.1[44]
13Acceleration difference between two vehiclesVariation in acceleration rates between vehicles, influencing collision potential.1[39]
14There is a surrounding vehiclePresence of neighboring vehicles, affecting driving dynamics and accident risk.1[15]
15Lack of lawsAbsence or lax enforcement of traffic regulations, increasing accident likelihood.1[16]
16traffic light statusState of traffic signals, influencing driver behavior and accident risk.1[50]
17Vehicle in front with high beamsLeading vehicle using high beam headlights, impacting visibility and accident risk.1[16]
Table 6. Vehicle factors used in DBVAR.
Table 6. Vehicle factors used in DBVAR.
IDFactorDescription#Studies
01SpeedThe rate at which a vehicle is traveling, measured in distance per unit of time, directly impacting the vehicle’s ability to respond to hazards and increasing the severity of potential collisions.27[10,11,14,15,21,29,30,33,35,36,37,39,42,43,44,45,47,48,50,51,52,53,54,55,56,57,58]
02AccelerationThe rate of change of velocity of a vehicle over time, either increasing or decreasing, crucial for determining the vehicle’s ability to adjust its speed and navigate safely through traffic, influencing collision potential.23[11,12,14,15,19,21,30,33,34,36,38,39,42,44,45,48,51,53,55,56,58,59,60]
03Steering angleDegree of wheel rotation, affecting vehicle trajectory.9[14,19,29,47,48,50,51,54,57]
04Vehicle GPS positionGlobal position coordinates, crucial for navigation and accident location determination.8[11,12,19,31,32,35,37,41]
05Heading angleDirection of vehicle travel, crucial for navigation and collision avoidance.7[33,35,37,48,51,58,59]
06BrakingDeceleration of the vehicle, critical for collision avoidance.6[10,15,40,43,53,58]
07Pedals positionPosition of accelerator and brake pedals, impacting vehicle speed control.6[29,48,51,52,54,57]
08Lane numberAssigned lane on the road, influencing collision risk during lane changes.5[11,34,47,51,58]
09Yaw angleAngle of rotation around the vertical axis, affecting vehicle stability.5[12,29,35,44,55]
10RPMEngine speed, influencing vehicle acceleration and control.5[21,29,51,52,57]
11Coolant temperatureTemperature of the engine coolant, impacting vehicle performance.4[21,29,36,52]
12Vehicle typeClassification of the vehicle, influencing handling characteristics and collision dynamics.3[11,29,39]
13Lane changeChange in lane position, increasing collision risk due to potential blind spots.3[33,51,53]
14Vehicle lengthLength of the vehicle, influencing maneuverability and collision severity.3[2,11,19]
15Balancing angleAngle of vehicle balance, affecting stability and risk of rollover.3[12,35,55]
16Engine loadDemand placed on the engine, impacting vehicle performance and stability.3[21,52,54]
17FuelRemaining combustible in the tank, critical for propulsion and impacting vehicle range.3[21,29,57]
18Turn typeThe specific maneuver a vehicle intends to execute, such as a left turn, right turn, or U-turn, crucial for anticipating traffic flow and collision avoidance3[10,16,53]
19JerkRate of change of acceleration, affecting passenger comfort and vehicle control.2[11,30]
20Pitch angleAngle of vehicle tilt, influencing stability and collision risk.2[12,55]
21Brake temperatureTemperature of the braking system, affecting braking efficiency and collision avoidance.2[16,29]
22AltitudeElevation above sea level, influencing engine performance and vehicle handling.2[21,37]
23Traveled distanceDistance covered by the vehicle, impacting fatigue and collision risk.2[29,36]
24Brake failureMalfunction of the braking system, increasing collision risk.2[16,19]
25DirectionalVehicle direction of travel, crucial for collision avoidance and navigation.2[10,57]
26Suspension heightHeight of the vehicle suspension, affecting stability and collision risk.2[19,29]
27Vehicle WidthWidth of the vehicle, impacting maneuverability and collision risk.1[19]
28Harsh accelerationsAbrupt changes in acceleration, impacting passenger comfort and vehicle control.1[40]
29ClutchMechanism for engaging and disengaging the engine from the transmission, crucial for vehicle control.1[15]
30Wheel angleAngle of the vehicle wheels, influencing steering and collision risk.1[51]
31G force in all three axesForces acting on the vehicle in three-dimensional space, impacting vehicle stability and control.1[29]
32OilLubricant for the vehicle engine, crucial for engine function and longevity.1[29]
33Water pressurePressure in the vehicle cooling system, impacting engine temperature regulation.1[29]
34Air pressurePressure in the vehicle tires, crucial for tire performance and vehicle stability.1[21]
35TiresContact points between the vehicle and the road, crucial for traction and vehicle control.1[29]
36Damaged rear-view mirrorImpaired visibility to the rear of the vehicle, increasing collision risk.1[16]
37Overspeed alarmWarning system for exceeding speed limits, crucial for collision avoidance.1[16]
38Loaded with hazardous materialTransporting dangerous substances, increasing collision risk and potential for environmental damage.1[16]
39Damaged windshield wiperImpaired visibility in adverse weather conditions, increasing collision risk.1[16]
40GearTransmission setting, impacting vehicle speed and acceleration.1[10]
41TransmissionMechanism for transferring engine power to the wheels, crucial for vehicle propulsion.1[19]
42ReverseGear setting for backward vehicle movement, crucial for maneuvering and collision avoidance.1[10]
43HornAudible warning device, crucial for communication and collision avoidance.1[10]
44Vehicle exterior lightIllumination for visibility in low-light conditions, crucial for collision avoidance.1[16]
Table 7. Driver factors used in DBVAR.
Table 7. Driver factors used in DBVAR.
IDFactorDescription#Studies
01Heart ratePulse rate indicating stress or fatigue levels affecting driving performance.4[29,37,51,59]
02EyeEye movements and tracking, influencing attention and reaction times.4[50,51,61,62]
03HeadHead position and movement, indicating focus and awareness.3[50,61,62]
04AgeDriver’s age, impacting reflexes and driving abilities.3[14,16,42]
05DistractionLevel of attentional diversion from driving tasks.3[14,16,63]
06Electrocardiogram (ECG)Heart activity measurement, indicating stress or health issues.2[18,64]
07Electrodermal ActivitySkin conductance reflecting stress or arousal levels.2[51,59]
08Breathing frequencyRate of breathing, indicating stress or fatigue.2[32,59]
09GenderA driver attribute used to analyze differences in driving behavior and accident risk. 2[14,42]
10Driving experienceDuration of driving practice, affecting skill and accident risk.2[16,42]
11Driver’s moodEmotional state impacting focus and decision-making.2[16,34]
12Electroencephalogram (EEG)Brain activity measurement indicating alertness levels.1[56]
13TemperatureBody temperature, affecting comfort and concentration.1[51]
14SleepAmount of rest influencing alertness and reaction times.1[14]
15Driver videoVisual monitoring of driver behavior and attention.1[18]
16Educational backgroundEducation level influencing knowledge and adherence to traffic rules.1[16]
17BirthplaceOrigin of driver, potentially affecting driving habits and risk perception.1[16]
18Driver’s license typeClassification of the license, indicating permitted vehicle types and driver qualifications.1[16]
19Extreme excitementHigh arousal levels impacting decision-making and control.1[16]
20Unaware of road conditionsLack of awareness about current road status, increasing accident risk.1[16]
21Perinasal perspirationSweat around the nose indicating stress or discomfort.1[59]
22FaceFacial expressions reflecting emotions and attention levels.1[61]
23MouthMouth movements indicating speech or stress levels.1[62]
24Reaction timeSpeed of response to stimuli, crucial for accident avoidance.1[45]
25Driving timeDuration of driving, impacting fatigue and alertness.1[15]
Table 8. Management factors used in DBVAR.
Table 8. Management factors used in DBVAR.
IDFactorDescription#Studies
01Driver evaluationAssessment of a driver’s performance and skills, impacting safety and accident risk.1[16]
02Overtime workExtended work hours contributing to driver fatigue and increased accident risk.1[16]
03Vehicle managementOversight of vehicle operations, ensuring safety and reducing accident likelihood.1[16]
04Safety trainingPrograms aimed at improving driver safety and reducing risky behaviors.1[16]
05Drivers’ careMeasures ensuring driver well-being, impacting alertness and accident risk.1[16]
06WorkloadAmount of work assigned to drivers, affecting fatigue and focus.1[16]
07Units monitoringSurveillance of vehicles to ensure compliance with safety standards.1[16]
08Distance to destination pointRemaining distance influencing driver fatigue and decision-making.1[17]
09Density of warehousing facilitiesThe concentration of storage locations in an area, influencing traffic patterns and accident risks through truck and delivery vehicle flow, potentially increasing congestion and interactions with other traffic.1[17]
Table 9. Variables that describe the accident used in DBVAR prediction.
Table 9. Variables that describe the accident used in DBVAR prediction.
StudiesFactor
[31]Severity, number of accidents
[32]Accident type, accident causes
Table 10. Algorithms used in the DBVAR.
Table 10. Algorithms used in the DBVAR.
StudiesAlgorithm aData SetStudy AreaResult
[54]ANN:
Backpropagation Levenberg–Marquardt
D.B.D.Turkeyacc = 90.00%
[52]ANNOwnIndiaacc = 99.00%
[66]SdsAED.B.D.Turkeyacc = 98.33%
[67]CNN:4DMRL EyeCzech Republicacc = 97.53%
[6]CNN: Inception v3OwnMexicoacc = 92.80%
[68]CNN: EFFNet-CASF3DUSAacc = 99.58%
[69]CNNSF3DUSAacc = 95.00%
[70]CNN: HCFSF3DUSAacc = 96.74%
[58]CNNOwnJapanacc = 83.00%
[71]CNN: BiFPNDMD-acc = 95.60%
[64]CNN: DCNNOwnUSAacc = 95.51%
[72]CNN: DenseNet + GASF3DUSAacc = 99.80%
[57]CNN: GoogleNetOwn-acc = 89.00%
[73]CNN: LWANet (VGG16)SF3DUSAacc = 99.37%
[74]CNN: MobileNet COCOUSAacc = 90.00%
[75]CNN: MobileNetV2SF3DUSAacc = 99.68%
[76]CNN: MobileNetv33D KITTIGermanyacc = 99.95%
[77]CNN: MSFLDHNUFDDChinaacc = 99.13%
[78]CNN: VGG-19AUCD2Greeceacc = 95.77%
[61]CNN + LSTMOwn-acc = 94.00%
[60]CNN-GRU + XGBoostSHRP 2USAacc = 97.50%
[15]DMNMNavInfo-acc = 99.00%
[47]GBBEBOThe Netherlandsacc = 81.00%
[10]GBOwnPakistanacc = 97.00%
[34]GBUAH- DriveSetSpainacc = 67.00%
[16]GB: GBDTOwnChinaacc = 80.00%
[79]GB: LightGBMHighDGermanyacc = 97.58%
[80]GRUOwnChinaacc = 93.94%
[11]GB: XGboostNGSIMUSAacc = 89.00%
[2]GB: XGboostOwnChinaacc = 96.66%
[59]GB: XGboostSIM 1USAacc = 89.24%
[38]GB:XGBoostD.B.D.Turkeyacc = 100.00%
[33]GMM: HC + FASH-NDSChinaacc = 87.00%
[50]AIO-HMMRoadLabCanadaacc = 86.40%
[39]LSTMHighDGermanyacc = 97.00%
[19]LSTMOwn-acc = 93.50%
[44]LSTM:Ensemble ClassifierOwnChinaacc = 90.50%
[14]LSTM + HMMSH-NDSChinaacc = 84.00%
[32]LSTM: BCDU-NetOwnChinaacc = 98.48%
[81]ConvLSTM: UMMFNOwnChinaacc = 97.79%
[63]ResNet + HRNN +
Inception
SF3DUSAacc = 96.23%
[82]EV-CNN + LSTMSF3DUSAacc = 93.68%
[35]LSTM: Stacked-LSTMUAH-DriveSetSpainacc = 99.47%
[55]LSTM: Stacked-LSTMUAH-DriveSetSpainacc = 94.00%
[45]LSTM-NNSHRP 2USAacc = 88.00%
[83]MediaPipe face meshOwn-acc = 95.80%
[84]MLPOwnGermanyacc = 87.00%
[56]MLPOwnChinaacc = 88.00%
[42]MLPOwnChinaacc = 69.60%
[30]MLPSH-NDSChinaacc = 89.20%
[85]MLP + CNN + TranformerSF3D USAacc = 99.91%
[86]ResNet: TSD-DLNAUCD2Greeceacc = 89.50%
[62]RFNTHUDDDTaiwanacc = 99.00%
[40]RFOwnGreeceacc = 89.30%
[51]RFOwnCanadaacc = 91.78%
[46]RFOwnChinaacc = 93.00%
[37]RFOwnIndiaacc = 98.00%
[43]RFSHRP 2USAacc = 90.00%
[21]RFTraffic, Driving Style and Road Surface ConditionItalyacc = 95.00%
[12]RFUAH- DriveSetSpainacc = 91.60%
[87]RFSPMDUSAacc = 92.77%
[31]RFUK Car Accident 2015United Kingdomacc = 99.00%
[65]Sequential Covering D.B.D.Turkeyacc = 96.25%
[88]SVMOwnPakistanacc = 87.00%
[41]DDPGSPMDUSARMSE = 0.4254
[18]GB: LightGBMOwnChinaRMSE = 0.004
[89]GB: LightGBMHighDGermanyRMSE = 0.0447
[20]GB: XGboostOwnGermanyRMSE = 0.0463
[17]GB: XGboostSWITRSUSARMSE = 4.058
[36]LSTMOwnTaiwanRMSE = 0.733
[90]NBFreeway-USAUSARMSD = 0.7
[29]MCS: BL + KNN + SVM + MLPOwnMoroccoF1 = 93.56%
[91]DTWOwnIndiadr = 100
[53]NMFOwnSouth Koreadrs = 72.9
[49]K-MeansNGSIMUSATTCi = 3.1602
[48]sHDP-HMM/NPYLM-K-MeansNUDrive corpusUSAROC = 0.953
Accuracy = acc; root mean squared deviation = RMSD; F1 score = F1; root mean square error = RMSE; detection rate = dr; mean absolute percentage error = MAPE; time to collision = TTC; driving risk score = drs; area under the curve = AUC; receiver operating characteristic = ROC. a Acronyms used for the algorithms used in DBVAR; multi-classifier system (MCS), Bayesian learning (BL), multi-layer perceptron (MLP), EfficientNet with channel attention (EFFNet-CA), stacked denoising sparse AutoEncoders (SdsAE), hybrid CNN framework (HCF), bi-directional feature pyramid network (BiFPN), deep CNN (DCNN), genetic algorithm (GA), lightweight attention-based network (LWANet), multi-scale facial landmark detector (MSFLD), gated recurrent unit (GRU), deep deterministic policy gradient (DDPG), deep multichannel network model (DMNM), dynamic time warping (DTW), gradient boosting (GB), gradient boosting decision trees (GBDT), Gaussian mixture model (GMM), hierarchical clustering (HC), factor-analysis (FA), hidden Markov model (HMM), auto-regressive input output HMM (AIO-HMM), convolutional LSTM (ConvLSTM), bi-directional ConvLSTM U-Net (BCDU-Net), unsupervised multi-modal fusion network (UMMFN), hierarchical recurrent neural network (HRNN), ensemble variant CNN (EV-CNN), non-negative matrix factorization (NMF), temporal–spatial double-line DL network (TSD-DLN), negative binomial (NB), and sticky hierarchical Dirichlet process hidden Markov model (sHDP-HMM).
Table 11. Studies applied in heavy vehicles and rural roads.
Table 11. Studies applied in heavy vehicles and rural roads.
Vehicle/RoadStudies
Heavy vehicles[2,16,17,53]
Type of rural road[2,40]
Table 12. Studies applied by type of driving risk.
Table 12. Studies applied by type of driving risk.
DB#Studies
Lane change4[39,41,48,89]
Distraction22[6,45,51,57,58,61,63,64,68,70,71,72,73,74,75,76,77,80,81,82,85,86]
Driving style45[2,10,11,12,14,15,16,17,19,20,21,29,30,31,32,33,34,35,36,37,38,40,42,43,44,46,47,49,50,52,53,54,55,56,59,60,65,66,69,79,80,84,87,90,91]
Stress1[18]
Drowsiness4[62,67,83,88]
Table 13. Methods used in the explainability of the DBVAR.
Table 13. Methods used in the explainability of the DBVAR.
StudiesMethodExplanation#FactorsCountry
[15]SHAPDB20China
[89]Lane change6Germany
[18]Driving stress22China
[40]Accident risk10Greece
[17]Injuries in accident16USA
[20]Driving risk26Germany
[90]RF features importanceDriving risk6USA
[21]Driving style7Italy
[31]Accident risk16United Kingdom
[37]DB5India
[16]DB15China
[87]Aggressive/risk DB on horizontal curves23USA
[39]GB features importanceLane change7Germany
[47]Driving under the influence of different substances36The Netherlands
[11]DB3USA
[10]Laplacian punctuationDB14Pakistan
[51]ExtraTreesDriver distraction19Canada
[14]Average attention weightAggressive DB8China
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Lacherre, J.; Castillo-Sequera, J.L.; Mauricio, D. Factors, Prediction, and Explainability of Vehicle Accident Risk Due to Driving Behavior through Machine Learning: A Systematic Literature Review, 2013–2023. Computation 2024, 12, 131. https://doi.org/10.3390/computation12070131

AMA Style

Lacherre J, Castillo-Sequera JL, Mauricio D. Factors, Prediction, and Explainability of Vehicle Accident Risk Due to Driving Behavior through Machine Learning: A Systematic Literature Review, 2013–2023. Computation. 2024; 12(7):131. https://doi.org/10.3390/computation12070131

Chicago/Turabian Style

Lacherre, Javier, José Luis Castillo-Sequera, and David Mauricio. 2024. "Factors, Prediction, and Explainability of Vehicle Accident Risk Due to Driving Behavior through Machine Learning: A Systematic Literature Review, 2013–2023" Computation 12, no. 7: 131. https://doi.org/10.3390/computation12070131

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop