Drivers’ Behavior and Traffic Accident Analysis Using Decision Tree Method

Abdullah, Pires; Sipos, Tibor

doi:10.3390/su141811339

Open AccessArticle

Drivers’ Behavior and Traffic Accident Analysis Using Decision Tree Method

by

Pires Abdullah

^1,2,*

and

Tibor Sipos

^1,3

¹

Department of Transport Technology and Economics, BME Faculty of Transportation Engineering and Vehicle Engineering, Budapest University of Technology and Economics, 1111 Budapest, Hungary

²

Department of Spatial Planning, College of Spatial Planning, University of Duhok, Kurdistan Region, Duhok 42001, Iraq

³

KTI—Institute for Transport Sciences, Directorate for Strategic Research and Development, 1119 Budapest, Hungary

^*

Author to whom correspondence should be addressed.

Sustainability 2022, 14(18), 11339; https://doi.org/10.3390/su141811339

Submission received: 9 August 2022 / Revised: 3 September 2022 / Accepted: 6 September 2022 / Published: 9 September 2022

(This article belongs to the Section Sustainable Transportation)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

This study was carried out to examine the severity level of crashes by analyzing traffic accidents. The study’s goal is to identify the major contributing factors to traffic accidents in connection to driver behavior and socioeconomic characteristics. In order to find the most probable causes in accordance with the major target variable, which is the level of severity of the crash, the study set out to identify the main attributes induced by the decision tree method (DT). The local people received a semi-structured questionnaire interview with closed-ended questions. The survey asked questions about drivers’ attitude and behavior, as well as other contributing factors such as time of accidents and road type. The attributes were analyzed using the machine-learning method using DT with Python programming language. This method was able to determine the relationship between severe and non-severe crashes and other significant influencing elements. The Duhok city people participated in the survey, which was conducted in the Kurdistan area of northern Iraq. The results of the study demonstrate that the number of lanes, time of the accident, and human attitudes, represented by their adherence to the speed limit, are the primary causes of accidents with victims.

Keywords:

machine learning; decision tree classifier; severe accident analysis; social factors; road safety

1. Introduction

Around the world, many urban regions have serious difficulties with traffic accidents and safety issues. One of the most important aspects of transportation for society is safety [1]. According to the World Health Organization, 1.35 million individuals worldwide pass away every year as a result of traffic accidents. Additionally, 20 to 50 million people experience non-fatal injuries that may lead to short-term or long-term diseases or disabilities. Traffic accidents, on the other hand, result in significant economic losses and negatively affect society [2]. Up to 3% of the GDP of the majority of nations is predicted to be spent on lost productivity and medical expenses [3]. The cost of traffic-related incidents in the United States of America, such as accidents and traffic congestion, can reach up to USD 160 billion annually and by the end of 2020, the cost might rise to USD 192 billion [4]. If the trend of the same condition continues, the road traffic network is recognized as the second and third leading cause of mortality for people between the ages of 5 and 44. Furthermore, traffic accident are predicted to become the third biggest cause of fatalities [5]. Crashes are accidents that involve the interaction of different components, such as the road, driver, vehicle, and environment [6]. Five components are recognized as having a direct impact on traffic accidents in the results of numerous studies in the area of traffic accident analysis: human factors, vehicle design, physical condition, traffic conditions, geometric features of the road, and weather conditions [7]. The vehicle, the driver, and the road environment all have a reasonable relationship with each other in terms of self-driving behavior [8]. An overwhelming rate of approximately 95% of accidents are caused by human errors [9]. Changing the attitudes of travelers reduces the negative impact of mobility [10]. According to data from the World Health Organization (WHO), using cell phones and other electronic devices while driving is the primary source of distraction. It was found that usage of mobile devices has increased by up to 11% over the last 5 to 10 years. This study noted a four-fold increase in the likelihood of a traffic collision when these devices are used. The authors stated that around 71% of road accidents are related to activities that drivers engaged in that were not connected with driving [7]. The high cost of vehicle accidents has emerged as a major concern for the development of driverless technology. According to estimates, the Google Car can reduce this cost by 90%. Furthermore, the Google Car has the potential to prevent over 2 million injuries and 30,000 fatalities, while saving nearly USD 270 billion in the United States each year [11]. Additionally, Tesla Motors was founded to create a project to accelerate the global shift to sustainable transportation, which includes self-driving capabilities [12].

Generally speaking, speed is seen as a primary factor in vehicle accidents. The characteristics of traffic vary according to the change of the speed distribution on a specific roadway in a specified period of time [13]. Speed-related indicators area consensus for researchers to consider as a needed variable for the severity dimension. However, the severity of modeling crush frequency received less attention [14]. The benefit of high-speed traffic flow is attributed to reduced travel time. However, this advantage could be associated with a possible surge in the number of accidents and the reality that injuries are likely to be severe if accidents happen at a higher speed [15]. Public authorities address this issue by reducing speed limits; according to a study conducted in Angers (France), for example, they extended the 30 km/h speed limit throughout the urban area. This study is based on the theory of planned behavior (TPB) and the prediction of young drivers’ intentions to comply with speed limits. By projecting the results on a decision tree method, they were able to identify the most influential variables for predicting intensions. The interest in using a decision tree is that it makes it possible to compare self-reported intentions and expected outcomes. [16]. The behavior of the frequency of risky and risk-free clusters was evaluated using the Gaussian function, in which the relationship between differences in posted speed limits, the operating speed, and the accident frequency rate was tested. It was found that the accident frequency rate is reduced by an average of 0.99 by increasing km/h in the difference between the posted speed limits and the operating speed, thereby decreasing the number of accidents per length. It was concluded that drivers in safe clusters do not exceed the speed limits, while in risky and unsafe clusters, drivers exceed the speed limits [17].

Road safety and fatal accidents are significantly influenced by a number of important elements. One of the most important aspects is believed to be the motorization level [18]. Researchers studied numerous significant aspects that might affect the creation of the high-risk traffic situation. It was found that there is a considerably stronger association between high-risk traffic accidents and road geometry and traffic circumstances. The road traffic accident probability is always higher when there are poor traffic conditions and improper road infrastructure. It was confirmed by statistical data that 20–25% of accidents occur due to the poor condition of the roads. The majority of studies based on road safety are focused on external causes such as the probability of crashes, type of driver, existing conditions of the pavement surface, or searching for patterns that can explain accident causes [7]. The mobile LiDAR system (MLS), supported by an inductive reasoning process, was taken as a means to assess road safety. The study was performed based on a decision tree, which provides a potential risk assessment based on geometric parameters exclusively. It was highlighted that in future research, an extension of the categorization evaluation of road risk by DT would be desirable [19]. The performance of LiDAR sensors has been investigated under adverse weather conditions. The results demonstrate that LiDAR must overcome the challenges posed by inclement weather to ensure safety by obtaining information about dynamic objects such as pedestrians, traffic lights, and surrounding vehicles, and improve the driving safety of automated vehicles [20]. The study by [21] investigated the rate of injury and deaths of vehicle traffic accidents associated with road types based on their main function. The road type definition included functional roads, administrative roads, urban expressways, and urban general roads. The study shows that the death rate from road traffic accidents on administrative roads is the highest, followed by that on functional roads among all different road types. Moreover, the incidence of traffic accidents is 11.6 times higher on urban general roads than on urban expressways [21]. A research analysis conducted using the decision tree method shows that the three most important factors in fatal injury are: the driver’s seat belt usage, the light condition of the roadway, and the driver’s alcohol usage [2]. Another study’s findings, conducted in Belgium, indicate significant differences between countries when it comes to road safety performance. National culture plays a huge role in these discrepancies; it is strongly related to differences in wealth and prosperity in different regions. In Europe, the overall percentage of individuals supporting the measures is over 70%. Thus, the social standard of people’s perspectives regarding road safety is substantially more important. This indicates that there is a general willingness to accept policy measures that help to improve road safety [22]. To create a composite safety indicator for various countries, the safety target was investigated using principal component analysis and weighting from common factor analysis. When the national safety program is in line with the European policy of a 50% decline in fatalities by 2010, it is revealed that it is marked as having an “ambitious” target. This mark (value “a”), which is the highest value, was given to many countries. However, some nations (such as Italy) claimed to have no national targets, given (value “c”) [23].

A decision tree (DT) can be used in different subjects to find out which variables affect the resolutions and build a model that adjusts accordingly. Ordinary trees consist of a root, branches, nodes, and leaves [16]. The first node is the root. Two or several branches may grow to form it. The last node of the chain is a leaf, and no branches grow from it. Each node represents a variable, and branches give a set of values, which can be predicted from observation of individuals, social groups, or specific characteristics. Accidents are relatively unpredictable and infrequent. Behavioral factors in road accidents are difficult to study by traditional research methods for a number of reasons [24]. In a study in the UK, two hundred police case files on right-turning accidents were randomly selected from the records held at police headquarters for Nottinghamshire, in which 100 right turns were made off the main road and 100 right turns were made onto the main road. The machine-learning method was used to create decision trees distinguishing the characteristics of accidents that resulted in injury or damage only. In the result, it was found that middle-aged drivers are generally safer than either young or old drivers [24].

In a research project by [25], for each type of traffic sign, drivers from various socioeconomic backgrounds filled out a paper survey that served as the basis for a decision tree that was used to determine the most important variables affecting drivers’ comprehension. By collecting division data, the algorithm’s ideal goal was to identify homogeneous groups with regard to its dependent variable.

The process of the (DT) algorithm is iterative until the stopping level is attained. In the case of classification, when the tree is used, the criteria are based on entropy and the Gini index [26]. The entropy is an inhomogeneity measure of input data for the classification. The decision tree construction has three objectives: reduce entropy (randomness of the variable goal), be consistent with the data set, and have the lowest number of nodes. However, the Gini index, which was developed by Conrado Gini in 1912, measures the data heterogeneity degree. Therefore, it can be used for measuring node impurity. It means that when this index is zero, the node is pure. On the other hand, when it approaches value one, the node is impure [26]. The decision tree method allows classification based on crash severity, and provides an alternative to parametric models in their ability to identify patterns based on data without the need to establish a functional relationship between variables. It does not need to specify a functional form in the way that ordinary statistical modeling techniques, such as regression models, do. One of the most important advantages of the DT is that the outcomes of the analysis are easy to understand and perform due to the graphical nature of its results. It can easily find the important variables of the model [27]. Network-level optimization was a model aim in a study by [28]. The traffic demand was managed through binary integer modeling, considering a fully autonomous vehicle transport system. The developed model investigated the transport processes at the vehicle level. The study’s intentions were to ensure traffic safety, as well as capacity management. According to a study by [29], a precise algorithm for predicting congestion was critical for reducing casualties. In that study, a comparison of decision trees, logistic regression, and neural networks was provided as traffic congestion prediction systems. For data processing, model training, and testing, “Tensor Flow” and “Clementine Machine Learning” were used. The confusion matrix shows that the decision tree has a better prediction performance and leads the other two methods in accuracy (97%) [29].

To sum up, the literature indicates the negative impact of car accidents on society, as well as the main basic causes of traffic accidents and methods related to DT by incorporating machine-learning approaches that are used in the field of transportation. The driver’s attitude and socioeconomic aspects connected to other elements related to car accidents have not been comprehensively examined in the literature in different areas. The objective of the paper is to address the most probable causes of both severe and non-severe accidents that are related to drivers’ personal attributes and behavioral factors. As humans are the substantial components, their behavioral changes need to be the focus [30].

2. Materials and Methods

The information was gathered in the city by conducting basic random sample interviews with citizens. The total number of participants from the public was 1172. In all regions of the city, the questionnaire forms were distributed to various groups of individuals while taking into account their age, gender, education level, and other factors. The percentages of participants by age and gender are shown in Figure 1a,b, below. Statistics from 2018 indicated that there were about 450,000 people living in the city of Duhok. It was chosen as a study area to investigate traffic accidents since it is one of the cities that has seen a significant number of vehicle crashes over the years, including a depressing number of fatalities and injuries.

Table 1 shows the number of accidents, from a prior study, together with the related number of fatalities and injuries that were reported in the city over the course of the ten years [31].

The software used for data analysis in Python 3.7 manipulates, transforms, and creates charts or graphs that summarize the information collected. Data mining techniques are rarely used in transportation, but their use is increasing by the day [32]. The decision tree algorithm was used in this research to analyze the dataset. In general, DT is a supervised classification that normally has a procedure, as follows; given a dataset of observations called a training set, different sets of observations are used; this set is called the test set. The variable to be predicted (classified) is called the class variable, and the rest of the variables in the dataset are called predictive attributes or features [33]. The questions were designed to ask for general information, specific opinions, driving experiences, and driving behaviors. The items of the questionnaire asked for substantial information about traffic behavior generally and traffic accidents specifically all over the city. The questionnaire’s items posed detailed questions regarding general driving behavior and individual traffic accidents that occurred throughout the city. The severity of the crash is a topic covered in the questions. The accident was either non-severe, if it only caused property damage, or severe, if it resulted in human injuries or deaths. All respondents’ answers were converted to a numeric expression and entered into the Excel sheet. Consequently, the Excel sheet was changed and saved as a CSV file. This was necessary in order to run the DT algorithm on Python. The main reason that DT was proposed to be used in this study is that the target variable that requires investigation has a binary property, i.e., the level of severity (“Severe Crashes” and “Non-Severe”), which is seen as appropriate for this method. The accuracy of the analysis obtained by the DT using the Python “sklearn” library’s “score()” method is 79.34%. This value is within the range of values obtained in other studies in which classification methods have similar objectives [27]. This function of the decision tree is to display the percentage accuracy of the assignments made by the classifier. It takes the input and target variables as arguments. The score value for this study indicates that classifications made by the model should be correct approximately 80% of the time.

3. Results

3.1. Road Traffic Accidents

People were asked whether they had experienced a car accident in the past several (up to 10) years. This question considered whether the respondent was in a traffic accident and whether he/she was a driver or a passenger. The results show that more than 45% of the interviewees have had a traffic accident at least once. Car accidents affect approximately half of the respondents. Figure 2a,b below display the percentage of respondents who were in car accidents based on their location inside the car at the time of the accident.

3.2. Severity of the Accidents

The severity of a certain crash varies from one to another. In this study, the severity of the crash is indicated with human injuries/loss, as well as the physical damage of the vehicle. The result of the study reveals that more than a quarter of the accidents are marked as severe crashes (Figure 3).

3.3. Driver Behavior

People were questioned about their driving practices to determine the likelihood that they might over-speed on a certain road link, which is when they drive faster than the posted speed limit when there are no speed limit detectors present. The answers were “Sometimes”, “Never”, and “Always”. Nearly all answers have the same percentage, according to Figure 4a. The respondents’ views on the local government’s placing speed limit radars on the road varies; some of them express support, while others do not. According to the results, as shown below, respondents who said they “Totally Support” and “Generally Support” both make up the majority of more than 50% of other respondents. It shows that half of the interviewed people have the desire to drive within the speed limits to avoid crashes. Figure 4b reveals the exact percentages. Figure 4c demonstrates that drivers who have the attitude to almost exceed the speed limits on the road have more than 30% of their accidents classified as severe. Government legislation states that all traffic violations, including exceeding speed limits, have specific consequences and legal penalties. People were questioned about their thoughts on the fines imposed by the traffic police. Figure 4d demonstrates that the response with the highest percentage is “Neither-Nor.”

3.4. The Relation of Age Groups to Vehicle Accidents

When taking into account the age of the driver, Figure 5 reveals that the age group 40–65 has the highest percentage of accidents, with 33.8 percent, followed by the 18–39 age group with 27 percent. The population over 65 years old make up the minimum percentage of those with the fewest accidents recorded.

3.5. The Decision Tree Method

Figure 6 shows the results of executing the DT method in Python. The graph in the tree indicates the primary contributing factors to the accidents in both severity levels. The primary component is positioned at the top of the tree hierarchy system, while the secondary factors are positioned lower down.

4. Discussion

In the current paper, the effect of drivers’ behavior on traffic accident types is studied. The following variables, listed in Table 2, are part of the dataset collection. A decision tree was built using the data that was converted to digits, to evaluate the variables that influence the severity of traffic accidents.

Figure 6 provides a number of things related to the DT technique analysis. The variable “No_of_Lane”, which is the tree’s root, comes first at the top of the tree, indicating that it is the most important element in classifying objects. The branches to the left are for the accidents on the roads with a lower number of lanes. Each root and intermediate node contains the decision factor, the entropy, and the number of respondents who fit the criterion at that point in the tree. For example, the root node indicates that there are 368 observations that make up the learning data set. Those are “drivers” who have been in “traffic accidents,” of which 272 are non-severe and 96 are severe. At the next level, we can see the “Support_SpeedLimit_Radars”; it indicates that the majority of the 180 people that are the less-supportive ones had been in accidents on roads with more than two lanes, such as highways and major arterials. On the left, it can be observed that 188 drivers were involved in accidents that happened on the two-lane road types. On the third level, at the far right, it can be seen that drivers from the higher age group are in the node that was created from the right arrow (false direction) of the upper node. This obviously means that the older drivers are more supportive of the speed limits. However, of its 156 samples, 68 of them are marked as severe crashes, Furthermore, out of 20 samples of the oldest driver group, 12 exceed the speed limit, and all accidents are classified as severe with 0 entropy. Additionally, it shows that the older groups have the majority of accidents on weekends, while the younger groups have accidents on weekdays. At the end, the leaf nodes for the intermediate nodes indicate that middle-aged drivers have accidents on multiple roadway types in daytime hours, all of which are non-severe accidents. On the other side, to the far left of the tree, it shows that younger drivers have accidents on two-lane roadways in the afternoon and evening and at night, with a majority of non-severe accidents. It is also observed that female drivers are more likely to be involved in car accidents on two-lane roads at night.

Finally, the elements in the value array show the severity level. The first value is the number of non-severe crashes, and the second is the number of severe crashes for each criterion. Out of the gathered data, the root node reveals that 272 people experienced non-severe accidents and 96 had severe accidents.

Entropy is the measure of noise in the decision. Noise can be viewed as uncertainty. For example, in nodes in which the decision results are equal values in the severity value array, the entropy is at its highest value, which is 1.0. This means that the model is unable to definitively mark the classification decision based on the input variables. For values of very low entropy, the decision is much more clear-cut, and the difference in the number of severe and non-severe is much higher.

Similar to many other kinds of research, this one has a number of restrictions. The phase of data collection was the most challenging. The respondents hardly accepted the questionnaire form to answer or offer a truthful response regarding their driving habits and attitudes. As it was previously stated in this paper, a severe accident is one that results in loss of life or injury. This limitation is related to the section on severe accidents. Only those who survived or have been injured respond to the questions. The injured person might be either the driver or passenger, and they would be the only ones to speak about the serious number of fatalities or injuries.

5. Conclusions

The findings in this study reveal that crashes have occurred in over 30% of drivers who have a tendency to drive faster than the speed limit. Nearly 70% of those who had been in at least one accident were drivers, while 30% were passengers. About 65% of the public respondents believe that it is normal to drive faster than the posted speed limit. Older drivers are involved in fewer traffic accidents in general, while DT shows that a significant percentage of their accidents are reported as severe. The type of road and the attitude toward fast driving, which is related to the habit of speeding on the road, are the main factors in traffic accidents in general. Over 50% of the respondents are not satisfied with the current legal penalties by the local authorities of the city. Python may be used to evaluate research related to transportation in an efficient manner using the decision tree method, which is a simple and effective technique. The accuracy of the results presented in this paper is acceptable. The impressive aspect of this method is that the output is displayed as a graph that contains all the key variables that have to be examined. The outcomes are more obvious and easier to notice when presented in this way. According to their importance, the attributes are placed at different branch levels in the hierarchy frame of the DT output. The graph displays the attitudes and the socioeconomic variables of the drivers that are placed at different levels of the tree. Finally, the variables or attributes that yield the most information, or those that result in the greatest reduction in entropy, are chosen to represent the influential elements of the target variable. This paper concludes that younger drivers are those who violate the speed limit. Most of the severe accidents happen on multiple lane roadways on weekends. Younger drivers are involved in accidents on two-lane roadways during the day, while older drivers are involved in accidents at night. Older drivers cause fewer crashes; however, nearly 75% of them are severe accidents.

Author Contributions

Conceptualization, P.A. and T.S.; methodology, P.A. and T.S.; software, P.A.; validation, P.A. and T.S.; formal analysis, P.A.; investigation, P.A.; resources, P.A. and T.S.; data curation, P.A.; writing—original draft preparation, P.A. and T.S.; writing—review and editing, P.A. and T.S.; visualization, P.A. and T.S. All authors have read and agreed to the published version of the manuscript.

Funding

The research was supported by OTKA-K20-134760-Heterogeneity in user preferences and its impact on transport project appraisal led by Adam TOROK and by OTKA-K21-138053- Life Cycle Sustainability Assessment of road transport technologies and interventions by Mária Szalmáné Csete.

Data Availability Statement

All data used during the study are presented in the paper.

Conflicts of Interest

The authors declare no conflict of interest.

References

Turoń, K.; Czech, P.; Tóth, J. Safety and Security Aspects in Shared Mobility Systems. Sci. J. Silesian. Univ. Technol. Ser. Transp. 2019, 104, 169–175. [Google Scholar] [CrossRef]
Chong, M.M.; Abraham, A.; Paprzycki, M. Traffic Accident Analysis Using Decision Trees and Neural Networks. arXiv 2004, arXiv:cs/0405050. [Google Scholar]
Bertoli, P.; Grembi, V. The political cycle of road traffic accidents. J. Health. Econ. 2021, 76, 102435. [Google Scholar] [CrossRef] [PubMed]
Ali, F.; Ali, A.; Imran, M.; Naqvi, R.A.; Siddiqi, M.H.; Kwak, K.-S. Traffic accident detection and condition analysis based on social networking data. Accid. Anal. Prev. 2021, 151, 105973. [Google Scholar] [CrossRef] [PubMed]
Konkor, I. Examining the relationship between transportation mode and the experience of road traffic accident in the upper west region of Ghana. Case Stud. Transp. Policy 2021, 9, 715–722. [Google Scholar] [CrossRef]
Martín, L.; Baena, L.; Garach, L.; López, G.; de Oña, J. Using Data Mining Techniques to Road Safety Improvement in Spanish Roads. Procedia Soc. Behav. Sci. 2014, 160, 607–614. [Google Scholar] [CrossRef]
Chand, A.; Jayesh, S.; Bhasi, A. Road traffic accidents: An overview of data sources, analysis techniques and contributing factors. Mater. Today Proc. 2021, 47, 5135–5141. [Google Scholar] [CrossRef]
Ismael, K.; Razzaq, N.A. Traffic Accidents Analysis on Dry and Wet Road Bends Surfaces in Greater Manchester-UK. Kurd. J. Appl. Res. 2017, 2, 284–291. [Google Scholar] [CrossRef]
Csiszar, C.; Foldes, D. System Model for Autonomous Road Freight Transportation. Promet 2018, 30, 93–103. [Google Scholar] [CrossRef]
Esztergár-Kiss, D.; Rózsa, Z.; Tettamanti, T. An activity chain optimization method with comparison of test cases for different transportation modes. Transp. A Transp. Sci. 2019, 16, 293–315. [Google Scholar] [CrossRef]
Poczter, S.L.; Jankovic, L.M. The Google Car: Driving Toward A Better Future? J. Bus. Case Stud. 2013, 10, 7–14. [Google Scholar] [CrossRef] [Green Version]
Naor, M.; Coman, A.; Wiznizer, A. Vertically Integrated Supply Chain of Batteries, Electric Vehicles, and Charging Infrastructure: A Review of Three Milestone Projects from Theory of Constraints Perspective. Sustainability 2021, 13, 3632. [Google Scholar] [CrossRef]
Zefreh, M.; Torok, A. Theoretical Comparison of the Effects of Different Traffic Conditions on Urban Road Environmental External Costs. Sustainability 2021, 13, 3541. [Google Scholar] [CrossRef]
Borsos, A. Application of Bivariate Extreme Value models to describe the joint behavior of temporal and speed related surrogate measures of safety. Accid. Anal. Prev. 2021, 159, 106274. [Google Scholar] [CrossRef]
Aljanahi, A.; Rhodes, A.; Metcalfe, A. Speed, speed limits and road traffic accidents under free flow conditions. Accid. Anal. Prev. 1998, 31, 161–168. [Google Scholar] [CrossRef]
Bordarie, J. Predicting intentions to comply with speed limits using a ‘decision tree’ applied to an extended version of the theory of planned behaviour. Transp. Res. Part F Traffic Psychol. Behav. 2019, 63, 174–185. [Google Scholar] [CrossRef]
Afandizadeh, S.; Hassanpour, S. Evaluating the Effect of Roadway and Development Factors on the Rural Road Safety Risk Index. Adv. Civ. Eng. 2020, 2020, 7820565. [Google Scholar] [CrossRef]
Török, Á. Comparative Analysis between the Theories of Road Transport Safety and Emission. Transport 2015, 32, 192–197. [Google Scholar] [CrossRef]
Martín-Jiménez, J.A.; Zazo, S.; Justel, J.J.A.; Rodríguez-Gonzálvez, P.; González-Aguilera, D. Road safety evaluation through automatic extraction of road horizontal alignments from Mobile LiDAR System and inductive reasoning based on a decision tree. ISPRS J. Photogramm. Remote Sens. 2018, 146, 334–346. [Google Scholar] [CrossRef]
Kim, J.; Park, B.-J.; Roh, C.-G.; Kim, Y. Performance of Mobile LiDAR in Real Road Driving Conditions. Sensors 2021, 21, 7461. [Google Scholar] [CrossRef]
Sun, T.-J.; Liu, S.-J.; Xie, F.-K.; Huang, X.-F.; Tao, J.-X.; Lu, Y.-L.; Zhang, T.-X.; Yu, A.-Y. Influence of road types on road traffic accidents in northern Guizhou Province, China. Chin. J. Traumatol. 2020, 24, 34–38. [Google Scholar] [CrossRef]
Berghe, W.V.D.; Schachner, M.; Sgarra, V.; Christie, N. The association between national culture, road safety performance and support for policy measures. IATSS Res. 2020, 44, 197–211. [Google Scholar] [CrossRef]
Gitelman, V.; Doveh, E.; Hakkert, S. Designing a composite indicator for road safety. Saf. Sci. 2010, 48, 1212–1224. [Google Scholar] [CrossRef]
Clarke, D.D.; Forsyth, R.; Wright, R. Machine learning in road accident research: Decision trees describing road accidents during cross-flow turns. Ergonomics 1998, 41, 1060–1079. [Google Scholar] [CrossRef]
Taamneh, M. Investigating the role of socio-economic factors in comprehension of traffic signs using decision tree algorithm. J. Saf. Res. 2018, 66, 121–129. [Google Scholar] [CrossRef]
Figueira, A.D.C.; Pitombo, C.S.; Oliveira, D.; Paulo, T.M.S.; Larocca, A.P.C. Identification of rules induced through decision tree algorithm for detection of traffic accidents with victims: A study case from Brazil. Case Stud. Transp. Policy 2017, 5, 200–207. [Google Scholar] [CrossRef]
Griselda, L.; Juan, D.O.; Joaquín, A. Using Decision Trees to Extract Decision Rules from Police Reports on Road Accidents. Procedia Soc. Behav. Sci. 2012, 53, 106–114. [Google Scholar] [CrossRef]
Pauer, G.; Török, Á. Binary integer modeling of the traffic flow optimization problem, in the case of an autonomous transportation system. Oper. Res. Lett. 2020, 49, 136–143. [Google Scholar] [CrossRef]
Tamir, T.S.; Xiong, G.; Li, Z.; Tao, H.; Shen, Z.; Hu, B.; Menkir, H.M. Traffic Congestion Prediction using Decision Tree, Logistic Regression and Neural Networks. IFAC-PapersOnLine 2020, 53, 512–517. [Google Scholar] [CrossRef]
Földes, D.; Csiszár, C. Conception of future integrated smart mobility. In Proceedings of the 2016 Smart Cities Symposium Prague (SCSP), Prague, Czech Republic, 26–27 May 2016; pp. 1–6. [Google Scholar] [CrossRef]
Abdullah, P.H.; Perschon, J.; Ameen, A.M. The Relationship between Car Dependency and Use of Public Transport in Duhok City-Barriers Analysis and Recommendations. J. Univ. Duhok 2020, 23, 59–68. [Google Scholar] [CrossRef]
Gaál, B.; Horváth, B. System Approach for Strategic Planning in Transport. Acta Tech. Jaurinensis 2017, 10, 13. [Google Scholar] [CrossRef]
Abellán, J.; López, G.; de Oña, J. Analysis of traffic accident severity using Decision Rules via Decision Trees. Expert Syst. Appl. 2013, 40, 6047–6054. [Google Scholar] [CrossRef] [Green Version]

Figure 1. (a) Participants in the survey with regards to the gender; (b) participants in the survey with regards to the age.

Figure 2. (a) Individuals who were in a road traffic accident; (b) position of the people surveyed who had car accidents.

Figure 3. Percentage of severe crashes.

Figure 4. (a) Respondents’ answers about exceeding speed limits; (b) respondents’ opinion about the availability of the speed limit radars on the roads; (c) drivers with an attitude of always exceeding speed limits linked with the percentage of their accidents type; (d) respondents’ satisfaction level about the legal penalties for traffic violations.

Figure 5. Drivers in accidents with regard to the age group.

Figure 6. Python output displaying the DT method’s graph.

Table 1. Number of accidents and causalities by vehicles in Duhok city [31].

Year	No. of Accidents	No. of Dead People	No. of Injured People
2010	632	88	754
2011	707	83	691
2012	629	129	722
2013	1319	260	4290
2014	1132	189	4213
2015	1220	172	3967
2016	1177	205	4003
2017	1152	145	1658
2018	1091	117	1071
2019	1002	98	1102

Table 2. The dataset names and their numerical description.

Variable	Description
Age	Age groups (0 = 1st; 1 = 2nd; 2 = 3rd; 3 = 4th)
Gender	Male = 0; female = 1
Support_SpeedLimit_Radars	Totally unsupported = 0; … totally supported = 4
Had_a_RoadAccident	(0 = no; 1 = yes)
Position	Passenger = 0; driver = 1
Time_of_accident	Time of the accident (morning = 0; … night = 3)
Day_ofTheWeek	Weekend = 0; weekday = 1
No_of_Lane	Number of lanes (two = 0; more than two (Mult) = 1)
Severity	Physical damage only = 0; human injures/loss = 1
Exceeding_Speed_limit	Never = 0; sometimes = 1; always = 2
SatisfiedWithLegalPanalties	Very unsatisfied = 0; … very satisfied = 4

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Abdullah, P.; Sipos, T. Drivers’ Behavior and Traffic Accident Analysis Using Decision Tree Method. Sustainability 2022, 14, 11339. https://doi.org/10.3390/su141811339

AMA Style

Abdullah P, Sipos T. Drivers’ Behavior and Traffic Accident Analysis Using Decision Tree Method. Sustainability. 2022; 14(18):11339. https://doi.org/10.3390/su141811339

Chicago/Turabian Style

Abdullah, Pires, and Tibor Sipos. 2022. "Drivers’ Behavior and Traffic Accident Analysis Using Decision Tree Method" Sustainability 14, no. 18: 11339. https://doi.org/10.3390/su141811339

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Drivers’ Behavior and Traffic Accident Analysis Using Decision Tree Method

Abstract

1. Introduction

2. Materials and Methods

3. Results

3.1. Road Traffic Accidents

3.2. Severity of the Accidents

3.3. Driver Behavior

3.4. The Relation of Age Groups to Vehicle Accidents

3.5. The Decision Tree Method

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI