Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5

Khattak, Afaq; Almujibah, Hamad; Elamary, Ahmed; Matara, Caroline Mongina

doi:10.3390/su141912340

Open AccessArticle

Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5

by

Afaq Khattak

^1,*,

Hamad Almujibah

²

,

Ahmed Elamary

² and

Caroline Mongina Matara

^3,4

¹

The Key Laboratory of Road and Traffic Engineering, Ministry of Education, Tongji University, 4800 Cao’an Highway, Jiading District, Shanghai 201804, China

²

Department of Civil Engineering, College of Engineering, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia

³

Department of Civil and Resource Engineering, Technical University of Kenya, P.O. Box 52428-00200, Haile Sellasie Avenue, Nairobi 00200, Kenya

⁴

Department of Civil and Construction Engineering, University of Nairobi, P.O. Box 30197-00100, Harry Thuku Road, Nairobi 00625, Kenya

^*

Author to whom correspondence should be addressed.

Sustainability 2022, 14(19), 12340; https://doi.org/10.3390/su141912340

Submission received: 23 July 2022 / Revised: 8 September 2022 / Accepted: 22 September 2022 / Published: 28 September 2022

(This article belongs to the Special Issue Sustainable Transportation and Road Safety)

Download

Browse Figures

Versions Notes

Abstract

:

Road traffic accidents are among the top ten major causes of fatalities in the world, taking millions of lives annually. Machine-learning ensemble classifiers have been frequently used for the prediction of traffic injury severity. However, their inability to comprehend complex models due to their “black box” nature may lead to unrealistic traffic safety judgments. First, in this research, we propose three state-of-the-art Dynamic Ensemble Learning (DES) algorithms including Meta-Learning for Dynamic Ensemble Selection (META-DES), K-Nearest Oracle Elimination (KNORAE), and Dynamic Ensemble Selection Performance (DES-P), with Random Forest (RF), Adaptive Boosting (AdaBoost), Classification and Regression Tree (CART), and Binary Logistic Regression (BLR) as the base learners. The DES algorithm automatically chooses the subset of classifiers most likely to perform well for each new test instance to be classified when generating a prediction, making it more efficient and flexible. The META-DES model using RF as the base learner outperforms other models with accuracy (75%), recall (69%), precision (71%), and F1-score (72%). Afterwards, the risk factors are analyzed with SHapley Additive exPlanations (SHAP). The driver’s age, month of the year, day of the week, and vehicle type influence SHAP estimation the most. Young drivers are at a heightened risk of fatal accidents. Weekends and summer months see the most fatal injuries. The proposed novel META-DES-RF algorithm with SHAP for predicting injury severity may be of interest to traffic safety researchers.

Keywords:

injury severity; dynamic ensemble selection; SHapley Additive exPlanations

1. Introduction

The World Health Organization’s (WHO) Global Status Report on Road Safety indicates that road traffic accidents cause between 20 and 50 million non-fatal injuries and close to 1.35 million fatal accidents annually, with significant economic impacts. Over the next decade or so, it is expected that the number of traffic accidents will increase even more. Road traffic crashes will become the seventh leading cause of mortality by the year 2030 unless many accident mitigation measures are taken [1,2]. High rates of road-related injuries and fatalities are prevalent in developing countries such as Pakistan [3,4], Saudi Arabia [5,6,7], Bangladesh [8,9], and so on. In developing countries, fatal road traffic crashes are the eighth leading cause of mortality globally and the leading cause of mortality for those aged 15 to 29, which is an alarming problem [1]. As a result of road traffic accidents, society as a whole has experienced a significant decrease in productivity, and the families of those who have been victimized have been subjected to significantly increased levels of financial hardship, physical stress, and mental anguish. To improve road design and reduce the number of accidents, it is necessary to identify and quantify the major road crash-causing factors. Improving road safety and planning interventions to reduce the frequency and mortality associated with road trauma requires a comprehensive understanding of the chain of events leading to road traffic injuries. Pakistan, a developing country with a population of approximately 207.774 million people, is included on this list of the top ten most populous countries in the world. Pakistan’s economy has been expanding at a moderately slow rate over the past few years, according to the Social Indicators of Pakistan (SIP), whereas the country’s road network has been expanding at a rapid rate [10,11]. The total length of roads has increased from 251,661 km in 2002 to 264,212 km in 2015. Furthermore, the number of motorized vehicles on the road has increased significantly, from 5.3 million in 2002 to 11 million in 2012 [12]. Road traffic accidents on Pakistani roads are increasing exponentially, which can be attributed to the country’s rapidly expanding automobile industry. In 2017, the Punjab Emergency Services (PES) received reports of 9648 traffic accidents, including 1753 fatalities. Because these numbers are frequently understated, it is possible that the actual numbers are higher [13,14]. In order to reduce the number of traffic accidents, it is essential to identify their root causes.

Researchers have spent many years attempting to establish a correlation between the severity of crash injuries and risk factors, such as highway geometry, vehicle type and age, month, crash type, traffic laws, driver-specific characteristics, weather-specific conditions, and temporal factors considering various statistical approaches, such as the Ordered Probit Model [15,16,17], the Mixed Logit Model [18,19,20], and the Random Parameter Ordered Probit Model [21,22,23]. Even though they necessitate stringent assumptions regarding the relationship between risk factors and target factors, these statistical approaches are rigorous and have clearly defined functional forms. If these assumptions are violated, the statistical model output results may be inaccurate. Similarly, integrating modern data sources typically results in massive datasets with multiple dimensions that are difficult to evaluate using traditional statistical modelling techniques. In contrast, machine learning, a data mining technique and precursor to artificial intelligence, is highly adaptive and requires few or no assumptions about the big data [24,25,26]. Healthcare, finance, education, and other data-intensive industries have already experienced the benefits of machine learning [27,28,29,30,31,32,33,34]. The field of traffic safety research has also shown considerable interest in machine-learning strategies. It has emerged as one of the most popular and intriguing approaches to assessing the injury severity caused in road traffic accidents [35,36,37,38,39,40,41].

Prior to the development of ensemble learning strategies, researchers focused primarily on individual machine-learning methods. Ensemble learning has demonstrated superior performance over a single machine learning-strategy in a variety of real-world applications [42,43,44]. Bagging, boosting, and stacking are the three primary concepts of ensemble learning, which includes the techniques and strategies of model fusion. The primary objective of ensemble learning is to combine the merits of multiple models in order to draw a conclusion. In a problem of binary classification, a dataset containing multiple factors or characteristics for each instance is provided. As one of the factors, the decision label must be categorical and identify the class to which each instance belongs. Classification strategies aim to develop classification models in order to predict and categorize the decision label for a particular instance. The two broad classification strategies are dynamic and static. Figure 1 illustrates the distinctions between ensemble selection and classification model selection methods for both static and dynamic classification strategies [45,46]. The primary distinction between dynamic and static classification strategies is whether or not all test instances are predicted using the same classification model. In a similar manner, the primary distinction between ensemble classifier selection and classifier selection is whether an ensemble classification model or a single classification model is composed of multiple base classification models that are used to predict test instances, resulting in a variety of classification strategies based on their various combinations. Different classification models perform better in different contexts; therefore, a static classification strategy does not typically achieve better performance in a given instance.

To develop an accurate classification prediction model for the severity of road traffic injuries, we propose three state-of-the-art DES algorithms, including Meta-Learning for Dynamic Ensemble Selection (META-DES) [47], K-Nearest Oracle Elimination (KNORAE) [48], and Dynamic Ensemble Selection Performance (DES-P) [49]. To the best of our knowledge, no study utilizing the DES algorithms has been carried out in the area of traffic safety. In recent years, DES has become a popular research topic in multi-classifier systems. In this paradigm, for each query instance to be classified, one or more base classification models areis selected. These strategies have proven to be superior to conventional (static) algorithms, such as boosting or majority voting. Estimating the competence level of each classification model in a pool of classification models is how the DES algorithm functions. To predict the label of a particular test instance, only the most competent ensemble of classification models from the pool is chosen. Not every classification model in the pool is skilled at categorizing all unknown instances, but each base classification model is skilled in a different local region of the feature space.

Furthermore, it is essential to note that machine-learning strategies lacking interpretability and comprehension are known as “black-box” algorithms [50]. Therefore, in addition to the classification and prediction of injury severity by the proposed DES algorithm, a post hoc explanation is carried out by employing the Shapley Additive exExplanations (SHAP) framework [51]. This provides a better understanding of the impact of various risk factors on the injury severity. The remainder of this paper is constituted as follows: The following section is the research methodology, which includes a description of the route and data, the static ensemble classifiers, the proposed DES algorithms, and the SHAP interpretation system. Section 3 discusses the DES algorithm results and comparison with other models, as well as the interpretation of the DES algorithms by SHAP analysis. In Section 4, conclusions and recommendations are made.

2. Materials and Methods

For the prediction and classification of injury severity in road traffic accidents, this study proposes state-of-the-art DES algorithms using various base learners. For the model development, the National Highway N-5 (Peshawar–Rahim Yar Khan Section) crash data were acquired from NH & MP, Pakistan. The data were initially preprocessed for missing values and then divided into training–validation and testing datasets. The DES dataset is the term referring to the pure validation dataset in dynamic ensemble selection modeling. The base learner is selected from this partition using the dynamic selection approach. As our base ensemble learners, we employed Binary Logistic Regression, Adaptive Boosting, Classification and Regression Trees, and Random Forest. The pure training dataset was used for fitting the classification models from the pool, whereas the pure validation dataset was used for fitting the dynamic selection models as well as hyperparameter tuning. The testing dataset was used to assess the performance of the developed model. Following that, SHAP analysis was conducted for optimal model interpretation, and important risk factors were identified and evaluated. Figure 2 shows the entire framework of this study.

2.1. Study Route

The records of traffic accidents that occurred between 2015 and 2019 on various sections of National Highway N-5 provided the data for this study (Peshawar to RahimYar Khan). Figure 3 shows the two-way divided, two-lane N-5, which links Torkham in Khyber Pakhtunkhwa with Karachi in Sindh, Pakistan. It is the longest road in Pakistan at 2108 km. It passes through the provinces of Khyber Pakhtunkhwa, Sindh, and Punjab. Along its course, it connects major towns and cities. This route is used to transfer the great majority of the freight from the port of Karachi to the upcountry cities. The posted speed limit for light transport vehicles (LTVs), which includes vans, pickup trucks, and passenger cars, is 100 kmph. The posted speed restriction is 90 kmph for heavy transport vehicles such as buses, lorries, and other trucks.

2.2. Data Description

The task of maintaining the security and safety of the country’s highways falls to Pakistan’s National Highways and Motorway Police (NH & MP). A crash investigation form (CIF) is completed at each accident scene by this department in order to keep track of traffic collisions. A four-page “Microcomputer Accident Analysis Proforma” (MAAP) document is used by the NH & MP to maintain crash-related data. The MAAP logs each accident on the N-5 based on twenty-two risk factors, including the environmental conditions, the type of crash, the condition of the road, the temporal characteristics, the vehicle type, and the characteristics of drivers. The MAAP also divides injuries into four categories: property damage only or no injury, minor injury, major injury, and fatal injury. In this study, we categorized the property damage only/no injuries, minor, and major injuries as “non-fatal injuries”, and all other injuries as “fatal injuries” to develop a binary classification problem. The descriptions and frequency of occurrences of these risk factors are shown in Table 1.

2.3. Static Ensemble Classification Models

In this research, we consider three static ensembles of classification models to use as base learners in the DES algorithm, including RF, AdaBoost, and CART. Nevertheless, we can also consider a pool of other classification models. The reason to use those ensemble classification models is their performance superiority in various domains in recent studies, as shown in Table 2.

2.3.1. Adaptive Boosting (AdaBoost)

Freund and Schapiro [57] developed AdaBoost in 1999 as a machine-learning ensemble classifier. It is an iterative method that creates a robust classifier by combining several low-performing (weak) classifiers, resulting in a classifier with high accuracy. This classifier’s fundamental principle is to establish the weights of a weak learner and train the dataset in each iteration so that accurate projections of unusual observations can be made. As a base classifier, any machine-learning algorithm that accepts weights on the training set can be used. Algorithm 1 explains how AdaBoost works.

Algorithm 1: Adaptive Boosting (AdaBoost)
1	Input: Training data: $D = {\{(X_{𝓀}, Y_{𝓀})\}}_{1}^{M}$ , weak learner $ℎ_{t}$
2	The weight distribution is initialized $ω_{1} (𝓀) = \frac{1}{m}$
3	for $t = 1 to T$ do
4		Using the weight distribution $ω_{t}$ , train the weak learner $ℎ_{t} : X \to R$
5		Compute the weight $𝜓$ of $ℎ_{t}$
6		The weight distribution is updated over the training data set $ω_{t + 1} (𝓀) = \frac{ω_{t} (𝓀) {ℯ 𝓍 𝓅}^{{- 𝜓}_{t} Y_{𝓀} ℎ_{t} (X_{𝓀})}}{Ω_{t}}$ Here, $Ω_{t}$ is the normalization factor, which is selected such that $ω_{t + 1}$ will be a distribution.
7	End for
8	Return $𝑓 (X) = \sum_{t = 0}^{T} 𝜓_{t} ℎ_{t} (X)$ $and H (X) = 𝑠 𝑖 ℊ 𝑛 (𝑓 (X))$

2.3.2. Random Forest (RF)

RF was developed by Breiman [58] as a bagging ensemble classifier that simultaneously trains multiple Classification and Regression Trees with bootstrapping and aggregation, also known as bagging (Figure 4). Numerous independent Classification and Regression Trees are learned in parallel using distinct subsets of the training dataset and distinct subsets of the available factors, as asserted by bootstrapping. Bootstrapping ensures that each Classification and Regression Tree of the Random Forest is distinct, thereby reducing the overall variance of the Random Forest. The classifier aggregates the evaluations of individual trees to reach a final conclusion; consequently, the RF classifier demonstrates superior generalization.

2.3.3. Classification and Regression Tree (CART)

CART is a decision-tree learning technique that classifies data using the Gini index [59]. As illustrated in Figure 5, it employs repeated binary splitting of parent nodes in the tree based on a decision rule until all child nodes are homogeneous. The tree-growing process entails repeated partitioning of the target variable in order to clean up the terminal nodes. As shown in Equation (1), the Gini Index is a measure of the impurity of a particular node in a tree. The lower the Gini index, the purer the dataset and the more accurate the classification.

Gini (X) = 1 - \sum Θ_{α}^{2}

(1)

Here,

X

is the crash training dataset, and

Θ_{α}

is the probability that category

α

appears in

X

.

CART continues tree growth until homogeneous results are obtained. As a result, tree pruning is necessary to avoid becoming a complex maximal tree. The branches that are irrelevant to the classifier are eliminated. The tree pruning technique entails removing branches from the largest trees, resulting in smaller and simpler trees. Smaller trees result in a higher rate of misclassification errors. Equation (2) gives the misclassification error rate.

ER = \sum Θ_{α} Gini (X)

(2)

2.4. Dynamic Ensemble Selection (DES) Algorithms

As discussed earlier, in order to develop an accurate classification and prediction model for the road traffic injury severity, we propose three state-of-the-art DES algorithms including Meta-Learning for Dynamic Ensemble Selection (META-DES), K-Nearest Oracle Elimination (KNORAE), and Dynamic Ensemble Selection Performance (DES-P).

2.4.1. META-DES

Dynamic classification is viewed by the META-DES algorithm as a meta-problem [40]. The goal of this algorithm’s meta-problem is to ascertain whether the chosen classification model from a pool of potential classification models is capable of classifying the provided test data [47]. This meta-problem can be resolved primarily in two steps.

Step 1: Finding the meta-features for every classification model involved in the pool. Four types of meta-features exist: (a) posterior likelihood/probability for each target label (the likelihood that the training data in the region of competence belong to the target label); (b) overall local accuracy (OLA) of the classification model in the competence region; (c) the neighbor’s hard classification (NHC) (a vector of “n” is generated, where “n” are the training instances in the region of competence). If the classification model correctly classifies the instance in the area of competence, the vector’s value is set to 1; otherwise, it is set to 0); and (d) the confidence of classification model (the orthogonal length between the input instance and the classification model’s decision boundary).

Step 2: Determine using meta-features whether a particular classification model is able to make accurate predictions for a specific set of test data. Due to this, the ensemble of classification models for the provided test data consists of all classification models that the meta-classification models have chosen [47].

2.4.2. KNORA

The KNORAE algorithm finds the subset of classification models for a given test data that correctly classifies all K nearest neighbors from a pool of available classification models. For the classification of the test data, the ensemble of these chosen classification models is assigned and made voteable (KNORAE algorithm employs the majority voting rule for the prediction purpose). To put it another way, the algorithm discards classification models that misclassify any nearby data [48]. If such a classification model cannot be found, the algorithm keeps lowering the importance of nearest neighbors and looks for at least one classification model that can classify all training samples in close proximity to the test data [48].

2.4.3. DES-P

This DES approach nullifies incompetent classification models by comparing their effectiveness to a random classification model. Effectiveness of the random classification model is 1/C, where C is the number of classes in the training dataset (see the explanation in [49]). The dynamic selection of classification models is carried out for each set of test data by comparing the classification models’ performance to that of a random classification model in the neighborhood defined by the test data. If the performance of a classification model is superior to that of a random classification model, then it is eligible for inclusion in the ensemble of classification models for the specified test data. If no classification model from the pool is picked, then the entire classification model pool is selected for the specified test data [49].

2.5. Hyperparameter Tuning

The tuning of hyperparameters is an essential training step for machine-learning algorithms. This contributes to the enhancement of the learning algorithm, the prevention of over-fitting, and the simplification of the algorithm. In this study, the F1-score is used as a performance metric for hyperparameter tuning. Existing hyperparameter tuning strategies include Grid Search Cross-Validation, Random Search Cross-Validation, and the Bayesian Optimization approach [60]. Random Search Cross-Validation and Grid Search Cross-Validation search the entire space of available values of hyperparameters iteratively without regard for previous outcomes, which is inefficient for large parameter spaces. In contrast, Bayesian Optimization uses previous evaluations to determine which hyperparameter set to test next. Using intelligent parameter combinations, it is able to focus on the regions of the parameter space that it believes will produce the most promising scores. This method typically requires fewer iterations to determine the optimal hyperparameter values.

2.6. Performance Evaluation

We carried out comparative experimentations based on four well-known machine-learning metrics, namely, classification accuracy, recall value, precision value, and F1-score, in order to conduct a thorough and reliable assessment of the DES algorithm and the contemporary static base ensemble learning models. These performance metrics were computed based on the contingency or confusion matrix as a binary classification problem, as shown in Figure 6. One class is the majority, and the number of instances is represented by

n_{1}

; the other class is the minority, and its number of instances is represented by

n_{2}

. Let

n

represent the total size of training data set

{n = n}_{1} + n_{2}

. A binary classification algorithm predicts whether each instance is positive or negative. Therefore, it generates outcomes of four types: true positive

(t^{p})

, false negative

(f^{n})

, false positive

(f^{p})

, and true negative

(t^{n})

. Equations (3)–(6) show the expression for the computation of Classification Accuracy value, recall value, precision value, and F1-score.

A c c u r a c y = \frac{t^{n} + t^{p}}{f^{n} {+ t}^{p} {+ t}^{n} {+ f}^{p}}

(3)

R e c a l l = \frac{t^{p}}{f^{n} {+ t}^{p}}

(4)

P r e c i s i o n = \frac{t^{p}}{f^{p} + t^{p}}

(5)

F 1 - S c o r e = \frac{t^{p}}{t^{p} + \frac{1}{2} (f^{p} + f^{n})}

(6)

2.7. Model Interpretation by SHAP

Lundberg and Lee [51] developed a SHAP analysis approach, which is based on a game-theory mechanism for machine learning model interpretation. The fundamental concept behind the SHAP tool is to compute the marginal contribution of factors to the machine-learning model output and then a “black box model” is interpreted from both global and local perspectives. During training or testing of machine-learning models, a prediction value is computed for each instance, and the SHAP value corresponds to the value assigned to each factors in the instance. The contribution of each factor, denoted by the Shapley value, is computed using Equation (7).

φ_{i} = \sum_{ϒ \subseteq Π \{i\}} \frac{ϒ! (n - |ϒ| - 1)!}{n!} [f (ϒ \cup \{i\}) - f (ϒ)]

(7)

where

φ_{i}

indicates the contribution of the ith factor,

Π

represents the set of all factors,

ϒ

represents the subset of the given predicted factor,

f (ϒ_{i})

and

f (ϒ)

represent the model results with and without ith factors, respectively. The SHAP analysis tool produces an interpretable machine-learning model via an additive factors imputation strategy, wherein the output model is defined as a linear sum of the input factors (Equation (8)).

ℊ (z^{'}) {= φ}_{0} + \sum_{i = 1}^{Λ} φ_{i} z^{'}

(8)

Here,

z^{'} \in {\{0, 1\}}^{Λ}

when a factor is observed = 1 and otherwise = 0,

Λ

denotes the number of input factors,

φ_{0}

is the base values, i.e., the predicted outcome without factors, and

φ_{i}

denotes the Shapley value of the ith factor. The SHAP model is used in this study for the interpretation of the DES algorithm using various base learners and the extraction of important risk factors that are likely to cause fatal injuries in road traffic accidents.

3. Results and Discussion

Machine-learning models have the ability to efficiently deal with small data sizes [61,62]. In this research, crash data from the records of the National Highway and Motorway Police (NH & MP) from 2015 to 2019 were used to predict injury severity. The crash dataset contained 1557 instances, which was initially split into two sets, namely, a training–validation set and a test set; 80% (1246) of the data was used for training–validation, and 20% (311) of the data was used for testing/performance evaluation purposes. The training–validation set contained 762 and 484 numbers of fatal and non-fatal cases, respectively. The testing set contained 201 and 110 numbers of fatal and non-fatal cases, respectively. Hence, an equal proportion of the classes for training–validation and testing sets were maintained for the experiments. The hyperparameter tuning using Bayesian Optimization with 10-Fold Cross-Validation was also performed on the pure validation dataset, which was 15% of the whole training–validation dataset. The range and optimal values of hyperparameters are shown in Table 3.

In the DES literature [45], the pure validation dataset is frequently referred to as the dynamic selection dataset (D_SEL), as dynamic selection techniques choose the base classification models from this partition. While fitting the pure validation dataset was used for the dynamic selection models, fitting the pool of classification method was done with the pure training dataset. The effectiveness of the classification models on the testing dataset was then assessed. In order to compare different models, we also implemented static ensemble models including Random Forest, Adaptive Boosting, and Classification and Regression Tree (CART), as well as statistical binary logistic regression (BLR).

3.1. Model Comparison and Performance Assessment

In this study, positive and negative classes were labelled as fatal and non-fatal, respectively. All models were evaluated based on the performance metrics, namely, classification accuracy, recall value, precision value, and F1-score, extracted from the confusion matrices (Figure 7) of each static-based ensemble model as well as DES algorithms using base learners. The precision value, also known as the positive predicted value, is a ratio of correct positive prediction to the total positive predictions, while the recall value, also known as sensitivity, reflects the ratio of correct positive predictions to the total positive instances (fatal class in our case). According to the results (Table 4), all models were capable of classifying injury severity with an accuracy of greater than 60%. In the case of static ensemble classification models, Random Forest resulted in a higher classification accuracy value (70%), precision value (68%), recall value (67%), and F1-score (67%), which was followed by the Classification and Regression Tree model with a classification accuracy value of 68%, a precision value of 66%, a recall value of 65%, and an F1-score of 65%. Adaptive Boosting showed a classification accuracy value, recall value, precision value, and F1-score equal to 60%, 58%, 57%, and 57%, respectively. BLR showed a classification accuracy value, recall value, precision value, and F1-score equal to 53%, 52%, 50%, and 51%, respectively.

The pools of Random Forest, Classification and Regression Tree, Adaptive Boosting, and BLR were then used in conjunction with KNORA, DES-P, and META-DES. In the case of DES algorithms with Random Forest (Table 5), META-DES-RF resulted in a higher performance measures in terms of classification accuracy value (75%), precision value (72%), recall value (69%), and F1-score (70%). The second-best DES model in conjunction with the Random Forest pool of classifiers was KNORA-RF, which resulted in a classification accuracy value of 72%, a precision value of 69%, a recall value of 69%, and an F1-score of 67%. Similarly, in the case of DES algorithms with adaptive boosting (Table 6), META-DES-AdaBoost resulted in higher performance measures with a classification accuracy value of 64%, precision value of 61%, recall value of 61%, and F1-score of 61%. The KNORA-DT performed well among other DES algorithms using the Classification and Regression Tree pool of classifiers (Table 7). It showed a classification accuracy value of 69%, a precision value of 67%, a recall value of 66%, and an F1-score of 66%. Overall, it was observed that META-DES-RF outperformed the static ensemble classifiers as well as other DES models and could be used in conjunction with SHAP analysis for feature importance analysis. The KNORA-CART performed well among other DES algorithms using the Classification and Regression Tree pool of classifiers (Table 7). It showed a classification accuracy value of 69%, a precision value of 67%, a recall value of 66%, and an F1-score of 66%. Overall, it was observed that META-DES-RF outperformed the static ensemble classifiers as well as other DES models and could be used in conjunction with SHAP analysis for feature importance analysis. The META-DES-BLR performed well among other DES algorithms using the BLR (Table 8). However, the performance was not as good as machine-learning classifiers. It showed a classification accuracy value of 65%, a precision value of 60%, a recall value of 66%, and an F1-score of 63%. Overall, it was observed that META-DES-RF outperformed the static ensemble classifiers as well as other DES models and could be used in conjunction with SHAP analysis for feature importance analysis.

3.2. META-DES-RF Framework Interpretation by SHAP

The aim of the SHAP interpretation mechanism is to describe the expected behavior of a machine-learning model with regard to the entire distribution of its input factors’ values. This is accomplished with the SHAP tool by averaging the SHAP values for each instance across the entire dataset. The SHAP tool can determine the relative importance of risk factors by a SHAP factors importance plot as well as factors’ contributions by a Bee swarm plot. It is pertinent to mention that factors’ contributions is not the same as factor importance. The factors’ contributions by a Bee swarm plot reveal not just the relative importance of factors, but their actual relationships with the predicted outcome. According to the aforementioned experimental findings, the predicted performance of the META-DES framework with RF, which has the best predicted performance, was elucidated. SHAP can interpret the predicted META-DES-RF results in different ways. Figure 7a illustrates the factors’ importance analysis of input factors. It can be seen that the importance of each input parameter is non-negligible, meaning that each input factor will have a varying degree of impact on the predicted outcomes. The results demonstrated that Driver_Age had the greatest impact on the severity of injuries, followed by Month_of_Year, Day_of_Week, and Vehicle_Type. The factors with the least influence were the Presence_of_Shoulder and Presence_of_Median. Although these results indicated the significance of each risk factor, they did not indicate the contribution of each risk factor to the likelihood of fatal injuries.

Figure 7b is a SHAP contribution plot (Bee swarm plot) of the risk factors, which illustrates the distribution of the SHAP values of each input factor in each instance and the overall influence trends. The x-axis (horizontal axis) of the plot represents the SHAP value of input factors, while the y-axis represents the input factors ranked by importance. The vertical line at a SHAP value = 0.00 represents the base line. The points in the figure represent the instances in the dataset, and their colors represent the values of the corresponding factors, ranging from light blue to red as the values increase. The points represented by pink represent risk factors with moderate levels. In the case of Driver_Age, the majority of light blue points fell to the right side of the reference vertical line, as shown in the plot. This indicates that young age drivers are more likely to be involved in fatal accidents, which is also consistent with previous studies [63,64,65]. Similarly, it has been observed that the majority of the pink color points and a few blue color points for Month_of_Year are to the right of the vertical line, which indicates the occurrence of fatal injuries in the months of May, June, and July (during the summer) and a few in the winter. This result is also aligned with previous studies related to road traffic injury severity [66,67].

Due to the higher summer temperature, tire rupture or monsoon precipitation that causes runoff may be factors in summer fatalities. For Day_of_Week, the blue points fall to the left side of the reference vertical line, indicating non-fatal injuries on weekdays; however, several pink to red points fall to the right, indicating fatal injuries on Friday, Saturday, and Sunday, which is aligned with existing previous studies as well [68,69]. On Fridays and weekends, the majority of individuals travel back to their hometowns, causing heavy traffic on the N-5. In addition, Friday and Sunday markets along the N-5 are frequented by a large number of people, making them more prone to accidents.

4. Conclusions and Recommendations

This study used the National Highway N-5 Peshawar–RahimYar Khan section dataset from 2015 to 2019 and proposed the DES algorithm in conjunction with a pool of ensemble classifiers including Random Forest, Classification and Regression Tree, Adaptive Boosting, and Binary Logistic Regression to predict injury severity. The results from the DES algorithms were compared, and it was revealed that the META-DES-RF outperformed other models’ terms of classification accuracy value, precision value, recall value, and F1-score. Therefore, the proposed META-DES framework offers an alternative strategy for modelling injury severity.

Machine-learning ensemble classification models’ lack of transparency and interpretability is widely criticized. This impacts the wide adoption of modeling techniques for prediction in traffic and transportation safety, despite the fact that these models are quite often more adaptable and reliable than conventional statistical models. To address the interpretability issue posed by META-DES-RF, the SHAP strategy was employed to evaluate the results and to assess important risk factors as well as assess their influence on injury severity in road traffic accidents. The top four crucial risk factors that enhance the probability of occurrence of fatal injury severity are Driver_Age, Month_of_Year, Day_of_Week, and Vehicle_Type. Young age drivers are more vulnerable than old age drivers to experience fatalities. Lowering the overall fatal crash rate could be accomplished by strengthening young driver education programs, enforcing stricter driving laws and standards, tightening driving tests, and arming parents with the necessary information. There are more fatal injuries in the months of May, June, and July than non-fatal injuries, which are more common in the months of November and December. Along with tire bursting in hot summer temperatures, the Punjab region’s rainy season may also be to blame for skidding, because it reduces friction between the tires and the wet pavement. Driving should be done with more caution and at a lower speeds while taking runoff into account during the monsoon rainy days (July to August). It has been observed that higher risks of fatal injuries occur during the weekends (Saturday and Sunday). The majority of people make weekend trips back to their towns or villages, which causes a significant amount of traffic on the highway. Likewise, Friday and Sunday markets along the N-5 route are frequently visited by people on the weekends.

This study’s innovative approach could be applied to a comprehensive analysis of road traffic accidents and is a useful resource for traffic safety specialists and transport policymakers. Only road traffic accident injury severity as computed by the dynamic ensemble and SHAP analysis was covered in this study. It is also pertinent to mention that our study only employed META-DES, KNORAE, and DES-P with RF, AdaBoost, CART, and BLR as their base learners. Among those, META-DES-RF showed a higher accuracy and F1-score. However, the performance can be further improved by using other advanced machine-learning models as base learners, such as extreme gradient boosting (XGBoost), Light Gradient Boosting Machine (LGBM), and Categorical Gradient Boosting (CatBoost). Other DES algorithms with a heterogeneous pool of classification models and risk factors could be used with large-scale data of different road sections in future research work.

Author Contributions

Data curation, A.K. and H.A.; formal analysis, A.K.; funding acquisition, H.A. and A.E.; investigation, C.M.M. and A.K.; methodology, A.K.; project administration, A.E. and H.A.; resources, A.K.; software, A.K.; supervision, C.M.M. and H.A.; validation, A.E.; visualization, A.K.; writing—original draft, H.A.; writing—review and editing, A.K. and A.E. All authors have read and agreed to the published version of the manuscript.

Funding

The research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

We are very thankful to the National Highway and Motorway Police, Pakistan, for providing National Highway N-5 crash data.

Conflicts of Interest

The authors declare no conflict of interest.

References

World Health Organization. Global Status Report on Road Safety 2015; World Health Organization: Geneva, Switzerland, 2015.
World Health Organization. Global Status Report on Road Safety (2018); World Health Organization: Geneva, Switzerland, 2019.
Umair, M.; Rana, I.A.; Lodhi, R.H. The impact of urban design and the built environment on road traffic crashes: A case study of Rawalpindi, Pakistan. Case Stud. Transp. Policy 2022, 10, 417–426. [Google Scholar] [CrossRef]
Hussain, M.; Shi, J. Modelling and examining the influence of predictor variables on the road crashes in functionally classified vehicles in Pakistan. Int. J. Crashworthiness 2021, 27, 1118–1127. [Google Scholar] [CrossRef]
Almoshaogeh, M.; Abdulrehman, R.; Haider, H.; Alharbi, F.; Jamal, A.; Alarifi, S.; Shafiquzzaman, M. Traffic accident risk assessment framework for qassim, saudi arabia: Evaluating the impact of speed cameras. Appl. Sci. 2021, 11, 6682. [Google Scholar] [CrossRef]
Rahman, M.M.; Islam, M.K.; Al-Shayeb, A.; Arifuzzaman, M. Towards sustainable road safety in Saudi Arabia: Exploring traffic accident causes associated with driving behavior using a Bayesian belief network. Sustainability 2022, 14, 6315. [Google Scholar] [CrossRef]
Al-Garawi, N.; Dalhat, M.A.; Aga, O. Assessing the Road Traffic Crashes among Novice Female Drivers in Saudi Arabia. Sustainability 2021, 13, 8613. [Google Scholar] [CrossRef]
Rahman, M.H.; Zafri, N.M.; Akter, T.; Pervaz, S. Identification of factors influencing severity of motorcycle crashes in Dhaka, Bangladesh using binary logistic regression model. Int. J. Inj. Control Saf. Promot. 2021, 28, 141–152. [Google Scholar] [CrossRef]
Zafri, N.M.; Prithul, A.A.; Baral, I.; Rahman, M. Exploring the factors influencing pedestrian-vehicle crash severity in Dhaka, Bangladesh. Int. J. Inj. Control Saf. Promot. 2020, 27, 300–307. [Google Scholar] [CrossRef]
SIP. Social Indicator of Pakistan; Government Pakistan, Statistics Division, Pakistan Bureau of Statistics: Islamabad, Pakistan, 2016.
SIP. Social Indicator of Pakistan; Government of Pakistan, Statistics Division, Pakistan Bureau of Statistics: Islamabad, Pakistan, 2020. Available online: http://www.pbs.gov.pk/content/population-census (accessed on 27 January 2022).
Shoaib, M. Pakistan Economic Survey 2012–2013; Ministry of Finance, Government of Pakistan: Islamabad, Pakistan, 2013.
Batool, Z.; Carsten, O.; Jopson, A. Road safety issues in Pakistan: A case study of Lahore. Transp. Plan. Technol. 2012, 35, 31–48. [Google Scholar] [CrossRef]
Kayani, A.; Fleiter, J.J.; King, M.J. Underreporting of road crashes in Pakistan and the role of fate. Traffic Inj. Prev. 2014, 15, 34–39. [Google Scholar] [CrossRef]
Xie, Y.; Zhang, Y.; Liang, F. Crash injury severity analysis using Bayesian ordered probit models. J. Transp. Eng. 2009, 135, 18–25. [Google Scholar] [CrossRef] [Green Version]
Zhang, Y.; Li, Z.; Liu, P.; Zha, L. Exploring contributing factors to crash injury severity at freeway diverge areas using ordered probit model. Procedia Eng. 2011, 21, 178–185. [Google Scholar]
Yasmin, S.; Eluru, N.; Bhat, C.R.; Tay, R. A latent segmentation based generalized ordered logit model to examine factors influencing driver injury severity. Anal. Methods Accid. Res. 2014, 1, 23–38. [Google Scholar] [CrossRef]
Chen, C.; Zhang, G.; Tarefder, R.; Ma, J.; Wei, H.; Guan, H. A multinomial logit model-Bayesian network hybrid approach for driver injury severity analyses in rear-end crashes. Accid. Anal. Prev. 2015, 80, 76–88. [Google Scholar] [CrossRef] [PubMed]
Kim, J.K.; Ulfarsson, G.F.; Shankar, V.N.; Mannering, F.L. A note on modeling pedestrian-injury severity in motor-vehicle crashes with the mixed logit model. Accid. Anal. Prev. 2010, 42, 1751–1758. [Google Scholar] [CrossRef] [PubMed]
Wu, Q.; Chen, F.; Zhang, G.; Liu, X.C.; Wang, H.; Bogus, S.M. Mixed logit model-based driver injury severity investigations in single-and multi-vehicle crashes on rural two-lane highways. Accid. Anal. Prev. 2014, 72, 105–115. [Google Scholar] [CrossRef]
Alogaili, A.; Mannering, F. Unobserved heterogeneity and the effects of driver nationality on crash injury severities in Saudi Arabia. Accid. Anal. Prev. 2020, 144, 105618. [Google Scholar] [CrossRef]
Chen, F.; Song, M.; Ma, X. Investigation on the injury severity of drivers in rear-end collisions between cars using a random parameters bivariate ordered probit model. Int. J. Environ. Res. Public Health 2019, 16, 2632. [Google Scholar] [CrossRef]
Russo, B.J.; Savolainen, T.; Schneider, W.H., IV; Anastasopoulos, C. Comparison of factors affecting injury severity in angle collisions by fault status using a random parameters bivariate ordered probit model. Anal. Methods Accid. Res. 2014, 2, 21–29. [Google Scholar] [CrossRef]
Clarke, B.; Fokoue, E.; Zhang, H.H. Principles and Theory for Data Mining and Machine Learning; Springer: Berlin/Heidelberg, Germany, 2009; pp. 304–310. [Google Scholar]
Raschka, S. Python Machine Learning; Packt Publishing Ltd.: Birmingham, UK, 2015. [Google Scholar]
Zaki, M.J.; Meira, W., Jr.; Meira, W. Data Mining and Analysis: Fundamental Concepts and Algorithms; Cambridge University Press: Cambridge, UK, 2014. [Google Scholar]
Ahmad, M.A.; Eckert, C.; Teredesai, A. Interpretable machine learning in healthcare. In Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, Washington, DC, USA, 29 August–1 September 2018; pp. 559–560. [Google Scholar]
Char, D.S.; Abràmoff, M.D.; Feudtner, C. Identifying ethical considerations for machine learning healthcare applications. Am. J. Bioeth. 2020, 20, 7–17. [Google Scholar] [CrossRef]
Shailaja, K.; Seetharamulu, B.; Jabbar, M.A. Machine learning in healthcare: A review. In Proceedings of the 2018 Second international conference on electronics, communication and aerospace technology (ICECA), Coimbatore, India, 29–31 March 2018; pp. 910–914. [Google Scholar]
Gogas, P.; Papadimitriou, T. Machine learning in economics and finance. Comput. Econ. 2021, 57, 1–4. [Google Scholar] [CrossRef]
Goodell, J.W.; Kumar, S.; Lim, W.M.; Pattnaik, D. Artificial intelligence and machine learning in finance: Identifying foundations, themes, and research clusters from bibliometric analysis. J. Behav. Exp. Financ. 2021, 32, 100577. [Google Scholar] [CrossRef]
Rundo, F.; Trenta, F.; di Stallo, A.L.; Battiato, S. Machine learning for quantitative finance applications: A survey. Appl. Sci. 2019, 9, 5574. [Google Scholar] [CrossRef]
Halde, R.R. Application of Machine Learning algorithms for betterment in education system. In Proceedings of the 2016 International Conference on Automatic Control and Dynamic Optimization Techniques (ICACDOT), Pune, India, 9–10 September 2016; pp. 1110–1114. [Google Scholar]
Luan, H.; Tsai, C.C. A review of using machine learning approaches for precision education. Educ. Technol. Soc. 2021, 24, 250–266. [Google Scholar]
Labib, M.F.; Rifat, A.S.; Hossain, M.M.; Das, A.K.; Nawrine, F. Road accident analysis and prediction of accident severity by using machine learning in Bangladesh. In Proceedings of the 2019 7th International Conference on Smart Computing & Communications (ICSCC), Sarawak, Malaysia, 28–30 August 2019; pp. 1–5. [Google Scholar]
Wen, X.; Xie, Y.; Jiang, L.; Pu, Z.; Ge, T. Applications of machine learning methods in traffic crash severity modelling: Current status and future directions. Transp. Rev. 2021, 41, 855–879. [Google Scholar] [CrossRef]
Kuşkapan, E.; Çodur, M.Y.; Atalay, A. Speed violation analysis of heavy vehicles on highways using spatial analysis and machine learning algorithms. Accid. Anal. Prev. 2021, 155, 106098. [Google Scholar] [CrossRef]
Nasrollahzadeh, A.A.; Sofi, A.R.; Ravani, B. Identifying factors associated with roadside work zone collisions using machine learning techniques. Accid. Anal. Prev. 2021, 158, 106203. [Google Scholar] [CrossRef] [PubMed]
Lei, T.; Peng, J.; Liu, X.; Luo, Q. Crash prediction on expressway incorporating traffic flow continuity parameters based on machine learning approach. J. Adv. Transp. 2021, 2021, 1–13. [Google Scholar] [CrossRef]
Chen, H.; Chen, H.; Zhou, R.; Liu, Z.; Sun, X. Exploring the mechanism of crashes with autonomous vehicles using machine learning. Math. Probl. Eng. 2021, 2021, 1–10. [Google Scholar] [CrossRef]
Zhang, S.; Khattak, A.; Matara, C.M.; Hussain, A.; Farooq, A. Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents. PLoS ONE 2022, 17, e0262941. [Google Scholar] [CrossRef]
Dong, S.; Khattak, A.; Ullah, I.; Zhou, J.; Hussain, A. Predicting and analyzing road traffic injury severity using boosting-based ensemble learning models with SHAPley Additive exPlanations. Int. J. Environ. Res. Public Health 2022, 19, 2925. [Google Scholar] [CrossRef]
Cui, S.; Yin, Y.; Wang, D.; Li, Z.; Wang, Y. A stacking-based ensemble learning method for earthquake casualty prediction. Appl. Soft Comput. 2021, 101, 107038. [Google Scholar] [CrossRef]
Zhu, Y.; Zhou, L.; Xie, C.; Wang, G.J.; Nguyen, T.V. Forecasting SMEs’ credit risk in supply chain finance with an enhanced hybrid ensemble machine learning approach. Int. J. Prod. Econ. 2019, 211, 22–33. [Google Scholar] [CrossRef] [Green Version]
Cruz, R.M.; Sabourin, R.; Cavalcanti, G.D. Dynamic classifier selection: Recent advances and perspectives. Inf. Fusion 2018, 41, 195–216. [Google Scholar] [CrossRef]
Zhang, Z.L.; Chen, Y.Y.; Li, J.; Luo, X.G. A distance-based weighting framework for boosting the performance of dynamic ensemble selection. Inf. Process. Manag. 2019, 56, 1300–1316. [Google Scholar] [CrossRef]
Cruz, R.M.; Sabourin, R.; Cavalcanti, G.D.; Ren, T.I. META-DES: A dynamic ensemble selection framework using meta-learning. Pattern Recognit. 2015, 48, 1925–1935. [Google Scholar] [CrossRef]
Ko, A.H.; Sabourin, R.; Britto, A.S., Jr. From dynamic classifier selection to dynamic ensemble selection. Pattern Recognit. 2008, 41, 1718–1731. [Google Scholar] [CrossRef]
Woloszynski, T.; Kurzynski, M.; Podsiadlo, P.; Stachowiak, G.W. A measure of competence based on random classification for dynamic ensemble selection. Inf. Fusion 2012, 13, 207–213. [Google Scholar] [CrossRef]
Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 2019, 1, 206–215. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.I. A unified approach to interpreting model predictions. In Proceedings of the NIPS’17: 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
Pham, B.T.; Nguyen, M.D.; Nguyen-Thoi, T.; Ho, L.S.; Koopialipoor, M.; Quoc, N.K.; Armaghani, D.J.; Van Le, H. A novel approach for classification of soils based on laboratory tests using Adaboost, Tree and ANN modeling. Transp. Geotech. 2021, 27, 100508. [Google Scholar] [CrossRef]
Song, Y.; Zhao, J.; Ostrowski, K.A.; Javed, M.F.; Ahmad, A.; Khan, M.I.; Aslam, F.; Kinasz, R. Prediction of compressive strength of fly-ash-based concrete using ensemble and non-ensemble supervised machine-learning approaches. Appl. Sci. 2021, 12, 361. [Google Scholar] [CrossRef]
Hayadi, B.H.; Kim, J.M.; Hulliyah, K.; Sukmana, H.T. Predicting Airline Passenger Satisfaction with Classification Algorithms. Int. J. Inform. Inf. Syst. 2021, 4, 82–94. [Google Scholar]
Ting, C.Y.; Tan, N.Y.Z.; Hashim, H.H.; Ho, C.C.; Shabadin, A. Malaysian road accident severity: Variables and predictive models. In Computational Science and Technology; Springer: Berlin/Heidelberg, Germany, 2020; pp. 699–708. [Google Scholar]
Ahsan, M.M.; Mahmud, M.P.; Saha, K.; Gupta, K.D.; Siddique, Z. Effect of data scaling methods on machine learning algorithms and model performance. Technologies 2021, 9, 52. [Google Scholar] [CrossRef]
Freund, Y.; Schapire, R.E. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 1997, 55, 119–139. [Google Scholar] [CrossRef] [Green Version]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Lewis, R.J. An introduction to classification and regression tree (CART) analysis. In Proceedings of the Annual Meeting of the Society for Academic Emergency Medicine, San Francisco, CA, USA, 27 October–1 November 2020; Volume 14. [Google Scholar]
Wu, J.; Chen, X.Y.; Zhang, H.; Xiong, L.D.; Lei, H.; Deng, S.H. Hyperparameter optimization for machine learning models based on Bayesian optimization. J. Electron. Sci. Technol. 2019, 17, 26–40. [Google Scholar]
Siddik, M.; Bakkar, A.; Arman, M.; Hasan, A.; Jahan, M.R.; Islam, M.; Biplob, K.B.B. Predicting the Death of Road Accidents in Bangladesh Using Machine Learning Algorithms. In Proceedings of the International Conference on Advances in Computing and Data Sciences 2021, Nashik, India, 23–24 April 2021; Springer: Berlin/Heidelberg, Germany, 2021; pp. 160–171. [Google Scholar]
Rezapour, M.; Molan, A.M.; Ksaibati, K. Analyzing injury severity of motorcycle at-fault crashes using machine learning techniques, decision tree and logistic regression models. Int. J. Transp. Sci. Technol. 2020, 9, 89–99. [Google Scholar] [CrossRef]
Wang, K.; Bhowmik, T.; Yasmin, S.; Zhao, S.; Eluru, N.; Jackson, E. Multivariate copula temporal modeling of intersection crash consequence metrics: A joint estimation of injury severity, crash type, vehicle damage and driver error. Accid. Anal. Prev. 2019, 125, 188–197. [Google Scholar] [CrossRef]
Yu, M.; Zheng, C.; Ma, C. Analysis of injury severity of rear-end crashes in work zones: A random parameters approach with heterogeneity in means and variances. Anal. Methods Accid. Res. 2020, 27, 100126. [Google Scholar] [CrossRef]
Rahimi, E.; Shamshiripour, A.; Samimi, A.; Mohammadian, A.K. Investigating the injury severity of single-vehicle truck crashes in a developing country. Accid. Anal. Prev. 2020, 137, 105444. [Google Scholar] [CrossRef]
Ouni, F.; Belloumi, M. Spatio-temporal pattern of vulnerable road user’s collisions hot spots and related risk factors for injury severity in Tunisia. Transp. Res. Part F Traffic Psychol. Behav. 2018, 56, 477–495. [Google Scholar] [CrossRef]
Haq, M.T.; Zlatkovic, M.; Ksaibati, K. Assessment of tire failure related crashes and injury severity on a mountainous freeway: Bayesian binary logit approach. Accid. Anal. Prev. 2020, 145, 105693. [Google Scholar] [CrossRef] [PubMed]
Osman, M.; Paleti, R.; Mishra, S. Analysis of passenger-car crash injury severity in different work zone configurations. Accid. Anal. Prev. 2018, 111, 161–172. [Google Scholar] [CrossRef] [PubMed]
Zheng, Z.; Lu, P.; Lantz, B. Commercial truck crash injury severity analysis using gradient boosting data mining model. J. Saf. Res. 2018, 65, 115–124. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Types of binary classification strategies.

Figure 2. The framework of the proposed research for the prediction of injury severity.

Figure 3. Peshawar-Rahim Yar Khan section of National Highway N-5.

Figure 4. Bootstrapping and aggregation in a Random Forest.

Figure 5. Classification and Regression Tree (CART).

Figure 6. Confusion matrix plot and corresponding performance indicators.

Figure 7. SHAP analysis importance and contribution plots.

Table 1. Description of various risk factors from the crash data record.

Type of Factor	Risk Factor	Description	Marginal Frequency (%)
Injury Severity	Injury Severity Level	Non-Fatal/Fatal	61.91/38.09
Vehicle Specific	Vehicle_Age (Years)	0–10/11–20/21–30/31–40/41+	32.01/36.49/15.30/9.30/6.90
	Number_of_Vehicles	Multiple/Single	66.54/33.46
	Type_of_Vehicle	Truck/Dumper/Trailer/Tractor/Car/Pickup/Minibus/Bus/Rickshaw/Motorcycle/Bicycle	20.08/4.43/16.44/2.30/12.69/3.50/9.19/8.48/5.34/6.78/10.77
Driver Specific	Driver_Age (Years)	18–25/26–30/31–35/36–40/41–45/46–50/51–55/55+	18.18/16.83/14.90/14.58/ 13.62/10.92/5.84/5.14
	Driver_Gender	Male/Female	99.99/0.001
	Driving_License	No/Yes	46.52/53.48
Temporal Specific	Month_of_Year	January/February/March/April/May/June/July/August/September/October/November/December	5.65/6.29/10.08/8.73/5.27/ 6.87/14.96/10.08/13.17/ 6.10/7.32/5.46
	Type_of_Day	Weekday/Weekend	68.22/31.78
	Time_of_Day	12:00:00 a.m.–3:59:59 a.m.	8.97
		4:00:00 a.m.–7:59:59 a.m.	14.41
		8:00:00 a.m.–11:59:59 a.m.	23.09
		12:00:00 p.m.–3:59:59 p.m.	21.13
		4:00:00 p.m.–7:59:59 p.m.	21.02
		8:00:00 p.m.–11:59:59 p.m.	11.38
	Day_of_Week	Monday/Tuesday/Wednesday/Thursday/Friday/Saturday/Sunday	10.76/12.67/14.52/13.51/16.87/16.82/14.85
Environment Specific	Lighting_Condition	Night with road lights/Night without road lights/Daylight	5.33/25.56/69.11
	Weather_Condition	Sunny/Rainy/Cloudy	89.85/6.56/3.59
	Visibility_Condition	Clear/Smog/Fog	96.41/0.50/3.08
Roadway Specific	Alignment	Horizontal curve/Grade/Combined horizontal and grade/Straight segment	5.66/4.43/5.55/84.36
	Road_Type	Urban/Rural	52.86/47.14
	Presence_of_Median	No/Yes	3.64/96.36
	Surface_Condition	Wet/Dry	7.51/92.49
	Work_Zone	No/Yes	98.64/1.35
	Pavement_Roughness	Smooth/Potholes/Rough	94.23/3.25/2.52
	Presence_of_Shoulder	No/Yes	2.63/97.37
Crash Specific	Collision_Type	Run off the highway/Nearby trees hitting/Fell off bridge/Head-on collision/Rear-end collision/Side-collision/Rolling over/Skidding/Hitting obstacle on road/Hitting pedestrian/Hitting animal on road	0.78/0.11/0.17/5.21/43.55/19.17/12.44/3.08/4.88/10.31/0.28
Crash Specific	Cause_of_Accident	Driver at-fault/Dozing at-wheel/Over speeding/Motorcycle rider at-fault/Low visibility/Vehicle at-fault (mechanical failure)/Sight obstruction/Slippery road/Vehicle out of control/Bicycle rider at-fault/Overtaking-wrong side/Pedestrian at-fault/Pavement distress/Others	56.33/1.40/3.87/3.14/0.39/7.74/ 1.79/2.35/0.90/2.35/0.56/1.46/7.29/1.51/8.91

Table 2. Superior performance of ensemble classification models in recent studies.

Algorithm Used	Purpose of Modeling	Best Algorithm	Best Score	Reference
AdaBoost, Tree, ANN	Prediction and classification of soils based on laboratory test	AdaBoost	0.87 (Accuracy)	[52]
DT, ANN, RF, GB	Predicting fly-ash-based concrete compressive strength	RF	0.89 (R-square)	[53]
KNN, LR, NB, DT, RF	Predicting airline passenger satisfaction	RF	0.99 (Accuracy)	[54]
RF, XGBoost, CART, NN, NB and SVM	Predicting road traffic accident fatality	RF	0.95 (Accuracy)	[55]
LR, LDA, KNN, ANN, CART, NB, SVM, XGBoost, RF, AdaBoost, ET	Heart disease prediction	CART	1.00 (F1-score)	[56]

Table 3. Hyperparameter tuning of model parameters.

Algorithm	Hyperparameters	Range	Optimal Values
Random Forest	n_estimators	[300, 2000]	933
Random Forest	max_depth	[0, 10]	6
Classification and Regression Tree	max_depth	[0, 10]	8
Classification and Regression Tree	learning_rate	[0.05, 0.2]	0.12
Adaptive Boosting	max_depth	[0, 10]	5
Adaptive Boosting	n_estimators	[300, 1500]	740

Table 4. Comparison of Performance Measures of Dynamic and Static Ensemble Classifiers.

Approach	Injury Severity Class	Performance Measures
Approach	Injury Severity Class	Precision	Recall	Accuracy	F1-Score
Random Forest	Non-Fatal	0.73	0.84	0.69	0.67
	Fatal	0.63	0.49
	Average	0.68	0.67
Classification and Regression Tree	Non-Fatal	0.73	0.79	0.67	0.65
	Fatal	0.59	0.51
	Average	0.66	0.65
Adaptive Boosting	Non-Fatal	0.69	0.71	0.60	0.57
	Fatal	0.47	0.44
	Average	0.58	0.57
Binary Logistic Regression	Non-Fatal	0.58	0.61	0.53	0.51
	Fatal	0.42	0.42
	Average	0.50	0.52

Table 5. Comparison of Performance Measures of Dynamic Ensemble Selection Algorithm based on the pool of Random Forest.

Approach	Injury Severity Class	Performance Measures
Approach	Injury Severity Class	Precision	Recall	Accuracy	F1-Score
KNORA-RF	Non-Fatal	0.73	0.86	0.72	0.67
	Fatal	0.65	0.46
	Average	0.69	0.66
DES-P-RF	Non-Fatal	0.73	0.86	0.71	0.66
	Fatal	0.65	0.46
	Average	0.69	0.66
META-DES-RF	Non-Fatal	0.74	0.86	0.75	0.72
	Fatal	0.67	0.51
	Average	0.71	0.69

Table 6. Comparison of Performance Measures of Dynamic Ensemble Selection Algorithm based on the pool of Adaptive Boosting.

Approach	Injury Severity Class	Performance Measures
Approach	Injury Severity Class	Precision	Recall	Accuracy	F1-Score
KNORA-AdaBoost	Non-Fatal	0.69	0.71	0.61	0.57
	Fatal	0.46	0.44
	Average	0.58	0.57
DES-P-AdaBoost	Non-Fatal	0.69	0.70	0.61	0.57
	Fatal	0.46	0.45
	Average	0.57	0.57
META-DES-AdaBoost	Non-Fatal	0.72	0.72	0.64	0.61
	Fatal	0.51	0.51
	Average	0.61	0.61

Table 7. Comparison of Performance Measures of Dynamic Ensemble Selection Algorithm based on the pool of Classification and Regression Tree.

Approach	Injury Severity Class	Performance Measures
Approach	Injury Severity Class	Precision	Recall	Accuracy	F1-Score
KNORA-CART	Non-Fatal	0.74	0.80	0.69	0.66
	Fatal	0.60	0.52
	Average	0.67	0.66
META-DES-CART	Non-Fatal	0.72	0.74	0.65	0.64
	Fatal	0.53	0.51
	Average	0.63	0.62
DES-P-CART	Non-Fatal	0.61	0.74	0.65	0.62
	Fatal	0.36	0.24
	Average	0.49	0.49

Table 8. Comparison of Performance Measures of Dynamic Ensemble Selection Algorithm based on Binary Logistic Regression.

Approach	Injury Severity Class	Performance Measures
Approach	Injury Severity Class	Precision	Recall	Accuracy	F1-Score
KNORA-BLR	Non-Fatal	0.62	0.73	0.64	0.62
	Fatal	0.60	0.51
	Average	0.61	0.62
META-DES-BLR	Non-Fatal	0.63	0.79	0.65	0.63
	Fatal	0.56	0.54
	Average	0.60	0.66
DES-P-BLR	Non-Fatal	0.57	0.70	0.51	0.49
	Fatal	0.39	0.31
	Average	0.48	0.51

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khattak, A.; Almujibah, H.; Elamary, A.; Matara, C.M. Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5. Sustainability 2022, 14, 12340. https://doi.org/10.3390/su141912340

AMA Style

Khattak A, Almujibah H, Elamary A, Matara CM. Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5. Sustainability. 2022; 14(19):12340. https://doi.org/10.3390/su141912340

Chicago/Turabian Style

Khattak, Afaq, Hamad Almujibah, Ahmed Elamary, and Caroline Mongina Matara. 2022. "Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5" Sustainability 14, no. 19: 12340. https://doi.org/10.3390/su141912340

APA Style

Khattak, A., Almujibah, H., Elamary, A., & Matara, C. M. (2022). Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5. Sustainability, 14(19), 12340. https://doi.org/10.3390/su141912340

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Route

2.2. Data Description

2.3. Static Ensemble Classification Models

2.3.1. Adaptive Boosting (AdaBoost)

2.3.2. Random Forest (RF)

2.3.3. Classification and Regression Tree (CART)

2.4. Dynamic Ensemble Selection (DES) Algorithms

2.4.1. META-DES

2.4.2. KNORA

2.4.3. DES-P

2.5. Hyperparameter Tuning

2.6. Performance Evaluation

2.7. Model Interpretation by SHAP

3. Results and Discussion

3.1. Model Comparison and Performance Assessment

3.2. META-DES-RF Framework Interpretation by SHAP

4. Conclusions and Recommendations

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI