1. Introduction
Road crash injuries are projected to become the fifth leading cause of mortality globally by 2030, highlighting the pressing need for effective prevention strategies in transportation safety management [
1]. Accurate forecasting of traffic crashes is a critical component of these strategies, as it enables the identification of potential risks and the development of targeted interventions [
2,
3]. Crash pattern analysis, which involves examining the factors contributing to road crashes, plays a vital role in identifying these risks. However, it is essential to recognize that crash patterns and contributing factors can vary significantly across cultural and geographical contexts. Traditional crash analysis methods often rely on statistical models that struggle to capture the nuanced and non-linear relationships present in crash narratives. Additionally, there is a lack of cross-cultural comparisons that leverage advanced AI techniques to address these gaps. Studying crash patterns in diverse regions provides valuable insights into how cultural, social, and environmental factors influence road safety, enabling the formulation of more tailored and effective prevention measures [
4].
This study focuses on crash patterns in two culturally and infrastructurally distinct regions, the United States and Jordan, to explore how varying socio-economic, legal, and environmental factors shape traffic safety outcomes. The United States, as a developed nation with extensive road networks and advanced traffic safety measures, presents a markedly different context compared to Jordan, a developing country facing challenges such as older vehicle fleets, variable enforcement of traffic laws, and unique cultural driving behaviors. By analyzing these contrasting settings, the study aims to uncover shared and region-specific factors influencing road safety, enhancing understanding of diverse traffic safety challenges.
By leveraging state-of-the-art natural language processing (NLP) techniques, this research performs a detailed analysis of crash narratives from the two regions, focusing on fatal and non-fatal crashes. Bidirectional Encoder Representations from Transformers (BERT) are employed for topic modeling and classification tasks, while Shapley Additive Explanations (SHAP) enhance model interpretability. This approach fills a critical research gap by providing a detailed, AI-driven comparative analysis of crash patterns across developed and developing countries, enabling a more comprehensive understanding of global traffic safety challenges. By addressing gaps in existing research, this study uncovers novel insights into cultural, infrastructural, and legal factors influencing crash outcomes and develops actionable recommendations for improving road safety in both developed and developing countries. The objectives of this study are threefold:
To analyze crash patterns in the USA and Jordan, identifying cultural, infrastructural, and legal factors that contribute to crash outcomes.
To apply BERT-based models and SHAP to enhance predictive accuracy and model transparency, uncovering critical risk factors.
To provide region-specific policy recommendations for road safety, including infrastructure improvements, vehicle safety standards, and emergency response strategies.
By integrating AI-driven methods and focusing on two distinct cultural contexts, this research aims to bridge methodological gaps and contribute to global road safety efforts.
2. Literature Review
Traditional data-driven regression models have been widely used to model crash severity, offering valuable mathematical interpretations and insights into individual predictor variables. However, these models are limited by their reliance on underlying assumptions, such as linear link functions and error distribution terms, which, if violated, can lead to biased estimates and reduced predictive accuracy [
5,
6]. This limitation is particularly critical when modeling complex crash dynamics, where interactions between factors are non-linear and highly context-dependent. Advanced data collection techniques, such as mobile LiDAR systems and point cloud applications, have significantly improved traffic safety analysis by capturing detailed spatial and environmental data [
7]. Despite their contributions, these technologies often fall short of providing contextual insights, such as driver behavior and cultural factors that significantly influence crash dynamics.
In contrast, text-based Natural Language Processing (NLP) methods like BERT enable the extraction of nuanced crash factors from narrative reports, offering insights into the cultural and behavioral elements surrounding crash incidents. A recent study highlighted the potential of deep NLP approaches, including ensemble learning with pre-trained transformers, for improving crash severity classification, demonstrating their capability in harnessing unconventional data sources for traffic safety analysis [
8]. Integrating narrative analysis with spatial data facilitates a holistic understanding of road safety factors, addressing gaps left by traditional approaches.
Various methodologies, including association rule mining, spatial statistical analysis, kernel density estimation (KDE), and sequence analysis, to investigate crash patterns and understand crash patterns [
1,
2,
3,
4,
5]. These methods have proven effective in identifying spatial and temporal crash trends, crash-prone areas, and crash sequences, forming a foundation for targeted safety measures [
1,
2,
3,
4,
5]. Machine learning-based models, including support vector machines, decision trees, and deep learning, have emerged as promising tools in road safety research, particularly in addressing the limitations of traditional statistical approaches [
6,
9,
10,
11,
12]. These models leverage real-time traffic data, such as traffic flow, speed, and volume, to identify crash-related trends and circumstances [
13,
14,
15,
16,
17]. Resampling approaches have been recommended to enhance the predictive performance of machine learning algorithms in handling imbalanced crash datasets [
18,
19], thereby improving the reliability of predictions. A recent study by Jaradat et al. (2024) introduced a multitask learning framework that utilizes social media data, specifically Twitter, to analyze and detect real-time crash patterns, enabling faster and more targeted traffic management responses in dynamic conditions [
20]. This innovation demonstrates the potential of AI in harnessing unconventional data sources for road safety improvements.
NLP techniques, such as BERT, facilitate the extraction of nuanced crash factors, allowing for a richer understanding of the elements influencing road safety in diverse cultural contexts. The flexibility and scalability of BERT in processing unstructured text have expanded the horizons of crash data analysis, enabling researchers to explore complex relationships within narrative data [
21].
Numerous studies have investigated the factors affecting road crashes. Researchers investigated the effectiveness of artificial neural network models in simulating motorway crashes and established that the core reason for the occurrence of crashes is the average daily traffic volume and average vehicle speed [
22]. Text-mining analyses have been employed to classify road traffic injury collision features, demonstrating their utility in road injury prevention [
23]. Giummarra et al. (2022) performed text mining to classify crash circumstances by road user group and found that it can be used to uncover the features of road traffic injury [
23]. These studies emphasize the value of text analysis in understanding crash dynamics across various contexts. Wang et al. (2017) analyzed the severity of traffic crash injuries at intersections and different sections of roads and showed the importance of understanding the differences in collision injury severity and their contributing factors [
19]. This is complemented by studies like Darus et al. (2022), which highlight road characteristics and traffic conditions as critical contributors to collision severity and emphasize the necessity of incorporating multiple parameters into injury classification models [
24]. Donnelly-Swift and Kelly (2015) used generalized linear regression models to identify variables related to fatal or serious injuries in single-vehicle road traffic crashes. Their study was important because it provided insight into the multidimensional aspects that result in injury severity [
25].
Text mining methods, including thematic analysis, content analysis, and NLP, have been used to mine and analyze crash narrative textual data [
26,
27,
28]. These methods can be employed to identify the contributing components that characterize crash patterns and explore the causation of crashes within unstructured narrative reports [
27,
29]. Association rule mining has played an instrumental role in identifying parameters associated with crash types, such as senior-driver crossing crashes [
28]. Combining real-time traffic data with insights gleaned from crash narratives yields a more complete understanding of crash risk and causation. This holistic mechanism establishes the foundation for developing forecasting models for crashes in real-time, which are reliable, effective, and fundamental for proactive traffic management and safety enhancement [
6,
30].
Topic modeling is an unsupervised machine learning technique aimed at identifying abstract “topics” by clustering groups of words within a set of documents. The evolution of topic modeling techniques has significantly advanced the field of NLP, beginning with the introduction of Latent Semantic Analysis (LSA) and culminating in the development of Bidirectional Encoder Representations from Transformers (BERT) [
31]. LSA, introduced by Deerwester et al. [
32], marked the inception of extracting latent topics from text by decomposing term-document matrices, thereby uncovering the underlying semantic structure of the corpus. Following LSA, Blei et al. [
33] introduced Latent Dirichlet Allocation (LDA). This generative probabilistic model improved upon LSA by allowing documents to be represented as mixtures of multiple topics, thus providing a more flexible and detailed method for topic discovery. Despite their effectiveness, both LSA and LDA are limited by their reliance on bag-of-words representations, which ignore the order of words and the contextual nuances of language. The advent of deep learning brought about significant advancements in topic modeling, with the introduction of models that could understand the context and semantics of words in text. BERT [
34] represents a paradigm shift, leveraging a transformer architecture to generate deep contextualized word embeddings. Although LDA and BERT are established techniques in natural language processing, the present study distinguishes itself by applying these methods to a novel cross-cultural context in traffic safety analysis. The integration of BERT for topic modeling and text classification with the addition of SHAP for model interpretability offers a fresh perspective that enhances the understanding of crash patterns in different cultural settings. This approach applies advanced NLP techniques to a new domain and introduces a comprehensive methodology that can be adapted for broader applications in road safety research. This work, therefore, contributes meaningfully to the field by bridging the gap between methodological rigor and practical application in a critical public safety domain.
Traffic crashes have been studied from a cross-cultural perspective in numerous research. These studies have assessed risks in road behavior, the incidence of aggressive driving, and driving behaviors across various nations and cultures [
35,
36,
37,
38]. The results reveal that driver anger and road safety behaviors are significantly influenced by cultural factors, underscoring the interplay between societal norms and driving practices [
39,
40,
41].
Given the global nature of road safety challenges, there is a critical need for research that transcends national boundaries and incorporates cross-cultural comparisons. Existing studies have left significant gaps in understanding how diverse socio-economic, infrastructural, and cultural contexts influence crash patterns and outcomes. Moreover, while advancements in analytical methods, such as Natural Language Processing (NLP) and machine learning, have greatly enhanced crash analysis, their application to cross-cultural comparisons remains limited. Despite progress in leveraging advanced models, many studies predominantly focus on single-country analyses or specific datasets, failing to account for the complex interplay of cultural, legal, and environmental contexts in crash causation.
This gap is particularly evident in the comparison of crash narratives between countries with vastly different socio-economic and cultural contexts. For instance, the United States, a highly motorized and developed nation, features extensive roadway networks and advanced traffic safety measures. In contrast, a developing nation, Jordan presents challenges such as varied traffic law enforcement and distinct driving behaviors. This study addresses these gaps by integrating BERT-based topic modeling and text classification with SHAP interpretability to analyze crash data from the USA and Jordan. Unlike prior studies focusing solely on infrastructural or behavioral factors, this research incorporates cultural, legal, and environmental elements to inform region-specific road safety interventions. The contributions of this study are as follows:
Evaluate state-of-the-art AI models, such as BERT, for crash severity classification, thereby enhancing predictive accuracy.
Leverage SHAP for AI model transparency, offering actionable insights into crash severity factors for informed policymaking.
Analyze crash narratives to uncover nuanced safety risks and support comprehensive safety measures.
Provide cross-cultural safety insights, facilitating the adaptation of successful interventions globally.
Identify unique and shared crash severity factors in the USA and Jordan, guiding targeted safety interventions.
Safety countermeasures are recommended to prevent crashes and reduce their severity in both countries.
3. Methodology
3.1. Proposed Framework
This study utilizes state-of-the-art AI techniques, including BERT and BERT/Bi-LSTM, to analyze factors contributing to fatal and non-fatal crashes using datasets from the USA and Jordan. The proposed research framework, illustrated in
Figure 1, involves multiple stages:
Data Collection from USA and Jordan crash datasets, encompassing both tabular and narrative data.
Preprocessing to standardize and prepare the data for analysis.
Exploratory and Textual Visualizations, including Exploratory Data Analysis (EDA), Topic Modeling, unigrams, bigrams, similarity matrices, and hierarchical structuring.
Text Classification using BERT/Bi-LSTM with SHAP interpretability.
Generating actionable insights and policy recommendations based on the analysis.
This comprehensive framework thoroughly compares crash patterns and highlights region-specific factors influencing road safety.
3.2. Dataset
This study utilizes crash data from two distinct regions: Jordan and the USA. Each dataset contains both tabular and narrative crash data. The tabular data includes core variables such as Severity, Crash Type, Light Condition, and Number of Vehicles. Additional contextual features like Weekday, Month, Season, Car Type, Accident Time, and Driver Age were extracted for deeper analysis.
3.2.1. Jordan Dataset
The Jordan dataset consists of 6359 traffic narrative reports from five major freeways: Airport Roads, Desert Highways, Jordan Highway, Route 30, and Route 35. These reports, obtained from the Jordan Traffic Institute (JTI), include both tabular and narrative data. Crashes were categorized into four levels: Fatal, Severe Injury, Moderate Injury, and Minor Injury.
3.2.2. USA Dataset
The USA dataset was obtained from the Missouri State Highway Patrol, covering reports from 2019–2020. Crashes were classified into three categories: Fatal, Property Damage, and Personal Injury. An equal sample size of 6359 crashes was selected from the USA dataset to ensure comparability, maintaining balance across categories.
3.2.3. Narrative Data and NLP Analysis
Crash narratives provided a rich source of contextual information. Given structural differences between datasets, these narratives became the primary focus for comparison. Using BERT and BERT/Bi-LSTM, the study analyzed key factors contributing to crash severity. This approach enhanced understanding of underlying crash dynamics in Jordan and the USA, as illustrated by sample narratives in
Table 1 and
Table 2.
3.3. Data Preprocessing
3.3.1. Standardization for Cross-Cultural Analysis
To enable meaningful comparisons between the Jordan and USA crash datasets, crash severity levels were standardized into two categories: Fatal and Non-fatal. Shared variables, including Severity, Crash Type, Light Condition, and Number of Vehicles, were retained for core analysis. Additional contextual features, such as Weekday, Month, Season, Car Type, Accident Time, and Driver Age, were extracted for deeper investigation. Unique variables were excluded to maintain cross-dataset comparability.
3.3.2. Text Preprocessing
A systematic preprocessing approach was implemented to ensure consistency and data quality. Key steps included:
Text Normalization: Text was converted to lowercase, and non-textual characters, URLs, and commonly repeated irrelevant terms (e.g., “VEHICLE”, “KIN”, “PRONOUNCED”, “FATALITY”) were removed to reduce noise while preserving critical crash-related terms, such as “death”, “killing”, and “notified”.
Minimal Preprocessing: Techniques such as tokenization, stop-word removal, and stemming were deliberately avoided to retain important contextual information, particularly region-specific nuances such as rollover dynamics in the USA or vehicle-related terms in Jordan.
3.3.3. Advanced Encoding
Advanced transformer models like BERT were employed to process crash narratives effectively. Raw text inputs were encoded using DistilBERT, with each narrative truncated or padded to a maximum length of 200 tokens to standardize input dimensions. This approach aligns with modern research emphasizing minimal preprocessing for deep learning models, ensuring that nuanced meanings and contextual details are preserved.
Following preprocessing, exploratory data analysis (EDA) was conducted to uncover initial insights and patterns. This included descriptive statistics and visualization techniques to highlight prevalent terms and relationships in the data. The standardized and refined datasets provided a robust foundation for downstream analyses, including the application of BERT/Bi-LSTM models and SHAP-based interpretations to uncover key crash factors.
3.4. Exploratory Data Analysis (EDA)
Following preprocessing, we conducted Exploratory Data Analysis (EDA) to uncover patterns, trends, and potential outliers in the data. Descriptive statistics and graphical representations, such as unigram and bigram frequency visualizations, highlighted prevalent terms and relationships, providing valuable insights into crash dynamics across the two datasets. However, we acknowledge that aligning crash severity categories during data standardization may introduce potential biases, as cultural and contextual differences in reporting practices could affect the comparability of datasets. For example, terms describing technical defects in Jordan may differ in granularity from terminology used in the USA, necessitating careful interpretation of results.
This robustly refined dataset, which retained its contextual integrity, provided a solid foundation for subsequent analysis, as summarized in
Table 3 and
Table 4.
Based on the descriptive statistics provided in
Table 3 and
Table 4, we conducted a comparative analysis of traffic crash patterns between Jordan and the USA, revealing notable differences and similarities across various dimensions. In terms of crash severity, both countries predominantly experience personal injury outcomes, yet Jordan has a significantly lower fatality rate (1.93%) compared to the USA (7.49%). Crash types in Jordan are overwhelmingly collisions (93.22%), in stark contrast to the USA’s diverse crash types, including fixed object collisions and overturns. Light conditions further differentiate the two, with most crashes in Jordan occurring during daylight (69.52%) and the USA experiencing a majority under dark-unlighted conditions (67.40%). Jordan exhibits a higher incidence of multi-vehicle collisions (71.62%), whereas single-vehicle crashes are more prevalent in the USA (54.85%).
Seasonal trends show Jordan peaked in crashes during Autumn (71%), aligning more closely with the USA’s increased crashes in colder months. Vehicle types involved in crashes also vary, with Jordan dominated by small ride-on cars (64.08%) and the USA showing a broader distribution, including motor vehicles and trucks. Crash timing indicates Jordan’s crashes peak during evening and midday, contrasting with the USA’s pattern of nighttime crashes. Lastly, the driver age group involved in crashes in Jordan is predominantly the 25–54 age group (73.91%), whereas the USA sees a broader age distribution, including a notable involvement of younger drivers. These findings underscore the complex interplay of environmental, societal, and regulatory factors influencing road safety in each country, highlighting the need for tailored road safety measures and policies. In the USA, a vast majority of crashes occur during the day, with a significant drop-off as lighting conditions worsen, suggesting that visibility plays a critical role in traffic safety.
Conversely, in Jordan, crashes are more evenly distributed between daylight and dark but lighted conditions, with fewer incidents occurring in the dark unlighted, which may indicate better adaptation or less traffic during these times. This comparison reveals stark differences in how light conditions affect driving safety in the two countries. While the USA clearly prefers daytime driving safety, Jordan displays a higher tolerance for less optimal lighting conditions. This insight could inform targeted strategies for improving road safety tailored to the unique driving environments of each country.
3.5. Topic Modeling Using BERT
BERT’s ability to capture bidirectional contexts has advanced the development of sophisticated topic modeling approaches that go beyond simple word co-occurrence, allowing for the extraction of semantically coherent and contextually relevant topics. This evolution from earlier models like Latent Semantic Analysis (LSA) to BERT highlights the shift from linear algebra-based and probabilistic methods to deep learning, marking significant progress toward richer, more nuanced text representations for topic modeling. The use of contextual embeddings has made BERT a valuable tool for identifying underlying themes in crash narratives. A critical component of topic modeling involves selecting a hyperparameter, denoted as
, which represents the number of latent topics. However, determining the optimal number of topics is often challenging, as it affects the quality and meaningfulness of the extracted topics [
34].
BERT continues to be widely employed for natural language processing (NLP) tasks due to its ability to capture deep contextual information. This strength lies in its foundation on the transformer architecture introduced by Vaswani et al., which employs protocol-defined encoder blocks to process text in parallel rather than sequentially [
42]. BERT leverages its bidirectional and context-sensitive nature to improve language understanding and prediction accuracy. Since textual data can often include symbols and numbers that introduce noise during data processing, standard preprocessing techniques like text cleaning, stop-word removal, tokenization, stemming, and word embedding are applied. However, these preprocessing steps can sometimes remove valuable contextual or semantic information from sentences or phrases. BERT addresses this limitation by being fine-tuned with an additional output layer, making it adaptable for a wide range of NLP tasks—such as question answering and language inference—without significant task-specific architecture changes.
Pre-trained on large datasets like English Wikipedia and the Book Corpus (800 M words), BERT is capable of handling tasks in over 100 languages. Its training in next-sentence prediction and masked language modeling further enhances its versatility. BERT incorporates three embedding layers: token, segment, and position embeddings [
31]. In classification tasks, the Bi-Directional Long Short-Term Memory (Bi-LSTM) network processes the 768 hidden states of the [CLS] token representation produced by BERT.
Models like TK-BERT and BERT-LDA have been shown to outperform traditional methods such as Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) by integrating contextual semantics and thematic narratives [
43]. Coherence was the primary metric in this study, as it reflects the logical association between words within a topic, ensuring the validity and relevance of the generated topics. Hyperparameter selection for k was performed in collaboration with domain expert to maximize coherence and interpretability. While Coherence was central to our evaluation, future studies may benefit from incorporating additional metrics, such as exclusivity, to further enhance the uniqueness and interpretability of topics.
3.6. Text Classification Using BERT/Bi-LSTM
Long Short-Term Memory (LSTM), a form of recurrent neural network (RNN), has been extensively used in natural language processing and sequential data analysis, due to its ability to capture long-term dependencies [
44]. Unlike RNNs, LSTM networks utilize memory cells and gating techniques to store and retrieve information over long sequences in an elegant manner. This architectural design resolves the vanishing and exploding gradient issues when training deep networks on sequential data, making LSTMs particularly suitable for language modeling, text classification, and time series prediction, among other applications. Their ability to model sequential relationships is critical in analyzing crash narratives, where temporal context plays a vital role.
The LSTM network, a specialized variant of recurrent neural networks (RNNs), exhibits unique capabilities in handling short- and long-term correlations within time series data. The network’s architecture includes a memory unit, where a central memory cell, denoted by the red circle, plays a pivotal role. This structure enables the LSTM network to effectively capture dependencies and patterns in sequential data, making it particularly advantageous for various applications where understanding short- and long-term relationships is essential [
44].
3.7. Evaluation Metrics
The BERT/Bi-LSTM model was validated on the test dataset. Key metrics included accuracy, precision, recall, F1-score, SHAP, and coherence score.
Accuracy measures the overall correctness of the model in classifying fatal and non-fatal crashes. The BERT/Bi-LSTM model achieved 99.5% accuracy for the USA dataset and 99% accuracy for the Jordan dataset, highlighting its effectiveness in classifying crash severity.
Precision indicates the proportion of correctly predicted fatal crashes out of all predicted fatal crashes. High precision reduces the occurrence of false positives, which is critical for reliable crash severity prediction.
Recall evaluates the model’s ability to capture all actual fatal crashes out of the total true fatal crashes. This metric is vital for ensuring that no fatal crashes are overlooked.
The F1-score balances precision and recall, ensuring the model performs well in both aspects. It is particularly important in this study to ensure both false positives and false negatives are minimized. The F1-scores for BERT/Bi-LSTM models exceeded 98% for both datasets.
SHAP interpretability values were employed to interpret the model predictions, identifying key features such as “overturned” and “attempted” for fatal crashes in the USA, and “damage” and “technical” issues for Jordan. This interpretability is crucial for understanding the underlying factors contributing to crashes in different cultural contexts.
For topic modeling, the coherence score was used to evaluate the semantic association of words within each topic. Higher coherence indicates better quality topics. The USA dataset achieved the highest coherence with 25 topics, while Jordan peaked with 10 topics, reflecting the differences in crash narratives between the two regions.
The coherence score
was computed using four different statistics: the segment of confirmed co-occurrences of two words, the probability of these two words appearing together, and their probabilities. The
for a single topic is as follows [
45]:
where
is the coherence score for a single topic, measuring the logical association between words within that topic;
is the number of words in a topic;
is the positions of words in a topic; and
is a function that calculates a score for the relationship between two words,
and
, at positions
and
within the topic.
6. Discussion
The study presented in this paper provides a comprehensive analysis of traffic crash patterns between the United States and Jordan, employing advanced Natural Language Processing (NLP) techniques, specifically Bidirectional Encoder Representations from Transformers (BERT), combined with Shapley Additive Explanations (SHAP) for interpretability. This cross-cultural examination sheds light on significant differences and underlying factors contributing to crash types and outcomes in developed versus developing country contexts.
The utilization of BERT and SHAP in this research represents a significant advancement in crash pattern analysis. Traditional statistical models, while helpful, often fall short of capturing the nuanced and complex nature of crash data narratives. The application of BERT allows for a more nuanced understanding of textual data, capturing the context and subtleties within crash reports. Coupled with SHAP, this approach provides clear insights into factors contributing to crash severity, enabling a more detailed and understandable analysis than previously possible with conventional methods. For example, SHAP analysis in Jordan identified “technical issues” as a critical factor, pointing to the need for stricter vehicle inspection policies and public awareness about vehicle maintenance. Similarly, in the USA, “alcohol-related incidents” emerged as a significant risk factor, emphasizing the importance of enhanced DUI enforcement and community education campaigns. These insights demonstrate the potential of SHAP to bridge the gap between data-driven findings and actionable policy measures.
The findings highlight distinct differences in crash characteristics between the two countries. In the United States, factors such as vehicle overturns and attempted maneuvers are prevalent in fatal crashes, possibly reflecting a higher incidence of high-speed or evasive driving scenarios. Conversely, in Jordan, crash severity is more closely associated with damage extent and resultant actions, suggesting variations in vehicle safety standards and road conditions. Seasonal variation in crash occurrences, with higher rates in Autumn for Jordan and winter for the USA, indicates environmental and cultural influences on driving patterns.
To provide a clearer visualization of these regional differences,
Figure 18: Comparative Analysis of Key Crash Factors (SHAP Values): USA vs. Jordan has been added. This figure consolidates the SHAP-derived insights for fatal and non-fatal crashes across the two regions, highlighting the key contributing factors and their relative importance.
Moreover, the high accuracy of the BERT/Bi-LSTM models in predicting crash severity demonstrates the potential of machine learning in enhancing traffic safety research by identifying specific risk factors and informing targeted interventions. This adaptability is particularly relevant for other regions, as BERT models can be fine-tuned for additional languages and cultural contexts, ensuring a flexible framework for crash narrative analysis worldwide.
Table 9 compares our study’s findings with other relevant studies, showing the relative accuracy and objectives across different research contexts.
6.1. Results in Relation to Existing Literature
The findings of this research align with existing literature on traffic safety and accident analysis. Key elements such as driver error, vehicle overturning, and mechanical issues are consistent with studies that identify these factors as significant contributors to accident severity. Both datasets highlight the substantial impact of human error, reinforcing earlier research that emphasizes the universal influence of driver conduct on road incidents [
60,
61,
62,
63,
64,
65]. Additionally, the increase in crash severity during nighttime driving is supported by road safety trends, where reduced visibility is linked to higher accident risks [
66]
This study also advances the understanding of vehicle-related factors by identifying technical defects as a major predictor of crash severity in Jordan, which reflects challenges seen in other developing regions where aging vehicle fleets and inadequate maintenance standards prevail. Unlike in Western contexts, where impaired driving, particularly related to alcohol use, is a dominant factor, this study emphasizes the importance of vehicle age and technical defects in shaping crash risks, offering insights into region-specific road safety challenges [
67]. These insights illustrate the methodology’s ability to generalize findings for diverse cultural and infrastructural contexts, enabling its application to datasets from other countries with minimal adaptation.
6.2. Implications for Road Safety Policy and Practice
This study’s cross-cultural analysis highlights important regional differences and shared risk factors in road safety for the USA and Jordan, underscoring the necessity of tailoring road safety strategies to local conditions. By utilizing BERT and SHAP models, we ensure that key crash patterns are interpretable, enhancing the validity of the cross-cultural analysis by identifying distinct and shared contributors without relying solely on observed correlations. The methodology’s modular design allows it to be extended to other countries, facilitating comparative studies across regions with varying socio-economic and infrastructural conditions.
6.2.1. United States Road Safety Policy
The frequent occurrence of rollover accidents and high-speed incidents in rural areas suggests a need for targeted rural highway safety strategies. Decision-makers could consider increased guardrail usage, road curve optimization, and adjusted speed limits in rural zones to reduce rollover risks. Additionally, the prevalence of terms like “attempted” in crash reports indicates that evasive maneuvers may contribute to crash severity. Strengthening driver training programs, especially for emergency responses, could help mitigate such risks.
6.2.2. Jordan Road Safety Policy
In Jordan, where crash reports frequently highlight issues related to vehicle age and technical defects, policy efforts could benefit from more rigorous inspection criteria and improved maintenance regulations. Given the aging vehicle fleet, periodic inspections to enforce safety standards are essential. Public education campaigns that emphasize the importance of vehicle maintenance could also reduce crash risks associated with mechanical failures.
6.2.3. Global Road Safety Insights
This study reveals common factors affecting road safety in both countries, including driver behavior and environmental conditions, such as nighttime driving and adverse weather. These insights offer valuable guidance for international road safety organizations in developing awareness campaigns that address global risks, such as inadequate lighting and poor weather conditions.
6.2.4. AI Application in Road Safety Monitoring
The use of BERT/Bi-LSTM models with SHAP values improves the precision and transparency of crash predictions. These technologies can be integrated into road safety monitoring systems for real-time crash analysis, enabling authorities to identify emerging risk factors more effectively. SHAP’s explainability makes the models accessible to policymakers, allowing them to make informed, data-driven decisions.
7. Policy Recommendations
The comparative analysis of crash patterns between the USA and Jordan highlights critical policy implications for both nations. This research suggests several key areas where targeted interventions can improve road safety outcomes tailored to the unique driving environments and crash dynamics observed in each country.
7.1. Road Infrastructure Improvements
In the USA, the high frequency of overturned vehicles in crash narratives indicates a need for enhanced road infrastructure, particularly on highways and rural roads. Investing in improved guardrails, better road curvature design, and enhanced road signage could help mitigate rollovers and vehicle ejections, which are prevalent in fatal crashes. On the other hand, Jordan’s crash narratives emphasize multi-vehicle collisions and technical issues, suggesting that improving urban road designs—such as clearer lane markings and dedicated lanes for high-traffic areas—would address common road usage challenges. Enhancing road maintenance efforts to accommodate Jordan’s aging vehicle fleet could also play a pivotal role in reducing crashes related to technical defects.
7.2. Vehicle Safety Standards and Maintenance Regulations
The prominence of technical issues in Jordanian crashes highlights the need for more rigorous vehicle inspection and maintenance standards. Implementing regular vehicle safety checks and enforcing strict penalties for non-compliance could help prevent crashes caused by mechanical failures. Additionally, promoting public awareness campaigns on the importance of vehicle maintenance would ensure better upkeep, especially in regions where older vehicles are more common. In contrast, the USA might benefit from focusing on driver safety education to prevent rollovers and single-vehicle crashes. Advanced driver assistance systems (ADAS), such as electronic stability control, should be more widely promoted to help reduce the number of fatal crashes.
7.3. Emergency Response and Medical Preparedness
The USA’s crash data reveals a higher fatality rate, with many fatalities involving multiple vehicles and complex crash dynamics like vehicle ejections and rollovers. Improving emergency response times and equipping rural and highway areas with advanced medical response systems could save lives in these fatal cases. Furthermore, training first responders to deal with high-severity crash scenarios, especially involving overturned vehicles, could lead to better outcomes. Jordan’s crash narratives, while generally involving less fatal crashes, suggest that enhancing emergency medical services (EMS), particularly in rural and underdeveloped regions, could significantly reduce fatality rates by ensuring faster and more effective medical interventions.
7.4. Targeted Road Safety Campaigns
The distinct differences in crash factors between the USA and Jordan underscore the need for tailored road safety campaigns. In the USA, the prominence of terms like “alcohol”, “roadway”, and “overturn” suggests a need for campaigns targeting substance abuse prevention and safe driving practices on rural roads. Conversely, Jordan’s crash narratives, which emphasize technical issues and vehicle defects, indicate the need for public education around safe vehicle operation, regular maintenance, and road safety measures. Focusing on these specific areas would better align safety interventions with the realities of driving conditions in each country.
7.5. Data Collection and Reporting Improvements
The study’s findings highlight the importance of standardizing data collection and reporting practices to enable more accurate cross-cultural comparisons. In the USA, the detailed reporting of environmental conditions and vehicle dynamics provides valuable insights but could be enhanced by capturing real-time driving conditions, such as weather, road surface, and visibility at the time of the crash. In Jordan, improving the completeness and consistency of crash reports, especially in rural areas, could provide more reliable data for analysis. Both countries could benefit from adopting unified data collection protocols, incorporating comprehensive vehicle, environmental, and behavioral factors to enhance the quality of crash data and, ultimately, road safety policies.
7.6. International Collaboration and Knowledge Sharing
The cross-cultural differences observed in this study highlight the need for global collaboration in road safety. By establishing data-sharing partnerships and engaging in international research efforts, both countries can benefit from shared insights into effective road safety strategies. The USA’s focus on mitigating rollovers and addressing wildlife interactions, and Jordan’s emphasis on technical defects and urban collisions, suggest that combining these insights could lead to innovative road safety interventions that address a broader range of risk factors across different regions.
8. Conclusions
This study pioneers a cross-cultural examination of road traffic crashes in the United States and Jordan by deploying advanced text-mining techniques to elucidate the nuanced factors that influence injury severity outcomes. Utilizing traffic crash narratives alongside quantitative data, the research aims to extract distinctive contributory elements from two divergent driving contexts, thereby offering a richer understanding of the factors correlating with injury severities. The study is grounded in a methodical framework encompassing standardization and harmonization of datasets, exploratory data analysis (EDA), and sophisticated AI modeling with BERT classifiers, supplemented by SHAP interpretability to ensure transparency in AI predictions. This multifaceted approach allows for an in-depth comparative analysis of crash severity determinants between the culturally distinct regions, highlighting unique and shared factors. A balanced sample size of 6359 crashes from each country’s national transportation agency data ensures a bias-free comparison.
The USA models display high accuracy in crash severity classification (99.5%), while Jordan’s models maintain 99% accuracy and slightly lower recall, underscoring the complex narrative classification in this context. SHAP analysis reveals that ‘overturned’ and ‘attempted’ actions dominate USA fatalities, while ‘damage’ and ‘resulted’ are more critical in Jordan. For non-fatal, ‘failed,’ ‘one’, and ‘yield’ issues dominate in the USA, whereas ’technical’ concerns and ‘collision’ are prevalent in Jordan. Topic modeling further distinguishes the datasets: the USA narratives are diverse, covering wildlife and legal entities, whereas Jordan focuses on technical vehicle details. The USA’s thematic complexity is mirrored in its broader range of topics discussed in crash narratives, as opposed to Jordan’s focused themes.
This analysis underscores the significance of environmental, societal, and regulatory factors shaping road safety. The cultural disparity is evident, for instance, in the prominence of alcohol-related incidents in the USA, absent in Jordan’s data due to cultural differences. Jordan’s emphasis on technical aspects, like vehicle defects, contrasts with the USA’s broader safety and legal discussions. The study presents an intricate portrayal of crash dynamics, emphasizing the importance of culturally informed interventions. It showcases the need for tailored road safety measures reflecting the specific conditions and behavioral patterns in each country to effectively mitigate the occurrence of road traffic crashes and their repercussions.
Limitations and Future Research Directions
While this study provides valuable insights into crash severity factors across diverse cultural contexts, several limitations must be acknowledged. The reliance on SHAP for interpreting model predictions, though effective, is constrained by its dependency on observed data, leaving unobserved variables—such as road conditions, driver fatigue, and environmental factors—unaccounted for [
68]. Latent factors, such as infrastructure quality or vehicle maintenance, which could significantly influence crash severity, remain unexplored due to data unavailability. These constraints are particularly critical in complex crash scenarios, where such variables play a significant role but are challenging to capture.
To address causality more comprehensively, future research could integrate causal inference frameworks such as Directed Acyclic Graphs (DAGs), structural equation modeling, or Bayesian Networks for Latent Variables (BN-LV) [
69,
70]. These approaches would complement SHAP by systematically identifying and modeling relationships between observed and unobserved factors, enabling a more nuanced understanding of crash dynamics and mitigating current methodological limitations.
The topic modeling approach primarily relies on the Coherence metric to assess topic relevance. While useful, this reliance may lead to broad topics dominated by common terms, reducing the granularity of insights and limiting their practical application for tailored road safety interventions. Future research could refine topic modeling by incorporating additional metrics like exclusivity or by integrating expert input, to ensure more contextually meaningful and actionable results.
Although model performance was validated using precision, recall, and F1 metrics, future studies could benefit from simulation-based approaches to enhance findings. Simulations replicating various crash scenarios would provide a dynamic perspective on real-world implications, enabling evaluations of road safety interventions under diverse environmental and cultural contexts. This approach would supplement performance metrics and deepen the analysis of causal relationships between crash factors and outcomes.
The imbalance between fatal and non-fatal crashes in the dataset poses another challenge, as the prevalence of non-fatal cases may bias model predictions. Future research should explore techniques such as data augmentation or resampling to address this imbalance, thereby enhancing model robustness. Furthermore, incomplete crash reports and differences in reporting practices between regions limit the comprehensiveness of the analysis. Standardizing crash reporting protocols across regions could mitigate these gaps and provide a more uniform basis for analysis.
Finally, incorporating real-time data and additional contextual variables—such as road type, weather conditions, and traffic density—would improve the model’s applicability to road safety research. Leveraging real-time monitoring systems could enhance prediction accuracy and enable proactive safety measures. Extending this cross-cultural analysis to other regions and adapting the methodology to various cultural and infrastructural contexts would support targeted interventions tailored to specific regional road safety challenges. These directions build upon the unique elements of this study, providing a roadmap for future advancements in traffic safety research.