Next Article in Journal
Placeful Business: Reimagining a Small Business Concept That Embraces and Enriches Places
Previous Article in Journal
Research on the Characteristics and Influencing Factors of Provincial Urban Network from the Perspective of Local Governance—Based on the Data of the Top 100 Enterprises in Four Categories in Fujian Province
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Investigation on Hazardous Material Truck Involved Fatal Crashes Using Cluster Correspondence Analysis

Road Safety Research Center, Research Institute of Highway Ministry of Transport, No. 8 Xitucheng Road, Haidian District, Beijing 100088, China
*
Author to whom correspondence should be addressed.
Sustainability 2023, 15(12), 9369; https://doi.org/10.3390/su15129369
Submission received: 11 April 2023 / Revised: 4 June 2023 / Accepted: 8 June 2023 / Published: 9 June 2023

Abstract

:
Although hazardous material (HAZMAT) truck-involved crashes are uncommon compared to other types of traffic crashes, these crashes pose considerable threats to the public, property, and environment due to the unique feature of low probability with high consequences. Using ten-year (2010–2019) crash data from the Fatality Analysis Reporting System (FARS) database, this study applies cluster correspondence analysis to identify the underlying patterns and the associations between the risk factors for HAZMAT-truck-involved fatal crashes. A low-dimensional space projects the categorical variables (including the crash, road, driver, vehicle, and environmental characteristics) into different clusters based on the optimal clustering validation criterion. This study reveals that fatal HAZMAT-truck-involved crashes are highly distinguishable concerning collision types (angle and front-to-front crashes, single-vehicle crashes, and front-to-end crashes) and roadway geometric variables, such as two-way undivided roadways, curve alignments, and high-speed (65 mph or more) urban interstate highways. Driver behavior (distraction, asleep or fatigue, and physical impairment), lighting conditions (dark–lighted and dark–not lighted), and adverse weather are also interrelated. The findings from this study will help HAZMAT carriers, transportation management authorities, and policymakers develop potential targeted countermeasures for HAZMAT-truck-involved crash reduction and safety improvement.

1. Introduction

Due to the rapid and broad development of urbanization and industrialization, the demand for hazardous materials (HAZMAT) has increased significantly. According to the 2017 Economic Census: Transportation report, HAZMAT shipments were approximately 3 billion tons in the United States in 2017. Trucks as a single mode accounted for nearly 65% of all HAZMAT shipments in value and over 60% of all HAZMAT shipments in tonnage [1]. HAZMAT-truck-involved crashes are uncommon compared to general traffic crashes, and the frequency is relatively smaller. However, these crashes tend to cause more fatalities and severe injuries. In the United States, 5005 heavy trucks were involved in fatal crashes in 2019, with 120 (2.4%) carrying HAZMAT [2]. The US Department of Transportation compiled a review of HAZMAT-related incidents during the ten-year period (2012–2021), which showed that HAZMAT highway transportation accounted for 100% of the total number of fatalities (83) and 74.02% of the total number of injuries (1732) [3].
Because of the unique physical and chemical properties, particularly the higher possibility of spillage, fire, and explosion during or after an incident, HAZMAT-truck-involved crashes are frequently referred to as “low probability with high consequences” [4]. According to a guidance issued by Federal Emergency Management Agency in 2019, these crashes may cause harm to people, the environment, critical infrastructure, and property. Their potential for harm exists regardless of whether hazardous materials are released by accident, fire, or weather-related event [5]. Hazardous material incidents impact various community stakeholders. First responders, first receivers, transportation personnel, local residents, and students are all at risk of negative health effects from these substances. Understanding where, when, and why crashes happen is critical to reducing the number of crashes and improving HAZMAT road transportation safety.
Many research efforts aim to identify the risk factors and the relationships among variables that influence HAZMAT-truck-involved crashes. Previous studies have proven that traditional statistical regression models can effectively establish the associations between different explanatory variables and crash occurrence or severity. However, these parameter-oriented methods often assume that the crash data form a specific distribution. It is difficult to reveal the heterogeneous relationships in the crash data. More and more researchers apply data mining and machine learning techniques to investigate the risk factors and their impacts on crashes. Due to the greater flexibility, and comparable or superior performance, these innovative methods could assist in comprehensive crash analysis and decision-making. Nevertheless, little research has utilized these approaches for HAZMAT-truck-involved crash analysis. Thus, there is a need for research efforts with additional resources and newer approaches and techniques in HAZMAT safety analysis. By focusing on the following objectives, this study aims to contribute to safer HAZMAT road transportation.
  • Analyze the fatal HAZMAT-truck-involved crash characteristics and the risk factors;
  • Investigate the crash patterns and the collective associations between risk factors and fatal HAZMAT-truck-involved crashes;
  • Apply machine learning techniques for HAZMAT crash pattern recognition.
Using ten-year (2010–2019) crash data from the Fatality Analysis Reporting System (FARS) database, this study applies cluster correspondence analysis to identify the underlying patterns and the associations between the risk factors for HAZMAT-truck-involved fatal crashes, intending to present the collective associations of risk factors that have yet to be explored or hidden in the whole crash dataset. This approach integrates cluster analysis and correspondence analysis by grouping the crash data into different clusters. A low-dimensional space based on the optimal scaling values projects the categorical variables (including the crash, road, driver, vehicle, and environmental characteristics) in a multivariate dataset. The findings from this study will help HAZMAT carriers, transportation management authorities, and policymakers develop potential targeted countermeasures for HAZMAT-truck-involved crash reduction and safety improvement.

2. Literature Review

Many past studies have identified and investigated the risk factors influencing the frequency and severity of HAZMAT-truck-involved crashes. The commonly used methods for determining the relationship between explanatory variables and crashes are exploratory data analysis with statistical description [6,7,8] and regression modeling [9,10,11,12]. Shen et al. [8] conducted a statistical distribution study of 708 HAZMAT road transportation crashes in China from 2004 to 2011. The most common collision types were rollover, run-off-road, and rear-end collisions. Freeways, early morning (4:00–6:00 a.m.) and midday (10:00 a.m.–12:00 p.m.) hours, vehicle-related defects, and human-related errors were all major influencing factors for HAZMAT tanker crashes. Uddin and Huynh [9] investigated factors affecting crash severity involving HAZMAT large trucks through fixed- and random-parameters ordered probit models. The model results showed that male occupants, dark–unlighted and dark–lighted conditions, crashes in rural areas, and weekday crashes are associated with a higher probability of severe injuries. Xing et al. [10] collected 1721 HAZMAT crashes in China from 2014 to 2017. They developed a random-parameters ordered probit model to quantify the influence of risk factors on HAZMAT crash severity, accounting for the unobserved heterogeneity in the crash data. HAZMAT category (compressed gas, explosive, and poison), roadway characteristics (tunnel, slope, county road, and dry road surface), collision types (rear-end and multi-vehicle crashes), driver characteristics (misoperation, fatigue, and speeding), and environmental characteristics (winter and dark lighting conditions) were associated with increased probability of severe injury crashes. Song et al. [11] developed an ordered logistic model for factor analysis of HAZMAT road transportation crashes. According to their findings, the six most important factors affecting HAZMAT crash severity were: traffic volume, driver fatigue or asleep, number of lanes, speeding, adverse weather conditions, and lighting conditions. Similarly, a study conducted by Ma et al. [12] also applied the ordered logistic model to investigate the relationship between risk factors and HAZMAT crashes. The model estimation results revealed that violations, unsafe driving behaviors, vehicle defects, vehicle types, weather, lighting, seasonal distribution, and road grade are closely related to more severe crashes. In addition, there were many other commonly used regression models in large-truck-involved crash studies, such as ordered probit [13], multinomial logit [14,15], mixed logit [16,17,18,19], hierarchical Bayesian random intercept [20], Bayesian binary logit [21], and Bayesian model averaging-based logistic regression [22].
Traditional statistical models can successfully identify the correlations between HAZMAT crashes and independent variables. However, they usually require certain assumptions in the crash data, and it is challenging to discover the underlying crash patterns and the interaction among various variables. Data mining and machine learning approaches for truck-related crash data processing and analysis, including classification trees [23], taxicab correspondence analysis [24], association rules mining [25], and Bayesian networks [26,27,28], have become increasingly popular in recent years. For example, Das et al. [24] utilized taxicab correspondence analysis to uncover the complex interactions between multiple risk factors and large truck-involved fatal crashes. The study identified five clusters displaying the association patterns of different variables, such as intersection types, posted speed limits, two-lane undivided roadways, collision types, number of vehicles, driver impairment, and weather conditions. Using crash data from 2008 to 2017, Hong et al. [25] performed association rules mining to investigate the risk factors contributing to HAZMAT vehicle-related crashes on expressways. Male drivers, daytime, mainline segments, single-vehicle crashes, and clear weather conditions were highly associated with HAZMAT vehicle-related crashes. Zhao et al. [26] utilized the Bayesian networks approach to prioritize the risk factors that impact HAZMAT road transportation crashes. According to their findings, the most important attributes were packing and loading of HAZMAT, vehicle and facility-related factors, and human factors. Similarly, Ma et al. [27] also utilized Bayesian networks to explore the most probable factors or the combination of factors leading to the crash—a recent HAZMAT crash severity prediction study [28] combined random forest and Bayesian networks. Random forest ranked the importance of risk factors, and Bayesian networks developed the probabilistic inference. The results showed that driver characteristics (driver age less than 25, fatigue, distraction, and violation), roadway characteristics (posted speed limits over 66 mph, intersections, ramps, and bridges), crash characteristics (fire/explosion/spill, midnight and early morning hours, and head-on crashes), vehicle characteristics (more than four vehicles), and environmental characteristics (poor lighting conditions and adverse weather) were all related to fatal and severe injury HAZMAT crashes.
Table 1 summarizes HAZMAT road transportation crash studies, including data sources, methods, and considered variables. Although data mining and machine learning approaches can provide comparable or superior performance on crash modeling, such applications in HAZMAT crash risk factor analysis were not a significant focus in the literature. Thus, there still is a need for research efforts with additional resources and innovative methods. This study applies cluster correspondence analysis to explore the complex HAZMAT-truck-involved fatal crash data, reveal the underlying crash patterns, and investigate the relationships between variables (including the crash, road, driver, vehicle, and environmental characteristics).

3. Materials and Methods

3.1. Crash Data

This study collected ten-year (2010–2019) fatal crash data involving HAZMAT trucks from the FARS database. FARS is a nationwide census providing National Highway Traffic Safety Administration (NHTSA), Congress, and the American public yearly data on fatal injuries in motor vehicle traffic crashes. This study used the unique crash identification number to retrieve and match data from the various datasets (crash data, vehicle data, and person data). The research conducted a thorough data cleaning process to address incorrect (such as speed limit recorded as 200 mph) and missing values in the final dataset development. The fatalities recorded in the source data include all that occurred for any reason at an accident involving HAZMAT trucks. There were 1322 HAZMAT-truck-involved fatal crashes after the data cleaning process. Figure 1 shows the distribution of these crashes in the US by different states from 2010 to 2019.
Based on the availability and relevance to explain the fatal crash occurrence, this study selected 26 risk factors, including crash, road, driver, vehicle, and environmental characteristics, as explanatory variables. Table 2 summarizes the descriptive statistics of the HAZMAT-truck-involved fatal crashes. The majority of fatal crashes involving HAZMAT trucks occurred on two-way undivided roadways with speed limits between 50–65 mph on segments of arterials or collectors. Notably, fatal crashes occurring in the dark–lighted and dark–not lighted conditions accounted for over 36% of the whole dataset. In addition, more HAZMAT-truck-involved fatal crashes occurred in rural areas (67.32%). Regarding driver characteristics, this study grouped the driver’s age into six levels. Drivers aged 46–55 accounted for the highest proportion in this fatal crash dataset. Some variables form high skewness distribution. For example, approximately 98% of drivers involved in the crashes are male. The dataset also contains driver status records, such as previously recorded crashes, previously recorded suspensions, revocations, withdrawals, previous speeding convictions, and previous other moving violation convictions.

3.2. Cluster Correspondence Analysis

Parametric and non-parametric models are two primary crash data analysis methods used to investigate the associations between risk factors and crash occurrence. In a parametric model, individual variables can explain the impact of analyzed risk factors on either crash probability or injury severity levels. Random parameter models are one example of the parametric technique to address data heterogeneity. However, these parametric modeling techniques require distributional hypotheses, which may be difficult to capture the variations for the observations with shared unobserved heterogeneity [29]. Given large and complex datasets such as crash data, which contains a lot of categorical variables with high dimensions from input data, there is a need for advanced methods to handle nonlinear relationships.
Correspondence analysis extends principal component analysis suited to exploring relationships among categorical data. This multivariate statistical tool utilizes contingency tables for dimension reduction and data visualization. A low-dimensional space displays the associations between the column elements and row elements in the data matrix, and the positions of the row and column points are consistent with their associations in the table. Many researchers have applied different forms of correspondence analysis in transportation safety studies, such as taxicab correspondence analysis [24,30], multiple correspondence analysis [31,32,33,34,35,36], and joint correspondence analysis [37,38]. As one of the unsupervised machine learning approaches, correspondence analysis does not require any distribution hypothesis among data. A previous study has identified its ability to efficiently reduce dimensionality and compile results into easy-to-read plots for in-depth crash analysis [32]. However, distinguishing different clusters depends on subjective judgment, which relies on the categorical variable feature approximation results in a low-dimensional space [36]. In recent years, some transportation safety studies used cluster correspondence analysis due to its ability to discover the underlying cluster structures and reduce multicollinearity among data [24,39,40,41,42]. This approach has outperformed other earlier approaches that attempt to apply dimension reduction and clustering either sequentially or concurrently in terms of cluster convergence [42].
In general, cluster analysis is an exploratory approach that enables one to classify a set of multivariate observations into groups by maximizing their similarity within each group. Correspondence analysis is a process of identifying, quantifying, separating, and plotting associations among the characteristics and relationships among the various levels. It complements the cluster analysis in terms of dimension reduction and data visualization. By combining cluster analysis and correspondence analysis, this approach groups the data points into different clusters based on the profiles of the categorical variables with optimal scaling values. It depicts the clusters and variables to a low-dimensional approximation. Below presents a brief description of cluster correspondence analysis. Readers can refer to van de Velden et al. [43,44] for the complete theoretical concept.
Suppose there are n number of individuals (for example, HAZMAT-truck-involved fatal crashes) with p categorical variables (for example, driver, road, vehicle, crash, and environmental characteristics) in the dataset. A super indicator matrix Z with n   ×   Q (where Q = j = 1 p q j , j = 1 ,   2 ,   ,   p ) represents this dataset. Each variable can also have q j modalities. Cluster membership can be expressed as an indicator matrix ZK, K is the number of clusters. A table-cross-tabulation cluster memberships with categorical variables can be expressed as F = Z′KZ, where ZK is the n   ×   K indicator matrix describing cluster membership. This approach automatically finds the optimal scaling values for rows (clusters) and columns (categories) so that the between-cluster variance is the maximum. That is, it optimally separates the clusters concerning the distributions over the categorical variables. Similarly and simultaneously, it optimally separates the categories with differing distributions over the clusters. The optimal cluster allocation ZK can be calculated as [44]:
max ϕ clusca Z K , B * = 1 p traceB * D Z - 1 2 Z M Z K D Z - 1 Z K MZD Z - 1 2 B *
The objective function of cluster correspondence analysis is shown below [44]:
min ϕ clusca Z K , G n p MZD Z - 1 2 B *   -   Z K G 2
where, Z is the n   ×   Q , matrix, Z j is an n   ×   q j indicator matrix for the j-th categorical variable, Z j 1 q j = 1 n , 1 q denotes a q dimensional vector of ones; Z K is the n   ×   K cluster membership indicator matrix; G is the cluster centroid matrix; B is the column coordinate matrix of rank k , k denotes the dimensionality of the optimally scaled quantifications; B * = 1 np D Z 1 2 B ; M = I n   -   1 n 1 n / n ; D Z is the diagonal matrix, D Z 1 Q = Z 1 n .
Iterating between correspondence analysis of the contingency matrix F = Z K Z and applying K-means cluster analysis to the reduced space coordinates obtained using the correspondence analysis category quantifications until convergence can obtain the optimal category quantifications (i.e., column coordinates) and an optimal cluster allocation. The cluster correspondence analysis approach can analyze two-way and multi-way tables that contain relationships between the rows and columns from datasets with various nominal variables. Interpreting the distributions over the variables in the different clusters is straightforward. Crash data is too heterogeneous, making it challenging to identify specific patterns. With its dimension reduction and visualization characteristics, the cluster correspondence analysis approach performs well in representing crash scenarios, analyzing high-dimensional categorical crash data, and recognizing the underlying patterns behind crash data.
This study used an open-source R package, “clustrd”, to perform the cluster correspondence analysis in R programming [45,46]. The clustering validation criterion used in the research is the Calinski–Harabasz index. This index (also known as the variance ratio criterion) determines how similar an object is to its own cluster (cohesion) when compared to other clusters (separation) [47]. This index uses the distance between the data points in a cluster and its cluster centroid to estimate cohesion. In contrast, it uses the distance between the cluster centroids and the global centroid to estimate separation. A higher index value suggests that the clusters are well separated.

4. Results and Discussion

4.1. Cluster Overview

To select the number of clusters, this study randomly conducted K-means clustering running multiple times to obtain the optimal values of the criterion. The study determined the final cluster number as four with the optimal criterion value of 4.3396 for the HAZMAT-truck-involved fatal crash dataset. Thus, the study grouped the HAZMAT-truck-involved fatal crashes into four clusters accordingly. Table 3 summarizes the key measurements of the clusters, including cluster centroid, number and percentage of crashes, and the sum of squares. Cluster 1 includes over 40% of the crash data. Cluster 2 and Cluster 3 account for 30.41% and 24.51% of the crash data, respectively. Cluster 4 contains the smallest number of crashes, only 2.64% of the crash data.
Figure 2 illustrates the projection of crash data points and individual variable categories onto a two-dimensional biplot. It provides the approximations and visualization of their associations. This study groups the crash data points into four clusters with the colors red (C1), green (C2), blue (C3), and purple (C4), respectively. Different shapes represent the cluster centroids with the shapes circle (C1), triangle (C2), square (C3), and plus (C4), respectively. The small black circles represent the variable categories with their names on the side. This biplot visualizes all variables’ locations and associations in a two-dimensional space. However, it is difficult to observe individual variable categories within specific clusters. For better visualization and interpretation, four separate bar plots in Figure 3 display the variable contributions in the format of standardized residuals.
The bar plots in Figure 3 show the top 30 variable categories with the highest standardized residuals (positive or negative in absolute values) in each cluster. It helps to identify variable categories with the most deviation from the independence condition. A positive residual indicates that the variable category has an above-average frequency within the cluster. A negative residual indicates that the variable category has a below-average frequency. A higher residual value indicates a stronger association and a more significant contribution to cluster segmentation. The purpose of this research is to identify HAZMAT-truck-involved fatal crash patterns. Thus, the following sections discuss only variable categories with positive standardized residuals.

4.1.1. Cluster 1

There are several variable categories with positive standardized residuals in this cluster: angle crashes, two vehicles involved crashes, two-way not divided roadways, four-way intersections, front-to-front crashes, arterials or collectors, daylight, two-way not divided roadways with a continuous left-turn lane, speed limits between 30 to 45 mph, T-intersections, the crash hour between 12:00 p.m. to 5:59 p.m., normal driver conditions, and no distraction. Cluster 1 indicates an association of fatal crashes involving HAZMAT trucks and another vehicle at intersections on two-way, undivided roadways during daytime.

4.1.2. Cluster 2

This cluster shows a group of variable categories closely associated with each other (measured by the positive standardized residuals on the right side in Figure 3b). These variable categories include single-vehicle crashes, not collision with motor vehicle crashes, two-way divided roadways with unprotected median, roadway functional class–local, roadway segment, driver age under 25, dark–not lighted conditions, roadway with grade, cloudy weather conditions, driver age between 25 to 35, curve alignment, manner of collision–others, the crash hour between 6:00 p.m. to 11:59 p.m., the driver with previously recorded suspensions, revocations, and withdrawals, and adverse weather conditions (rain or snow). Cluster 2 indicates an association of fatal crashes involving single HAZMAT trucks on local roadways with relatively younger drivers under dark lighting conditions.

4.1.3. Cluster 3

This cluster contains several variable categories with positive standardized residuals: interstate highways, two-way divided roadways with a positive median barrier, speed limits greater than 65 mph, front-to-rear crashes, the crash hour between 12:00 a.m. to 5:59 a.m., multi-vehicle crashes, fire, dark–lighted conditions, abnormal driver conditions–asleep, fatigue, physical impairment, or others, entrance/exit ramps, urban areas, distraction, roadway segment, HAZMAT class–explosives, other distraction, and female drivers. Cluster 3 indicates an association of fatal crashes involving HAZMAT trucks and multiple vehicles at urban interstate highway segments and entrance/exit ramps with relatively high speed limits, under dark lighting conditions during midnight and early morning. Results show that driver behavior, such as distraction, asleep or fatigue, and physical impairment, are closely associated with fatal crashes under the abovementioned conditions. Female drivers also have associations with fatal crashes.

4.1.4. Cluster 4

This cluster includes the following variable categories with positive standardized residuals: traffic way–others, speed limits less than 25 mph, roadway grade–others, roadway alignment–others, dark–lighted conditions, roadway surface condition–non-dry, four-way intersections, angle crashes, roadway functional class–local, urban areas, the driver with previously recorded crashes, the crash hour between 12:00 a.m. to 5:59 a.m., and the crash hour between 6:00 p.m. to 11:59 p.m. This cluster contains the least information, covering many other values of different variable categories.

4.2. HAZMAT Crash Characteristics Analysis

The cluster-based analysis identifies the patterns of the risk factors associated with HAZMAT-truck-involved fatal crashes. The standardized residuals measure the significance of variable categories within the clusters. Table 4 illustrates the variable distribution in each cluster in a heatmap format. Risk factors highlighted in dark blue indicate overrepresentation. This table effectively illustrates the “between clusters” proportions of variable categories. The combination of variable categories identified in the clusters helps to discover the distinct and interpretable patterns for fatal crashes involving HAZMAT trucks.
  • HAZMAT-truck-involved fatal crashes at intersections on two-way undivided roadways overrepresent in Cluster 1. Intersection crashes (including four-way intersections and T-intersections) account for nearly 20% of the whole dataset, as shown in Table 2, and it is over 70% in Cluster 1. Similarly, 51.89% of fatal HAZMAT truck-related crashes occurred on two-way undivided roadways in the whole dataset. The variable accounts for 63.85% of the crashes in Cluster 1. This cluster represents 40% of the crash data, indicating HAZMAT-truck-involved fatal crashes under certain conditions. The increased conflicting points and interfering factors, including vision field, pedestrian crossing, and traffic lights at intersections, could explain the higher probability of fatalities once a crash occurs. This finding is consistent with previous studies [20,22,23,27,28,48].
  • Intersection crashes, especially angle (80.64%) and front-to-front (77.50%) crashes on arterials or collectors, are prominently associated with fatalities, as shown in Cluster 1. Previous studies have shown that head-on and angle crashes are more harmful than other collision types at such locations, as these crashes commonly result in fatalities and severe injuries [13,28].
  • Curve alignments also have higher associations with HAZMAT-truck-involved fatal crashes. The variable accounts for nearly 20% of the whole dataset, as shown in Table 2, and it is 34.68% and 35.89% in Cluster 1 and Cluster 2, respectively. The combined intersections and curves challenge drivers because of their unique design and functions, increasing the crash risk of vehicles running off-road or rollovers, leading to fatalities and severe injuries. This finding is consistent with previous studies [10,14,15,16,20,21,25,49,50].
  • The result shows that driver status is a critical risk factor in HAZMAT-truck-involved fatal crashes in Cluster 3—specifically, multiple vehicles at urban interstate highway segments with relatively high speed limits under dark lighting conditions. Distraction, asleep or fatigue, and physical impairment could impair driving ability and cause significant cognitive inadequacies, increasing the possibility of fatal and severe injury crashes. The result complies with the previous studies [10,11,12,13,14,16,20,27,28,47,51]. Commercial truck drivers are more commonly subjected to longer travel time and driving distances than passenger vehicle drivers, bringing them a higher safety risk for distraction and fatigue [52,53]. The driver behavior-related crash pattern identified in this study emphasizes the need for safety training, education program, and advanced driver supervision (i.e., a driver monitoring system with fatigue and distracted driving detection/alert function) for HAZMAT truck drivers. Crash prevention training is one of the most effective ways of promoting the application of safe HAZMAT road transportation practices and procedures [54].
  • The dark–lit condition is associated with interstate highways, predominantly on two-way divided roadways with a positive median barrier in urban areas (40.46% in Cluster 3). On the contrary, the dark–not lighted condition is associated with a lower functional class in rural areas (41.66% in Cluster 2). Poor visibility caused by adverse lighting conditions during nighttime is a critical risk factor for HAZMAT-truck-involved fatal crashes. It adheres to past research [9,11,12,15,16,17,20,24,28,47,55]. According to a Florida study, installed lighting positively affects the reduction in crashes for all crash types and severity levels by 37% [56].
  • This study identifies the association between HAZMAT-truck-involved fatal crashes and adverse weather conditions. Adverse weather conditions could pose challenges to drivers due to the reduced road friction coefficient and relatively poor visibility. The presence of driver inattention, fatigue, physical impairment (Cluster 2), dark lighting conditions (dark–lighted and dark–not lighted), or special road locations such as entrance/exit ramps (Cluster 3) could intensify the challenges.
  • Front-to-end crashes on high-speed (65 mph or more) urban interstate highways and freeways overrepresent in Cluster 3. Higher speed limits are associated with HAZMAT-truck-involved fatal crashes. An explanation for these rear-end crashes on high-speed interstate highways and freeways is failure to keep a safe speed and distance for maneuver adjustment. In addition, multiple vehicle crashes at such locations also increase the potential risk of HAZMAT release and explosion. Cluster 3 illustrates the overrepresented fire (47.66%), as shown in Table 4.
  • The presence of overrepresented other information identified in Cluster 4 implies a higher possibility of hit-and-run situations, combined with angle crashes, dark–lighted conditions, speed limits less than 25 mph, and local roadways. This finding is similar to a previous study [57]. However, there are only 35 crashes in this group, which is challenging to have a conclusive result.
  • Additionally, some factors with small observations in the whole dataset may not be apparent to have a chance to show their impact. However, these factors show different proportion distributions in different clusters, further explaining their correspondence with fatal HAZMAT crashes. For example, Cluster 2 illustrates the overrepresented factor of drivers with previously recorded suspensions, revocations, and withdrawals (40.47%) and drivers aged under 25 (58.33%). The combination of the factors may be rare but strongly associated with fatality given an accident, which was hidden in the whole data descriptive statistics.

4.3. Implications of Study Findings

Research on crash safety analysis ultimately aims to provide a scientific basis for crash and injury reduction by developing state of the art in modeling, management, and policymaking. This study applies an advanced categorical data mining technique, cluster correspondence analysis, to identify the critical clusters that describe the patterns of influencing factors associated with fatal crashes involving HAZMAT trucks. The factors examined in the current study for various scenarios could help transportation authorities build data-driven interventions and countermeasures. Considering the study scope, the following recommendations are based on the results, focusing on the planning level without involving the project level.
  • Considering the high percentage of HAZMAT-truck-related fatal crashes at intersections on rural two-way undivided roadways, strategies suggested by Federal Highway Administration (FHWA) could benefit safety improvement. For example, improving driver awareness of intersection approach(es) by providing enhanced signing, delineation, and supplementary messages with the systemic application of multiple low-cost countermeasures may help to reduce fatal and injury crashes by 10–27% for both signalized and stop-controlled intersections [58,59].
  • To reduce intersection fatal crashes, especially angle and front-to-front crashes involving HAZMAT trucks, dedicated left- and right-turn lanes for physical separation between turning traffic that is slowing or stopped and adjacent through traffic at approaches to intersections may serve as one of the potential countermeasures. Offset turn lanes can provide added safety benefits for a reduction in fatal and injury crashes (36%) and total crashes (14–26%) [60].
  • HAZMAT shipment carriers and stakeholders can utilize the findings of driver-behavior-related factors to develop safety training materials and education programs.
  • Enhancing roadway lighting and installing a roadside weather-responsive warning system could reduce HAZMAT-truck-related fatal crashes during nighttime and adverse weather conditions.
  • The real-time advisory speed limit on high-speed urban interstate highways and freeways can mitigate corresponding rear-end crash risks.
One of the most important factors in risk mitigation is the site of crashes involving HAZMAT trucks. If the locations are close to businesses and residential areas, it will increase the possibility of catastrophic effects in the event of a leak or explosion. Based on the cluster correspondence analysis results, roadway conditions, such as intersections on two-way undivided roadways, curve alignments, and high-speed urban interstate highways, are strongly associated with HAZMAT-truck-involved fatal crashes. As part of the permitting process, jurisdictions and policymakers may set guidelines for an interactive mechanism allowing HAZMAT shipment carriers, transportation management authorities, and stakeholders to communicate with local first responders and create emergency plans. This ensures that first responders are informed of and ready to address the new potential risks under different roadway conditions.

5. Conclusions

HAZMAT-truck-involved crashes pose considerable threats to the public, property, and environment. To reduce crashes, particularly fatal crashes, it is critical to identify the crash patterns and risk factors associated with the crash occurrence. This study uniquely contributes to the research of HAZMAT-truck-involved crashes in several aspects. By collecting ten-year (2010–2019) fatal crash data from the FARS database, this study applies a comparatively new categorical data analysis approach combining cluster and correspondence analyses. It depicts the clusters and variable categories to a low-dimensional approximation through a biplot. Furthermore, the estimation of top positive residuals demonstrates the correlations between various variables, highlighting the most prevalent fatal crash scenarios involving HAZMAT trucks. Instead of a subjective method, the cluster correspondence analysis approach uses automated data segmentation following the dimensional reduction in categorical data.
A crash is the complicated and interconnected result of risk factors, including driver, road, vehicle, and the environment. The four clusters for fatal HAZMAT-truck-involved crashes provide interesting insights. Through interactions with associated factors, this study identifies the underlying patterns and the high-risk scenarios where fatal HAZMAT-truck-involved crashes are more likely to occur. This study reveals that fatal HAZMAT-truck-involved crashes are highly distinguishable concerning collision types (angle and front-to-front crashes, single-vehicle crashes, and front-to-end crashes) and roadway geometric variables, such as two-way undivided roadways, curve alignments, and high-speed (65 mph or more) urban interstate highways. In addition to the factors mentioned above, driver behavior (distraction, asleep or fatigue, and physical impairment), lighting conditions (dark–lighted and dark–not lighted), and weather are also interrelated. The study also provides “between clusters” proportions by the variable categories, which help identify important risk factors with small observation and the combination of variables strongly associated with HAZMAT-truck-involved fatal crashes that have yet to be explored or hidden in the whole crash dataset.
The associations of risk factors identified in this study provide implications for potential preventive measures to improve HAZMAT road transportation safety. Understanding the crash mechanism and recognizing why, where, and how HAZMAT-truck-involved fatal crashes occurred is important. HAZMAT shipment carriers, transportation agencies, and stakeholders can use the results to develop countermeasures that prioritize safety goals and objectives in the emphasized areas of targeted crash patterns. Several critical tasks need further consideration: developing a data-driven summary of the causes of crashes and primary risk factors, creating an information guide for the risk factors to enhance managerial and operational awareness and expertise of preventive measures, and enhancing knowledge of effective technologies, safety culture, and prevention practices and procedures to improve HAZMAT road transportation safety.
This study has several limitations. The research identifies risk factors from the FARS database’s ten-year (2010–2019) fatal crash data. Based on the data availability, the researchers limit the scope of this study to fatal HAZMAT-truck-involved crashes. With more injury crashes and no injury crash data, the crash patterns for different severity levels could shed more light on HAZMAT-truck-involved crash characteristic analysis and HAZMAT road transportation safety improvement. Future studies should explore and examine the integration of spatial, temporal, and additional data (such as HAZMAT-crash-related environmental damage and the population along the HAZMAT transportation route).

Author Contributions

Conceptualization, M.S.; methodology, M.S.; software, M.S.; data curation, M.S.; formal analysis, M.S.; writing—original draft preparation, M.S.; writing—review and editing, M.S. and R.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China (grant number: 2021YFC3001500).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used in this study is from the Fatality Analysis Reporting System (FARS), which can be requested at https://www.nhtsa.gov/research-data/fatality-analysis-reporting-system-fars (accessed on 5 March 2022).

Acknowledgments

The authors thank the Fatality Analysis Reporting System (FARS) for providing the data.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the study’s design; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

  1. Census Bureau. 2017 Commodity Flow Survey, Hazardous Materials Series; U.S. Department of Commerce: Washington, DC, USA, 2020. Available online: https://www.census.gov/content/dam/Census/library/publications/2017/econ/ec17tcf-us.pdf (accessed on 24 May 2022).
  2. Federal Motor Carrier Safety Administration. Large Truck and Bus Crash Facts 2019; United States Department of Transportation: Washington, DC, USA, 2021. Available online: https://www.fmcsa.dot.gov/safety/data-and-statistics/large-truck-and-bus-crash-facts-2019 (accessed on 24 May 2022).
  3. Pipeline and Hazardous Materials Safety Administration. Office of Hazardous Material Safety; 10 Year Incident Summary Reports; U.S. Department of Transportation: Washington, DC, USA, 2021. Available online: https://portal.phmsa.dot.gov/analytics/saw.dll?Portalpages&PortalPath=%2Fshared%2FPublic%20Website%20Pages%2F_portal%2F10%20Year%20Incident%20Summary%20Reports (accessed on 24 May 2022).
  4. Zhou, L.; Guo, C.; Cui, Y.; Wu, J.; Lv, Y.; Du, Z. Characteristics, cause, and severity analysis for hazmat transportation risk management. Int. J. Environ. Res. Public Health 2020, 17, 2793. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Federal Emergency Management Agency. Hazardous Materials Incidents Guidance for State, Local, Tribal, Territorial, and Private Sector Partners; U.S. Department of Homeland Security: Washington, DC, USA, 2019. Available online: https://www.fema.gov/sites/default/files/2020-07/hazardous-materials-incidents.pdf (accessed on 28 May 2023).
  6. Oggero, A.; Darbra, R.M.; Munoz, M.; Planas, E.; Casal, J. A survey of accidents occurring during the transport of hazardous substances by road and rail. J. Hazard. Mater. 2006, 133, 1–7. [Google Scholar] [CrossRef] [PubMed]
  7. Yang, J.; Li, F.; Zhou, J.; Zhang, L.; Huang, L.; Bi, J. A survey on hazardous materials accidents during road transport in China from 2000 to 2008. J. Hazard. Mater. 2010, 184, 647–653. [Google Scholar] [CrossRef]
  8. Shen, X.; Yan, Y.; Li, X.; Xie, C.; Wang, L. Analysis on tank truck accidents involved in road hazardous materials transportation in China. Traffic Inj. Prev. 2014, 15, 762–768. [Google Scholar] [CrossRef] [PubMed]
  9. Uddin, M.; Huynh, N. Factors influencing injury severity of crashes involving HAZMAT trucks. Int. J. Transp. Sci. Technol. 2018, 7, 1–9. [Google Scholar] [CrossRef]
  10. Xing, Y.; Chen, S.; Zhu, S.; Zhang, Y.; Lu, J. Exploring risk factors contributing to the severity of hazardous material transportation accidents in China. Int. J. Environ. Res. Public Health 2020, 17, 1344. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  11. Song, X.; Wu, J.; Zhang, H.; Pi, R. Analysis of crash severity for hazard material transportation using highway safety information system data. SAGE Open 2020, 10, 21582440209. [Google Scholar] [CrossRef]
  12. Ma, C.; Zhou, J.; Yang, D. Causation analysis of hazardous material road transportation accidents based on the ordered logit regression model. Int. J. Environ. Res. Public Health 2020, 17, 1259. [Google Scholar] [CrossRef] [Green Version]
  13. Zhu, H.; Srinivasan, S. A comprehensive analysis of factors influencing the injury severity of large-truck crashes. Accid. Anal. Prev. 2011, 43, 49–57. [Google Scholar] [CrossRef]
  14. Chen, F.; Chen, S. Injury severities of truck drivers in single- and multi-vehicle accidents on rural highways. Accid. Anal. Prev. 2011, 43, 1677–1688. [Google Scholar] [CrossRef]
  15. Naik, B.; Tung, L.; Zhao, S.; Khattak, A.J. Weather impacts on single-vehicle truck crash injury severity. J. Saf. Res. 2016, 58, 57–65. [Google Scholar] [CrossRef] [PubMed]
  16. Islam, S.; Jones, S.L.; Dye, D. Comprehensive analysis of single- and multi-vehicle large truck at-fault crashes on rural and urban roadways in Alabama. Accid. Anal. Prev. 2014, 67, 148–158. [Google Scholar] [CrossRef] [PubMed]
  17. Uddin, M.; Huynh, N. Truck-involved crashes injury severity analysis for different lighting conditions on rural and urban roadways. Accid. Anal. Prev. 2017, 108, 44–55. [Google Scholar] [CrossRef]
  18. Uddin, M.; Huynh, N. Injury severity analysis of truck-involved crashes under different weather conditions. Accid. Anal. Prev. 2020, 141, 105529. [Google Scholar] [CrossRef]
  19. Islam, M.; Hosseini, P.; Jalayer, M. An analysis of single-vehicle truck crashes on rural curved segments accounting for unobserved heterogeneity. J. Saf. Res. 2022, 80, 148–159. [Google Scholar] [CrossRef]
  20. Haq, M.T.; Zlatkovic, M.; Ksaibati, K. Assessment of commercial truck driver injury severity based on truck configuration along a mountainous roadway using hierarchical bayesian random intercept approach. Accid. Anal. Prev. 2021, 162, 106392. [Google Scholar] [CrossRef] [PubMed]
  21. Ahmed, M.M.; Franke, R.; Ksaibati, K.; Shinstine, D.S. Effects of truck traffic on crash injury severity on rural highways in wyoming using bayesian binary logit models. Accid. Anal. Prev. 2018, 117, 106–113. [Google Scholar] [CrossRef]
  22. Iranitalab, A.; Khattak, A.; Bahouth, G. Statistical modeling of cargo tank truck crashes: Rollover and release of hazardous materials. J. Saf. Res. 2020, 74, 71–79. [Google Scholar] [CrossRef]
  23. Eustace, D.; Alqahtani, T.; Hovey, P.W. Classification tree modelling of factors impacting severity of truck-related crashes in Ohio. In Proceedings of the 97th Annual Meeting of the Transportation Research Board, Washington, DC, USA, 7–11 January 2018. [Google Scholar]
  24. Das, S.; Islam, M.B.; Dutta, A.; Shimu, T.H. Uncovering deep structure of determinants in large truck fatal crashes. Transp. Res. Rec. J. Transp. Res. Board 2020, 2674, 742–754. [Google Scholar] [CrossRef]
  25. Hong, J.; Tamakloe, R.; Park, D. Application of association rules mining algorithm for hazardous materials transportation crashes on expressway. Accid. Anal. Prev. 2020, 142, 105497. [Google Scholar] [CrossRef]
  26. Zhao, L.; Wang, X.; Qian, Y. Analysis of factors that influence hazardous material transportation accidents based on bayesian networks: A case study in China. Saf. Sci. 2012, 50, 1049–1055. [Google Scholar] [CrossRef]
  27. Ma, X.; Xing, Y.; Lu, J. Causation analysis of hazardous material road transportation accidents by bayesian network using genie. J. Adv. Transp. 2018, 2018, 6248105. [Google Scholar] [CrossRef]
  28. Sun, M.; Zhou, R.; Jiao, C.; Sun, X. Severity analysis of hazardous material road transportation crashes with a bayesian network using highway safety information system data. Int. J. Environ. Res. Public Health 2022, 19, 4002. [Google Scholar] [CrossRef] [PubMed]
  29. Mannering, F.L.; Shankar, V.; Bhat, C.R. Unobserved heterogeneity and the statistical analysis of highway accident data. Anal. Methods Accid. Res. 2016, 11, 1–16. [Google Scholar] [CrossRef]
  30. Das, S. Identifying key patterns in motorcycle crashes: Findings from taxicab correspondence analysis. Transp. A 2021, 17, 593–614. [Google Scholar] [CrossRef]
  31. Das, S.; Sun, X. Factor association with multiple correspondence analysis in vehicle-pedestrian crashes. Transp. Res. Rec. J. Transp. Res. Board 2015, 2519, 95–103. [Google Scholar] [CrossRef]
  32. Das, S.; Sun, X. Association knowledge for fatal run-off-road crashes by multiple correspondence analysis. IATSS Res. 2016, 39, 146–155. [Google Scholar] [CrossRef] [Green Version]
  33. Das, S.; Avelar, R.; Dixon, K.; Sun, X. Investigating on the wrong way driving crash patterns using multiple correspondence analysis. Accid. Anal. Prev. 2018, 111, 43–55. [Google Scholar] [CrossRef]
  34. Baireddy, R.; Zhou, H.; Jalayer, M. Multiple correspondence analysis of pedestrian crashes in rural Illinois. Transp. Res. Rec. J. Transp. Res. Board 2018, 2672, 116–127. [Google Scholar] [CrossRef]
  35. Jalayer, M.; Pour-Rouholamin, M.; Zhou, H. Wrong-way driving crashes: A multiple correspondence approach to identify contributing factors. Traffic Inj. Prev. 2018, 19, 35–41. [Google Scholar] [CrossRef]
  36. Hossain, M.M.; Rahman, M.A.; Sun, X.; Mitran, E. Investigating underage alcohol-intoxicated driver crash patterns in Louisiana. Transp. Res. Rec. J. Transp. Res. Board 2021, 2675, 769–782. [Google Scholar] [CrossRef]
  37. Das, S.; Jha, K.; Fitzpatric, K.; Brewer, M.; Shimu, T.H. Pattern identification from older bicyclist fatal crashes. Transp. Res. Rec. J. Transp. Res. Board 2019, 2673, 638–649. [Google Scholar] [CrossRef]
  38. Hossain, M.M.; Sun, X.; Mitran, E.; Rahman, M.A. Investigating fatal and injury crash patterns of teen drivers with unsupervised learning algorithms. IATSS Res. 2021, 45, 561–573. [Google Scholar] [CrossRef]
  39. Das, S.; Mousavi, S.M.; Shirinzad, M. Pattern recognition in speeding related motorcycle crashes. J. Transp. Saf. Secur. 2021, 14, 1121–1138. [Google Scholar] [CrossRef]
  40. Das, S.; Dutta, A.; Rahman, M.A. Pattern recognition from light delivery vehicle crash characteristics. J. Transp. Saf. Secur. 2021, 14, 2055–2073. [Google Scholar] [CrossRef]
  41. Das, S.; Hossain, M.M.; Rahman, M.A.; Kong, X. Understanding patterns of moped and seated motor scooter (50 cc or less) involved fatal crashes using cluster correspondence analysis. Transp. A 2022, 19, 2. [Google Scholar] [CrossRef]
  42. Rahman, M.A.; Das, S.; Sun, X. Using cluster correspondence analysis to explore rainy weather crashes in Louisiana. Transp. Res. Rec. J. Transp. Res. Board 2022, 2676, 159–173. [Google Scholar] [CrossRef]
  43. Van de Velden, M.; D’Enza, A.I.; Palumbo, F. Cluster Correspondence Analysis; EI Report Serie EI 2014-24 ed. Econometric Institute, Erasmus University: Rotterdam, The Netherlands, 2014; Available online: https://econpapers.repec.org/paper/emseureir/77010.htm (accessed on 28 May 2023).
  44. Van de Velden, M.; D’Enza, A.I.; Palumbo, F. Cluster correspondence analysis. Psychometrika 2017, 82, 158–185. [Google Scholar] [CrossRef] [Green Version]
  45. Markos, A.; D’Enza, A.I.; Van de Velden, M. Beyond tandem analysis: Joint dimension reduction and clustering in R. J. Stat. Softw. 2019, 91, 1–24. [Google Scholar] [CrossRef] [Green Version]
  46. R Development Core Team. R: A Language and Environment for Statistical Computing. Available online: http://www.r-project.org (accessed on 23 March 2022).
  47. Caliñski, T.; Harabasz, J. A dendrite method for cluster analysis. Commun. Stat. 1974, 3, 1–27. [Google Scholar]
  48. Wenhui, L.; Fengtian, C.; Chuna, W.; Xingkai, M.; Lv, S. Bayesian network-based knowledge graph inference for highway transportation safety risks. Adv. Civ. Eng. 2021, 2021, 6624579. [Google Scholar] [CrossRef]
  49. Savolainen, P.T.; Tarko, A.P. Safety impacts at intersections on curved segments. Transp. Res. Rec. J. Transp. Res. Board 2005, 1908, 130–140. [Google Scholar] [CrossRef]
  50. Pahukula, J.; Hernandez, S.; Unnikrishnan, A. A time of day analysis of crashes involving large trucks in urban areas. Accid. Anal. Prev. 2015, 75, 155–163. [Google Scholar] [CrossRef] [PubMed]
  51. Mashhadi, M.M.R.; Wulff, S.S.; Ksaibati, K. A comprehensive study of single and multiple truck crashes using violation and crash data. Open Transp. J. 2018, 12, 43–56. [Google Scholar] [CrossRef]
  52. Chen, G.X.; Amandus, H.E.; Wu, N. Occupational fatalities among driver/sales workers and truck drivers in the United States, 2003–2008. Am. J. Ind. Med. 2014, 57, 800–809. [Google Scholar] [CrossRef]
  53. Lemke, M.K.; Apostolopoulos, Y.; Hege, A.; Sonmez, S.; Wideman, L. Understanding the role of sleep quality and sleep duration in commercial driving safety. Accid. Anal. Prev. 2016, 97, 79–86. [Google Scholar] [CrossRef] [Green Version]
  54. Pipeline and Hazardous Materials Safety Administration. Prevention/Mitigation Guidelines; United States Department of Transportation: Washington, DC, USA, 2017. Available online: https://hazmat.dot.gov/grants/hazmat/preventionmitigation-guidelines (accessed on 15 December 2022).
  55. Islam, M.; Hernandez, S. Large truck–involved crashes: Exploratory injury severity analysis. J. Transp. Eng. 2013, 139, 596–604. [Google Scholar] [CrossRef] [Green Version]
  56. Abdel-Aty, M.; Lee, C.; Park, J.; Wang, J.; Abuzwidah, M.; Al-Arifi, S. Validation and Application of Highway Safety Manual (Part D) in Florida; University of Central Florida: Orlando, FL, USA, 2014. [Google Scholar] [CrossRef]
  57. Tay, R.; Barua, U.; Kattan, L. Factors contributing to hit-and-run in fatal crashes. Accid. Anal. Prev. 2009, 13, 227–233. [Google Scholar] [CrossRef]
  58. Le, T.Q.; Gross, F.; Harmon, T. Safety effects of low-cost systemic safety improvements at signalized and stop-controlled intersections. Transp. Res. Rec. J. Transp. Res. Board 2017, 2636, 80–87. [Google Scholar] [CrossRef]
  59. Federal Highway Administration. Systemic Application of Multiple Low-Cost Countermeasures at Stop-Controlled Intersections; United States Department of Transportation: Washington, DC, USA, 2021. Available online: https://highways.dot.gov/sites/fhwa.dot.gov/files/2022-06/03_Systemic%20Application%20at%20Stop-Controlled%20Intersections_508.pdf (accessed on 15 December 2022).
  60. Federal Highway Administration. Dedicated Left- and Right-Turn Lanes at Intersections; United States Department of Transportation: Washington, DC, USA, 2021. Available online: https://highways.dot.gov/sites/fhwa.dot.gov/files/2022-08/13_Left-%20and%20Right-Turn%20Lanes_508.pdf (accessed on 15 December 2022).
Figure 1. HAZMAT-truck-involved fatal crashes in the US (2010–2019).
Figure 1. HAZMAT-truck-involved fatal crashes in the US (2010–2019).
Sustainability 15 09369 g001
Figure 2. Cluster biplot.
Figure 2. Cluster biplot.
Sustainability 15 09369 g002
Figure 3. Top 30 largest standardized residuals in clusters 1–4: (a) Cluster 1; (b) Cluster 2; (c) Cluster 3; and (d) Cluster 4.
Figure 3. Top 30 largest standardized residuals in clusters 1–4: (a) Cluster 1; (b) Cluster 2; (c) Cluster 3; and (d) Cluster 4.
Sustainability 15 09369 g003aSustainability 15 09369 g003b
Table 1. Summary of HAZMAT road transportation crash studies.
Table 1. Summary of HAZMAT road transportation crash studies.
ReferenceData SourceMethodVariables
[6]Beginning of the 20th century to 2004, Major Hazard Incidents Data ServiceExploratory data analysis and statistical descriptionHuman, mechanical, and external characteristics
[7]2000–2007, Journal of Safety and EnvironmentExploratory data analysis and statistical descriptionDriver, crash, road, vehicle, management, and weather characteristics
[8]2004–2011, China Chemical Safety AssociationExploratory data analysis and statistical descriptionDriver, roadway class, crash, temporal variables, vehicle, and weather characteristics
[9]2005–2011, Highway Safety Information System, CaliforniaFixed- and random- parameters ordered probitOccupant, crash, vehicle, roadway, environmental, and temporal characteristics
[10]2014–2017, China Chemical Safety AssociationRandom-parameters ordered probitDriver, crash, vehicle, roadway, and environmental characteristics
[11]Five-year period, Highway Safety Information SystemOrdered logitDriver, roadway, and environmental characteristics
[12]2018–2019, Ministry of Emergency ManagementOrdered logitDriver, vehicle, roadway, and environmental characteristics
[25]2008–2017, Korea Expressway CorporationAssociation rules miningDriver, crash, vehicle, roadway, environmental, and temporal characteristics
[26]2005–2009Bayesian networkHuman, vehicle, mechanical, and external characteristics
[27]2015–2016, State Work Accident Briefing System and Chemical Accidents Information NetworkBayesian networkDriver, crash, vehicle, roadway, and environmental characteristics
[28]2013–2017, Highway Safety Information SystemRandom forest and Bayesian networkDriver, crash, vehicle, roadway, and environmental characteristics
Table 2. Descriptive statistics of HAZMAT-truck-involved fatal crashes.
Table 2. Descriptive statistics of HAZMAT-truck-involved fatal crashes.
VariablesCount%VariablesCount%
Driver characteristics Road characteristics
Driver age (DrA) Trafficway Description (TrW)
 <25241.82% Two-way, not divided68651.89%
 25–3518313.84% Two-way, divided, unprotected median28921.86%
 36–4531223.60% Two-way, divided, positive median barrier26119.74%
 46–5543032.53% Two-way, not divided with a continuous left-turn lane372.80%
 56–6531223.60% Entrance/exit ramp241.82%
 >65614.61% Others251.89%
Driver gender (DrG) Speed limit (SpL)
 Female272.04% <25 mph564.24%
 Male129597.96% 30–45 mph20615.58%
Driver violation (DrV)  50–65 mph79059.76%
 No126495.61% >65 mph27020.42%
 Yes584.39%
Commercial motor vehicle license status (CDL) Roadway alignment (Alg)
 Valid129097.58% Straight102877.76%
 Not Valid322.42% Curve24818.76%
Previous recorded crashes (PrA)  Others463.48%
 No111284.11%Roadway grade (Grd)
 Yes21015.89% Level89667.78%
Previous recorded suspensions, revocations, and withdrawals (PrvSs) Grade37428.29%
 No128096.82% Others523.93%
 Yes423.18%Roadway surface condition (SrC)
Previous speeding convictions (PrvSp)  Dry107181.01%
 No110683.66% Non-dry25118.99%
 Yes21616.34%Setting (Stt)
Previous other moving violation convictions (PrO)  Rural89067.32%
 No104078.67% Urban43232.68%
 Yes28221.33%Roadway function class (RdF)
Distraction (Dst)  Interstate33525.34%
 No95672.32% Arterial/collector93670.80%
 Yes755.67% Local513.86%
 Others29122.01%Intersection type (InT)
Impairment (Imp)  Segment105079.43%
 None/apparently normal99875.49% Four-way intersection17713.39%
 Asleep, fatigue or physical impairment534.01% T-intersection846.35%
 Others27120.50% Others110.83%
Crash characteristics Environmental characteristics
Crash hour (Hor) Lighting condition (LgC)
 12:00–5:59 a.m.29622.39% Daylight78359.23%
 6:00–11:59 a.m.40530.64% Dark–lighted1319.91%
 12:00–5:59 p.m.40630.71% Dark–not lighted34826.32%
 6:00–11:59 p.m.21516.26% Dawn/dusk604.54%
Manner of collision (MnC) Weather (Wth)
 Not collision with motor vehicle40830.86% Clear90968.75%
 Angle37728.52% Cloudy21216.04%
 Front-to-front20015.13% Rain/snow13610.29%
 Front-to-rear22717.17% Others654.92%
 Sideswipe876.58%Visual obstruction (VsO)
 Others231.74% No125995.23%
Fire (Fir)  Yes634.77%
 No112985.40%
 Yes19314.60%
Vehicle characteristics
Hazardous material class (HzC) Number of total vehicles (VhT)
 Flammable/combustible liquid72754.99% Single31423.75%
 Gases19114.45% Two76557.87%
 Corrosive947.11% Multi24318.38%
 Explosives292.19%
 Others28121.26%
Table 3. Key cluster measurements.
Table 3. Key cluster measurements.
ClusterCluster CentroidSizePercentage Sum of Squares
Dimension 1Dimension 2
C1−0.0191−0.004756142.44%0.0492
C20.0229−0.011640230.41%0.0323
C30.00710.022632424.51%0.0447
C4−0.0099−0.0354352.64%0.0324
Table 4. Proportion distributions of the variables by clusters.
Table 4. Proportion distributions of the variables by clusters.
VariablesNumber of CrashesCluster 1 (561)Cluster 2 (402)Cluster 3 (324)Cluster 4 (35)
Driver age (DrA)<252416.67%58.33%25.00%0.00%
25–3518335.52%37.70%22.95%3.83%
36–4531245.19%32.69%18.91%3.21%
46–5543044.64%27.91%25.12%2.33%
56–6531240.39%27.56%29.81%2.24%
>656147.54%24.59%26.23%1.64%
Driver gender (DrG)Female2718.52%25.93%51.85%3.70%
Male129542.93%30.50%23.94%2.63%
Driver violation (DrV)No126442.09%30.85%24.53%2.53%
Yes5850.00%20.69%24.14%5.17%
Commercial motor vehicle license status (CDL)Valid129042.79%30.31%24.26%2.64%
Not Valid3228.13%34.37%34.37%3.13%
Previous recorded crashes (PrA)No111243.07%30.31%24.46%2.16%
Yes21039.05%30.95%24.76%5.24%
Previous recorded suspensions, revocations, and withdrawals (PrvSs)No128042.57%30.08%24.69%2.66%
Yes4238.10%40.47%19.05%2.38%
Previous speeding convictions (PrvSp)No110643.13%30.11%24.23%2.53%
Yes21638.89%31.94%25.93%3.24%
Previous other moving
violation convictions (PrO)
No104044.52%30.48%22.31%2.69%
Yes28234.76%30.14%32.62%2.48%
Distraction (Dst)No95649.27%29.18%19.14%2.41%
Yes7513.33%33.33%48.01%5.33%
Others29127.49%33.68%36.08%2.75%
Impairment (Imp)None/Apparently Normal99849.90%30.26%17.33%2.51%
Asleep, Fatigue or Physical Impairment533.77%26.42%67.92%1.89%
Others27122.51%31.73%42.44%3.32%
Trafficway
Description (TrW)
Two-Way, Not Divided68663.85%30.17%4.81%1.17%
Two-Way, Divided, Unprotected Median28928.03%42.90%28.72%0.35%
Two-Way, Divided, Positive Median Barrier2612.68%24.52%72.80%0.00%
Two-Way, Not Divided With a Continuous Left-Turn Lane3794.59%5.41%0.00%0.00%
Entrance/Exit Ramp240.00%20.83%75.00%4.17%
Others250.00%0.00%0.00%100.00%
Speed limit (SpL)<25 mph5614.29%19.64%14.29%51.78%
30–45 mph20663.11%30.58%4.85%1.46%
50–65 mph79048.99%31.77%18.86%0.38%
>65 mph27013.33%28.52%58.15%0.00%
Roadway alignment (Alg)Straight102844.66%30.54%23.44%1.36%
Curve24834.68%35.89%29.03%0.40%
Others4628.26%4.35%23.91%43.48%
Roadway grade (Grd)Level89645.98%29.35%23.33%1.34%
Grade37435.83%36.10%28.07%0.00%
Others5226.92%9.62%19.23%44.23%
Roadway surface condition (SrC)Dry107143.23%31.19%24.09%1.49%
Non-Dry25139.05%27.09%26.29%7.57%
Setting (Stt)Rural89046.52%30.22%21.69%1.57%
Urban43230.32%30.79%34.03%4.86%
Roadway function class (RdF)Interstate3350.30%22.69%76.41%0.60%
Arterial/Collector93658.02%32.05%6.94%2.99%
Local5133.33%50.99%5.88%9.80%
Intersection type (InT)Segment105030.76%34.67%33.05%1.52%
Four-Way Intersection17781.36%10.17%0.00%8.47%
T-Intersection8473.81%21.43%0.00%4.76%
Others1172.73%18.18%9.09%0.00%
Crash hour (Hor)12:00–5:59 a.m.29618.24%27.03%50.00%4.73%
6:00–11:59 a.m.40552.60%31.11%15.06%1.23%
12:00–5:59 p.m.40655.42%29.80%13.30%1.48%
6:00–11:59 p.m.21532.09%34.89%28.37%4.65%
Manner of collision (MnC)Not Collision with Motor
Vehicle
40813.24%48.53%35.78%2.45%
Angle37780.64%10.34%3.18%5.84%
Front-to-Front20077.50%20.50%2.00%0.00%
Front-to-Rear22724.23%34.80%40.09%0.88%
Sideswipe8742.53%35.63%18.39%3.45%
Others230.00%43.48%56.52%0.00%
Fire (Fir)No112945.79%31.18%20.55%2.48%
Yes19322.80%25.91%47.66%3.63%
Number of total vehicles (VhT)Single3143.18%54.46%39.81%2.55%
Two76562.74%21.70%12.55%3.01%
Multi24329.22%28.40%40.73%1.65%
Hazardous material class (HzC)Flammable/Combustible
Liquid
72743.06%30.67%22.83%3.44%
Gases19152.88%29.32%15.18%2.62%
Corrosive9439.36%26.60%34.04%0.00%
Explosives2937.93%20.69%41.38%0.00%
Others28135.23%32.74%30.25%1.78%
Lighting condition (LgC)Daylight78355.04%29.63%14.18%1.15%
Dark–Lighted13123.66%25.19%40.46%10.69%
Dark–Not Lighted34822.13%41.66%33.62%2.59%
Dawn/Dusk6036.67%33.33%25.00%5.00%
Weather (Wth)Clear90943.90%29.15%23.87%3.08%
Cloudy21235.85%42.92%18.87%2.36%
Rain/Snow13629.41%36.76%33.09%0.74%
Others6532.31%32.31%33.84%1.54%
Visual obstruction (VsO)No125943.05%30.50%23.83%2.62%
Yes6330.16%28.57%38.10%3.17%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sun, M.; Zhou, R. Investigation on Hazardous Material Truck Involved Fatal Crashes Using Cluster Correspondence Analysis. Sustainability 2023, 15, 9369. https://doi.org/10.3390/su15129369

AMA Style

Sun M, Zhou R. Investigation on Hazardous Material Truck Involved Fatal Crashes Using Cluster Correspondence Analysis. Sustainability. 2023; 15(12):9369. https://doi.org/10.3390/su15129369

Chicago/Turabian Style

Sun, Ming, and Ronggui Zhou. 2023. "Investigation on Hazardous Material Truck Involved Fatal Crashes Using Cluster Correspondence Analysis" Sustainability 15, no. 12: 9369. https://doi.org/10.3390/su15129369

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop