1. Introduction
In recent years, bicycle ridership has gradually become one of the most common commuting means for the urban populace in the United States, as it is economical, energy-saving, and environmentally friendly. Many cities across the world are currently developing programs designed to promote bicycle riding as a means of reducing road congestions, controlling air pollution, and promoting healthier and more sustainable transportation alternatives [
1,
2]. Specific modifications of city bicycle infrastructure, including increasing bicycle lane mileage [
3,
4], adding bicycle share programs [
5], and improving signage and street markings, are among the extensive integrated measures that proved to be most effective in increasing bicycle use [
6]. However, bicyclists have a higher risk of severe crashes compared to vehicle drivers, which is a major deterrent to adopting bicycling as the main mode of transportation by many people [
7,
8]. For example, bicycle crashes were responsible for more than half million emergency visits to hospital in the United States in 2018 [
9] and more than half million emergency department visits for traumatic brain injury during 2009–2018 [
10]. According to the National Highway Traffic Safety Administration (NHTSA), the percentage of bicyclists in total fatalities has steadily increased from 1.8% to 2.2% between 2004 and 2013 [
11]. Thus, the safety challenges associated with bicyclists remain a major concern in transportation planning. Several factors can contribute to bicycle injuries/fatalities and can be broadly classified as roadway-related, person-related, and environment-related. The common bicycle crash contributing factors include poor compliance with traffic laws and improper use of facilities, speeding, inadequate separation, crossing locations, inadequate conspicuity, and impairment and distraction [
12].
San Antonio (located in Bexar County, Texas) is one of the fastest growing cities in United States and the seventh most populous with a population of 1.55 million [
13]. In the year 2017, the population of San Antonio grew by 24,408 people, more than any other city in the United States [
14]. The population of San Antonio is expected to increase by over one million by 2040 with a proportionate increase in the number of road users including pedestrians, bicyclists, and motorists. With more pedestrians, bicycles, and vehicles occupying the roads, conflicts on the road are also expected to increase. San Antonio is one of the premier places for bicycling in the United States, and the city has been enhancing its bicycle infrastructure by adding wider shoulders, multiuse paths, cycle tracks, bicycle lanes, bicycle boulevards, better routes, and shared lane markings, all of which are aimed at supporting bicycle traffic safety in the city [
15]. In 2016, 62 bicyclists were killed in motor vehicle crashes in Texas, which was a 17% increase from 2015. San Antonio, being the second largest city in Texas, witnessed five deaths in 2016, a 25% increase from the previous year [
16].
Researchers have used several techniques in traffic safety analysis models in previous studies including simultaneous equations [
17], negative binomial [
18,
19,
20], random effect ordered logit [
21], ordinal probit [
22], random effect negative binomial [
23], and Bayesian hierarchical binomial and logistic models [
24]. Logistic regression has also been widely used, especially in predicting motor-vehicle crashes by young drivers [
25], determining the effect of age and/or gender on injury severity in head-on motor vehicle collisions [
26], analyzing circumstances of bicycle crashes and injury pattern of cyclist casualties [
27], improving motor vehicle/bicycle crash database [
28], examining factors associated with bicycle injuries [
29], determining risk factors related to e-bike and bicycle crashes [
30], and determining the association of crash-related factors with mobile phone use for motorcyclists and e-bikers [
31].
The safety of bicyclists can be enhanced by reducing collision risk and/or reducing the severity risk of a crash. Crash severity can be influenced by the behavior of the party at fault in a crash [
32,
33,
34]. Pedestrian crashes support the hypothesis that the crash severity increases when the non-motorist is at fault, whereas bicycle crashes suggest otherwise [
33]. Some studies also tried to determine the most suitable type of bicycle facility (e.g., on-street facility/shared-use path/curb lane) for safety and mobility [
35,
36], as well as the effect of specific roadway facilities [
37,
38]. However, studies focused on the effect of bicycle facilities (e.g., bicycle lanes, curb lanes, shared lane arrows, and road signage), as well as the effect of combination of these strategies, are still limited [
6]. Previous studies found several factors to be significantly affecting bicyclist injury severity resulting from bicycle–motor vehicle crashes such as bicyclist race and gender, bicyclist old age, speeding, alcohol influence of driver/bicyclist, use of helmet, lighting condition, day of week, road type, and presence of intersection [
12,
33,
39,
40,
41,
42,
43,
44].
This study aims to contribute to a safer roadway for bicyclists by identifying high-risk locations and analyzing the effect of contributing factors on bicycle crash severity. This will assist the City of San Antonio, Texas Department of Transportation, and other traffic management stakeholders to prioritize allocation of available resources to high-risk locations, modify existing facilities at high-risk locations, adopt informed decisions regarding future designs, and run campaigns to the targeted audience. This study also contributes to the state of the literature by estimating the significant variables associated with different levels of injury severity of bicyclists, by examining how crash-related variables differ by party at fault for different injury levels, and how the presence of bicycle facility affects the severity of crash.
2. Materials and Methods
The crash data used in this study were acquired from the Texas Department of Transportation’s (TxDOT) Crash Records Information System (CRIS) for the 5 year study period (January 2014 to December 2018). Starting from 2003, all crashes that occurred on Texas roads and were reported by law enforcement officers, when predefined criteria were met, were included in the database. If any crash resulted in injury or death of any person involved or property damage ($1000 or more), a report was forwarded by the law enforcement officer to TxDOT no later than the 10th day after the date of the crash. The CRIS database includes the location and time of crash occurrence along with other relevant information pertaining to the crash (e.g., environmental, temporal, road, and bicyclists’ characteristics). The bike facility data were collected from the Traffic Engineering Division of the Transportation and Capital Improvements Department (TCI) of San Antonio. The facility demarcation is representative of the centerline of the facility.
Injury severity of the bicyclists was divided into two categories: KA (fatal or incapacitating injury; i.e., severe injury) and KAB (fatal or incapacitating or non-incapacitating injury; i.e., any confirmed injury). Bicycle crashes occurred on roads not maintained by the City of San Antonio were excluded from the analysis. Five datasets were prepared for analyses: all bicyclist-related crashes, bicyclist-not-at-fault crashes, bicyclist-at-fault crashes, on-facility bicycle crashes, and off-facility bicycle crashes. Any crash that occurred within 15 m of any side of the centerline of a road with a bicycle facility was considered as an on-facility crash. A bicyclist was assumed to be at fault if associated with any of the following primary contributing factors: changed lane when unsafe, disregarded stop and go signal, disregarded warning sign at construction, inattention, failed to control speed/speeding, disregarded stop sign or light, disregarded turn marks at intersection, faulty evasive action, fleeing or evading police, followed too closely, was drinking, overtook and passed with insufficient clearance, turned improperly, under influence of alcohol/drug, driving on wrong side/way, and was using a cellphone.
A heatmap (Figure 1) was created using reported locations of bicycle crashes to represent crash density over the study area. A color scheme consisting of a set of smoothly varying colors [
45] was used for the density calculation with the kernel density estimation (KDE) method to visualize crash data as a continuous surface [
46]. A defined kernel density surface is required for each crash point where the density value peaks at the center and gradually declines away from the center [
47,
48]. The KDE tool uses the quartic kernel function represented by the following equation:
where
= is the kernel function for two-dimensional
. Generally,
is a radially symmetric unimodal probability density function [
47].
The predicted density at a
location is determined by the following formula:
where
=
are input points or point crashes
is the population field value of point
, and
is the distance between point
and the location
.
The hotspot analysis (Getis-Ord Gi*) was used to identify statistically significant spatial clusters [
49]. This method uses the Getis-Ord Gi* statistic to identify clusters of high values (hotspots) and low values (cold spots). For each input feature, an output feature is created containing a
z-score,
p-value, and confidence level bin. The Getis-Ord local statistic is defined as
where
xj = the attribute value for feature
j,
wij = the spatial weight between feature
i and
j, and
n = the total number of features.
The average of the observed values is expressed as
The standard deviation is expressed as
The complete spatial randomness (CSR) of the features or the values associated with those features is the underlying null hypothesis for this analysis. The null hypothesis is rejected when the
z-score is relatively high/low and the
p-value is very small, indicating statistically significant clustering or dispersion of features or values associated with features. The
p-value represents the probability of randomness in clustering, and
z-scores represent standard deviations. A very small
p-value along with a very high/low
z-score indicates the small probability of a cluster being a product of random distribution. When the search bandwidth is excessively large, a very smooth pattern is produced, making the process of differentiation harder between local hotspot locations. On the other hand, a spiky density pattern is produced by a narrow search bandwidth which highlights individual hotspot locations. Therefore, the use of an excessively large or narrow bandwidth might lead to false conclusions. To overcome this limitation, a trial-and-error method was adopted as recommended by previous studies [
49,
50,
51]. When working with point data, a fixed distance band is suitable in the conceptualization of spatial relationships, and the selected threshold distance was 250 m.
The heatmaps derived using crash density cannot identify severe crash-prone locations. Hot/cold spots can be identified on the basis of crash severity, and each crash must be provided a weight based on its severity. The “compromise approach” (more severe crashes are provided with greater weight) has been popular in providing weight to crashes, but there is no established standard weighting system [
52]. Crashes involving a fatality, serious injury, other injury, and property damage only were assigned severity indices of 3.0, 1.8, 1.3, and 1.0, respectively, by the Roads and Traffic Authority of New South Wales [
53]. Another study based on Flanders, Belgium, used 5.0, 3.0, and 1.0 as severity indices for fatal, serious, and light injury crashes, respectively [
54]. This study was primarily focused on identifying the high-risk locations for bicyclists in its spatial analysis segment and placed relatively greater weights on severe bicycle crashes. The following equation was used to determine the severity index (
) of any location:
where
Y1,
Y2,
Y3,
Y4, and
Y5 represent the total number of crashes involving a fatality, serious injury, non-serious injury, possible injury, and no injury, respectively.
Bicycle–motor vehicle crash-associated variables selected for statistical analyses (
Table 1) were based on a literature review [
12,
30,
33,
39,
40,
42,
43,
44,
55] to study their effect on the severity of bicycle crashes as stand-alone variables and in conjunction with other. Bivariate analysis is utilized as an exploratory tool for hypothesis of the association test between a dependent and an independent variable, and it was used in this study to explore the relationships between the bicyclist injury severity and bicycle–motor vehicle crash-associated variables (as standalone variables). Chi-squared tests were performed for each categorical variable in the bivariate analysis to determine statistically significant differences within two or more classes in the distribution of the variable. Although a chi-squared test can determine the association between two variables, it fails to account for possible confounding factors. Hence, a definite causal relationship between two variables cannot be established from the chi-squared test. The strength of association was determined using the odds ratio (OR), which represents the ratio of the odds of an event occurring in the presence of the independent variable compared to the odds of that event occurring in the absence of that independent variable. Crash severity was used as the response variable during the development of logistic regression models to test the relationship of bicycle crash severity with other crash-associated variables (weather condition, lighting condition, speed limit, road class, collision type, time, period of week, month, and intersection presence) and bicyclist-related variables (age, gender, ethnicity, and helmet wearing practice) for all five data types described above. Developed logit models were used to identify statistically significant classes within the selected variables (in terms of crash severity) and to check the strength of association of each significant class in the logit model. This study used the logit as the natural logarithm of the odds, as shown in Equation (7).
where A is the probability of severe crashes, Zi is the independent variable, and βi is the model coefficient directly determining the odds ratio.
The injury severity is determined by the officer on duty without cross-checking with hospital data or other external sources, and the party at fault is determined on the basis of the contributing factors reported by the officer; both can be subject to errors. Motor vehicle–bicycle crashes resulting in minor injuries or no visible injuries were often underreported, which might have led to a bias in results. Traffic volume data were not included in the spatial analysis, and only basic demographic variables were included due to the unavailability of city-wide detailed traffic volume data. The traffic policies and environmental factors did not change significantly during the study period to our best knowledge. However, consideration of the dataset as static over the study period might have been a potential limitation of this study due to the lack of control of environmental factors. Bias in sample selection was also possible, which is often associated with crash data analyses as people involved in crashes might not be representative of the general road user population. Coordinates were unavailable for 17.6% of the crashes; thus, the heatmap and hotspots were prepared using 83.4% of all crashes only.
4. Conclusions
This study examined the spatial distribution of bicycle crashes and analyzed the factors that influence the injury severity of bicyclists involved in bicycle–motor vehicle crashes using 5 year crash data for San Antonio, TX. Bivariate analysis and logistic regression modeling were used to examine the relationships among different human-, environment-, and crash-related factors for two injury severity levels. Bicyclist age and ethnicity, lighting condition, road class, time of day, and day of the week had the most significant association with severe bicycle crashes. Overall, bicycle crashes resulting in severe injury of bicyclists were more strongly influenced by the variables studied compared to non-severe crashes.
In the absence of bicycle facilities, severe bicycle crashes had several strong predictors (bicyclist age and ethnicity, intersection presence, and temporal variables), while on-facility severe bicycle crashes had almost none, implying the effectiveness of bicycle facilities in reducing the influence of some variables on crash severity. Facilities such as separate/protected bicycle lane might contribute to decreasing the chance of a deadly collision with motor vehicles, thereby reducing severe injury risk. Although the presence of bicycle facilities reduced the effects of some variables to some extent, their presence made no statistically significant difference in crash severity when compared to roads without bicycle facilities.
Wearing a helmet is not mandatory for bicyclists in Texas [
59], but some major cities in Texas have ordinances of mandatory helmet use. The results indicate very limited practice of helmet use among crash involved bicyclists in San Antonio and an increase in the injury severity of bicyclists in the absence of a helmet for bicyclist-at-fault and on-facility crashes. Our findings suggest that older, male, and non-Hispanic bicyclists of San Antonio are more likely to sustain severe injuries. Identifying zones with a higher proportion of older bicyclists and introducing bicycle facilities in these zones, a targeted campaign to encourage the wearing of helmets and protective gear, and the introduction of a mandatory helmet use ordinance in San Antonio might reduce the severe injuries of older bicyclists.
Bicycle crashes on roads with a relatively higher speed limit (e.g., highways and FM roads) notably increased the bicyclist injury severity risk, especially for crashes where bicyclists were not at fault and those which occurred on roads with bicycle facilities. Identification of the optimal bicycle lane width and separation of bicycle lanes from the roadway were effective in reducing the crash rate [
35] and raised bicycle crossings were effective in increasing the safety of bicyclists [
60]. These techniques should be introduced to roads with a relatively higher speed limit and bicycling activities.
The weekend period had a relatively lower bicycle crash count but higher severe injury proportion. Nighttime during weekend period had a substantially high severe injury risk, probably due to the higher frequency of DWI and distracted drivers [
33], encouraging the use of ridesharing services during this period. Bicyclists exhibited reduced faulty behavior during the weekend, whereas the proportion of faulty motor vehicle drivers of San Antonio was greater during the weekend [
44].
The relatively higher bicyclist injury risk during summer and lower injury risk during winter might be attributed to the seasonal differences in clothing practice. Bicycle crashes, especially on-facility bicycle crashes, occurred at a much higher proportion at intersections. This might be an indicator of the effectiveness of facility implementation, as fewer crashes occurred on roadway segments where facilities were introduced.
Intersections in the city center were more prone to a higher bicycle crash frequency but lower injury severity of bicyclists, which is analogous to the overall crash pattern at intersections in San Antonio [
41]. The primary contributing factors to bicycle–motor vehicle crashes were driver inattention and disregard of stop sign/light for both bicyclists and vehicle drivers. Implementation of bike-activated signal detection and bicycle signal heads at signalized intersections, encouraging and endorsing the use of autonomous vehicles, as well as the implementation of an automated red-light camera and shared marking lanes along with campaigns toward targeted audience, can significantly reduce bicycle crash incidents. Future studies focusing on the in-depth analysis of the most significant variables might be helpful in providing case-specific recommendations.