1. Introduction
Traffic accidents have brought a huge social and economic burden to society, and the world is also working to reduce the damage caused by accidents [
1,
2]. At the end of 2019, the total mileage of rural roads in China reached 4.2 million kilometers [
3]. With the gradual improvement of the rural road network, the traffic safety problem in rural areas has also become a research hotspot. According to the “Administrative Measures for Rural Road Construction” promulgated by the Ministry of Transport of the People’s Republic of China in 2018, rural roads are classified into three types of roads: county roads, rural roads, and village roads [
4]. China prioritizes rural revitalization policies and is committed to improving road safety at the county level to achieve sustainable traffic development [
5,
6]. This paper analyzes accidents in the county area.
China’s economic system reform implemented in the late 1970s led to the integration of urban-rural relations [
7]. Geographically, the boundary between urban and rural areas is gradually blurred, and the two areas are integrated and permeated with each other. In terms of spatial characteristics, population attributes, land use, and regional economy, urban and rural characteristics coexist, with prominent contradictions, complex problems, and prominent traffic safety issues. In the mid-1980s, China’s planning circles and land management department first put forward the concept of an “urban-rural junction” [
8]. There are many names for this region, such as urban-rural ecotone and urban-rural junction zone, but the essence is the same. The urban-rural junction is a dynamic regional entity, formed by many social, economic, cultural, and other factors [
8,
9]. In 2002, the State Council of China defined the urban-rural boundary area as the planned area for construction land, the state-owned land and collective-owned land mixed areas, and the planned area for agricultural land, which includes the state-owned construction land [
8]. Data show that traffic accidents occurring at or near intersections every year account for 19.8% of the annual accidents. Among the traffic accidents at intersections, the traffic accidents at the intersection of urban-rural areas account for 53.1% [
10]. The government has limited investment in transportation facilities in the urban-rural area [
11], indicating that the security situation of the urban-rural junction is not optimistic. There are some problems in the town-center area, such as chaotic traffic organization, mixed traffic flow, and large transit traffic flow, so the traffic safety problem needs to be solved urgently. Therefore, in this paper, the town areas, belonging to the urban-rural junction, and the town-center area are taken as the key areas.
Figure 1, adapted from Reference [
9], shows a schematic diagram of an urban-rural junction. The urban fringe and the rural fringe in
Figure 1 are the key areas. This paper deeply analyzes the influencing factors of traffic accidents in key areas to achieve the purpose of improving the level of traffic safety in key areas of rural roads, reducing the accident rate and ensuring rural traffic safety.
Previous traffic safety studies have compared and analyzed cities, suburbs, and rural areas. So far, the studies of traffic accident severity were mainly analyzed from five aspects: driver characteristics, vehicle characteristics, road characteristics, environment characteristics, and traffic management attributes [
12,
13,
14,
15,
16,
17]. Analyzing the mechanism of accidents requires a clear understanding of the interaction of various influencing factors [
18]. Islam analyzed the severity of motorcycle accidents on rural and urban roads and found that some variables (such as motorcyclists under influence of alcohol, non-usage of the helmet, high-speed roadways, etc.) were significant in both areas [
14], and Nguyen et al. discovered that perceived risks, beliefs, and environmental characteristics had a significant impact on motorcycle drivers at intersections [
17]. According to data from the National Bureau of Statistics in 2019, given that the number of motorcycle traffic accidents accounted for 18.4% of the total number of traffic accidents, and the death toll from motorcycle traffic accidents was 10,474, this article analyzed the impact of motorcycles on the severity of accidents. Scholars researched drivers in urban and rural areas. Wu et al. found that significant differences existed between factors contributing to driver injury severity in single-vehicle crashes in rural and urban areas [
15]. Watson et al. found that drivers in rural areas were less likely to use seat belts than in urban areas [
19]. Kmet et al. analyzed the accident mortality of children and adolescents in rural and urban areas and found that the mortality rate in rural areas was five times higher than that in urban areas [
20]. Singh et al. found that the main cause of most traffic accidents was the behavior of the driver, and whether the driver chose to reduce the speed when encountering adverse weather [
21]. In terms of road characteristics research, Pokorny found that rural roads with a lane width of 1.50–2.50 m and a shoulder width of 0.50–0.75 m are safer in Norway [
18]. Colonna proposed that safety principles need to be accurately followed when carrying out road geometric design [
22]. Calvo found that wide signs are more effective at reducing speed on weekends and when there is heavy traffic [
23]. In addition to the above research, traffic safety management policies have also attracted scholars’ attention [
24,
25]. Based on previous studies and collected accident data, this paper will research four aspects: driver characteristics, vehicle characteristics, road characteristics, and environmental characteristics.
At present, mathematical-statistical methods are widely used in the analysis of the severity of accidents. Generally, different models are adopted according to the collected data and research needs. Traditional statistical methods were used to study the field of traffic safety, such as the binary logit model [
26,
27], the ordered probability model (both probit and logit) [
28,
29,
30,
31,
32], and the multinomial logit model [
12,
33]. The binary logit model is also widely used in various fields of sustainability to obtain the influencing factors of an event [
34,
35,
36,
37]. This paper divides the severity of accidents into severe accidents and non-severe accidents, which are binary variables, so chose to use a binary logistic model for analysis. With the advancement of urbanization in China, scholars have made some researches on the urban-rural junction because of its particularity and unique traffic characteristics. There are few studies on the comparative analysis between town-rural area and town-center area. Therefore, it is necessary to specifically explore the mechanism of accidents in different areas to provide theoretical support for improving the safety of key areas of rural roads. It is of great significance to analyze the factors of accident severity in the town-rural area and the town-center area within the county area.
This study conducted qualitative and quantitative analysis on traffic accident data in the town-rural area and town-center area in Hunan Province, and used statistical modeling methods to extract important factors affecting accidents. The factors obtained that affect the traffic accidents in key areas can provide references for the improvement of traffic safety in key areas of rural roads in the county, and also have guiding and reference significance for other rural road areas. The next section explains the source of the collected data and the theoretical basis of the analysis method.
Section 3 presents the results of the model and the main influencing factors of the severity of the accident.
Section 4 discusses the findings of the research.
Section 5 draws conclusions and prospects.
3. Results
This section shows the results of the multicollinearity test of each independent variable and the parameter estimation results of the models of the severity of accidents in key areas of rural roads. The goodness of fit and prediction accuracy of the model are verified.
3.1. Accident Model of the Town-Rural Area
Table 8 shows the results of the diagnosis of multiplicity commonality. The variance inflation factor, VIF, of all variables is less than 10 and the tolerance is greater than 0.1, which indicates that there is no serious collinearity problem between the variables. IBM SPSS Statistics 22.0 software is used to establish a binomial logistic regression model based on the stepwise regression method, and the estimation results of model parameters are shown in
Table 9.
As can be seen from
Table 9, the
p-value of the number of involved vehicles, whether the motorcycle is involved, and whether at the intersection is less than 0.5, indicates that these independent variables have a significant impact on the severity of the accident. Although whether a normal road is meaningful in a single independent variable analysis (contingency table analysis), it is not included in the model. Because the multivariate model considers the influence of each independent variable, the results are more objective, so when such contradictions occur, the statistical results of the multivariate model should prevail [
45]. So the above three variables with significant effects are included in the model, and the regression equation is obtained as follows:
The results of the model show that: (i), The number of involved vehicles (1), that is, the OR value of multi-vehicle accidents is 0.051, indicating that multi-vehicle accidents are only 0.051 times more likely to be serious than single-vehicle accidents. That is, single-vehicle accidents are more likely to be serious than multi-vehicle accidents. (ii), Whether motorcycles are involved (1), that is, the OR value of motorcycles involved is 6.780, indicating that motorcycles involved are 6.780 times the probability of serious accidents when motorcycles are not involved. It shows that serious accidents are more likely to occur when motorcycles are involved. (iii), Whether it is at an intersection (1), that is, the OR value at the intersection is 2.303, indicating that the intersection is 2.303 times the probability of serious accidents at a non-intersection, and indicating that accidents are more likely to be serious at intersections.
The goodness of fit test and prediction accuracy-test are carried out to verify goodness of fit of the model.
Table 10 shows the results of the goodness of fit test and prediction accuracy-test. The significance p-value of the Hosmer–Lemeshow test is 0.322 greater than 0.05, indicating that the Hosmer–Lemeshow test is not significant, and the good goodness of fit is relatively good.
IBM SPSS Statistics 22.0 software is used to draw the ROC curve to test the prediction accuracy of the model, as shown in
Figure 2. The founders of IBM SPSS Statistics 22.0 were three graduate students from Stanford University in the United States. It is currently provided by IBM.
Figure 2 is the ROC curve diagram of the traffic accident severity prediction model in the town-rural area. The green line in the figure is the diagonal reference line. The area under the green line segment is 0.05, indicating that the model has no predicted value. The blue line segment is the ROC curve drawn based on the predicted value calculated by the binary logistics model. At this time, the AUC value is shown in
Table 11. Among them, the outer curve, which is farther from the diagonal, has higher sensitivity and specificity than the inner curve, which is closer to the diagonal [
49].
Table 11 shows the estimated value and standard error of the AUC value. The AUC value is 0.770, and its asymptotic 95% confidence interval is (0.695~0.845). The significance
p-value is 0.000, which indicates that the regression model of accident severity in the town-rural area has a good prediction effect and a high predictive diagnostic value.
3.2. Accident Model of the Town-Center Area
Table 12 shows the results of the diagnosis of multiplicity commonality. Except for the variable of whether the road section is normal, the variance inflation factor of all other variables is less than 10, and the tolerance is greater than 0.1; therefore, in the diagnosis of collinearity, whether the road section is normal or not is excluded from the model.
Table 12 shows the collinearity diagnosis results of other variables, and there is no serious collinearity problem among variables. IBM SPSS Statistics 22.0 software is used to build a binary logistic model based on the stepwise regression method, and the estimation results of model parameters are shown in
Table 13.
As can be seen from
Table 13, the overall p-values of age, number of vehicles involved, and accident period are all less than 0.5, indicating that these independent variables have a significant impact on the severity of traffic accidents. Although the road alignment is significant in the single independent variable analysis, it is eliminated in the stepwise regression screening process in order to maximize the likelihood of the model. Therefore, the above three variables with significant effects are included in the model, and the regression equation is obtained as follows:
The interpretation of the model is as follows: (i), Age (2), that is, the OR value of the elderly is 16.689, and it has a significant impact, indicating that the probability of serious accidents in the elderly is 16.689 times that of people in the young stage. That is, older people are more likely to have serious accidents than younger people. Age (1), that is, middle age had no significant effect (p-value 0.961 > 0.05). Considering the principle that age is a dummy variable, Age (1) is also included in the model. (ii), The number of involved vehicles (1), that is, the OR value of the multi-vehicle accident is 0.040, and it has a significant impact, indicating that the occurrence of a multi-vehicle accident is 0.040 times the probability of a single-vehicle accident occurring as a serious accident. It shows that single-vehicle accidents are more likely to be severe. (iii), In the accident period (1), the OR value at night is 4.251, and it has a significant impact. It shows that the probability of serious accidents at night is 4.251 times that of the day, and it shows that nighttime accidents are more likely to be severe than during the daytime.
The goodness of fit test and prediction accuracy-test are carried out to verify the goodness of fit of the model.
Table 14 shows the results of the goodness of fit test and prediction accuracy test. The significance
p-value of the Hosmer–Lemeshow test is 0.663 greater than 0.05, indicating that the Hosmer–Lemeshow test is not significant, that is, the good goodness of fit is relatively good.
IBM SPSS Statistics 22.0 software is used to draw the ROC curve to test the prediction accuracy of the model, as shown in
Figure 3.
Figure 3 is the ROC curve diagram of the traffic accident severity prediction model in the town-center area. The meaning of the green line segment and the blue line segment in
Figure 3 is the same as in
Figure 2.
Table 15 shows the estimated value and standard error of the AUC value. The AUC value is 0.813, and its asymptotic 95% confidence interval is (0.664~0.961). The significance
p-value is 0.000, which indicates that the regression model of accident severity in the town-center area has a good prediction effect and a high predictive diagnostic value.