1. Introduction
Urban road networks (URNs) carry the daily lives of city dwellers and the activities of urban transport, which constitutes the physical vehicles that carry all urban transport activity. Transport networks are susceptible to various disruptions, such as infrastructure failures and temporary closures due to construction works. More severe disruptions, such as strikes or extreme weather conditions (e.g., droughts, heavy snow, and high winds), could lead to the partial unavailability of transport networks. The incident is likely to cause queues of vehicles in the affected section and sudden changes in the distribution of traffic flows in the URNs. It is also possible that the incident will cause cascading failures of nearby road sections, leading to widespread network paralysis. Therefore, in addition to the increasing conflict between supply and demand, URNs have to bear the potential risk of congestion caused by unexpected events.
The URNs are complex networks consisting of physical road networks and transport demand networks. URNs are typical examples of complex dynamic networks, characterized by network nodes, representing locations, and network connected edges, representing the adjacency relationship created by the line. Traffic flow in the transport demand network reflects how vehicles move along the road, which is a transport phenomenon associated with the URN. However, junctions and segments of the URNs are located in different places with different levels of importance. Therefore, it is very important to evaluate the importance of roads and intersections in the road network. It helps in finding weak roads and junctions in the road network. The identification of critical roads and intersections is important in terms of urban congestion relief, emergency response plans, and urban traffic planning and management.
To effectively analyze the importance of nodes in urban road networks, researchers have proposed identifying key nodes based on their vulnerability. Leveraging complex network theory, the analysis focuses on urban road networks. However, in actual road networks, the significance of nodes is influenced not only by the network structure but also by critical factors such as geographical location and traffic characteristics. Most existing studies have primarily examined differences in network structure based on characteristic models of road networks, often overlooking the crucial aspects of traffic characteristics and geographical location. The data generated by taxis provide a rich source of information that can be used to understand mobility in cities, as taxis usually have a more ample coverage of routes in a city. The data usually consist of GPS traces as well as ridership information that, when compared to traditional approaches to collecting data (namely travel surveys), provide more accurate and accessible information to describe the movements and discover mobility patterns. To name a few of the uses, the availability of GPS data has allowed researchers to critically analyze section identification [
1], collaborative public transportation [
2], travel characteristics [
3], etc.
Therefore, to address the aforementioned gap, this paper proposes an improved K-shell model that incorporates both the structural characteristics of the URN (node degree and node betweenness) and traffic characteristics (node intensity, number of POIs, and average road width) to identify key stations within the network. This comprehensive approach aims to enhance the accuracy and reliability of pinpointing critical nodes by considering a broader range of influential factors. To construct a weighted URN for commuter days and rest days, traffic flows are allocated by a user equilibrium model based on daily travel demand differences on city streets. Then, the vulnerability of weight URNs is analyzed under deliberate attacks by comparing the cascade failure process.
The remainder of this paper is organized as follows.
Section 2 presents a comprehensive relevant literature review. The feature analysis based on taxi GPS is described in
Section 3.
Section 4 presents the proposed methodology in detail.
Section 5 analyzes the weighted URNs for commuting days and rest days. Finally, discussions are presented in
Section 6.
2. Related Work
Complex networks have been an extensive research focus over the past 20 years. In 1998, Watts published the issue of ‘Small World’ in
Nature and, in 1999, Barabási [
4] published the issue of ‘Scale Free’ in
Science; research on the structure and properties of complex networks is drawing more and more attention and has penetrated mathematics, computer science, sociology, traffic network [
5], power networks [
6], etc. The robustness of networks has been studied widely in the transportation field, such as road networks, maritime networks, subway networks [
7], and bus transit networks [
8,
9]. The concept of vulnerability can measure the performance of networks. In transportation systems, vulnerability is often defined as susceptibility to events that could lead to reduced network availability [
10].
A robust transportation network facilitates the movement of passengers, goods, and services and contributes to a thriving economy. Vulnerability has also taken on a different meaning in URNs. Jenelius et al. [
11] argue that the concept of vulnerability with respect to road infrastructure should address two components: the probability of hazard occurrence and the consequences of the hazard event. Furthermore, Jenelius and Mattsson [
12] state that the impact on a single user depends on their exposure to the disruption scenario. Berdica [
13] defines vulnerability more specifically as “the susceptibility to incidents that can result in considerable reductions in road network serviceability.” Wen et al. [
14] evaluated the urban road network in terms of layout and internal structure based on fractal analysis, analyzing the relationship between the length of the URN and the built-up area. Moreover, Taylor [
15] defines network vulnerability in terms of the ease of access to activities from different locations within a regional network.
The URNs have heterogeneous traffic distribution. Jiang’s research showed that a few roads carry the most traffic flow [
16]. However, these roads often have more severe consequences following failures and disruptions. However, identifying critical elements in the network is a crucial part of vulnerability analysis and designing resilient transportation systems [
17]. According to network science, there are various methods for measuring the importance of nodes in complex networks, such as the degree centrality, betweenness centrality, closeness centrality [
18,
19,
20], etc. Many researchers have quantified the vulnerability of UBNs using essential infrastructure recognition methods [
21,
22]. Sohn [
21] analyzed the vulnerability of the road network following the disruption of critical links by considering the effects of distance decay and traffic volume on the traffic network. Ricci curvature [
23] is used to study the topological characteristics of road networks by measuring the transmission state between nodes through an optimal transmission metric. Jenelius and Mattsson [
12] analyzed the vulnerability of URNs under single link closures and area-covering disruptions by combining spatial patterns with regional variations in geographic location, travel patterns, and network density. Lu [
24] considered the alignment and type of roads in the URNs and studied the vulnerability of URNs concerning traffic demand.
In actuality, the topological model does not properly reflect traffic flow dynamics. The URNs are the skeleton of the overall urban planning layout, which can provide safe, rapid, economical, and comfortable driving conditions for all kinds of transportation. Traffic flow refers to the flow generated by vehicles traveling in the URN. The study of traffic flow is generally divided into macroscopic traffic flow [
25] and microscopic traffic flow [
26]. It is therefore possible to simulate a more realistic road traffic network by using a traffic flow model that considers route choice and travel costs. In order to construct a weighted transportation network, traffic assignment is essential. Commonly used traffic flow allocation models include all or nothing [
25], multi-road logit [
9], system optimum [
27], and user equilibrium [
28]. The system optimum and user equilibrium models, which take capacity and congestion into account, have gained more attention in the vulnerability analysis of urban traffic networks [
29]. Traffic flow allocation models based on user equilibrium models can be used to determine the most severe congestion failure segments [
30].
While the cascading impacts from targeted attacks on road infrastructures are mainly unknown, researchers have also conducted numerous studies to quantify weighted network vulnerability. Zhang and Wang [
22] developed a metric for road network capacity performance following natural hazards, and they discussed improving resilience by strengthening existing roads and strategic new constructions. He et al. [
31] developed a multi-linked freight network and found that traffic impacts vulnerability differently. Wang et al. [
32] propose a novel framework for analyzing the vulnerability of URNs based on the tracks of real taxis. Due to changes in the structure and capacity of the network, travelers’ planned travel routes will change in the event that weighted URNs fail, reducing the efficiency of their trips. Therefore, in addition to quantifying road network vulnerability to hazards, it is also vital to understand cascading vulnerability in the underlying road traffic network [
26]. Wu et al. [
33] analyzed the spatial and temporal patterns of cascading failures in URNs during real rainstorms. They used the relationship between degrees of centrality as a measure of road loading and investigated the spatial correlation between URN vulnerability and cascading failures. Jia et al. [
34] proposed a cascading failure model to analyze the propagation of failures in URNs, providing guidance for planning freight routes during emergencies. Wang et al. [
35] analyze the vulnerability of URNs to traffic situations at different moments. To measure cascade failure vulnerability, the researchers analyzed the post-failure road network structure, vehicle travel time, and accessibility.
However, the URN is inherently dynamic. Existing studies primarily focus on virtual flow or hypothetical traffic flow based on network topological features, with limited research on dynamic traffic flow modeling using actual data. To address this gap, this paper proposes a comprehensive method to analyze the vulnerability of urban road networks. First, areas with higher densities of order data are identified using GPS. By combining the OD demand matrix and user equilibrium model across different dates, traffic flow within road sections is accurately captured. In defining node importance, this study incorporates critical factors such as population and points of interest (POIs) surrounding the URN. This approach ensures a more realistic and data-driven analysis of node importance in the URN. Furthermore, under three deliberate attacks on three different dates, considering the geographic characteristics of the nodes, the cascade failure vulnerability of the URNs is analyzed.
3. Analysis and Mining of Traffic Features Based on Taxi GPS Tracks
Since the development of urban construction and the advancement of data technology, floating car systems have been established in most Chinese cities. The database receives real-time information about cab locations and passengers when GPS devices are installed in taxis. Since GPS has broad coverage, good continuity, and a low acquisition cost, it is widely used in studies of residential travel characteristics [
36,
37]. And GPS data from cab trips can provide useful insights into the mobility of city residents. This section analyzes the travel characteristics of residents in the central area of Harbin in terms of taxi order duration time, order number, GPS spatial distribution, and geographically weighted regression, respectively.
In this section, the road network in the main urban area of Harbin City is taken as the study area. The GPS data are the trajectory data of 2000 taxis every 30 s throughout the day on 14 April 2019 (rest day) and 17 April 2019 (commute day) provided by Harbin City Transportation Bureau; each record includes the vehicle number, time, latitude, longitude, speed, and the vehicle status (0 indicates that there is no passenger and 1 indicates that there is a passenger), as shown in
Table 1.The GPS data are used to obtain the OD order data for taxis in the main urban area of Harbin City, and the OD orders are obtained by filtering and processing. By filtering and processing the GPS trajectory data, the taxi OD order data are obtained.
- (1)
Travel orders per hour
The Taxi OD is the passengers’ origin and destination data and the taxi’s order data. This paper uses GPS data to extract cab order data for the Harbin section during commuter days and rest dates.
In
Figure 1 and
Figure 2, commuter day and rest day trips are demonstrated according to their time-varying characteristics. The resident trips are divided into 24 periods and moment 0 indicates 00:00 to 01:00. The hours of 03:00–05:00 are the low peak travel for commuting days and rest days. However, peak travel times differ markedly between commuting days and rest days. The morning peak of commuter day travel occurs at 09:00–10:00 and the evening peak occurs at 16:00 and 20:00. The morning peak for travel on rest days is 8:00 and the evening peak appears at 16:00–17:00.
- (2)
Duration of order
As can be seen from
Figure 3, the majority of taxi rides are less than twenty minutes, demonstrating that Harbin’s taxi rides are mostly short distances according to taxi orders on commuting days and weekends. Commute days have a longer average order duration than weekends, and peak order durations also differ between commute days and weekends. As shown in
Figure 3, order durations at 7:00 and 17:00 on commuting days are often significantly higher than at other times, possibly due to traffic congestion.
- (3)
Spatial distribution
The geographical distributions of urban travel are examined from two perspectives: GPS spatial distributions and travel OD spatial distributions.
This paper analyzes spatial distribution via a heat map.
Figure 4 shows the GPS data for commuting and resting days. GPS points are mainly distributed in the main city of rest days and commuting days, and the difference between different days is slight. Moreover, GPS data contain track points where taxis are empty, so taxi drivers tend to choose areas with high passenger demand, as can be seen in
Figure 4.
The spatial distribution of taxi order data on rest days and commuting days is shown in
Figure 5. It can be seen that the overall spatial distribution is not distinct for all-day ODs when comparing the spatial distribution of origins and destinations. The extremely strong urban centrality is reflected in the highest travel density in the main area. However, the spatial distribution density of the OD differs slightly. On rest days and commuting days, orders are concentrated in Hehexing Road Street, Haxi Street Office, and Wanggang Town.
- (4)
Geographically weighted regression analysis
Geographically weighted regression (GWR) is a statistical analysis model for local parameter estimation. The GWR model provides a better fit for the spatial variation in elements due to its responsiveness to the heterogeneity of the spatial distribution of ingredients compared to the global model [
38]. The model structure calculation formula of GWR is as follows:
where
is the coordinates of the geographic center of the traffic zone
i.
is the continuous function (the Gaussian of the GWR kernel function is used in this paper),
are the
k regression parameters for traffic zone
i, and
is the random error term.
Figure 6 depicts the geographically weighted regression between urban travel demand, POI, and population size for commuting days and rest days. The correlation between the variables is indicated by the colored bands. In densely populated areas, morning and evening peaks affect cab ridership differently. It can be seen from the regression analysis that taxi passenger flow is positively correlated with urban core population density, which results in higher levels of travel demand. According to the regression analysis of the morning peak POI and cab passenger flow on rest days, the correlation is lower in the core area. The correlation is more substantial during evening peak hours, which correlates with rest day travel patterns.
4. Evaluation Framework: Vulnerability Analysis Method
The vulnerability analysis framework proposed in this paper consists of four main components. The first step is constructing weighted URNs from the complex network theoretic. Then, utilizing the GPS data extracted in the previous section, OD travel volume on commuting days and rest days are assigned to link the road network. The second step, including the method of node importance assessment based on residents’ travel characteristics, is proposed by combining the population and POI numbers of traffic zones with the method of crucial node identification in complex networks. Lastly, the third step analyzes the characteristics of two different weighted road networks for commuter days and rest days utilizing the structural vulnerability metric. As shown in
Figure 7, this section presents a vulnerability analysis framework for analyzing the network vulnerability based on structural and functional assessment indicators.
4.1. Flow Distribution Model
In a real URN, road managers are concerned mainly about the flow changes in road sections, while passengers are instead concerned about the difficulty of travel. The user equilibrium model formalizes the traffic problem by modeling a route choice: passengers complete their journey by choosing a path,
l. Flows that occur when each driver minimizes their own travel time are known as user equilibrium flows. It is theoretically impossible for a driver to profit from deviating from their route in the resulting system state. In Wardrop’s principles in transportation, the Nash equilibrium in roads is essentially described: the travel time on all routes used by origin–destination (OD) pairs is equal and is lower than the travel time on any new route experienced by a single vehicle. The flow distribution under the user balance network can be regarded as the flow distribution. The convex program for the user equilibrium problem has been formulated [
39] as follows:
where
L is the set of all links
l in a traffic network and
is the link characteristic function that describes the relationship between the travel time
and the passenger flow
xl on link
l. This paper considers the BPR function as the link characteristic function of MBNs.
is the passenger flow between the original node
s and the destination node
t on path
k,
is the total traffic between the original node
s and the destination node
t, and
when section
l is on path
k. The BPR function is as follows:
where
is the capacity of a bus segment
l and
is the free-flow travel time. The α and
β are the shape coefficients, the values of which are generally set as
α = 0.15 and
β = 4.0. Since a simple topological structure ignores the heterogeneity of traffic flows in a traffic network, edge weights are introduced to quantify the distribution of traffic flows through travel times
.
4.2. Critical Node Identification
In
Section 3, the distribution of OD orders and the relationship with population and POI were analyzed. Therefore, this section proposes node importance indicators that consider complex network theory and traffic characteristics.
Improving the K-shell algorithm:
The K-shell algorithm traditionally produces a coarse result for node importance division, as a large number of nodes are assigned the same K value, making it difficult to distinguish the importance of nodes within the same layer.
This is based on the results of GWR, which indicate that road intersections exhibit distinct characteristics depending on their structure and traffic conditions. The paper proposes an improved K-shell algorithm that takes into account the structure of the road network and traffic characteristics.
where
Wi is the importance of intersection
i,
Ks is the K-shell value of intersection
i, and
Ki is the combined characteristic value of intersection
i.
The value of intersection comprehensiveness comprises two parts: the structural characteristic index and the traffic characteristic index. The former includes classical metrics in complex networks, such as the node degree and node mediator. The latter is unique to road networks and better able to characterize the actual situation of the intersection, including node strength, number of points of interest, and average road width. The intersection comprehensive characteristic values are defined as follows:
where
Kα,
i is the intersection structural characteristic index,
Kβ,i is the intersection traffic characteristic index, and
λ is the coefficient to be determined.
(1) Index of Structural Characteristics. This index calculates the importance of an intersection in terms of its structural characteristics by taking into account the connectivity of the nodes and the number of shortest paths through the nodes. The calculation is as follows:
where
di is the degree of node
i,
Bi denotes the betweenness of node
i, and σ is a coefficient to be determined.
The indicator of node degree represents the number of edges connected to the node and is defined as follows:
where
j is the set of nodes connected to
i.
Node betweenness is the ratio of the number of shortest paths through the node to the number of shortest paths in the entire network. Travelers typically choose the shortest path to travel, so the node with the shortest path will have a higher traffic demand. The node betweenness can be calculated as follows:
where
emn(
i) is the number of shortest paths of the node pair (node
m and node
n) through node
i and
emn is the total number of shortest paths between the node pair (node
m and node
n).
(2) Traffic Characteristic Indicators. Traffic characteristic indicators are calculated by considering node strength based on traffic, number of points of interest (POIs), and population. These indicators are used to evaluate the effectiveness of the intersection in terms of traffic characteristics and the traffic characteristic indicators are calculated as follows:
where
M1,i is the strength of node
i,
M2,i is the number of POIs of node
i,
M3,i is the average population of node
i (the number of people in the region where node
i is located), and
μ1,
μ2, and
μ3 are coefficients to be determined.
Node strength is calculated as the sum of all edge weights adjacent to the node. In this paper, we selected the traffic of the road adjacent to node
i as the node weight. Traffic is a quantitative indicator that characterizes traffic demand and reflects its size. The formula for calculating node strength is as follows:
where
qij is the flow between node
i and node
j.
The quantity of M2,i is determined by the number of POIs within a 200 m diameter centered on node i. POIs may indicate the strength of traffic attraction; a higher number of POIs indicates a denser distribution of public service facilities, resulting in a greater volume of trips and an increased likelihood of traffic congestion. We obtained a total of 80,255 POIs, categorized into 14 major and 139 medium categories.
4.3. Cascading Failure Model
The failure of one critical node or component in an urban road network may result in adjacent nodes or edges failing, which will severely collapse the network. The purpose of this section is to analyze the cascade failure characteristics of the weighted road network after failure. The capacity of the node, referred to as the maximum flow tolerance of a station in URNs, can often reflect the collection and dispersion ability [
40]. The node capacity on the station of a network is defined as follows:
Practically, the higher capacity intersection or roadway segment may also carry more traffic, so if the roadway is disrupted, more vehicles will use the higher capacity intersection or segment. The node connected to
vj is defined as
vk and the increase in traffic flow on edge
vj to
vk is as follows:
where
Neighborj is the neighboring nodes of node
j and
denotes the traffic flow of
after failure occurs; the edge of
ejk will be removed from the road network when
, followed by cascading failure.
4.4. Vulnerability Indicator
The measurement of network operational loss (the degree of network dysfunction after a node or link is disrupted) is a practical tool for analyzing the dynamic performance of URNs over time. Following the node/link disruption, the most significant operational and connected part of the network is considered, where the isolated components of the network can perform partially independently after the disruption of the node or link.
In actual URNs, when there is some intersection congestion or emergency, the connected network will be divided into multiple sub-networks [
40]. The sub-network containing the most nodes is known as the maximum giant component graph (
MGC):
where
N(
t) represents the number of nodes in
MGC under cascading failures and
N is the number of nodes in
MGC at the beginning.
Network efficiency is calculated as the average of the inverse of the shortest path distances between all of the nodes in the network. This provides an overall measure of efficiency between all pairs of nodes. The global efficiency decreases as the average distance between any two nodes in the network increases. The global efficiency is calculated as follows:
where
E is the network efficiency,
N is the total number of nodes in the network, and
dij is the shortest path length between node
i and node
j.
6. Conclusions
The measurement of vulnerability is a critical issue in the field of transportation networks. This paper analyzes the travel characteristics of urban residents on commuter and rest days and the relationship between OD orders and population and POI using GPS data from taxis. The paper proposes a new method for identifying critical nodes based on the geographic features and POI of URNs. Moreover, it utilizes three different indicators to evaluate the importance of nodes in traffic networks based on their structural and functional characteristics. Then, a method for weighing URN construction based on OD orders is proposed to compare and analyze the cascading failure vulnerability characteristics of different weighted road networks on commuter days and rest days. Meanwhile, four malicious attack methods (DMA, BHA, KHA, and IKHA) are proposed to imitate an emergency, utilizing the main urban area of Harbin to demonstrate the feasibility and effectiveness of the model.
The improved IKS algorithm proposed in this paper takes into account the structural and traffic characteristics of the urban road network and puts forward the concept of intersection importance, which is more effective in classifying the importance of intersections than the traditional K-shell algorithm, and makes up for the shortcomings of the traditional K-shell algorithm in identifying the key intersections of the road network.
The result indicates that both weighted networks exhibit intense vulnerability under IKHA by weighting the URN for commuting days and rest days. Consequently, road traffic management should be different on commuter days and rest days, and such segments and intersections should be protected. Traffic management departments can upgrade intersections, restrict traffic flow, adjust signal timing, increase road capacity, and take other appropriate traffic management measures at key intersections. Different strategies for congestion prevention and evacuation can be formulated, according to the specific conditions of the identification results for weekends and holidays when traffic flows fluctuate greatly, in order to maximize the safety and efficiency of road traffic.
The presented approach has some limitations. First, the sample of this study is relatively small. This study estimates the traffic flow on roads by taxi trip data. However, the real traffic flow is mixed and the complexity of its composition and behavior determines that it is quite different from the pure vehicle flow. Thus, a prospective study with large samples is warranted in the future. Second, this paper presents a novel method for the vulnerability analysis of road networks without proposing a measure for improving network vulnerability. Third, this study does not consider real failures such as natural disasters and traffic accidents, although it simulates failure scenarios through random attacks and deliberate attacks.
To further optimize the proposed method, more detailed issues need to be investigated in our future research, such as the vulnerability of networks under the mutual influence of multiple transportation systems and recovery schemes of transportation networks with constrained emergency resources.