1. Introduction
The impact of extreme weather results in greater variability in urban road traffic and poses a significant challenge for traffic prediction and management. In coastal areas, for example, tropical cyclones often bring extreme weather events, including strong winds and heavy rain, resulting in flooding, massive traffic delays, and accidents [
1]. It was reported that the commuting time during rush hours in a typhoon season increased by 30%~60% than the usual workdays in Shenzhen City, China (
http://www.sz.gov.cn/en_szgov/news/notices/content/post_8000824.html, accessed on 13 October 2023). During Hurricane Sandy in 2012, it took 132 h for traffic in New York City to return to normal [
2]. The frequency and intensity of extreme weather events such as tropical cyclones seem to be increasing due to climate change [
3]. Sustainable Development Goal 11 (SDG11) aims to make cities inclusive, safe, resilient, and sustainable [
4]. Predicting spatiotemporal patterns of urban road traffic accurately under extreme weather is critical to strengthening city safety and resilience [
5,
6].
While prior research has extensively examined infrastructure damage [
7] and economic impacts [
8,
9] from extreme weather, understanding how urban traffic systems dynamically respond remains a critical research gap. Meteorological factors create multifaceted transportation challenges: precipitation affects road surfaces by reducing friction coefficients and increasing braking distances, while intense rainfall significantly impairs visibility and driving conditions. Strong crosswinds present additional hazards to vehicle stability and control [
10]. Beyond direct effects, extreme weather can cause secondary disruptions through fallen trees, flooding, and infrastructure damage [
11], triggering complex ripple effects across transportation networks [
11]. These impacts are further compounded when navigation systems redirect traffic, potentially overloading alternative routes during peak periods. The intricate interplay between environmental conditions, infrastructure vulnerability, and traffic flow dynamics represents a significant and underexplored area in transportation research.
The data-driven methods are a trend to improve the performance of traffic prediction. Classical traffic forecasting models typically focus on extracting the temporal correlation of traffic flow. Recent cutting-edge studies have demonstrated the feasibility and superiority of deep learning in traffic prediction. Various algorithms based on Recurrent Neural Network (RNN) such as Long Short-Term Memory (LSTM) [
12], bidirectional LSTM [
13], sequence-to-sequence learning [
14,
15], Gated Recurrent Unit (GRU) [
16], and an attention mechanism [
17] are well-suited for capturing time dependence and widely used in traffic prediction tasks. Among them, GRU is particularly effective in terms of solution quality and inference speed [
16]. Researchers have also realized the importance of the strong spatial interaction of transportation networks. As a result, algorithms based on Convolutional Neural Network (CNN) such as CNN-LSTM [
18], Conv-LSTM [
19], and 3D CNN [
20] have been developed to extract spatiotemporal features from traffic information. These methods use Euclidean distance to measure the spatial correlation of raster data. However, real-world traffic data has a non-Euclidean structure of directional topology. Thus, Graph Convolution Network (GCN) has been introduced to extract from graph data. GCN-based deep learning algorithms are a direction for technological improvement.
While deep learning has demonstrated significant capabilities in processing spatiotemporal data patterns from large datasets [
21,
22], current architectures still present notable limitations requiring further development. First, model architecture remains a critical factor—optimal structural design directly influences predictive performance when working with adequate training data [
23,
24]. This architectural optimization represents a persistent research challenge across various deep learning applications. Additionally, while traffic data sequences capture fundamental state characteristics, they often fail to incorporate all relevant influencing factors [
25]. These limitations highlight the need for improvement in neural network design and feature representation for traffic prediction tasks.
Studies have found that future traffic states are not only dependent on historical traffic information but also impacted by external factors, such as the natural environment [
26], surrounding infrastructure [
27], and especially weather conditions [
28] during extreme weather. The knowledge-driven information fusion provides us with new ideas to predict urban traffic under extreme weather. According to the United Nations Office for Disaster Risk Reduction (UNDRR), the disaster risk results from the complex interaction between development processes that generate conditions of exposure, vulnerability, and hazard (
https://www.preventionweb.net/understanding-disaster-risk/component-risk/disaster-risk, accessed on 7 October 2023). Urban roads are critical infrastructures exposed to the natural environment. Extensive research has been carried out on modeling the spatial–temporal correlation of traffic flow itself or considering insufficient external factors [
29,
30,
31]. While previous research has advanced traffic prediction, the specific compound effects of multi-hazards and varying environmental conditions have not been thoroughly investigated, especially concerning extreme weather scenarios.
Addressing SDG11, there is a need for a traffic prediction model that combines disaster knowledge with spatiotemporal correlation to predict traffic status under extreme weather, which is more practical for building resilient cities.
Therefore, we developed a data-driven and knowledge-driven traffic prediction framework. Our work improves the structures of traffic prediction models, imitates the cognitive process of experts with respect to real-time changes in traffic, and optimizes the network to extract spatiotemporal correlation from high-dimensional massive data. Moreover, a new data fusion module is designed by integrating hazards and environment knowledge. Here, the two hazards considered are compound precipitation and wind, as they are most likely to occur in China, particularly in the southeast coastal areas during the summer, which is related to the frequent occurrence of tropical cyclones [
32]. The environment information includes social environment and natural environment. By identifying these potential changes in traffic flow early on, particularly for urban road systems that are often heavily impacted by extreme weather events, the framework can serve as an invaluable tool for both early warning and traffic management. The ability to anticipate traffic disruptions can help adapt traffic management strategies quickly and minimize the adverse effects of such events on urban transportation, such as delays, accidents, and increased travel times. Overall, our work has the potential to greatly enhance the resilience and responsiveness of urban road systems to extreme weather events.
The remainder of this paper is organized as follows.
Section 2 outlines the methodology employed in the study.
Section 3 provides detailed information of the experiment.
Section 4 presents the numerical results and discusses the model’s performance.
Section 5 concludes the paper.
2. Methodology
2.1. Framework
This study proposes a novel Knowledge-driven Attribute-Augmented Attention Spatio-Temporal Graph Convolutional Network (KA3STGCN) framework (
Figure 1) for urban traffic prediction under extreme weather.
We develop a physics-informed attribute-augmented unit that fundamentally advances beyond traditional feature concatenation approaches through its dynamic coupling mechanism. This unit uniquely integrates the following: (1) dynamic hazard attributes including wind speed and precipitation with adaptive weighting based on real-time intensity, (2) static environment attributes, including Points of Interest (POI) and Digital Elevation Model (DEM), representing social and natural infrastructure vulnerability, and (3) historical traffic states—all processed through a parallel architecture that preserves feature distinctiveness while enabling nonlinear interactions.
The attribute-augmented unit is fed into the deep learning model to capture and predict the spatiotemporal pattern of urban road traffic. The model comprises three main components: GCN, GRU, and an attention mechanism. GCN is used to extract spatial dependence while accounting for hazard-modulated road vulnerabilities. GRU is well-suited for modeling the temporal dependence of road traffic. Compared with LSTM, GRU has fewer parameters with faster training speed and convergence, while maintaining comparable performance. Furthermore, the attention mechanism can be employed to adjust the relative importance of different horizons. While the components process spatial and temporal dependencies sequentially, the attribute-augmented unit employs a parallel architecture. The hazard attributes and traffic data are processed independently before fusion, thereby preserving their distinct characteristics.
The definitions of variables in
Figure 1 are as follows:
Definition 1. Urban road network . The urban roads are modeled as an unweighted network . is the set of roads; is the set of edges connecting different roads. The adjacency matrix is used to illustrate the connectivity of and composed of 0 and 1, where 1 means the corresponding roads are connected, and 0 otherwise.
Definition 2. Traffic state matrix . denotes the traffic state on the i-th road at time t. The traffic states are usually described as the road speed, density, or traffic flow. Without loss of generality, traffic speed is used as an example of traffic information in experiments.
Definition 3. Hazard attribute matrix . is a collection of different dynamic hazard factors. For j-th hazard attribute , is the time series of the -th hazard attribute of the -th road. Here, the hazards include two meteorological factors—wind speed and precipitation.
Definition 4. Environment attribute matrix . is a collection of different static environment factors. For the j-th environment attribute , is the -th environment attribute of the -th road. Here, the environment factors include POI and terrain.
The traffic predicting problem aims to learn a function
that is able to predict
future traffic states given the urban road network
G, historical traffic matrix
, the hazard attribute matrix
, and the environment attribute matrix
, as shown in Equation (1).
2.2. Attribute-Augmented Unit
The attribute-augmented unit serves as the core innovation of our framework, designed to effectively integrate disaster knowledge with traffic prediction through three novel technical contributions.
At time
t, if disaster information is not considered, the input
can be expressed as the following:
The attribute-augmented unit joins traffic speed matrix
and the hazard attributes
and environment attributes
. Notably, we introduce a dynamic cumulative hazard window mechanism that captures both immediate and delayed disaster impacts. Specifically, for each hazard attribute
, we construct an extended time window
to model the temporal propagation of disaster effects, where
represents the historical horizon. This design explicitly accounts for the lagged consequences of extreme weather events, such as gradual flooding after heavy rainfall., i.e., picking the hazard attributes
for each submatrix
when generating
.
is only calculated once and used repeatedly without introducing additional uncertainty. Finally, the complete attribute-augmented matrix
including both time-variant hazard and time-invariant environment attributes as well as traffic speed at time
t is formed as the following:
Here, .
Thus, Equation (1) can be transformed as the following:
Our attribute-augmented unit is not a static feature concatenation, but rather a physics-informed dynamic coupling mechanism. This parallel design preserves the distinct characteristics of each feature type while allowing for learned interactions through subsequent network layers. The transformation in Equation (4) demonstrates how these augmented features are incorporated into the prediction framework.
2.3. Models
A deep learning model is designed to capture spatiotemporal dependencies by combing the GCN, GRU, and attention mechanism.
2.3.1. Spatial Dependence Modeling
In this paper, spatial dependency is modeled by GCN. The learning process of graph convolution is similar to the convolution and coding for node information. It mainly aggregates neighbor nodes through the adjacency matrix. Parameters are shared during the aggregation process [
33]. Given an adjacency matrix
A and the augmented matrix
of the road network
G, the GCN model constructs a filter in the Fourier domain. The hidden layers in GCN can be represented as
. The hazard and environment attributes are scaled, where
includes hazard-modulated road vulnerabilities.
The propagation rule
in spatial domain-based GCN model is defined as follows [
34]:
where
is the output and
.
is the nonlinear activation function.
represents the adjacency matrix with added self-loops, and
is the identity matrix.
the corresponding degree matrix.
is the trainable weight matrix of the
l-th layer. Equation (5) demonstrates how to normalize the graph into a regular network, obtain parameters and weights, and then return the output to the following layer.
2.3.2. Temporal Dependence Modeling
The temporal dependency is modeled by GRU after the GCN cell. GRU was regarded as a simplification and improvement of LSTM [
35,
36]. GRU solves the problems of gradient disappearance or gradient explosion in back propagation of long-term memory by using update gate
and reset gate
[
37]. The “gate” here refers to the matrix multiplication that selectively controls the flow of information.
At time , the internal processes of a GRU cell are shown below.
Firstly, the update gate
is calculated as follows:
where
represents the graph convolution process and is defined in (5), and
is the current input.
is the hidden state from the previous node. The activation function σ converts data into values in the range of 0~1 as the gating signal.
and
are the weight and bias.
is used to limit how much old information is incorporated into the current data.
Secondly, the reset gate
is calculated as follows:
Here, and are the weight and bias. The activation function tanh converts data into values in the range of −1~1 to avoid data disappearance or explosion. The old state would be added into the new state by the control of .
Finally, the current output state
is calculated based on the new state
and last output
, as follows:
2.3.3. Attention Mechanism
In the previous calculation process, a large number of feature maps of different channels was generated. However, the importance of the information transmitted via channels varies. So, the proposed model employs the attention mechanism to dynamically adjust the contribution of hazard features based on their predictive utility. Here, a multi-layer perception is added to the model after GCN and GRU [
38].
Given a query
and the hidden layer vector
,
is the length of the time series. For each
, the probability
of selecting
is as follows:
Here,
is calculated based on the additive model of two hidden layers using linear transformation [
39].
and
are the weight and bias of the first hidden layer, and
and
are the weight and bias of the second hidden layer, respectively. The higher the information relevance with
, the higher the weight of
. The attention score was then determined using a weighted average, as follows:
Finally, the full connection layer is used to output the prediction results.
2.3.4. Loss Function
In the training process, the goal is to minimize the error between the real traffic speed on the roads and the predicted value. Thus, the loss function’s goal is to minimize the prediction error, as follows:
Here, and are real traffic speed and predicted speed, respectively. is the L2 regularization term to avoid overfitting, and is a hyperparameter.
3. Data and Experiments
3.1. Data Description
In our study, the KA3STGCN model is applied to a real-world dataset in Shenzhen City. Shenzhen is a densely populated and economically developed coastal megacity in China with a massive and continuously growing transportation infrastructure (
Figure 2a). The city has an area of 1997.47
, a permanent population of 13.44 million as of 2019 (
https://www.sz.gov.cn/en_szgov/aboutsz/profile/content/post_11666623.html, accessed on 2 January 2025), a total road mileage of 8066.1
and a civilian car ownership of 3.53 million as of 2020 (
http://tjnj.gdstats.gov.cn:8080/tjnj/2021/directory/15/html/15-11-0.htm, accessed on 2 January 2025). During summer, Shenzhen is vulnerable to frequent tropical cyclones with wind and rain in the western Pacific Ocean. In 2018, the city experienced the most severe damage from tropical cyclones in last decade, including Typhoon Mangkhut. The time period is from 1 April 2018 to 30 September 2018, which is the high incidence period of tropical cyclones. Six categories of data were used in this study, as shown in
Table 1, and the multi-source data distributions are presented in
Figure 2b–d.
3.2. Data Preprocessing
To address the heterogeneity of multi-source data with varying formats and standards, we established a rigorous data consistency framework through the following steps: First, the urban road network was converted into an adjacency matrix to represent topological relationships. Second, 10 min traffic speed data were aggregated into hourly averages to align with the temporal resolution of meteorological data. For environmental feature extraction, we calculated elevation values averaged across each road segment from 30 m resolution DEM data and determined the dominant POI category per road through kernel density analysis of 14 POI classifications.
Spatial alignment was achieved through a multi-step process: we sampled points along each road polyline at 10 m intervals, computed the average Euclidean distance to neighboring meteorological grid centroids, and assigned grid-recorded wind speed and precipitation values using nearest-neighbor interpolation weighted by these distances. Temporal alignment incorporated an n-hour cumulative window for hazard attributes to capture delayed weather impacts, while maintaining static environment features for each road.
All features underwent Min-Max normalization to [0,1] ranges. The unit can weight hazard impacts based on real-time intensity and road vulnerability. Missing values were excluded to ensure data quality. The final attribute-augmented matrix combined these normalized features through Equation (3), where sliding hazard windows preserved temporal dependencies and distance-based weighting maintained spatial relationships—a significant advancement over simple concatenation approaches like ASTGCN. All normalized values were denormalized post-prediction for interpretation.
After pre-processing, 1054 road segments were available (two-way roads are regarded as two road segments). The size of the adjacency matrix was 105 × 1054. The size of each environment factor was 1054 × 1. The size of each hazard factor was 1054 × 4392 (183 days, 4392 h).
Furthermore, we verified the impact of disaster-related factors on urban road traffic (
Figure 3).
Figure 3a shows the changes in traffic speed of one road segment during Typhoon Mangkhut in 2018 compared with a normal sunny day. It is evident that extreme weather drastically reduced traffic speed.
Figure 3b depicts the traffic speeds of two road segments dominated by two classes of DEM (Class 1 is lower than Class 2 in altitude). Similar time-varying traffic condition features were observed in both sample groups; however, total traffic speeds were lower in DEM Class 1 than in DEM Class 2.
Figure 3c shows the traffic speeds of two roads dominated by two types of urban POI. The traffic speed on roads around living services decreased around 7 a.m. and 6 p.m. The traffic speed on roads neighboring enterprises reached its valley around 8 a.m. and was lower than that of the former roads most of the time.
3.3. Evaluation Metric
Four metrics were used to evaluate the performance of the KA3STGCN model: Root mean square error (
RMSE), Mean Absolute Percentage Error (
MAPE),
, and Coefficient of Determination (
), which are defined as follows:
where
is the number of time samples;
is the number of roads.
and
are the observation and prediction of the
i-th road in
j-th time.
and
represent the set of
and
, respectively, and
is the average of
.
is the Frobenius norm.
measures the average magnitude of prediction errors, penalizing larger deviations more heavily. In the context of extreme weather, where traffic speed fluctuations can be abrupt and severe such as sudden drops due to rain or wind, is critical for quantifying the model’s ability to handle such anomalies. expresses errors as a percentage of actual values, making it intuitive for assessing relative accuracy. Under extreme weather, traffic speeds may drop to near-zero (e.g., road closures). helps evaluate whether the model’s relative errors remain acceptable even in such scenarios. reflects the overall proportion of correctly predicted traffic states across the entire road network. For urban resilience planning, holistic accuracy is key. quantifies the proportion of variance in traffic speeds explained by the model. A high indicates that the model accounts for most variability induced by extreme weather. The chosen metrics collectively address the unique challenges of traffic forecasting under extreme weather: and quantify error magnitudes, evaluates network-wide reliability, and validates the model’s explanatory power.
3.4. Parameter Settings
In the model training, some parameters were determined based on the experience of existing studies [
41]: the optimization was Adaptive Moment Estimation (Adam) [
42]; the learning rate was 0.001; the batch size was 64; λ in loss function was 0.0015; and the proportion of the training set was 0.8. Other parameters were searched by experiment, as shown in
Figure 4.
- (1)
Learning horizons. Considering the cumulative effects of hazards on traffic, we expanded the time window size when constructing the attribute-augmented unit.
Figure 4a shows the model performance for the learning horizons of {1, 2, 3, 4, 5}. The model performed best considering the last 3 h. The model performance was still robust when the learning horizon was 4 or 5, while the time cost increased with more hidden parameters.
- (2)
Predicting horizons.
Figure 4b shows the model performance when the predicting horizon was {1, 2, 3}. The short-term prediction is better than the long-term prediction, which was consistent with the expectation—the longer predicting horizon has a greater uncertainty.
- (3)
Training epochs.
Figure 4c shows the model performance when the number of training epochs was {500, 1000, 1500, 2000, 3000, 3500, 4000}. As the training epochs increase, the change in evaluation metrics tended to be stable, with a turning point of 3000.
- (4)
Hidden units.
Figure 4d shows the model performance when the number of units in the hidden layer was {8, 16, 32, 64, 100}. With a turning point of 64, the evaluation metrics’ change tends to be stable as hidden units rise. When there are 128 hidden units, the memory overflows due to too many parameters.
In conclusion, we identified the optimal configurations for model training: 3 learning horizons, 1 predicting horizon, 3000 training epochs, and 64 hidden layer units.
4. Results and Discussion
4.1. The Performance of KA3STGCN Model
The experimental results demonstrate that our KA3STGCN framework achieves robust performance in predicting urban traffic under extreme weather. Implemented using TensorFlow, the model converges after 3000 training epochs with the following evaluation metrics: , , and , . These quantitative measures indicate the model’s capability to effectively capture the complex spatiotemporal patterns of urban traffic during disaster events.
We also conducted a time-aware 5-fold cross-validation considering the spatiotemporal dependencies. The dataset is sequentially partitioned into five temporal blocks, ensuring each fold maintains continuous time segments. During validation, we preserved temporal order by using earlier folds for training and subsequent folds for testing. Each fold retained the complete urban road network topology in all splits and included all traffic patterns (peak/off-peak, weekdays/weekends). We reported both temporal metrics (time-wise RMSE) and spatial metrics (node-level RMSE) across folds, with final performance calculated as the average of all out-of-fold predictions, demonstrating consistent Accuracy of 0.79 ± 0.02 on Shenzhen data.
Figure 5 presents a comparative analysis between observed and predicted traffic speeds for two representative roads in Shenzhen during both the whole test set (
Figure 5a,b) and Typhoon Mangkhut (
Figure 5c,d). First, the model effectively captures fundamental traffic periodicity and trend patterns across different temporal scales. However, performance variations emerge between Road 1 (
RMSE = 8.15, MAPE = 22.3%) and Road 2
(RMSE = 6.42, MAPE = 18.1%) during peak typhoon conditions. This divergence primarily stems from the following: (1) differential implementation of emergency traffic controls that affected Road 1 more severely; (2) inherent variations in infrastructure vulnerability between the two roads; and (3) current limitations in modeling compound disaster effects beyond core meteorological factors.
The observed performance differences between test cases highlight important considerations for practical deployment. Most notably, they emphasize the need to incorporate additional data sources—particularly real-time information about traffic control policies and infrastructure conditions—to further enhance prediction accuracy during complex disaster scenarios. We discuss these implementation challenges and potential solutions in greater depth in
Section 4.6.
From an operational perspective, our model demonstrates measurable improvements for urban traffic management systems. These technical improvements translate to several concrete benefits: (1) enhanced decision-making for traffic managers through fewer false alarms when issuing congestion warnings; (2) more precise rerouting recommendations during floods and strong winds, particularly for high-risk areas like DEM Class 1 roads; and (3) optimized resource allocation for post-disaster cleanup operations based on improved traffic resumption predictions.
The framework’s superior performance yields broader societal and economic impacts. By reducing prediction errors, the model helps mitigate indirect costs including fuel waste and productivity loss through dynamic signal timing and preemptive lane closures. These capabilities are particularly valuable for coastal cities like Shenzhen that frequently experience tropical cyclones. Furthermore, the system supports SDG11 by minimizing disruptions to critical infrastructure access routes, including roads serving hospitals and other essential services. The combination of improved accuracy and operational applicability positions our approach as a valuable tool for enhancing urban resilience against extreme weather events.
4.2. Model Performance Comparison Results
We contrasted the proposed KA3STGCN model with the following baselines: Historical Average method (HA), Autoregressive Integrated Moving Average model (ARIMA), Support Vector Regression model (SVR), eXtreme Gradient Boosting (XGBoost) [
43], Temporal Graph Convolution Network model (TGCN) [
44], Attention Spatial-Temporal Graph Convolutional Network (ASTGCN) [
39], Attribute-Augmented Spatial-Temporal Graph Convolutional Network (A2STGCN) [
41], physics-informed neural networks [
45], and Bayesian GCN [
46]. The hyperparameters in the above baselines were kept consistent with KA3STGCN.
Table 2 shows that our KA3STGCN model performed best among all the models tested.
Comparative analysis demonstrates significant performance differences across model categories. The traditional time series models (HA, ARIMA) exhibit limited predictive capability (Accuracy = 0.60~0.63) owing to their static linear assumptions, which prove inadequate for modeling the non-stationary traffic patterns characteristic of extreme weather events. While shallow learning-based algorithms (SVR, XGBoost) show improved performance (Accuracy = 0.62~0.74) through engineered temporal features, their failure to account for spatial dependencies results in suboptimal performance during network-wide disruptions.
The evaluation reveals that deep learning architectures consistently outperform other approaches. Baseline models including TGCN, ASTGCN, and A2STGCN achieve accuracy levels exceeding 0.7. These models were the degraded versions of the KA3STGCN model in terms of attention mechanism and external disaster information fusion, and their performance was slightly inferior. Through systematic evaluation of four graph-based architectures, we observe progressive performance improvements that highlight the importance of different architectural components for extreme weather traffic prediction. The baseline TGCN (GCN + GRU) achieves an RMSE of 7.90 and MAPE of 23.66%, demonstrating the fundamental capability of spatiotemporal modeling but showing limitations in handling sudden weather-induced traffic variations. The ASTGCN (GCN + GRU + attention) model reduces these metrics to 7.39 RMSE and 24.46% MAPE, with the attention mechanism proving particularly effective for prioritizing critical temporal segments during weather events (6.79% RMSE improvement over TGCN). However, its performance degrades during prolonged extreme conditions due to insufficient incorporation of environmental context. A2STGCN (GCN + GRU+ attribute-augmented unit) shows different strengths, achieving 8.92 RMSE but superior attribute-specific performance (12.96% better MAPE than ASTGCN for DEM Class 1 roads). This suggests that while attribute augmentation improves physical interpretability, the lack of attention mechanisms limits its ability to dynamically adjust to rapidly changing conditions.
Our KA3STGCN (GCN + GRU+ attention + attribute-augmented unit) combines the strengths of both approaches and performs the best in four evaluation metrics. The synergistic combination yields three key advantages: (1) the attention mechanism dynamically weights important temporal segments during extreme events; (2) the attribute-augmented unit provides physics-informed feature representation; and (3) their joint operation enables adaptive focus on both temporal criticality and spatial vulnerability. This is particularly evident during Typhoon Mangkhut (
Figure 5c,d), where KA3STGCN maintains stable performance while other models show significant error spikes during peak wind/rain periods.
In addition, we tested physics-informed neural networks and Bayesian GCN because they share key characteristics with our approach, particularly their ability to dynamically weight features based on real-time conditions. The physics-informed neural networks outperformed most methods; it lagged behind KA3STGCN. The incorporation of physical laws likely improved its robustness, but it lacked the spatiotemporal attention and attribute-augmented features of KA3STGCN, which are critical for capturing complex dependencies in the data. The Bayesian GCN implementation demonstrates robust performance (RMSE = 7.15, MAPE = 21.3%) by effectively quantifying prediction uncertainty during extreme weather events. However, our KA3STGCN requires 18% less computational resources. This advantage stems from our physics-informed attribute processing, which provides more direct modeling of disaster dynamics compared to the purely data-driven uncertainty estimation in Bayesian approaches.
The comparative results demonstrate that while individual components provide partial improvements, their integrated implementation in KA3STGCN yields nonlinear performance gains for extreme weather prediction. This suggests that effective disaster-aware traffic modeling requires both dynamic temporal weighting and physics-informed feature representation working in concert.
4.3. Significant Variables and Interpretations
The ablation experiment aimed to assess the impact of different disaster information and their combinations.
Table 3 presents the model performances of 16 cases, including no external information (none), hazards (wind, rain), environments (POI, DEM), and their combinations. The average results of five repeated experiments demonstrated that traffic prediction under extreme weather could benefit from the combination of hazard and environment information.
As shown in
Table 3, when adding one disaster-related variable, most of the evaluation metrics worsened by incorporating urban POI, DEM, wind, or rain. The explanation could be that a single feature is valid only on a small portion of the data, but the complex model and the insufficient training data result in the feature being ineffective on the whole dataset, while reducing the generalization effect on the test set. The single disaster-related variable has minimal effect on urban road traffic and may degrade performance due to sparse feature utility. This finding highlights the need to explore the compounding effects of various complex factors on urban road traffic changes.
Considering the impact of two variables, the combination of static environment attributes (POI + DEM) improved the model prediction precision (
Accuracy and
), while the combination of dynamic hazard attributes (wind + rain) reduced the model error (
RMSE and
MAPE). For the wind, the combinations of rain–wind, DEM–wind, and POI–wind were better than the wind alone. For the rain, the wind–rain combination had a favorable effect, followed by POI–rain. However, the DEM–rain combination was the worst, indicating a synergistic inhibitory effect. The wind–rain coupled hazards can promote the accuracy of urban road traffic prediction, especially for China, as tropical cyclones that land in coastal areas of China are mainly wind and rain coexisting [
47]. In addition, we found that supplementing environment information improved prediction accuracy based on the coupling of wind–rain. When using the POI–DEM–wind–rain combination, the model outperformed the previous combinations of variables. The model’s ability to capture the fluctuations in urban road traffic speed was enhanced after considering all the disaster information. The mechanisms of compounding external factors influencing road traffic during disasters deserve to be further studied.
4.4. Robustness Analyses
The proposed KA3STGCN model aims to predict road traffic response to natural hazards during disasters. According to the key components of disaster risk, strong winds and heavy rains induced by tropical cyclones are two main hazards that affect urban roads. Urban roads are critical infrastructures exposed to the natural environment, and their technical grade is closely related to their hazard vulnerability. Here, the model sensitivity is analyzed on different hazard intensities and road vulnerabilities.
Figure 6 shows that the average RMSE on different road grades, precipitation, and wind intensities was primarily concentrated in 3~8, indicating a robust overall performance. For different hazard intensities, we matched the average RMSE of all roads in the test set with the meteorological data including hourly precipitations and wind speeds. Then, we calculated the average RMSE of four wind speeds: 0–3.4 m/s, 3.4–8 m/s, 8–13.9 m/s, and 13.9–17.2 m/s, and of four hourly precipitations: 0–2.5 mm/h (light rain), 2.5–8 mm/h (moderate rain), 8–16 mm/h (heavy rain), and 16–50 mm/h (rainstorm).
Figure 6a,b reveals that the model’s bias increased slightly as the precipitation or wind speed (less than 13.9 m/s) increased. The RMSE with the wind speed of 13.9–17.2 m/s is lower than that of 3.4–13.9 m/s. One possible reason is that the bias is caused by the sparse data and other factors.
For different road vulnerabilities, we classified DiDi roads into five grades by matching with the open-source OpenStreetMap (OSM), as shown in
Table 4. We calculated the RMSE of each road in all hours of the test set and then determined the average RMSE of different road grades.
Figure 6c showed that the difference in RMSE among five urban road grades was not significant, demonstrating that the proposed KA3STGCN model had strong robustness in different urban road vulnerabilities.
4.5. Spatiotemporal Differences
Figure 7 shows the KA3STGCN model performances at different hours of the day with different wind or rain intensities. The red projection showed the outliers of the RMSE were mainly observed during rush hours (5–9 a.m. and 2–6 p.m.). The extreme values of the RMSE appeared in the morning rush hour, which could be explained by the concentrated extreme weather, as shown in the yellow projection. In summer, the temperatures in coastal cities such as Shenzhen can be very high during the day. When the air near the ground receives enough heat from the earth’s surface, the temperature increases, the density decreases, and finally, it rises. When the warm and humid air with a large amount of water vapor rises to a certain height, the air temperature drops, especially at night, then the water vapor condenses into ice crystals or water drops, which are prone to thunderstorms and strong winds from midnight to morning. Extreme weather and early peaks bring more uncertainty to traffic changes.
Figure 8 shows the prediction RMSE of KA3STGCN (ours) and ASTGCN (without disaster knowledge) model on different roads. The results indicate that KA3STGCN performed better than ASTGCN, with most roads having RMSE values below 10. The roads with RMSE over 10 in the ASTGCN model are long-distance rounding various external environments, which are mitigated in the KA3STGCN model. This comparison validates the necessity and illustrates the importance of integrating disaster information into traffic prediction under extreme weather, especially for accurate urban disaster management.
4.6. Generalizability and Limitations
While the KA3STGCN framework demonstrates strong performance in Shenzhen and is designed with generalizability in mind—its architecture does not rely on region-specific assumptions—several limitations warrant discussion regarding its broader applicability. First, the current validation is limited to Shenzhen, China, due to data availability constraints. The model requires high-resolution meteorological data, road segment-level speed measurements, and detailed natural and social environment attributes (e.g., DEM, POI), which are rarely available in consistent formats across different regions. These requirements pose significant challenges for implementation in areas with less comprehensive monitoring infrastructure, particularly during extreme weather events when traditional monitoring systems may fail.
The integration of multi-source data (traffic, weather, and urban infrastructure) introduces additional challenges of data sparsity and heterogeneity, mirroring common problems in smart city applications where data gaps during emergencies remain persistent [
48,
49]. Future implementations could benefit from advanced data imputation techniques and multi-sensor fusion methods to enhance robustness under incomplete data conditions. Furthermore, practical deployment faces computational constraints that may limit real-time applications, especially for large-scale urban networks. The model’s reliance on high-resolution spatiotemporal data leads to significant computational demands, potentially causing latency issues in emergency response scenarios.
Despite these limitations, our framework provides a replicable blueprint for disaster-aware traffic prediction. The modular design allows adjustments for local data conditions—for instance, substituting missing hazard variables with proxy indicators or leveraging coarser-resolution inputs when necessary. While we employed optimization strategies like model quantization in our experiments, further work is needed to develop lightweight versions suitable for edge computing implementations. These technical limitations, common to many data-intensive urban analytics systems [
50,
51], highlight the need for continued research into efficient computation methods without sacrificing prediction accuracy.
Our future work will prioritize multi-city validation as compatible datasets emerge, with a focus on standardizing data requirements for global applicability. Moreover, we will incorporate some more recent models such as hybrid knowledge-infused frameworks, Granger causality graph [
52], or uncertainty-aware Bayesian frameworks [
53] to better capture uncertainties during extreme events. These directions align with recent efforts to bridge data gaps in smart city research, ensuring the model’s potential for broader adoption while maintaining robustness under extreme weather scenarios. Addressing these data and computational challenges will also be crucial for the framework’s adoption across diverse urban contexts with varying technological capabilities.
5. Conclusions
This paper presents KA3STGCN, a novel framework that advances urban traffic prediction under extreme weather through three key methodological innovations. First, we developed a disaster-knowledge attribute augmentation method. This combines dynamic hazard data with static infrastructure vulnerability data. Such integration helps the model capture the complex interplay between weather extremes and road network resilience. Second, our hybrid architecture represents a departure from conventional spatiotemporal models by simultaneously processing spatial, temporal, and hazard dimensions through dedicated GCN, GRU, and attention mechanisms. Third, the framework demonstrates the overall robustness in prediction accuracy across varying hazard intensities, addressing a critical limitation in existing approaches that often fail during extreme events.
Future research could further integrate traffic flow theory and more accurate information into the prediction model. We will commit to multi-city validation as our future work when additional datasets become available. We expect more accurate weather forecasts, more samples during early peak hours, or extreme weather in the future. The framework could be incorporated into the traffic management system to offer system-level real-time routing services. Based on weather forecasts and the proposed model, our method can predict the traffic speed of urban roads in advance, especially under extreme weather, to provide better decision support for individual drivers’ travel planning and government agencies’ disaster preparedness. The proposed model has a strong pioneering potential for coping with extreme weather events and improving transportation resilience at the urban scale under climate change.