Hierarchical Spatial-Temporal Neural Network with Attention Mechanism for Traffic Flow Forecasting
Abstract
:Featured Application
Abstract
1. Introduction
- (1)
- We propose the STA-Blocks structure, designed to encapsulate local spatial-temporal correlations. This structure utilizes a temporal attention layer to capture temporal correlations, while local spatial correlations within a one-hop radius are captured through a spatial attention layer. Both of these layers operate on the basis of a multi-headed self-attention mechanism.
- (2)
- We utilize a hierarchical structure of stacked Spatial-Temporal Attention Blocks (STA-Blocks) to methodically extract and integrate spatial-temporal correlations. This hierarchical structure enhances the receptive field of local multi-headed self-attention, thereby achieving a level of performance equivalent to full attention.
- (3)
- We have evaluated the predictive performance of our proposed model using two real-world traffic datasets. Our model has demonstrated a significant improvement in the accuracy of traffic forecasting when compared to the baseline models. Additionally, we have carried out ablation studies to assess the impact of our model’s components on its overall performance.
2. Related Works
3. Methodology
3.1. Problem Formulation
3.2. Overview of Model Architecture
3.3. Input Embedding Layer
3.4. STA-Blocks
3.4.1. Temporal Attention Layer
3.4.2. Spatial Attention Layer
3.5. Hierarchical Structure
3.6. Output Layer
3.7. Algorithm Description
Algorithm 1. HSTAN Training Algorithm |
Input: The historical traffic flow sequence by input embedding: ; number of train epoch: epochs; count of heads on temporal attention mechanism: ; count of heads on spatial attention mechanism: ; layers with residuals: L; loss function: loss |
Output: traffic flow prediction value: |
For do For do For do # Temporal Attention Initialize ComputerAttention(, , ) End For For s do # Spatial Attention Initialize ComputerCorrelation() Softmax() End For End For End For |
4. Experiments Datasets
4.1. Datasets
- SZ-taxi: The dataset is the Shenzhen taxi trajectory from 1 January to 31 January 2015. The 156 major roads in Luohu District were selected as the study area, and the experimental data consisted of two main parts: one was a 156 × 156 adjacency matrix describing the spatial relationships between roads, with each row representing a road and the values in the matrix, indicating the connectivity between roads; the other was a feature matrix describing the change in speed over time on each road, with each row representing a road and each column representing the speed of traffic on the road at different times of the day. The speed of traffic on each road is calculated every 15 min.
- Los-loop: This dataset is collected in real time using loop detectors on motorways in Los Angeles County. It consists of 207 sensors with traffic speeds collected from 1 March 2012 to 7 March 2012. These traffic speed data are aggregated every 5 min.
4.2. Benchmarks
- ARIMA: Autoregressive Integrated Moving Average (ARIMA) is a well-known model used to understand and predict future values in the time series data.
- T-GCN [3]: Temporal Graph Convolutional Network, which combines GCN with GRU for the extraction of spatial-temporal correlations for traffic forecasting.
- STGCN [14]: Spatial-Temporal Graph Convolution Network, which uses graph convolution and one-dimensional convolution to capture spatial-temporal dependency.
- AGCRN [18]: Adaptive Graph Convolutional Recurrent Network, which uses adaptive graphs and a combination of recurrent networks to capture spatial-temporal correlations.
- A3TGCN [4]: Attention Temporal Graph Convolutional Network, based on T-GCN, which adds an attention mechanism to capture both dynamics of global spatial-temporal correlations.
4.3. Experiment Settings
4.4. Experiment Result Analysis
4.5. Visualized Analysis
4.6. Effect of Spatial-Temporal Attention
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Veres, M.; Moussa, M. Deep Learning for Intelligent Transportation Systems: A Survey of Emerging Trends. IEEE Trans. Intell. Transp. Syst. 2020, 21, 3152–3168. [Google Scholar] [CrossRef]
- Jiang, W.; Luo, J. Graph Neural Network for Traffic Forecasting: A Survey. Expert Syst. Appl. 2022, 207, 117921. [Google Scholar] [CrossRef]
- Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Li, H. T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction. IEEE Trans. Intell. Transp. Syst. 2020, 21, 3848–3858. [Google Scholar] [CrossRef]
- Bai, J.; Zhu, J.; Song, Y.; Zhao, L.; Hou, Z.; Du, R.; Li, H. A3T-GCN: Attention Temporal Graph Convolutional Network for Traffic Forecasting. ISPRS Int. J. Geo-Inf. 2021, 10, 485. [Google Scholar] [CrossRef]
- Xu, M.; Dai, W.; Liu, C.; Gao, X.; Lin, W.; Qi, G.J.; Xiong, H. Spatial-Temporal Transformer Networks for Traffic Flow Forecasting. arXiv 2020, arXiv:2001.02908. [Google Scholar]
- Feng, A.; Tassiulas, L. Adaptive Graph Spatial-Temporal Transformer Network for Traffic Flow Forecasting. In Proceedings of the 31st ACM International Conference on Information and Knowledge Management (CIKM’22), New York, NY, USA, 17–21 October 2022; pp. 3933–3937. [Google Scholar] [CrossRef]
- Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to Sequence Learning with Neural Networks. arXiv 2014, arXiv:1409.3215. [Google Scholar]
- Zhao, Z.; Chen, W.; Wu, X.; Chen, P.C.Y.; Liu, J. LSTM network: A deep learning approach for short-term traffic forecast. IET Intell. Transp. Syst. 2017, 11, 68–75. [Google Scholar] [CrossRef]
- Ma, X.; Tao, Z.; Wang, Y.; Yu, H.; Wang, Y. Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp. Res. Part C Emerg. Technol. 2015, 54, 187–197. [Google Scholar] [CrossRef]
- Fu, R.; Zhang, Z.; Li, L. Using LSTM and GRU neural network methods for traffic flow prediction. In Proceedings of the 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China, 11–13 November 2016; pp. 324–328. [Google Scholar] [CrossRef]
- Tran, Q.H.; Fang, Y.-M.; Chou, T.-Y.; Hoang, T.-V.; Wang, C.-T.; Vu, V.T.; Ho, T.L.H.; Le, Q.; Chen, M.-H. Short-term traffic speed forecasting model for a parallel multi-lane arterial road using gps-monitored data based on deep learning approach. Sustainability 2022, 14, 6351. [Google Scholar] [CrossRef]
- Yang, D.; Chen, K.; Yang, M.; Zhao, X. Urban rail transit passenger flow forecast based on LSTM with enhanced long-term features. IET Intell. Transp. Syst. 2019, 13, 1475–1482. [Google Scholar] [CrossRef]
- Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. In Proceedings of the 6th International Conference on Learning Representations (ICLR’18), Vancouver, BC, Canada, 30 April–3 May 2018; pp. 1–16. [Google Scholar] [CrossRef]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI’18), Stockholm, Sweden, 13–19 July 2018; pp. 3634–3640. [Google Scholar] [CrossRef]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Zhang, C. Graph wavenet for deep spatial-temporal graph modeling. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI’19), Macao, China, 10–16 August 2019; pp. 1907–1913. [Google Scholar] [CrossRef]
- Feng, N.; Guo, S.N.; Song, C.; Zhu, Q.C.; Wan, H.Y. Multi-component spatial-temporal graph convolution networks for traffic flow forecasting. Ruan Jian Xue Bao/J. Softw. 2019, 30, 759–769. [Google Scholar] [CrossRef]
- Song, C.; Lin, Y.; Guo, S.; Wan, H. Spatial-temporal synchronous graph convolutional networks: A new framework for spatial-temporal network data forecasting. Proc. AAAI Conf. Artif. Intell. 2020, 34, 914–921. [Google Scholar] [CrossRef]
- Bai, L.; Yao, L.; Li, C.; Wang, X.; Wang, C. Adaptive Graph Convolutional Recurrent Network for Traffic Forecasting. arXiv 2020, arXiv:2007.02842. [Google Scholar]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Chang, X.; Zhang, C. Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, 6–10 July 2020; pp. 753–763. [Google Scholar] [CrossRef]
- Velikovi, P.; Cucurull, G.; Casanova, A.; Romero, A.; Liò, P.; Bengio, Y. Graph Attention Networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Wu, T.; Feng, C.; Yun, W. Graph Attention LSTM Network: A New Model for Traffic Flow Forecasting. In Proceedings of the 5th International Conference on Information Science and Control Engineering (ICISCE’18), Zhengzhou, China, 20–22 July 2018. [Google Scholar]
- Shih, S.Y.; Sun, F.K.; Lee, H.Y. Temporal Pattern Attention for Multivariate Time Series Forecasting. arXiv 2019, arXiv:1809.04206. [Google Scholar]
- Fang, X.; Huang, J.; Wang, F.; Zeng, L.; Wang, H. ConSTGAT: Contextual Spatial-Temporal Graph Attention Network for Travel Time Estimation at Baidu Maps. In Proceedings of the 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’20), Virtual Event, 6–10 July 2020. [Google Scholar] [CrossRef]
- Liu, C.H.; Piao, C.; Ma, X.; Yuan, Y.; Leung, K.K. Modeling citywide crowd flows using attentive convolutional lstm. In Proceedings of the IEEE 37th International Conference on Data Engineering (ICDE’21), Chania, Greece, 19–22 April 2021; pp. 217–228. [Google Scholar] [CrossRef]
- Guo, S.; Lin, Y.; Feng, N.; Song, C.; Wan, H. Attention Based Spatial-Temporal Graph Convolutional Networks for Traffic Flow Forecasting. Proc. AAAI Conf. Artif. Intell. 2019, 33, 922–929. [Google Scholar] [CrossRef]
- Luo, X.; Zhu, C.; Zhang, D.; Li, Q. Dynamic Graph Convolution Network with Spatio-Temporal Attention Fusion for Traffic Flow Prediction. arXiv 2023, arXiv:2302.12598. [Google Scholar]
- Zheng, C.; Fan, X.; Wang, C.; Qi, J. GMAN: A Graph Multi-Attention Network for Traffic Prediction. Proc. AAAI Conf. Artif. Intell. 2020, 34, 1234–1241. [Google Scholar] [CrossRef]
- Guo, S.; Lin, Y.; Wan, H.; Li, X.; Cong, G. Learning Dynamics and Heterogeneity of Spatial-Temporal Graph Data for Traffic Forecasting. IEEE Trans. Knowl. Data Eng. 2021, 34, 5415–5428. [Google Scholar] [CrossRef]
- Yan, H.; Ma, X.; Pu, Z. Learning Dynamic and Hierarchical Traffic Spatiotemporal Features With Transformer. IEEE Trans. Intell. Transp. Syst. 2021, 23, 22386–22399. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
Data | Models | 15 min | 30 min | 45 min | 60 min | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | MAE | RMSE | MAPE | ||
SZ-Taxi | ARIMA | 4.98 | 7.24 | -- | 4.67 | 6.79 | -- | 4.67 | 6.78 | -- | 4.75 | 6.77 | -- |
TGCN | 3.57 | 4.86 | 34.65 | 3.62 | 4.92 | 35.16 | 3.65 | 4.96 | 35.46 | 3.69 | 5.01 | 35.90 | |
STGCN | 3.16 | 4.46 | 28.70 | 3.21 | 4.53 | 28.99 | 3.24 | 4.55 | 29.22 | 3.27 | 4.60 | 29.35 | |
AGCRN | 3.15 | 4.44 | 28.74 | 3.20 | 4.51 | 29.09 | 3.24 | 4.55 | 29.31 | 3.27 | 4.59 | 29.17 | |
A3TGCN | 2.83 | 4.24 | 26.82 | 2.89 | 4.27 | 26.98 | 2.88 | 4.28 | 27.14 | 2.93 | 4.26 | 27.31 | |
HSTAN (Ours) | 2.68 | 4.17 | 25.37 | 2.71 | 4.21 | 25.45 | 2.79 | 4.29 | 24.38 | 2.84 | 4.38 | 25.89 | |
Los-loop | ARIMA | 7.68 | 10.04 | -- | 7.69 | 9.34 | -- | 7.69 | 10.05 | -- | 7.70 | 9.87 | -- |
TGCN | 3.13 | 5.09 | 8.67 | 3.66 | 5.99 | 10.21 | 4.17 | 6.68 | 11.39 | 4.23 | 7.09 | 12.10 | |
STGCN | 2.73 | 5.06 | 7.18 | 3.13 | 5.96 | 8.85 | 3.32 | 6.46 | 9.38 | 3.49 | 6.87 | 9.94 | |
AGCRN | 2.75 | 5.17 | 7.30 | 3.12 | 6.01 | 8.66 | 3.33 | 6.49 | 9.37 | 3.49 | 6.82 | 9.95 | |
A3TGCN | 3.12 | 5.55 | 8.55 | 3.65 | 6.57 | 10.46 | 4.06 | 7.30 | 12.13 | 4.46 | 7.95 | 13.78 | |
HSTAN (Ours) | 2.65 | 4.96 | 6.86 | 3.03 | 5.86 | 8.25 | 3.28 | 6.40 | 9.18 | 3.49 | 6.83 | 9.94 |
Data | Model | MAE | RMSE | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
15 min | 30 min | 45 min | 60 min | Avg. | 15 min | 30 min | 45 min | 60 min | Avg. | ||
SZ-Taxi | HSTAN w/o TA | 4.80 | 4.82 | 4.84 | 4.88 | 4.83 | 6.24 | 6.29 | 6.60 | 6.36 | 6.37 |
HSTAN w/o SA | 3.57 | 3.65 | 3.83 | 3.95 | 3.75 | 4.77 | 4.89 | 5.17 | 5.60 | 5.10 | |
HSTAN | 2.68 | 2.71 | 2.79 | 2.84 | 2.75 | 4.17 | 4.21 | 4.29 | 4.38 | 4.26 | |
Los-loop | HSTAN w/o TA | 7.64 | 7.72 | 7.78 | 7.93 | 7.76 | 10.71 | 10.86 | 10.88 | 11.01 | 10.86 |
HSTAN w/o SA | 3.03 | 3.87 | 5.36 | 5.12 | 4.34 | 5.47 | 6.93 | 8.67 | 8.81 | 7.47 | |
HSTAN | 2.65 | 3.03 | 3.28 | 3.49 | 3.11 | 4.96 | 5.86 | 6.40 | 6.83 | 6.01 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lian, Q.; Sun, W.; Dong, W. Hierarchical Spatial-Temporal Neural Network with Attention Mechanism for Traffic Flow Forecasting. Appl. Sci. 2023, 13, 9729. https://doi.org/10.3390/app13179729
Lian Q, Sun W, Dong W. Hierarchical Spatial-Temporal Neural Network with Attention Mechanism for Traffic Flow Forecasting. Applied Sciences. 2023; 13(17):9729. https://doi.org/10.3390/app13179729
Chicago/Turabian StyleLian, Qingyun, Wei Sun, and Wei Dong. 2023. "Hierarchical Spatial-Temporal Neural Network with Attention Mechanism for Traffic Flow Forecasting" Applied Sciences 13, no. 17: 9729. https://doi.org/10.3390/app13179729