Multi-View Multi-Attention Graph Neural Network for Traffic Flow Forecasting
Abstract
:1. Introduction
- The AMGC-AT model proposed in this paper is based on the domain knowledge of transport and achieves a better balance between the complexity of the model framework and the comprehensive exploration of this knowledge.
- The AMGC-AT model is based on three types of traffic flow patterns (recent, daily and weekly), and innovatively combines two kinds of self-attention mechanisms in order to drill deeply into the high-order spatial temporal information of subway patronage. The features learned by the neural network can be fully expressed in a reasonable output layer configuration.
- In this study, we perform a large number of comparative and ablation experiments, which show that the AMGC-AT model outperforms the other eight base models at all time points at randomly chosen locations. The ablation experiment provides evidence for the validity and rationality of each component of the model framework.
2. Data
2.1. Dataset Description
2.2. Dataset Preprocessing
2.3. Problem Definition
3. Methodology
3.1. Overview of the Proposed Model
3.2. Pre-Defined Affinity Graph Representation
3.2.1. Neighborhood Graph
3.2.2. SimRank Graph
3.2.3. Functional Similarity Graph
3.2.4. Cosine Similarity Graph
3.3. Prediction Network Module
3.3.1. Multi-Graph Convolution Module
3.3.2. Spatial and Temporal Self-Attention Module
3.3.3. Output Layer
4. Experiments
4.1. Experimental Settings
4.2. Baselines
4.3. Results and Analyses
4.3.1. Overall Comparison
- Traditional HA and ARIMA performed the worst in both the short and long term. The reason is that the two models capture only a limited temporal correlation and ignore some important but indispensable influences, such as the cyclical impact of urban residents’ daily travel patterns on subway traffic. In addition, important spatial and topological information about the subway network is missing.
- LSTM, TCN and GCN perform better than traditional models because they capture more temporal correlations and GCN captures more spatial correlations. However, the performance of the LSTM was significantly reduced in the long-term forecast. As can be observed, in most cases, complex deep-learning architectures (such as AGCRN, ST-GCN and MTGNN) yield more favorable results than single models. This is mainly because spatiotemporal features can be extracted from these models simultaneously.
- Notably, our self-attention based multi-graph approach has better performance in extracting joint spatiotemporal features compared to AGCRN, STGCN and MTGNN. Compared to STGCN, our model showed a significant increase in the accuracy of the predictions, because the AMGC-AT has two mechanisms of self-attention while the STGCN does not. For AGCRN and MTGNN, there are a number of differences compared to our proposed model. Firstly, AGCRN and MTGNN rely on their own adaptive graph-learning modules in order to learn the spatial correlations, while our proposed AMGC-AT relies on four different graph structures that use domain knowledge in order to learn spatial correlations, and our model is more direct and effective at learning spatial correlations. Secondly, MTGNN uses a mix-hop propagation graph solution, whereas STGCN, AGCRN, and our AMGC-AT all make use of spectral convolution, which is more suitable for node prediction on a metro network of this size. AGCRN also uses RNN to extract temporal features, which hurts the robustness of the model due to severe sample fluctuations and specific gradient explosions, while our AMGC-AT uses a more efficient gated convolution for extracting and outputting temporal features. In conclusion, compared to the most advanced models to date, only our model incorporates two self-attention mechanisms that are an integral part of AMGC-AT’s state-of-the-art performance. The effectiveness of the components of the AMGC-AT model will be analyzed and validated in the next chapter of the ablation experiment.
- In the case of RMSE, the significant improvements in three time intervals compared to the best (available) models were 16.74%, 14.47%, and 7.12%, respectively. The MAE improvement rates were 5.54%, 3.71% and 5.93%, respectively. Corresponding improvements in WMAPE were 12.52%, 5.2% and 8.53%, respectively.
- Figure 6 shows the distribution of RMSE errors for passenger inflow at 10-min granularity demonstrated using AMGC-AT and STGCN models. Yellow color indicates a relatively small error, and red color indicates a relatively large error. It is clear to see that the AMGC-AT model proposed in this paper can capture the variations of morning and evening peak passenger flow more efficiently and accurately to reduce the prediction error.
4.3.2. Results Analysis
- Station 4: this is a Jianglin road station on Line 1, located in the Binjiang neighborhood of Hangzhou, surrounded by hospitals and schools, less than 500 m from the Binjiang government line.
- Station 18: Jiuhe road station on Line 1 is situated in the upper section of Hangzhou city. Today it is surrounded by farmhouses in both urban and rural areas. The surrounding areas will see more commercial and residential development.
- Station 46: Qianjiang Road Station on Line 2, with a degree value of 4 on the metro system, is situated in the upper city of Hangzhou and is a transfer station between Line 2 and Line 4 of the metro system. It is surrounded by many commercial and residential areas.
- Station 76: The Citizen Center Station on Line 4, with a degree value of 2 on the subway network, is located in the upper part of Hangzhou city. The station has eleven entrances and exits, and is surrounded by large complexes such as the Hangzhou Grand Theater, Hangzhou International Convention Center, and the Civic Center, among others.
4.4. Ablation Study
- AMGC-S: This variant eliminates the temporal self-attention mechanism and uses representations learned from spatial graphs directly to infer traffic flow.
- AMGC-T: This variant eliminates the spatial self-attention mechanism, while the other components remain the same, using only spatial features derived from multi-graph convolution to infer traffic flow.
- AGC-AT: This variant eliminates the convoluted components of Multi-graph and does not use predefined graph representations based on a priori knowledge constructs for feature extraction, leaving the components unchanged.
- AMG-AT: This variant removes the Multi-graph shared parameter section in the Multi-graph convolution component, which assumes that there is not sufficient correlation between predefined graph representations. The remaining components, including the spatiotemporal attention component and the causal convolution output component, remain the same.
- AGC1-AT: This variant removes the schematic representation of a station-based POI information construct in a multi-graph convolution component, leaving the rest of the component unchanged.
- AGC2-AT: This variant removes graphical representations of convoluted components based on station traffic volume similarity, leaving the rest of the component unchanged.
- AGC3-AT: This variant eliminates the schematic representation of the simrank-based convolution component, leaving the rest of the component unchanged.
- AMGC-AT1: This variant eliminates the periodic, short- and long-term division of traffic flow in the temporal and spatial self-attention component, with traffic flow information fed directly into the temporal and spatial self-attention component and the rest remaining the same.
- AMGC-AT2: This variant eliminates the short- and long-term division of traffic flows in the temporal and spatial self-attention segments; only the adjacent traffic flows are fed into the temporal and spatial self-attention segments, and the rest remain the same.
- AMGC-AT3: This variant eliminates the adjacent division of traffic flow in the temporal and spatial self-attention segments and only inputs traffic flow containing short-and long-term information into the temporal and spatial self-attention segments, leaving the remainder unchanged.
- AMGC-AT: The complete model presented in this paper.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Smith, B.L.; Demetsky, M.J. Traffic flow forecasting: Comparison of modeling approaches. J. Transp. Eng. 1997, 123, 261–266. [Google Scholar] [CrossRef]
- Shekhar, S.; Williams, B.M. Adaptive seasonal time series models for forecasting short-term traffic flow. Transp. Res. Rec. J. Transp. Res. Board 2007, 2024, 116–125. [Google Scholar] [CrossRef]
- Ma, X.; Tao, Z.; Wang, Y.; Yu, H.; Wang, Y. Long short-term memory neural network for traffic speed prediction using Remote Microwave Sensor Data. Transp. Res. Part C Emerg. Technol. 2015, 54, 187–197. [Google Scholar] [CrossRef]
- Fu, R.; Zhang, Z.; Li, L. Using LSTM and GRU neural network methods for traffic flow prediction. In Proceedings of the 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC 2016), Wuhan, China, 11–13 November 2016; pp. 324–328. [Google Scholar] [CrossRef]
- Jiang, W.; Zhang, L. Geospatial data to images: A deep-learning framework for traffic forecastin. Tsinghua Sci. Technol. 2019, 24, 52–64. [Google Scholar] [CrossRef]
- Ma, X.; Dai, Z.; He, Z.; Ma, J.; Wang, Y.; Wang, Y. Learning traffic as images: A deep convolutional neural network for large-scale transportation network speed prediction. Sensors 2017, 17, 818. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhao, Z.; Chen, W.; Wu, X.; Chen, P.C.Y.; Liu, J. LSTM network: A deep learning approach for short-term Traffic forecast. IET Intell. Transp. Syst. 2017, 11, 68–75. [Google Scholar] [CrossRef] [Green Version]
- Jiang, W.; Luo, J. Graph Neural Network for traffic forecasting: A survey. Expert Syst. Appl. 2022, 207, 117921. [Google Scholar] [CrossRef]
- Liu, Z.; Tan, H. Traffic prediction with Graph Neural Network: A survey. CICTP 2021, 2021, 467–474. [Google Scholar] [CrossRef]
- Scarselli, F.; Gori, M.; Tsoi, A.C.; Hagenbuchner, M.; Monfardini, G. The graph neural network model. IEEE Trans. Neural Netw. 2009, 20, 61–80. [Google Scholar] [CrossRef] [PubMed]
- Defferrard, M.; Bresson, X.; Vandergheynst, P. Convolutional neural networks on graphs with fast localized spectral filtering. arXiv 2017. [Google Scholar] [CrossRef]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph Convolutional Networks. arXiv 2017. [Google Scholar] [CrossRef]
- Atwood, J.; Towsley, D. Diffusion-Convolutional Neural Networks, Advances in Neural Information Processing Systems. 1970. Available online: https://papers.nips.cc/paper/2016/hash/390e982518a50e280d8e2b535462ec1f-Abstract.html (accessed on 12 December 2022).
- Gilmer, J.; Schoenholz, S.S.; Riley, P.F.; Vinyals, O.; Dahl, G.E. Message passing neural networks. In Machine Learning Meets Quantum Physics; Springer: Berlin/Heidelberg, Germany, 2020; pp. 199–214. [Google Scholar] [CrossRef]
- Hamilton, W.L.; Ying, R.; Leskovec, J. Inductive representation learning on large graphs. arXiv 2018. [Google Scholar] [CrossRef]
- Mattos, J.P.; Marcacini, R.M. Semi-supervised graph Attention Networks for Event Representation Learning. In Proceedings of the 2021 IEEE International Conference on Data Mining (ICDM), Auckland, New Zealand, 7–10 December 2021. [Google Scholar] [CrossRef]
- Yu, B.; Yin, H.; Zhu, Z. Spatio-temporal graph convolutional networks: A Deep Learning Framework for traffic forecasting. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Stockholm, Sweden, 13–19 July 2018. [Google Scholar] [CrossRef] [Green Version]
- Zhang, J.; Chen, F.; Guo, Y.; Li, X. Multi-graph Convolutional Network for short-term passenger flow forecasting in urban rail transit. IET Intell. Transp. Syst. 2020, 14, 1210–1217. [Google Scholar] [CrossRef]
- Zhao, L.; Song, Y.; Zhang, C.; Liu, Y.; Wang, P.; Lin, T.; Deng, M.; Li, H. T-GCN: A temporal graph convolutional network for traffic prediction. IEEE Trans. Intell. Transp. Syst. 2019, 21, 3848–3858. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Yu, R.; Shahabi, C.; Liu, Y. Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Wu, Z.; Pan, S.; Long, G.; Jiang, J.; Chang, X.; Zhang, C. Connecting the dots: Multivariate time series forecasting with Graph Neural Networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual Event, 6–10 July 2020. [Google Scholar] [CrossRef]
- Geng, X.; Li, Y.; Wang, L.; Zhang, L.; Yang, Q.; Ye, J.; Liu, Y. Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting. Proc. AAAI Conf. Artif. Intell. 2019, 33, 3656–3663. [Google Scholar] [CrossRef]
- Li, D.; Lasenby, J. Spatiotemporal attention-based graph convolution network for segment-level traffic prediction. IEEE Trans. Intell. Transp. Syst. 2022, 23, 8337–8345. [Google Scholar] [CrossRef]
- Qiao, S.; Li, T.; Li, H.; Zhu, Y.; Peng, J.; Qiu, J. SimRank: A page rank approach based on similarity measure. In Proceedings of the 2010 IEEE International Conference on Intelligent Systems and Knowledge Engineering, Hangzhou, China, 15–16 November 2010. [Google Scholar] [CrossRef]
- Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018. [Google Scholar] [CrossRef]
- Bai, L.; Yao, L.; Li, C.; Wang, X.; Wang, C. Adaptive graph convolutional recurrent network for traffic forecasting. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Online, 6–12 December 2020. [Google Scholar] [CrossRef]
Time | Line ID | Station ID | Device ID | Status | User ID | Pay Type |
---|---|---|---|---|---|---|
1 January 2019 6:54 | B | 30 | 1482 | 0 | D92a70cb | 3 |
1 January 2019 7:24 | A | 72 | 3311 | 1 | Af95c8cc | 0 |
1 January 2019 8:03 | C | 46 | 2176 | 0 | Bf8ed2b8 | 1 |
… | … | … | … | … | … | … |
Station Index | 06:00–06:10 | 06:10–06:20 | 06:20–06:30 | 23:20–23:30 |
---|---|---|---|---|
1 | 21 | 40 | 37 | 8 |
2 | 17 | 14 | 29 | 28 |
3 | 25 | 56 | 77 | 15 |
… | … | … | … | … |
81 | 10 | 11 | 15 | 10 |
Model | 10 Min | 15 Min | 30 Min | ||||||
---|---|---|---|---|---|---|---|---|---|
RMSE | MAE | WMAPE | RMSE | MAE | WMAPE | RMSE | MAE | WMAPE | |
HA [1] | 58.45 | 31.28 | 17.20% | 101.73 | 51.17 | 18.92% | 312.10 | 159.71 | 29.99% |
ARIMA [2] | 51.53 | 29.38 | 15.58% | 81.94 | 42.21 | 15.97% | 189.89 | 102.36 | 19.25% |
LSTM [3] | 38.21 | 22.85 | 13.51% | 42.92 | 29.03 | 12.86% | 97.35 | 57.82 | 11.79% |
TCN [25] | 36.18 | 19.77 | 12.71% | 40.28 | 26.34 | 10.45% | 65.21 | 39.68 | 7.47% |
GCN [12] | 37.21 | 19.82 | 12.82% | 42.59 | 28.65 | 11.78% | 67.23 | 40.05 | 7.62% |
AGCRN [26] | 31.09 | 17.78 | 11.95% | 38.18 | 25.63 | 10.22% | 57.66 | 35.81 | 6.89% |
ST-GCN [17] | 30.31 | 17.74 | 12.49% | 36.94 | 21.31 | 8.67% | 56.30 | 33.85 | 6.64% |
MTGNN [21] | 29.56 | 16.94 | 11.65% | 35.76 | 20.40 | 8.89% | 54.41 | 32.50 | 6.36% |
AMGC-AT (ours) | 25.32 | 16.65 | 10.62% | 31.24 | 19.67 | 8.45% | 50.79 | 30.68 | 5.86% |
Improvement | 16.74% | 5.54% | 12.52% | 14.47% | 3.71% | 5.20% | 7.12% | 5.93% | 8.53% |
Time Interval | 10 Min | ||
---|---|---|---|
RMSE | MAE | WMAPE | |
AMGC-S | 28.81 | 31.28 | 17.20% |
AMGC-T | 28.56 | 29.30 | 15.58% |
AGC-AT | 29.53 | 19.56 | 13.01% |
AMG-AT | 26.29 | 16.71 | 11.24% |
AGC1-AT | 25.97 | 16.31 | 11.04% |
AGC2-AT | 27.02 | 17.78 | 11.95% |
AGC3-AT | 26.81 | 17.74 | 12.49% |
AMGC-AT1 | 26.45 | 16.94 | 11.65% |
AMGC-AT2 | 25.85 | 15.53 | 8.99% |
AMGC-AT3 | 26.47 | 15.74 | 9.18% |
AMGC-AT | 25.32 | 16.65 | 10.62% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, F.; Zheng, C.; Zhang, C.; Ma, J.; Sun, K. Multi-View Multi-Attention Graph Neural Network for Traffic Flow Forecasting. Appl. Sci. 2023, 13, 711. https://doi.org/10.3390/app13020711
Wu F, Zheng C, Zhang C, Ma J, Sun K. Multi-View Multi-Attention Graph Neural Network for Traffic Flow Forecasting. Applied Sciences. 2023; 13(2):711. https://doi.org/10.3390/app13020711
Chicago/Turabian StyleWu, Fei, Changjiang Zheng, Chen Zhang, Junze Ma, and Kai Sun. 2023. "Multi-View Multi-Attention Graph Neural Network for Traffic Flow Forecasting" Applied Sciences 13, no. 2: 711. https://doi.org/10.3390/app13020711
APA StyleWu, F., Zheng, C., Zhang, C., Ma, J., & Sun, K. (2023). Multi-View Multi-Attention Graph Neural Network for Traffic Flow Forecasting. Applied Sciences, 13(2), 711. https://doi.org/10.3390/app13020711