Convolutional Long-Short Term Memory Network with Multi-Head Attention Mechanism for Traffic Flow Prediction
Abstract
:1. Introduction
- A decoder network is introduced to decompose the original traffic flow data into features located at a higher-dimensional feature space.
- A convolutional LSTM network is introduced to consider the correlation of the high dimensional features and the temporal correlation of traffic flow data.
- A multi-head attention mechanism is introduced to use the most relevant portion of the traffic data to make predictions so that the prediction performance can be improved.
2. Literature Review
3. Convolutional LSTM with Multi-Head Attention Mechanism
3.1. Decoder Network for Traffic Data Decomposition
3.2. Convolutional LSTM Cell
3.3. Multi-Head Attention Model
4. Case Study
4.1. Dataset Description
4.2. Evaluation Metric
4.3. Model Architecture and Hyperparameters
4.4. Traffic Flow Prediction Results for the First Case
4.5. Traffic Flow Prediction Results for the Second Case
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
LSTM | Long short-term memory |
ANN | artificial neural network |
PeMS | Caltrans Performance Measurement System |
VMT | vehicle miles traveled |
VHT | vehicle hours traveled |
RSME | root mean squared error |
MAE | mean absolute error |
FC | fully connected |
ConvLSTM | convolutional long short-term memory |
References
- Bazzan, A.L.; Oliveira, D.d.; Klügl, F.; Nagel, K. To adapt or not to adapt–consequences of adapting driver and traffic light agents. In Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning; Springer: Berlin/Heidelberg, Germany, 2005; pp. 1–14. [Google Scholar]
- Ahmad, A.; Arshad, R.; Mahmud, S.A.; Khan, G.M.; Al-Raweshidy, H.S. Earliest-deadline-based scheduling to reduce urban traffic congestion. IEEE Trans. Intell. Transp. Syst. 2014, 15, 1510–1526. [Google Scholar] [CrossRef]
- Zhang, Y.; Li, C.; Luan, T.H.; Fu, Y.; Shi, W.; Zhu, L. A mobility-aware vehicular caching scheme in content centric networks: Model and optimization. IEEE Trans. Veh. Technol. 2019, 68, 3100–3112. [Google Scholar] [CrossRef]
- Falcocchio, J.C.; Levinson, H.S. Road Traffic Congestion: A Concise Guide; Springer: Berlin/Heidelberg, Germany, 2015; Volume 7. [Google Scholar]
- Wu, Y.; Tan, H.; Qin, L.; Ran, B.; Jiang, Z. A hybrid deep learning based traffic flow prediction method and its understanding. Transp. Res. Part C Emerg. Technol. 2018, 90, 166–180. [Google Scholar] [CrossRef]
- Shi, R.; Du, L. Multi-Section Traffic Flow Prediction Based on MLR-LSTM Neural Network. Sensors 2022, 22, 7517. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.; Zhao, J.; Shao, C.; Dong, C.; Yin, C. Truck traffic flow prediction based on LSTM and GRU methods with sampled GPS data. IEEE Access 2020, 8, 208158–208169. [Google Scholar] [CrossRef]
- Chen, Z.; Wu, B.; Li, B.; Ruan, H. Expressway exit traffic flow prediction for ETC and MTC charging system based on entry traffic flows and LSTM model. IEEE Access 2021, 9, 54613–54624. [Google Scholar] [CrossRef]
- Zhou, Q.; Chen, N.; Lin, S. FASTNN: A Deep Learning Approach for Traffic Flow Prediction Considering Spatiotemporal Features. Sensors 2022, 22, 6921. [Google Scholar] [CrossRef]
- Yu, C.; Chen, J.; Xia, G. Coordinated Control of Intelligent Fuzzy Traffic Signal Based on Edge Computing Distribution. Sensors 2022, 22, 5953. [Google Scholar] [CrossRef]
- Feng, X.; Ling, X.; Zheng, H.; Chen, Z.; Xu, Y. Adaptive multi-kernel SVM with spatial–temporal correlation for short-term traffic flow prediction. IEEE Trans. Intell. Transp. Syst. 2018, 20, 2001–2013. [Google Scholar] [CrossRef]
- Kumar, S.V. Traffic flow prediction using Kalman filtering technique. Procedia Eng. 2017, 187, 582–587. [Google Scholar] [CrossRef]
- Mingheng, Z.; Yaobao, Z.; Ganglong, H.; Gang, C. Accurate multisteps traffic flow prediction based on SVM. Math. Probl. Eng. 2013, 2013, 418303. [Google Scholar] [CrossRef] [Green Version]
- Lv, Y.; Duan, Y.; Kang, W.; Li, Z.; Wang, F.Y. Traffic flow prediction with big data: A deep learning approach. IEEE Trans. Intell. Transp. Syst. 2014, 16, 865–873. [Google Scholar] [CrossRef]
- Miglani, A.; Kumar, N. Deep learning models for traffic flow prediction in autonomous vehicles: A review, solutions, and challenges. Veh. Commun. 2019, 20, 100184. [Google Scholar] [CrossRef]
- Sun, Y.; Leng, B.; Guan, W. A novel wavelet-SVM short-time passenger flow prediction in Beijing subway system. Neurocomputing 2015, 166, 109–121. [Google Scholar] [CrossRef]
- Liu, Y.; Wu, H. Prediction of road traffic congestion based on random forest. In Proceedings of the 2017 10th International Symposium on Computational Intelligence and Design (ISCID), Hangzhou, China, 9–10 December 2017; IEEE: Piscataway, NJ, USA, 2017; Volume 2, pp. 361–364. [Google Scholar]
- Sun, S.; Xu, X. Variational inference for infinite mixtures of Gaussian processes with applications to traffic flow prediction. IEEE Trans. Intell. Transp. Syst. 2010, 12, 466–475. [Google Scholar] [CrossRef]
- Pascale, A.; Nicoli, M. Adaptive Bayesian network for traffic flow prediction. In Proceedings of the 2011 IEEE Statistical Signal Processing Workshop (SSP), Nice, France, 28–30 June 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 177–180. [Google Scholar]
- Tang, J.; Chen, X.; Hu, Z.; Zong, F.; Han, C.; Li, L. Traffic flow prediction based on combination of support vector machine and data denoising schemes. Phys. Stat. Mech. Its Appl. 2019, 534, 120642. [Google Scholar] [CrossRef]
- Zhang, L.; Alharbe, N.R.; Luo, G.; Yao, Z.; Li, Y. A hybrid forecasting framework based on support vector regression with a modified genetic algorithm and a random forest for traffic flow prediction. Tsinghua Sci. Technol. 2018, 23, 479–492. [Google Scholar] [CrossRef]
- Xu, Y.; Yin, F.; Xu, W.; Lin, J.; Cui, S. Wireless traffic prediction with scalable Gaussian process: Framework, algorithms, and verification. IEEE J. Sel. Areas Commun. 2019, 37, 1291–1306. [Google Scholar] [CrossRef] [Green Version]
- Wang, W.; Zhou, C.; He, H.; Wu, W.; Zhuang, W.; Shen, X. Cellular traffic load prediction with LSTM and Gaussian process regression. In Proceedings of the ICC 2020-2020 IEEE International Conference on Communications (ICC), Dublin, Ireland, 7–11 June 2020; IEEE: Piscatawy, NJ, USA, 2020; pp. 1–6. [Google Scholar]
- Zhu, Z.; Peng, B.; Xiong, C.; Zhang, L. Short-term traffic flow prediction with linear conditional Gaussian Bayesian network. J. Adv. Transp. 2016, 50, 1111–1123. [Google Scholar] [CrossRef]
- Li, Z.; Jiang, S.; Li, L.; Li, Y. Building sparse models for traffic flow prediction: An empirical comparison between statistical heuristics and geometric heuristics for Bayesian network approaches. Transp. Transp. Dyn. 2017, 7, 107–123. [Google Scholar] [CrossRef]
- Wei, W.; Wu, H.; Ma, H. An autoencoder and LSTM-based traffic flow prediction method. Sensors 2019, 19, 2946. [Google Scholar] [CrossRef] [Green Version]
- Xiao, Y.; Yin, Y. Hybrid LSTM neural network for short-term traffic flow prediction. Information 2019, 10, 105. [Google Scholar] [CrossRef] [Green Version]
- Fu, R.; Zhang, Z.; Li, L. Using LSTM and GRU neural network methods for traffic flow prediction. In Proceedings of the 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC), Wuhan, China, 11–13 November 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 324–328. [Google Scholar]
- Shu, W.; Cai, K.; Xiong, N.N. A short-term traffic flow prediction model based on an improved gate recurrent unit neural network. IEEE Trans. Intell. Transp. Syst. 2021, 23, 16654–16665. [Google Scholar] [CrossRef]
- Yang, B.; Sun, S.; Li, J.; Lin, X.; Tian, Y. Traffic flow prediction using LSTM with feature enhancement. Neurocomputing 2019, 332, 320–327. [Google Scholar] [CrossRef]
- Xiangxue, W.; Lunhui, X.; Kaixun, C. Data-driven short-term forecasting for urban road network traffic based on data processing and LSTM-RNN. Arab. J. Sci. Eng. 2019, 44, 3043–3060. [Google Scholar] [CrossRef]
- Li, Z.; Xiong, G.; Chen, Y.; Lv, Y.; Hu, B.; Zhu, F.; Wang, F.Y. A hybrid deep learning approach with GCN and LSTM for traffic flow prediction. In Proceedings of the 2019 IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealan, 27–30 October 2019; IEEE: Piscatway, NJ, USA, 2019; pp. 1929–1933. [Google Scholar]
- Chen, J.; Liao, S.; Hou, J.; Wang, K.; Wen, J. GST-GCN: A Geographic-Semantic-Temporal Graph Convolutional Network for Context-aware Traffic Flow Prediction on Graph Sequences. In Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Melbourne, Australia, 17–20 October 2021; IEEE: Piscataway, NJ, USA, 2020; pp. 1604–1609. [Google Scholar]
- Jiang, W.; Luo, J. Graph neural network for traffic forecasting: A survey. Expert Syst. Appl. 2022, 4, 117921. [Google Scholar] [CrossRef]
- Tian, Y.; Zhang, K.; Li, J.; Lin, X.; Yang, B. LSTM-based traffic flow prediction with missing data. Neurocomputing 2018, 318, 297–305. [Google Scholar] [CrossRef]
- Dai, G.; Ma, C.; Xu, X. Short-term traffic flow prediction method for urban road sections based on space–time analysis and GRU. IEEE Access 2019, 7, 143025–143035. [Google Scholar] [CrossRef]
- Zhene, Z.; Hao, P.; Lin, L.; Guixi, X.; Du, B.; Bhuiyan, M.Z.A.; Long, Y.; Li, D. Deep convolutional mesh RNN for urban traffic passenger flows prediction. In Proceedings of the 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), Guangzhou, China, 8–12 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 1305–1310. [Google Scholar]
- Luo, X.; Li, D.; Yang, Y.; Zhang, S. Spatiotemporal traffic flow prediction with KNN and LSTM. J. Adv. Transp. 2019, 2019, 4145353. [Google Scholar] [CrossRef] [Green Version]
- Zhu, H.; Xie, Y.; He, W.; Sun, C.; Zhu, K.; Zhou, G.; Ma, N. A novel traffic flow forecasting method based on RNN-GCN and BRB. J. Adv. Transp. 2020, 2020, 7586154. [Google Scholar] [CrossRef]
- Yu, B.; Lee, Y.; Sohn, K. Forecasting road traffic speeds by considering area-wide spatio-temporal dependencies based on a graph convolutional neural network (GCN). Transp. Res. Part C Emerg. Technol. 2020, 114, 189–204. [Google Scholar] [CrossRef]
- Ye, J.; Zhao, J.; Ye, K.; Xu, C. How to build a graph-based deep learning architecture in traffic domain: A survey. IEEE Trans. Intell. Transp. Syst. 2020, 2020, 7586154. [Google Scholar] [CrossRef]
- Shi, X.; Chen, Z.; Wang, H.; Yeung, D.Y.; Wong, W.K.; Woo, W.c. Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Adv. Neural Inf. Process. Syst. 2015, 23, 3904–3924. [Google Scholar]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Xu, K.; Ba, J.; Kiros, R.; Cho, K.; Courville, A.; Salakhudinov, R.; Zemel, R.; Bengio, Y. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning, PMLR, Lille, France, 7–9 July 2015; pp. 2048–2057. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems 30, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Li, J.; Tu, Z.; Yang, B.; Lyu, M.R.; Zhang, T. Multi-head attention with disagreement regularization. arXiv 2018, arXiv:1810.10183. [Google Scholar]
- Caltrans. Performance Measurement System (PeMS). Available online: https://pems.dot.ca.gov/ (accessed on 5 February 2022).
- Sun, S.; Huang, R.; Gao, Y. Network-scale traffic modeling and forecasting with graphical lasso and neural networks. J. Transp. Eng. 2012, 138, 1358–1367. [Google Scholar] [CrossRef]
- Rahman, F.I. Short term traffic flow prediction using machine learning-KNN, SVM and ANN with weather information. Int. J. Traffic Transp. Eng. 2020, 10, 371–389. [Google Scholar]
No. of Layers | Descriptions | Output Dimensions |
---|---|---|
1 | Input layer | 100 × 24 × 1 |
2 | FC layer | 100 × 24 × 100 |
3 | Convolutional LSTM | 100 × 24 × 99 |
4 | Multi-head attention | 100 × 24 × 99 |
5 | Flatten layer | 100 × 2376 |
6 | Dense layer | 100 × 1 |
No. of Layers | Descriptions | Output Dimensions |
---|---|---|
1 | Input layer | 100 × 24 × 1 |
2 | FC layer | 100 × 24 × 100 |
3 | FC layer | 100 × 24 × 100 |
4 | FC layer | 100 × 24 × 100 |
5 | Convolutional LSTM | 100 × 24 × 91 |
6 | Multi-head attention | 100 × 24 × 91 |
7 | Flatten layer | 100 × 2184 |
8 | Dense layer | 100 × 1 |
No. of Layers | Descriptions | Output Dimensions |
---|---|---|
1 | Input layer | 100 × 24 × 1 |
2 | FC layer | 100 × 24 × 100 |
3 | FC layer | 100 × 24 × 100 |
4 | FC layer | 100 × 24 × 100 |
5 | FC layer | 100 × 24 × 100 |
6 | FC layer | 100 × 24 × 100 |
7 | Convolutional LSTM | 100 × 24 × 91 |
8 | Multi-head attention | 100 × 24 × 91 |
9 | Flatten layer | 100 × 2184 |
10 | Dense layer | 100 × 1 |
Method Symbol | Description |
---|---|
D-ConvLSTM | Decoder with convolutional LSTM |
D-Attention | Decoder with multi-head attention |
LSTM | Long short term memory network |
LASSO | Regression with l1-norm regularization |
ANN | Artificial neural network |
1 h Task | 5 h Task | 10 h Task | |||||
---|---|---|---|---|---|---|---|
VMT | VHT | VMT | VHT | VMT | VHT | ||
RMSE | Proposed | 0.032 | 0.066 | 0.080 | 0.128 | 0.084 | 0.167 |
D-ConvLSTM | 0.044 | 0.079 | 0.099 | 0.128 | 0.094 | 0.157 | |
D-Attention | 0.043 | 0.086 | 0.105 | 0.179 | 0.113 | 0.199 | |
LSTM [30] | 0.038 | 0.064 | 0.065 | 0.145 | 0.104 | 0.191 | |
LASSO [48] | 0.088 | 0.141 | 0.142 | 0.242 | 0.141 | 0.240 | |
ANN [49] | 0.054 | 0.103 | 0.137 | 0.245 | 0.138 | 0.241 | |
MAE | Proposed | 0.024 | 0.048 | 0.059 | 0.090 | 0.058 | 0.116 |
D-ConvLSTM | 0.034 | 0.058 | 0.072 | 0.097 | 0.064 | 0.115 | |
D-Attention | 0.034 | 0.066 | 0.076 | 0.135 | 0.077 | 0.138 | |
LSTM [30] | 0.029 | 0.045 | 0.046 | 0.107 | 0.064 | 0.130 | |
LASSO [48] | 0.063 | 0.096 | 0.099 | 0.165 | 0.098 | 0.163 | |
ANN [49] | 0.039 | 0.072 | 0.090 | 0.172 | 0.089 | 0.168 |
1 h Task | 5 h Task | 10 h Task | |||||
---|---|---|---|---|---|---|---|
VMT | VHT | VMT | VHT | VMT | VHT | ||
RMSE | Proposed | 0.053 | 0.100 | 0.084 | 0.135 | 0.100 | 0.172 |
D-ConvLSTM | 0.088 | 0.153 | 0.094 | 0.157 | 0.118 | 0.184 | |
D-Attention | 0.055 | 0.087 | 0.112 | 0.168 | 0.141 | 0.253 | |
LSTM [30] | 0.055 | 0.106 | 0.113 | 0.187 | 0.119 | 0.225 | |
LASSO [48] | 0.091 | 0.145 | 0.149 | 0.256 | 0.145 | 0.248 | |
ANN [49] | 0.063 | 0.112 | 0.143 | 0.255 | 0.146 | 0.256 | |
MAE | Proposed | 0.042 | 0.078 | 0.062 | 0.093 | 0.064 | 0.109 |
D-ConvLSTM | 0.060 | 0.107 | 0.060 | 0.104 | 0.078 | 0.125 | |
D-Attention | 0.043 | 0.107 | 0.075 | 0.123 | 0.104 | 0.187 | |
LSTM [30] | 0.046 | 0.076 | 0.077 | 0.129 | 0.080 | 0.153 | |
LASSO [48] | 0.063 | 0.099 | 0.102 | 0.175 | 0.100 | 0.168 | |
ANN [49] | 0.044 | 0.079 | 0.096 | 0.175 | 0.099 | 0.179 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wei, Y.; Liu, H. Convolutional Long-Short Term Memory Network with Multi-Head Attention Mechanism for Traffic Flow Prediction. Sensors 2022, 22, 7994. https://doi.org/10.3390/s22207994
Wei Y, Liu H. Convolutional Long-Short Term Memory Network with Multi-Head Attention Mechanism for Traffic Flow Prediction. Sensors. 2022; 22(20):7994. https://doi.org/10.3390/s22207994
Chicago/Turabian StyleWei, Yupeng, and Hongrui Liu. 2022. "Convolutional Long-Short Term Memory Network with Multi-Head Attention Mechanism for Traffic Flow Prediction" Sensors 22, no. 20: 7994. https://doi.org/10.3390/s22207994
APA StyleWei, Y., & Liu, H. (2022). Convolutional Long-Short Term Memory Network with Multi-Head Attention Mechanism for Traffic Flow Prediction. Sensors, 22(20), 7994. https://doi.org/10.3390/s22207994