Traffic Status Prediction Based on Multidimensional Feature Matching and 2nd-Order Hidden Markov Model (HMM)
Abstract
:1. Introduction
- (1)
- Extraction and Matching of Multidimensional Spatiotemporal Features in Urban Road Traffic. In this step, we extract intricate traffic features from various sections of urban roads, considering both the temporal and spatial dimensions. Utilizing multidimensional spatiotemporal data, we capture a wide array of attributes that influence traffic conditions in these sections, accounting for their mutual correlations and impacts. These attributes undergo quantification through normalization, and we employ nearest-neighbor matching techniques to mitigate the influence of long-term cyclic patterns on prediction outcomes. Moreover, this approach comprehensively considers the interconnections among these factors, effectively addressing the issue of MNAR in traffic data.
- (2)
- Integration of Feature Matching and 2nd-order HMM for Traffic Status Prediction. To enhance the efficiency and accuracy of traffic status prediction, we adopt a “match first, then predict” strategy within specific time intervals. Within a designated timeframe, we obtain the traffic status for certain preceding timeslices based on spatiotemporal traffic features using feature matching techniques. Building upon this initial matching, we predict the traffic status for subsequent timeslices, utilizing a 2nd-order HMM to forecast forthcoming statuses.
2. Literature Review
2.1. Traffic Status Recognition
2.2. Prediction and Imputation Algorithms
3. Methods
3.1. Traffic Status Labeling
3.2. Multidimensional Feature Extraction
3.3. Feature Normalization and Matching
Algorithm 1 Topology-first Feature Matching Algorithm | |
Input: | |
: Feature matrix of all topology units | |
: Section number of the section to be predicted | |
: Timeslice to be predicted | |
Output: | |
: Traffic status matched to the section to be predicted | |
1: | |
2: | : |
3: | as None. |
4: | for: |
5: | Calculate the standard deviation of |
6: | similar_topologies ← the matrix and the calculated standard deviation |
7: | end for |
8: | ← top-5 matrices of similar_topologies with the smallest standard deviation |
9: | : |
10: | as None and min_time_diff as 0 s |
11: | for: |
12: | Calculate the standard deviation between |
13: | if the standard deviation < min_time_diff then: |
← the matched matrix and the calculated standard deviation | |
14: | end if |
15: | end for |
16: | Calculate the average traffic status for middle road sections in the matrix |
17: | ← the average traffic status |
18: | return S |
3.4. Traffic Status Prediction Based on 2nd-Order HMM
4. Experiment
4.1. Traffic Status Labeling and Road Feature Construction
4.2. Traffic Status Matching and 2nd-Order HMM Prediction
4.3. Comparison Experiment
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Sharma, A.; Sharma, A.; Nikashina, P.; Gavrilenko, V.; Tselykh, A.; Bozhenyuk, A.; Masud, M.; Meshref, H. A Graph Neural Network (GNN)-Based Approach for Real-Time Estimation of Traffic Speed in Sustainable Smart Cities. Sustainability 2023, 15, 11893. [Google Scholar] [CrossRef]
- Wang, J.; Wang, C.; Lv, J.; Zhang, Z.; Li, C. Modeling Travel Time Reliability of Road Network Considering Connected Vehicle Guidance Characteristics Indexes. J. Adv. Transp. 2017, 2017, 2415312. [Google Scholar] [CrossRef]
- Van Buuren, S. Flexible Imputation of Missing Data, 1st ed.; Chapman and Hall/CRC: New York, NY, USA, 2012; ISBN 978-0-429-06540-8. [Google Scholar]
- Chen, X.; He, Z.; Wang, J. Spatial-Temporal Traffic Speed Patterns Discovery and Incomplete Data Recovery via SVD-Combined Tensor Decomposition. Transp. Res. Part C Emerg. Technol. 2018, 86, 59–77. [Google Scholar] [CrossRef]
- Bae, B.; Kim, H.; Lim, H.; Liu, Y.; Han, L.D.; Freeze, P.B. Missing Data Imputation for Traffic Flow Speed Using Spatio-Temporal Cokriging. Transp. Res. Part C Emerg. Technol. 2018, 88, 124–139. [Google Scholar] [CrossRef]
- Li, L.; Zhang, J.; Wang, Y.; Ran, B. Missing Value Imputation for Traffic-Related Time Series Data Based on a Multi-View Learning Method. IEEE Trans. Intell. Transp. Syst. 2019, 20, 2933–2943. [Google Scholar] [CrossRef]
- Sun, Y.; Li, J.; Xu, Y.; Zhang, T.; Wang, X. Deep Learning versus Conventional Methods for Missing Data Imputation: A Review and Comparative Study. Expert Syst. Appl. 2023, 227, 120201. [Google Scholar] [CrossRef]
- Soumare, H.; Benkahla, A.; Gmati, N. Deep Learning Regularization Techniques to Genomics Data. Array 2021, 11, 100068. [Google Scholar] [CrossRef]
- Harleman, M.; Harris, L.; Willis, M.D.; Ritz, B.; Hystad, P.; Hill, E.L. Changes in Traffic Congestion and Air Pollution Due to Major Roadway Infrastructure Improvements in Texas. Sci. Total Environ. 2023, 898, 165463. [Google Scholar] [CrossRef]
- Janwari, M.M.; Tiwari, G.; Popli, S.K.; Mir, M.S. Traffic Analysis of Srinagar City. Transp. Res. Procedia 2016, 17, 3–15. [Google Scholar] [CrossRef]
- Ukam, G.; Adams, C.; Adebanji, A.; Ackaah, W. Factors Affecting Paratransit Travel Times at Route and Segment Levels. Int. J. Transp. Sci. Technol. 2023. [Google Scholar] [CrossRef]
- Cui, S.; Gu, X.; Xie, W.; Wu, D. Research on Cold Chain Routing Optimization of Multi-Distribution Center Considering Traffic Performance Index. Procedia Comput. Sci. 2023, 221, 1343–1350. [Google Scholar] [CrossRef]
- Tamir, T.S.; Xiong, G.; Li, Z.; Tao, H.; Shen, Z.; Hu, B.; Menkir, H.M. Traffic Congestion Prediction Using Decision Tree, Logistic Regression and Neural Networks. IFAC Pap. 2020, 53, 512–517. [Google Scholar] [CrossRef]
- Saleem, M.; Abbas, S.; Ghazal, T.M.; Adnan Khan, M.; Sahawneh, N.; Ahmad, M. Smart Cities: Fusion-Based Intelligent Traffic Congestion Control System for Vehicular Networks Using Machine Learning Techniques. Egypt. Inform. J. 2022, 23, 417–426. [Google Scholar] [CrossRef]
- Afandizadeh Zargari, S.; Amoei Khorshidi, N.; Mirzahossein, H.; Heidari, H. Analyzing the Effects of Congestion on Planning Time Index—Grey Models vs. Random Forest Regression. Int. J. Transp. Sci. Technol. 2023, 12, 578–593. [Google Scholar] [CrossRef]
- Gao, Y.; Li, J.; Xu, Z.; Liu, Z.; Zhao, X.; Chen, J. A Novel Image-Based Convolutional Neural Network Approach for Traffic Congestion Estimation. Expert Syst. Appl. 2021, 180, 115037. [Google Scholar] [CrossRef]
- Guo, J.; Liu, Y.; Yang, K.Q.; Wang, Y.; Fang, S. GPS-Based Citywide Traffic Congestion Forecasting Using CNN-RNN and C3D Hybrid Model. Transp. A Transp. Sci. 2021, 17, 190–211. [Google Scholar] [CrossRef]
- Narmadha, S.; Vijayakumar, V. Spatio-Temporal Vehicle Traffic Flow Prediction Using Multivariate CNN and LSTM Model. Mater. Today Proc. 2023, 81, 826–833. [Google Scholar] [CrossRef]
- Zheng, W.; Yang, H.F.; Cai, J.; Wang, P.; Jiang, X.; Du, S.S.; Wang, Y.; Wang, Z. Integrating the Traffic Science with Representation Learning for City-Wide Network Congestion Prediction. Inf. Fusion 2023, 99, 101837. [Google Scholar] [CrossRef]
- Fowe, A.J.; Chan, Y. A Microstate Spatial-Inference Model for Network-Traffic Estimation. Transp. Res. Part C Emerg. Technol. 2013, 36, 245–260. [Google Scholar] [CrossRef]
- Weerakody, P.B.; Wong, K.W.; Wang, G. Cyclic Gate Recurrent Neural Networks for Time Series Data with Missing Values. Neural. Process Lett. 2023, 55, 1527–1554. [Google Scholar] [CrossRef]
- Yang, B.; Kang, Y.; Yuan, Y.; Huang, X.; Li, H. ST-LBAGAN: Spatio-Temporal Learnable Bidirectional Attention Generative Adversarial Networks for Missing Traffic Data Imputation. Knowl. Based Syst. 2021, 215, 106705. [Google Scholar] [CrossRef]
- Guo, Z.; Yang, C.; Wang, D.; Liu, H. A Novel Deep Learning Model Integrating CNN and GRU to Predict Particulate Matter Concentrations. Process Saf. Environ. Prot. 2023, 173, 604–613. [Google Scholar] [CrossRef]
- Xu, M.; Di, Y.; Ding, H.; Zhu, Z.; Chen, X.; Yang, H. AGNP: Network-Wide Short-Term Probabilistic Traffic Speed Prediction and Imputation. Commun. Transp. Res. 2023, 3, 100099. [Google Scholar] [CrossRef]
- Haliduola, H.N.; Bretz, F.; Mansmann, U. Missing Data Imputation Using Utility-Based Regression and Sampling Approaches. Comput. Methods Programs Biomed. 2022, 226, 107172. [Google Scholar] [CrossRef] [PubMed]
- Huang, L.; Li, Z.; Luo, R.; Su, R. Missing Traffic Data Imputation with a Linear Generative Model Based on Probabilistic Principal Component Analysis. Sensors 2023, 23, 204. [Google Scholar] [CrossRef]
- Wang, L.; Geng, X.; Ma, X.; Liu, F.; Yang, Q. Cross-City Transfer Learning for Deep Spatio-Temporal Prediction. arXiv 2018, arXiv:1802.00386. [Google Scholar]
- Qi, Y.; Ishak, S. A Hidden Markov Model for Short Term Prediction of Traffic Conditions on Freeways. Transp. Res. Part C Emerg. Technol. 2014, 43, 95–111. [Google Scholar] [CrossRef]
- Zaki, J.F.; Ali-Eldin, A.; Hussein, S.E.; Saraya, S.F.; Areed, F.F. Traffic Congestion Prediction Based on Hidden Markov Models and Contrast Measure. Ain Shams Eng. J. 2020, 11, 535–551. [Google Scholar] [CrossRef]
- Raskar, C.; Nema, S. Metaheuristic Enabled Modified Hidden Markov Model for Traffic Flow Prediction. Comput. Netw. 2022, 206, 108780. [Google Scholar] [CrossRef]
- Wang, Y.; Chen, Y.; Li, G.; Lu, Y.; He, Z.; Yu, Z.; Sun, W. City-Scale Holographic Traffic Flow Data Based on Vehicular Trajectory Resampling. Sci. Data 2023, 10, 57. [Google Scholar] [CrossRef]
- Chen, J.; Hu, Z.; Li, F. An Estimation Method of Traffic Flow State Based on Matching of Temporal-spatial Feature Sequences. J. Transp. Inf. Saf. 2021, 39, 68–76+120. [Google Scholar]
- Tang, J.; Zhang, G.; Wang, Y.; Wang, H.; Liu, F. A Hybrid Approach to Integrate Fuzzy C-Means Based Imputation Method with Genetic Algorithm for Missing Traffic Volume Data Estimation. Transp. Res. Part C Emerg. Technol. 2015, 51, 29–40. [Google Scholar] [CrossRef]
- Duan, Y.; Lv, Y.; Liu, Y.-L.; Wang, F.-Y. An Efficient Realization of Deep Learning for Traffic Data Imputation. Transp. Res. Part C Emerg. Technol. 2016, 72, 168–181. [Google Scholar] [CrossRef]
Variables | Interpretations |
---|---|
The average speed of the section at the current moment, taking the average of the sample speeds within the timeslice, km/h. | |
The length of the road section, m. | |
The ratio of section travel time within the timeslice. | |
The actual travel time on the section within the timeslice, s. | |
The travel time at the desired vehicle speed , s. | |
The traffic performance index of the road section in timeslice. | |
The conversion relation. | |
The road section number from which each feature sequence was generated to perform a road network topology search in a GIS mapping system. | |
The number of all road sections grouped into a topology unit, arranged in the order of inflow road sections-current road-outflow road sections, and the data in each inflow road section or outflow road section are sorted by number order of . | |
The time series. | |
Adjacency hierarchy. Since the strong correlation at =1 has been demonstrated [31], and in order to reduce the matrix dimension, we only consider the influence of the adjacent level of road traffic on the current road section to be predicted; therefore, this study adopts the road sections that are adjacent to each other at the first level of the road network topology as the features in the adjacency hierarchy part, i.e., the adjacency hierarchy is labeled as 1. | |
Road sections number, means the number of adjacent road sections. | |
Road classification, means the urban road classification of road section, including highway, expressway, trunk road, secondary trunk road, etc. | |
Road length. | |
Lanes number of road section. | |
Traffic flow of road sections within timeslice. | |
Average speed within timeslice of road section. | |
Traffic status within timeslice of road section. | |
The five congestion degrees in Table 2. | |
The set of all possible observations, which in our algorithm can be either traffic flow or velocity. | |
The set of all possible traffic statuses. | |
The initial probability of each traffic status , estimated with a frequency on the set . | |
The status transfer probability matrix. | |
The observation probability matrix. | |
or | The probability that the first moment status is transferred to the second moment status in . |
TPI | R | Congestion Degree |
---|---|---|
[0, 1] | (0, 4/3] | Smooth |
(1, 2] | (4/3, 20/11] | Basic Smooth |
(2, 3] | (20/11, 2.5] | Light Congestion |
(3, 4] | (2.5, 5] | Moderate Congestion |
(4, 5] | (5, +∞) | Severe Congestion |
Features | Values | Topology Structure | Normalized Value |
---|---|---|---|
Time series | Weekday Non-weekday | — | 0.5, 1 |
light periods rush periods | — | 0.5, 1 | |
1st 5 min 2nd 5 min 3rd 5 min 4th 5 min (in an hour) | — | 0.25, 0.5, 0.75, 1 | |
Number of adjacent road sections | 0, 1, 2, 3, 4 | Inflow road sections | 0, 0.25, 0.5, 0.75, 1 |
Outflow road sections | |||
Urban road classification | Highway Freeway Trunk Road Secondary Trunk Road | Inflow road sections | 0.25, 0.5, 0.75, 1 |
Middle road sections | |||
Outflow road sections | |||
Road length | — | Inflow road sections | A linear method was used to normalize with 1. |
Middle road sections | |||
Outflow road sections | |||
Lanes number of road section | 0, 1, 2, 3, 4, 5 | Inflow road sections | 0, 0.2, 0.4, 0.6, 0.8, 1 |
Middle road sections | |||
Outflow road sections | |||
Traffic flow of road section | — | Inflow road sections | A linear method was used to normalize with 2. |
Outflow road sections | |||
Average speed of road section | — | Inflow road sections | A linear method was used to normalize with 3. |
Outflow road sections | |||
Road traffic status of road section | Smooth Basic Smooth Light Congestion Moderate Congestion Severe Congestion | Inflow road sections | 0, 0.25, 0.5, 0.75, 1 |
Outflow road sections |
Road Class | Average Speed (km/h) |
---|---|
Highway | 81.60 |
Freeway | 69.55 |
Trunk Road | 39.58 |
Secondary Trunk Road | 28.61 |
ID | Road Classification | UpNum 1 | DownNum 2 | Road Length | Diff (m) 3 |
---|---|---|---|---|---|
443301 | Secondary Trunk Road | 3 | 1 | 1366 | 6 |
43203 | Secondary Trunk Road | 3 | 1 | 1348 | 24 |
681302 | Secondary Trunk Road | 3 | 1 | 1399 | 27 |
604301 | Secondary Trunk Road | 3 | 1 | 1331 | 41 |
593104 | Secondary Trunk Road | 3 | 1 | 1416 | 44 |
Congestion Degree | Smooth | Basic Smooth | Light Congestion | Moderate Congestion | Severe Congestion |
Initial Status Probabilities | 0.607 | 0.264 | 0.079 | 0.043 | 0.007 |
Dataset | Road Classification | Average Precision (%) | |
---|---|---|---|
Ours | HMM-C | ||
Data of Shenzhen City | Highway | 92.685 | 91.823 |
Freeway | 91.750 | 91.663 | |
Trunk Road | 91.364 | 91.315 | |
Secondary Trunk Road | 90.559 | 92.219 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, F.; Liu, K.; Chen, J. Traffic Status Prediction Based on Multidimensional Feature Matching and 2nd-Order Hidden Markov Model (HMM). Sustainability 2023, 15, 14671. https://doi.org/10.3390/su152014671
Li F, Liu K, Chen J. Traffic Status Prediction Based on Multidimensional Feature Matching and 2nd-Order Hidden Markov Model (HMM). Sustainability. 2023; 15(20):14671. https://doi.org/10.3390/su152014671
Chicago/Turabian StyleLi, Fei, Kai Liu, and Jialiang Chen. 2023. "Traffic Status Prediction Based on Multidimensional Feature Matching and 2nd-Order Hidden Markov Model (HMM)" Sustainability 15, no. 20: 14671. https://doi.org/10.3390/su152014671
APA StyleLi, F., Liu, K., & Chen, J. (2023). Traffic Status Prediction Based on Multidimensional Feature Matching and 2nd-Order Hidden Markov Model (HMM). Sustainability, 15(20), 14671. https://doi.org/10.3390/su152014671