Optimal Aggregate Size of Traffic Sequence Data Based on Fuzzy Entropy and Mutual Information
Abstract
:1. Introduction
2. The Measure of Degree of Dependence between Sequences—Mutual Information
3. Measure Methods of Sequence Complexity
3.1. Approximate Entropy
- Suppose an n-dimensional time series obtained by sampling at equal time intervals is u(1), u(2), u(3), …, u(N);
- Define two hyperparameters m and r in the algorithm, where m is an integer greater than or equal to 2, indicating the length of the compared vectors and r is a real number that represents the degree of approximation. The smaller r is, the stricter the algorithm requires on sequence similarity; r generally takes 0.15~0.25;
- Reconstruct a time series, X(1), X(2), …, X(N + m − 1), where X(i) is vector, equal [u(i), u(I + 1), …, u(I + m − 1)];
- Pair up any two vectors X(i), X(j) of the reconstructed sequence; I and j are integers in the interval [1, N − m + 1]. Then count the number of vector pairs that satisfy the distance condition. Chebyshev distance is often used here as distance calculation for vectors, that calculation formula is:
- 5.
- Calculate the average number of paired vectors satisfying the distance condition for each vector:
- 6.
- Calculate the approximate entropy according to the following formula.
3.2. Sample Entropy
- 1–3.
- The first three steps of sample entropy calculation are consistent with the calculation process of approximate entropy;
- 4.
- Counts the number of vector pairs that satisfy the distance condition. The calculation of the sample entropy in this step is similar to the approximate entropy, and the vector distance is also calculated using the Chebyshev distance. However, unlike approximate entropy, the condition of vector pairing not only requires i and j to be any integer in the interval [1, N − m + 1], but also requires to i not be equal to j for sample entropy;
- 5.
- Calculate the average number of paired vectors that satisfy the distance condition for each vector;
- 6.
- Calculate the sample entropy according to the following formula.
3.3. Fuzzy Entropy
- 1–3.
- The first three steps of fuzzy entropy calculation are consistent with the calculation process of approximate entropy and sample entropy;
- 4.
- Count the sum of fuzzy memberships of all vector pairs. During the calculation of fuzzy entropy, the vector distance no longer uses the Chebyshev distance and no longer checks whether the vector distance satisfies the distance condition. Instead, calculate the fuzzy membership of each vector pair and sum it up.
- 5.
- Calculate the average fuzzy membership for each vector;
- 6.
- Calculate the fuzzy entropy according to the following formula.
4. Simulation to Obtain Experimental Data
5. Calculation of Sequence Information
5.1. Calculation of the Mutual Information of and
5.2. Calculation of the Mutual Information of and
5.3. Calculation of the Mutual Information of and
5.4. Calculation of the Complexity of Sequence
5.5. Select the Optimal Aggregate Size
6. Compare Different Aggregation Sizes
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
Algorithm A1: Calculate the approximate entropy of the sequence |
Input: |
U: an array representing an n-dimensional time series sampled at equal time intervals. |
r: a hyperparameter that controls the degree of approximation of the sequence. |
m: aggregation size of the sequence. |
Output: |
ApEn: approximate entropy of a sequence. |
function ApEnCalculate (U, r, m) /*main function, calculate approximate entropy*/ |
1. |
2. return |
end function |
function PhiCalculate (U, m, r) /*subfunction, return the of the reconstructed sequence*/ |
1. |
2. for i in |
3. |
4. end |
5. |
6. foreach i in X |
7. foreach j in X |
8. if Dist (i, j, m) <= r then |
9. end |
10. end |
11. |
12. return |
end function |
function Dist (a, b, m) /*subfunction, returns the maximum absolute distance of the input two vectors */ |
1. |
2. for i in 1:m |
3. if then |
4. end |
5. return max |
end function |
Algorithm A2: Calculate the sample entropy of the sequence |
Input: |
U: an array representing an n-dimensional time series sampled at equal time intervals. |
r: a hyperparameter that controls the degree of approximation of the sequence. |
m: aggregation size of the sequence. |
Output: |
SpEn: sample entropy of sequence. |
function SpEnCalculate (U, r, m) /*main function, calculate sample entropy*/ |
1. |
2. return |
end function |
function PsiCalculate (U, m, r) /*subfunction, return the of the reconstructed sequence*/ |
1. |
2. for i in |
3. |
4. end |
5. |
6. foreach i in X |
7. foreach j in X |
8. if i == j then continue |
9. if Dist (i, j, m) <= r then |
10. end |
11. end |
12. |
13. return |
end function |
function Dist (a, b, m) /*subfunction, returns the maximum absolute distance of the input two vectors */ |
1. |
2. for i in 1:m |
3. if then |
4. end |
5. return d |
end function |
Algorithm A3: Calculate the fuzzy entropy of the sequence |
Input: |
U: an array representing an n-dimensional time series sampled at equal time intervals. |
r: a hyperparameter that controls the degree of approximation of the sequence. |
m: aggregation size of the sequence. |
Output: |
FuzzyEn: fuzzy entropy of sequence. |
function FuzzyEnCalculate (U, r, m) /*main function, calculate fuzzy entropy*/ |
1. |
2. return |
end function |
function OmegaCalculate (U, m, r) /*subfunction, return the of the reconstructed sequence*/ |
1. |
2. for i in |
3. |
4. end |
5. |
6. foreach i in X |
7. foreach j in X |
8. if i == j then continue |
9. = /*fuzzy membership function*/ |
10. end |
11. end |
12. |
13. return |
end function |
function Dist (a, b, m) /*subfunction, returns the maximum absolute distance of the input two vectors */ |
1. |
2. for i in 1:m |
3. if then |
4. end |
5. return d |
end function |
References
- Vlahogianni, E.I.; Golias, J.C.; Karlaftis, M.G. Short-term traffic forecasting: Overview of objectives and methods. Transp. Rev. 2004, 24, 533–557. [Google Scholar] [CrossRef]
- Ahmed, M.S.C.; Allen, R. Analysis of freeway traffic time-series data by using Box-Jenkins techniques. Transp. Res. Rec. 1979, 1, 1–9. [Google Scholar]
- Wang, Y.; Jia, R.X.; Dai, F.; Ye, Y.X. Traffic Flow Prediction Method Based on Seasonal Characteristics and SARIMA-NAR Model. Appl. Sci. 2022, 12, 2190. [Google Scholar] [CrossRef]
- Zhang, Y.R.; Zhang, Y.L.; Haghani, A. A hybrid short-term traffic flow forecasting method based on spectral analysis and statistical volatility model. Transp. Res. Part C Emerg. Technol. 2014, 43, 65–78. [Google Scholar] [CrossRef]
- Li, W.Y.; Li, J.Z.; Wang, T. Box-Cox Exponential Transformation. J. Wuhan Univ. Technol. (Transp. Sci. Eng.) 2020, 44, 974–977. [Google Scholar]
- Ma, T.; Zhou, Z.; Antoniou, C. Dynamic factor model for network traffic state forecast. Transp. Res. Part B Methodol. 2018, 118, 281–317. [Google Scholar] [CrossRef]
- Wang, R.J.; Pei, X.K.; Zhu, J.Y.; Zhang, Z.Y.; Huang, X.; Zhai, J.Y.; Zhang, F.L. Multivariable time series forecasting using model fusion. Inf. Sci. 2022, 585, 262–274. [Google Scholar] [CrossRef]
- Inam, S.; Mahmood, A.; Khatoon, S.; Alshamari, M.; Nawaz, N. Multisource Data Integration and Comparative Analysis of Machine Learning Models for On-Street Parking Prediction. Sustainability 2022, 14, 7317. [Google Scholar] [CrossRef]
- Li, M.; Li, M.; Liu, B.; Liu, J.; Liu, Z.; Luo, D. Spatio-Temporal Traffic Flow Prediction Based on Coordinated Attention. Sus-tainability 2022, 14, 7394. [Google Scholar] [CrossRef]
- Lippi, M.; Bertini, M.; Frasconi, P. Short-Term Traffic Flow Forecasting: An Experimental Comparison of Time-Series Analysis and Supervised Learning. IEEE Trans. Intell. Transp. Syst. 2013, 14, 871–882. [Google Scholar] [CrossRef]
- Tettamanti, T.; Demeter, H.; Varga, I. Route Choice Estimation Based on Cellular Signaling Data. Acta Polytech. Hung. 2012, 9, 207–220. [Google Scholar]
- Jiang, C.Y.; Hu, X.M.; Chen, W.N. An Urban Traffic Signal Control System Based on Traffic Flow Prediction. In Proceedings of the 2021 13th International Conference on Advanced Computational Intelligence (ICACI), Chongqing, China, 14–16 May 2021; pp. 259–265. [Google Scholar]
- Xiao, S. Optimal Travel Path Planning and Real Time Forecast System Based on Ant Colony Algorithm. In Proceedings of the 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China, 25–26 March 2017; pp. 2223–2226. [Google Scholar]
- Balwan, M.A.; Varghese, T.; Nadeera, S. Urban Traffic Control System Review—A Sharjah City Case Study. In Proceedings of the 2021 9th International Conference on Traffic and Logistic Engineering (ICTLE), Macau, China, 9–11 August 2021; pp. 46–50. [Google Scholar]
- Woodard, M.; Wisely, M.; Sahra, S.S. A Survey of Data Cleansing Techniques for Cyber-Physical Critical Infrastructure Systems. Adv. Comput. 2016, 102, 63–110. [Google Scholar]
- Ghaman, R.; Gettman, D.; Head, L.; Mirchandani, P. Adaptive control software for distributed systems. In Proceedings of the 28th Annual Conference of the IEEE Industrial-Electronics-Society, Seville, Spain, 5–8 November 2002; pp. 3103–3106. [Google Scholar]
- Fernandez, R.; Valenzuela, E.; Casanello, F.; Jorquera, C. Evolution of the TRANSYT model in a developing country. Transp. Res. Part A Policy Pract. 2006, 40, 386–398. [Google Scholar] [CrossRef]
- Reza, I.; Ratrout, N.T.; Rahman, S.M. Artificial Intelligence-Based Protocol for Macroscopic Traffic Simulation Model Development. Arab. J. Sci. Eng. 2021, 46, 4941–4949. [Google Scholar] [CrossRef]
- Tian, Z.; Ohene, F.; Hu, P.F. Arterial Performance Evaluation on an Adaptive Traffic Signal Control System. Procedia Soc. Behav. Sci. 2011, 16, 230–239. [Google Scholar] [CrossRef] [Green Version]
- Dusparic, I.; Monteil, J.; Cahill, V. Towards Autonomic Urban Traffic Control with Collaborative Multi-Policy Reinforcement Learning. In Proceedings of the 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), Rio de Janeiro, Brazil, 1–4 November 2016; pp. 2065–2070. [Google Scholar]
- Mirchandani, P.; Wang, F.Y. Rhodes to Intelligent Transportation Systems. IEEE Intell. Syst. 2005, 20, 10–15. [Google Scholar] [CrossRef]
- Moharm, K.I.; Zidane, E.F.; El-Mahdy, M.M.; El-Tantawy, S. Big Data in ITS: Concept, Case Studies, Opportunities, and Challenges. IEEE Trans. Intell. Transp. Syst. 2019, 20, 3189–3194. [Google Scholar] [CrossRef]
- Shengdong, M.; Zhengxian, X.; Yixiang, T. Intelligent Traffic Control System Based on Cloud Computing and Big Data Mining. IEEE Trans. Ind. Inform. 2019, 15, 6583–6592. [Google Scholar] [CrossRef]
- Suvin, P.V.; Karmakar, A.R.; Islam, S.Y.; Mallikarjuna, C. Criteria for Temporal Aggregation of the Traffic Data from a Heterogeneous Traffic Stream. Transp. Res. Procedia 2020, 48, 3401–3412. [Google Scholar] [CrossRef]
- Weerasekera, R.; Sridharan, M.; Ranjitkar, P. Implications of Spatiotemporal Data Aggregation on Short-Term Traffic Pre-diction Using Machine Learning Algorithms. J. Adv. Transp. 2020, 2020, 1–21. [Google Scholar] [CrossRef]
- Liu, Y.; Zhang, J.Z. Predicting Traffic Flow in Local Area Networks by the Largest Lyapunov Exponent. Entropy 2016, 18, 32. [Google Scholar] [CrossRef] [Green Version]
- Chen, Y.Y.; Lv, Y.S. Analysis and Forecasting of Urban Traffic Condition Based on Categorical Data. In Proceedings of the 2016 IEEE International Conference on Service Operations and Logistics, and Informatics (Soli), Beijing, China, 10–12 July 2016; pp. 113–118. [Google Scholar]
- Pincus, S.M. Approximate Entropy as a Measure of System-Complexity. Proc. Natl. Acad. Sci. USA 1991, 88, 2297–2301. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Li, Z.; Cai, J.P.; Lu, X.F.; Si, J.B. Complexity Measure of FH/SS Sequences Using Approximate Entropy. In Proceedings of the 2009 IEEE International Conference on Communications, Dresden, Germany, 14–18 June 2009; pp. 834–838. [Google Scholar]
- Richman, J.S.; Moorman, J.R. Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol. 2000, 278, H2039–H2049. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Delgado-Bonal, A.; Marshak, A. Approximate Entropy and Sample Entropy: A Comprehensive Tutorial. Entropy 2019, 21, 541. [Google Scholar] [CrossRef]
- Chen, W.T.; Zhuang, J.; Yu, W.X.; Wang, Z.Z. Measuring complexity using FuzzyEn, ApEn, and SampEn. Med Eng. Phys. 2009, 31, 61–68. [Google Scholar] [CrossRef]
- Krivda, V.; Petru, J.; Macha, D.; Novak, J. Use of Microsimulation Traffic Models as Means for Ensuring Public Transport Sustainability and Accessibility. Sustainability 2021, 13, 2709. [Google Scholar] [CrossRef]
- Alrukaibi, F.; Alsaleh, R.; Sayed, T. Applying Machine Learning and Statistical Approaches for Travel Time Estimation in Partial Network Coverage. Sustainability 2019, 11, 3822. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, J.; Li, W.; Lian, G. Optimal Aggregate Size of Traffic Sequence Data Based on Fuzzy Entropy and Mutual Information. Sustainability 2022, 14, 14767. https://doi.org/10.3390/su142214767
Li J, Li W, Lian G. Optimal Aggregate Size of Traffic Sequence Data Based on Fuzzy Entropy and Mutual Information. Sustainability. 2022; 14(22):14767. https://doi.org/10.3390/su142214767
Chicago/Turabian StyleLi, Junzhuo, Wenyong Li, and Guan Lian. 2022. "Optimal Aggregate Size of Traffic Sequence Data Based on Fuzzy Entropy and Mutual Information" Sustainability 14, no. 22: 14767. https://doi.org/10.3390/su142214767