Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Multi-Step Traffic Speed Prediction Based on Ensemble Learning on an Urban Road Network

Appl. Sci. 2021, 11(10), 4423; https://doi.org/10.3390/app11104423

by Bin Feng¹

, Jianmin Xu¹, Yonggang Zhang^2,* and Yongjie Lin^1,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Appl. Sci. 2021, 11(10), 4423; https://doi.org/10.3390/app11104423

Submission received: 21 April 2021 / Revised: 8 May 2021 / Accepted: 11 May 2021 / Published: 13 May 2021

(This article belongs to the Topic Applied Computer Vision and Pattern Recognition)

Round 1

Reviewer 1 Report

Forecasting road traffic is an urgent issue for intelligent traffic systems. In their paper, the authors propose a multi-step algorithm based on machine learning for traffic prediction. After carefully evaluating the manuscript, I can list a number of notable strengths of the work:

It is clearly written and well-illustrated;
The proposed method is compared with its counterparts;
The comparison is performed at a high level, using statistical analysis;
The results are based on real data;
References include a lot of recent works.

Weaknesses of the manuscript include:

It is not mentioned the time range for prediction. Authors call it 'short-term prediction', though it still needs clarification, e.g. in minutes;
Authors should pay more attention to the possible application of such algorithms. E.g., are they are suitable for intelligent route planning in mobile software?
There are some minor spelling errors, especially missed commas and articles.

Despite these non-dramatic flaws, I believe the work may be accepted after minor revision.

Author Response

Dear Reviewers:

Thank you so much for preparing and quickly sending us your comments concerning our revised manuscript “Multi-step traffic speed prediction based on ensemble learning on urban road network”. All comments are intimately related to our study and we have tried our best to address all those suggestions and concerns appeared in the review letters. Major corrections and changes made in the revised manuscript are summarized as follows.

1.It is not mentioned the time range for prediction. Authors call it 'short-term prediction', though it still needs clarification, e.g. in minutes;

ANS: Thank you for your comment. In this paper, we used a real dataset to predict link travel speed. The collected field data interval was 1 hour. We have modified the statement in the revised manuscript in the first paragraph in Section 4. The specific revisions in the resubmitted manuscript are cited as follows:

“In this study, the pilot dataset with the time interval of 1 hour, which was are aggregated into 60-min intervals on Xingzhong Rd, was recorded over five weeks from October 21 to November 24, 2018.”

Moreover, we have also modified the statement in the revised manuscript in Section 5.1 first paragraph. The specific revisions in the resubmitted manuscript are cited as follows:

“Each subfigure shows one performance index of five prediction models under three scenarios with three kinds of prediction steps [(1h(60 min), 2h(120 min), 3h(180 min))] into future.”

Authors should pay more attention to the possible application of such algorithms. E.g., are they are suitable for intelligent route planning in mobile software?

ANS: Thank you for your comments. The objective of this study is to present a new method to predict link travel speed at urban signalized corridors. Definitely, the output of the developed algorithm can be used for traffic control, management and route planning, etc. In Section 6, we have added its application as below: “Moreover, the proposed model can also be integrated into some advanced ITS to alleviate traffic congestion, for example real-time route planning system, traffic management system and traffic signal control system.”

There are some minor spelling errors, especially missed commas and articles.

ANS: Thank you for your comment. We have corrected grammatical and punctuation errors in the article.

Author Response File: Author Response.docx

Reviewer 2 Report

In this document, the authors propose a multi-step traffic speed forecasting by using ensemble learning model with traffic speed detrending algorithm. However, I will comment on some aspects to improve the quality of the article:
-Authors must submit the manuscript in the correct format provided by MDPI.
-The authors are using the acronyms incorrectly. As for example, the wrong way is as found in line 20 << support vector machine (SVM) >>, the correct thing is << Support Vector Machine (SVM) >>. This type of error should be corrected in all acronyms used by the authors throughout the manuscript. Also, every acronym must have its meaning.
-I suggest the authors not to separate the Related Works Section into Subsections. In addition, I suggest further developing the state of the art, by including these articles:
--Zambrano-Martinez, J. L., Calafate, C. T., Soler, D., Lemus-Zúñiga, L. G., Cano, J. C., Manzoni, P., & Gayraud, T. (2019). A centralized route-management solution for autonomous vehicles in urban areas. Electronics, 8(7), 722.
--Ma, X.; Tao, Z.; Wang, Y.; Yu, H.; Wang, Y. Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp. Res. C Emerg. Technol. 2018, 54, 187–197.
-The authors must separate Section 4, leave only a Section of Case Studies where the experiments performed are entered, and the results can be presented in another Section. The other separate section is Discussion.
-In line 210, the word "the" is colored red.
-Who gave the authors the traffic dataset? for what reason do we only have 5 weeks of this dataset?
-What is the criteria of the authors in choosing 10 representative characteristics to calculate the analysis correction?
-What other case studies can be added to the article?
-Have the authors used any simulator to corroborate the data they have obtained?
-The authors mention 4 predictors, it would be advisable to give a brief introduction to each of them with their advantages and disadvantages and of course their characteristics.
-What is the difference between the prediction and reality after 1 hour, 2 hours, 3 hours? What is the reason that the prediction is made for each hour?
-The authors can change Figure 4, from bars to a CDF? to have a clear overview of the prediction that the reader will have.
-Figure 6 is not necessary to be placed in the article, because the reader knows in advance the meaning of the Boxplot.
-Authors must improve the conlcussions, and write what are the future works.

Author Response

Dear Reviewer:

Thank you so much for your comments concerning our revised manuscript “Multi-step traffic speed prediction based on ensemble learning on urban road network”. All comments are intimately related to our study and we have tried our best to address all those suggestions and concerns appeared in the review letters. Major corrections and changes made in the revised manuscript are summarized as follows.

Authors must submit the manuscript in the correct format provided by MDPI

ANS: Thank you for your comment. We re-submitted this paper in correct template provided by MDPI.

The authors are using the acronyms incorrectly. As for example, the wrong way is as found in line 20 << support vector machine (SVM) >>, the correct thing is << Support Vector Machine (SVM) >>. This type of error should be corrected in all acronyms used by the authors throughout the manuscript. Also, every acronym must have its meaning.

ANS: Thank you for your comments. We corrected acronyms in the full manuscript, for example, Support Vector Machine (SVM), and K-nearest Nearest Neighbor(KNN) in Abstract, Intelligent Transportation System (ITS) and Detrending Direct Strategy in Section 1, Kalman Filtering (KF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBOOST), Convolutional Neural Network (CNN), Evolving Fuzzy Neural Network (EFNN), Attention Graph Convolutional Sequence-to- Sequence model (AGC-Seq2Seq) in Related Work.

I suggest the authors not to separate the Related Works Section into Subsections. In addition, I suggest further developing the state of the art, by including these articles:

--Zambrano-Martinez, J. L., Calafate, C. T., Soler, D., Lemus-Zúñiga, L. G., Cano, J. C., Manzoni, P., & Gayraud, T. (2019). A centralized route-management solution for autonomous vehicles in urban areas. Electronics, 8(7), 722.

--Ma, X.; Tao, Z.; Wang, Y.; Yu, H.; Wang, Y. Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp. Res. C Emerg. Technol. 2018, 54, 187–197.

ANS: Thank you for your comments. We have put four subsections into one section in Related work. We also added the first reference into the second paragraph in Section 2, which is cited in reference [17]. The second reference in the reviewer’s comment was cited in this manuscript with the reference number of [23].

The authors must separate Section 4, leave only a Section of Case Studies where the experiments performed are entered, and the results can be presented in another Section. The other separate section is Discussion.

ANS: Thank you for your comment. We separated Section 4 into two sections. The new Section 4 is about case study description while the new Section 5 involves in the discussions of model validation.

In line 210, the word "the" is colored red

ANS: Thank you for your comment. That is typo and we removed the red color.

Who gave the authors the traffic dataset? for what reason do we only have 5 weeks of this dataset?

ANS: Thank you for your comment. The authors got the traffic dataset from the department of Zhongshan Traffic Police Detachment, which had been collected by ITS with Internet Plus. We illustrated it in first paragraph in Section 4. This study would prefer to investigating the speed trend of one month, so we used four-week data as training set and one-week data as testing data.

What is the criteria of the authors in choosing 10 representative characteristics to calculate the analysis correction?

ANS: Thank you for your comment. In order to determine the appropriate model inputs, this study chooses initially the ten spatiotemporal candidate variables of travel speed and flow for correlation analysis as shown in Table 1, which involve in time of day, day of week, upstream and downstream connected links. Highly relevant variables would be the final input variables after correlation analysis. We have modified the statement in the revised manuscript in the first paragraph in Section 4.

What other case studies can be added to the article?

ANS: Thank you for your comment. Now, we have only employed two links to validate the proposed prediction model with the field dataset of five-week due to the difficulty and high-cost of data collection. However, this dataset was collected in a typical signalized intersection scenario, where the scenario factors included traffic flow, traffic speed, etc. In future studies, we’ll do our best to collecting more data source to validate the accuracy of the model, even for other cities.

Have the authors used any simulator to corroborate the data they have obtained?

ANS: Thank you for your comment. We used a real dataset to predict the traffic flow speed. This dataset is from the same data collection system as the dataset in literature 39, and used the same pre-processing means in literature 39 for data verification. In fact, we did not build a simulator to verify the data due to the signal timing plan and vehicle trajectory data.

The authors mention 4 predictors, it would be advisable to give a brief introduction to each of them with their advantages and disadvantages and of course their characteristics.

ANS: Thank you for your comment. We have modified the statement in the revised manuscript in Section 5. The specific revisions in the resubmitted manuscript are cited as follows:

“The proposed forecasting models in this study are evaluated by comparing with four other predictors: SVM, CATBOOST, KNN, and BAGGING (the average result of SVM, CATBOOST and KNN into an ensemble learning). Among, SVM could deal with overfitting problem and have good generalization performance because SVM can construct a mapping from one dimensional input vector into high-dimensional space by the use of reproducing kernels. Furthermore, the SVM is also slow in the test phase due to the high algorithm complexity and needs a large memory capacity to calculate. CATBOOST uses an efficient gradient modification of ordered boosting to overcome the problem of target leakage, and it performs well in small datasets, but training a CATBOOST model requires a lot of time and compute memory. KNN is suitable for small data sets but it is usually hysteretic in time series. BAGGING is a combination of KNN, SVM and CATBOOST, and outperforms each individual method.”

What is the difference between the prediction and reality after 1 hour, 2 hours, 3 hours? What is the reason that the prediction is made for each hour?

ANS: Thank you for your comment. We used MOEs (MAPE, MAE, MSE, CV) to evaluate the proposed model. There was a trend that prediction accuracy decreased over time (after 1 hour, 2 hours, 3 hours). Section 5.1 showed prediction accuracy of each model. The second paragraph in Section 5.1 pointed that the MAPE of northbound DDSELM was 1.16% lower (7.08% versus 8.24%) than KNN in one-step-ahead prediction, 1.58% (8.77% versus 10.35%) in two-step-ahead prediction, 1.56% (10.34% versus 11.90%) in three-step-ahead prediction, respectively.

In this paper, the time interval of data source is 1 hour, so the minimal prediction interval is set to 1 hour in this study. Meanwhile, we also tested the prediction model with different prediction time step, such as 2h and 3h.

The authors can change Figure 4, from bars to a CDF? to have a clear overview of the prediction that the reader will have.

ANS: Thank you for your comment. The CDF does provide a clear description of the model's prediction accuracy. Actually, we used bars in Figure 4 to illustrate the overall predictions accuracy, while a CDF curve is also cited in Figure 8 to overview the prediction reliability.

Figure 6 is not necessary to be placed in the article, because the reader knows in advance the meaning of the Boxplot.

ANS: Thank you for your comment. Figure 6 has been removed.

Authors must improve the conclusions, and write what are the future works.

ANS: Thank you for your comment. We have modified the statement in the revised manuscript in Section 6. The specific revisions in the resubmitted manuscript are cited as follows:

“In order to tackle the challenge of multi-step traffic speed prediction, we proposed an ensemble model, i.e., Detrending and Direct Strategy Ensemble Learning Model (DDSELM). Detrending technique could separate original dataset into mean trends and residuals, and direct strategy could decrease the cumulative error in prediction process. To validate the effectiveness of our model, we used several benchmark models as comparison model including SVM, CATBOOST, KNN and BAGGING, based on a field dataset collected in the city of Zhongshan, China. Predictive result showed that our model outperformed four benchmark ones in terms of the MAPE, MAE, MSE and CV under three prediction intervals. For one-step-ahead prediction, the MAPE of DDSELM for northbound segments is 7.08% (14.90% for southbound) segments. For two-step-ahead and three-step-ahead prediction, the MAPE of DDSELM for north-bound segments is 8.77% and 10.34% (16.99% and 17.82% for southbound segments), respectively. In future works, it’s necessary to consider the impact of road network characteristics and specific incidents on prediction accuracy. Moreover, the proposed model can also be integrated into some advanced ITS to alleviate traffic congestion, for example real-time route planning system, traffic management system and traffic signal control system.”

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

Thanks to the authors for performing the relevant changes that were suggested by the reviewers. However, the title of Table 2 must be next to the same Table.

Article Menu

Multi-Step Traffic Speed Prediction Based on Ensemble Learning on an Urban Road Network

Further Information

Guidelines

MDPI Initiatives

Follow MDPI