Next Article in Journal
Control of a Variable-Impedance Fault Current Limiter to Assist Low-Voltage Ride-Through of Doubly Fed Induction Generators
Next Article in Special Issue
Cooperative- and Eco-Driving: Impact on Fuel Consumption for Heavy Trucks on Hills
Previous Article in Journal
Describing I2V Communication in Scenarios for Simulation-Based Safety Assessment of Truck Platooning
Previous Article in Special Issue
Object Classification with Roadside LiDAR Data Using a Probabilistic Neural Network
 
 
Article
Peer-Review Record

Traffic Signal Control System Based on Intelligent Transportation System and Reinforcement Learning

Electronics 2021, 10(19), 2363; https://doi.org/10.3390/electronics10192363
by Julián Hurtado-Gómez 1, Juan David Romo 1, Ricardo Salazar-Cabrera 1,*, Álvaro Pachón de la Cruz 2 and Juan Manuel Madrid Molina 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Electronics 2021, 10(19), 2363; https://doi.org/10.3390/electronics10192363
Submission received: 30 August 2021 / Revised: 24 September 2021 / Accepted: 24 September 2021 / Published: 28 September 2021
(This article belongs to the Special Issue Intelligent Transportation Systems (ITS), Volume II)

Round 1

Reviewer 1 Report

The purpose of the paper, “Traffic Signal Control System based on Intelligent Transportation System and Reinforcement Learning”, is to introduce a traffic signal control system containing various modules with advanced models. Although the paper has shown its purpose and the process of applying the proposed methodology logically in general, here are a few comments that might support the paper and overcome potential weaknesses.

First, the size of the test site is too small meaning that the application of Q-learning in a crossroad does not seem to be valuable. When we look at the paper, “Traffic Signal Optimization for Oversaturated Urban Networks: Queue Growth Equalization”, it handles a 3-by-3 traffic network with an optimization algorithm to convince the validity. Therefore, a better reason needs to be given if the paper continues on the one crossroad case.

Second, more algorithms should be considered if the paper wants to prove the statement that Q-learning is the best choice in the case. I could not deny that Q-learning is one of the most used RL algorithms, but there are algorithms as the paper mentioned including deep Q-network (DQN) in “Deep reinforcement learning for traffic signal control under disturbances: A case study on Sunway city, Malaysia“, and Queue Growth Equalization in “Traffic Signal Optimization for Oversaturated Urban Networks: Queue Growth Equalization“ and etc. Since the paper only compare with 2 basic cases, the paper needs to explain more if basic and fix cases are enough.

 

Finally, there are quite minor mistakes. For example, a period is missing at the end of the sentence in line 215 and the spaces from line 738 to 741 and from line 747 to 745 are different from other paragraphs. Formulas’ font should be italic too.

Reference

Jang, K.; Kim, H.; Jang, I.G. Traffic Signal Optimization for Oversaturated Urban Networks: Queue Growth Equalization. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS. 2015, 16(4), pp. 2121-2128.

Rasheed, F.; Yau, K.L.A.; Low, Y.C.; Wen, P. Deep reinforcement learning for traffic signal control under disturbances: A case study on Sunway city, Malaysia. Future Generation Computer Systems. 2020, 109, pp. 431-445.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper tackles the problem of traffic signal control in dense populated urban areas applying ITS and reinforcement learning. The topic is very actual, and much research is currently being done to improve the applicability of reinforcement learning in real-world adaptive traffic signal control. A new ITS architecture is proposed for developing countries, and a simple RL-based traffic signal control is being done. The first is the strong point of the paper. The second must be significantly improved. Namely, traffic flows at an intersection present a very dynamic process to be controlled using RL. In Fig. 8, you generate your Q-table without applying the Q-update function if I understand everything correctly. You just update the generated Q-table using a database of collected traffic states and hard-coded reward values. The thing is that such an approach can be applied in a stationary process like one for the basic RL example of navigating a mobile robot in a stationary grid world. Once you change your signal program, you affect all your future states as vehicles arrive randomly at your intersection. Thus, you cannot learn an optimal traffic controller without a simulator. You can use your baseline approach to generate an initial Q-table. But will it be optimal? Surely not! It is very unclear how you generate your first Q-table and refine it without a simulator. When writing a paper, you can only claim some comparison with other approaches that you have either implemented yourself or use data for the same experiment from another paper. You have built a new simulation model using your own data if I understand everything correctly?

For example, use the papers 10.1109/ELMAR49956.2020.9219024, 10.1109/ITSC.2012.6338707, 10.1016/j.ifacol.2016.07.002, and 10.1016/j.aei.2018.08.002 for help.

To improve the quality of the paper, pay attention to the following:

  • Use the same style when writing keywords, equations, figure captions, literature entries in the whole paper;
  • Do not create paragraphs consisting only of one sentence; group more sentences into one paragraph;
  • Always carefully analyze what the method you use do, and explain them accordingly;
  • Use the abbreviation after you defined it;
  • Subsection 2.1 describes the methodology how you did the literature review, but the state of the art review of authors is missing (it is needed to argue your novelty compared to existing research);
  • Do not use vague words like various, many, etc., and mention at least some examples in brackets;
  • Give more details about your learning process; RL based systems are learned during the interaction with the environment in the exploration phase. Or did you use a learning process where your traffic controller worked parallel with a traditional traffic signal control system?
  • Use a longer video for testing your queue length estimation based on computer vision;
  • You cannot claim to have similar results as DLR without implementing this approach.

There are additional comments in the attached PDF.

Comments for author File: Comments.pdf

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The paper is now significantly improved. The author’s response more clearly describes how the learning of the proposed TSC is being made

To improve the quality of the paper for publication, pay attention to the following. The paragraph in lines 454-458 is unclear. It must be emphasized that you use an offline approach to generate the initial Q-Table using a previously recorded response of the underlying traffic process (intersection) to be controlled. This is done to fasten the learning phase needed before the system can be implemented. After that, the Q-table is refined in direct interaction with the environment like described in the paper.

There are additional comments in the attached PDF.

Comments for author File: Comments.pdf

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Back to TopTop