Next Article in Journal
Transformation of the Urban Energy–Mobility Nexus: Implications for Sustainability and Equity
Previous Article in Journal
Element of Disaster Risk Reduction in Geography Education in Malaysia
 
 
Article
Peer-Review Record

A Resilient Intelligent Traffic Signal Control Scheme for Accident Scenario at Intersections via Deep Reinforcement Learning

Sustainability 2023, 15(2), 1329; https://doi.org/10.3390/su15021329
by Zahra Zeinaly 1, Mahdi Sojoodi 1,* and Sadegh Bolouki 2
Reviewer 1:
Reviewer 2:
Reviewer 3: Anonymous
Reviewer 4:
Sustainability 2023, 15(2), 1329; https://doi.org/10.3390/su15021329
Submission received: 5 November 2022 / Revised: 27 December 2022 / Accepted: 6 January 2023 / Published: 10 January 2023

Round 1

Reviewer 1 Report

The key problem of this study is that the simulation experiments DID NOT prove that the proposed traffic signal control system is resilient to accidents. Every time you changed the location of the accident(s), the control algorithm is RE-TRAINED. A robust control algorithm should be able to handle the accidents occurring everywhere WITHOUT re-training. I will need to see experiments such as the algorithm was trained when the accident occurred westbound but tested when the accident occurred southbound. 

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

This is a very interesting article and promising results worth publishing. I have no complaints about the descriptions. However, I would like to point out the lack of connection between the simulated state and practical use. So I would ask the authors to extend the Conclusions. Please summarize the possibilities of solving the problems by answering the following questions:

1) What type of traffic light controllers can support such complex control algorithms?

2) Can the algorithm be used only in computer control systems as an additional application, or can it be part of an isolated signaling controller?

3) What are the limitations of using the proposed algorithm?

4) What kind of detection devices will we require in practice and how to properly calibrate them to obtain satisfactory prediction results?

5) Have the authors tried or are going to try to implement the developed solution in reality and check the simulation results with the empirical results of the length of vehicle queues? if so, what are the results of the research?

6) Have the authors performed or intend to perform an analysis of the simulation of a variable traffic load at the intersection, i.e. containing short-term fluctuations (e.g. variable periods of 15-minute intervals, sometimes larger, sometimes smaller, sometimes constant)? if so, what are the results of the research?

 

If possible, please enrich the last chapter with answers to the above questions.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Abstract should be improved to be more concrete in the novelty and the findings.

The paper reproduces the scenario, procedures and data referenced in citation [28], to improve the state coding scheme for the Deep Reinforcement Learning strategy, adding the new set of variables “queue length” as previously described in other references such as [33]. This strategy is applied to spurious incident in the road, analyzing how the system learns from this new situation, and showing that learning curve gets shortened and traffic congestion is reduced.

The paper is clearly written, some typos, errors and clarifications should be addressed:

-          [General] Please review indentation of equations, tables and figures. They should be centered and not left anchored.

-          [line 51] The agent concept is not clearly presented and probably left to the reading of the references, but as it is a central concept it should be clarified. Moreover, the original ref [28] presents it much clearly. In my understanding, the agent should represent the whole intersection system.

-          [line 112] There is no justification for usage of the Q-Learning algorithm rather than its usage in previous references.

-          [Figure 2]Why not adding the light control in the scenario representation as they are the main actuator?

-          [Table 1] Some elements are cut-off. Please correct it.

-          [Eq.2] Equation is trivial, suggested removal.

-          [Figure 3] Elements are cut-off. Please correct it.

-          [219-225] Some concept is repeated. Please simplify.

-          [227] T concept is used and not described before.

-          [227] Cell sizes and grouping are not described. They are critical for the problem statement.

-          [237] States, Queue lengths and P-state are all of T-cardinality??? I don’t understand this.

-          [276] Input layer consists of 164 neurons … why? How this number is justified?

-          [276-277] There is no argumentation for the structure of the Deep Neural Network. Why 4 layers? Why 100 neurons in layer 2 and 3?  The only explanation I could find is that [28] uses this configuration, why? Moreover, if the input is changed from 80 [28]to 164 shouldn’t be changedthe internal structure of the DNN ?

-          [227]Why ReLU is used?

-          [334] We can find in Ref. [28] an argumentation about usage of Stochastic Gradient applied to MSE. Why is it not explained in the paper?

-          [4.1 – 343] Traffic demand is described a Weibull distribution, but not clarified how the demand is applied to every flow in the scenario.

-          [362] One episode = 5400 steps, one step = one second à one episode = 90 minutes. How a whole day traffic demand curve is used for this 90 minutes simulation?

-          [Figure 20]Footer is displaced from the graphic.

-          [Table 4] Elements are cut-off. Please correct it.

-          [500] Multiple accidents scenarios are pointed , but not represented. In fact, the paper compares single accident scenarios at 30 or 50 mts from the intersection, against scenarios of multiple accidents far 700 – 730 from the intersection. Shouldn’t it be logical to compare multiple accidents with similar distance to intersection?

-          [517] “The simulation results show that the algorithm learned a stable strategy regardless of the location and time of accidents”. There is little or no evidence of this statement, as no time-related experiment is shown, and only incidents close to the intersection are considered (before entering it), but there are many others sensitive locations that have not been addressed.

While reading the paper several questions appear:

-          The paper only considers training the agent in a single incidental situation, for a concrete scenario. But incidents could occur in any position. Does it mean that the system should be trained considering any incidental situation? If not, what would be the behaviour of the system when it has been trained for one incidental situation, and a different location occurs?

-          The paper considers that incidents occurs during the whole training period, as if they were permanent. But many incidents are temporary. What would be the behaviour of the trained agent in this case?

-          Only training scenarios are considered. How it would be

-          Only incidents before the intersection are considered. Nevertheless it could be considered that accidents would occur in the intersection with higher probability.What would happen when the accident occurs in the intersection or just after it?

 

The main reference cited should be properly addressed as found in  https://ceur-ws.org/Vol-2404/ . 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

I find the topic of the paper interesting.  While the provided framework is interesting , there is still a major deficiencies in the paper. Although the entire paper require major rewrite,  I can see the potential for the paper to make a nice contribution to Sustainability.

 

·      The sentence in lines 8-10 is too long. Please  divide in to two separate sentence.

 

·      Literature review mostly gives detail for RL, and applications on traffic. Literature review needs to be expended including other methods (such that simulations etc.).

 

·      All around the paper, I don't think "resilience " is used in correct meaning.

 

·      The contribution of the paper section is too long and does not clear. it mostly explains the proposed model but does not give how it differs from current literature. The authors should explain the novelty of this study. The current form is not clear.

 

·      Figure 3 needs to be fixed (last letters are disappeared).

 

·      Lines 219-225 need rewrite. Not clear, and repeating.

 

·      The lines 252-254 are just a repetition. This part already mentioned earlier.

 

·      In section 3.4, how did you obtained test and training data. How did you decide the size of test and training data.

 

·      You compare your study with reference 28. But, no information is provided how your model performs compare the realty. No error rate is provided.

 

·      Did you do hyperparameter optimization? 

 

·      The conclusion part is a just short summary of the results. Implication of the study should be provided.

 

 

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Thanks for addressing my comments.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop