Next Article in Journal
Design and Implementation of a New Framework for Post-Synthesis Obfuscation with a Mixture of Multiple Cells with an Integrated Anti-SAT Block
Previous Article in Journal
Color Remote Sensing Image Restoration through Singular-Spectra-Derived Self-Similarity Metrics
 
 
Article
Peer-Review Record

Mean Field Multi-Agent Reinforcement Learning Method for Area Traffic Signal Control

Electronics 2023, 12(22), 4686; https://doi.org/10.3390/electronics12224686
by Zundong Zhang 1, Wei Zhang 1,*, Yuke Liu 1 and Gang Xiong 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Electronics 2023, 12(22), 4686; https://doi.org/10.3390/electronics12224686
Submission received: 15 October 2023 / Revised: 10 November 2023 / Accepted: 15 November 2023 / Published: 17 November 2023

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The problem considered in the manuscript is very important and actual. The tools used are also proper and actual. Moreover, the important learning algorithms to regional road network environment are given. However, Authors should pay more attention specially for theoretical part:

- formulas (2) and (3) should be redefined. Now it is kind of a loop and not clear enough;

- p. 3, l. 134-135: give at least a reference to thesis that "for multi-agent stochastic games, there will always exist at least one Nash equilibrium ...".Explain why and when it exists;

- p.3, l.140: "Under certain assumptions ..." Which assumptions? Are they restrictive?

- p.4, formula (5): function Q depending on three variables is not defined;

- p.4, l. 165 change the symbol in the formula for \tilde{a} - now it is too similar to symbol "\in";

- p. 5 formula (12): what is \beta?

- p. 6, l. 210: define \nu^{MF}.

Please comment the differences in the metodology in your work and work titled

"Dynamic traffic signal control using mean field multi-agent reinforcement learning in large scale road-networks" form IET Intelligent Transport System

 

Comments on the Quality of English Language

No comments

Author Response

Only attachments uploaded

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

 

The study aims to address the limitations of traditional algorithms applied to multiple intersections in a network by utilizing the mean field theory. Although the significance of this research is commendable, however, major improvement and addressal of certain issues is required in order to make the research ready for publication.

Abbreviations must be defined at the first occurence in the manuscript, especially if they are being used in the abstract e.g MFQ-ATSC, MFAC-ATSC etc.

Explanantion of online and offline methods with proper references needs to be added. Similarly, details of DQN and A3C methods used for comparison should be added. Preferably, a ‘Related work’ section may be added after Introduction to add these details.

Figure 4 can be further improved by adding satellite imagery of the area with other relevant details, to make the readers understand the environment of the study area.

In section 3.2.2, discrete action space set can be better expressed in the form of a table.

Why 400 iterations were conducted? why not more or less? Please cite if there is any reference used.

On page 11, avg loss time reduction calculations for MFQ-ATSC and MFAC-ATSC are interchanged; need to be corrected.

Results and discussion must be improved e.g. why is there massive difference in reward curves for J17, similarly why is there fluctuation in MFQ-ATSC in J9. All the results need to be discussed in detail. Moreover, the robustness and stability of the proposed algorithms in varying scenarios should also be discussed.

Proof reading is required in addition to rechecking of citations. Reference formatting is also not uniform e.g. see reference number 20,21.

Lastly, there is no mention of the limitations and future directions of the research.

 

The authors are advised to incorporate the comments in order to improve the manuscript. 

 

Comments on the Quality of English Language

Proof-reading required. Detailed comments already mentioned above.

Author Response

Only attachments uploaded

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This work presents a region traffic signal control method based on mean-field reinforcement learning, where neural networks based on real-time traffic state information are trained to determine the optimal signal timing plan at intersections. In general, the work is of value and well organized. This reviewer has the following comments. 

1. The contributions of this paper should be further summarized, especially at the end of Introduction. 

2. Literature review is insufficient to support the motivation of this work. 

3. In section 2, existing reinforcement learning formulas are used, which should be combined with the traffic signal control problem. Obviously, in traffic signal control, existing symbols have physical meanings.

4. Conclusion part should be simplified that directly stating the obtained results. Also, future works are suggested to be added. 

Comments on the Quality of English Language

The English should be further polished in the revised round. 

Author Response

Only attachments uploaded

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Previous comment 2: the referenceshould be added in the manuscript text.

Previous comments 5 and 6: the explanations should be added in the text.

Comments on the Quality of English Language

-

Author Response

Only attachments uploaded

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

I would like to appreciate the quick response from the authors. However, there are still a few comments which haven't been addressed in detail. Few instances from the response report are given below:

Response 2: Agree. Therefore, we have completed adjusting parts of the article to emphasize this point. Details comparing the DQN and A3 C methods are in the revised Chapter 3 content, starting at line 192. 

No such improvement or revision in the manuscript is found.

Response 5:Thanks for pointing this out. To this question, our answer is that running 400 training rounds is enough to see the control effect of the algorithm. After 400 rounds of training, the algorithm we proposed has become stable. In order to avoid unnecessary waste of resources , we took 400 training epochs.

The query needs to be answered technically with relevant citations for selecting the optimal number of iterations.

Response 6:Thanks for pointing this out. To this question, our answer is that the control effects of the MFQ-ATSC and MFAC-ATSC algorithms are reflected in the average loss time. There is only a slight difference in the 0-4000s, but the control effects of the two algorithms are still different. This can be reflected after the 4000s. 

There is a calculation mistake in the avg loss time reduction for MFQ-ATSC and MFAC-ATSC. Kindly recheck.

Response 7:Thanks for pointing this out. For this problem, we can give an explanation. The huge difference in the reward curve of J17 is because J17 is only one agent in the multi-agent, and its reward fluctuation cannot fully control the level of the entire regional traffic signal. From the perspective of the simulation scenario, it is possible that at the intersection where the J17 signal controller is located, the vehicle density in the road network has changed, such as congestion, queuing, sudden braking and other operations. The same is true for J9. Regarding the discussion of the results, we have made changes in the text.

These discussions need to be added in the manuscript. I do not see them in the revised manuscript.

Response 8:Thanks for pointing this out. For this problem, we have modified the format of 2 literature [20], [21]  

References 20,21 are still in the same format in the revised manuscript.

In addition to above, Discussion section at the end of Page 11 is added erroneously, while the discussion section at the end of Page 12 is also not placed correctly (discussion should always be before conclusion). You can either include the new paragraph related to future research in the conclusion section or write a proper discussion section including findings of the results, limitations and future work.  

Comments on the Quality of English Language

Already indicated.

Author Response

Only attachments uploaded

Author Response File: Author Response.pdf

Round 3

Reviewer 2 Report

Comments and Suggestions for Authors

Thank you for incorporating the given suggestions and improving the manuscript substantially. I have no further comments.

Back to TopTop