Next Article in Journal
Prediction of Ship Main Particulars for Harbor Tugboats Using a Bayesian Network Model and Non-Linear Regression
Previous Article in Journal
Practice-Oriented Controller Design for an Inverse-Response Process: Heuristic Optimization versus Model-Based Approach
 
 
Article
Peer-Review Record

Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm

Appl. Sci. 2024, 14(7), 2889; https://doi.org/10.3390/app14072889
by Jingpeng Gan 1, Jiancheng Zhang 2,* and Yuansheng Liu 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2024, 14(7), 2889; https://doi.org/10.3390/app14072889
Submission received: 22 January 2024 / Revised: 16 March 2024 / Accepted: 19 March 2024 / Published: 29 March 2024

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

The article analysed the behaviour decision of the automatic driving unsignalized roundabout, proposed a learning method based on optimization of near end strategy, and conducted simulation training based on deep reinforcement learning.

 

The research is very interesting. However, some improvements are required as follows:

-        Please improve the quality of figures because the current images are very poor;

-        Please improve the quality of Tables 1-4

-        Conclusions need to be rewritten and refined (the text is too short);

-       Authors should strictly follow the paper template on the official website of the conference for formatting.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

This paper describes the control algorithm of an autonomous vehicle, in terms of adaptation to urban traffic conditions. It focuses on the example of an unsignalled roundabout, as it is a difficult place to parameterise in autonomous control systems. A timely and interesting topic.

The form of the article is correct. 3 algorithms have been proposed, which are described in the body and referred to in the introduction. Simulations of the proposed solutions have been performed and evaluated, which is also alluded to in the last chapter.

However, the authors should work on the quality of the content of the manuscript. Already in the authors' mailing address an error crept in. Often, there is no space after full stops ending sentences. A parenthesis is missing in the reference to literature item 3. The literature list is double numbered.
The content mentions 4 scenarios, while only 3 are used in the research.
The quality of the drawings is unacceptable as they are completely illegible. This needs to be corrected.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

The structure of the paper is fine, although I found some discrepancies regarding using the template. Mathematical formulas are written too small, please check them. Please also check the placing of the figures. The references are number double in the list.

To the content: At autonomous vehicles number one priority is the safety. It is not clear for me how can you merge safety AND effective usage of the roundabout.

If the simulated roundabout is an existing one, did you tested it with human drivers, to compare your, suggested method to the real-world? Is your method better then human drivers? These questions are not clear for me based on the manuscript. 

Author Response

Please see the attachment

Author Response File: Author Response.docx

Reviewer 4 Report

Comments and Suggestions for Authors

Figures 2, 4, 5, 6, 7, 8, and 9 are unavailable for evaluation. These figures provide some insights into the experiment's results. However, to fully assess the effectiveness and practicality of the proposed method, it is crucial to have a comprehensive understanding of the specific adjustments made during training and how they influenced the overall performance of the autonomous vehicle. With a thorough description or analysis of these figures and their relationship to the research findings, it is not easier to draw definitive conclusions.

Future studies should provide more detailed explanations and interpretations of these figures, establishing a clear link between the visuals and the obtained results. Additionally, while expanding the sample size to incorporate a broader range of scenarios and traffic conditions is recommended the sample size, it is essential to recognize that more than the authors may need to address concerns about representativeness adequately. It is necessary to ensure that the selected sample accurately reflects real-world conditions and encompasses various challenges and complexities encountered in actual traffic situations.

In conclusion, while the inclusion of additional figures for evaluation is appreciated, it is imperative to address the concerns raised regarding the sample's representativeness, the limitations of the scenario involving 200 vehicles, and the lack of detailed information about the adjustments made during training. Addressing these concerns can significantly enhance the validity and relevance of the research findings.

Comments on the Quality of English Language

The English is mostly clear, but there are a few minor improvements that the authors can be made:

As shown in Figure 8, in the 200-vehicle social environment, autonomous vehicles will be affected by their driving status and surrounding vehicles. The average reward value of autonomous vehicles using the optimized PPO algorithm in unsignalized roundabouts is significantly superior to that of the PPO+CCMR algorithm. Additionally, the decision-making in unsignalized roundabouts with the optimized PPO algorithm is more reasonable, resulting in higher reward values.

The reward value function of the optimized PPO algorithm tends to stabilize at 540,000 steps, while the reward value function of the PPO+CCMR algorithm tends to stabilize at 560,000 steps. Compared to the PPO+CCMR algorithm, the reward value is increased by 19.58%, and the simulation training efficiency of the optimized PPO algorithm is improved by 7.4%.

The changes need to be implemented throughout the entire research paper.

Author Response

Please see the attachment

Author Response File: Author Response.docx

Round 2

Reviewer 4 Report

Comments and Suggestions for Authors

The authors are encouraged to provide support for the experiments conducted on the research question. It is important to measure the conclusions based on the number of proposed Proximal Policy Optimization Algorithms.

Comments on the Quality of English Language

The native English speaker should review the paper.

 

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop