Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe article analysed the behaviour decision of the automatic driving unsignalized roundabout, proposed a learning method based on optimization of near end strategy, and conducted simulation training based on deep reinforcement learning.
The research is very interesting. However, some improvements are required as follows:
- Please improve the quality of figures because the current images are very poor;
- Please improve the quality of Tables 1-4
- Conclusions need to be rewritten and refined (the text is too short);
- Authors should strictly follow the paper template on the official website of the conference for formatting.
Author Response
Please see the attachment
Author Response File: Author Response.docx
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper describes the control algorithm of an autonomous vehicle, in terms of adaptation to urban traffic conditions. It focuses on the example of an unsignalled roundabout, as it is a difficult place to parameterise in autonomous control systems. A timely and interesting topic.
The form of the article is correct. 3 algorithms have been proposed, which are described in the body and referred to in the introduction. Simulations of the proposed solutions have been performed and evaluated, which is also alluded to in the last chapter.
However, the authors should work on the quality of the content of the manuscript. Already in the authors' mailing address an error crept in. Often, there is no space after full stops ending sentences. A parenthesis is missing in the reference to literature item 3. The literature list is double numbered.
The content mentions 4 scenarios, while only 3 are used in the research.
The quality of the drawings is unacceptable as they are completely illegible. This needs to be corrected.
Author Response
Please see the attachment
Author Response File: Author Response.docx
Reviewer 3 Report
Comments and Suggestions for AuthorsThe structure of the paper is fine, although I found some discrepancies regarding using the template. Mathematical formulas are written too small, please check them. Please also check the placing of the figures. The references are number double in the list.
To the content: At autonomous vehicles number one priority is the safety. It is not clear for me how can you merge safety AND effective usage of the roundabout.
If the simulated roundabout is an existing one, did you tested it with human drivers, to compare your, suggested method to the real-world? Is your method better then human drivers? These questions are not clear for me based on the manuscript.
Author Response
Please see the attachment
Author Response File: Author Response.docx
Reviewer 4 Report
Comments and Suggestions for AuthorsFigures 2, 4, 5, 6, 7, 8, and 9 are unavailable for evaluation. These figures provide some insights into the experiment's results. However, to fully assess the effectiveness and practicality of the proposed method, it is crucial to have a comprehensive understanding of the specific adjustments made during training and how they influenced the overall performance of the autonomous vehicle. With a thorough description or analysis of these figures and their relationship to the research findings, it is not easier to draw definitive conclusions.
Future studies should provide more detailed explanations and interpretations of these figures, establishing a clear link between the visuals and the obtained results. Additionally, while expanding the sample size to incorporate a broader range of scenarios and traffic conditions is recommended the sample size, it is essential to recognize that more than the authors may need to address concerns about representativeness adequately. It is necessary to ensure that the selected sample accurately reflects real-world conditions and encompasses various challenges and complexities encountered in actual traffic situations.
In conclusion, while the inclusion of additional figures for evaluation is appreciated, it is imperative to address the concerns raised regarding the sample's representativeness, the limitations of the scenario involving 200 vehicles, and the lack of detailed information about the adjustments made during training. Addressing these concerns can significantly enhance the validity and relevance of the research findings.
Comments on the Quality of English LanguageThe English is mostly clear, but there are a few minor improvements that the authors can be made:
As shown in Figure 8, in the 200-vehicle social environment, autonomous vehicles will be affected by their driving status and surrounding vehicles. The average reward value of autonomous vehicles using the optimized PPO algorithm in unsignalized roundabouts is significantly superior to that of the PPO+CCMR algorithm. Additionally, the decision-making in unsignalized roundabouts with the optimized PPO algorithm is more reasonable, resulting in higher reward values.
The reward value function of the optimized PPO algorithm tends to stabilize at 540,000 steps, while the reward value function of the PPO+CCMR algorithm tends to stabilize at 560,000 steps. Compared to the PPO+CCMR algorithm, the reward value is increased by 19.58%, and the simulation training efficiency of the optimized PPO algorithm is improved by 7.4%.
The changes need to be implemented throughout the entire research paper.
Author Response
Please see the attachment
Author Response File: Author Response.docx
Round 2
Reviewer 4 Report
Comments and Suggestions for AuthorsThe authors are encouraged to provide support for the experiments conducted on the research question. It is important to measure the conclusions based on the number of proposed Proximal Policy Optimization Algorithms.
Comments on the Quality of English LanguageThe native English speaker should review the paper.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf