Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Research on Dynamic Path Planning of Multi-AGVs Based on Reinforcement Learning

Appl. Sci. 2022, 12(16), 8166; https://doi.org/10.3390/app12168166

by Yunfei Bai^1,2, Xuefeng Ding^1,2, Dasha Hu^1,2,* and Yuming Jiang^1,2

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Appl. Sci. 2022, 12(16), 8166; https://doi.org/10.3390/app12168166

Submission received: 8 July 2022 / Revised: 2 August 2022 / Accepted: 10 August 2022 / Published: 15 August 2022

Round 1

Reviewer 1 Report

The article talks about using distance measurement sensors on the AGVs to perform RL for path planning in a multi vehicular setup. The authors have show some limited test scenarios to showcase the effectiveness of the system

1. There are numerous articles the authors have referred to for which there is no citations in the text. That way, I am not clear about the literature study part of the article.

2. The equations are unclear there is no proper explanation why its done, how it's done and what are the terms indicate

For example the very first equation what is Lmin, Lmax, what's 1-2 why is it normalized.

The issue persists in all the equations

3. The abstract is a summary of the work. Details of which is anyway in the later sections of the article, I strongly discourage the repetitive content in the abstract

4. The adaptive clustering method using kohonen method is a generalized feedback method, the enhansment that the authors claim is not clear.

5. The enhansment for the Qalgorithm although is justified, the article doesn't clarify on the selection of new actions and the size of new action set.

Author Response

Dear Reviewer,

We appreciate your attention and insightful comments on our manuscript entitled " Research on dynamic path planning of multi-AGVs based on reinforcement learning" (ID:1832560). These comments are valuable in improving the quality and readability of the paper and will be an important guide for our future research. We have carefully studied these comments and have revised the paper in light of your comments. The revisions in the article are highlighted in red and the main revisions corresponding to your comments are as follows.

There are numerous articles the authors have referred to for which there is no citations in the text. That way, I am not clear about the literature study part of the article.

Responses to the comments of Reviewer #1: Based on your comments, we have revised the citation section to ensure that every literature is cited in the article. For details, see the due diligence section of this article.

The equations are unclear there is no proper explanation why its done, how it's done and what are the terms indicate. For example the very first equation what is Lmin, Lmax, what's 1-2 why is it normalized.

Responses to the comments of Reviewer #2: According to your comments, we have added the purpose of each formula and what each parameter indicates. For example, for equation 10, its use is written in lines 390 to 367, while the description of its parameters is written in line 411.

The abstract is a summary of the work. Details of which is anyway in the later sections of the article, I strongly discourage the repetitive content in the abstract.

Responses to the comments of Reviewer #3: In response to your comments, we have rewritten the summary and added numerical results. The details can be found in lines 27 to 35 of the first page of this paper.

The adaptive clustering method using kohonen method is a generalized feedback method, the enhansment that the authors claim is not clear.

Responses to the comments of Reviewer #4: Based on your comments, we have added a discussion of technology choices for each technology, as seen in the second subsection of the introduction section and subsection 2.2.1 of the article. Then, in subsection 2.2.1, we explain the reasons why the kohonen was able to optimize the Q-learning algorithm. Finally, the experiments in Subsection 5.3 also verified our claim.

The enhansment for the Qalgorithm although is justified, the article doesn't clarify on the selection of new actions and the size of new action set.

Responses to the comments of Reviewer #5: Based on your comments, we have added a description of the new actions and the selection of new actions. This can be found in subsection 2.3, lines 278 to 282, and subsection 2.4, lines 309 to 319. We then added a discussion of the technical choices and some technical details.

In addition to the improvements to the appeal, we have revised the article as follows:

We have revised the entire introduction and divided it into three main sections. For motivation and incitement, we clarify the problem and purpose of this paper to be addressed. For literature review and contributions, we explain the reasons for choosing these techniques by analyzing the literature. For example, on page 2 of the paper, lines 61-64, we explain why we chose the Q-learning algorithm for dynamic path planning. For the organization of the paper, we provide the organization of the paper, which can be found on page 4, line 159 of the paper.
In order to improve the innovativeness of the paper, we have rewritten some technical details of the K-L algorithm, highlighting the innovativeness of the method and the reasons for the choice of the technique. The improvements can be found on pages 6 to 7 of the paper. Then, this paper also adds our latest results to incorporate scheduling policies into global path planning and design a weight-based polling algorithm to complete the queuing of AGVs. The details can be found on pages 10 to 11 of the paper.
we have reorganized and supplemented the experiments related to the GA-KL method. We compared the GA-KL method with the multi-destination path planning method published in 2022 and the hybrid A* algorithm with reinforcement learning path planning method, published in 2021. and the path integral-based reinforcement learning algorithm, and finally the GA-KL method has good results both in terms of total path length and total completion time. The details can be found on pages 20 to 22 of this paper.

Finally, we are very grateful for your help in pointing out the shortcomings of our manuscript and the areas where we have not done enough. It has served as an important guide for our future research and helped us to improve it further.

Thanks and Best Kind Regards,

Yunfei Bai,

July 30, 2022.

Reviewer 2 Report

Manuscript ID applsci1832560 Title Research on dynamic path planning of multi-AGVs based on reinforcement learning ___Comments:______________________________________________________________

1. There are no numerical results on the abstract. The abstract of scientific papers should be written the numerical result, which brought to a conclusion.
2. The problem definition, motivation and more recent related work must be added in the introduction section. My suggestion is to divide the introduction into three subsections: motivation and incitement, literature review and contribution and paper organization.
3. The novelty and new outcomes of the present work are not sufficient for this high impact factor journal. There are many similar works and could not find any significant difference.
4. There is no discussion of user requirements, technological options and support for the decisions made at the design. The authors should include more technical details and explanations.
5. The comparison to other improved schemes (within the last 3 years) is required. This paper should summarize those results and give a comprehensive performance comparison with previous works.
6. More experiments and some comparisons with other up-to-date methods should be addressed or added to back your claims to expand your experiments and analysis of results further.
7. The conclusion and future work part can be extended to have a better understanding of the approach and issues related to that which can be taken into consideration for future work.

_________________________________________________________

Author Response

Dear Reviewer,

There are no numerical results on the abstract. The abstract of scientific papers should be written the numerical result, which brought to a conclusion.

Responses to the comments of Reviewer #1: Based on your suggestion, we have filled in the numerical results in the abstract. The details can be found in the first page of this paper, lines27-36.

The problem definition, motivation and more recent related work must be added in the introduction section. My suggestion is to divide the introduction into three subsections: motivation and incitement, literature review and contribution and paper organization.

Responses to the comments of Reviewer #2: Thanks for your comment. It is the important deficiencies in this paper, wherefore, we have revised the entire introduction and divided it into three main sections. For motivation and incitement, we clarify the problem and purpose of this paper to be addressed. For literature review and contributions, we explain the reasons for choosing these techniques by analyzing the literature. For example, on page 2 of the paper, lines 61-64, we explain why we chose the Q-learning algorithm for dynamic path planning. For the organization of the paper, we provide the organization of the paper, which can be found on page 4, line 159 of the paper.

The novelty and new outcomes of the present work are not sufficient for this high impact factor journal. There are many similar works and could not find any significant difference.

Responses to the comments of Reviewer #3: In order to improve the innovativeness of the paper, we have rewritten some technical details of the K-L algorithm, highlighting the innovativeness of the method and the reasons for the choice of the technique. The improvements can be found on pages 6 to 7 of the paper. Then, this paper also adds our latest results to incorporate scheduling policies into global path planning and design a weight-based polling algorithm to complete the queuing of AGVs. The details can be found on pages 11 to 12 of the paper.

There is no discussion of user requirements, technological options and support for the decisions made at the design. The authors should include more technical details and explanations.

Responses to the comments of Reviewer #4: Based on your suggestions, we have added a discussion of technology choices for each technology, as seen in the second subsection of the introduction section, subsection 2.2.1, and subsection 2.3 of the article, among others. For example, in subsection 2.2.1, we explain the reasons why the kohonen was able to optimize the Q-learning algorithm. Then, we add details of techniques such as the kohonen, the SSA algorithm-based adaptive exploration strategies, and potential-based reward functions, respectively.

The comparison to other improved schemes (within the last 3 years) is required. This paper should summarize those results and give a comprehensive performance comparison with previous works.

Responses to the comments of Reviewer #5: Based on your suggestion, we have reorganized and supplemented the experiments related to the GA-KL method. We compared the GA-KL method with the multi-destination path planning method published in 2022 and the hybrid A* algorithm with reinforcement learning path planning method published in 2021, and the path integral-based reinforcement learning algorithm published in 2021. Finally, the GA-KL method has good results both in terms of total path length and total completion time. The details can be found on pages 20 to 22 of this paper.

More experiments and some comparisons with other up-to-date methods should be addressed or added to back your claims to expand your experiments and analysis of results further.

Responses to the comments of Reviewer #6: Based on your suggestions, the experiments of the GA-KL method have been supplemented and refined to fully consider the possible situations of AGVs in dynamic environments. The first is to test the ability of the GA-KL method in narrow environments, which can be found in subsection 5.7 of this paper. Then，it is to examine the performance of the GA-KL method in the case of multiple dynamic obstacles and high congestion probability, as described in subsection 5.8 of this paper.

The conclusion and future work part can be extended to have a better understanding of the approach and issues related to that which can be taken into consideration for future work.

Responses to the comments of Reviewer #7: We have added a description of our future work at the end of the article.

Thanks and Best Kind Regards,

Yunfei Bai,

July 30, 2022.

Round 2

Reviewer 1 Report

The authors have addressed all the concerns. I do not have any other issues and the article could be considered for publication in the present form.

Author Response

Dear Reviewer,

Thank you again very much for taking the time to review the manuscript. In the review report form, you mentioned that you would like us to improve the introductory part of the article. We have carefully studied your comments and have revised the article based on your comments. The revisions in the article are highlighted in red and the major revisions corresponding to your comments are as follows.

The introductory section could be improved

Responses to the comments of Reviewer #1: We have added a new reference to support our description of the current problems faced by AGV and the urgency of solving this problem. This can be found on page 2, line 48 of the paper. Then, we added some descriptions of clustering algorithms to highlight the advantages of Kohonen neural networks. This can be found on page 3, line 98 of the paper.

Finally, we would like to take this opportunity to thank you for all your time involved and this great opportunity for us to improve the manuscript. We hope you will find this revised version satisfactory.

Yunfei Bai,

August 2, 2022.

Reviewer 2 Report

Authors have given reply to Reviewer's queries.

English language and style are fine/minor spell check required

Author Response

Dear Reviewer,

Thank you again very much for taking the time to review the manuscript. In the review report form, you mentioned that you would like us to improve the study design and the description of the experimental results of the article. We have carefully studied your comments and have revised the article based on your comments. The revisions in the article are highlighted in red and the major revisions corresponding to your comments are as follows.

The study design can be improved

Responses to the comments of Reviewer #1: Based on your suggestion, we have added some details about the study design. It further explains how our approach effectively addresses the problems faced by AGVs in dynamic environments step by step, strengthening the rationality and effectiveness of our approach. This can be found on page 4, line 150 of the paper.

The description of the experimental results can be improved

Responses to the comments of Reviewer #2: Based on your suggestion, we added some descriptions of the experimental results as described in subsections 5.3 and 5.4, which can be found on page 15, line 519 and page 16, line 551 of the paper. Then, we refined and modified Figures 11, 12, 18 and 19, etc. to make the values clearer and better formatted.

In addition, we reviewed this paper through professionals, corrected the spelling of some words in the article, and standardized the abbreviations.

Yunfei Bai,

August 2, 2022.

Article Menu

Research on Dynamic Path Planning of Multi-AGVs Based on Reinforcement Learning

Further Information

Guidelines

MDPI Initiatives

Follow MDPI