Robotic Manipulator in Dynamic Environment with SAC Combing Attention Mechanism and LSTM
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe paper proposes a framework for reinforcement learning of robotic manipulators in a dynamic environment. Its main contribution is the introduction of an attention mechanism fused with an LSTM network and a modified reward function. The simulation results are promising, therefore the paper may be interesting for the robotics community. However, it has some shortcomings that should be addressed before an eventual publication:
-The authors should describe better the conditions of the simulations results, such as the manipulator characteristics, the characteristics of the obstacles and the simulated sensors.
-The authors should clearly state the improvement of the proposed method relative to other similar methods in the literature in the literature by listing the differences of the proposed method.
-Many acronyms in the paper are not introduced beforehand, such as LSTM, AGV, etc... Even if the meaning of these acronyms is well known they should be defined beforehand.
-The meaning of some variables is not defined. For example omega_i in equation 1 is not defined.
Author Response
1.In the experimental part of the text, the type of use of the robotic arm, the way of obtaining the distance between the robotic arm and the obstacle, the whole description about the environment and the detailed explanation of the specific parameter indicators have been described in detail
2.The differences between the proposed methodology and the literature, as well as ways to improve it, have been listed in the related work section of the text。
3.The relevant acronyms in the text have explained their specific meanings
4.A detailed explanation of the variables in the text is also provided
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for Authors1. The article occasionally offers intricate concepts and methods in a simplified format that may be clarified. Readers may find it easier to comprehend the improvements over conventional models, for instance, if more thorough descriptions of how the attention mechanism and LSTM networks directly enhance the SAC algorithm are provided.
2. Despite discussing a number of tests, the research should include a more thorough compared analysis with other current algorithms, such as comparing with other cutting-edge deep reinforcement learning techniques, rather than only SAC versions.
3. Including a wider range of situations during the testing stage might aid in proving the suggested method's resilience in a variety of circumstances, not simply the particular examples examined.
4. The technique would be easier to understand if it included flowcharts or diagrams to illustrate the algorithmic stages and integration of components (such as LSTM and attention mechanisms inside SAC).
The paper is well-written with professional language. No significant spelling or grammar errors are noticeable in the provided excerpts. All things considered, the work offers an exciting method for robotic manipulator motion planning that makes use of sophisticated reinforcement learning algorithms.
Author Response
1. In the method introduction section, a specific description has been added about the specific way of combining the attention mechanism and the LSTM, as well as the related formula representation
2. In the experimental part, the SAC algorithm has been used for an insightful comparative analysis with today's mainstream algorithms PPO and TD3, as well as the improved AL-SAC and LSTM-SAC algorithms
3. Since the environment in this paper is dynamic, the obstacles and moving points are in a random moving state. In the future, we will take different experimental scenarios for comparison experiments.
4. we improved the Attn-LSTM structure diagram shown in Fig. 2 to refine the specific flow of the attention mechanism.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsReview Report
Journal: Electronics (ISSN 2079-9292)
Manuscript ID: electronics-2981912
Article Type: Article
Title: Robotic Manipulator in Dynamic Environment with SAC Combing Attention Mechanism and LSTM
Authors: Xinghong Kuang and Sucheng Zhou
Summary: This work deals with the motion planning of a manipulator in a dynamic environment. The paper proposes using the improved Soft Actor Critic Algorithm (SAC) with the maximum entropy advantage as the benchmark algorithm. To address the challenges of insufficient robustness and difficulty in adapting to environmental changes, the authors suggest combining Euclidean distance and distance difference to enhance target approach accuracy. Furthermore, they propose an attention network fused with LSTM to address the instability and uncertainty of input states in dynamic environments, improving the SAC algorithm. Simulation experiments were conducted, and the results demonstrated that the fused neural network functions increased the success rate of approaching the target, enhanced the SAC algorithm's convergence speed, success rate, and avoidance capabilities.
However, this work in the present form needs a Minor revision due to the following reasons:
1. The authors should provide a comparative analysis with existing motion planning algorithms to demonstrate superiority.
2. The authors should clearly define and discuss the evaluation metrics used to assess the performance of the proposed approach. It is important to establish appropriate metrics to measure the success rate, convergence speed, and avoidance capabilities, and provide a detailed analysis of the results.
3. The authors should include comprehensive real-world experiments or case studies to validate the effectiveness and robustness of the proposed approach in dynamic environments.
4. The authors should provide sufficient information about the selection and tuning of hyperparameters for the proposed approach. The choice of hyperparameters can significantly impact performance, and their influence should be thoroughly discussed and analyzed.
5. The authors should offer a more extensive comparison with related works to strengthen the novelty and significance of the proposed approach.
6. The authors should mention the availability of the implemented algorithm or code for reproducibility and validation, which would facilitate other researchers in validating and building upon the proposed work.
Author Response
1. Experiments have been added to the experimental part of the current mainstream algorithms PPO and TD3 algorithm comparison experiments and related analysis
2.The success rate, obstacle avoidance rate and convergence rate have been explained and illustrated in detail in the Experiments and Evaluation section.
3. Due to the insufficiency of the current experimental conditions, we plan to start realizing the real machine experiments of the robotic arm in the future work plan and improve it continuously.
4. In the experimental parameter setting section, we explain some identifiable hyperparameter settings and provide literature on reliability proofs. And the implicit layer structure of SAC algorithm is categorized into three and experiments are carried out to determine the final structure. The loss curves of the actor and critic networks as well as the reward are used as evaluation criteria.
5. There are not many studies on robotic arm motion planning in dynamic environments, we have used the algorithms mentioned in the existing literature in this research area for comparative experiments and validation, and will continue to follow up and improve them in the future.
6. In the Experimentation and Evaluation section, the specific information in the dynamic environment has been described in detail so that the researcher can continue the related work, and the code download address has been provided at the end of the article. The code documentation is currently being organized and will be uploaded to my github later.
Author Response File: Author Response.pdf