Next Article in Journal
Design and Optimization of a Bennett–Spherical Scissor Mechanism Suitable for Driving Aerial–Aquatic Rotor Deformation
Previous Article in Journal
Research on Camera Rotation Strategies for Active Visual Perception in the Self-Driving Vehicles
 
 
Article
Peer-Review Record

Synergistic Pushing and Grasping for Enhanced Robotic Manipulation Using Deep Reinforcement Learning

Actuators 2024, 13(8), 316; https://doi.org/10.3390/act13080316
by Birhanemeskel Alamir Shiferaw 1, Tayachew F. Agidew 1, Ali Saeed Alzahrani 2 and Ramasamy Srinivasagan 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Actuators 2024, 13(8), 316; https://doi.org/10.3390/act13080316
Submission received: 12 July 2024 / Revised: 17 August 2024 / Accepted: 19 August 2024 / Published: 20 August 2024
(This article belongs to the Section Actuators for Robotics)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper is focused on robotic manipulation tasks of grasping objects. However, a number of problems can occur with this seemingly simple task. Problems occur mainly when the shape and dimensions and orientation of the grasped object change. In that case, a different approach is needed. The authors tried to solve these problems using deep learning. I find this research very relevant and interesting.

The introduction of the paper is aimed at describing the situation and justifying the need to solve this research. The authors describe artificial intelligence as a frequently used tool to solve similar problems. In a separate chapter, the authors provide an overview of similar related works.

The main contribution of this work is clearly defined and how the authors achieved it is clearly defined.
Next, the methods used by the authors in this research are described. An experimental setup is described, which includes a collaborative robot UR5 and an end effector and a vision system with a depth camera and a computer system.
The experimental procedure is described in detail. Subsequently, the task of trajectory planning is solved.

The mathematical model of the robotic system is described below. The standard Denavit–Hartenberg method is used.
The forward and inverse kinematic task is solved. The dynamic task of modeling is then further solved.
The use of artificial intelligence methods is further documented. The training and testing procedures are described next.

The presented results show the suitability of solving the defined problems.

Overall, it is an excellent article, but the authors did not avoid several fundamental mistakes, which can be corrected.
I recommend the article for major revision.



Comments:
1. The article is very well done, but the mathematical entries are incorrect and sloppy. Such a scientific article requires a precisely written mathematical formulation. By using an inconsistent style, it degrades the level of this otherwise excellent work.
Authors often use the designation of quantities using numbers, for example "c2". I highly recommend fixing this and using a subscript to differentiate quantities using numeric subscripts. Since the authors do not use subscripts, the mathematical notations are unclear and confusing. It is correct for some quantities, but many quantities are written without subscripts and it looks confusing.
Scalar quantities must be listed in italic style. Please check the entire text, including images and tables. It must be clear which quantity is scalar and which is matrix quantity. For example, the quantity "PDD" is alternately indicated as italics and as normal text. This is how it is in several places throughout the article. One should check the entire article and be consistent with mathematical expressions.

2. There is chaos with the numbering of images.
Why is there again picture no. 12 on page 12? 1? In this image, it is cropped from the text. You have to be careful when processing images.
Then again on page 13 there is picture no. 2
Some images are missing references in the text. That's probably why the authors didn't mind.
Also on page 14 there is again figure 3 and there are also texts from which it is cut. The authors used small text fields and the text is not visible in its entirety.
Other images are also incorrectly numbered and lack references in the text.

3. Incorrect mathematical notation: Row no. 506 - there should be a space between the number and the quantity unit.

4. Tables 3, 4, 5 should show the units (%)

5. Also check the grammar, for example line 690 "dif-ferent", line 693 "ob-jects"

6. Why is table 6 in a different style. It looks bad. It needs to be revised.

7. Also some references are wrong. There are no complete bibliographic data, e.g. reference no. 2, 3
Journal titles, article titles, page numbers and other data are missing.

Author Response

Comment 1: [The article is very well done, but the mathematical entries are incorrect and sloppy. Such a scientific article requires a precisely written mathematical formulation. By using an inconsistent style, it degrades the level of this otherwise excellent work. Authors often use the designation of quantities using numbers, for example "c2". I highly recommend fixing this and using a subscript to differentiate quantities using numeric subscripts. Since the authors do not use subscripts, the mathematical notations are unclear and confusing. It is correct for some quantities, but many quantities are written without subscripts and it looks confusing. Scalar quantities must be listed in italic style. Please check the entire text, including images and tables. It must be clear which quantity is scalar and which is matrix quantity. For example, the quantity "PDD" is alternately indicated as italics and as normal text. This is how it is in several places throughout the article. One should check the entire article and be consistent with mathematical expressions.]

Response 1: [We thank the reviewer for his/her valuable comments. It is important that all the mathematical equations should be uniform and well-formatted.  As a result, we formatted the equations using subscripts and uniform formatting. page 7, line 261; page 9, line 320; page 9, line 331 ]

Comment 2: [There is chaos with the numbering of images. Why is there again picture no. 12 on page 12? 1? In this image, it is cropped from the text. You have to be careful when processing images. Then again on page 13 there is picture no. 2 Some images are missing references in the text. That's probably why the authors didn't mind. Also, on page 14 there is again figure 3 and there are also texts from which it is cut. The authors used small text fields and the text is not visible in its entirety. Other images are also incorrectly numbered and lack references in the text.]

Response 2: [We value the reviewers comment and we have made a correction to the numbering and picture drafting. The pictures are drawn and processed by the authors. Page 9, line 307; page 11, line 394; page 12, line 418; page 13, line 425; page 14, line 446]

Comment 3: [Incorrect mathematical notation: Row no. 506 - there should be a space between the number and the quantity unit.]

Response 3: [We thank the reviewer for his/her detailed and valuable comments. There should be a space between a number and a unit. And we made a correction. Page 13, line 430]

Comment 4: [Tables 3, 4, 5 should show the units (%)]

Response 4: [Exactly, the numbers were presented with out a unit. Thank you again and we added a unit to each data presented. Page 17, line 577; page 17, line 595; page 18, line 610]

Comment 5: [Also check the grammar, for example line 690 "dif-ferent", line 693 "ob-jects"]

Response 5: [We value the comment and we made a peer-review and corrected. page 17, line 590; page 17, line 593; page 18, line 614; page 18, line 617, page 19, line 643, page 20, line 648]

Comment 6: [Why is table 6 in a different style. It looks bad. It needs to be revised.]

Response 6: [Yes, the table was distorted and not uniform with other tables. We redraw the table and made uniform with other tables. Page 18-19, line 619]

Comment 7: [Also some references are wrong. There are no complete bibliographic data, e.g. reference no. 2, 3 Journal titles, article titles, page numbers and other data are missing.]

Response 7: [We removed the references which lack full information and added other references. Page 21, line 697-705]

 

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

In this paper, authors presented a novel approach that integrates pushing and grasping actions using deep reinforcement learning. The proposed model employed two fully convolutional neural networks—Push-Net and Grasp-Net—that predict pixel-wise Q-values for potential pushing and grasping actions from heightmap images of the scene. The research field of the paper is consistent with the topic of robotic grasping and deep reinforcement learning, and the content is substantial. To further improve the quality of the paper, the following suggestions are put forward:

1. The relevant work part of the paper lists a lot of relevant literature, but it has not been effectively summarized and discussed, which makes the research idea and innovation of the paper unclear.

2. D-H method includes standard D-H method (SDH) and modified D-H method (MDH). The method used in this paper is MDH method, which should be clearly explained in the paper.

3. The forward kinematics, inverse kinematics and dynamics modeling process of the open 6-link arm UR5 is very mature, and it is not the original work of this paper. Therefore, it is not necessary to show the process in detail, but to list the model parameters, application methods and final results. The derivation process of this part makes the paper too long and too many pages.

4. The grasping device of the end of the robot arm used in the final simulation of the paper should be explained, whether its grasping ability can make the robot arm effectively grasp various objects with preset shapes? This device is not considered in the modeling process, and its parameters will affect the kinematics and dynamics of the robot arm.

5. The final verification of the paper is based on the simulation environment. Is the method proposed in the paper effective in the real environment with the characteristics of light change, object color saturation, noise interference, etc.?

6. How does the proposed method deal with Sim-to-Real Gap.

 

Comments on the Quality of English Language

Minor editing of English language required

Author Response

Comment 1: [The relevant work part of the paper lists a lot of relevant literature, but it has not been effectively summarized and discussed, which makes the research idea and innovation of the paper unclear.]

Response 1: [The reviewer raises valid points that could have been explained better in our paper. We value that a good research paper stands from evaluating, and systematic review of state of the art references. As a result, we have added the drawbacks and summary of the references with valid critique. page 2, line 66-68, line 73-75, line 83-85; page 2-3, line 90-94; page 3, line 99-104, line 122-125]

Comment 2: [D-H method includes standard D-H method (SDH) and modified D-H method (MDH). The method used in this paper is MDH method, which should be clearly explained in the paper.]

Response 2: [The authors would like to thank for this comment. We used the modified D-H method for our paper and added relevant details for the UR5 robot. page 7, line 245-249, line253-258]

Comment 3: [The forward kinematics, inverse kinematics and dynamics modeling process of the open 6-link arm UR5 is very mature, and it is not the original work of this paper. Therefore, it is not necessary to show the process in detail, but to list the model parameters, application methods and final results. The derivation process of this part makes the paper too long and too many pages.]

Response 3: [Thank you for your systematic feedback about the lengthy, and messy equations presented in the paper. We analyzed that the kinematic and dynamic equations of the UR5 robot is well-advanced and noted its importance removing from the paper as suggested by reviewer. pages and number: removed]

Comment 4: [The grasping device of the end of the robot arm used in the final simulation of the paper should be explained, whether its grasping ability can make the robot arm effectively grasp various objects with preset shapes? This device is not considered in the modeling process, and its parameters will affect the kinematics and dynamics of the robot arm.]

Response 4: [The UR5 robotic arm is equipped with an RG2 parallel jaw gripper. Key specifications include:

Grip Force: 20-235 N

Grip Stroke: 0-85 mm

The gripper is capable of handling various objects with different shapes and sizes. Its parameters significantly affect the kinematics and dynamics of the robotic arm. The gripper's effectiveness in grasping tasks is validated through simulations involving diverse objects, ensuring its robustness and adaptability. We agree that the dynamics and kinematics of the UR5 robot can vary due to the end-effector. But we ignore the dynamics and kinematics of the robot as stated. page 5, line 189-192]

Comment 5: [The final verification of the paper is based on the simulation environment. Is the method proposed in the paper effective in the real environment with the characteristics of light change, object color saturation, noise interference, etc.?]

Response : 5 [We thank the reviewer for his/her valuable comment. Light Changes: The RGB-D camera's depth information compensates for lighting variations.

Color saturation, Hue, Brightness, Contrast has been generated within Data augmentation feature available in PyTorch ensures the model's robustness ensures accurate object detection regardless of color, light intensity.

Noise Interference: We strongly feel that, the reward structure and deep learning algorithms is resilient towards noise impacts. page 20, line 667-671]

Comment 6: [How does the proposed method deal with Sim-to-Real Gap.]

Response 6: [Domain Randomization: Varying simulation parameters to expose the model to diverse scenarios.

Transfer Learning: Fine-tuning the model with real-world data to enhance performance. (But we will incorporate in this in future work).

Moreover, Robust Reward Structures: Ensuring consistency between simulated and real-world environments. These techniques collectively bridge the gap, ensuring the model's effectiveness in real-world applications. page 20, line 672-677]

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This paper is about robotic grasping of objects in cluttered environments. It uses a deep reinforcement learning method for pushing and grasping actions and it uses UR5 to show the effectiveness of the method.

 

The paper begins with basic definitions of AI and robotics. In the related work it presents some works with their related success rate of grasping. The introductory material is finished with five points that includes the major contribution of the paper.

In the 3rd section, an overview of the software and the hardware components as well as the experimental procedure is presented. An extensive analysis of the UR5 robot kinematics, followed by a brief analysis of robotic dynamics and trajectory planning is presented. Data collection includes the acquiring of RGB-D images of (a) randomly arranged cluttered objects, (b) challenging well-ordered configurations and (c) novel object configurations. Then the deep reinforcement learning method as well as the densely connected convolutional neural networks are presented. The primitive actions for pushing and grasping are then explained. This section ends with the reward modelling, the training and testing procedures.

 

In the results section, the proposed method is compared to various methods in various scenarios. Generally, the method shows very good performance in all scenarios.

 

The paper is interesting but there a lot of issues to be addressed:

·     

1.      The key gaps of the SoA are not explained enough. Therefore, the major contribution of the paper remains a question.

2.      All the experiments are simulated, something that is not clearly shown in the paper.

3.      The kinematics modelling of UR5 robot is extensively presented but it is not needed. This is already presented in technical reports.

4.      The dynamic modelling and the trajectory planning is too general. It should be adapted for UR5 or delete it. It is knowledge found in every introductory textbook for industrial robots.

5.      Concerning the presentation DRL and the remaining of section 3 should be adapted to the problem. They are too general.

6.      Concerning the results, it should be justified the selection of the methods that are compared to the proposed method.

Comments on the Quality of English Language

·         Ther are a lot of “-“ in words in the text.

·         There are some wrongly cited figures in the text.

Author Response

Comment 1: [The key gaps of the SoA are not explained enough. Therefore, the major contribution of the paper remains a question.]

Response 1: [Thank you for this interesting comment. We have addressed the issue through further reviewing the papers and highlighted the gap between the state-of-the-art references and our proposed method. page 3-4, line 132-152]

Comment 2: [All the experiments are simulated, something that is not clearly shown in the paper.]

Response 2: [All of the experiments were carried out with the PyBullet physics engine in a simulated setting. To assess the effectiveness of the suggested method in a comprehensive way, the simulation incorporates unique object scenarios, well-organized configurations, and cluttered objects arranged randomly. page 15, line 485-490]

Comment 3: [The kinematics modelling of UR5 robot is extensively presented but it is not needed. This is already presented in technical reports.]

Response 3: [We agree with this comment and decided to remove the kinematics and the dynamics of the robot. removed]

Comment 4: [The dynamic modelling and the trajectory planning is too general. It should be adapted for UR5 or delete it. It is knowledge found in every introductory textbook for industrial robots.]

Response 4: [We agree with this comment and decided to remove the kinematics and the dynamics of the robot. removed]

Comment 5: [Concerning the presentation DRL and the remaining of section 3 should be adapted to the problem. They are too general.]

Response 5: [The proposed method utilizes DRL to integrate pushing and grasping actions. The DRL framework consists of two fully convolutional neural networks: Push-Net and Grasp-Net. These networks predict pixel-wise Q-values for potential actions based on heightmap images of the scene. The training process involves collecting RGB-D images of various object configurations and using a reward structure tailored to enhance both pushing and grasping actions. The DRL approach is specifically designed to improve manipulation performance in cluttered environments by enabling the robot to strategically rearrange objects before attempting a grasp. page 9, line 300-359]

Comment :6 [Concerning the results, it should be justified the selection of the methods that are compared to the proposed method.]

Response 6: [The proposed method is compared against several state-of-the-art approaches [6], [7], [8], [9], [10] to evaluate its performance comprehensively. These methods were selected based on their relevance to robotic manipulation in cluttered environments, their use of reinforcement learning techniques, and their reported success rates in similar tasks. The comparison focuses on grasp success rates, completion rates, and the ability to handle cluttered and novel object scenarios. page 18-19, line 619-620]

Comment :7 [Comments on the Quality of English Language· There are a lot of “- “in words in the text.]

Response : 7 [Thank you for the detailed comments and we revised for typo errors accordingly as well as figures are modified. page 9, line 307; page 17, line 590, line 593; page 18, line 614,...]

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The article is of much better quality and the authors have corrected almost all errors and processed my comments. However, there are still errors, but the authors can correct them in the proof version of the article.

Comments:
Image 2 is of poor quality. It is in bad resolution. Also, some texts are crossed out with reference lines (Link 5, Joint 5, Joint 3).

on line 111 - the sentence should not start with a reference. But for example: "The work [12] explore ..."
on line 115 - it is better to state: "the study [13] proposed a model ..."

Mathematical notations are still not listed correctly. For example, in equations 3, 4, 5, 6, 7, italic style should be used for scalar quantities. Just as it is stated in the text and in other equations.

Author Response

Comment 1: [Image 2 is of poor quality. It is in bad resolution. Also, some texts are crossed out with reference lines (Link 5, Joint 5, Joint 3).]

Response 1: [Thanks again for your valuable feedback and we modified figure 2. page 6, line 243]

Comment 2: [on line 111 - the sentence should not start with a reference. But for example: "The work [12] explore ..."]

Response 2: [We appreciate the comment and corrections are done as per the reviewer comment. page 3, line 111]

Comment 3: [on line 115 - it is better to state: "the study [13] proposed a model ..."]

Response 3: [We appreciate the comment and corrections are done as per the reviewer comment. page 3, line 115]

Comment 4: [Mathematical notations are still not listed correctly. For example, in equations 3, 4, 5, 6, 7, italic style should be used for scalar quantities. Just as it is stated in the text and in other equations.]

Response 3: [Yes, we made the corrections. page 9, line 320, line 331, line 337; page 20, line 348, line 355]

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

The author has made careful revisions to the paper according to the previous suggestions, and the revised version basically solves some existing problems, and the content is more accurate and the conclusions are more credible.

Author Response

Thanks for the tremendous support and critical feedbacks so far.

Back to TopTop