Next Article in Journal
On the Dynamics of Flexible Wings for Designing a Flapping-Wing UAV
Previous Article in Journal
Detection Probability and Bias in Machine-Learning-Based Unoccupied Aerial System Non-Breeding Waterfowl Surveys
 
 
Article
Peer-Review Record

Active Object Detection and Tracking Using Gimbal Mechanisms for Autonomous Drone Applications

by Jakob Grimm Hansen 1,* and Rui Pimentel de Figueiredo 2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Submission received: 4 September 2023 / Revised: 24 January 2024 / Accepted: 24 January 2024 / Published: 6 February 2024
(This article belongs to the Section Drone Design and Development)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper specifically assesses the significance of gimbals in active perception systems, which are fundamental in numerous applications, particularly in inspection and surveillance activities. Their methodical analysis and findings offer crucial insights for the application of active vision-based processes in autonomous drone applications, such as building inspection and vehicle tracking.

The authors demonstrate the benefits of active object tracking for UAV applications by first reducing motion blur brought on by rapid camera movement and vibrations, and then by fixing the object of interest in the field-of-view's center and thereby reducing reprojection errors brought on by peripheral distortion. The results show that active techniques significantly outperform classic passive ones in terms of object position estimate accuracy. In more detail, a series of experiments suggests that active gimbal tracking can improve the spatial estimation accuracy of moving objects of known size, even in the face of difficult motion patterns and image distortion.

Some deficiencies seen in the article review are as follows:

1. The references given in the 1. Introduction and 2. Background sections are very inadequate. Also, the expressions in section 2.1 are used verbatim from another article:

Pimentel de Figueiredo, R., Le Fevre Sejersen, J., Grimm Hansen, J., & Brandão, M. (2022). Integrated design-sense-plan architecture for autonomous geometric-semantic mapping with UAVs. Frontiers in Robotics and AI9, 911974.

2. There are some unnecessary references used in the article. It is considered that there is no need to use references 1-4, 6, 21, 31 and 32.

3. References 16, 23-29 are out of date. Current studies should be used instead.

4. Some recommended articles are:

Pan, N., Zhang, R., Yang, T., Cui, C., Xu, C., & Gao, F. (2021). Fast‐Tracker 2.0: Improving autonomy of aerial tracking with active vision and human location regression. IET Cyber‐Systems and Robotics, 3(4), 292-301.

Kiyak, E., & Unal, G. (2021). Small aircraft detection using deep learning. Aircraft Engineering and Aerospace Technology93(4), 671-681.

Kraft, M., Piechocki, M., Ptak, B., & Walas, K. (2021). Autonomous, onboard vision-based trash and litter detection in low altitude aerial images collected by an unmanned aerial vehicle. Remote Sensing, 13(5), 965.

Wu, H., Jiang, L., Liu, X., Li, J., Yang, Y., & Zhang, S. (2021, September). Intelligent explosive ordnance disposal UAV system based on manipulator and real-time object detection. In 2021 4th International Conference on Intelligent Robotics and Control Engineering (IRCE) (pp. 61-65). IEEE.

Unal, G. (2021). Visual target detection and tracking based on Kalman filter. Journal of Aeronautics and Space Technologies14(2), 251-259.

Qingqing, L., Taipalmaa, J., Queralta, J. P., Gia, T. N., Gabbouj, M., Tenhunen, H., ... & Westerlund, T. (2020, November). Towards active vision with UAVs in marine search and rescue: Analyzing human detection at variable altitudes. In 2020 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR) (pp. 65-70). IEEE.

Lin, L., Yang, Y., Cheng, H., & Chen, X. (2019). Autonomous vision-based aerial grasping for rotorcraft unmanned aerial vehicles. Sensors, 19(15), 3410.

Nguyen, P. H., Arsalan, M., Koo, J. H., Naqvi, R. A., Truong, N. Q., & Park, K. R. (2018). LightDenseYOLO: A fast and accurate marker tracker for autonomous UAV landing by visible light camera sensor on drone. Sensors, 18(6), 1703.

Chen, P., Dang, Y., Liang, R., Zhu, W., & He, X. (2017). Real-time object tracking on a drone with multi-inertial sensing data. IEEE Transactions on Intelligent Transportation Systems, 19(1), 131-139.

5. How was Equation 7 obtained? Can details be given?

 

 

 

 

 

 

 

Author Response

This paper specifically assesses the significance of gimbals in active perception systems, which are fundamental in numerous applications, particularly in inspection and surveillance activities. Their methodical analysis and findings offer crucial insights for the application of active vision-based processes in autonomous drone applications, such as building inspection and vehicle tracking.

 

The authors demonstrate the benefits of active object tracking for UAV applications by first reducing motion blur brought on by rapid camera movement and vibrations, and then by fixing the object of interest in the field-of-view's center and thereby reducing reprojection errors brought on by peripheral distortion. The results show that active techniques significantly outperform classic passive ones in terms of object position estimate accuracy. In more detail, a series of experiments suggests that active gimbal tracking can improve the spatial estimation accuracy of moving objects of known size, even in the face of difficult motion patterns and image distortion.

 

We thank the reviewer for the positive comments and suggestions. All identified questions below are marked with Q and answers with A.

 

Some deficiencies seen in the article review are as follows:

 

Q1. The references given in the 1. Introduction and 2. Background sections are very inadequate. Also, the expressions in section 2.1 are used verbatim from another article:

 

Pimentel de Figueiredo, R., Le Fevre Sejersen, J., Grimm Hansen, J., & Brandão, M. (2022). Integrated design-sense-plan architecture for autonomous geometric-semantic mapping with UAVs. Frontiers in Robotics and AI9, 911974.

 

A1. We made significant changes to these sections. In particular we altered section 2 to remove overlap with our previous paper.

 

Q2. There are some unnecessary references used in the article. It is considered that there is no need to use references 1-4, 6, 21, 31 and 32. References 16, 23-29 are out of date. Current studies should be used instead.

 

4. Some recommended articles are:

Pan, N., Zhang, R., Yang, T., Cui, C., Xu, C., & Gao, F. (2021). Fast‐Tracker 2.0: Improving autonomy of aerial tracking with active vision and human location regression. IET Cyber‐Systems and Robotics, 3(4), 292-301.

Kiyak, E., & Unal, G. (2021). Small aircraft detection using deep learning. Aircraft Engineering and Aerospace Technology93(4), 671-681.

Kraft, M., Piechocki, M., Ptak, B., & Walas, K. (2021). Autonomous, onboard vision-based trash and litter detection in low altitude aerial images collected by an unmanned aerial vehicle. Remote Sensing, 13(5), 965.

Wu, H., Jiang, L., Liu, X., Li, J., Yang, Y., & Zhang, S. (2021, September). Intelligent explosive ordnance disposal UAV system based on manipulator and real-time object detection. In 2021 4th International Conference on Intelligent Robotics and Control Engineering (IRCE) (pp. 61-65). IEEE.

Unal, G. (2021). Visual target detection and tracking based on Kalman filter. Journal of Aeronautics and Space Technologies14(2), 251-259.

Qingqing, L., Taipalmaa, J., Queralta, J. P., Gia, T. N., Gabbouj, M., Tenhunen, H., ... & Westerlund, T. (2020, November). Towards active vision with UAVs in marine search and rescue: Analyzing human detection at variable altitudes. In 2020 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR) (pp. 65-70). IEEE.

Lin, L., Yang, Y., Cheng, H., & Chen, X. (2019). Autonomous vision-based aerial grasping for rotorcraft unmanned aerial vehicles. Sensors, 19(15), 3410.

Nguyen, P. H., Arsalan, M., Koo, J. H., Naqvi, R. A., Truong, N. Q., & Park, K. R. (2018). LightDenseYOLO: A fast and accurate marker tracker for autonomous UAV landing by visible light camera sensor on drone. Sensors, 18(6), 1703.

Chen, P., Dang, Y., Liang, R., Zhu, W., & He, X. (2017). Real-time object tracking on a drone with multi-inertial sensing data. IEEE Transactions on Intelligent Transportation Systems, 19(1), 131-139.

 

A2. We included some of the references suggested by the reviewer

 

Q3. How was Equation 7 obtained? Can details be given?

 

Q3. Equation 7 was obtained with the systematic procedure known as Denavit-Hartenberg (DH) convention. Details on how equation 7 have have been improved, in section 3.2.4.

 

Reviewer 2 Report

Comments and Suggestions for Authors

Summary: This manuscript focuses on the gimbal systems with object detection and tracking applications in UAVs.The authors systematically evaluate the significance of using a gimbal in active object tracking for UAV applications. This inclusion not only reduces motion blur but also minimizes re-projection errors. As a result, it substantially enhances the accuracy of object pose estimation, particularly in challenging scenarios such as image distortion and complex motion patterns.

Merits: The authors provide sufficient details to well understand the proposed systems. Overall, the paper is well-written with a clear motivation.

Major Issues:

1. The current version of the manuscript looks more like a technical report without important details to evaluate the contribution and the novelty of the authors’ work. I would suggest the authors rewrite the method section to focus on introducing the work proposed by the authors.

2.In Line 153, the authors mentioned that the input to the system is the RGB-D point clouds generated from multiple stereo cameras. Generating point clouds itself from stereo images consumes considerable computational resources, especially for edge devices like drones. There are LiDAR sensors available both for research and industrial projects, even some of them have built-in algorithms for active detection and tracking. The authors should have a section to elaborate on this choice for active detection and tracking on drones. I would like to see a comparison between the proposed work and the existing works on LiDAR-based active detection and tracking for drones.

3.  In Line 191, the authors mention that the image is converted from RGB to HSV to facilitate the color detection, however, the authors did not elaborate any details about the techniques used for color detection. As far as I am concerned, techniques like color matching could be done in RGB space as well. I would like to see more details on the motivation behind converting RBG into HSV.

4. The details of tuning the control system are not described. I would like to see more details on the control system design and the experiments on the applied control system, e.g., the tuning of PID (I suppose the PD control should be sufficient in the application context).

5. More experiments should be conducted on the low level of the system, e.g., control, and object detection evaluated with a proper metric to compare, e.g., IoU.

Minor Issue: Object detection and tracking are very active research areas. However, the author’s literature study stops at 2021. The authors highlight that in the context of active tracking, existing methods were computationally expensive, as indicated in a survey paper from 2013 ([13]). It is worth noting that in 2013, edge computing was in its nascent stages of development. However, over the past few years, there has been a proliferation of edge computing platforms and the introduction of numerous small-scale algorithms. Therefore, I would suggest the authors revise the literature, particularly the active tracking section, and shift the focus towards more recent works in edge computing and the smaller-scale or non-parametric methods for vision detection and tracking.

 

Author Response

Reviewer 2’s report (Major) 

Summary: This manuscript focuses on the gimbal systems with object detection and tracking applications in UAVs.The authors systematically evaluate the significance of using a gimbal in active object tracking for UAV applications. This inclusion not only reduces motion blur but also minimizes re-projection errors. As a result, it substantially enhances the accuracy of object pose estimation, particularly in challenging scenarios such as image distortion and complex motion patterns. 

 

We would like to thank the reviewer for the positive insights. The manuscript has been revised, although extra experimental work is not possible at the time. Identified questions are preceded by Q and respective answers by A. 

 

Merits: The authors provide sufficient details to well understand the proposed systems. Overall, the paper is well-written with a clear motivation. 

 

Major Issues: 

 

Q1. The current version of the manuscript looks more like a technical report without important details to evaluate the contribution and the novelty of the authors’ work. I would suggest the authors rewrite the method section to focus on introducing the work proposed by the authors. 

 

A1. The introduction and methodologies section of the paper was significantly improved, namely on the mathematical formalism behind the active object tracking approach.  

 

Q2. In Line 153, the authors mentioned that the input to the system is the RGB-D point clouds generated from multiple stereo cameras. Generating point clouds itself from stereo images consumes considerable computational resources, especially for edge devices like drones. There are LiDAR sensors available both for research and industrial projects, even some of them have built-in algorithms for active detection and tracking. The authors should have a section to elaborate on this choice for active detection and tracking on drones. I would like to see a comparison between the proposed work and the existing works on LiDAR-based active detection and tracking for drones. 

 

A2. We focused our work on object tracking using a monocular camera and discarded details on navigation aspectes using RGB-D information. Unfortunately, at this stage more experiments and comparisons to other works are not possible. 

 

Q3. In Line 191, the authors mention that the image is converted from RGB to HSV to facilitate the color detection, however, the authors did not elaborate any details about the techniques used for color detection. As far as I am concerned, techniques like color matching could be done in RGB space as well. I would like to see more details on the motivation behind converting RBG into HSV. 

 

A3. We changed the sentence: 

“The input image is converted from RGB to HSV” 

to 

“The input image is converted from RGB to HSV, to facilitate color-based detection and tracking, since it separates the color information from the brightness information, being more robust to varying lighting conditions” 

 

Q4. The details of tuning the control system are not described. I would like to see more details on the control system design and the experiments on the applied control system, e.g., the tuning of PID (I suppose the PD control should be sufficient in the application context). 

 

A4. The parameters of the PID controller were found using the  Ziegler-Nichols method. Unfortunately more details at this stage are not possible. 

 

Q5. More experiments should be conducted on the low level of the system, e.g., control, and object detection evaluated with a proper metric to compare, e.g., IoU. Minor Issue: Object detection and tracking are very active research areas. However, the author’s literature study stops at 2021. The authors highlight that in the context of active tracking, existing methods were computationally expensive, as indicated in a survey paper from 2013 ([13]). It is worth noting that in 2013, edge computing was in its nascent stages of development. However, over the past few years, there has been a proliferation of edge 

 

2 computing platforms and the introduction of numerous small-scale algorithms. Therefore, 

I would suggest the authors revise the literature, particularly the active tracking section, 

and shift the focus towards more recent works in edge computing and the smaller-scale or 

non-parametric methods for vision detection and tracking. 

 

A5. Unfortunately, at this stage more experimental work is not possible. 

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have addressed my concerns.

Comments on the Quality of English Language

none

Author Response

The English has been overall improved

Back to TopTop