Towards Real-Time On-Drone Pedestrian Tracking in 4K Inputs
Abstract
:1. Introduction
- It develops a real-time pedestrian tracker for the onboard mission computer, which takes as the input 4K aerial video streams. The fundamental ideas are to determine when and where to execute detection and tracking algorithms and to combine the results from them effectively.
- It proposes a novel tracker-assisted confidence boosting algorithm to enhance the detection accuracy.
- It empirically demonstrates the efficacy of the proposed methods on real-world aerial videos, which are captured by drones at the height of 50 m.
- To the best of our knowledge, this paper is the first work that enables real-time on-drone tracking for 4K aerial inputs.
2. Related Works
2.1. Single Object Tracking (SOT)
2.2. Multiple Object Tracking (MOT)
2.3. Drone Datasets
3. Design and Implementation
3.1. Overview
3.1.1. Target Hardware Architectures
3.1.2. Overview of the Proposed Tracker
3.2. Implementation Details
3.2.1. Intermittent Detection and Parallel Execution
3.2.2. Input Slicing and Output Stitching
3.2.3. Tracker-Assisted Confidence Boosting
3.2.4. Identity Association for Drone-Captured Inputs
Algorithm 1 An Ensemble for Identity Association |
|
4. Experiments
4.1. Efficiency Evaluation
4.2. Accuracy Evaluation
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
UAV | Unmanned Aerial Vehicle |
SOT | Single Object Tracking |
MOT | Multiple Object Tracking |
IoU | Intersection Over Union |
TACB | Tracker-Assisted Confidence Boosting |
CPU | Central Processing Unit |
GPU | Graphic Processing Unit |
FC | Flight Controller |
MC | Mission Computer |
FPS | Frames Per Second |
SoC | System-on-Chip |
CNN | Convolutional Neural Network |
PSR | Peak-to-Sidelobe Ratio |
KCF | Kernelized Correlation Filters |
CSRT | Discriminative Correlation Filter with Channel and Spatial Reliability Tracker |
MOSSE | Minimum Output Sum of Squared Error |
References
- Puttock, A.; Cunliffe, A.; Anderson, K.; Brazier, R.E. Aerial Photography Collected with a Multirotor Drone Reveals Impact of Eurasian Beaver Reintroduction on Ecosystem Structure. J. Unmanned Veh. Syst. 2015, 3, 123–130. [Google Scholar] [CrossRef]
- Ding, G.; Wu, Q.; Zhang, L.; Lin, Y.; Tsiftsis, T.A.; Yao, Y.D. An Amateur Drone Surveillance System Based on the Cognitive Internet of Things. IEEE Commun. Mag. 2018, 56, 29–35. [Google Scholar] [CrossRef]
- Xu, C.; Zhang, K.; Jiang, Y.; Niu, S.; Yang, T.; Song, H. Communication aware UAV swarm surveillance based on hierarchical architecture. Drones 2021, 5, 33. [Google Scholar] [CrossRef]
- Tariq, R.; Rahim, M.; Aslam, N.; Bawany, N.; Faseeha, U. Dronaid: A Smart Human Detection Drone for Rescue. In Proceedings of the 2018 15th International Conference on Smart Cities: Improving Quality of Life Using ICT & IoT (HONET-ICT), Islamabad, Pakistan, 8–10 October 2018; pp. 33–37. [Google Scholar]
- Schedl, D.C.; Kurmi, I.; Bimber, O. An Autonomous Drone for Search and Rescue in Forests using Airborne Optical Sectioning. Sci. Robot. 2021, 6, eabg1188. [Google Scholar] [CrossRef] [PubMed]
- Besada, J.A.; Bergesio, L.; Campaña, I.; Vaquero-Melchor, D.; López-Araquistain, J.; Bernardos, A.M.; Casar, J.R. Drone Mission Definition and Implementation for Automated Infrastructure Inspection using Airborne Sensors. Sensors 2018, 18, 1170. [Google Scholar] [CrossRef] [PubMed]
- Balamuralidhar, N.; Tilon, S.; Nex, F. MultEYE: Monitoring system for real-time vehicle detection, tracking and speed estimation from UAV imagery on edge-computing platforms. Remote Sens. 2021, 13, 573. [Google Scholar] [CrossRef]
- Rančić, K.; Blagojević, B.; Bezdan, A.; Ivošević, B.; Tubić, B.; Vranešević, M.; Pejak, B.; Crnojević, V.; Marko, O. Animal Detection and Counting from UAV Images Using Convolutional Neural Networks. Drones 2023, 7, 179. [Google Scholar] [CrossRef]
- Zhang, Y.; Wang, C.; Wang, X.; Zeng, W.; Liu, W. FairMOT: On the Fairness of Detection and Re-identification in Multiple Object Tracking. Int. J. Comput. Vis. 2021, 129, 3069–3087. [Google Scholar] [CrossRef]
- Liu, Q.; Chu, Q.; Liu, B.; Yu, N. GSM: Graph Similarity Model for Multi-Object Tracking. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI-20), Yokohama, Japan, 11–17 July 2020; pp. 530–536. [Google Scholar]
- Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upcroft, B. Simple Online and Realtime Tracking. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar]
- Wojke, N.; Bewley, A.; Paulus, D. Simple Online and Realtime Tracking with a Deep Association Metric. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649. [Google Scholar]
- Zhang, Z.; He, Y.; Guo, H.; He, J.; Yan, L.; Li, X. Towards Robust Visual Tracking for Unmanned Aerial Vehicle with Spatial Attention Aberration Repressed Correlation Filters. Drones 2023, 7, 401. [Google Scholar] [CrossRef]
- Fan, H.; Du, D.; Wen, L.; Zhu, P.; Hu, Q.; Ling, H.; Shah, M.; Pan, J.; Schumann, A.; Dong, B.; et al. VisDrone-MOT2020: The Vision Meets Drone Multiple Object Tracking Challenge Results. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Glasgow, UK, 23–28 August 2020; pp. 713–727. [Google Scholar]
- Lim, Y.; Kim, Y.; Lim, C. DNA+Drone: Drone Service Platform Converging Bigdata, 5G Networks, and AI, which are Korean ICT Strengths. In Proceedings of the US-Korea Conference on Science, Technology, and Entrepreneurship (UKC), Arlington, TX, USA, 17–20 August 2022; p. 265. [Google Scholar]
- Hong, Y.; Kim, S.; Kim, Y.; Cha, J. Quadrotor Path Planning using A* Search Algorithm and Minimum Snap Trajectory Generation. ETRI J. 2021, 43, 1013–1023. [Google Scholar] [CrossRef]
- Canovas, B.; Nègre, A.; Rombaut, M. Onboard Dynamic RGB-D Simultaneous Localization and Mapping for Mobile Robot Navigation. ETRI J. 2021, 43, 617–629. [Google Scholar] [CrossRef]
- Hong, T.; Liang, H.; Yang, Q.; Fang, L.; Kadoch, M.; Cheriet, M. A real-time tracking algorithm for multi-target UAV based on deep learning. Remote Sens. 2022, 15, 2. [Google Scholar] [CrossRef]
- Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual Object Tracking using Adaptive Correlation Filters. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Juan, PR, USA, 17–19 June 2010; pp. 2544–2550. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. High-speed Tracking with Kernelized Correlation Filters. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 583–596. [Google Scholar] [CrossRef] [PubMed]
- Lukezic, A.; Vojir, T.; Čehovin Zajc, L.; Matas, J.; Kristan, M. Discriminative Correlation Filter with Channel and Spatial Reliability. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6309–6318. [Google Scholar]
- Wu, L.; Wang, Y.; Shao, L.; Wang, M. 3-D PersonVLAD: Learning Deep Global Representations for Video-based Person Reidentification. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 3347–3359. [Google Scholar] [CrossRef] [PubMed]
- Sekh, A.A.; Dogra, D.P.; Choi, H.; Chae, S.; Kim, I.J. Person Re-identification in Videos by Analyzing Spatio-Temporal Tubes. Multimed. Tools Appl. 2020, 79, 24537–24551. [Google Scholar] [CrossRef]
- Hou, R.; Ma, B.; Chang, H.; Gu, X.; Shan, S.; Chen, X. VRSTC: Occlusion-free Video Person Re-identification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7183–7192. [Google Scholar]
- Wang, G.; Lai, J.; Huang, P.; Xie, X. Spatial-temporal Person Re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 8933–8940. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks. In Proceedings of the Advances in Neural Information Processing Systems 28 (NIPS 2015), Montreal, QC, Canada, 7–12 December 2015; Volume 28. [Google Scholar]
- Zhu, X.; Lyu, S.; Wang, X.; Zhao, Q. TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 2778–2788. [Google Scholar]
- Voigtlaender, P.; Krause, M.; Osep, A.; Luiten, J.; Sekar, B.B.G.; Geiger, A.; Leibe, B. MOTS: Multi-object Tracking and Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7942–7951. [Google Scholar]
- Xu, J.; Cao, Y.; Zhang, Z.; Hu, H. Spatial-temporal Relation Networks for Multi-object Tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3988–3998. [Google Scholar]
- Sadeghian, A.; Alahi, A.; Savarese, S. Tracking the Untrackable: Learning to Track Multiple Cues with Long-term Dependencies. In Proceedings of the IEEE international Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 300–311. [Google Scholar]
- Xiao, C.; Cao, Q.; Zhong, Y.; Lan, L.; Zhang, X.; Cai, H.; Luo, Z. Enhancing Online UAV Multi-Object Tracking with Temporal Context and Spatial Topological Relationships. Drones 2023, 7, 389. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is All You Need. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Geiger, A.; Lenz, P.; Stiller, C.; Urtasun, R. Vision Meets Robotics: The KITTI Dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef]
- Zhu, P.; Wen, L.; Du, D.; Bian, X.; Ling, H.; Hu, Q.; Nie, Q.; Cheng, H.; Liu, C.; Liu, X.; et al. VisDrone-DET2018: The Vision Meets Drone Object Detection in Image Challenge Results. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Jocher, G. YOLOv5 by Ultralytics. 2020. Available online: https://github.com/ultralytics/yolov5 (accessed on 21 August 2023).
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 7464–7475. [Google Scholar]
- Leal-Taixé, L.; Milan, A.; Reid, I.; Roth, S.; Schindler, K. MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking. arXiv 2015, arXiv:1504.01942. [Google Scholar]
- Luiten, J.; Osep, A.; Dendorfer, P.; Torr, P.; Geiger, A.; Leal-Taixé, L.; Leibe, B. HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking. Int. J. Comput. Vis. 2020, 129, 548–578. [Google Scholar] [CrossRef] [PubMed]
Module | Processor | Latency | Update Freq. |
---|---|---|---|
Detection | GPU | 785.3 ms | 1.3 Hz |
ID-A (IoU) | CPU | 4.1 ms | 1.3 Hz |
ID-A (Euc.) | CPU | 0.9 ms | 1.3 Hz |
ID-A (Color) | CPU | 9.1 ms | 1.3 Hz |
Tracking | CPU | 28.7 ms | 30.0 Hz |
Module | Processor | Latency | Update Freq. |
---|---|---|---|
Detection | GPU | 145.7 ms | 4.4 Hz |
ID-A (IoU) | CPU | 2.4 ms | 4.4 Hz |
ID-A (Euc.) | CPU | 0.7 ms | 4.4 Hz |
ID-A (Color) | CPU | 6.5 ms | 4.4 Hz |
Tracking | CPU | 18.6 ms | 30.0 Hz |
Method | HOTA | DetA | AssA | LocA | Update Freq. |
---|---|---|---|---|---|
SORT | 10.81 | 10.45 | 12.24 | 66.70 | 9.1 Hz |
Ours | 18.38 | 18.78 | 18.71 | 71.48 | 24.0 Hz |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Oh, C.; Lee, M.; Lim, C. Towards Real-Time On-Drone Pedestrian Tracking in 4K Inputs. Drones 2023, 7, 623. https://doi.org/10.3390/drones7100623
Oh C, Lee M, Lim C. Towards Real-Time On-Drone Pedestrian Tracking in 4K Inputs. Drones. 2023; 7(10):623. https://doi.org/10.3390/drones7100623
Chicago/Turabian StyleOh, Chanyoung, Moonsoo Lee, and Chaedeok Lim. 2023. "Towards Real-Time On-Drone Pedestrian Tracking in 4K Inputs" Drones 7, no. 10: 623. https://doi.org/10.3390/drones7100623
APA StyleOh, C., Lee, M., & Lim, C. (2023). Towards Real-Time On-Drone Pedestrian Tracking in 4K Inputs. Drones, 7(10), 623. https://doi.org/10.3390/drones7100623