This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Text-Guided Multi-Class Multi-Object Tracking for Fine-Grained Maritime Rescue
by
Shuman Li
Shuman Li ,
Zhipeng Lin
Zhipeng Lin ,
Haotian Wang
Haotian Wang *,
Wenjing Yang
Wenjing Yang and
Hengzhu Liu
Hengzhu Liu
Department of Intelligent Data Science, College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2024, 16(19), 3684; https://doi.org/10.3390/rs16193684 (registering DOI)
Submission received: 13 August 2024
/
Revised: 28 September 2024
/
Accepted: 1 October 2024
/
Published: 2 October 2024
Abstract
The rapid development of remote sensing technology has provided new sources of data for marine rescue and has made it possible to find and track survivors. Due to the requirement of tracking multiple survivors at the same time, multi-object tracking (MOT) has become the key subtask of marine rescue. However, there exists a significant gap between fine-grained objects in realistic marine rescue remote sensing data and the fine-grained object tracking capability of existing MOT technologies, which mainly focuses on coarse-grained object scenarios and fails to track fine-grained instances. Such a gap limits the practical application of MOT in realistic marine rescue remote sensing data, especially when rescue forces are limited. Given the promising fine-grained classification performance of recent text-guided methods, we delve into leveraging labels and attributes to narrow the gap between MOT and fine-grained maritime rescue. We propose a text-guided multi-class multi-object tracking (TG-MCMOT) method. To handle the problem raised by fine-grained classes, we design a multi-modal encoder by aligning external textual information with visual inputs. We use decoding information at different levels, simultaneously predicting the category, location, and identity embedding features of objects. Meanwhile, to improve the performance of small object detection, we also develop a data augmentation pipeline to generate pseudo-near-infrared images based on RGB images. Extensive experiments demonstrate that our TG-MCMOT not only performs well on typical metrics in the maritime rescue task (SeaDronesSee dataset), but it also effectively tracks open-set categories on the BURST dataset. Specifically, on the SeaDronesSee dataset, the Higher Order Tracking Accuracy (HOTA) reached a score of 58.8, and on the BURST test dataset, the HOTA score for the unknown class improved by 16.07 points.
Share and Cite
MDPI and ACS Style
Li, S.; Lin, Z.; Wang, H.; Yang, W.; Liu, H.
Text-Guided Multi-Class Multi-Object Tracking for Fine-Grained Maritime Rescue. Remote Sens. 2024, 16, 3684.
https://doi.org/10.3390/rs16193684
AMA Style
Li S, Lin Z, Wang H, Yang W, Liu H.
Text-Guided Multi-Class Multi-Object Tracking for Fine-Grained Maritime Rescue. Remote Sensing. 2024; 16(19):3684.
https://doi.org/10.3390/rs16193684
Chicago/Turabian Style
Li, Shuman, Zhipeng Lin, Haotian Wang, Wenjing Yang, and Hengzhu Liu.
2024. "Text-Guided Multi-Class Multi-Object Tracking for Fine-Grained Maritime Rescue" Remote Sensing 16, no. 19: 3684.
https://doi.org/10.3390/rs16193684
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article Access Statistics
For more information on the journal statistics, click
here.
Multiple requests from the same IP address are counted as one view.