Multi-Object Tracking with Predictive Information Fusion and Adaptive Measurement Noise
Abstract
:1. Introduction
- To improve the quality of detection boxes and reduce the impact of detection results on tracking performance, we proposed a method that utilizes predictive information to filter candidate boxes. NMS is not applied during the detection phase but is instead delayed until the tracking phase to better support the tracking task, significantly enhancing the overall performance of MOT.
- To mitigate the impact of localization errors on the motion model, we proposed an adaptive measurement noise method that dynamically adjusts the ratio between prediction and measurement weights, significantly enhancing overall tracking performance.
- To mitigate the impact of complex motion on tracking stability, we proposed a method that adjusts the IoU distance using height information to enhance the overall association stability.
- Our proposed method underwent comprehensive experimental validation on the DanceTrack dataset, and competitive results were also achieved on the MOT20 dataset.
2. Materials and Methods
2.1. Candidate Boxes Enhanced Filtering (CEF)
2.1.1. Score Update
2.1.2. Dynamic Threshold
2.1.3. Initializing and Executing
Algorithm 1: CEF |
Input: is the list of initial detection boxes; is the corresponding detection score after the update; is the corresponding NMS threshold after the update. Output: is the detection box after CEF is executed.
|
2.2. Adaptive Measurement Noise (AMN)
2.3. Dilatation Height IoU (DHIoU)
3. Results
3.1. Datasets and Metrics
3.1.1. Datasets
3.1.2. Evaluation Metrics
3.1.3. Implementation Details
3.2. Comparison Results
3.2.1. Comparison on DanceTrack
3.2.2. Comparison on MOT20
3.3. Ablation Study
3.3.1. Component Ablation
3.3.2. Measurement Noise
3.3.3. Affinity Matrix
4. Discussion
5. Limitations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bergmann, P.; Meinhardt, T.; Leal-Taixé, L. Tracking Without Bells and Whistles. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 941–951. [Google Scholar]
- Luo, X.; Wang, Y.; Zhang, X. A Violation Analysis Method of Traffic Targets Based on Video and GIS. Geomat. Inf. Sci. Wuhan Univ. 2023, 48, 647–655. [Google Scholar]
- Zhang, Y.; Da, F. A Multi-object Tracking Method Based on Dilatation Region Matching and Adaptive Trajectory Management Strategy. Geomat. Inf. Sci. Wuhan Univ. 2024, 49, 572–581. [Google Scholar] [CrossRef]
- Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upcroft, B. Simple online and realtime tracking. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar]
- Wojke, N.; Bewley, A.; Paulus, D. Simple online and realtime tracking with a deep association metric. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3645–3649. [Google Scholar]
- Zhang, Y.; Sun, P.; Jiang, Y.; Yu, D.; Weng, F.; Yuan, Z.; Luo, P.; Liu, W.; Wang, X. ByteTrack: Multi-object Tracking by Associating Every Detection Box. In Proceedings of the Computer Vision—ECCV 2022: 17th European Conference, Tel Aviv, Israel, 23–27 October 2022; pp. 1–21. [Google Scholar]
- Zhang, Y.; Wang, C.; Wang, X.; Zeng, W.; Liu, W. Fairmot: On the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 2021, 129, 3069–3087. [Google Scholar] [CrossRef]
- Du, Y.; Zhao, Z.; Song, Y.; Zhao, Y.; Su, F.; Gong, T.; Meng, H. Strongsort: Make deepsort great again. IEEE Trans. Multimed. 2023, 25, 8725–8737. [Google Scholar] [CrossRef]
- Ren, H.; Han, S.; Ding, H.; Zhang, Z.; Wang, H.; Wang, F. Focus On Details: Online Multi-Object Tracking with Diverse Fine-Grained Representation. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 11289–11298. [Google Scholar]
- Cao, J.; Pang, J.; Weng, X.; Khirodkar, R.; Kitani, K. Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 9686–9696. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef] [PubMed]
- Zhou, X.; Wang, D.; Krähenbühl, P. Objects as Points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021. arXiv 2021, arXiv:2107.08430. [Google Scholar]
- Wu, J.; Cao, J.; Song, L.; Wang, Y.; Yang, M.; Yuan, J. Track to detect and segment: An online multi-object tracker. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 12352–12361. [Google Scholar]
- Chen, L.; Ai, H.; Zhuang, Z.; Shang, C. Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In Proceedings of the 2018 IEEE International Conference on Multimedia and Expo (ICME), San Diego, CA, USA, 23–27 July 2018; pp. 1–6. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Seidenschwarz, J.; Brasó, G.; Serrano, V.C.; Elezi, I.; Leal-Taixé, L. Simple Cues Lead to a Strong Multi-Object Tracker. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 13813–13823. [Google Scholar]
- Luo, H.; Gu, Y.; Liao, X.; Lai, S.; Jiang, W. Bag of Tricks and a Strong Baseline for Deep Person Re-Identification. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 1487–1495. [Google Scholar]
- Wang, G.; Yuan, Y.; Chen, X.; Li, J.; Zhou, X. Learning discriminative features with multiple granularities for person re-identification. In Proceedings of the 26th ACM International Conference on Multimedia, New York, NY, USA, 22–26 October 2018; pp. 274–282. [Google Scholar]
- Tang, Q.; Jo, K.-H. Unsupervised Object Re-identification via Instances Correlation Loss. In Proceedings of the 2022 IEEE 20th International Conference on Industrial Informatics (INDIN), Perth, Australia, 25–28 July 2022; pp. 135–139. [Google Scholar]
- Liang, C.; Zhang, Z.; Zhou, X.; Li, B.; Zhu, S.; Hu, W. Rethinking the competition between detection and reid in multiobject tracking. IEEE Trans. Image Process. 2022, 31, 3182–3196. [Google Scholar] [CrossRef] [PubMed]
- Pang, J.; Qiu, L.; Li, X.; Chen, H.; Li, Q.; Darrell, T.; Yu, F. Quasi-Dense Similarity Learning for Multiple Object Tracking. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 164–173. [Google Scholar]
- Du, Y.; Wan, J.; Zhao, Y.; Zhang, B.; Tong, Z.; Dong, J. Giaotracker: A comprehensive framework for mcmot with global information and optimizing strategies in visdrone 2021. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 2809–2819. [Google Scholar]
- Aharon, N.; Orfaig, R.; Bobrovsky, B.-Z. BoT-SORT: Robust Associations Multi-Pedestrian Tracking. arXiv 2022, arXiv:2206.14651. [Google Scholar]
- Evangelidis, G.D.; Psarakis, E.Z. Parametric image alignment using enhanced correlation coefficient maximization. IEEE Trans. Pattern Anal. Mach. Intell. 2008, 30, 1858–1865. [Google Scholar] [CrossRef] [PubMed]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar]
- Zhou, X.; Koltun, V.; Krähenbühl, P. Tracking objects as points. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 474–490. [Google Scholar]
- Zeng, F.; Dong, B.; Zhang, Y.; Wang, T.; Zhang, X.; Wei, Y. Motr: End-to-end multiple-object tracking with transformer. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 659–675. [Google Scholar]
- Vaswani, A. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Huang, X.; Zhan, Y. Multi-object tracking with adaptive measurement noise and information fusion. Image Vis. Comput. 2024, 144, 104964. [Google Scholar] [CrossRef]
- Liang, H.; Wu, T.; Zhang, Q.; Zhou, H. Non-maximum suppression performs later in multi-object tracking. Appl. Sci. 2022, 12, 3334. [Google Scholar] [CrossRef]
- Sun, P.; Cao, J.; Jiang, Y.; Yuan, Z.; Bai, S.; Kitani, K.; Luo, P. Dancetrack: Multi-object tracking in uniform appearance and diverse motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 20993–21002. [Google Scholar]
- Dendorfer, P.; Rezatofighi, H.; Milan, A.; Shi, J.; Cremers, D.; Reid, I.; Roth, S.; Schindler, K.; Leal-Taixé, L. MOT20: A benchmark for multi object tracking in crowded scenes. arXiv 2020, arXiv:2003.09003. [Google Scholar]
- Bernardin, K.; Stiefelhagen, R. Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. EURASIP J. Image Video Process. 2007, 1, 246309. [Google Scholar] [CrossRef]
- Ristani, E.; Solera, F.; Zou, R.; Cucchiara, R.; Tomasi, C. Performance measures and a data set for multi-target, multi-camera tracking. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–10 and 15–16 October 2016; pp. 17–35. [Google Scholar]
- Luiten, J.; Osep, A.; Dendorfer, P.; Torr, P.; Geiger, A.; Leal-Taixé, L.; Leibe, B. Hota: A Higher Order Metric for Evaluating Multi-object Tracking. Int. J. Comput. Vis. 2021, 129, 548–578. [Google Scholar] [CrossRef] [PubMed]
- Shao, S.; Zhao, Z.; Li, B.; Xiao, T.; Yu, G.; Zhang, X.; Sun, J. Crowdhuman: A benchmark for detecting human in a crowd. arXiv 2018, arXiv:1805.00123. [Google Scholar]
- Sun, P.; Cao, J.; Jiang, Y.; Zhang, R.; Xie, E.; Yuan, Z.; Wang, C.; Luo, P. Transtrack: Multiple object tracking with transformer. arXiv 2020, arXiv:2012.15460 2020. [Google Scholar]
- Chu, P.; Wang, J.; You, Q.; Ling, H.; Liu, Z. TransMOT: Spatial-Temporal Graph Transformer for Multiple Object Tracking. In Proceedings of the 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2–7 January 2023; pp. 4859–4869. [Google Scholar]
Method | HOTA (%) | MOTA (%) | IDF1 (%) | DetA (%) | AssA (%) |
---|---|---|---|---|---|
SORT | 47.9 | 91.8 | 50.8 | 72.0 | 31.2 |
DeepSORT | 45.6 | 87.8 | 47.9 | 71.0 | 29.7 |
CenterTrack | 41.8 | 86.8 | 35.7 | 78.1 | 22.6 |
FairMOT | 39.7 | 82.2 | 40.8 | 66.7 | 23.8 |
QDTrack | 45.7 | 83.0 | 44.8 | 72.1 | 29.2 |
TransTrack | 45.5 | 88.4 | 45.2 | 75.9 | 27.5 |
TraDes | 43.3 | 86.2 | 41.2 | 74.5 | 25.4 |
ByteTrack | 47.3 | 89.5 | 52.5 | 71.6 | 31.4 |
OC-SORT | 54.6 | 89.6 | 54.6 | 80.4 | 40.2 |
Our method | 57.5 | 92.5 | 57.3 | 82.3 | 40.3 |
Method | HOTA (%) | MOTA (%) | IDF1 (%) | DetA (%) | AssA (%) |
---|---|---|---|---|---|
FairMOT | 54.6 | 61.8 | 67.3 | 54.7 | 54.7 |
TransMOT | 61.9 | 77.5 | 75.2 | 64.0 | 60.1 |
ByteTrack | 61.3 | 77.8 | 75.2 | 63.4 | 59.6 |
OC-SORT | 62.1 | 75.5 | 75.9 | 62.4 | 62.0 |
Our method | 62.1 | 75.1 | 76.0 | 62.0 | 62.3 |
CEF | AMN | DHIOU | HOTA (%) | MOTA (%) | IDF1 (%) | DetA (%) | AssA (%) |
---|---|---|---|---|---|---|---|
46.8 | 88.3 | 51.6 | 70.5 | 31.2 | |||
✓ | 47.0 | 88.5 | 51.9 | 70.6 | 31.2 | ||
✓ | 56.9 | 90.3 | 56.3 | 79.7 | 40.8 | ||
✓ | 57.4 | 90.3 | 56.7 | 79.7 | 41.5 | ||
✓ | ✓ | 55.6 | 90.2 | 56.2 | 79.1 | 39.2 | |
✓ | ✓ | 56.0 | 90.3 | 56.7 | 78.7 | 40.1 | |
✓ | ✓ | 57.5 | 90.2 | 57.0 | 79.6 | 41.7 | |
✓ | ✓ | ✓ | 58.0 | 90.4 | 57.3 | 79.8 | 42.3 |
HOTA (%) | MOTA (%) | IDF1 (%) | DetA (%) | AssA (%) | |
---|---|---|---|---|---|
Constant | 55.6 | 90.3 | 56.0 | 78.7 | 39.5 |
NSA | 56.6 | 90.2 | 56.1 | 79.8 | 40.3 |
AMN | 58.0 | 90.4 | 57.3 | 79.8 | 42.3 |
HOTA (%) | MOTA (%) | IDF1 (%) | DetA (%) | AssA (%) | |
---|---|---|---|---|---|
IoU | 55.6 | 90.1 | 56.2 | 79.1 | 39.2 |
) | 56.7 | 90.4 | 56.3 | 80.0 | 40.4 |
DHIoU | 58.0 | 90.4 | 57.3 | 79.8 | 42.3 |
First Association | Second Association | HOTA (%) | MOTA (%) | IDF1 (%) | DetA (%) | AssA (%) |
---|---|---|---|---|---|---|
IoU | IoU | 55.6 | 90.1 | 56.2 | 79.1 | 39.2 |
DHIoU | IoU | 57.7 | 90.2 | 56.8 | 79.8 | 41.9 |
DHIoU | DHIoU | 58.0 | 90.4 | 57.3 | 79.8 | 42.3 |
HOTA (%) | MOTA (%) | IDF1 (%) | DetA (%) | AssA (%) | |
---|---|---|---|---|---|
IoU | 55.6 | 90.1 | 56.2 | 79.1 | 39.2 |
) | 51.4 | 90.0 | 49.4 | 79.3 | 33.5 |
) | 56.7 | 90.4 | 56.3 | 80.0 | 40.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cheng, X.; Zhao, H.; Deng, Y.; Shen, S. Multi-Object Tracking with Predictive Information Fusion and Adaptive Measurement Noise. Appl. Sci. 2025, 15, 736. https://doi.org/10.3390/app15020736
Cheng X, Zhao H, Deng Y, Shen S. Multi-Object Tracking with Predictive Information Fusion and Adaptive Measurement Noise. Applied Sciences. 2025; 15(2):736. https://doi.org/10.3390/app15020736
Chicago/Turabian StyleCheng, Xiaohui, Haoyi Zhao, Yun Deng, and Shuangqin Shen. 2025. "Multi-Object Tracking with Predictive Information Fusion and Adaptive Measurement Noise" Applied Sciences 15, no. 2: 736. https://doi.org/10.3390/app15020736
APA StyleCheng, X., Zhao, H., Deng, Y., & Shen, S. (2025). Multi-Object Tracking with Predictive Information Fusion and Adaptive Measurement Noise. Applied Sciences, 15(2), 736. https://doi.org/10.3390/app15020736