CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking
Abstract
:1. Introduction
- We propose CMDN, a novel adversarial defense approach tailored for UAV tracking. CMDN demonstrates remarkable efficiency and effectiveness in defense, aligning well with the high-precision and real-time demands of UAV tracking.
- Experiments conducted in three widely used benchmarks illustrate that our defense network can considerably strengthen the robustness of an SNN-based tracker against adversarial attacks in both adaptive and non-adaptive scenarios. CMDN also demonstrates outstanding transferability on heterogeneous trackers, which makes it convenient to deploy CMDN in a plug-and-play manner.
- The real-world testing conducted on a UAV platform verifies the efficiency and effectiveness of our defense network on edge devices.
2. Related Works
2.1. UAV Tracking
2.2. Adversarial Attack and Defense in Visual Object Tracking
2.3. Masked Image Modeling
3. Methodology
3.1. Problem Definition
3.2. Reconstruction Loss Function
3.3. Complementary Reconstruction Process
Algorithm 1 Framework of pretraining process of proposed CMDN |
Input: training set D, training epochs E, tracker backbone T Output: trained defense network parameters for encoder, for decoder
|
4. Experiments
4.1. Implement Details
4.2. Testing Datasets and Metrics
4.3. Robustness on Non-Adaptive Attacks
4.4. Robustness on Adaptive Attacks
4.5. Performance on Clean Examples
4.6. Transferability on Different UAV Trackers
4.7. Test of Defense Efficiency
4.8. Ablation Studies
4.9. Real-World Tests and Visualization
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Morando, L.; Recchiuto, C.T.; Calla, J.; Scuteri, P.; Sgorbissa, A. Thermal and visual tracking of photovoltaic plants for autonomous UAV inspection. Drones 2022, 6, 347. [Google Scholar] [CrossRef]
- Xie, X.; Xi, J.; Yang, X.; Lu, R.; Xia, W. Stftrack: Spatio-temporal-focused siamese network for infrared uav tracking. Drones 2023, 7, 296. [Google Scholar] [CrossRef]
- Gao, Z.; Li, D.; Wen, G.; Kuai, Y.; Chen, R. Drone based RGBT tracking with dual-feature aggregation network. Drones 2023, 7, 585. [Google Scholar] [CrossRef]
- Li, B.; Wu, W.; Wang, Q.; Zhang, F.; Xing, J.; Yan, J. Siamrpn++: Evolution of siamese visual tracking with very deep networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4282–4291. [Google Scholar]
- Fu, C.; Cao, Z.; Li, Y.; Ye, J.; Feng, C. Onboard real-time aerial tracking with efficient Siamese anchor proposal network. IEEE Trans. Geosci. Remote. Sens. 2021, 60, 1–13. [Google Scholar] [CrossRef]
- Cao, Z.; Fu, C.; Ye, J.; Li, B.; Li, Y. SiamAPN++: Siamese attentional aggregation network for real-time UAV tracking. In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Prague, Czech Republic, 27 September–1 October 2021; pp. 3086–3092. [Google Scholar]
- Cao, Z.; Fu, C.; Ye, J.; Li, B.; Li, Y. Hift: Hierarchical feature transformer for aerial tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 15457–15466. [Google Scholar]
- Szegedy, C. Intriguing properties of neural networks. arXiv 2013, arXiv:1312.6199. [Google Scholar]
- Goodfellow, I.J.; Shlens, J.; Szegedy, C. Explaining and harnessing adversarial examples. arXiv 2014, arXiv:1412.6572. [Google Scholar]
- Guo, Q.; Xie, X.; Juefei-Xu, F.; Ma, L.; Li, Z.; Xue, W.; Feng, W.; Liu, Y. Spark: Spatial-aware online incremental attack against visual tracking. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 202–219. [Google Scholar]
- Jia, S.; Ma, C.; Song, Y.; Yang, X. Robust tracking against adversarial attacks. In Proceedings of the Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, 23–28 August 2020; Proceedings, Part XIX 16. Springer: Berlin/Heidelberg, Germany, 2020; pp. 69–84. [Google Scholar]
- Jia, S.; Song, Y.; Ma, C.; Yang, X. Iou attack: Towards temporally coherent black-box adversarial attack for visual object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 6709–6718. [Google Scholar]
- Jiang, Y.; Yin, G. Attention-Enhanced One-Shot Attack against Single Object Tracking for Unmanned Aerial Vehicle Remote Sensing Images. Remote. Sens. 2023, 15, 4514. [Google Scholar] [CrossRef]
- Yan, B.; Wang, D.; Lu, H.; Yang, X. Cooling-shrinking attack: Blinding the tracker with imperceptible noises. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 990–999. [Google Scholar]
- Suttapak, W.; Zhang, J.; Zhang, L. Diminishing-feature attack: The adversarial infiltration on visual tracking. Neurocomputing 2022, 509, 21–33. [Google Scholar] [CrossRef]
- Fu, C.; Li, S.; Yuan, X.; Ye, J.; Cao, Z.; Ding, F. Ad 2 attack: Adaptive adversarial attack on real-time uav tracking. In Proceedings of the 2022 International Conference on Robotics and Automation (ICRA), Philadelphia, PA, USA, 23–27 May 2022; pp. 5893–5899. [Google Scholar]
- Wu, Z.; Yu, R.; Liu, Q.; Cheng, S.; Qiu, S.; Zhou, S. Enhancing Tracking Robustness with Auxiliary Adversarial Defense Networks. arXiv 2024, arXiv:2402.17976. [Google Scholar]
- Chen, J.; Ren, X.; Guo, Q.; Juefei-Xu, F.; Lin, D.; Feng, W.; Ma, L.; Zhao, J. LRR: Language-Driven Resamplable Continuous Representation against Adversarial Tracking Attacks. arXiv 2024, arXiv:2404.06247. [Google Scholar]
- Peng, Z.; Dong, L.; Bao, H.; Ye, Q.; Wei, F. Beit v2: Masked image modeling with vector-quantized visual tokenizers. arXiv 2022, arXiv:2208.06366. [Google Scholar]
- He, K.; Chen, X.; Xie, S.; Li, Y.; Dollár, P.; Girshick, R. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 16000–16009. [Google Scholar]
- Xie, Z.; Zhang, Z.; Cao, Y.; Lin, Y.; Bao, J.; Yao, Z.; Dai, Q.; Hu, H. Simmim: A simple framework for masked image modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 9653–9663. [Google Scholar]
- Mueller, M.; Smith, N.; Ghanem, B. A benchmark and simulator for UAV tracking. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 445–461. [Google Scholar]
- Wu, Y.; Lim, J.; Yang, M.H. Online object tracking: A benchmark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 2411–2418. [Google Scholar]
- Kristan, M.; Leonardis, A.; Matas, J.; Felsberg, M.; Pflugfelder, R.; ˇCehovin Zajc, L.; Vojir, T.; Bhat, G.; Lukezic, A.; Eldesokey, A.; et al. The sixth visual object tracking vot2018 challenge results. In Proceedings of the European Conference on computer VISION (ECCV) Workshops, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Dosovitskiy, A. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Bertinetto, L.; Valmadre, J.; Henriques, J.F.; Vedaldi, A.; Torr, P.H. Fully-convolutional siamese networks for object tracking. In Proceedings of the Computer Vision–ECCV 2016 Workshops, Amsterdam, The Netherlands, 8–10+15–16 October 2016; Proceedings, Part II 14. Springer: Berlin/Heidelberg, Germany, 2016; pp. 850–865. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 60, 84–90. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Vaswani, A. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017. Available online: https://user.phil.hhu.de/~cwurm/wp-content/uploads/2020/01/7181-attention-is-all-you-need.pdf (accessed on 21 October 2024).
- Khan, S.; Naseer, M.; Hayat, M.; Zamir, S.W.; Khan, F.S.; Shah, M. Transformers in vision: A survey. ACM Comput. Surv. (CSUR) 2022, 54, 1–41. [Google Scholar] [CrossRef]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef]
- Rao, Y.; Zhao, W.; Liu, B.; Lu, J.; Zhou, J.; Hsieh, C.J. Dynamicvit: Efficient vision transformers with dynamic token sparsification. Adv. Neural Inf. Process. Syst. 2021, 34, 13937–13949. [Google Scholar]
- Yin, H.; Vahdat, A.; Alvarez, J.M.; Mallya, A.; Kautz, J.; Molchanov, P. A-vit: Adaptive tokens for efficient vision transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 10809–10818. [Google Scholar]
- Li, S.; Yang, Y.; Zeng, D.; Wang, X. Adaptive and background-aware vision transformer for real-time uav tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 13989–14000. [Google Scholar]
- Deng, A.; Han, G.; Chen, D.; Ma, T.; Liu, Z. Slight aware enhancement transformer and multiple matching network for real-time UAV tracking. Remote. Sens. 2023, 15, 2857. [Google Scholar] [CrossRef]
- Devlin, J. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Brown, T.B. Language models are few-shot learners. arXiv 2020, arXiv:2005.14165. [Google Scholar]
- Zhao, H.; Wang, D.; Lu, H. Representation learning for visual object tracking by masked appearance transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 18696–18705. [Google Scholar]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13. Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
- Čehovin, L.; Leonardis, A.; Kristan, M. Visual object tracking performance measures revisited. IEEE Trans. Image Process. 2016, 25, 1261–1274. [Google Scholar] [CrossRef] [PubMed]
- Carlini, N.; Wagner, D. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, Dallas, TX, USA, 3 November 2017; pp. 3–14. [Google Scholar]
- Athalye, A.; Carlini, N.; Wagner, D. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 274–283. [Google Scholar]
- Tramer, F.; Carlini, N.; Brendel, W.; Madry, A. On adaptive attacks to adversarial example defenses. Adv. Neural Inf. Process. Syst. 2020, 33, 1633–1645. [Google Scholar]
- Amovlab. Available online: https://amovlab.com/product/detail?pid=43 (accessed on 21 October 2024).
Setting of Hyperparameters | Values |
---|---|
Learning rate | 1× 10−4 |
AdamW’s , | 0.9, 0.95 |
Weight decay | 0.05 |
Batch size B | 16 |
Training epochs E | 10 |
Loss weight , | 1, 1 |
Dataset | Exp. No. | Attack Method | Defense Pattern | Metrics | Attack Result | Non-Adaptive Defense Result | Adaptive Defense Result | ||
---|---|---|---|---|---|---|---|---|---|
UAV123 | 1-1 | CSA | Template Only | Success ↑ | 0.478 | 0.579 | 0.101 (21.13%) | 0.583 | 0.105 (21.97%) |
Precision ↑ | 0.656 | 0.782 | 0.126 (19.21%) | 0.783 | 0.127 (19.36%) | ||||
1-2 | Search Only | Success ↑ | 0.154 | 0.575 | 0.421 (273.38%) | 0.551 | 0.397 (257.79%) | ||
Precision ↑ | 0.320 | 0.776 | 0.456 (142.50%) | 0.747 | 0.427 (133.44%) | ||||
1-3 | Both | Success ↑ | 0.156 | 0.569 | 0.413 (264.74%) | 0.550 | 0.394 (252.56%) | ||
Precision ↑ | 0.327 | 0.776 | 0.449 (137.31%) | 0.759 | 0.432 (132.11%) | ||||
1-4 | IoU Attack | Search Only | Success ↑ | 0.459 | 0.552 | 0.093 (20.26%) | 0.568 | 0.109 (23.75%) | |
Precision ↑ | 0.595 | 0.749 | 0.154 (25.88%) | 0.757 | 0.162 (27.23%) | ||||
1-5 | Ad2Attack | Search Only | Success ↑ | 0.343 | 0.564 | 0.221 (64.43%) | 0.541 | 0.198 (57.73%) | |
Precision ↑ | 0.501 | 0.777 | 0.276 (55.09%) | 0.758 | 0.257 (51.30%) | ||||
OTB100 | 2-1 | CSA | Template Only | Success ↑ | 0.527 | 0.628 | 0.101 (19.17%) | 0.625 | 0.098 (18.60%) |
Precision ↑ | 0.713 | 0.834 | 0.121 (16.97%) | 0.829 | 0.116 (16.27%) | ||||
2-2 | Search Only | Success ↑ | 0.349 | 0.627 | 0.278 (79.66%) | 0.614 | 0.265 (75.93%) | ||
Precision ↑ | 0.491 | 0.836 | 0.345 (70.26%) | 0.824 | 0.333 (67.82%) | ||||
2-3 | Both | Success ↑ | 0.324 | 0.624 | 0.300 (92.59%) | 0.616 | 0.292 (90.12%) | ||
Precision ↑ | 0.471 | 0.835 | 0.364 (77.28%) | 0.825 | 0.354 (75.16%) | ||||
2-4 | IoU Attack | Search Only | Success ↑ | 0.499 | 0.603 | 0.104 (20.84%) | 0.613 | 0.114 (22.85%) | |
Precision ↑ | 0.644 | 0.800 | 0.156 (24.22%) | 0.817 | 0.173 (26.86%) | ||||
2-5 | Ad2Attack | Search Only | Success ↑ | 0.259 | 0.459 | 0.200 (77.22%) | 0.442 | 0.183 (70.66%) | |
Precision ↑ | 0.315 | 0.636 | 0.321 (101.90%) | 0.623 | 0.308 (97.78%) | ||||
VOT2018 | 3-1 | CSA | Template Only | EAO ↑ | 0.123 | 0.286 | 0.163 (132.52%) | 0.266 | 0.143 (116.26%) |
Accuracy ↑ | 0.541 | 0.599 | 0.058 (10.72%) | 0.600 | 0.059 (10.91%) | ||||
Robustness ↓ | 1.147 | 0.421 | 0.726 (63.30%) | 0.529 | 0.618 (53.88%) | ||||
3-2 | Search Only | EAO ↑ | 0.073 | 0.261 | 0.188 (257.53%) | 0.250 | 0.177 (242.47%) | ||
Accuracy ↑ | 0.486 | 0.589 | 0.103 (21.19%) | 0.583 | 0.097 (19.96%) | ||||
Robustness ↓ | 2.074 | 0.529 | 1.545 (74.49%) | 0.567 | 1.507 (72.66%) | ||||
3-3 | Both | EAO ↑ | 0.073 | 0.248 | 0.175 (239.73%) | 0.234 | 0.161 (220.55%) | ||
Accuracy ↑ | 0.467 | 0.583 | 0.116 (24.84%) | 0.591 | 0.124 (26.55%) | ||||
Robustness ↓ | 2.013 | 0.515 | 1.498 (74.42%) | 0.632 | 1.381 (68.60%) | ||||
3-4 | IoU Attack | Search Only | EAO ↑ | 0.129 | 0.184 | 0.055 (42.64%) | 0.195 | 0.066 (51.16%) | |
Accuracy ↑ | 0.568 | 0.583 | 0.015 (2.64%) | 0.591 | 0.023 (4.05%) | ||||
Robustness ↓ | 1.171 | 0.810 | 0.361 (30.83%) | 0.740 | 0.431 (36.81%) | ||||
3-5 | Ad2Attack | Search Only | EAO ↑ | 0.103 | 0.163 | 0.06 (58.25%) | 0.154 | 0.051 (49.51%) | |
Accuracy ↑ | 0.434 | 0.532 | 0.098 (22.58%) | 0.530 | 0.096 (22.12%) | ||||
Robustness ↓ | 1.428 | 0.894 | 0.534 (37.39%) | 0.985 | 0.443 (31.02%) |
Dataset | Defense Method | Metrics | Attack Result | Non-Adaptive Defense Result | Adaptive Defense Result | ||
---|---|---|---|---|---|---|---|
UAV123 | AADN | Success ↑ | 0.156 | 0.525 | 0.369 (236.54%) | 0.476 | 0.320 (205.13%) |
Precision ↑ | 0.327 | 0.722 | 0.395 (120.80%) | 0.678 | 0.351 (107.34%) | ||
CMDN | Success ↑ | 0.156 | 0.569 | 0.413 (264.74%) | 0.550 | 0.394 (252.56%) | |
Precision ↑ | 0.327 | 0.776 | 0.449 (137.31%) | 0.759 | 0.432 (132.11%) | ||
OTB100 | AADN | Success ↑ | 0.324 | 0.559 | 0.235 (72.53%) | 0.403 | 0.079 (24.38%) |
Precision ↑ | 0.471 | 0.777 | 0.306 (64.97%) | 0.573 | 0.102 (21.66%) | ||
CMDN | Success ↑ | 0.324 | 0.624 | 0.300 (92.59%) | 0.616 | 0.292 (90.12%) | |
Precision ↑ | 0.471 | 0.835 | 0.364 (77.28%) | 0.825 | 0.354 (75.16%) | ||
VOT2018 | AADN | EAO ↑ | 0.073 | 0.14 | 0.067 (91.78%) | 0.109 | 0.036 (49.32%) |
Accuracy ↑ | 0.486 | 0.546 | 0.079 (16.91%) | 0.488 | 0.021 (4.50%) | ||
Robustness ↓ | 2.074 | 1.063 | 0.95 (47.19%) | 1.395 | 0.618 (30.70%) | ||
CMDN | EAO ↑ | 0.073 | 0.248 | 0.175 (239.73%) | 0.234 | 0.161 (220.55%) | |
Accuracy ↑ | 0.486 | 0.583 | 0.116 (24.84%) | 0.591 | 0.124 (26.55%) | ||
Robustness ↓ | 2.074 | 0.515 | 1.498 (74.42%) | 0.632 | 1.381 (68.60%) |
Dataset | Metrics | Original Result | Defense Result | (%) |
---|---|---|---|---|
UAV123 | Success ↑ | 0.611 | 0.578 | −0.033 (−5.40%) |
Precision ↑ | 0.804 | 0.776 | −0.028 (−3.48%) | |
OTB100 | Success ↑ | 0.695 | 0.644 | −0.051 (−7.34%) |
Precision ↑ | 0.905 | 0.856 | −0.049 (−5.41%) | |
VOT2018 | EAO ↑ | 0.352 | 0.283 | −0.069 (−19.60%) |
Accuracy ↑ | 0.601 | 0.597 | −0.004 (−0.67%) | |
Robustness ↓ | 0.29 | 0.393 | −0.103 (−35.52%) |
Victim Tracker | Attack Method | Metrics | Original Result | Attack Result | Defense Result | (%) |
---|---|---|---|---|---|---|
SiamAPN | Ad2Attack | Success ↑ | 0.575 | 0.139 | 0.489 | 0.350 (251.80%) |
Precision ↑ | 0.765 | 0.343 | 0.694 | 0.351 (102.33%) | ||
HiFT | Ad2Attack | Success ↑ | 0.589 | 0.263 | 0.473 | 0.210 (79.85%) |
Precision ↑ | 0.787 | 0.399 | 0.652 | 0.253 (63.41%) | ||
Aba-ViTrack | IoU Attack | Success ↑ | 0.664 | 0.581 | 0.630 | 0.049 (8.43%) |
Precision ↑ | 0.864 | 0.805 | 0.832 | 0.027 (3.35%) |
Defense Method | Frame per Second | Cost per Frame (ms) | |
---|---|---|---|
Original SiamRPN++ | 107 | 9.35 | - |
AADN | 68 | 14.71 | −5.36 |
CMDN | 55 | 18.18 | −8.83 |
LRR | 26 | 38.46 | −29.11 |
Exp. No. | Network Type | Metircs | Attack Result | Defense Result | (%) |
---|---|---|---|---|---|
1 | MAE50% | Success | 0.324 | 0.576 | 0.252 (77.78%) |
Precision | 0.471 | 0.775 | 0.304 (64.54%) | ||
2 | MAE_rec50% | Success | 0.324 | 0.608 | 0.284 (87.65%) |
Precision | 0.471 | 0.805 | 0.334 (70.91%) | ||
3 | MAE_comp | Success | 0.324 | 0.538 | 0.214 (66.05%) |
Precision | 0.471 | 0.718 | 0.247 (52.44%) | ||
4 | CMDN | Success | 0.324 | 0.624 | 0.300 (92.59%) |
Precision | 0.471 | 0.835 | 0.364 (77.28%) | ||
5 | CMDN(0.5,1) | Success | 0.324 | 0.582 | 0.258 (79.63%) |
Precision | 0.471 | 0.784 | 0.313 (66.45%) | ||
6 | CMDN(1,0.5) | Success | 0.324 | 0.611 | 0.287 (88.58%) |
Precision | 0.471 | 0.810 | 0.339 (71.97%) |
Defense Method | Frame per Second | Cost per Frame (ms) | |
---|---|---|---|
Original SiamAPN | 65 | 15.38 | - |
CMDN | 27 | 37.04 | −21.66 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, R.; Wu, Z.; Liu, Q.; Zhou, S.; Gou, M.; Xiang, B. CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking. Drones 2024, 8, 607. https://doi.org/10.3390/drones8110607
Yu R, Wu Z, Liu Q, Zhou S, Gou M, Xiang B. CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking. Drones. 2024; 8(11):607. https://doi.org/10.3390/drones8110607
Chicago/Turabian StyleYu, Ruilong, Zhewei Wu, Qihe Liu, Shijie Zhou, Min Gou, and Bingchen Xiang. 2024. "CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking" Drones 8, no. 11: 607. https://doi.org/10.3390/drones8110607
APA StyleYu, R., Wu, Z., Liu, Q., Zhou, S., Gou, M., & Xiang, B. (2024). CMDN: Pre-Trained Visual Representations Boost Adversarial Robustness for UAV Tracking. Drones, 8(11), 607. https://doi.org/10.3390/drones8110607