Previous Article in Journal
A Dual-Branch Fusion of a Graph Convolutional Network and a Convolutional Neural Network for Hyperspectral Image Classification
Previous Article in Special Issue
Multi-Resolution Learning and Semantic Edge Enhancement for Super-Resolution Semantic Segmentation of Urban Scene Images
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Security in Transformer Visual Trackers: A Case Study on the Adversarial Robustness of Two Models

1
School of Cyberspace, Hangzhou Dianzi University, Hangzhou 310018, China
2
Key Laboratory of Discrete Industrial Internet of Things of Zhejiang, Hangzhou 310018, China
3
DBAPPSecurity Co., Ltd., Hangzhou 310051, China
4
ZJU-Hangzhou Global Scientific and Technological Innovation Center, Zhejiang University, Hangzhou 310058, China
5
Institut Polytechnique de Paris, Institut Mines-Telecom, 91120 Paris, France
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(14), 4761; https://doi.org/10.3390/s24144761 (registering DOI)
Submission received: 16 May 2024 / Revised: 9 July 2024 / Accepted: 10 July 2024 / Published: 22 July 2024
(This article belongs to the Special Issue Advances in Automated Driving: Sensing and Control)

Abstract

Visual object tracking is an important technology in camera-based sensor networks, which has a wide range of practicability in auto-drive systems. A transformer is a deep learning model that adopts the mechanism of self-attention, and it differentially weights the significance of each part of the input data. It has been widely applied in the field of visual tracking. Unfortunately, the security of the transformer model is unclear. It causes such transformer-based applications to be exposed to security threats. In this work, the security of the transformer model was investigated with an important component of autonomous driving, i.e., visual tracking. Such deep-learning-based visual tracking is vulnerable to adversarial attacks, and thus, adversarial attacks were implemented as the security threats to conduct the investigation. First, adversarial examples were generated on top of video sequences to degrade the tracking performance, and the frame-by-frame temporal motion was taken into consideration when generating perturbations over the depicted tracking results. Then, the influence of perturbations on performance was sequentially investigated and analyzed. Finally, numerous experiments on OTB100, VOT2018, and GOT-10k data sets demonstrated that the executed adversarial examples were effective on the performance drops of the transformer-based visual tracking. White-box attacks showed the highest effectiveness, where the attack success rates exceeded 90% against transformer-based trackers.
Keywords: autonomous driving; visual tracking; adversarial attacks; transformer model autonomous driving; visual tracking; adversarial attacks; transformer model

Share and Cite

MDPI and ACS Style

Ye, P.; Chen, Y.; Ma, S.; Xue, F.; Crespi, N.; Chen, X.; Fang, X. Security in Transformer Visual Trackers: A Case Study on the Adversarial Robustness of Two Models. Sensors 2024, 24, 4761. https://doi.org/10.3390/s24144761

AMA Style

Ye P, Chen Y, Ma S, Xue F, Crespi N, Chen X, Fang X. Security in Transformer Visual Trackers: A Case Study on the Adversarial Robustness of Two Models. Sensors. 2024; 24(14):4761. https://doi.org/10.3390/s24144761

Chicago/Turabian Style

Ye, Peng, Yuanfang Chen, Sihang Ma, Feng Xue, Noel Crespi, Xiaohan Chen, and Xing Fang. 2024. "Security in Transformer Visual Trackers: A Case Study on the Adversarial Robustness of Two Models" Sensors 24, no. 14: 4761. https://doi.org/10.3390/s24144761

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop