Stripe-Assisted Global Transformer and Spatial–Temporal Enhancement for Vehicle Re-Identification
Abstract
:1. Introduction
- (1)
- We propose a novel SaGT method for vehicle re-identification that learns a discriminative global feature with the assistance of local feature learning, while also considering the redundancy, thereby only using the global feature for inference.
- (2)
- We design an SFM to construct stripe-based features that effectively capture details in stripe regions. Additionally, we introduce an SaGL to encourage the global feature to learn discriminative information from stripe-based features.
- (3)
- We introduce STPro to offer an additional metric to enhance vehicle re-identification relying on only visual features. We also explore the fusion of the visual feature metric and STPro to further improve vehicle re-identification.
2. Related Work
2.1. Visual Feature-Based Vehicle Re-Identification
2.2. Spatial–Temporal Information-Based Vehicle Re-Identification
2.3. Vision Transformers
3. Methodology
3.1. SaGT
3.1.1. Embedding Basic Features
3.1.2. Extracting Global and Stripe-Based Features
3.1.3. Model Optimization
3.2. STPro
3.2.1. Histogram Statistic
3.2.2. Kernel Density Estimation
3.3. Fusion Module
4. Experiment
4.1. Dataset
4.1.1. VeRi-776
4.1.2. VehicleID
4.2. Implementation Details
4.3. Comparisons with State-of-the-Art Methods
4.3.1. Comparisons on VeRi-776
4.3.2. Comparisons on VehicleID
4.4. Ablation Study and Analysis
4.4.1. Effectiveness of STPro
4.4.2. Effectiveness of SFM and SaGL
4.4.3. Visualization Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Khan, S.D.; Ullah, H. A survey of advances in vision-based vehicle re-identification. Comput. Vis. Image Underst. 2019, 182, 50–63. [Google Scholar] [CrossRef]
- Lou, Y.; Bai, Y.; Liu, J.; Wang, S.; Duan, L. Veri-wild: A large dataset and a new method for vehicle re-identification in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3235–3243. [Google Scholar] [CrossRef]
- Bai, Y.; Liu, J.; Lou, Y.; Wang, C.; Duan, L.Y. Disentangled feature learning network and a comprehensive benchmark for vehicle re-identification. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 6854–6871. [Google Scholar] [CrossRef]
- He, S.; Luo, H.; Wang, P.; Wang, F.; Li, H.; Jiang, W. Transreid: Transformer-based object re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 15013–15022. [Google Scholar] [CrossRef]
- Lian, J.; Wang, D.H.; Wu, Y.; Zhu, S. Multi-Branch Enhanced Discriminative Network for Vehicle Re-Identification. IEEE Trans. Intell. Transp. Syst. 2023, 25, 1263–1274. [Google Scholar] [CrossRef]
- Sun, K.; Pang, X.; Zheng, M.; Nie, X.; Li, X.; Zhou, H.; Yin, Y. Heterogeneous context interaction network for vehicle re-identification. Neural Netw. 2024, 169, 293–306. [Google Scholar] [CrossRef]
- Xu, Z.; Wei, L.; Lang, C.; Feng, S.; Wang, T.; Bors, A.G.; Liu, H. SSR-Net: A Spatial Structural Relation Network for Vehicle Re-identification. ACM Trans. Multimed. Comput. Commun. Appl. 2023, 19, 216. [Google Scholar] [CrossRef]
- Wang, H.; Hou, J.; Chen, N. A survey of vehicle re-identification based on deep learning. IEEE Access 2019, 7, 172443–172469. [Google Scholar] [CrossRef]
- Guo, H.; Zhu, K.; Tang, M.; Wang, J. Two-level attention network with multi-grain ranking loss for vehicle re-identification. IEEE Trans. Image Process. 2019, 28, 4328–4338. [Google Scholar] [CrossRef]
- Jiang, N.; Xu, Y.; Zhou, Z.; Wu, W. Multi-attribute driven vehicle re-identification with spatial-temporal re-ranking. In Proceedings of the 2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 858–862. [Google Scholar] [CrossRef]
- Li, H.; Lin, X.; Zheng, A.; Li, C.; Luo, B.; He, R.; Hussain, A. Attributes guided feature learning for vehicle re-identification. IEEE Trans. Emerg. Top. Comput. Intell. 2021, 6, 1211–1221. [Google Scholar] [CrossRef]
- Li, Y.; Liu, K.; Jin, Y.; Wang, T.; Lin, W. VARID: Viewpoint-aware re-identification of vehicle based on triplet loss. IEEE Trans. Intell. Transp. Syst. 2020, 23, 1381–1390. [Google Scholar] [CrossRef]
- Li, K.; Ding, Z.; Li, K.; Zhang, Y.; Fu, Y. Vehicle and person re-identification with support neighbor loss. IEEE Trans. Neural Netw. Learn. Syst. 2020, 33, 826–838. [Google Scholar] [CrossRef]
- Chen, X.; Sui, H.; Fang, J.; Feng, W.; Zhou, M. Vehicle re-identification using distance-based global and partial multi-regional feature learning. IEEE Trans. Intell. Transp. Syst. 2020, 22, 1276–1286. [Google Scholar] [CrossRef]
- Zhang, X.; Zhang, R.; Cao, J.; Gong, D.; You, M.; Shen, C. Part-Guided Attention Learning for Vehicle Instance Retrieval. IEEE Trans. Intell. Transp. Syst. 2022, 23, 3048–3060. [Google Scholar] [CrossRef]
- Liu, X.; Liu, W.; Zheng, J.; Yan, C.; Mei, T. Beyond the parts: Learning multi-view cross-part correlation for vehicle re-identification. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 907–915. [Google Scholar] [CrossRef]
- Teng, S.; Zhang, S.; Huang, Q.; Sebe, N. Multi-view spatial attention embedding for vehicle re-identification. IEEE Trans. Circuits Syst. Video Technol. 2020, 31, 816–827. [Google Scholar] [CrossRef]
- Yu, Z.; Pei, J.; Zhu, M.; Zhang, J.; Li, J. Multi-attribute adaptive aggregation transformer for vehicle re-identification. Inf. Process. Manag. 2022, 59, 102868. [Google Scholar] [CrossRef]
- Chen, H.; Lagadec, B.; Bremond, F. Partition and reunion: A two-branch neural network for vehicle re-identification. In Proceedings of the CVPR Workshops, Long Beach, CA, USA, 16–20 June 2019; pp. 184–192. [Google Scholar]
- Wang, H.; Peng, J.; Jiang, G.; Xu, F.; Fu, X. Discriminative feature and dictionary learning with part-aware model for vehicle re-identification. Neurocomputing 2021, 438, 55–62. [Google Scholar] [CrossRef]
- Qian, J.; Zhao, J. PFNet: Part-guided feature-combination network for vehicle re-identification. Multimed. Tools Appl. 2024, 1–18. [Google Scholar] [CrossRef]
- Yu, Z.; Huang, Z.; Pei, J.; Tahsin, L.; Sun, D. Semantic-Oriented Feature Coupling Transformer for Vehicle Re-Identification in Intelligent Transportation System. IEEE Trans. Intell. Transp. Syst. 2023, 25, 2803–2813. [Google Scholar] [CrossRef]
- Qian, J.; Jiang, W.; Luo, H.; Yu, H. Stripe-based and attribute-aware network: A two-branch deep model for vehicle re-identification. Meas. Sci. Technol. 2020, 31, 095401. [Google Scholar] [CrossRef]
- Shen, Y.; Xiao, T.; Li, H.; Yi, S.; Wang, X. Learning deep neural networks for vehicle re-id with visual-spatio-temporal path proposals. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1900–1909. [Google Scholar] [CrossRef]
- Lv, K.; Du, H.; Hou, Y.; Deng, W.; Sheng, H.; Jiao, J.; Zheng, L. Vehicle Re-Identification with Location and Time Stamps. In Proceedings of the CVPR Workshops, Long Beach, CA, USA, 16–20 June 2019; pp. 399–406. [Google Scholar]
- Tong, P.; Li, M.; Li, M.; Huang, J.; Hua, X. Large-scale vehicle trajectory reconstruction with camera sensing network. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking, New Orleans, LA, USA, 25–29 October 2021; pp. 188–200. [Google Scholar] [CrossRef]
- Yao, H.; Duan, Z.; Xie, Z.; Chen, J.; Wu, X.; Xu, D.; Gao, Y. City-scale multi-camera vehicle tracking based on space-time-appearance features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 3310–3318. [Google Scholar] [CrossRef]
- Sun, Y.; Zheng, L.; Yang, Y.; Tian, Q.; Wang, S. Beyond Part Models: Person Retrieval with Refined Part Pooling (and A Strong Convolutional Baseline). In Lecture Notes in Computer Science; Springer International Publishing: Berlin/Heidelberg, Germany, 2018; pp. 501–518. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Shen, F.; Xie, Y.; Zhu, J.; Zhu, X.; Zeng, H. Git: Graph interactive transformer for vehicle re-identification. IEEE Trans. Image Process. 2023, 32, 1039–1051. [Google Scholar] [CrossRef] [PubMed]
- Li, H.; Li, C.; Zheng, A.; Tang, J.; Luo, B. MsKAT: Multi-scale knowledge-aware transformer for vehicle re-identification. IEEE Trans. Intell. Transp. Syst. 2022, 23, 19557–19568. [Google Scholar] [CrossRef]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar] [CrossRef]
- Liu, X.; Liu, W.; Mei, T.; Ma, H. A deep learning-based approach to progressive vehicle re-identification for urban surveillance. In Proceedings of the ECCV 2016, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 869–884. [Google Scholar] [CrossRef]
- Liu, H.; Tian, Y.; Yang, Y.; Pang, L.; Huang, T. Deep relative distance learning: Tell the difference between similar vehicles. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2167–2175. [Google Scholar] [CrossRef]
- Shi, Y.; Zhang, X.; Tan, X. Local-guided Global Collaborative Learning Transformer for Vehicle Reidentification. In Proceedings of the 2022 IEEE 34th International Conference on Tools with Artificial Intelligence (ICTAI), Macao, China, 31 October–2 November 2022; pp. 793–798. [Google Scholar] [CrossRef]
- Shen, F.; Zhu, J.; Zhu, X.; Xie, Y.; Huang, J. Exploring spatial significance via hybrid pyramidal graph network for vehicle re-identification. IEEE Trans. Intell. Transp. Syst. 2021, 23, 8793–8804. [Google Scholar] [CrossRef]
- Tu, J.; Chen, C.; Huang, X.; He, J.; Guan, X. DFR-ST: Discriminative feature representation with spatio-temporal cues for vehicle re-identification. Pattern Recognit. 2022, 131, 108887. [Google Scholar] [CrossRef]
- Zhu, W.; Wang, Z.; Wang, X.; Hu, R.; Liu, H.; Liu, C.; Wang, C.; Li, D. A Dual Self-Attention mechanism for vehicle re-Identification. Pattern Recognit. 2023, 137, 109258. [Google Scholar] [CrossRef]
- Li, Z.; Deng, Y.; Tang, Z.; Huang, J. Sfmnet: Self-guided feature mining network for vehicle re-identification. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Gold Coast, Australia, 18–23 June 2023; pp. 1–8. [Google Scholar] [CrossRef]
- Lu, Z.; Lin, R.; Hu, H. MART: Mask-aware reasoning transformer for vehicle re-identification. IEEE Trans. Intell. Transp. Syst. 2022, 24, 1994–2009. [Google Scholar] [CrossRef]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar] [CrossRef]
Method | Type | Reference | mAP | Rank-1 |
---|---|---|---|---|
SN [13] | GF | TNNLS’22 | 75.70 | 95.10 |
VARID [12] | GF | TITS’22 | 79.30 | 96.00 |
VAT [18] | GF | IPM’22 | 80.40 | 97.50 |
MsKAT [32] | GF | TITS’22 | 82.00 | 97.10 |
PGAN [15] | LF | TITS’22 | 79.30 | 96.50 |
DGPM [14] | LF | TITS’21 | 79.39 | 96.19 |
LG-CoT [36] | LF | ICTAI’22 | 79.70 | 97.00 |
HPGN [37] | LF | TITS’22 | 80.18 | 96.72 |
DFR [38] | LF | PR’22 | 84.47 | 93.02 |
DSN [39] | LF | PR’23 | 76.30 | 94.80 |
SFMNet [40] | LF | IJCNN’23 | 80.00 | 97.00 |
GiT [31] | LF | TIP’23 | 80.34 | 96.86 |
SOFCT [22] | LF | TITS’23 | 80.70 | 96.60 |
MART [41] | LF | TITS’23 | 82.70 | 97.60 |
DPGM-ST [14] | LF and ST | TITS’21 | 82.17 | 98.45 |
DFR-ST [38] | LF and ST | PR’22 | 86.00 | 95.67 |
Baseline | GF | Ours | 78.38 | 95.71 |
SaGT | GF | Ours | 80.67 | 96.96 |
SaGT-ST | GF and ST | Ours | 86.59 | 98.75 |
Method | Type | Reference | Small | Medium | Large | |||
---|---|---|---|---|---|---|---|---|
mAP | Rank-1 | mAP | Rank-1 | mAP | Rank-1 | |||
SN [13] | GF | TNNLS’22 | 78.80 | 76.70 | 76.80 | 74.80 | 76.30 | 73.90 |
VARID [12] | GF | TITS’22 | 88.50 | 85.80 | 84.70 | 81.20 | 82.40 | 79.50 |
VAT [18] | GF | IPM’22 | 89.90 | 84.50 | 87.10 | 80.50 | 85.00 | 78.20 |
MsKAT [32] | GF | TITS’22 | - | 86.30 | - | 81.80 | - | 79.40 |
DFR [38] | LF | PR’22 | 87.55 | 82.15 | 84.94 | 79.33 | 83.18 | 77.93 |
HPGN [37] | LF | TITS’22 | 89.60 | 83.91 | 86.16 | 79.97 | 83.60 | 77.32 |
PGAN [15] | LF | TITS’22 | - | - | - | - | 83.90 | 77.80 |
LG-CoT [36] | LF | ICTAI’22 | 90.50 | 85.20 | 86.60 | 80.50 | 84.40 | 78.00 |
DSN [39] | LF | PR’23 | 81.70 | 80.60 | 79.10 | 78.20 | 75.50 | 75.00 |
SOFCT [22] | LF | TITS’23 | 89.80 | 84.50 | 86.40 | 80.90 | 84.30 | 78.70 |
GiT [31] | LF | TIP’23 | 90.12 | 84.65 | 86.77 | 80.52 | 84.26 | 77.94 |
SFMNet [40] | LF | IJCNN’23 | - | 85.10 | - | 80.50 | - | 77.60 |
Baseline | GF | Ours | 85.06 | 77.38 | 81.63 | 74.13 | 78.04 | 69.75 |
SaGT | GF | Ours | 91.36 | 86.33 | 87.30 | 81.44 | 84.38 | 78.13 |
Method | Fusion | mAP | Rank-1 |
---|---|---|---|
Baseline | - | 78.38 | 95.71 |
Baseline-ST | Add | 82.94 (+4.56 ) | 98.27 (+2.56) |
Baseline-ST | Multiply | 84.93 (+6.55) | 98.75 (+3.04) |
SaGT | - | 80.67 | 96.96 |
SaGT-ST | Add | 84.61 (+3.94) | 98.33 (+1.37) |
SaGT-ST | Multiply | 86.59 (+5.92) | 98.75 (+1.79) |
Method | VeRi-776 | VehicleID | ||||||
---|---|---|---|---|---|---|---|---|
Small | Medium | Large | ||||||
mAP | Rank-1 | mAP | Rank-1 | mAP | Rank-1 | mAP | Rank-1 | |
Baseline | 78.38 | 95.71 | 85.06 | 77.38 | 81.63 | 74.13 | 78.04 | 69.75 |
SaGT-SFM | 80.30 | 96.96 | 89.66 | 84.00 | 86.01 | 80.09 | 83.84 | 77.67 |
SaGT | 80.67 | 96.96 | 91.36 | 86.33 | 87.30 | 81.44 | 84.38 | 78.13 |
Method | Feature | mAP ↑ | Parameter ↓ | Time ↓ | Storage ↓ |
---|---|---|---|---|---|
(%) | (M) | (ms/Image) | (KB/Image) | ||
SaGT-SFM | GF | 80.30 | 85.6 | 8.13 | 6 |
LF | 80.33 | 85.7 | 13.45 | 48 | |
GF and LF | 80.34 | 92.7 | 15.17 | 54 | |
SaGT | GF | 80.67 | 85.6 | 8.13 | 6 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
An, Y.; Zhang, X.; Shi, B.; Tan, X. Stripe-Assisted Global Transformer and Spatial–Temporal Enhancement for Vehicle Re-Identification. Appl. Sci. 2024, 14, 3968. https://doi.org/10.3390/app14103968
An Y, Zhang X, Shi B, Tan X. Stripe-Assisted Global Transformer and Spatial–Temporal Enhancement for Vehicle Re-Identification. Applied Sciences. 2024; 14(10):3968. https://doi.org/10.3390/app14103968
Chicago/Turabian StyleAn, Yasong, Xiaofei Zhang, Bodong Shi, and Xiaojun Tan. 2024. "Stripe-Assisted Global Transformer and Spatial–Temporal Enhancement for Vehicle Re-Identification" Applied Sciences 14, no. 10: 3968. https://doi.org/10.3390/app14103968
APA StyleAn, Y., Zhang, X., Shi, B., & Tan, X. (2024). Stripe-Assisted Global Transformer and Spatial–Temporal Enhancement for Vehicle Re-Identification. Applied Sciences, 14(10), 3968. https://doi.org/10.3390/app14103968