TensorTrack: Tensor Decomposition for Video Object Tracking
Abstract
:1. Introduction
2. Related Work
3. Preliminary
3.1. Overview of Correlation Filters in Object Tracking
3.2. Principles of Tucker Tensor Decomposition
4. Proposed Method
4.1. Video Decomposition Process
4.2. Tucker2 Decomposition and Motion Pattern
4.3. Randomized Singular Value Decomposition for Efficiency
Algorithm 1 Randomized singular value decomposition |
Input: (target matrix), rank parameter k, power iteration parameter p |
Output: , , |
Step 1: Randomized Initialization |
Step 2: Power Iterations (Optional) |
for i in range(p): |
Step 3: QR Decomposition |
Step 4: Projection onto Lower-Dimensional Subspace |
Step 5: Compute SVD on the Smaller Matrix |
Step 6: Map Back to the Original Space |
Step 7: Return the Decomposition |
return |
4.4. Integrating Appearance and Motion
Algorithm 2 Integration of appearance and motion for heatmap fusion |
Input: , |
Output: Step 1: Compute Standard Deviation for Weighting |
Step 2: Normalize Heatmaps for Consistency |
Step 3: Compute Weighting Factors |
Step 4: Compute Fused Heatmap |
Step 5: Normalize the Fused Heatmap |
Step 6: Return Fused Heatmap |
return |
5. Experiments
5.1. OTB100 Dataset
5.2. Implementation Details
5.3. Impact of Tucker2 Decomposition
5.4. Impact of Randomized SVD
5.5. Benchmark Evaluations
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Pham, T.T.; Déniz, H.R.; Pham, T.D. Tensor decomposition of non-EEG physiological signals for visualization and recognition of human stress. In Proceedings of the 2019 11th International Conference on Bioinformatics and Biomedical Technology, Stockholm, Sweden, 29–31 May 2019; pp. 132–136. [Google Scholar]
- Henretty, T.; Baskaran, M.; Ezick, J.; Bruns-Smith, D.; Simon, T.A. A quantitative and qualitative analysis of tensor decompositions on spatiotemporal data. In Proceedings of the 2017 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 12–14 September 2017; IEEE: Piscataway, NY, USA, 2017; pp. 1–7. [Google Scholar]
- de Almeida, A.L.F.; Favier, G.; da Costa, J.; Mota, J.C.M. Overview of tensor decompositions with applications to communications. Signals Images Adv. Results Speech Estim. Compress. Recognit. Filter. Process. 2016, 12, 325–356. [Google Scholar]
- Zhang, M.; Xing, J.; Gao, J.; Hu, W. Robust visual tracking using joint scale-spatial correlation filters. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QU, Canada, 27–30 September 2015; IEEE: Piscataway, NY, USA, 2015; pp. 1468–1472. [Google Scholar]
- Tang, M.; Feng, J. Multi-kernel correlation filter for visual tracking. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 3038–3046. [Google Scholar]
- Lukežič, A.; Zajc, L.Č.; Kristan, M. Deformable parts correlation filters for robust visual tracking. IEEE Trans. Cybern. 2017, 48, 1849–1861. [Google Scholar] [CrossRef] [PubMed]
- Fu, C.; Xu, J.; Lin, F.; Guo, F.; Liu, T.; Zhang, Z. Object saliency-aware dual regularized correlation filter for real-time aerial tracking. IEEE Trans. Geosci. Remote Sens. 2020, 58, 8940–8951. [Google Scholar] [CrossRef]
- Lin, F.; Fu, C.; He, Y.; Guo, F.; Tang, Q. Learning temporary block-based bidirectional incongruity-aware correlation filters for efficient UAV object tracking. IEEE Trans. Circuits Syst. Video Technol. 2020, 31, 2160–2174. [Google Scholar] [CrossRef]
- Yueyang, G.; Kunqi, G.; Yu, Q.; Xiaoguang, N.; Kuan, X.; Xingqi, F.; Jie, Y. Boosting correlation filter based tracking using multi convolutional features. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; IEEE: Piscataway, NY, USA, 2019; pp. 3965–3969. [Google Scholar]
- Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual object tracking using adaptive correlation filters. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; IEEE: Piscataway, NY, USA, 2010; pp. 2544–2550. [Google Scholar]
- Nikouei, S.Y.; Chen, Y.; Song, S.; Faughnan, T.R. Kerman: A hybrid lightweight tracking algorithm to enable smart surveillance as an edge service. In Proceedings of the 2019 16th IEEE Annual Consumer Communications & Networking Conference (CCNC), Las Vegas, NV, USA, 11–14 January 2019; IEEE: Piscataway, NY, USA, 2019; pp. 1–6. [Google Scholar]
- Liu, S.; Liu, D.; Srivastava, G.; Połap, D.; Woźniak, M. Overview and methods of correlation filter algorithms in object tracking. Complex Intell. Syst. 2021, 7, 1895–1917. [Google Scholar] [CrossRef]
- Ma, H.; Acton, S.T.; Lin, Z. SITUP: Scale invariant tracking using average peak-to-correlation energy. IEEE Trans. Image Process. 2020, 29, 3546–3557. [Google Scholar] [CrossRef]
- Li, B.; Fu, C.; Ding, F.; Ye, J.; Lin, F. All-day object tracking for unmanned aerial vehicle. IEEE Trans. Mob. Comput. 2022. [Google Scholar] [CrossRef]
- de Almeida, A.L.; Favier, G.; Mota, J.C.M. A constrained factor decomposition with application to MIMO antenna systems. IEEE Trans. Signal Process. 2008, 56, 2429–2442. [Google Scholar] [CrossRef]
- Kolda, T.G.; Bader, B.W. Tensor decompositions and applications. SIAM Rev. 2009, 51, 455–500. [Google Scholar] [CrossRef]
- Grigis, A.; Renard, F.; Noblet, V.; Heinrich, C.; Heitz, F.; Armspach, J.P. A new high order tensor decomposition: Application to reorientation. In Proceedings of the 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, Chicago, IL, USA, 30 March–2 April 2011; IEEE: Piscataway, NY, USA, 2011; pp. 258–261. [Google Scholar]
- Király, F.J. Efficient Orthogonal Tensor Decomposition, with an Application to Latent Variable Model Learning. arXiv 2013, arXiv:1309.3233. [Google Scholar]
- Favier, G.; Fernandes, C.A.R.; de Almeida, A.L. Nested Tucker tensor decomposition with application to MIMO relay systems using tensor space–time coding (TSTC). Signal Process. 2016, 128, 318–331. [Google Scholar] [CrossRef]
- Xu, F.; Morency, M.W.; Vorobyov, S.A. DOA estimation for transmit beamspace mimo radar via tensor decomposition with vandermonde factor matrix. IEEE Trans. Signal Process. 2022, 70, 2901–2917. [Google Scholar] [CrossRef]
- Azaïez, M.; Chacón Rebollo, T.; Gómez Mármol, M.; Perracchione, E.; Rincón Casado, A.; Vega, J. Data-driven reduced order modeling based on tensor decompositions and its application to air-wall heat transfer in buildings. SeMA J. 2021, 78, 213–232. [Google Scholar] [CrossRef]
- Zhao, Y.; Yan, H.; Holte, S.; Mei, Y. Rapid detection of hot-spots via tensor decomposition with applications to crime rate data. J. Appl. Stat. 2022, 49, 1636–1662. [Google Scholar] [CrossRef] [PubMed]
- Hutter, E.; Solomonik, E. Multi-Parameter Performance Modeling via Tensor Completion. arXiv 2023, arXiv:2210.10184. [Google Scholar]
- Ratre, A.; Pankajakshan, V. Tucker tensor decomposition-based tracking and Gaussian mixture model for anomaly localisation and detection in surveillance videos. IET Comput. Vis. 2018, 12, 933–940. [Google Scholar] [CrossRef]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. Exploiting the circulant structure of tracking-by-detection with kernels. In Proceedings of the Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Florence, Italy, 7–13 October 2012; Proceedings, Part IV 12. Springer: Berlin/Heidelberg, Germany, 2012; pp. 702–715. [Google Scholar]
- Danelljan, M.; Shahbaz Khan, F.; Felsberg, M.; Van de Weijer, J. Adaptive color attributes for real-time visual tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1090–1097. [Google Scholar]
- Bertinetto, L.; Valmadre, J.; Golodetz, S.; Miksik, O.; Torr, P.H. Staple: Complementary learners for real-time tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1401–1409. [Google Scholar]
- Tucker, L.R. Some mathematical notes on three-mode factor analysis. Psychometrika 1966, 31, 279–311. [Google Scholar] [CrossRef]
- Tucker, L.R. Implications of factor analysis of three-way matrices for measurement of change. Probl. Meas. Chang. 1963, 15, 3. [Google Scholar]
- Levin, J. Three-mode factor analysis. Psychol. Bull. 1965, 64, 442. [Google Scholar] [CrossRef]
- Tucker, L.R. The extension of factor analysis to three-dimensional matrices. In Contributions to Mathematical Psychology; Holt, Rinehart and Winston: New York, NY, USA, 1964; pp. 110–127. [Google Scholar]
- Yu, X.; Luo, Z. A sparse tensor optimization approach for background subtraction from compressive measurements. Multimed. Tools Appl. 2021, 80, 26657–26682. [Google Scholar] [CrossRef]
- Lebedev, V.; Ganin, Y.; Rakhuba, M.; Oseledets, I.; Lempitsky, V. Speeding-up convolutional neural networks using fine-tuned cp-decomposition. arXiv 2014, arXiv:1412.6553. [Google Scholar]
- Markopoulos, P.P.; Kundu, S.; Chamadia, S.; Pados, D.A. Efficient L1-norm principal-component analysis via bit flipping. IEEE Trans. Signal Process. 2017, 65, 4252–4264. [Google Scholar] [CrossRef]
- Zhu, Z.; Wang, Q.; Li, B.; Wu, W.; Yan, J.; Hu, W. Distractor-aware siamese networks for visual object tracking. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 101–117. [Google Scholar]
- Danelljan, M.; Bhat, G.; Khan, F.S.; Felsberg, M. Atom: Accurate tracking by overlap maximization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4660–4669. [Google Scholar]
Method | Precision | AUC | FPS |
---|---|---|---|
CF (Baseline) | 0.484 | 0.699 | 42 |
CF+Motion Model | 0.507 | 0.729 | 39 |
CF+Appearance Model | 0.581 | 0.784 | 32 |
CF+Appearance Model+Tucker2 (Ours) | 0.709 | 0.812 | 48 |
Method | Peak Compactness | Information Entropy |
---|---|---|
CF (Baseline) | - | - |
CF+Motion Model | 1.45 | 2.01 |
CF+Appearance Model | 1.31 | 1.86 |
CF+Appearance Model+Tucker2 (Ours) | 1.08 | 1.62 |
Reconstruction Error (L2) | Speed (Seconds) | |
---|---|---|
Regular SVD | 5.45 × 10−16 | 0.85 |
Randomized SVD | 3.78 × 10−13 | 0.007 |
Tracker (Year) | Precision | AUC | Time Complexity | FPS | GPU Requirements |
MOSSE (2010) | 0.414 | 0.311 | 669 | RTX 1060 | |
HCF (2015) | 0.837 | 0.562 | 10.4 | RTX 1060 | |
KYS (2020) | - | 0.695 | 20 | RTX 1060 | |
DR2Track (2021) | 0.657 | 0.447 | 28 | RTX 1060 | |
RCBSCF (2019) | 0.711 | 0.485 | 36 | RTX 1060 | |
Siamese Trackers | Precision | AUC | Time Complexity | FPS | GPU Requirements |
SINT (2016) | 0.788 | 0.592 | 4 | RTX 1060 | |
RTINET (2019) | - | 0.682 | 9 | RTX 1060 | |
SiamR-CNN (2020) | 0.891 | 0.701 | 4.7 | RTX 4070 | |
Deep Learning-Based | Precision | AUC | Time Complexity | FPS | GPU Requirements |
OSTrack (2022) | 0.710 | 0.690 | 35 | RTX 4070 | |
MixFormer (2022) | 0.720 | 0.700 | 30 | RTX 4070 | |
TransT (2021) | 0.715 | 0.691 | 25 | RTX 4070 | |
STARK (2021) | 0.710 | 0.698 | 20 | RTX 4070 | |
KeepTrack (2021) | 0.725 | 0.685 | 10 | RTX 4070 | |
ToMP (2022) | 0.730 | 0.710 | 20 | RTX 4070 | |
UncertaintyTrack (2024) | 0.740 | 0.720 | 15 | RTX 4070 | |
Ego-Motion (2024) | 0.735 | 0.738 | 18 | RTX 4070 | |
RaTrack (2023) | 0.720 | 0.728 | 22 | RTX 4070 | |
Tracking by 3D (2023) | 0.725 | 0.730 | 12 | RTX 4070 | |
Ours (2024) | 0.709 | 0.812 | 48 | RTX 1060 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Gu, Y.; Zhao, P.; Cheng, L.; Guo, Y.; Wang, H.; Ding, W.; Liu, Y. TensorTrack: Tensor Decomposition for Video Object Tracking. Mathematics 2025, 13, 568. https://doi.org/10.3390/math13040568
Gu Y, Zhao P, Cheng L, Guo Y, Wang H, Ding W, Liu Y. TensorTrack: Tensor Decomposition for Video Object Tracking. Mathematics. 2025; 13(4):568. https://doi.org/10.3390/math13040568
Chicago/Turabian StyleGu, Yuntao, Pengfei Zhao, Lan Cheng, Yuanjun Guo, Haikuan Wang, Wenjun Ding, and Yu Liu. 2025. "TensorTrack: Tensor Decomposition for Video Object Tracking" Mathematics 13, no. 4: 568. https://doi.org/10.3390/math13040568
APA StyleGu, Y., Zhao, P., Cheng, L., Guo, Y., Wang, H., Ding, W., & Liu, Y. (2025). TensorTrack: Tensor Decomposition for Video Object Tracking. Mathematics, 13(4), 568. https://doi.org/10.3390/math13040568