MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices
Abstract
:1. Introduction
- (1)
- We introduce a tracker to pinpoint the position of the tracked object in different frames. This approach eliminates the need to save the segmented result, thereby avoiding an increase in the VRAM usage.
- (2)
- We employ two lightweight components, Discriminative Correlation Filters with Channel and Spatial Reliability (CSR-DCF) and the Mobile Segment Anything Model (MobileSAM). These components ensure a fast inference speed for our method and minimize the VRAM usage.
- (3)
- We design a diffusion module, which attempts to diffuse the mask over the complete tracked object to improve the segmentation accuracy. This module can enhance the precision of segmentation without increasing the VRAM usage of the model.
2. Preliminary
3. Materials and Methods
3.1. Task Definition
3.2. Mask Generator
3.3. Tracker
3.4. Diffusion Module
4. Experiments
4.1. Dataset
4.2. Experimental Environment
4.3. Evaluation Metrics
4.4. Accuracy Improvement Validation of Diffusion Module
4.5. Lightweight Validation of MobileSAM
4.6. Comparison with Other Related Methods
4.7. Generalization Ability Validation
5. Discussion
5.1. Interpretation and Evaluation of Experimental Results
5.2. Implications of This Work
- (1)
- Inspired by the propagation-based S-VOS network, we decompose the S-VOS task into two subtasks: tracking and segmentation. These subtasks leverage mature achievements in the fields of Visual Object Tracking (VOT) [45,46,47,48,49] and instance segmentation [50,51,52,53,54]. Our model inherits the lightweight and fast inference speed characteristics from the CSR-DCF [30] and MobileSAM [24].
- (2)
- (3)
- We introduce a diffusion module to enhance the accuracy and robustness of model segmentation without increasing VRAM usage. Our model demonstrates significant advantages in terms of inference efficiency, VRAM utilization, and segmentation accuracy.
5.3. Limitations of Our Method and Future Research Directions
- (1)
- The proposed model has a weak anti-interference ability in occlusion situations. When the target is partially or completely occluded by other objects, the segmentation results tend to be erroneous or unstable. This is mainly because the mask generator lacks the prior estimation of the previous frame mask, which hinders its ability to effectively maintain the continuity and integrity of the target.
- (2)
- This paper adopts a CF-based tracker as the propagation component, which offers the advantages of fast speed and strong robustness but also presents some limitations. For example, the tracker is insensitive to the appearance changes and motion patterns of the target, which may cause tracking drift or loss. Therefore, the segmentation accuracy largely depends on the performance of the tracker. In future work, we need to improve the tracker, enhancing its accuracy and stability.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Luo, S.; Yu, J.; Xi, Y.; Liao, X. Aircraft Target Detection in Remote Sensing Images Based on Improved YOLOv5. IEEE Access 2022, 10, 5184–5192. [Google Scholar] [CrossRef]
- Zhou, L.; Yan, H.; Shan, Y.; Zheng, C.; Liu, Y.; Zuo, X.; Qiao, B. Aircraft Detection for Remote Sensing Images Based on Deep Convolutional Neural Networks. J. Electr. Comput. Eng. 2021, 2021, 4685644. [Google Scholar] [CrossRef]
- Li, Y.; Zhao, J.; Zhang, S.; Tan, W. Aircraft Detection in Remote Sensing Images Based on Deep Convolutional Neural Network. In Proceedings of the 2018 IEEE 3rd International Conference on Cloud Computing and Internet of Things (CCIOT) Aircraft, Dalian, China, 20–21 October 2018; pp. 135–138. [Google Scholar]
- Wu, S.; Zhang, K.; Li, S.; Yan, J. Learning to Track Aircraft in Infrared Imagery. Remote Sens. 2020, 12, 3995. [Google Scholar] [CrossRef]
- Oh, S.W.; Lee, J.-Y.; Xu, N.; Kim, S.J. Video Object Segmentation Using Space-Time Memory Networks. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9225–9234. [Google Scholar]
- Cheng, H.K.; Tai, Y.-W.; Tang, C.-K. Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 9 June 2021; Volume 15, pp. 11781–11794. [Google Scholar]
- Wang, H.; Jiang, X.; Ren, H.; Hu, Y.; Bai, S. SwiftNet: Real-Time Video Object Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 1296–1305. [Google Scholar]
- Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.; Rolland, C.; Gustafson, L.; Xiao, T.; Whitehead, S.; Berg, A.C.; Lo, W.-Y.; et al. Segment Anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023; pp. 4015–4026. [Google Scholar]
- Chen, K.; Liu, C.; Chen, H.; Zhang, H.; Li, W.; Zou, Z.; Shi, Z. RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation Based on Visual Foundation Model. arXiv 2023, arXiv:2306.16269. [Google Scholar]
- Wang, Y.; Zhao, Y.; Petzold, L. An Empirical Study on the Robustness of the Segment Anything Model (SAM). arXiv 2023, arXiv:2305.06422. [Google Scholar]
- Huang, Y.; Yang, X.; Liu, L.; Zhou, H.; Chang, A.; Zhou, X.; Chen, R.; Yu, J.; Chen, J.; Chen, C.; et al. Segment Anything Model for Medical Images? arXiv 2023, arXiv:2304.14660. [Google Scholar]
- Caelles, S.; Maninis, K.K.; Pont-Tuset, J.; Leal-Taixé, L.; Cremers, D.; Van Gool, L. One-Shot Video Object Segmentation. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 5320–5329. [Google Scholar]
- Perazzi, F.; Pont-Tuset, J.; McWilliams, B.; Van Gool, L.; Gross, M.; Sorkine-Hornung, A. A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 30 June 2016; pp. 724–732. [Google Scholar]
- Perazzi, F.; Khoreva, A.; Benenson, R.; Schiele, B.; Sorkine-Hornung, A. Learning Video Object Segmentation from Static Images. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3491–3500. [Google Scholar]
- Cheng, H.K.; Tai, Y.W.; Tang, C.K. Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 5555–5564. [Google Scholar] [CrossRef]
- Cheng, H.K.; Schwing, A.G. XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2022; Volume 13688, pp. 640–658. [Google Scholar]
- Li, M.; Hu, L.; Xiong, Z.; Zhang, B.; Pan, P.; Liu, D. Recurrent Dynamic Embedding for Video Object Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 24 June 2022; pp. 1322–1331. [Google Scholar]
- Liang, Y.; Li, X.; Jafari, N.; Chen, Q. Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 15 October 2020; Volume 2020, pp. 3430–3441. [Google Scholar]
- Li, X.; Loy, C.C. Video Object Segmentation with Joint Re-Identification and Attention-Aware Mask Propagation; Lecture Notes in Computer Science (Including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2018; Volume 11207, pp. 93–110. [Google Scholar]
- Rahmatulloh, A.; Gunawan, R.; Sulastri, H.; Pratama, I.; Darmawan, I. Face Mask Detection Using Haar Cascade Classifier Algorithm Based on Internet of Things with Telegram Bot Notification. In Proceedings of the 2021 International Conference Advancement in Data Science, E-Learning and Information Systems, ICADEIS 2021, Nusa Dua Bali, Indonesia, 13–14 October 2021. [Google Scholar]
- Lakhan, A.; Elhoseny, M.; Mohammed, M.A.; Jaber, M.M. SFDWA: Secure and Fault-Tolerant Aware Delay Optimal Workload Assignment Schemes in Edge Computing for Internet of Drone Things Applications. Wirel. Commun. Mob. Comput. 2022, 2022, 5667012. [Google Scholar] [CrossRef]
- Mostafa, S.A.; Mustapha, A.; Gunasekaran, S.S.; Ahmad, M.S.; Mohammed, M.A.; Parwekar, P.; Kadry, S. An Agent Architecture for Autonomous UAV Flight Control in Object Classification and Recognition Missions. Soft Comput. 2023, 27, 391–404. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16 × 16 Words: Transformers for Image Recognition At Scale. In Proceedings of the ICLR 2021—9th International Conference on Learning Representations, Virtual Event, Austria, 3–7 May 2021. [Google Scholar]
- Zhang, C.; Han, D.; Qiao, Y.; Kim, J.U.; Bae, S.-H.; Lee, S.; Hong, C.S. Faster Segment Anything: Towards Lightweight SAM for Mobile Applications. arXiv 2023, arXiv:2306.14289. [Google Scholar]
- Wu, K.; Zhang, J.; Peng, H.; Liu, M.; Xiao, B.; Fu, J.; Yuan, L. TinyViT: Fast Pretraining Distillation for Small Vision Transformers; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2022; Volume 13681, pp. 68–85. [Google Scholar]
- Shelhamer, E.; Long, J.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 640–651. [Google Scholar] [CrossRef] [PubMed]
- Held, D.; Thrun, S.; Savarese, S. Learning to Track at 100 FPS with Deep Regression Networks; Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Berlin/Heidelberg, Germany, 2016; Volume 9905, pp. 749–765. [Google Scholar]
- Bolme, D.S.; Beveridge, J.R.; Draper, B.A.; Lui, Y.M. Visual Object Tracking Using Adaptive Correlation Filters. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; pp. 2544–2550. [Google Scholar]
- Henriques, J.F.; Caseiro, R.; Martins, P.; Batista, J. High-Speed Tracking with Kernelized Correlation Filters. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 583–596. [Google Scholar] [CrossRef] [PubMed]
- Lukežič, A.; Vojíř, T.; Čehovin Zajc, L.; Matas, J.; Kristan, M. Discriminative Correlation Filter Tracker with Channel and Spatial Reliability. Int. J. Comput. Vis. 2018, 126, 671–688. [Google Scholar] [CrossRef]
- Feng, Q.; Xu, X.; Wang, Z. Deep Learning-Based Small Object Detection: A Survey. Math. Biosci. Eng. 2023, 20, 6551–6590. [Google Scholar] [CrossRef] [PubMed]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 2017, pp. 5999–6009. [Google Scholar]
- Li, R.Y.M.; Tang, B.; Chau, K.W. Sustainable Construction Safety Knowledge Sharing: A Partial Least Square-Structural Equation Modeling and a Feedforward Neural Network Approach. Sustainability 2019, 11, 5831. [Google Scholar] [CrossRef]
- Nguyen, A.; Pham, K.; Ngo, D.; Ngo, T.; Pham, L. An Analysis of State-of-the-Art Activation Functions for Supervised Deep Neural Network. In Proceedings of the 2021 International Conference on System Science and Engineering, ICSSE 2021, Ho Chi Minh City, Vietnam, 26–28 August 2021; pp. 215–220. [Google Scholar]
- Tancik, M.; Srinivasan, P.P.; Mildenhall, B.; Fridovich-Keil, S.; Raghavan, N.; Singhal, U.; Ramamoorthi, R.; Barron, J.T.; Ng, R. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. In Proceedings of the Advances in Neural Information Processing Systems, Online, 6–12 December 2020; Volume 2020. [Google Scholar]
- Dalal, N.; Triggs, B.; Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection To Cite This Version: Histograms of Oriented Gradients for Human Detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–25 June 2005; pp. 886–893. [Google Scholar]
- Zamir, S.W.; Arora, A.; Gupta, A.; Khan, S.; Sun, G.; Khan, F.S.; Zhu, F.; Shao, L.; Xia, G.-S.; Bai, X. ISAID: A Large-Scale Dataset for Instance Segmentation in Aerial Images. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 28–37. [Google Scholar]
- Xia, G.S.; Bai, X.; Ding, J.; Zhu, Z.; Belongie, S.; Luo, J.; Datcu, M.; Pelillo, M.; Zhang, L. DOTA: A Large-Scale Dataset for Object Detection in Aerial Images. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 3974–3983. [Google Scholar]
- Shermeyer, J.; Hossler, T.; Van Etten, A.; Hogan, D.; Lewis, R.; Kim, D. RarePlanes: Synthetic Data Takes Flight. In Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2021; pp. 207–217. [Google Scholar]
- Li, F.; Kim, T.; Humayun, A.; Tsai, D.; Rehg, J.M. Video Segmentation by Tracking Many Figure-Ground Segments. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2192–2199. [Google Scholar]
- Pont-Tuset, J.; Perazzi, F.; Caelles, S.; Arbeláez, P.; Sorkine-Hornung, A.; Van Gool, L. The 2017 DAVIS Challenge on Video Object Segmentation. arXiv 2017, arXiv:1704.00675. [Google Scholar]
- Xu, N.; Yang, L.; Fan, Y.; Yang, J.; Yue, D.; Liang, Y.; Price, B.; Cohen, S.; Huang, T. YouTube-VOS: Sequence-to-Sequence Video Object Segmentation. In Scanning Microscopy; Springer: Berlin/Heidelberg, Germany, 2018; Volume 3, pp. 603–619. [Google Scholar]
- Oh, S.W.; Lee, J.Y.; Sunkavalli, K.; Kim, S.J. Fast Video Object Segmentation by Reference-Guided Mask Propagation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7376–7385. [Google Scholar]
- Chiroma, H.; Herawan, T.; Fister, I.; Fister, I.; Abdulkareem, S.; Shuib, L.; Hamza, M.F.; Saadi, Y.; Abubakar, A. Bio-Inspired Computation: Recent Development on the Modifications of the Cuckoo Search Algorithm. Appl. Soft Comput. J. 2017, 61, 149–173. [Google Scholar] [CrossRef]
- Chen, X.; Yan, B.; Zhu, J.; Lu, H.; Ruan, X.; Wang, D. High-Performance Transformer Tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 8507–8523. [Google Scholar] [CrossRef] [PubMed]
- Zhao, J.; Dai, K.; Zhang, P.; Wang, D.; Lu, H. Robust Online Tracking with Meta-Updater. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 6168–6182. [Google Scholar] [CrossRef] [PubMed]
- Zhu, J.; Lai, S.; Chen, X.; Wang, D.; Lu, H. Visual Prompt Multi-Modal Tracking. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 9516–9526. [Google Scholar]
- Chen, X.; Peng, H.; Wang, D.; Lu, H.; Hu, H. SeqTrack: Sequence to Sequence Learning for Visual Object Tracking. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 14572–14581. [Google Scholar]
- Liu, S.; Li, X.; Lu, H.; He, Y. Multi-Object Tracking Meets Moving UAV. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; Volume 2022, pp. 8866–8875. [Google Scholar]
- Li, R.; He, C.; Li, S.; Zhang, Y.; Zhang, L. DynaMask: Dynamic Mask Selection for Instance Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 11279–11288. [Google Scholar]
- Li, R.; He, C.; Zhang, Y.; Li, S.; Chen, L.; Zhang, L. SIM: Semantic-Aware Instance Mask Generation for Box-Supervised Instance Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 7193–7203. [Google Scholar]
- Zhang, T.; Wei, S.; Ji, S. E2EC: An End-to-End Contour-Based Method for High-Quality High-Speed Instance Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 4443–4452. [Google Scholar]
- Zhu, C.; Zhang, X.; Li, Y.; Qiu, L.; Han, K.; Han, X. SharpContour: A Contour-Based Boundary Refinement Approach for Efficient and Accurate Instance Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; Volume 2022, pp. 4382–4391. [Google Scholar]
- Cheng, T.; Wang, X.; Chen, S.; Zhang, W.; Zhang, Q.; Huang, C.; Zhang, Z.; Liu, W. Sparse Instance Activation for Real-Time Instance Segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; Volume 2022, pp. 4423–4432. [Google Scholar]
Number of Videos | Number of Frames | Resolution | Maximum Frame Number | Average Pixel Ratio | |
---|---|---|---|---|---|
SegTrack [40] | 24 | 1516 | - | 279 | 5.39% |
DAVIS-16 [13] | 50 | 3455 | 854 × 480 | 104 | 8.09% |
DAVIS-17 [41] | 209 | 13,586 | 854 × 480 | 104 | 6.09% |
YouTube-VOS [42] | 6559 | 160,697 | 1280 × 720 | 36 | 10.77% |
Ours | 28 | 6280 | 854 × 480 | 327 | 4.76% |
Device | Cropping Scaling Factor | Mask Binarization Threshold | Number of Sub-Masks |
---|---|---|---|
GTX 1080 Ti | 0.15 | 0.3 | 5 |
Jetson Xavier NX | 0.15 | 0.3 | 2 |
Image Encoder Structure | With Mask Diffusion Module | J (%) | F (%) | J&F (%) | FPS | VRAM Usage |
---|---|---|---|---|---|---|
Tiny-ViT | Yes | 60.4 | 72.4 | 66.4 | 12.31 | 425 MB |
ViT-B | Yes | 62.4 | 73.2 | 67.8 | 3.30 | 3269 MB |
ViT-L | Yes | 63.2 | 75.8 | 69.5 | 1.50 | 5047 MB |
ViT-H | Yes | 64.5 | 76.4 | 70.5 | 0.95 | 6570 MB |
Tiny-ViT | No | 51.1 | 63.1 | 57.1 | 16.95 | 425 MB |
ViT-B | No | 52.5 | 63.5 | 58.0 | 3.73 | 3269 MB |
ViT-L | No | 52.7 | 65.4 | 59.1 | 1.73 | 5047 MB |
ViT-H | No | 52.1 | 66.0 | 59.1 | 0.97 | 6570 MB |
Methods | Init Methods | J (%) | F (%) | J&F (%) | FPS | Maximum VRAM Usage |
---|---|---|---|---|---|---|
OSVOS [12] | Mask | 37.0 | 46.6 | 41.8 | 15.99 | 3246 MB |
RGMP [43] | Mask | 53.9 | 66.3 | 60.1 | 7.29 | 3584 MB |
STM [5] | Mask | 44.6 | 60.1 | 52.4 | 6.89 | 7308 MB |
SwiftNet [7] | Mask | 49.8 | 60.3 | 55.1 | 10.27 | 6998 MB |
MiVOS [15] | Click | 45.7 | 64.1 | 54.9 | 6.38 | 7254 MB |
STCN [6] | Mask | 44.5 | 66.2 | 55.4 | 9.53 | 7841 MB |
XMem [16] | Click | 55.8 | 69.2 | 62.5 | 10.76 | 4211 MB |
MobileSAM-Track | Box | 61.4 | 71.4 | 66.4 | 12.31 | 425 MB |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Y.; Zhao, Y.; Zhang, X.; Wang, X.; Lian, C.; Li, J.; Shan, P.; Fu, C.; Lyu, X.; Li, L.; et al. MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices. Remote Sens. 2023, 15, 5665. https://doi.org/10.3390/rs15245665
Liu Y, Zhao Y, Zhang X, Wang X, Lian C, Li J, Shan P, Fu C, Lyu X, Li L, et al. MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices. Remote Sensing. 2023; 15(24):5665. https://doi.org/10.3390/rs15245665
Chicago/Turabian StyleLiu, Yehui, Yuliang Zhao, Xinyue Zhang, Xiaoai Wang, Chao Lian, Jian Li, Peng Shan, Changzeng Fu, Xiaoyong Lyu, Lianjiang Li, and et al. 2023. "MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices" Remote Sensing 15, no. 24: 5665. https://doi.org/10.3390/rs15245665
APA StyleLiu, Y., Zhao, Y., Zhang, X., Wang, X., Lian, C., Li, J., Shan, P., Fu, C., Lyu, X., Li, L., Fu, Q., & Li, W. J. (2023). MobileSAM-Track: Lightweight One-Shot Tracking and Segmentation of Small Objects on Edge Devices. Remote Sensing, 15(24), 5665. https://doi.org/10.3390/rs15245665