Thermal Video Enhancement Mamba: A Novel Approach to Thermal Video Enhancement for Real-World Applications
Abstract
:1. Introduction
- We introduce a novel Mamba model for thermal video enhancement that integrates the SS2D module with CNNs to handle complex motions and challenging lighting conditions. This model includes:
- (a)
- The Basic Denoising module, which reduces noise and improves image quality.
- (b)
- The Optical Flow Attention module, which provides blur-resistant motion deblurring and preserves scene details even under challenging circumstances.
- We create a labeled thermal video dataset using entropy-based measures to produce meaningful labels for training and evaluation. This dataset includes over three video sequence pairs, with 4k frame pairs.
- We evaluate the proposed framework on real-world scenarios like wildlife monitoring and autonomous systems. Our experiments cover diverse thermal video datasets, including BIRDSAI [20], FLIR [21], CAMEL [22], Autonomous Vehicles [23], and Solar Panel [24], each presenting unique challenges. Compared to two traditional methods, DCRGC [25] and RLBHE [26], as well as five deep learning-based approaches, IE-CGAN [13], BBCNN [27], IDTransformer [28], AverNet [18], and Shift-Net [17], the presented Mamba model consistently outperforms existing solutions. This is demonstrated through qualitative improvements and quantitative assessments using state-of-the-art thermal image quality measures, such as EME [29], BDIM [30], DMTE [31], MDIMTE [31], LGTA [32] and BIE [33].
2. Related Works
2.1. Thermal Imaging Enhancement Models
2.2. Video Enhancement Models
2.3. State Space Models
3. Materials and Methods
3.1. Network Structure
3.2. Training and Dataset
3.2.1. Dataset Generation
3.2.2. Sharpening and Denoising Network
3.2.3. Blur-Resistant Motion Estimation Network
3.2.4. Motion Deblurring Network
4. Results
4.1. Qualitative Comparision
4.2. Quantitative Comparision
4.3. Evaluation Metrics for Object Detection
4.4. Ablation Study
5. Discussion
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Shidik, G.; Noersasongko, E.; Nugraha, A.; Andono, P.N.; Jumanto, J.; Kusuma, E.J. A systematic review of intelligence video surveillance: Trends, techniques, frameworks, and datasets. IEEE Access 2019, 7, 170457–170473. [Google Scholar] [CrossRef]
- Karpuzov, S.; Petkov, G.; Ilieva, S.; Petkov, A.; Kalitzin, S. Object tracking based on optical flow reconstruction of motion-group parameters. Information 2024, 15, 296. [Google Scholar] [CrossRef]
- Alsrehin, N.; Klaib, A.; Magableh, A. Intelligent transportation and control systems using data mining and machine learning techniques: A comprehensive study. IEEE Access 2019, 7, 49830–49857. [Google Scholar] [CrossRef]
- Zhang, L.; Xiong, N.; Gao, W.; Wu, P. Improved detection method for micro-targets in remote sensing images. Information 2024, 15, 108. [Google Scholar] [CrossRef]
- Dong, Y.; Pan, W.D. A survey on compression domain image and video data processing and analysis techniques. Information 2023, 14, 184. [Google Scholar] [CrossRef]
- Yoshida, E.; Kato, S.; Sato, T.; Suzuki, T.; Koyama, H.; Kato, S. Proposal and prototyping on wildlife tracking system using infrared sensors. In Proceedings of the International Conference on Information Networking (ICOIN), Jeju-si, Republic of Korea, 7–10 January 2022; pp. 292–297. [Google Scholar]
- Kim, E.; Kim, W.; Park, J.; Yeo, K. Human detection in infrared image using daytime model-based transfer learning for military surveillance system. In Proceedings of the 14th International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 11–13 October 2023; pp. 1306–1308. [Google Scholar]
- Yuan, C.-J.; Lan, S.-J.; Yan, G.-D.; Wang, D.; Lu, J.-H.; Meng, Q.-F. Application of near-infrared spectroscopy in rapid determination of adenosine and polysaccharide in Cordyceps militaris. In Proceedings of the Fifth International Conference on Natural Computation, Tianjin, China, 14–16 August 2009; pp. 578–582. [Google Scholar]
- Alheeti, K.; McDonald-Maier, K. An intelligent security system for autonomous cars based on infrared sensors. In Proceedings of the 23rd Internat. Conference on Automation and Computing (ICAC), Huddersfield, UK, 7–8 September 2017; pp. 1–5. [Google Scholar]
- Mo, F.; Li, H.; Yao, X.; Wang, Q.; Jing, Q.; Zhang, L. Intelligent onboard processing and multichannel transmission technology for infrared remote sensing data. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 28 July–2 August 2019; pp. 9063–9066. [Google Scholar]
- Wu, H.; Chen, B.; Guo, Z.; He, C.; Luo, S. Mini-infrared thermal imaging system image denoising with multi-head feature fusion and detail enhancement network. Opt. Laser Technol. 2024, 179, 111311. [Google Scholar] [CrossRef]
- Shi, Y.; Zhang, H.; Li, J.; Wang, X.; Zhao, Q.; Liu, T. GAPANet: Group alternate perceived attention network for optical imaging infrared thermal radiation effect correction. Opt. Express 2024, 32, 35888–35902. [Google Scholar] [CrossRef]
- Kuang, X.; Sui, X.; Liu, Y.; Chen, Q.; Gu, G. Single infrared image enhancement using a deep convolutional neural network. Neurocomputing 2019, 332, 119–128. [Google Scholar] [CrossRef]
- Xu, Z.; Zhao, H.; Zheng, Y.; Guo, H.; Li, S.; Lyu, Z. A dual nonsubsampled contourlet network for synthesis images and infrared thermal images denoising. PeerJ Comput. Sci. 2024, 10, e1817. [Google Scholar] [CrossRef] [PubMed]
- Marnissi, M.; Fathallah, A. Revolutionizing thermal imaging: GAN-based vision transformers for image enhancement. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Kuala Lumpur, Malaysia, 8–12 October 2023; pp. 2735–2739. [Google Scholar]
- Yang, Y.; Aviles-Rivero, A.I.; Fu, H.; Liu, Y.; Wang, W.; Zhu, L. Video adverse-weather-component suppression network via weather messenger and adversarial backpropagation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–5 October 2023; pp. 13200–13210. [Google Scholar]
- Li, D.; Shi, X.; Zhang, Y.; Cheung, K.C.; See, S.; Wang, X.; Qin, H.; Li, H. A simple baseline for video restoration with grouped spatial-temporal shift. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 9822–9832. [Google Scholar]
- Zhao, H.; Chen, L.; Wang, M.; Li, J.; Xu, T.; Zhou, Q. AverNet: All-in-one video restoration for time-varying unknown degradations. In Proceedings of the Thirty-Eighth Annual Conference on Neural Information Processing Systems (NeurIPS), Montréal, Canada, 11–16 December 2024. [Google Scholar]
- Park, J.; Kim, H.; Lee, J.; Choi, Y.; Song, M. Videomamba: Spatio-temporal selective state space model. In Proceedings of the European Conference on Computer Vision (ECCV), Cham, Switzerland, 4–8 October 2024; Springer: Cham, Switzerland, 2024. [Google Scholar]
- Bondi, E.; Jain, R.; Aggrawal, P.; Anand, S.; Hannaford, R.; Kapoor, A.; Piavis, J.; Shah, S.; Joppa, L.; Dilkina, B.; et al. BIRDSAI: A dataset for detection and tracking in aerial thermal infrared videos. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 1–5 March 2020; pp. 1736–1745. [Google Scholar]
- FLIR Thermal Dataset. Available online: https://www.kaggle.com/datasets/deepnewbie/flir-thermal-images-dataset (accessed on 21 September 2024).
- Gebhardt, E.; Wolf, M. CAMEL dataset for visual and thermal infrared multiple object detection and tracking. In Proceedings of the 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Auckland, New Zealand, 27–30 November 2018; pp. 1–6. [Google Scholar]
- Takumi, K.; Watanabe, K.; Ha, Q.; Tejero-De-Pablos, A.; Ushiku, Y.; Harada, T. Multispectral object detection for autonomous vehicles. In Proceedings of the Thematic Workshops of ACM Multimedia, Mountain View, CA, USA, 23–27 October 2017; pp. 35–43. [Google Scholar]
- Bommes, L.; Buerhop-Lutz, C.; Pickel, T.; Hauch, J.; Brabec, C.; Peters, I.M. Georeferencing of photovoltaic modules from aerial infrared videos using structure-from-motion. Prog. Photovolt. 2022, 30, 1122–1135. [Google Scholar] [CrossRef]
- Wang, Z.; Liang, Z.; Liu, C. A real-time image processor with combining dynamic contrast ratio enhancement and inverse gamma correction for PDP. Displays 2009, 30, 133–139. [Google Scholar] [CrossRef]
- Zuo, C.; Chen, Q.; Sui, X. Range limited bi-histogram equalization for image contrast enhancement. Optik 2013, 124, 425–431. [Google Scholar] [CrossRef]
- Lee, K.; Lee, J.; Lee, J.; Hwang, S.; Lee, S. Brightness-based convolutional neural network for thermal image enhancement. IEEE Access 2017, 5, 26867–26879. [Google Scholar] [CrossRef]
- Shen, Z.; Wang, H.; Zhang, X.; Li, J.; Liu, T.; Chen, M. IDTransformer: Infrared image denoising method based on convolutional transposed self-attention. Alex. Eng. J. 2024, 110, 310–321. [Google Scholar] [CrossRef]
- Agaian, S.; Panetta, K.; Grigoryan, A. A new measure of image enhancement. In Proceedings of the IASTED International Conference on Signal Processing and Communications (SPC), Marbella, Spain, 19–21 September 2000. [Google Scholar]
- Trongtirakul, T.; Agaian, S. Unsupervised and optimized thermal image quality enhancement and visual surveillance application. Signal Process. Image Commun. 2022, 105, 116714. [Google Scholar] [CrossRef]
- Agaian, S.; Roopaei, M.; Akopian, D. Thermal-image quality measurements. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 1–5. [Google Scholar]
- Agaian, S.; Ayunts, H.; Trongtirakul, T.; Hovhannisyan, S. A new method for judging thermal image quality with applications. Signal Process. 2024, 229, 109769. [Google Scholar] [CrossRef]
- Ayunts, H.; Grigoryan, A.; Agaian, S. Novel entropy for enhanced thermal imaging and uncertainty quantification. Entropy 2024, 26, 374. [Google Scholar] [CrossRef] [PubMed]
- Gonzalez, R.; Woods, R. Digital Image Processing, 2nd ed.; Prentice-Hall: Upper Saddle River, NJ, USA, 2002; Volume 793. [Google Scholar]
- Dhariwal, S. Comparative analysis of various image enhancement techniques. Int. J. Electron. Commun. Technol. 2011, 2, 91–95. [Google Scholar]
- Mudavath, T.; Niranjan, V. Thermal image enhancement for adverse weather scenarios: A wavelet transform and histogram clipping approach. Signal Image Video Process 2024, in press. [CrossRef]
- Grigoryan, A.; Agaian, S. Asymmetric and symmetric gradient operators with application in face recognition in Renaissance portrait art. In Proceedings of the SPIE, Defense + Commercial Sensing, Mobile Multimedia/Image Processing, Security, and Applications, Baltimore, MD, USA, 4–18 May 2019; Volume 10993, p. 12. [Google Scholar]
- Agaian, S.; Panetta, K.; Grigoryan, A. Transform-based image enhancement algorithms with performance measure. IEEE Trans. Image Process. 2001, 10, 367–382. [Google Scholar] [CrossRef]
- Kastrinaki, V.; Zervakis, M.; Kalaitzakis, K. A survey of video processing techniques for traffic applications. Image Vis. Comput. 2003, 21, 359–381. [Google Scholar] [CrossRef]
- González-Cepeda, J.; Ramajo, Á.; Armingol, J. Intelligent video surveillance systems for vehicle identification based on multinet architecture. Information 2022, 13, 325. [Google Scholar] [CrossRef]
- Rao, Y.; Lin, W.; Chen, L. Image-based fusion for video enhancement of nighttime surveillance. Opt. Eng. 2010, 49, 120501. [Google Scholar] [CrossRef]
- Agaian, S.; Blair, S.; Panetta, K. Transform coefficient histogram-based image enhancement algorithms using contrast entropy. IEEE Trans. Image Process. 2007, 16, 741–758. [Google Scholar] [CrossRef] [PubMed]
- Reinhard, E.; Ward, G.; Pattanaik, S.; Debevec, P. High Dynamic Range Imaging: Acquisition, Display, and Image-Based Lighting; Morgan: San Francisco, CA, USA, 2005. [Google Scholar]
- Lee, S. An efficient content-based image enhancement in the compressed domain using Retinex theory. IEEE Trans. Circuits Syst. Video Tech. 2007, 17, 199–213. [Google Scholar] [CrossRef]
- Balster, E.; Zheng, Y.F.; Ewing, R.L. Combined spatial and temporal domain wavelet shrinkage algorithm for video denoising. IEEE Trans. Circuits Syst. Video Technol. 2006, 16, 220–230. [Google Scholar]
- Wan, T.; Tzagkarakis, G.; Tsakalides, P.; Canagarajah, C.N.; Achim, A. Context enhancement through image fusion: A multi-resolution approach based on convolution of Cauchy distributions. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Las Vegas, NV, USA, 31 March–4 April 2008; pp. 1309–1312. [Google Scholar]
- Li, J.; Li, S.Z.; Pan, Q.; Yang, T. Illumination and motion-based video enhancement for night surveillance. In Proceedings of the IEEE workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, Beijing, China, 15–16 October 2005; pp. 169–175. [Google Scholar]
- Land, E.H.; McCann, J.J. Lightness and Retinex theory. J. Opt. Soc. Am. 1971, 61, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Stauffer, C.; Grimson, W.E.L. Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 747–757. [Google Scholar] [CrossRef]
- Liang, J.; Yang, Y.; Li, B.; Duan, P.; Xu, Y.; Shi, B. Coherent event guided low-light video enhancement. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–5 October 2023; pp. 10615–10625. [Google Scholar]
- Cho, Y.; Lee, H.; Park, D.; Kim, C.Y. Enhancement for temporal resolution of video based on multi-frame feature trajectory and occlusion compensation. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 7–10 November 2009; pp. 389–392. [Google Scholar]
- Gu, A.; Dao, T. Mamba: Linear-time sequence modeling with selective state spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar]
- Yuan, C.; Zhao, D.; Agaian, S.S. MUCM-Net: A Mamba powered UCM-Net for skin lesion segmentation. Explor. Med. 2024, 5, 694–708. [Google Scholar] [CrossRef]
- Zhang, H.; Zhu, Y.; Wang, D.; Zhang, L.; Chen, T.; Ye, Z. A survey on visual Mamba. arXiv 2024, arXiv:2404.15956v2. [Google Scholar]
- Bai, J.; Yin, Y.; He, Q.; Li, Y.; Zhang, X. Retinexmamba: Retinex-based mamba for low-light image enhancement. arXiv 2024, arXiv:2405.03349. [Google Scholar]
- Zhang, Z.; Jiang, H.; Singh, H. NeuFlow: Real-time, high-accuracy optical flow estimation on robots using edge devices. arXiv 2024, arXiv:2403.10425. [Google Scholar]
- Zhang, Z.; Gupta, A.; Jiang, H.; Singh, H. NeuFlow v2: High-efficiency optical flow estimation on edge devices. arXiv 2024, arXiv:2408.10161. [Google Scholar]
- Lucas, B.D.; Kanade, T. An iterative image registration technique with an application to stereo vision. In Proceedings of the 7th International Joint Conference on Artificial Intelligence (IJCAI ‘81), Vancouver, BC, Canada, 24–28 August 1981; Volume 2, pp. 674–679. [Google Scholar]
- Wedel, A.; Pock, T.; Zach, C.; Bischof, H.; Cremers, D. An improved algorithm for TV-L1 optical flow. In Proceedings of the Statistical and Geometrical Approaches to Visual Motion Analysis: International Dagstuhl Seminar, Dagstuhl Castle, Germany, 13–18 July 2008; Springer: Berlin/Heidelberg, Germany, 2009; pp. 23–45. [Google Scholar]
- Deng, G.; Galetto, F.; Al–nasrawi, M.; Waheed, W. A guided edge-aware smoothing-sharpening filter based on patch interpolation model and generalized gamma distribution. IEEE Open J. Signal Process. 2021, 2, 119–135. [Google Scholar] [CrossRef]
- Liu, D.; Wen, B.; Fan, Y.; Loy, C.C.; Huang, T.S. Non-local recurrent network for image restoration. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Montréal, QC, Canada, 3–8 December 2018; Volume 31. [Google Scholar]
- Murphy, A.H. Skill scores based on the mean square error and their relationships to the correlation coefficient. Mon. Weather Rev. 1988, 116, 2417–2424. [Google Scholar] [CrossRef]
- Butler, D.J.; Wulff, J.; Stanley, G.B.; Black, M.J. A naturalistic open source movie for optical flow evaluation. In Proceedings of the European Conference on Computer Vision (ECCV), Florence, Italy, 7–13 October 2012. [Google Scholar]
- Geiger, A.; Lenz, P.; Urtasun, R. Are we ready for autonomous driving? The KITTI vision benchmark suite. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Providence, RI, USA, 16–21 June 2012; pp. 3354–3361. [Google Scholar]
- Kondermann, D.; Nair, R.; Honauer, K.; Krispin, K.; Andrulis, J.; Brock, A.; Gussefeld, B.; Rahimimoghaddam, M.; Hofmann, S.; Brenner, C.; et al. The HCI benchmark suite: Stereo and flow ground truth with uncertainties for urban autonomous driving. In Proceedings of the CVPR Workshops, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 19–28. [Google Scholar]
- Su, S.; Delbracio, M.; Wang, J.; Sapiro, G.; Heidrich, W.; Wang, O. Deep video deblurring for hand-held cameras. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 237–246. [Google Scholar]
- Son, H.; Lee, J.; Lee, J.; Cho, S.; Lee, S. Recurrent video deblurring with blur-invariant motion estimation and pixel volumes. Assoc. Comput. Mach. 2021, 40, 5. [Google Scholar] [CrossRef]
- Wang, C.; Yeh, I.; Liao, H. You only learn one representation: Unified network for multiple tasks. J. Inf. Sci. Eng. 2023, 39, 691–709. [Google Scholar]
- Feng, Y.; Huang, J.; Du, S.; Ying, S.; Yong, J.-H.; Li, Y.; Ding, G.; Ji, R.; Gao, Y. Hyper-YOLO: When visual object detection meets hypergraph computation. arXiv 2024, arXiv:2408.04804. [Google Scholar] [CrossRef]
DCRGC | RLBHE | IE-CGAN | BBCNN | AverNet | Shift-Net | IDTransformer | TVEMamba | |
---|---|---|---|---|---|---|---|---|
Noise reduction | ✓ | ± | ✓ | ✓ | ✓ | |||
Balanced contrast | ✓ | ✓ | ± | ± | ± | ± | ✓ | |
Handles underexposed areas | ± | ± | ✓ | |||||
Handles overexposed areas | ± | ✓ | ± | ± | ± | ± | ✓ | |
Edge preservation | ± | ± | ± | ± | ✓ | ✓ | ||
Maintains natural brightness | ± | ✓ | ✓ | ✓ | ✓ | |||
Handles complex textures | ± | ± | ✓ | |||||
Artifact-free output | ✓ |
Benefits | Limitations |
---|---|
Objects can be observed in no light conditions (dark environments). | Difficulty distinguishing between objects in proximity or of similar temperatures. |
High performance in all weather conditions (rain, fog, snow, smoke). | Generally, lower resolution compared to visible light images. |
Opportunities for surveillance over large distances and areas, detecting motion over a wide range. | Cannot see through glass or water, as these materials reflect infrared radiation, limiting use cases like capturing images of individuals in cars. |
Detection of objects even when partially hidden by vegetation. | More expensive than visible light cameras. |
Promotes early detection of thermal anomalies (e.g., equipment overheating, fire hazards), contributing to preventive safety measures. | Cannot identify detected individuals, as infrared radiation does not create detailed enough images. |
BBCNN | DCRGC | IE-CGAN | RLBHE | AverNet | Shift-Net | IDTransformer | TVEMamba | |
---|---|---|---|---|---|---|---|---|
BIRDSAI | ||||||||
EME | 10.060 | 20.264 | 17.748 | 18.377 | 9.721 | 8.772 | 9.929 | 22.942 |
DMTE | 0.297 | 0.297 | 0.296 | 0.297 | 0.299 | 0.260 | 0.297 | 0.299 |
MDIMTE | 45.060 | 42.620 | 31.620 | 46.001 | 44.994 | 43.784 | 46.824 | 47.132 |
BDIM | 0.974 | 0.986 | 0.988 | 0.986 | 0.970 | 0.867 | 0.967 | 0.991 |
LGTA | 1.158 | 1.167 | 1.423 | 1.154 | 1.119 | 1.122 | 1.151 | 1.172 |
BIE | 0.085 | 0.098 | 0.076 | 0.088 | 0.086 | 0.084 | 0.099 | 0.109 |
CAMEL | ||||||||
EME | 14.633 | 24.214 | 24.010 | 23.796 | 17.140 | 17.520 | 14.728 | 25.371 |
DMTE | 0.293 | 0.292 | 0.290 | 0.294 | 0.296 | 0.295 | 0.293 | 0.296 |
MDIMTE | 39.833 | 41.309 | 32.004 | 40.747 | 40.680 | 41.117 | 41.731 | 42.786 |
BDIM | 0.990 | 0.988 | 0.992 | 0.990 | 0.984 | 0.983 | 0.981 | 0.994 |
LGTA | 1.239 | 1.235 | 1.381 | 1.089 | 1.166 | 1.135 | 1.141 | 1.548 |
BIE | 0.070 | 0.091 | 0.074 | 0.087 | 0.078 | 0.069 | 0.070 | 0.098 |
FLIR | ||||||||
EME | 10.743 | 13.424 | 10.560 | 11.185 | 9.842 | 9.679 | 8.345 | 14.152 |
DMTE | 0.295 | 0.296 | 0.295 | 0.294 | 0.298 | 0.297 | 0.297 | 0.298 |
MDIMTE | 43.801 | 40.627 | 41.024 | 42.486 | 49.002 | 48.060 | 50.012 | 50.146 |
BDIM | 0.972 | 0.977 | 0.965 | 0.971 | 0.963 | 0.961 | 0.958 | 0.982 |
LGTA | 1.137 | 1.146 | 1.105 | 1.080 | 1.116 | 1.111 | 1.120 | 1.167 |
BIE | 0.183 | 0.180 | 0.192 | 0.196 | 0.199 | 0.202 | 0.207 | 0.227 |
Autonomous Vehicles | ||||||||
EME | 2.929 | 3.088 | 7.260 | 8.130 | 5.492 | 3.115 | 7.141 | 12.513 |
DMTE | 0.299 | 0.298 | 0.297 | 0.297 | 0.299 | 0.298 | 0.300 | 0.310 |
MDIMTE | 51.517 | 48.326 | 53.659 | 47.925 | 55.380 | 41.529 | 56.013 | 57.369 |
BDIM | 0.937 | 0.959 | 0.943 | 0.957 | 0.937 | 0.922 | 0.923 | 0.963 |
LGTA | 1.180 | 1.393 | 1.189 | 1.411 | 1.253 | 1.205 | 1.223 | 1.499 |
BIE | 0.093 | 0.182 | 0.088 | 0.150 | 0.095 | 0.092 | 0.096 | 0.097 |
EME | DMTE | MDIMTE | BDIM | LGTA | BIE | |
---|---|---|---|---|---|---|
w/o SD-Net | 16.245 | 0.291 | 42.121 | 0.964 | 1.148 | 0.081 |
w/o MD-Net | 20.187 | 0.295 | 46.345 | 0.981 | 1.159 | 0.093 |
w/o BRME-Net | 21.145 | 0.298 | 46.899 | 0.989 | 1.168 | 0.101 |
TVEMamba | 22.942 | 0.299 | 47.132 | 0.991 | 1.172 | 0.109 |
Dataset | BIRDSAI | BIRDSAI | BIRDSAI | FLIR | ||||
---|---|---|---|---|---|---|---|---|
Classes | 2 | 3 | 2 | 2 | ||||
Architecture | YOLOR1 | YOLOR2 | YOLOR1 | YOLOR2 | Hyper-YOLO1 | Hyper-YOLO2 | Hyper-YOLO1 | Hyper-YOLO2 |
mAP0.5 | 38.1 | 44.2 | 25.0 | 29.7 | 38.0 | 43.9 | 89.8 | 89.9 |
mAP0.5:0.9 | 13.2 | 16.8 | 9.3 | 10.9 | 12.9 | 16.4 | 56.6 | 56.7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hovhannisyan, S.; Agaian, S.; Panetta, K.; Grigoryan, A. Thermal Video Enhancement Mamba: A Novel Approach to Thermal Video Enhancement for Real-World Applications. Information 2025, 16, 125. https://doi.org/10.3390/info16020125
Hovhannisyan S, Agaian S, Panetta K, Grigoryan A. Thermal Video Enhancement Mamba: A Novel Approach to Thermal Video Enhancement for Real-World Applications. Information. 2025; 16(2):125. https://doi.org/10.3390/info16020125
Chicago/Turabian StyleHovhannisyan, Sargis, Sos Agaian, Karen Panetta, and Artyom Grigoryan. 2025. "Thermal Video Enhancement Mamba: A Novel Approach to Thermal Video Enhancement for Real-World Applications" Information 16, no. 2: 125. https://doi.org/10.3390/info16020125
APA StyleHovhannisyan, S., Agaian, S., Panetta, K., & Grigoryan, A. (2025). Thermal Video Enhancement Mamba: A Novel Approach to Thermal Video Enhancement for Real-World Applications. Information, 16(2), 125. https://doi.org/10.3390/info16020125