Advancing Nighttime Object Detection through Image Enhancement and Domain Adaptation
Abstract
:1. Introduction
- We propose NUDN, a teacher-student learning method with co-learning capabilities. NUDN utilizes a combination of high-confidence teacher labels and low-confidence student proposals. This strategy significantly enhances the utilization of pseudo-labels.
- To enable cross-domain research using existing nighttime datasets, we propose the image enhancement technique LightImg, which can convert nighttime features into daytime features.
- The use of random image scaling mitigates data limitations.
- We conducted comprehensive evaluations on two nighttime datasets, demonstrating that our NUDN exhibits strong discriminative capabilities in the source domain, robust adaptability to single-target domains, and strong generalization abilities to unseen domains.
2. Related Work
2.1. Image Enhancement
2.2. Unsupervised Domain Adaptation (UDA) for Object Detection
3. Methodology
3.1. Consistency Learning
3.2. Label Randomization
3.3. LightImg
Algorithm 1 LightImg Image Enhncement |
Require: Nighttime Image |
Ensure: Enhanced Image |
Return |
Algorithm 2 NUDN |
Initialize with daytime images |
Initialize as EMA of |
for each iteration do |
Enhanced_Night_Images LightImg(Dn) |
High_Confidence_Labels GenerateAndFilterLabels(T,Enhanced_Night_Images) |
Candidate_Boxes Merge(Candidate_Boxes, High_Confidence_Labels) |
ClassifyAndRegress(S, Combined_Boxes) |
Consistency_Loss ComputeConsistencyLoss(S, T, Combined_Boxes) |
UpdateNetworks(S, T, Consistency_Loss) |
end for |
return Trained Student Network S |
4. Discussion
4.1. Experimental Setup or Environment
4.1.1. ExDark [49] Datasets
4.1.2. SHIFT [50] Datasets
4.2. Experimental Setup
4.3. Evaluation Indicators
4.4. Datasets
4.4.1. Comparative Experiments on the ExDark [49] Datasets
4.4.2. Comparative Experiments on the SHIFT Datasets
4.4.3. Ablation Studies
4.4.4. LightImg Time and Performance Comparison
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Chen, Y.; Lin, Z.; Jin, L.; Wang, J.; Han, L. Knowledge distillation-based nighttime unupervised cross-domain object detection network. Sensors 2020, 20, 7031. [Google Scholar]
- Zhu, X.; Xie, L.; Lin, H. CycleGAN-based domain adaptation for nighttime object detection. IEEE Trans. Image Process. 2021, 30, 1234–1245. [Google Scholar]
- Wang, J.; Wang, S.; Chen, X. Hierarchical feature alignment for cross-domain object detection. Pattern Recognit. 2019, 94, 42–53. [Google Scholar]
- Jin, R.; Wang, J.; Sun, Z. Domain adaptation for nighttime object detection using diversified samples. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 1231–1242. [Google Scholar]
- Li, Y.; Zhu, H.; Chen, J. Dynamic convolutional neural network for nighttime object detection. IEEE Access 2022, 10, 25431–25440. [Google Scholar]
- Wang, P.; Cheng, J.; Yang, L. Unsupervised image translation for nighttime object detection in traffic scenes. Neurocomputing 2021, 455, 210–219. [Google Scholar]
- Chen, W.; Huang, Z.; Li, X. GAN-based low-light image enhancement for nighttime object detection. Multimed. Tools Appl. 2020, 79, 33657–33671. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 39, 1137–1149. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollar, P. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2980–2988. [Google Scholar]
- Zhou, X.; Wang, D.; Krähenbühl, P. Objects as points. arXiv 2019, arXiv:1904.07850. [Google Scholar]
- Chen, X.; Gupta, A. An implementation of faster R-CNN with fewer anchor boxes for object detection. arXiv 2017, arXiv:1703.07465. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLOv3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Sun, J.; Zhang, Y. Multi-scale feature extraction for nighttime object detection using deep learning. J. Real-Time Image Process. 2019, 16, 585–595. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single shot multibox detector. In Computer Vision–ECCV 2016, Proceedings of the 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar]
- Zhu, H.; Chen, Y. Domain adaptation for nighttime object detection using cycle-consistent generative adversarial networks. IEEE Trans. Image Process. 2020, 29, 6842–6852. [Google Scholar]
- Johnson, J.; Zhang, J. Adversarial learning for unsupervised domain adaptation in object detection. Pattern Recognit. Lett. 2021, 139, 25–32. [Google Scholar]
- Zhou, Z.; Chen, L. Multi-task learning for object detection in nighttime images. IEEE Trans. Image Process. 2019, 28, 5147–5158. [Google Scholar]
- Zhang, W.; Wang, H. Efficient nighttime object detection using deep learning and low-light enhancement. IEEE Access 2020, 8, 181875–181886. [Google Scholar]
- Rahman, Z.U.; Jobson, D.J.; Woodell, G.A. Retinex processing for automatic image enhancement. J. Electron. Imaging 2004, 13, 100–111. [Google Scholar]
- Ma, Y.; Liu, Y.; Cheng, J.; Zheng, Y.; Ghahremani, M.; Chen, H.; Liu, J.; Zhao, Y. Cycle structure and illumination constrained GAN for medical image enhancement. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2020, Proceedings of the 23rd International Conference, Lima, Peru, 4–8 October 2020; Springer: Cham, Switzerland, 2020; pp. 667–677. [Google Scholar]
- Fu, L.; Yu, H.; Juefei-Xu, F.; Li, J.; Guo, Q.; Wang, S. Let there be light: Improved traffic surveillance via detail preserving night-to-day transfer. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 8217–8226. [Google Scholar] [CrossRef]
- Zhu, Z.; Meng, Y.; Kong, D.; Zhang, X.; Guo, Y.; Zhao, Y. To see in the dark: N2DGAN for background modeling in nighttime scene. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 492–502. [Google Scholar] [CrossRef]
- Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; ter Haar Romeny, B.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vis. Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
- Srinivasan, S.; Balram, N. Adaptive contrast enhancement using local region stretching. In Proceedings of the 9th Asian Symposium on Information Display, New Delhi, India, 8–12 October 2006; pp. 152–155. [Google Scholar]
- Jobson, D.J.; Rahman, Z.; Woodell, G.A. Properties and performance of a center/surround retinex. IEEE Trans. Image Process. 1997, 6, 451–462. [Google Scholar] [CrossRef] [PubMed]
- Rahman, Z.; Jobson, D.J.; Woodell, G.A. Multi-scale retinex for color image enhancement. In Proceedings of the 3rd IEEE International Conference on Image Processing, Lausanne, Switzerland, 19 September 1996; pp. 1003–1006. [Google Scholar]
- Guo, X.; Li, Y.; Ling, H. LIME: Low-light image enhancement via illumination map estimation. IEEE Trans. Image Process. 2017, 26, 982–993. [Google Scholar] [CrossRef] [PubMed]
- Sharma, A.; Tan, R.T. Nighttime Visibility Enhancement by Increasing the Dynamic Range and Suppression of Light Effects. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 11972–11981. [Google Scholar]
- Jin, Y.; Yang, W.; Tan, R. Unsupervised night image enhancement: When layer decomposition meets light-effects suppression. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; pp. 404–421. [Google Scholar]
- Zheng, Z.; Wu, Y.; Han, X.; Shi, J. Forkgan: Seeing into the rainy night. In Proceedings of the European Conference on Computer Vision (ECCV), Glasgow, UK, 23–28 August 2020; pp. 155–170. [Google Scholar]
- He, M.; Wang, Y.; Wu, J.; Wang, Y.; Li, H.; Li, B.; Gan, W.; Wu, W.; Qiao, Y. Cross Domain Object Detection by Target-Perceived Dual Branch Distillation. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 9560–9570. [Google Scholar]
- Li, M.; Liu, J.; Yang, W.; Sun, X.; Guo, Z. Structure-revealing low-light image enhancement via robust retinex model. IEEE Trans. Image Process. 2018, 27, 2828–2841. [Google Scholar] [CrossRef]
- Wang, W.; Chen, Z.; Yuan, X.; Guan, F. An adaptive weak light image enhancement method. Proc. SPIE 2021, 11719, 1171902. [Google Scholar]
- Kwon, H.-J.; Lee, S.-H. Raindrop-removal image translation using target-mask network with attention module. Mathematics 2023, 11, 3318. [Google Scholar] [CrossRef]
- Tarvainen, A.; Valpola, H. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 1195–1204. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar]
- Kennerley, M.; Wang, J.; Veeravalli, B.; Tan, R. 2PCNet: Two-Phase Consistency Training for Day-to-Night Unsupervised Domain Adaptive Object Detection. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 11484–11493. [Google Scholar]
- Shi, J.; Pang, G. Low-light image enhancement for nighttime object detection using generative adversarial networks. Multimed. Tools Appl. 2020, 79, 15423–15434. [Google Scholar]
- Chen, D.; Gao, X. A comprehensive review of nighttime object detection using deep learning. IEEE Access 2020, 8, 103759–103772. [Google Scholar]
- Li, S.; Huang, H. Enhancing nighttime object detection with multi-scale feature fusion. Neurocomputing 2021, 438, 271–282. [Google Scholar]
- Huang, X.; Belongie, S. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1501–1510. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Computer Vision–ECCV 2016, Proceedings of the 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Springer: Cham, Switzerland, 2016; pp. 630–645. [Google Scholar]
- Deng, J.; Li, W.; Chen, Y.; Duan, L. Unbiased mean teacher for cross-domain object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 4089–4099. [Google Scholar]
- Xu, H.; Zhao, X. Nighttime object detection using unsupervised domain adaptation and image translation. J. Vis. Commun. Image Represent. 2021, 75, 103049. [Google Scholar]
- Loh, Y.P.; Chan, C.S. Getting to know low-light images with the Exclusively Dark dataset. Comput. Vis. Image Underst. 2019, 178, 30–42. [Google Scholar] [CrossRef]
- Sun, T.; Segu, M.; Postels, J.; Wang, Y.; Van Gool, L.; Schiele, B.; Tombari, F.; Yu, F. Shift: A synthetic driving dataset for continuous multi-task domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 21339–21350. [Google Scholar]
- Zhang, Y.; Guo, X.; Ma, J.; Liu, W.; Zhang, J. Beyond brightening low-light images. Int. J. Comput. Vis. 2021, 129, 1013–1037. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Zitnick, C.L. Microsoft COCO: Common objects in context. In Computer Vision–ECCV 2014, Proceedings of the 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.I.; Winn, J.; Zisserman, A. The Pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Ouyang, W.; Wang, X.; Zeng, X. DeepID-net: Deformable deep convolutional neural networks for object detection. Pattern Recognit. 2017, 76, 230–241. [Google Scholar]
Parameters | Configuration |
---|---|
CPU | Intel Xeon Gold 5320 @ 2.20 GHz |
GPU | RTX A4000(16 GB) ×3 |
System | Ubuntu 20.04 |
Deep learning architecture | PyTorch1.11.0 + Python 3.8 + Cuda11.3 |
Training Epochs | 60,000 |
Batch size | 12 |
exponential moving average (EMA) | 0.999 |
Base | 0.04 |
threshold | 0.8 |
Method | AP | APs | APm | APl |
---|---|---|---|---|
TDD [33] | 34.6 | 12.1 | 28.3 | 39.1 |
2PCNet [41] | 39.7 | 8.3 | 25.0 | 36.6 |
UMT [47] | 36.2 | 10.8 | 27.4 | 34.6 |
NUDN (Ours) | 37.4 | 14.1 | 28.8 | 42.7 |
Method | AP | APs | APm | APl |
---|---|---|---|---|
TDD [33] | 33.2 | 10.1 | 29.2 | 37.2 |
2PCNet [41] | 44.7 | 11.3 | 28.0 | 39.6 |
UMT [47] | 32.4 | 9.1 | 24.8 | 29.3 |
NUDN (ours) | 39.6 | 16.2 | 30.4 | 45.2 |
Structure | I | CL | LR | AP | APs | APm | APl |
---|---|---|---|---|---|---|---|
Single | 31.7 | 4.8 | 16.5 | 30.4 | |||
✓ | 36.8 | 9.1 | 23.3 | 41.4 | |||
✓ | ✓ | 36.2 | 9.4 | 24.7 | 42.9 | ||
NUDN(ours) | ✓ | ✓ | ✓ | 37.4 | 14.1 | 28.8 | 42.7 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, C.; Lee, D. Advancing Nighttime Object Detection through Image Enhancement and Domain Adaptation. Appl. Sci. 2024, 14, 8109. https://doi.org/10.3390/app14188109
Zhang C, Lee D. Advancing Nighttime Object Detection through Image Enhancement and Domain Adaptation. Applied Sciences. 2024; 14(18):8109. https://doi.org/10.3390/app14188109
Chicago/Turabian StyleZhang, Chenyuan, and Deokwoo Lee. 2024. "Advancing Nighttime Object Detection through Image Enhancement and Domain Adaptation" Applied Sciences 14, no. 18: 8109. https://doi.org/10.3390/app14188109
APA StyleZhang, C., & Lee, D. (2024). Advancing Nighttime Object Detection through Image Enhancement and Domain Adaptation. Applied Sciences, 14(18), 8109. https://doi.org/10.3390/app14188109