Fully Synthetic Videos and the Random-Background-Pasting Method for Flame Segmentation
Abstract
:1. Introduction
- We train segmentation models using virtual images without manual annotation. By using fully synthetic video frames, we can learn significant information about the targets.
- We demonstrate a significant improvement in generalization performance in segmentation tasks. By training models using synthetic video frames, models can learn a particular approach from a source domain and then use that approach in a different target domain.
- We carry out the efficient augmentation of the training dataset using synthetic videos for any real test case. This study shows that training a segmentation model without using real training data is possible. This means that the problem of lacking data and time-consuming annotations can be solved with our method.
2. Materials and Methods
2.1. Related Works
2.2. Method
- Generate videos of objects.
- 2.
- Extract object masks.
- 3.
- Paste a randomly selected background.
- 4.
- Train a segmentation model using generated data.
- 5.
- Test the model using real flame images.
2.2.1. Virtual Video Frames for Model Training
2.2.2. Bridging the Reality Gap with Background Paste
2.3. Experiment
2.3.1. Dataset for Model Training and Testing
2.3.2. Segmentation Models
3. Results
3.1. Quality Assessment
3.2. Quantity Assessment
3.2.1. Analysis of the Segmentation Result
3.2.2. Measurements of the Dataset for Learning
3.3. Analysis of the Random-Background-Pasting Method
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Jia, Y.; Du, H.; Wang, H.; Yu, R.; Fan, L.; Xu, G.; Zhang, Q. Automatic Early Smoke Segmentation based on Conditional Generative Adversarial Networks. Optik 2019, 193, 162879. [Google Scholar] [CrossRef]
- Gaur, A.; Singh, A.; Kumar, A.; Kumar, A.; Kapoor, K. Video flame and smoke based fire detection algorithms: A literature review. Fire Technol. 2020, 56, 1943–1980. [Google Scholar] [CrossRef]
- Shamsoshoara, A.; Afghah, F.; Razi, A.; Zheng, L.; Fulé, P.Z.; Blasch, E. Aerial imagery pile burn detection using deep learning: The FLAME dataset. Comput. Netw. 2021, 193, 108001. [Google Scholar] [CrossRef]
- Chen, S.; Cao, Y.; Feng, X.; Lu, X. Global2Salient: Self-adaptive feature aggregation for remote sensing smoke detection. Neurocomputing 2021, 466, 202–220. [Google Scholar] [CrossRef]
- Wang, Z.; Zhang, H.; Hou, M.; Shu, X.; Wu, J.; Zhang, X. A Study on Forest Flame Recognition of UAV Based on YOLO-V3 Improved Algorithm. In Recent Advances in Sustainable Energy and Intelligent Systems; Springer: Cham, Switzerland, 2021; pp. 497–503. [Google Scholar]
- Li, P.; Zhao, W. Image fire detection algorithms based on convolutional neural networks. Case Stud. Therm. Eng. 2020, 19, 100625. [Google Scholar] [CrossRef]
- Purves, D. Cognitive Neuroscience; Sinauer Associates, Inc.: Sunderland, UK, 2008. [Google Scholar]
- Zhou, T.; Porikli, F.; Crandall, D.J.; Van Gool, L.; Wang, W. A Survey on Deep Learning Technique for Video Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 7099–7122. [Google Scholar] [CrossRef] [PubMed]
- Virtual Flame Videos. 2018. Available online: https://www.bilibili.com/video/BV1Ep411o7ao/?spm_id_from=333.999.0.0&vd_source=a3701280b4f33a1022c6b93f5360155f (accessed on 8 May 2023).
- FIRE-SMOKE-DATASET. 2019. Available online: https://github.com/DeepQuestAI/Fire-Smoke-Dataset (accessed on 8 May 2023).
- Fire-Detection-Image-Dataset. 2017. Available online: https://github.com/cair/Fire-Detection-Image-Dataset (accessed on 8 May 2023).
- Non-Smoke Images. 2023. Available online: http://staff.ustc.edu.cn/~yfn/non-smoke1_27707.rar (accessed on 8 May 2023).
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Cham, Switzerland, 2015; pp. 234–241. [Google Scholar]
- Chen, L.-C.; Papandreou, G.; Schroff, F.; Adam, H. Rethinking atrous convolution for semantic image segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 60. [Google Scholar] [CrossRef] [Green Version]
- Wong, S.C.; Gatt, A.; Stamatescu, V.; McDonnell, M.D. Understanding data augmentation for classification: When to warp? In Proceedings of the 2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA), Gold Coast, QLD, Australia, 30 November–2 December 2016; pp. 1–6. [Google Scholar]
- Taylor, L.; Nitschke, G. Improving deep learning with generic data augmentation. In Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI), Bangalore, India, 18–21 November 2018; pp. 1542–1547. [Google Scholar]
- Huang, W.; Zhang, S.; Zhang, P.; Zha, Y.; Fang, Y.; Zhang, Y. Identity-aware facial expression recognition via deep metric learning based on synthesized images. IEEE Trans. Multimed. 2021, 24, 3327–3339. [Google Scholar] [CrossRef]
- Sankaranarayanan, S.; Balaji, Y.; Jain, A.; Lim, S.N.; Chellappa, R. Learning from synthetic data: Addressing domain shift for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 3752–3761. [Google Scholar]
- Antoniou, A.; Storkey, A.; Edwards, H. Data augmentation generative adversarial networks. arXiv 2017, arXiv:1711.04340. [Google Scholar]
- Xu, G.; Zhang, Y.; Zhang, Q.; Lin, G.; Wang, J. Deep domain adaptation based video smoke detection using synthetic smoke images. Fire Saf. J. 2017, 93, 53–59. [Google Scholar] [CrossRef] [Green Version]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6023–6032. [Google Scholar]
- Dwibedi, D.; Misra, I.; Hebert, M. Cut, paste and learn: Surprisingly easy synthesis for instance detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1301–1310. [Google Scholar]
- Ghiasi, G.; Cui, Y.; Srinivas, A.; Qian, R.; Lin, T.-Y.; Cubuk, E.D.; Le, Q.V.; Zoph, B. Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar] [CrossRef]
- Tremblay, J.; Prakash, A.; Acuna, D.; Brophy, M.; Jampani, V.; Anil, C.; To, T.; Cameracci, E.; Boochoon, S.; Birchfield, S. Training deep networks with synthetic data: Bridging the reality gap by domain randomization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 969–977. [Google Scholar]
- Nunes, R.; Ferreira, J.; Peixoto, P. SynPhoRest-Synthetic Photorealistic Forest Dataset with Depth Information for Machine Learning Model Training; Zenodo: Geneva, Switzerland, 2022. [Google Scholar]
- Blender. 2023. Available online: https://www.blender.org/ (accessed on 1 April 2023).
- Unity. 2023. Available online: https://unity.cn/ (accessed on 1 April 2023).
- Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proceedings of the International Conference on Machine Learning (PMLR), Lille, France, 6 July–11 July 2015; pp. 448–456. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Yuan, F.; Zhang, L.; Xia, X.; Huang, Q.; Li, X. A wave-shaped deep neural network for smoke density estimation. IEEE Trans. Image Process. 2019, 29, 2301–2313. [Google Scholar] [CrossRef] [PubMed]
- Van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res. 2008, 9, 2579–2605. [Google Scholar]
- Jia, Y.; Yuan, J.; Wang, J.; Fang, J.; Zhang, Q.; Zhang, Y. A saliency-based method for early smoke detection in video sequences. Fire Technol. 2016, 52, 1271–1292. [Google Scholar]
Models | Original Virtual Images with a Solid Background | Virtual Images with a Real Background | ||
---|---|---|---|---|
mPA | mIoU | mPA | mIoU | |
FCN | 0.531 | 0.169 | 0.573 | 0.425 |
U-net | 0.586 | 0.249 | 0.552 | 0.464 |
Deeplabv3 | 0.506 | 0.098 | 0.506 | 0.098 |
Mask R-CNN | 0.729 * | 0.459 * | 0.783 ** | 0.515 ** |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jia, Y.; Mao, Z.; Zhang, X.; Kuang, Y.; Chen, Y.; Zhang, Q. Fully Synthetic Videos and the Random-Background-Pasting Method for Flame Segmentation. Electronics 2023, 12, 2492. https://doi.org/10.3390/electronics12112492
Jia Y, Mao Z, Zhang X, Kuang Y, Chen Y, Zhang Q. Fully Synthetic Videos and the Random-Background-Pasting Method for Flame Segmentation. Electronics. 2023; 12(11):2492. https://doi.org/10.3390/electronics12112492
Chicago/Turabian StyleJia, Yang, Zixu Mao, Xinmeng Zhang, Yaxi Kuang, Yanping Chen, and Qixing Zhang. 2023. "Fully Synthetic Videos and the Random-Background-Pasting Method for Flame Segmentation" Electronics 12, no. 11: 2492. https://doi.org/10.3390/electronics12112492
APA StyleJia, Y., Mao, Z., Zhang, X., Kuang, Y., Chen, Y., & Zhang, Q. (2023). Fully Synthetic Videos and the Random-Background-Pasting Method for Flame Segmentation. Electronics, 12(11), 2492. https://doi.org/10.3390/electronics12112492