High Quality Coal Foreign Object Image Generation Method Based on StyleGAN-DSAD
Abstract
:1. Introduction
- We introduced a dual self-attention module (DSAM) into the generator of StyleGAN to strengthen the long-distance dependence of features between spatial and channel, which could refine the details of the generated images and solve the problems of artifacts, distortions, and front background adhesion in the generated images.
- Through research and experiments, we found that the discriminator part has little effect on the quality of the generated images; thus, we replaced the standard convolution in the discriminator with a depthwise separable convolution (DSC) to reduce the time and space complexity of StyleGAN and improve the training efficiency.
- Compared with the baseline method, images generated by the proposed method can generate better quality and more diverse foreign object images. Meanwhile, the accuracy of coal foreign object detection was effectively improved after data augmentation using the proposed method, indicating that the application of StyleGAN-DSAD to coal foreign object image augmentation is feasible.
2. Related Work
2.1. Generative Adversarial Network
2.2. StyleGAN
- Mapping network
- Synthesis network
- Discriminator
- Loss function
3. Proposed Methods
- The limited receptive field of the convolutional structure makes it difficult to learn global, long-term dependencies between features, resulting in missing details in key parts of the generated foreign object images [27], producing the phenomena of artifacts, shape distortion, and front background adhesion.
- The multi-level convolutional structure leads to a large number of model parameters, which increases the time and space complexity of the model training process.
3.1. DSAM
3.2. DSC
4. Experiments and Results
4.1. Dataset
4.2. Experimental Settings
4.3. Performance Evaluation of StyleGAN-DSAD
- Comparison of model generation quality and diversity
- Comparison of model complexity
- Overall performance evaluation
- Comparison of actual generation effect
4.4. Practical Effects of Data Augmentation for Foreign Object Detection
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Liu, F.; Guo, L.F.; Zhao, L.Z. Research on coal safety range and green low-carbon technology path under the dual-carbon background. J. China Coal Soc. 2022, 47, 1–15. [Google Scholar]
- Kiseleva, T.V.; Mikhailov, V.G.; Karasev, V.A. Management of local economic and ecological system of coal processing company. In Proceedings of the IOP Conference Series: Earth and Environmental Science, Novokuznetsk, Russia, 7–10 June 2016; Volume 45, p. 012013. [Google Scholar]
- Liu, F.; Cao, W.J.; Zhang, J.M.; Cao, G.M.; Guo, L.F. Current technological innovation and development direction of the 14th five-year plan period in China coal industry. J. China Coal Soc. 2021, 46, 1–14. [Google Scholar]
- Cao, X.G.; Liu, S.Y.; Wang, P.; Xu, G.; Wu, X.D. Research on coal gangue identification and positioning system based on coal-gangue sorting robot. Coal Sci. Technol. 2022, 50, 237–246. [Google Scholar]
- Wang, Y.; Wang, Y.; Dang, L. Video detection of foreign objects on the surface of belt conveyor underground coal mine based on improved SSD. J. Ambient Intell. Humaniz. Comput. 2020, 1–10. [Google Scholar] [CrossRef]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 1–48. [Google Scholar] [CrossRef]
- Zhang, K.; Wang, W.; Lv, Z.; Fan, Y.; Song, Y. Computer vision detection of foreign objects in coal processing using attention CNN. Eng. Appl. Artif. Intell. 2021, 102, 104242. [Google Scholar] [CrossRef]
- Hao, S.; Zhang, X.; Ma, X.; Sun, S.Y.; Wen, H. Foreign object detection in coal mine conveyor belt based on CBAM-YOLOv5. J. China Coal Soc. 2022, 47, 4147–4156. [Google Scholar]
- Zhong, Z.; Zheng, L.; Kang, G. Random erasing data augmentation. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 13001–13008. [Google Scholar]
- Liang, D.; Yang, F.; Zhang, T.; Yang, P. Understanding mixup training methods. IEEE Access 2018, 6, 58774–58783. [Google Scholar] [CrossRef]
- Hu, J.; Gao, Y.; Zhang, H.J.; Jin, B.Q. Research on the identification method of non-coal foreign object of belt convey or based on deep learning. Ind. Mine Autom. 2021, 47, 57–62+90. [Google Scholar]
- Li, M.; Yang, M.L.; Liu, C.Y.; He, X.L.; Duan, Y. Study on illuminance adjustment method for image-based coal and gangue separation. J. China Coal Soc. 2021, 46, 1149–1158. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarialnets. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; Volume 2, pp. 2672–2680. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M. Improved training of Wasserstein GANs. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5767–5777. [Google Scholar]
- Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv 2015, arXiv:1511.06434. [Google Scholar]
- Brock, A.; Donahue, J.; Simonyan, K. Large scale GAN training for high fidelity natural image synthesis. arXiv 2018, arXiv:1809.11096. [Google Scholar]
- Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of GANs for improved quality, stability, and variation. In Proceedings of the International Conference on Learning Representations (ICLR), Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Shi, Z.; Sang, M.; Huang, Y. Defect detection of MEMS based on data augmentation, WGAN-DIV-DC, and a YOLOv5 model. Sensors 2022, 22, 9400. [Google Scholar] [CrossRef]
- Deng, Y.; Shi, Y.P.; Liu, J.; Jiang, Y.Y.; Zhu, Y.M.; Liu, J. Multi-angle facial expression recognition algorithm combined with dual-channel WGAN-GP. Laser Optoelectron. Prog. 2022, 59, 137–147. [Google Scholar]
- Wang, X.; Gao, F.; Chen, J.; Hao, P.C.; Jing, Z.J. Generative adversarial networks based sample generation of coal and rock images. J. China Coal Soc. 2021, 46, 3066–3078. [Google Scholar]
- Wang, L.; Wang, X.; Li, B. A data expansion strategy for improving coal-gangue detection. Int. J. Coal Prep. Util. 2022, 1–19. [Google Scholar] [CrossRef]
- Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 4401–4410. [Google Scholar]
- Ma, D.; Liu, J.; Fang, H. A multi-defect detection system for sewer pipelines based on StyleGAN-SDM and fusion CNN. Constr. Build. Mater. 2021, 312, 125385. [Google Scholar] [CrossRef]
- Hussin, S.; Yildirim, R. StyleGAN-LSRO method for person re-identification. IEEE Access 2021, 9, 13857–13869. [Google Scholar] [CrossRef]
- Li, M.; Zhou, G.; Chen, A. FWDGAN-based data augmentation for tomato leaf disease identification. Comput. Electron. Agric. 2022, 194, 106779. [Google Scholar] [CrossRef]
- Zhang, H.; Goodfellow, I.; Metaxas, D.; Odena, A. Self-attention generative adversarial networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 7354–7363. [Google Scholar]
- Yang, Y.; Sun, L.; Mao, X. Data augmentation based on generative adversarial network with mixed attention mechanism. Electronics 2022, 11, 1718. [Google Scholar] [CrossRef]
- Fu, J.; Liu, J.; Tian, H. Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 3146–3154. [Google Scholar]
- Howard, A.G.; Zhu, M.; Chen, B. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Advances in Neural Information Processing Systems (NeurIPS); Curran Associates, Inc.: New York, NY, USA, 2017; Volume 30, pp. 6629–6640. [Google Scholar]
- Salimans, T.; Goodfellow, I.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved Techniques for Training GANs. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Volume 29, pp. 2234–2242. [Google Scholar]
- Chen, H.; Sun, K.; Tian, Z.; Shen, C.; Yan, Y. Blendmask: Top-down meets bottom-up for instance segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8573–8581. [Google Scholar]
Model | IS | FID | A-Params | Params Rate/% | Time/min | Time Rate/% |
---|---|---|---|---|---|---|
StyleGAN | 4.9 | 34.99 | 44,105,664 | / | 4155 | / |
StyleGAN-DSAM | 7.42 | 29.1 | 45,884,736 | 104.0 | 4312 | 103.8 |
StyleGAN-DSC | 4.57 | 34.8 | 17,849,340 | 38.9 | 2396 | 57.6 |
StyleGAN-DSAD | 7.41 | 29.3 | 19,628,412 | 44.5 | 2443 | 58.8 |
Training Set | APbox/% | APmask/% |
---|---|---|
1000 generated images | 52.3 | 41.6 |
1000 real images | 59.9 | 50.7 |
3000 generated images | 60.0 | 51.5 |
3000 real images | 68.0 | 59.6 |
5000 generated images | 67.8 | 59.5 |
5477 real images | 75.8 | 67.5 |
7000 generated images | 70.0 | 61.6 |
8500 generated images | 71.8 | 61.4 |
10,000 generated images | 71.9 | 62.6 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cao, X.; Wei, H.; Wang, P.; Zhang, C.; Huang, S.; Li, H. High Quality Coal Foreign Object Image Generation Method Based on StyleGAN-DSAD. Sensors 2023, 23, 374. https://doi.org/10.3390/s23010374
Cao X, Wei H, Wang P, Zhang C, Huang S, Li H. High Quality Coal Foreign Object Image Generation Method Based on StyleGAN-DSAD. Sensors. 2023; 23(1):374. https://doi.org/10.3390/s23010374
Chicago/Turabian StyleCao, Xiangang, Hengyang Wei, Peng Wang, Chiyu Zhang, Shikai Huang, and Hu Li. 2023. "High Quality Coal Foreign Object Image Generation Method Based on StyleGAN-DSAD" Sensors 23, no. 1: 374. https://doi.org/10.3390/s23010374
APA StyleCao, X., Wei, H., Wang, P., Zhang, C., Huang, S., & Li, H. (2023). High Quality Coal Foreign Object Image Generation Method Based on StyleGAN-DSAD. Sensors, 23(1), 374. https://doi.org/10.3390/s23010374