A Novel Two-Stage Approach for Automatic Extraction and Multi-View Generation of Litchis
Abstract
:1. Introduction
2. Materials and Methods
2.1. Image Data Collection
2.2. Data Annotation
2.3. Data Augmentation
2.4. Semantic Segmentation Network Architecture
2.5. Combining Semantic Segmentation with HSV Color Space for Extracting Litchi Branches
2.5.1. Establishing the Litchi Branch Dataset
2.5.2. Denoising Diffusion Probabilistic Models (DDPM)
2.5.3. Expanding the Litchi Branch Dataset Using DDPM
2.5.4. Extracting Crucial Litchi Components Using Semantic Segmentation Results and the HSV Color Space
2.6. Multi-View Generation Network Architecture
2.7. Evaluation Metrics
3. Results
3.1. Model Training
3.2. Comparison of Different Semantic Segmentation Models
3.3. Experiment on HSV Color Space Thresholding for Litchi Branches
3.3.1. Results of Litchi Branch Generation by DDPM
3.3.2. Histogram of Average Hue Values for Litchi Branches
3.4. Comparison of Improved Litchi Branch Extraction Methods
3.5. Comparison of Different Multi-View Generation Models
3.5.1. Qualitative Comparison
3.5.2. Quantitative Comparison
3.6. Litchi Multi-View Generation Network Based on Two-stage Model
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Zhao, L.; Wang, K.; Wang, K.; Zhu, J.; Hu, Z. Nutrient Components, Health Benefits, and Safety of Litchi (Litchi Chinensis Sonn.): A Review. Compr. Rev. Food Sci. Food Saf. 2020, 19, 2139–2163. [Google Scholar] [CrossRef] [PubMed]
- Wene, Q.I.; Houbin, C.; Tao, L.; Fengxian, S. Development Status, Trend and Suggestion of Litchi Industry in Mainland China. Guangdong Agric. Sci. 2019, 46, 132–139. [Google Scholar]
- Li, C.; Lin, J.; Li, Z.; Mai, C.; Jiang, R.; Li, J. An Efficient Detection Method for Litchi Fruits in a Natural Environment Based on Improved YOLOv7-Litchi. Comput. Electron. Agric. 2024, 217, 108605. [Google Scholar] [CrossRef]
- Yu, L.; Xiong, J.; Fang, X.; Yang, Z.; Chen, Y.; Lin, X.; Chen, S. A Litchi Fruit Recognition Method in a Natural Environment Using RGB-D Images. Biosyst. Eng. 2021, 204, 50–63. [Google Scholar] [CrossRef]
- Xie, J.; Jing, T.; Chen, B.; Peng, J.; Zhang, X.; He, P.; Yin, H.; Sun, D.; Wang, W.; Xiao, A.; et al. Method for Segmentation of Litchi Branches Based on the Improved DeepLabv3+. Agronomy 2022, 12, 2812. [Google Scholar] [CrossRef]
- Li, Y.; Wang, X.; Zhao, Z.; Han, S.; Liu, Z. Lagoon Water Quality Monitoring Based on Digital Image Analysis and Machine Learning Estimators. Water Res. 2020, 172, 115471. [Google Scholar] [CrossRef] [PubMed]
- Mo, J.; Lan, Y.; Yang, D.; Wen, F.; Qiu, H.; Chen, X.; Deng, X. Deep Learning-Based Instance Segmentation Method of Litchi Canopy from UAV-Acquired Images. Remote Sens. 2021, 13, 3919. [Google Scholar] [CrossRef]
- Li, Y.; Zhao, Z.; Luo, Y.; Qiu, Z. Real-Time Pattern-Recognition of GPR Images with YOLO v3 Implemented by Tensorflow. Sensors 2020, 20, 6476. [Google Scholar] [CrossRef] [PubMed]
- Li, Y.; Zhao, Z.; Xu, W.; Liu, Z.; Wang, X. An Effective FDTD Model for GPR to Detect the Material of Hard Objects Buried in Tillage Soil Layer. Soil Tillage Res. 2019, 195, 104353. [Google Scholar] [CrossRef]
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. arXiv 2014, arXiv:1406.2661. [Google Scholar] [CrossRef]
- Pan, Z.; Yu, W.; Yi, X.; Khan, A.; Yuan, F.; Zheng, Y. Recent Progress on Generative Adversarial Networks (GANs): A Survey. IEEE Access 2019, 7, 36322–36333. [Google Scholar] [CrossRef]
- Wang, C.; Xiao, Z. Lychee Surface Defect Detection Based on Deep Convolutional Neural Networks with GAN-Based Data Augmentation. Agronomy 2021, 11, 1500. [Google Scholar] [CrossRef]
- Huang, Y.; Chen, Z.; Liu, J. Limited Agricultural Spectral Dataset Expansion Based on Generative Adversarial Networks. Comput. Electron. Agric. 2023, 215, 108385. [Google Scholar] [CrossRef]
- Ho, J.; Jain, A.; Abbeel, P. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems; The MIT Press: Cambridge, MA, USA, 2020. [Google Scholar]
- Dhariwal, P.; Nichol, A. Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems; The MIT Press: Cambridge, MA, USA, 2021. [Google Scholar]
- Chen, D.; Qi, X.; Zheng, Y.; Lu, Y.; Huang, Y.; Li, Z. Synthetic Data Augmentation by Diffusion Probabilistic Models to Enhance Weed Recognition. Comput. Electron. Agric. 2024, 216, 108517. [Google Scholar] [CrossRef]
- Moreno, H.; Gómez, A.; Altares-López, S.; Ribeiro, A.; Andújar, D. Analysis of Stable Diffusion-Derived Fake Weeds Performance for Training Convolutional Neural Networks. Comput. Electron. Agric. 2023, 214, 108324. [Google Scholar] [CrossRef]
- Wu, J.; Wyman, O.; Tang, Y.; Pasini, D.; Wang, W. Multi-View 3D Reconstruction Based on Deep Learning: A Survey and Comparison of Methods. Neurocomputing 2024, 582, 127553. [Google Scholar] [CrossRef]
- Liu, Y.; Lin, C.; Zeng, Z.; Long, X.; Liu, L.; Komura, T.; Wang, W. SyncDreamer: Generating Multiview-Consistent Images from a Single-View Image. arXiv 2024, arXiv:2309.03453. [Google Scholar]
- Shen, Q.; Yang, X.; Wang, X. Anything-3D: Towards Single-View Anything Reconstruction in the Wild. arXiv 2023, arXiv:2304.10261. [Google Scholar]
- Shi, R.; Chen, H.; Zhang, Z.; Liu, M.; Xu, C.; Wei, X.; Chen, L.; Zeng, C.; Su, H. Zero123++: A Single Image to Consistent Multi-View Diffusion Base Model. arXiv 2023, arXiv:2310.15110. [Google Scholar]
- Long, X.; Guo, Y.-C.; Lin, C.; Liu, Y.; Dou, Z.; Liu, L.; Ma, Y.; Zhang, S.-H.; Habermann, M.; Theobalt, C.; et al. Wonder3D: Single Image to 3D Using Cross-Domain Diffusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
- Cheng, B.; Misra, I.; Schwing, A.G.; Kirillov, A.; Girdhar, R. Masked-Attention Mask Transformer for Universal Image Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Smith, A.R. Color Gamut Transform Pairs. ACM SIGGRAPH Comput. Graph. 1978, 12, 12–19. [Google Scholar] [CrossRef]
- Sohl-Dickstein, J.; Weiss, E.A.; Maheswaranathan, N.; Ganguli, S. Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 7–9 July 2015. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Deitke, M.; Schwenk, D.; Salvador, J.; Weihs, L.; Michel, O.; VanderBilt, E.; Schmidt, L.; Ehsani, K.; Kembhavi, A.; Farhadi, A. Objaverse: A Universe of Annotated 3D Objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
Label | Explanation |
---|---|
Background | Image background |
Litchi | Litchi Fruit |
Branch | Litchi Branch |
Model | Backbone | Input Resolution | Data Augmentation Applied | mIoU (%) | mPA (%) |
---|---|---|---|---|---|
Mask2Former | Swin-S | 256 × 256 | Yes | 78.52 | 83.36 |
Mask2Former | Swin-S | 1024 × 1024 | Yes | 79.04 | 87.39 |
Mask2Former | Swin-S | 512 × 512 | Yes | 79.79 | 85.82 |
Mask2Former | Swin-S | 512 × 512 | No | 75.64 | 80.40 |
Model | Backbone | mIoU (%) | mPA (%) |
---|---|---|---|
DeepLabV3+ | ResNet-101 | 73.96 | 77.55 |
PSPNet | ResNet-101 | 73.29 | 79.03 |
SegFormer | MIT-B5 | 74.45 | 80.48 |
KNet | Swin-L | 76.44 | 81.26 |
Mask2Former | Swin-L | 78.64 | 83.35 |
Mask2Former | Swin-S | 79.79 | 85.82 |
Method | PSNR ↑ | SSIM ↑ | LPIPS ↓ |
---|---|---|---|
Syncdreamer | 8.01 | 0.4578 | 0.492 |
Zero123 | 14.16 | 0.7411 | 0.283 |
Zero123++ | 16.51 | 0.7959 | 0.199 |
Wonder3D | 18.89 | 0.8199 | 0.114 |
Method | PSNR ↑ | SSIM ↑ | LPIPS ↓ |
---|---|---|---|
Wonder3D | 19.24 | 0.8231 | 0.168 |
Mask2Former + Wonder3D | 19.45 | 0.8352 | 0.108 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Y.; Wang, J.; Liang, M.; Song, H.; Liao, J.; Lan, Y. A Novel Two-Stage Approach for Automatic Extraction and Multi-View Generation of Litchis. Agriculture 2024, 14, 1046. https://doi.org/10.3390/agriculture14071046
Li Y, Wang J, Liang M, Song H, Liao J, Lan Y. A Novel Two-Stage Approach for Automatic Extraction and Multi-View Generation of Litchis. Agriculture. 2024; 14(7):1046. https://doi.org/10.3390/agriculture14071046
Chicago/Turabian StyleLi, Yuanhong, Jing Wang, Ming Liang, Haoyu Song, Jianhong Liao, and Yubin Lan. 2024. "A Novel Two-Stage Approach for Automatic Extraction and Multi-View Generation of Litchis" Agriculture 14, no. 7: 1046. https://doi.org/10.3390/agriculture14071046
APA StyleLi, Y., Wang, J., Liang, M., Song, H., Liao, J., & Lan, Y. (2024). A Novel Two-Stage Approach for Automatic Extraction and Multi-View Generation of Litchis. Agriculture, 14(7), 1046. https://doi.org/10.3390/agriculture14071046