A Practical Deep Learning Architecture for Large-Area Solid Wastes Monitoring Based on UAV Imagery
Abstract
:1. Introduction
- (1)
- Approximately 450 km2 of high-resolution images of the Nanhu District in Jiaxing City and some neighboring areas were collected using drones, and a high-precision solid waste pile segmentation dataset with multi-scene distribution was created by pixel-wise labeling.
- (2)
- A dual-branch structured semantic segmentation model was proposed, which achieves a more comprehensive feature representation by fusing contextual feature branches and spatial feature branches. In comparison with other semantic segmentation methods, it achieves state-of-the-art performance in the high-resolution solid waste pile segmentation dataset, demonstrating excellent capability in segmenting complex edges of solid waste piles in multiple scenes.
- (3)
- This work provides new ideas for large-area and high-precision solid waste monitoring, effectively reducing the cost of manual inspection and supervision. It can provide scientific theoretical support for the ecological civilization construction and sustainable development of cities, and play an active role in protecting land resources and the ecological environment.
2. Study Area
3. Methods
3.1. Dataset
3.1.1. Data Acquisition
3.1.2. Data Labeling
- (1)
- Area Selection: Priority is given to identifying large, concentrated areas. In the sample area selection, garbage piles with an area of at least 1 square meter, as determined by expert judgment, and consisting of more than 900 pixels based on the spatial resolution, are considered. If there are objects of similar nature near a large garbage pile, they are also considered as garbage.
- (2)
- Covering Multiple Scene Types: Solid waste piles commonly found in daily life are widely distributed in various environments such as buildings, roads, grasslands, rivers, and forests. In order for the model in this study to possess the ability to recognize solid waste in multiple scene types, the sampling areas need to include various scene types.
- (3)
- Balancing Garbage Category Proportions: Based on the overview of the entire image, garbage can be classified into three categories: construction waste, household waste, and mixed waste. Construction waste includes materials such as wood, bricks, and concrete generated during construction and demolition processes. Household waste includes garbage such as plastic, kitchen waste, and paper generated in daily life. Mixed waste refers to garbage that is a mixture of different sources and types. This study does not distinguish between different types of garbage, but when creating sampling areas, it is necessary to maintain a balance in the quantities of these three types to meet monitoring requirements.
3.1.3. Training Dataset
3.2. Segmentation Model
3.2.1. Backbone and Segmentation Head
3.2.2. Context Feature Branch
3.2.3. Spatial Feature Branch
3.2.4. Feature Fusion Module
- (1)
- Different preprocessing scales. The GSM block is 1/16 in stage 1, 1/32 in stage 2, and the DCR is 1/8 in both stages. Since the global semantic information covers the entire image, the GSM block can focus on highlighting global semantic features in low-resolution feature maps, while spatial detail information mainly concentrates on local regions. The high-resolution feature maps allow the DCR to focus on highlighting spatial detail information. The separation and extraction mechanism makes feature extraction more efficient.
- (2)
- Multi-level fusion and complementary information. By conditioning the learning of the GSM block on DCR features, and conditioning the learning of DCR on SWR features, for the contextual branch, downsampling and global semantic information extraction will result in the loss of feature map detail information. However, spatial features from high-resolution can effectively supplement the detail information, addressing the blurring of feature maps caused by information loss. For the spatial feature branch with low-level detailed features, it typically contains a large amount of noise and redundant information. After communication, it can obtain high-level global semantic information at the low level and weight the detail information based on the importance of semantic information.
- (3)
- Improving the joint training-prediction accuracy of the GSM block and DCR block. Joint training framework refers to a training approach where multiple components are trained together in an integrated manner. Due to the dependency between the two branches, integrating them into a joint training framework can better utilize their complementary information, enhancing the model’s expressive power and prediction accuracy. It also reduces the reliance on the sample size.
4. Experiments and Discussion
4.1. Implementation Details
4.2. Accuracy Metrics
4.3. Ablation Study
4.4. Method Comparison
4.4.1. Quantitative Analysis
4.4.2. Qualitative Analysis
Multi-Scene Comparison
Extensive Examples
4.5. Effectiveness of Comprehensive Monitoring and Management
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Mohee, R.; Mauthoor, S.; Bundhoo, Z.M.A.; Somaroo, G.; Soobhany, N.; Gunasee, S. Current status of solid waste management in small island developing states: A review. Waste Manag. 2015, 43, 539–549. [Google Scholar] [CrossRef] [PubMed]
- Grazhdani, D. Assessing the variables affecting on the rate of solid waste generation and recycling: An empirical analysis in Prespa Park. Waste Manag. 2016, 48, 3–13. [Google Scholar] [CrossRef] [PubMed]
- Incekara, A.H.; Delen, A.; Seker, D.Z.; Goksel, C. Investigating the Utility Potential of Low-Cost Unmanned Aerial Vehicles in the Temporal Monitoring of a Landfill. ISPRS Int. J. Geo-Inf. 2019, 8, 22. [Google Scholar] [CrossRef]
- Manzo, C.; Mei, A.; Zampetti, E.; Bassani, C.; Paciucci, L.; Manetti, P. Top-down approach from satellite to terrestrial rover application for environmental monitoring of landfills. Sci. Total Environ. 2017, 584–585, 1333–1348. [Google Scholar] [CrossRef]
- Li, H.; Hu, C.; Zhong, X.; Zeng, C.; Shen, H. Solid Waste Detection in Cities Using Remote Sensing Imagery Based on a Location-Guided Key Point Network With Multiple Enhancements. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 191–201. [Google Scholar] [CrossRef]
- Bui, T.D.; Tsai, F.M.; Tseng, M.L.; Ali, M.H. Identifying sustainable solid waste management barriers in practice using the fuzzy Delphi method. Resour. Conserv. Recycl. 2020, 154, 104625. [Google Scholar] [CrossRef]
- Zeng, L.; Sun, H.; Peng, T.; Hui, T. Effect of glass content on sintering kinetics, microstructure and mechanical properties of glass-ceramics from coal fly ash and waste glass. Mater. Chem. Phys. 2021, 260, 124120. [Google Scholar] [CrossRef]
- Malche, T.; Maheshwary, P.; Tiwari, P.K.; Alkhayyat, A.H.; Bansal, A.; Kumar, R. Efficient solid waste inspection through drone-based aerial imagery and TinyML vision model. Trans. Emerg. Telecommun. Technol. 2023, e4878. [Google Scholar] [CrossRef]
- Padubidri, C.; Kamilaris, A.; Karatsiolis, S. Accurate Detection of Illegal Dumping Sites Using High Resolution Aerial Photography and Deep Learning. In Proceedings of the 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops), Pisa, Italy, 21–25 March 2022. [Google Scholar] [CrossRef]
- Ramos, R.d.O.; de Sousa Fernandes, D.D.; de Almeida, V.E.; Gonsalves Dias Diniz, P.H.; Lopes, W.S.; Leite, V.D.; Ugulino de Araujo, M.C. A video processing and machine vision-based automatic analyzer to determine sequentially total suspended and settleable solids in wastewater. Anal. Chim. Acta 2022, 1206, 339411. [Google Scholar] [CrossRef]
- Gill, J.; Faisal, K.; Shaker, A.; Yan, W.Y. Detection of waste dumping locations in landfill using multi-temporal Landsat thermal images. Waste Manag. Res. 2019, 37, 386–393. [Google Scholar] [CrossRef]
- Begur, H.; Dhawade, M.; Gaur, N.; Dureja, P.; Gao, J.; Mahmoud, M.; Huang, J.; Chen, S.; Ding, X. An edge-based smart mobile service system for illegal dumping detection and monitoring in San Jose. In Proceedings of the 2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), San Francisco, CA, USA, 4–8 August 2017; pp. 1–6. [Google Scholar] [CrossRef]
- DeVries, B.; Huang, C.; Armston, J.; Huang, W.; Jones, J.W.; Lang, M.W. Rapid and robust monitoring of flood events using Sentinel-1 and Landsat data on the Google Earth Engine. Remote Sens. Environ. 2020, 240, 111664. [Google Scholar] [CrossRef]
- Enegbuma, W.I.; Bamgbade, J.A.; Ming, C.P.H.; Ohueri, C.C.; Tanko, B.L.; Ojoko, E.O.; Dodo, Y.A.; Kori, S.A. Real-Time Construction Waste Reduction Using Unmanned Aerial Vehicle. In Handbook of Research on Resource Management for Pollution and Waste Treatment; IGI Global: Hershey, PA, USA, 2020. [Google Scholar] [CrossRef]
- Osco, L.P.; Marcato Junior, J.; Marques Ramos, A.P.; de Castro Jorge, L.A.; Fatholahi, S.N.; de Andrade Silva, J.; Matsubara, E.T.; Pistori, H.; Gonçalves, W.N.; Li, J. A review on deep learning in UAV remote sensing. Int. J. Appl. Earth Obs. Geoinf. 2021, 102, 102456. [Google Scholar] [CrossRef]
- Manfreda, S.; McCabe, M.F.; Miller, P.E.; Lucas, R.; Pajuelo Madrigal, V.; Mallinis, G.; Ben Dor, E.; Helman, D.; Estes, L.; Ciraolo, G.; et al. On the Use of Unmanned Aerial Systems for Environmental Monitoring. Remote Sens. 2018, 10, 641. [Google Scholar] [CrossRef]
- Tmušić, G.; Manfreda, S.; Aasen, H.; James, M.R.; Gonçalves, G.; Ben-Dor, E.; Brook, A.; Polinova, M.; Arranz, J.J.; Mészáros, J.; et al. Current Practices in UAS-based Environmental Monitoring. Remote Sens. 2020, 12, 1001. [Google Scholar] [CrossRef]
- Shen, C. A transdisciplinary review of deep learning research and its relevance for water resources scientists. Water Resour. Res. 2018, 54, 8558–8593. [Google Scholar] [CrossRef]
- Gonçalves, G.; Andriolo, U.; Pinto, L.; Bessa, F. Mapping marine litter using UAS on a beach-dune system: A multidisciplinary approach. Sci. Total Environ. 2020, 706, 135742. [Google Scholar] [CrossRef]
- Li, J.; Chen, J.; Sheng, B.; Li, P.; Yang, P.; Feng, D.D.; Qi, J. Automatic detection and classification system of domestic waste via multimodel cascaded convolutional neural network. IEEE Trans. Ind. Inform. 2021, 18, 163–173. [Google Scholar] [CrossRef]
- Sheng, T.J.; Islam, M.S.; Misran, N.; Baharuddin, M.H.; Arshad, H.; Islam, M.R.; Chowdhury, M.E.; Rmili, H.; Islam, M.T. An internet of things based smart waste management system using LoRa and tensorflow deep learning model. IEEE Access 2020, 8, 148793–148811. [Google Scholar] [CrossRef]
- Gupta, T.; Joshi, R.; Mukhopadhyay, D.; Sachdeva, K.; Jain, N.; Virmani, D.; Garcia-Hernandez, L. A deep learning approach based hardware solution to categorise garbage in environment. Complex Intell. Syst. 2022, 8, 1129–1152. [Google Scholar] [CrossRef]
- Abdu, H.; Noor, M.H.M. A Survey on Waste Detection and Classification Using Deep Learning. IEEE Access 2022, 10, 128151–128165. [Google Scholar] [CrossRef]
- Majchrowska, S.; Mikołajczyk, A.; Ferlin, M.; Klawikowska, Z.; Plantykow, M.A.; Kwasigroch, A.; Majek, K. Deep learning-based waste detection in natural and urban environments. Waste Manag. 2022, 138, 274–284. [Google Scholar] [CrossRef]
- Yun, K.; Kwon, Y.; Oh, S. Vision-based garbage dumping action detection for real-world surveillance platform. ETRI J. 2019, 41, 494–505. [Google Scholar] [CrossRef]
- Li, X.; Tian, M.; Kong, S.; Wu, L.; Yu, J. A modified YOLOv3 detection method for vision-based water surface garbage capture robot. Int. J. Adv. Robot. Syst. 2020, 17, 1729881420932715. [Google Scholar] [CrossRef]
- Chu, Y.; Huang, C.; Xie, X.; Tan, B.; Kamal, S.; Xiong, X. Multilayer hybrid deep-learning method for waste classification and recycling. Comput. Intell. Neurosci. 2018, 2018, 5060857. [Google Scholar] [CrossRef] [PubMed]
- Shahab, S.; Anjum, M. Solid waste management scenario in india and illegal dump detection using deep learning: An AI approach towards the sustainable waste management. Sustainability 2022, 14, 15896. [Google Scholar] [CrossRef]
- Shi, C.; Tan, C.; Wang, T.; Wang, L. A waste classification method based on a multilayer hybrid convolution neural network. Appl. Sci. 2021, 11, 8572. [Google Scholar] [CrossRef]
- Fasana, C.; Pasini, S. Learning to Detect Illegal Landfills in Aerial Images with Scarce Labeling Data. 2022. Available online: https://hdl.handle.net/10589/196992 (accessed on 20 December 2022).
- Torres, R.N.; Fraternali, P. AerialWaste dataset for landfill discovery in aerial and satellite images. Sci. Data 2023, 10, 63. [Google Scholar] [CrossRef]
- Yu, X.; Chen, Z.; Zhang, S.; Zhang, T. A street rubbish detection algorithm based on Sift and RCNN. In Proceedings of the MIPPR 2017: Automatic Target Recognition and Navigation, Xiangyang, China, 19 February 2018; pp. 97–104. [Google Scholar] [CrossRef]
- Niu, B.; Feng, Q.; Yang, J.; Chen, B.; Gao, B.; Liu, J.; Li, Y.; Gong, J. Solid waste mapping based on very high resolution remote sensing imagery and a novel deep learning approach. Geocarto Int. 2023, 38, 2164361. [Google Scholar] [CrossRef]
- Altikat, A.; Gulbe, A.; Altikat, S. Intelligent solid waste classification using deep convolutional neural networks. Int. J. Environ. Sci. Technol. 2022, 19, 1285–1292. [Google Scholar] [CrossRef]
- Li, N.; Chen, Y. Municipal solid waste classification and real-time detection using deep learning methods. Urban Clim. 2023, 49, 101462. [Google Scholar] [CrossRef]
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar] [CrossRef]
- Brostow, G.J.; Fauqueur, J.; Cipolla, R. Semantic object classes in video: A high-definition ground truth database. Pattern Recognit. Lett. 2009, 30, 88–97. [Google Scholar] [CrossRef]
- Neuhold, G.; Ollmann, T.; Rota Bulo, S.; Kontschieder, P. The mapillary vistas dataset for semantic understanding of street scenes. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4990–4999. [Google Scholar] [CrossRef]
- Ogunrinde, I.; Bernadin, S. A review of the impacts of defogging on deep learning-based object detectors in self-driving cars. SoutheastCon 2021, 2021, 01–08. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 26 June–1 July 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Chen, F.; Wu, F.; Xu, J.; Gao, G.; Ge, Q. Adaptive deformable convolutional network. Neurocomputing 2021, 453, 853–864. [Google Scholar] [CrossRef]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 4510–4520. [Google Scholar] [CrossRef]
- Yu, C.; Gao, C.; Wang, J.; Yu, G.; Shen, C.; Sang, N. Bisenet v2: Bilateral network with guided aggregation for real-time semantic segmentation. Int. J. Comput. Vis. 2021, 129, 3051–3068. [Google Scholar] [CrossRef]
- Qi, J.; Nguyen, M.; Yan, W. Waste classification from digital images using ConvNeXt. In Image and Video Technology: Proceedings of the 10th Pacific-Rim Symposium, PSIVT 2022, Virtual Event, 12–14 November 2022; Springer International Publishing: Cham, Switzerland, 2022; pp. 1–13. [Google Scholar] [CrossRef]
- Minaee, S.; Boykov, Y.; Porikli, F. Image segmentation using deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 3523–3542. [Google Scholar] [CrossRef]
- Hong, Y.; Pan, H.; Sun, W.; Jia, Y. Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes. arXiv 2021, arXiv:2101.06085. [Google Scholar] [CrossRef]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Computer Vision—ECCV 2018: Proceedings of the 15th European Conference, Munich, Germany, 8–14 September 2018; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer International Publishing: Cham, Switzerland, 2018; Volume 11211 LNCS, pp. 833–851. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: Proceedings of the 18th International Conference, Munich, Germany, 5–9 October 2015; Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer International Publishing: Cham, Switzerland, 2015; Volume 9351, pp. 234–241. [Google Scholar] [CrossRef]
- Chao, P.; Kao, C.-Y.; Ruan, Y.-S.; Huang, C.-H.; Lin, Y. Hardnet: A low memory traffic network. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3552–3561. [Google Scholar] [CrossRef]
- Peng, C.; Zhang, X.; Yu, G.; Luo, G.; Sun, J. Large kernel matters–improve semantic segmentation by global convolutional network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4353–4361. [Google Scholar] [CrossRef]
- Zhou, Z.; Siddiquee, M.M.R.; Tajbakhsh, N.; Liang, J. UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation. IEEE Trans. Med. Imaging 2020, 39, 1856–1867. [Google Scholar] [CrossRef]
- Zhu, X.; Cheng, Z.; Wang, S. Coronary angiography image segmentation based on PSPNet. Comput. Methods Programs Biomed. 2021, 200, 105897. [Google Scholar] [CrossRef] [PubMed]
- Lee, J.; Kim, D.; Ponce, J.; Ham, B. Sfnet: Learning object-aware semantic correspondence. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2278–2287. [Google Scholar] [CrossRef]
- Xie, E.; Wang, W.; Yu, Z.; Anandkumar, A.; Alvarez, J.M.; Luo, P. SegFormer: Simple and efficient design for semantic segmentation with transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar] [CrossRef]
- Aleissaee, A.; Kumar, A.; Anwer, R.; Khan, S.; Cholakkal, H. Transformers in remote sensing: A survey. Remote Sens. 2023, 15, 1860. [Google Scholar] [CrossRef]
Method | Backbone + Head | Context | Spatial | Fusion | F1 | OA | Rec | Prec |
---|---|---|---|---|---|---|---|---|
A | √ | 85.791 | 92.972 | 83.915 | 87.752 | |||
B | √ | √ | 87.869 | 93.956 | 86.574 | 89.203 | ||
C | √ | √ | 88.142 | 94.123 | 86.392 | 89.964 | ||
D | √ | √ | √ | 88.564 | 94.245 | 88.146 | 88.986 | |
E | √ | √ | √ | √ | 88.875 | 94.390 | 88.626 | 89.125 |
Method | F1 (%) | OA | Rec | Prec |
---|---|---|---|---|
PSPNet | 81.387 | 90.985 | 77.956 | 85.134 |
DAnet | 83.076 | 91.632 | 81.23 | 85.008 |
DDRNet | 83.246 | 91.546 | 83.068 | 83.424 |
BiSeNetv2 | 84.482 | 92.413 | 81.683 | 87.479 |
SegFormer | 85.113 | 92.719 | 82.318 | 88.104 |
FCHarDNet | 85.189 | 92.713 | 82.889 | 87.62 |
ConvNeXt | 85.286 | 92.738 | 83.239 | 87.436 |
UNet | 85.389 | 92.865 | 82.457 | 88.537 |
Deeplabv3plus | 85.843 | 93.037 | 83.495 | 88.328 |
SFNet | 86.094 | 93.162 | 83.724 | 88.602 |
SwinTransformer | 86.173 | 93.071 | 85.405 | 86.956 |
NestedUnet | 86.572 | 93.505 | 82.803 | 90.701 |
GCN | 87.683 | 93.885 | 86.087 | 89.34 |
SWM | 88.875 | 94.390 | 88.626 | 89.125 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, Y.; Zhao, B.; Zhang, X.; Nie, W.; Gou, P.; Liao, J.; Wang, K. A Practical Deep Learning Architecture for Large-Area Solid Wastes Monitoring Based on UAV Imagery. Appl. Sci. 2024, 14, 2084. https://doi.org/10.3390/app14052084
Liu Y, Zhao B, Zhang X, Nie W, Gou P, Liao J, Wang K. A Practical Deep Learning Architecture for Large-Area Solid Wastes Monitoring Based on UAV Imagery. Applied Sciences. 2024; 14(5):2084. https://doi.org/10.3390/app14052084
Chicago/Turabian StyleLiu, Yang, Bo Zhao, Xuepeng Zhang, Wei Nie, Peng Gou, Jiachun Liao, and Kunxin Wang. 2024. "A Practical Deep Learning Architecture for Large-Area Solid Wastes Monitoring Based on UAV Imagery" Applied Sciences 14, no. 5: 2084. https://doi.org/10.3390/app14052084
APA StyleLiu, Y., Zhao, B., Zhang, X., Nie, W., Gou, P., Liao, J., & Wang, K. (2024). A Practical Deep Learning Architecture for Large-Area Solid Wastes Monitoring Based on UAV Imagery. Applied Sciences, 14(5), 2084. https://doi.org/10.3390/app14052084