An Image Analysis of River-Floating Waste Materials by Using Deep Learning Techniques

Nunkhaw, Maiyatat; Miyamoto, Hitoshi

doi:10.3390/w16101373

Open AccessArticle

An Image Analysis of River-Floating Waste Materials by Using Deep Learning Techniques

by

Maiyatat Nunkhaw

^*

and

Hitoshi Miyamoto

Regional Environment Systems Course, Graduate School of Engineering and Science, Shibaura Institute of Technology, Tokyo 135-8548, Japan

^*

Author to whom correspondence should be addressed.

Water 2024, 16(10), 1373; https://doi.org/10.3390/w16101373

Submission received: 11 April 2024 / Revised: 5 May 2024 / Accepted: 7 May 2024 / Published: 11 May 2024

(This article belongs to the Special Issue Machine-Learning-Based Water Quality Monitoring)

Download

Browse Figures

Versions Notes

Abstract

:

Plastic pollution in the ocean is a severe environmental problem worldwide because rivers carry plastic waste from human activities, harming the ocean’s health, ecosystems, and people. Therefore, monitoring the amount of plastic waste flowing from rivers and streams worldwide is crucial. In response to this issue of river-floating waste, our present research aimed to develop an automated waste measurement method tailored for real rivers. To achieve this, we considered three scenarios: clear visibility, partially submerged waste, and collective mass. We proposed the use of object detection and tracking techniques based on deep learning architectures, specifically the You Only Look Once (YOLOv5) and Simple Online and Realtime Tracking with a Deep Association Metric (DeepSORT). The types of waste classified in this research included cans, cartons, plastic bottles, foams, glasses, papers, and plastics in laboratory flume experiments. Our results demonstrated that the refined YOLOv5, when applied to river-floating waste images, achieved high classification accuracy, with 88% or more for the mean average precision. The floating waste tracking using DeepSORT also attained F1 scores high enough for accurate waste counting. Furthermore, we evaluated the proposed method across the three different scenarios, each achieving an 80% accuracy rate, suggesting its potential applicability in real river environments. These results strongly support the effectiveness of our proposed method, leveraging the two deep learning architectures for detecting and tracking river-floating waste with high accuracy.

Keywords:

water quality; river waste; macro plastics; monitoring; deep learning; image analysis

1. Introduction

Marine waste poses severe and far-reaching threats to marine life. Of particular concern is the pervasive distribution of microplastics in oceans, presenting an insidious danger to marine ecosystems and the delicate balance of our planet [1,2,3]. For example, the sudden increase in ocean plastic waste poses multifaceted threats, i.e., imperiling marine ecosystems, human health, crucial ecosystem services, and so forth [4]. Among the ocean plastic pollution, plastic waste from terrestrial river basins, constituting 80% of ocean plastic waste, has a huge impact and is a severe global environmental issue to be urgently solved [5]. Widespread urbanization and human activities in river basins have caused massive waste outfluxes via rivers and streams, resulting in unprecedented ocean plastic pollution. Therefore, detecting, classifying, and quantitatively counting river plastic waste is essential for assessing river quality and the resultant environmental impacts on the ocean [6].

Traditionally, water quality control has relied on manual assessments of river conditions, with subsequent efforts dedicated to collecting, monitoring, and quantifying floating waste through labor-intensive field surveys [7,8,9]. The quantifying approach [7] used a net with a 2.5 cm mesh at Noda Bridge on the Edo River in Japan to measure the waste. According to their findings, 6% of the total waste by weight was anthropogenic waste. The collecting approach [8] reported on the prevalence of plastic waste, ranging from 0.8 to 5.1% of the total macro-plastics, and intercepted an annual average of 22 to 36 tons of floating plastic waste using floating waste-retention booms. As for the monitoring approach [9], a visual observation method was developed to systematically collect data on floating macro-plastics through collaboration with European countries. While informative, these manual methodologies were labor-intensive and incurred substantial costs. As the demand for more efficient and cost-effective water quality control methods intensifies, there is a growing imperative to explore and implement advanced technologies and automated systems that can augment or replace these traditional approaches, ensuring the more comprehensive and sustainable management of water resources on a global scale [10,11].

The imperative for an innovative automated waste detection system has emerged as a linchpin in the optimization of river cleaning and waste removal efforts. Grounded in cutting-edge computer vision technologies, this system is poised to revolutionize the monitoring of polluted waterways, offering heightened efficiency and accuracy. At its technological core lies the implementation of object detection, a sophisticated computer vision technique adept at pinpointing the exact locations of waste items within images or video frames [12,13]. This capability not only ensures the early identification of pollution but also facilitates swift response and cleanup operations, transforming the landscape of environmental conservation. Despite these advancements, challenges persist in continuously monitoring the temporal fluctuations in waste quantities. Kataoka et al. [14] tried to solve the problem of tracking waste in rivers. They developed a method using image analysis, which is like teaching a computer to understand pictures. This method was specifically designed for this task, making it easier to monitor and manage river waste. However, this method faced hurdles in fully detecting plastic waste across various water types, relying on the ability to distinguish color differences in each water body to classify the type of waste. While these automated systems excel at identifying and categorizing types of waste, the accurate quantification of the volume of debris flowing from rivers into the ocean remains a complex challenge, necessitating ongoing refinements and innovations in this dynamic and in vital fields of environmental technology.

In advancing the automated waste detection system for river pollution monitoring, the incorporation of object tracking emerges as a transformative element, particularly in the dynamic context of flowing water. Continuously tracing the motion of identified waste objects across successive video frames allows the system to not only detect pollution but also monitor its trajectory and assess downstream implications. Drawing insights from applications in diverse domains, such as L. Gatelli et al. [15] and Jingyi’s [16] use of YOLOv4 as an object-detecting method for vehicle detection at road intersections or S. Charran et al.’s [17] proposal for automating the ticketing process for traffic violations using image recognition, showcases the versatility of such tracking technologies. Similarly, Y. Ge et al.’s [18] study on tomato growth monitoring with the YOLO-DeepSORT model underlines the potential for accurate data collection and decision-making in diverse fields. By combining object detection and tracking, the system not only identifies and categorizes waste but also quantifies it, enabling precise waste counting and offering a comprehensive understanding of pollution levels. This innovation empowers environmental authorities to optimize resource allocation, prioritize cleanup initiatives, and contribute to the sustainable preservation of aquatic ecosystems, providing a robust tool for comprehending the types and quantities of waste flowing from rivers into the ocean.

This paper aims to pioneer the development of an automated waste measurement method tailored for real river environments. To achieve this, we considered different scenarios involving waste conditions using an innovative automated waste detection system, leveraging advancements in object detection methods, and introducing enhancements to the established YOLOv5 [19] model architecture. The proposed system aspires to surpass conventional methods by incorporating cutting-edge features and methodologies, with a specific focus on addressing challenges associated with monitoring flowing river environments. Beyond the object detection capabilities, the paper introduces [20] as an object-tracking method tailored for video frames of flowing water, offering a solution to the intricate task of waste counting in dynamic river scenarios. By amalgamating these enhancements into the model, the system aims not only to detect but also to accurately quantify waste, providing a more comprehensive understanding of pollution levels in water bodies. This innovative approach holds potential for significant advancements in river pollution monitoring, offering a robust tool for environmental conservation and resource management.

2. Materials and Methods

Figure 1 depicts a system diagram outlining the proposed floating waste measurement method. It illustrates the flow from image input to YOLOv5-based detection, then to DeepSORT tracking, and finally, to data analysis and output.

2.1. Implementation of the Proposed Method

The method we propose in this section combines the strengths of YOLOv5 for detecting waste and YOLOv5_DeepSORT for counting waste, creating an efficient system to monitor and assess river waste in video frames. The research project consists of four important parts. First, we created a diverse and expanded dataset that includes seven different waste classes to make sure our model is well-trained and covers various types of waste. The second part focuses on detecting and classifying waste, using the powerful YOLOv5 design to accurately identify different types of river waste. The third part introduces innovation by smoothly integrating the DeepSORT tracking algorithm, allowing real-time tracking of detected waste objects in video frames. This ensures not only the precise identification but also the continuous monitoring and collection of waste over time. The study ends with a thorough presentation and analysis of results, showing how effective the integrated YOLOv5-DeepSORT system is. This comprehensive approach is a useful tool for environmental managers, providing real-time insights into the changing nature of river waste and contributing to cleaner and more sustainable water ecosystems. Beyond its environmental use, this research highlights the versatility of the method, demonstrating its potential for accurate object detection and counting in different areas, which is a significant step forward in the field.

2.2. Dataset and Environmental Scenario

The dataset for this research was created from a diverse array of waste types and quantities for both training and prediction in deep learning. The collection process involved capturing images of waste, particularly plastics, under controlled conditions. 12MP RGB camera (MAPIR. Inc., San Diego, California, USA), with specific spectral bands (red: 660 nm, green: 550 nm, blue: 475 nm), was employed for image capture, ensuring the high resolution and accurate representation of the waste in the laboratory setting. The dataset categorized simulated waste into seven distinct classes: cans, cartons, plastic bottles, foam, glass, paper, and plastic, reflecting a comprehensive spectrum of environmental debris. This categorization serves as the foundation for training and validating the deep learning model. The dataset not only provides diversity in waste types but also encompasses varying quantities, enriching the model’s ability to discern and classify different levels of pollution in controlled laboratory conditions.

The study places a particular emphasis on scenario diversity in the creation of the dataset, incorporating three waste detection scenarios that emulate different environmental conditions. In the first scenario (Case 1), characterized by clear visibility, individual items of waste are easily distinguishable. The second scenario (Case 2) considers partially submerged waste, evaluating the model’s adaptability to changes in visibility and underwater scenarios. The third scenario (Case 3) involves waste forming a collective mass, requiring the model to accurately detect individual items within clusters. By incorporating these distinct scenarios, the study comprehensively evaluates the model’s behavior in response to variations in environmental conditions and the morphology of the waste. This approach has the potential to enhance the reliability of the detection system in real river scenarios.

2.3. Waste Detection Algorithms

Waste detection models represent a crucial frontier in leveraging advanced technologies to tackle environmental challenges. These models, employing computer vision and machine learning techniques, play a pivotal role in swiftly and accurately identifying and categorizing waste within images and video data. The landscape of available models includes prominent ones, such as YOLOv5, Faster Region Convolutional Neural Network (Faster R-CNNs) [21], Single-Shot Multibox Detector (SSD) [22], and Mask R-CNNs [23]. Each model brings its unique strengths to the table, striking a balance between detection speed and accuracy. YOLO, known for its real-time capabilities, Faster R-CNNs, offering high accuracy, SSD’s rapid scanning, and Mask R-CNN’s pixel-level segmentation exemplify the diverse methodologies employed in waste detection.

In the context of this research, YOLOv5 emerges as the model of choice for waste detection, representing the latest iteration in the You Only Look Once (YOLO) series. Its selection is driven by its exceptional balance between speed and accuracy, making it particularly well-suited for the dynamic requirements of waste detection in environmental monitoring. YOLOv5 excels in the simultaneous detection of multiple objects in a single inference, a critical feature for real-time responsiveness. Additionally, the model demonstrates high accuracy in identifying various types of waste, aligning with the research’s objectives. The choice of YOLOv5 reflects a strategic decision based on its robust performance, adaptability to diverse environmental conditions, and its efficiency in delivering accurate and timely insights.

The implementation of YOLOv5 involves a meticulous fine-tuning process, starting with the preparation of a comprehensive waste detection dataset containing images and corresponding annotations. The subsequent setup of YOLOv5 includes the installation of essential libraries and dependencies, followed by the initialization of the model using either pretrained weights or custom weight configurations tailored to the specific dataset. Fine-tuning is a crucial step that requires adjustments to hyperparameters in Table S1, including the initial learning rate (lr0) set to 0.00872, the number of epochs set to 80, and batch size set to 8. This process allows the model to adapt to the nuances of the dataset and refine its performance for accurate waste detection. The choice of hyperparameters, including learning rates and regularization techniques, is pivotal in optimizing the model’s performance. The evaluation phase, utilizing a test dataset, ensures that the fine-tuned model meets the desired standards of accuracy and precision. The adaptability, speed, and accuracy of YOLOv5 position it as an ideal solution for automating waste detection across a spectrum of environmental monitoring scenarios, providing a robust tool for addressing environmental challenges.

2.4. Waste Tracking and Counting Algorithms

Waste tracking models, including SORT (Simple Online and Realtime Tracking) [24] and its extension, DeepSORT, are pivotal in continuously monitoring and tracking the movement and positions of waste objects. SORT employs Kalman filtering and the Hungarian algorithm for real-time online tracking, predicting object motion and associating data effectively. DeepSORT, extending SORT, incorporates deep learning for appearance-based re-identification matching, mitigating identification switching and ensuring robust tracking, even in scenarios involving occlusions. By integrating view information with tracking components, DeepSORT excels at the real-time tracking of multiple objects in video streams. These models enhance the precision of waste outflow measurements, contributing to effective environmental monitoring and waste management with minimized environmental impact.

In this study, we chose DeepSORT (SORT with a Deep Association Metric) as the waste-tracking model. DeepSORT, building upon SORT, uses deep learning to improve tracking capabilities, especially in recognizing and matching appearances. It helps overcome challenges like objects being hidden for a long time or changing their looks, making the tracking more accurate and reducing confusion in identifying objects. The model’s ability to smoothly combine what it sees with the tracking parts aligns with the study’s focus on dealing with complicated waste-tracking situations. Including DeepSORT in the study helps us measure waste flow more precisely, moving forward environmental monitoring and waste management efforts with a focus on accurate tracking in real-time situations.

2.5. Performance and Evaluation

Evaluating the performance of object detection and tracking models, such as YOLOv5, is a critical aspect of deep learning and computer vision. This assessment relies on several essential metrics, including precision (P), recall (R), F1 score, PR curve (precision–recall curve), average precision (AP), and mean average precision (mAP) [25]. Precision measures the model’s ability to correctly identify positive instances, recall gauges its capability to detect all actual positives, and the F1 score offers a balanced evaluation that avoids favoring one metric over the other. The PR curve graphically illustrates the model’s precision and recall trade-off at different confidence thresholds, aiding in threshold selection. AP (average precision) quantifies object detection accuracy by measuring the area under the precision–recall curve for a specific class. Meanwhile, mAP (mean average precision) averages APs across multiple classes [25]. To calculate mAP, ‘n’ represents the confidence threshold, and ‘class’ denotes the number of waste classes. Together, these metrics empower researchers and engineers to optimize YOLOv5, tailoring it to specific needs, whether emphasizing accuracy, comprehensiveness, or striking a balance between the two, thereby ensuring precise and efficient object detection and tracking in diverse applications. Precision, recall, F1 score, and mAP can be calculated using Equations (1)–(4).

P = \frac{T P}{T P + F P}

(1)

R = \frac{T P}{T P + F N}

(2)

m A P = \frac{1}{c l a s s} \sum_{k = 1}^{c l a s s} {(\sum_{i = 1}^{n} (R_{i} - R_{i - 1}) P_{i})}_{k}

(3)

F 1_s c o r e = \frac{2 P \cdot R}{P + R}

(4)

3. Results

3.1. Waste Detection

A specific approach was taken to validate the applicability of the deep learning model YOLOv5 for river monitoring in streams. A dataset of river waste images was created from continuous video footage captured in an open-channel flume. This dataset was then used to reinforce the developed model through fine-tuning. Figure 2 shows the best-tuned model’s AP values for each type of waste. In this study, the training CPU duration amounted to approximately 6 h. The optimized model achieved a classification accuracy mAP of 0.75. For specific categories, such as cans, foam, and paper, the AP was 0.8, while the accuracy for glass was the lowest, with an AP of 0.5.

To identify factors contributing to the decreased accuracy in certain categories of river waste, an examination of the area dependence of waste in the captured images was conducted, as shown in Figure 3a. However, river waste is not accurately detected unless it occupies a certain minimum proportion of the entire captured image. Specifically, while cans can be detected with a small area proportion, paper requires a larger area proportion (15%) to be detected. Therefore, it can be concluded that the detection accuracy of the developed deep learning model depends on the size of the object in images, and there exists a minimum required area proportion for accurate detection.

Figure 3b provides a comparison of accuracies before and after enlarging river waste objects. From this comparison, it is evident that the classification accuracy (mAP) of enlarged images generally improved, with a substantial improvement from mAP 0.5 to mAP 0.9 observed, particularly in the case of glass. However, plastic bottles and plastic exhibited a slight decrease in accuracy. This confirms that enlarging river waste images enhances the classification model’s accuracy.

3.2. Waste Counting

The accuracy evaluation summarized in Table 1 shows differing performances across the three cases, as described in the Dataset and Environmental Scenario section. Case 1 exhibits the most robust overall results, with a total recall of 0.94, precision of 0.80, and F1 score of 0.84. Case 2 closely follows, with a total recall of 0.84, precision of 0.83, and F1 score of 0.82, while Case 3 lags behind, with a total recall of 0.68, precision of 0.95, and F1 score of 0.78. Noteworthy strengths include consistently high precision and recall for the class “Paper” in all cases, indicating the reliable identification of this material. On the flip side, weaknesses include variable performances for “Plastic” and “Foam”, suggesting challenges in consistent recognition, and the class “Plastic bottle” exhibits moderate and balanced but not exceptional precision and recall. The reduced accuracies in cases 2 and 3 are due to the waste forming clusters and undergoing degradation in water. Moreover, the subpar precision in plastic recycling is likely connected to factors such as the size and transparency of the waste. Future research needs to focus on developing models that take into account considerations like size and transparency degradation.

4. Discussion

The proposed research achieved an 88.0% mAP for seven different waste classes, demonstrating high accuracy (Table 2). This result shows excellent performance on a large test image set, and it maintains a very high level of accuracy when compared to other studies. Other studies have focused on different waste classes or object classes, and they have achieved varying mAP scores. For instance, the YOLOv3 study focused on four waste classes and achieved a 77.2% mAP on 37 test images. In contrast, the Faster R-CNN study concentrated on three waste classes and achieved an 81.0% mAP. Furthermore, the YOLOv4 and DeepSORT study centered on different vehicle classes, attaining a 78% mAP. The primary advantage of the proposed research is its ability to achieve high accuracy while focusing on the seven distinct waste classes. This study’s significant improvement in accuracy is attributed to factors such as fine-tuning and the abundance of training images. This makes it highly effective in detecting and classifying a wide range of waste types.

This study, similar to the work by Jingyi et al. [16], underscores the challenges associated with accurately classifying objects that share similar shapes. Notably, the present study identifies the lowest accuracy in detecting plastic bottles and plastic objects. In both studies, there is a recognition of the difficulties in accurately classifying objects, especially those with similar shapes, and a shared emphasis on addressing this challenge. To address this issue, emphasizing the incorporation of specialized training data to discern subtle differences in shape and features is crucial for effective model performance in the future.

The presented results across these cases shed light on how the characteristics of the water environment and the morphology of the waste impact waste detection. In Case 1, where visibility is clear, individual waste items are easily discernible, leading to high recall and precision scores across most waste categories. This scenario represents ideal conditions for detection. Case 2 introduces partially submerged waste, simulating decreased visibility and underwater scenarios. Despite the added complexity, the model demonstrates adaptability, albeit with slightly lower performance metrics compared to Case 1. Case 3 presents a more challenging scenario, with waste forming clusters, demanding the model accurately detect individual items within these masses. Here, the performance varies, with some categories experiencing decreased precision and recall. Overall, these findings underline the importance of considering scenario diversity in dataset creation for waste detection models. By encompassing various environmental conditions and waste morphologies, such as clear water, underwater, and clustered waste, the model’s robustness and reliability in real river scenarios can be significantly enhanced, ensuring effective waste management and environmental preservation efforts.

The model demonstrates an overall precision of 80%, indicating that it is capable of maintaining a consistent level of accuracy despite environmental influences. However, Yuhao Ge et al.’s study [18] highlighted challenges associated with adverse environmental conditions during video capture, potentially resulting in missed detections and negative impacts on target tracking and counting. This raises concerns about the practical applicability of the model in real-world scenarios, where unpredictable weather or complex scenes may hinder its performance and reliability. To address these concerns, future efforts should prioritize enhancing the model’s robustness to handle challenging real-world conditions effectively. The critical emphasis on improving reliability and stability, especially in adverse environmental contexts, emerges as a pivotal focus for upcoming research endeavors.

In practical implementation, our study utilized a high-performance computer for machine learning training, which lasted approximately 6 h. Remarkably, results were obtained in less than 30 s when the fine-tuned model was executed, as indicated in Table 1. While acknowledging the need for powerful computers during training, the feasibility of real-time utilization in actual river use is promising. A future challenge involves adapting the model for execution on portable devices or smartphones, ensuring convenience and user-friendliness for broader applicability.

Overall, this research addresses the critical issue of riverine waste management through an innovative approach that integrates cutting-edge technologies, i.e., YOLOv5 and DeepSORT. The system’s ability to accurately detect and classify various types of waste, combined with its real-time tracking capabilities, marks a significant advancement in environmental monitoring. The comprehensive dataset, consideration of seasonal and weather conditions, and incorporation of underwater waste contribute to the model’s robustness. However, challenges related to dataset diversity and regional adaptability remain. As outlined in the PC specifications, the computer’s high performance underscores the research’s commitment to ensuring computational efficiency.

5. Conclusions

The proposed research successfully demonstrated the ability to accurately quantify seven categories (can, carton, plastic bottle, foam, glass, paper, and plastic) through integrating deep learning architectures, i.e., YOLOv5 and DeepSORT. Its practicality in natural river environments and its accurate classification and object-tracking capabilities would make it a valuable tool for environmental conservation. Additionally, it offers a promising solution for addressing the pressing issue of plastic pollution in rivers and oceans. However, it is essential to acknowledge the system’s limitations, particularly concerning detecting and counting small objects, those submerged in water, and objects forming clusters. Furthermore, since this study was conducted in a laboratory setting, considerations such as data requirements, types of waste, and using cameras in natural river environments are necessary for its practical application.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/w16101373/s1, Table S1: Hyperparameters.

Author Contributions

Conceptualization, H.M.; methodology, M.N. and H.M.; software, M.N.; validation, M.N.; formal analysis, M.N.; investigation, M.N.; resources, M.N.; writing—original draft preparation, M.N.; writing—review and editing, H.M.; visualization, M.N.; supervision, H.M.; project administration, H.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The code was developed using Python and is publicly accessible through [https://github.com/MAIYATAT/waste_detection].

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Wong, W.Y.; Al-Ani, A.K.I.; Hasikin, K.; Khairuddin, A.S.M.; Razak, S.A.; Hizaddin, H.F.; Mokhtar, M.I.; Azizan, M.M. Water, Soil and Air Pollutants’ Interaction on Mangrove Ecosystem and Corresponding Artificial Intelligence Techniques Used in Decision Support Systems—A Review. IEEE Access 2021, 9, 105532–105563. [Google Scholar] [CrossRef]
Smith, M.; Love, D.C.; Rochman, C.M.; Neff, R.A. Microplastics in Seafood and the Implications for Human Health. Curr. Environ. Health Rep. 2018, 5, 375–386. [Google Scholar] [CrossRef] [PubMed]
Dalu, T.; Banda, T.; Mutshekwa, T.; Munyai, L.F.; Cuthbert, R.N. Effects of urbanisation and a wastewater treatment plant on microplastic densities along a subtropical river system. Environ. Sci. Pollut. Res. 2021, 28, 36102–36111. [Google Scholar] [CrossRef] [PubMed]
Windsor, F.M.; Tilley, R.M.; Tyler, C.R.; Ormerod, S.J. Microplastic ingestion by riverine macroinvertebrates. Sci. Total Environ. 2019, 646, 68–74. [Google Scholar] [CrossRef] [PubMed]
Jambeck, J.R.; Geyer, R.; Wilcox, C.; Siegler, T.R.; Perryman, M.; Andrady, A.; Narayan, R.; Law, K.L. Plastic waste inputs from land into the ocean. Science 2015, 347, 768–771. [Google Scholar] [CrossRef] [PubMed]
Harris, P.; Westerveld, L.; Nyberg, B.; Maes, T.; Macmillan-Lawler, M.; Appelquist, L. Exposure of coastal environments to river-sourced plastic pollution. Sci. Total Environ. 2021, 769, 145222. [Google Scholar] [CrossRef]
Nihei, Y.; Shirakawa, A.; Suzuki, T.; Akamatsu, Y. Field Measurements of Floating-Litter Transport in a Large River under Flooding Conditions and its relation to DO Environments in an Inner Bay. J. Jpn. Soc. Civ. Eng. Ser. B2 (Coast. Eng.) 2010, 66, 1171–1175. [Google Scholar] [CrossRef]
Gasperi, J.; Dris, R.; Bonin, T.; Rocher, V.; Tassin, B. Assessment of floating plastic debris in surface water along the Seine River. Environ. Pollut. 2014, 195, 163–166. [Google Scholar] [CrossRef]
González-Fernández, D.; Hanke, G. Toward a Harmonized Approach for Monitoring of Riverine Floating Macro Litter Inputs to the Marine Environment. Front. Mar. Sci. 2017, 4, 86. [Google Scholar] [CrossRef]
Tsering, T.; Sillanpää, M.; Sillanpää, M.; Viitala, M.; Reinikainen, S.-P. Microplastics pollution in the Brahmaputra River and the Indus River of the Indian Himalaya. Sci. Total Environ. 2021, 789, 147968. [Google Scholar] [CrossRef]
Djuwita, M.R.; Hartono, D.M.; Mursidik, S.S.; Soesilo, T.E.B. Pollution Load Allocation on Water Pollution Control in the Citarum River. J. Eng. Technol. Sci. 2021, 53, 210112. [Google Scholar] [CrossRef]
Rangarajan, A.K.; Purushothaman, R. A Vision Based Crop Monitoring System Using Segmentation Techniques. Adv. Electr. Comput. Eng. 2020, 20, 89–100. [Google Scholar] [CrossRef]
Romic, K.; Galic, I.; Leventic, H.; Nenadic, K. Real-time Multiresolution Crosswalk Detection with Walk Light Recognition for the Blind. Adv. Electr. Comput. Eng. 2018, 18, 11–20. [Google Scholar] [CrossRef]
Kataoka, T.; Nihei, Y. Quantification of floating riverine macro-debris transport using an image processing approach. Sci. Rep. 2020, 10, 2198. [Google Scholar] [CrossRef] [PubMed]
Gatelli, L.; Gosmann, G.; Fitarelli, F.; Huth, G.; Schwertner, A.A.; de Azambuja, R.; Brusamarello, V.J. Counting, Classifying and Tracking Vehicles Routes at Road Intersections with YOLOv4 and DeepSORT. In Proceedings of the 2021 5th International Symposium on Instrumentation Systems, Circuits and Transducers (INSCIT), Campinas, Brazil, 23–27 August 2021; pp. 1–6. [Google Scholar] [CrossRef]
Zhao, J.; Hao, S.; Dai, C.; Zhang, H.; Zhao, L.; Ji, Z.; Ganchev, I. Improved Vision-Based Vehicle Detection and Classification by Optimized YOLOv4. IEEE Access 2022, 10, 8590–8603. [Google Scholar] [CrossRef]
Charran, R.S.; Dubey, R.K. Two-Wheeler Vehicle Traffic Violations Detection and Automated Ticketing for Indian Road Scenario. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22002–22007. [Google Scholar] [CrossRef]
Ge, Y.; Lin, S.; Zhang, Y.; Li, Z.; Cheng, H.; Dong, J.; Shao, S.; Zhang, J.; Qi, X.; Wu, Z. Tracking and Counting of Tomato at Different Growth Period Using an Improving YOLO-Deepsort Network for Inspection Robot. Machines 2022, 10, 489. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. arXiv 2016, arXiv:1506.02640. [Google Scholar] [CrossRef]
Wojke, N.; Bewley, A.; Paulus, D. Simple Online and Realtime Tracking with a Deep Association Metric. arXiv 2017, arXiv:1703.07402. [Google Scholar] [CrossRef]
Girshick, R. Fast R-CNN. arXiv 2015, arXiv:1504.08083. [Google Scholar] [CrossRef]
Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Computer Vision–ECCV 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2016; pp. 21–37. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. arXiv 2018, arXiv:1703.06870. [Google Scholar] [CrossRef]
Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upcroft, B. Simple online and realtime tracking. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar] [CrossRef]
Padilla, R.; Netto, S.L.; da Silva, E.A.B. A Survey on Performance Metrics for Object-Detection Algorithms. In Proceedings of the 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), Niteroi, Brazil, 1–3 July 2020; pp. 237–242. [Google Scholar] [CrossRef]
Fulton, M.; Hong, J.; Jahidul Islam, M.; Sattar, J. Robotic Detection of Marine Litter Using Deep Visual Detection Models. In Proceedings of the 2019 International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 5752–5758. [Google Scholar] [CrossRef]
Watanabe, J.-I.; Shao, Y.; Miura, N. Underwater and airborne monitoring of marine ecosystems and debris. J. Appl. Remote Sens. 2019, 13, 044509. [Google Scholar] [CrossRef]
Li, X.; Tian, M.; Kong, S.; Wu, L.; Yu, J. A modified YOLOv3 detection method for vision-based water surface garbage capture robot. Int. J. Adv. Robot. Syst. 2020, 17, 1729881420932715. [Google Scholar] [CrossRef]
Luo, X.; Zhao, R.; Gao, X. Research on UAV Multi-Object Tracking Based on Deep Learning. In Proceedings of the 2021 IEEE International Conference on Networking, Sensing and Control (ICNSC), Xiamen, China, 3–5 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
Ma, W.; Wang, X.; Yu, J. A Lightweight Feature Fusion Single Shot Multibox Detector for Garbage Detection. IEEE Access 2020, 8, 188577–188586. [Google Scholar] [CrossRef]

Figure 1. Graphical flowchart of the floating waste measurement method with deep learning techniques.

Figure 2. Waste detection results by the trained YOLOv5 model.

Figure 3. Results after enlarging the river waste image (a) Enlarging the river waste image resulted in improved accuracy. (b) The accuracies of waste detection and analysis were assessed both before and after enlarging the river waste image.

Table 1. Accuracy metrics of waste counting.

Type	Case 1: CPU Time: 30 s			Case 2: CPU Time: 25 s			Case 3: CPU Time: 27 s
Type	Recall	Precision	F1 Score	Recall	Precision	F1 Score	Recall	Precision	F1 Score
Can	1.00	0.67	0.80	1.00	0.67	0.80	1.00	1.00	1.00
Carton	1.00	1.00	1.00	1.00	0.67	0.80	0.50	1.00	0.67
Plastic bottle	0.60	1.00	0.75	0.60	1.00	0.75	0.60	1.00	0.75
Foam	1.00	0.67	0.80	0.75	1.00	0.86	0.50	1.00	0.67
Glass	1.00	0.75	0.86	1.00	1.00	1.00	0.67	1.00	0.80
Paper	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00	1.00
Plastic	1.00	0.50	0.67	0.50	0.50	0.50	0.50	0.67	0.57
Total	0.94	0.80	0.84	0.84	0.83	0.82	0.68	0.95	0.78

Table 2. Model comparison of the proposed work with previous works on waste detection.

Model	Data (Test Images)	Class Number (Types)	mAP (%)	Reference
YOLOv5 and DeepSORT	5711	7 classes (Can, Carton, Plastic bottle, Foam, Glass, Paper, and Plastic)	88.0	Present work
Faster R-CNN	820	3 classes (Plastic debris, Biological materials, and Man-made objects)	81.0	Fulton et al. [26]
YOLOv3	37	4 classes (Plastic bottle, Plastic bag, Driftwood, and Other debris)	77.2	Watanabe et al. [27]
YOLOv3	301	3 classes (Plastic bottle, Plastic bag, and Styrofoam)	91.4	Li et al. [28]
YOLOv4 and DeepSORT	1884	4 classes (Car, Truck, Bus, and Motorcycle)	78.0	L. Gatelli et al. [15]
YOLOv5 and Multi-Object Tracking	-	3 classes (Car, Cyclist, and Pedestrian)	86.0	X. Luo et al. [29]
SSD	-	5 classes (Cardboard, Paper, Plastic, Metal, and Glass)	74.0	W. Ma et al. [30]
L-SSD	-	5 classes (Cardboard, Paper, Plastic, Metal, and Glass)	83.5	W. Ma et al. [30]

Note: YOLO: You Only Look Once, Faster R-CNN: Faster Region Convolutional Neural Network, SSD: single-shot multibox detector, L-SSD: Lightweight Feature Fusion Single-Shot Multibox Detector.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nunkhaw, M.; Miyamoto, H. An Image Analysis of River-Floating Waste Materials by Using Deep Learning Techniques. Water 2024, 16, 1373. https://doi.org/10.3390/w16101373

AMA Style

Nunkhaw M, Miyamoto H. An Image Analysis of River-Floating Waste Materials by Using Deep Learning Techniques. Water. 2024; 16(10):1373. https://doi.org/10.3390/w16101373

Chicago/Turabian Style

Nunkhaw, Maiyatat, and Hitoshi Miyamoto. 2024. "An Image Analysis of River-Floating Waste Materials by Using Deep Learning Techniques" Water 16, no. 10: 1373. https://doi.org/10.3390/w16101373

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Image Analysis of River-Floating Waste Materials by Using Deep Learning Techniques

Abstract

1. Introduction

2. Materials and Methods

2.1. Implementation of the Proposed Method

2.2. Dataset and Environmental Scenario

2.3. Waste Detection Algorithms

2.4. Waste Tracking and Counting Algorithms

2.5. Performance and Evaluation

3. Results

3.1. Waste Detection

3.2. Waste Counting

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI