A Floating-Waste-Detection Method for Unmanned Surface Vehicle Based on Feature Fusion and Enhancement
Abstract
:1. Introduction
- A high proportion of small objects affects neural-network feature extraction;
- Complex surface environments have a great impact on the robustness of object-detection networks.
- We design a low-level representation-enhancement module, which consists of a multiscale convolution part and a region-detection head. The designed module enhances the low-level representation capability of the feature maps, effectively solving the problem that features of small objects are easily lost when downsampling in the backbone.
- We propose a new feature-fusion module to fuse the highest-resolution and lowest-resolution feature maps. The proposed module effectively fuses features by learning the correlations between feature maps at different levels, which reduces the transfer of redundant information.
- We construct a new floating-waste dataset called FloatingWaste-I (detailed in Section 4.1). The proposed dataset has more floating-waste categories (carton and bottle) and more light conditions (cloudy, rainy, sunny and evenfall). And, YOLO-Float is compared with current state-of-the-art models on the public FloW-img and FloatingWaste-I datasets to verify its effectiveness.
2. Related Work
2.1. Water-Surface-Object Dataset
- The sea has a less distracting background than inland waterways, with mainly light patches and ripples, and a lack of reflections from trees, houses and roads;
- Large objects such as ships and ferries make up a great part of the marine dataset.
2.2. Object Detection
3. Proposed Method
3.1. Low-Level Representation-Enhancement Module
3.2. Attention Fusion Module
3.3. Loss Functions
4. Experiments
4.1. Dataset and Implementation Details
4.1.1. FloW-Img Dataset
4.1.2. FloatingWaste-I Dataset
4.1.3. Evaluation Metrics
4.1.4. Experimental Platform and Parameters
4.2. Results of the Experiments
4.2.1. Ablation Studies
4.2.2. Effectiveness of Loss Curve
4.2.3. Effectiveness of AFM
4.3. Comparison with Other Methods
4.3.1. Object Detection on FloW-Img
4.3.2. Object Detection on FloatingWaste-I
5. Discussion and Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Li, W.C.; Tse, H.F.; Fok, L. Plastic waste in the marine environment: A review of sources, occurrence and effects. Sci. Total Environ. 2016, 566, 333–349. [Google Scholar] [CrossRef]
- Jambeck, J.R.; Geyer, R.; Wilcox, C.; Siegler, T.R.; Perryman, M.; Andrady, A.; Narayan, R.; Law, K.L. Plastic waste inputs from land into the ocean. Science 2015, 347, 768–771. [Google Scholar] [CrossRef] [PubMed]
- Lebreton, L.; Van Der Zwet, J.; Damsteeg, J.W.; Slat, B.; Andrady, A.; Reisser, J. River plastic emissions to the world’s oceans. Nat. Commun. 2017, 8, 1–10. [Google Scholar] [CrossRef]
- Akib, A.; Tasnim, F.; Biswas, D.; Hashem, M.B.; Rahman, K.; Bhattacharjee, A.; Fattah, S.A. Unmanned floating waste collecting robot. In Proceedings of the TENCON 2019—2019 IEEE Region 10 Conference (TENCON), Kochi, India, 17–20 October 2019; pp. 2645–2650. [Google Scholar]
- Ruangpayoongsak, N.; Sumroengrit, J.; Leanglum, M. A floating waste scooper robot on water surface. In Proceedings of the 2017 17th International Conference on Control, Automation and Systems (ICCAS), IEEE, Jeju, Republic of Korea, 18–21 October 2017; pp. 1543–1548. [Google Scholar]
- Chang, H.C.; Hsu, Y.L.; Hung, S.S.; Ou, G.R.; Wu, J.R.; Hsu, C. Autonomous water quality monitoring and water surface cleaning for unmanned surface vehicle. Sensors 2021, 21, 1102. [Google Scholar] [CrossRef]
- Hasany, S.N.; Zaidi, S.S.; Sohail, S.A.; Farhan, M. An autonomous robotic system for collecting garbage over small water bodies. In Proceedings of the 2021 6th International Conference on Automation, Control and Robotics Engineering (CACRE), IEEE, Dalian, China, 15–17 July 2021; pp. 81–86. [Google Scholar]
- Li, N.; Huang, H.; Wang, X.; Yuan, B.; Liu, Y.; Xu, S. Detection of Floating Garbage on Water Surface Based on PC-Net. Sustainability 2022, 14, 11729. [Google Scholar] [CrossRef]
- Yang, X.; Zhao, J.; Zhao, L.; Zhang, H.; Li, L.; Ji, Z.; Ganchev, I. Detection of river floating garbage based on improved YOLOv5. Mathematics 2022, 10, 4366. [Google Scholar] [CrossRef]
- Ouyang, C.; Hou, Q.; Dai, Y. Surface Object Detection Based on Improved YOLOv5. In Proceedings of the 2022 IEEE 5th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), Chongqing, China, 16–18 December 2022; Volume 5, pp. 923–928. [Google Scholar]
- Cheng, Y.; Zhu, J.; Jiang, M.; Fu, J.; Pang, C.; Wang, P.; Sankaran, K.; Onabola, O.; Liu, Y.; Liu, D.; et al. Flow: A dataset and benchmark for floating waste detection in inland waters. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 10953–10962. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision—ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13. Springer: Cham, Switzerland, 2014; pp. 740–755. [Google Scholar]
- Yang, G.; Feng, W.; Jin, J.; Lei, Q.; Li, X.; Gui, G.; Wang, W. Face mask recognition system with YOLOV5 based on image recognition. In Proceedings of the 2020 IEEE 6th International Conference on Computer and Communications (ICCC), Chengdu, China, 11–14 December 2020; pp. 1398–1404. [Google Scholar]
- Ieamsaard, J.; Charoensook, S.N.; Yammen, S. Deep learning-based face mask detection using yolov5. In Proceedings of the 2021 9th International Electrical Engineering Congress (iEECON), IEEE, Pattaya, Thailand, 10–12 March 2021; pp. 428–431. [Google Scholar]
- Wu, T.H.; Wang, T.W.; Liu, Y.Q. Real-time vehicle and distance detection based on improved yolo v5 network. In Proceedings of the 2021 3rd World Symposium on Artificial Intelligence (WSAI), IEEE, Guangzhou, China, 18–20 June 2021; pp. 24–28. [Google Scholar]
- Dong, X.; Yan, S.; Duan, C. A lightweight vehicles detection network model based on YOLOv5. Eng. Appl. Artif. Intell. 2022, 113, 104914. [Google Scholar] [CrossRef]
- Chen, Z.; Wu, R.; Lin, Y.; Li, C.; Chen, S.; Yuan, Z.; Chen, S.; Zou, X. Plant disease recognition model based on improved YOLOv5. Agronomy 2022, 12, 365. [Google Scholar] [CrossRef]
- Yao, J.; Qi, J.; Zhang, J.; Shao, H.; Yang, J.; Li, X. A real-time detection algorithm for Kiwifruit defects based on YOLOv5. Electronics 2021, 10, 1711. [Google Scholar] [CrossRef]
- Kristan, M.; Kenk, V.S.; Kovačič, S.; Perš, J. Fast image-based obstacle detection from unmanned surface vehicles. IEEE Trans. Cybern. 2015, 46, 641–654. [Google Scholar] [CrossRef] [PubMed]
- Bovcon, B.; Mandeljc, R.; Perš, J.; Kristan, M. Stereo obstacle detection for unmanned surface vehicles by IMU-assisted semantic segmentation. Robot. Auton. Syst. 2018, 104, 1–13. [Google Scholar] [CrossRef]
- Moosbauer, S.; Konig, D.; Jakel, J.; Teutsch, M. A benchmark for deep learning based object detection in maritime environments. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019. [Google Scholar]
- Papageorgiou, C.P.; Oren, M.; Poggio, T. A general framework for object detection. In Proceedings of the Sixth International Conference on Computer Vision (IEEE Cat. No. 98CH36271), Bombay, India, 7 January 1998; pp. 555–562. [Google Scholar]
- Ojala, T.; Pietikainen, M.; Harwood, D. Performance evaluation of texture measures with classification based on Kullback discrimination of distributions. In Proceedings of the 12th International Conference on Pattern Recognition, IEEE, Jerusalem, Israel, 9–13 October 1994; Volume 1, pp. 582–585. [Google Scholar]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 37, 1904–1916. [Google Scholar] [CrossRef] [PubMed]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 2015, 28, 91–99. [Google Scholar] [CrossRef]
- Cai, Z.; Vasconcelos, N. Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
- Redmon, J.; Farhadi, A. YOLO9000: Better, faster, stronger. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7263–7271. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. Yolov4: Optimal speed and accuracy of object detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Jocher, G.; Chaurasia, A.; Stoken, A.; Borovec, J.; Kwon, Y.; Michael, K.; Fang, J.; Yifu, Z.; Wong, C.; Montes, D.; et al. ultralytics/yolov5: V7. 0-yolov5 Sota Realtime Instance Segmentation; Zenodo: Geneva, Switzerland, 2022. [Google Scholar]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Liu, S.; Qi, L.; Qin, H.; Shi, J.; Jia, J. Path aggregation network for instance segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8759–8768. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzreland, 2020; pp. 213–229. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
- Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 20–22 June 2023; pp. 7464–7475. [Google Scholar]
Datasets | Year | Frames | Object | Classes | Resolution | Env. | USV-Based | Condition |
---|---|---|---|---|---|---|---|---|
MODD [19] | 2016 | 4454 | obstacle | 4 | M | Y | L | |
SMD [21] | 2017 | 16,000 | boat | 10 | M | N | L&D | |
MODD2 [20] | 2018 | 11,675 | obstacle | 2 | M | Y | L | |
FloW-img [11] | 2021 | 2000 | waste | 1 | I | Y | L | |
FloatingWaste-I | 2023 | 1867 | waste | 2 | I | Y | L&D |
Method | |||||||||
---|---|---|---|---|---|---|---|---|---|
YOLOv5 | 41.8% | 35.3% | 26.7% | 62.4% | 80.1% | 48.8% | 36.1% | 68.9% | 83.2% |
YOLOv5 + LREM | 42.3% | 40.3% | 28.2% | 62.4% | 79.3% | 49.6% | 37.1% | 69.6% | 83.2% |
YOLOv5 + AFM | 43.8% | 40.3% | 29.2% | 63.4% | 82.9% | 51% | 38.9% | 69.9% | 85.9% |
YOLO-Float | 44.2% | 42% | 30.7% | 62.8% | 82.6% | 51.4% | 39.8% | 69.4% | 85.9% |
Method | |||||||
---|---|---|---|---|---|---|---|
YOLOv5 | 41.8% | 82.9% | 35.3% | 26.7% | 80.1% | 36.1% | 83.2% |
YOLOv5 + Upsample | 43% | 84% | 39.1% | 28.5% | 82.2% | 37.8% | 84.1% |
YOLOv5 + AFM | 43.8% | 83.5% | 40.3% | 29.2% | 82.9% | 38.9% | 85.9% |
ALGORITHM | Param | FPS | ||||||
---|---|---|---|---|---|---|---|---|
YOLOv5-X | 41.8% | 82.9% | 35.3% | 26.7% | 48.8% | 36.1% | 83.2 M | 38 |
YOLOR-D6 | 41% | 77% | 39.1% | 28.6% | 48.4% | 34.9% | 151.7 M | 34 |
YOLOX-X | 41.5% | 80.5% | 38.4% | 27.6% | 47.6% | 34.2% | 99.1M | 58 |
YOLOv7-E6E | 40.8% | 81.8% | 36.9% | 28% | 40% | 39% | 110.3 M | 36 |
YOLO-Float(-X) | 44.2% | 83.3% | 42% | 30.7% | 51.4% | 39.8% | 91.8 M | 35 |
Faster R-CNN [11] | 18.4% | 9.3 | ||||||
Cascade R-CNN [11] | 43.4% | 3.9 | ||||||
DSSD [11] | 27.5% | 28.6 | ||||||
RetinaNet [11] | 24.9% | 7.6 | ||||||
FPN [11] | 18.4% | 7.4 |
ALGORITHM | ||||||
---|---|---|---|---|---|---|
YOLOv5-X | 34.8% | 82.9% | 22.9% | 20.8% | 41.9% | 31.4% |
YOLOR-D6 | 26.3% | 51% | 22% | 9.1% | 31.1% | 13.5% |
YOLOX-X | 35.3% | 81.5% | 26.7% | 21.1% | 39.3% | 26.6% |
YOLOv7-E6E | 30.4% | 70.5% | 20.6% | 20.2% | 43.4% | 36.2% |
YOLO-Float(-X) | 37.8% | 82.5% | 30.4% | 23.6% | 44.4% | 32.8% |
Bottle | Carton | |||
---|---|---|---|---|
ALGORITHM | ||||
YOLOv5-X | 32.9% | 77.8% | 36.3% | 87.1% |
YOLOvR-D6 | 26.9% | 51.4% | 29.4% | 57.2% |
YOLOvX-X | 32.8% | 75.9% | 37.7% | 86% |
YOLOv7-E6E | 30.8% | 73.3% | 28.7% | 68.3% |
YOLO-Float(-X) | 38.6% | 80.7% | 36.8% | 83.3% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Y.; Wang, R.; Gao, D.; Liu, Z. A Floating-Waste-Detection Method for Unmanned Surface Vehicle Based on Feature Fusion and Enhancement. J. Mar. Sci. Eng. 2023, 11, 2234. https://doi.org/10.3390/jmse11122234
Li Y, Wang R, Gao D, Liu Z. A Floating-Waste-Detection Method for Unmanned Surface Vehicle Based on Feature Fusion and Enhancement. Journal of Marine Science and Engineering. 2023; 11(12):2234. https://doi.org/10.3390/jmse11122234
Chicago/Turabian StyleLi, Yong, Ruichen Wang, Dongxu Gao, and Zhiyong Liu. 2023. "A Floating-Waste-Detection Method for Unmanned Surface Vehicle Based on Feature Fusion and Enhancement" Journal of Marine Science and Engineering 11, no. 12: 2234. https://doi.org/10.3390/jmse11122234
APA StyleLi, Y., Wang, R., Gao, D., & Liu, Z. (2023). A Floating-Waste-Detection Method for Unmanned Surface Vehicle Based on Feature Fusion and Enhancement. Journal of Marine Science and Engineering, 11(12), 2234. https://doi.org/10.3390/jmse11122234