Towards Lightweight Neural Networks for Garbage Object Detection
Abstract
:1. Introduction
- The proposed dilated–deformable convolution combines dilated convolution and deformable convolution to precisely dilate the local receptive field without increasing the number of parameters and computation.
- We optimized the network structure on the basis of YOLOv4 (to ensure accuracy) and significantly reduced the number of parameters and computation.
2. Related Work
3. Methodology
3.1. Data Set
3.2. Pre-Processing and Data Augmentation
3.3. Algorithm Design
3.3.1. YOLOv4
3.3.2. Focus
3.3.3. DCSPResNet
3.3.4. YOLOG
3.4. Performance Indices of the Object Detection Model
3.5. Training Strategies and Experimental Setup
4. Results
4.1. Results of the Ablation Experiment
4.2. Comparison of Recognition Performance
4.3. Performance on the Public Dataset
4.4. Object Detection Test
5. Conclusions
- 1.
- We improved the CSPResNet structure using dilated–deformable convolution to accurately expand the receptive field and extract features more effectively.
- 2.
- A lightweight garbage object detection network, named YOLOG, was designed to ensure real-time and accurate results. It simplifies the hardware requirements, reduces computational costs, and meets the needs of practical applications.
- 3.
- We presented comparative experiments with other advanced networks on garbage and public datasets, respectively, to demonstrate the effectiveness of YOLOG.
- 4.
- YOLOG allows for the efficient detection and classification of all types of domestic garbage using edge devices (e.g., Jetson AGX Xavier).
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Tong, Y.; Liu, J.; Liu, S. China is implementing “Garbage Classification” action. Environ. Pollut. 2020, 259, 2019–2020. [Google Scholar] [CrossRef]
- Sanderson, R.E. Environmental Protection Agency Office of Federal Activities’ Guidance on Incorporating EPA’s Pollution Prevention Strategy into the Environmental Review Process; EPA: Washington, DC, USA, 1993. [Google Scholar]
- Zhang, D.; Keat, T.S.; Gersberg, R.M. A comparison of municipal solid waste management in Berlin and Singapore. Waste Manag. 2010, 30, 921–933. [Google Scholar] [CrossRef]
- Shanghai Municipal Waste Management Regulations. Available online: https://www.shqp.gov.cn/mac/tzgg/20191031/604406.html (accessed on 25 July 2022).
- Frost, S.; Tor, B.; Agrawal, R.; Forbes, A.G. CompostNet: An Image Classifier for Meal Waste. In Proceedings of the 2019 IEEE Global Humanitarian Technology Conference, GHTC 2019, Seattle, WA, USA, 17–20 October 2019; pp. 1–4. [Google Scholar]
- Meng, S.; Zhang, N.; Ren, Y. X-DenseNet: Deep Learning for Garbage Classification Based on Visual Images. J. Phys. Conf. Ser. 2020, 1575, 012139. [Google Scholar] [CrossRef]
- Su, N.; Lin, Z.; You, W.; Zheng, N.; Ma, K. RMGCS: Real-time multimodal garbage classification system for recyclability. J. Intell. Fuzzy Syst. 2022, 42, 3963–3973. [Google Scholar] [CrossRef]
- De Carolis, B.; Ladogana, F.; MacChiarulo, N. YOLO TrashNet: Garbage Detection in Video Streams. In Proceedings of the IEEE Conference on Evolving and Adaptive Intelligent Systems, Bari, Italy, 27–29 May 2020. [Google Scholar]
- Yu, F.; Koltun, V. Multi-scale context aggregation by dilated convolutions. arXiv 2015, arXiv:1511.07122. [Google Scholar]
- Qi, W.H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable Convolutional Networks. arXiv 2017, arXiv:1703.06211. [Google Scholar]
- Gonzalez, T.F. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar]
- Chollet, F. Xception: Deep Learning with depthwise separable convolutions. arXiv 2017, arXiv:1610.02357. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Sangeetha, V.; Prasad, K.J. Deep Residual Learning for Image Recognition. Indian J. Chem.-Sect. B Org. Med. Chem. 2006, 45, 1951–1954. [Google Scholar]
- Ma, N.; Zhang, X.; Zheng, H.T.; Sun, J. Shufflenet V2: Practical guidelines for efficient cnn architecture design. In Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Springer: Cham, Switzerland, 2018; Volume 11218, pp. 122–138. [Google Scholar]
- Yang, Z.; Li, D. WasNet: A Neural Network-Based Garbage Collection Management System. IEEE Access 2020, 8, 103984–103993. [Google Scholar] [CrossRef]
- Chen, Z.; Yang, J.; Chen, L.; Jiao, H. Garbage classification system based on improved ShuffleNet v2. Resour. Conserv. Recycl. 2022, 178, 106090. [Google Scholar] [CrossRef]
- Mao, W.L.; Chen, W.C.; Fathurrahman, H.I.K.; Lin, Y.H. Deep learning networks for real-time regional domestic waste detection. J. Clean. Prod. 2022, 344, 131096. [Google Scholar] [CrossRef]
- “Huawei Cloud Cup” 2020 Shenzhen Open Data Application Innovation Competition · Domestic Waste Image Classification. Available online: https://competition.huaweicloud.com/information/1000038439/circumstance?track=107 (accessed on 25 July 2022).
- Liu, X.; Zhai, J. Domestic Waste Sorting System Based on Deep Learning; Springer: Singapore, 2022; Volume 803, pp. 38–47. [Google Scholar]
- Qin, J.; Wang, C.; Ran, X.; Yang, S.; Chen, B. A robust framework combined saliency detection and image recognition for garbage classification. Waste Manag. 2022, 140, 193–203. [Google Scholar] [CrossRef]
- Zaidi, S.S.A.; Ansari, M.S.; Aslam, A.; Kanwal, N.; Asghar, M.; Lee, B. A survey of modern deep learning based object detection models. Digit. Signal Process. Rev. J. 2022, 126, 1–18. [Google Scholar] [CrossRef]
- Jiang, P.; Ergu, D.; Liu, F.; Cai, Y.; Ma, B. A Review of Yolo Algorithm Developments. Procedia Comput. Sci. 2021, 199, 1066–1073. [Google Scholar] [CrossRef]
- Zhou, Y.; Chen, S.; Wang, Y.; Huan, W. Review of research on lightweight convolutional neural networks. In Proceedings of the 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference, ITOEC 2020, Chongqing, China, 12–14 June 2020; pp. 1713–1720. [Google Scholar]
- Yang, M. TrashNet Repository. Available online: https://github.com/garythung/trashnet (accessed on 25 July 2022).
- Bochkovskiy, A.; Wang, C.Y.; Liao, H.Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Yun, S.; Han, D.; Chun, S.; Oh, S.J.; Choe, J.; Yoo, Y. CutMix: Regularization strategy to train strong classifiers with localizable features. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019; pp. 6022–6031. [Google Scholar]
- Sultana, F.; Sufian, A.; Dutta, P. A Review of Object Detection Based on Convolutional Neural Network. Adv. Intell. Syst. Comput. 2020, 1157, 1–16. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Zhao, Y.; Han, R.; Rao, Y. Feature Pyramid Networks for Object Detection. In Proceedings of the 2019 International Conference on Virtual Reality and Intelligent Systems, ICVRIS 2019, Jishou, China, 14–15 September 2019; pp. 428–431. [Google Scholar]
- Wang, C.Y.; Liao, H.Y.M.; Wu, Y.H.; Chen, P.Y.; Hsieh, J.W.; Yeh, I.H. CSPNet: A New Backbone that can Enhance Learning Capability of CNN. arXiv 2019, arXiv:1911.11929. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI 2020—34th AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 12993–13000. [Google Scholar]
- Jocher, G. YOLOv5. Available online: https://github.com/ultralytics/yolov5 (accessed on 25 July 2022).
- Zhang, S.; Yu, H. Person Re-Identification by Multi-Camera Networks for Internet of Things in Smart Cities. IEEE Access 2018, 6, 76111–76117. [Google Scholar] [CrossRef]
- Arora, S.; Bhaskara, A.; Ge, R.; Ma, T. Provable bounds for learning some deep representations. arXiv 2013, arXiv:1310.6343. [Google Scholar]
- Singla, V.; Singla, S.; Feizi, S.; Jacobs, D. Low Curvature Activations Reduce Overfitting in Adversarial Training. In Proceedings of the IEEE International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 16403–16413. [Google Scholar]
- Hong, S.J.; Han, Y.; Kim, S.Y.; Lee, A.Y.; Kim, G. Application of deep-learning methods to bird detection using unmanned aerial vehicle imagery. Sensors 2019, 19, 1651. [Google Scholar] [CrossRef] [Green Version]
- Zheng, Y.; Wu, S.; Liu, D.; Wei, R.; Li, S.; Tu, Z. Sleeper Defect Detection Based on Improved YOLO V3 Algorithm. In Proceedings of the 15th IEEE Conference on Industrial Electronics and Applications, ICIEA 2020, Virtual, 9–13 November 2020; pp. 955–960. [Google Scholar]
- Lin, T.; Maire, M.; Belongie, S.; Hays, J. Microsoft COCO: Common Objects in Context. arXiv 2014, arXiv:1405.0312. [Google Scholar]
- The PASCAL Visual Object Classes. Available online: http://host.robots.ox.ac.uk/pascal/VOC/ (accessed on 1 September 2022).
Backbone | Gflops | Params (M) |
---|---|---|
CSPDarknet | 34.49 | 26.617 |
DCSPDarknet | 1.255 | 0.304 |
Epoch | Strategy | Method |
---|---|---|
1–500 | Optimizer | Adam (weight decay = 0.0005) |
Loss | CIoU | |
Learning rate scheduler | CosineAnnealingLR | |
Data augmentation | Mosaic | |
500–600 | Optimizer | Adam (weight decay = 0.0005) |
Loss | CIoU | |
Learning rate scheduler | CosineAnnealingLR | |
Data augmentation | None |
Method | Gflops | Params (M) | |
---|---|---|---|
YOLOv4 baseline | 64.85 | 59.65 | 63.99 |
+ restructuring + SiLU | 62.95 (−1.9) | 5.55 (−54.1 ) | 5.86 (−58.13) |
+ dilated conv | 64.66 (+1.71) | 8.42 (+2.87) | 8.81 (+2.95) |
+ dilated–deformable conv | 66.7 (+2.04) | 6.05 (−2.37) | 6.17 (−2.64) |
+ FReLU | 61.50 ( −5.2) | 8.45 (+2.4) | 8.81 (+2.64) |
Method | Backbone | Size | FPS | Gflops | Params (M) | |||
---|---|---|---|---|---|---|---|---|
YOLOv3 | Darknet-53 | 416 | 46.58 | 64.53 | 94.22 | 73.41 | 65.42 | 61.58 |
YOLOv4 | CSPDarknet-53 | 416 | 41.45 | 64.85 | 93.88 | 73.11 | 59.65 | 63.99 |
YOLOv4-tiny | CSPDarknet-53 | 416 | 230.9 | 37.97 | 76.23 | 32.12 | 6.84 | 5.9 |
YOLOv5-S | Modified CSP v5 | 416 | 125.2 | 64.47 | 95.07 | 73.84 | 6.93 | 7.09 |
YOLOG(ours) | DCSPDarknet | 416 | 127.0 1 | 66.7 | 94.58 | 75.11 | 6.05 | 6.17 |
Method | Size | Gflops | ||
---|---|---|---|---|
YOLOv4 | 416 | 60.07 | 88.26 | 59.71 |
YOLOv4-tiny | 416 | 42.43 | 79.07 | 3.43 |
YOLOv5-S | 416 | 53.10 | 79.30 | 6.98 |
YOLOG(ours) | 416 | 55.74 | 79.96 | 6.08 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cai, X.; Shuang, F.; Sun, X.; Duan, Y.; Cheng, G. Towards Lightweight Neural Networks for Garbage Object Detection. Sensors 2022, 22, 7455. https://doi.org/10.3390/s22197455
Cai X, Shuang F, Sun X, Duan Y, Cheng G. Towards Lightweight Neural Networks for Garbage Object Detection. Sensors. 2022; 22(19):7455. https://doi.org/10.3390/s22197455
Chicago/Turabian StyleCai, Xinchen, Feng Shuang, Xiangming Sun, Yanhui Duan, and Guanyuan Cheng. 2022. "Towards Lightweight Neural Networks for Garbage Object Detection" Sensors 22, no. 19: 7455. https://doi.org/10.3390/s22197455
APA StyleCai, X., Shuang, F., Sun, X., Duan, Y., & Cheng, G. (2022). Towards Lightweight Neural Networks for Garbage Object Detection. Sensors, 22(19), 7455. https://doi.org/10.3390/s22197455