Application of Task-Aligned Model Based on Defect Detection
Abstract
:1. Introduction
- The first TOOD algorithm is applied for the detection of metal defects [32].
- In the NEU-DET metal defect dataset, the mAP is increased from 75.4% to 77.9%, an improvement of 2.5%, and the mAP is also improved compared with the existing model.
2. Related Work
2.1. Class and Location Independent Problem
2.2. Complexity of Metal Defective Images
2.3. Optimize the Method of Selecting the Best Bounding Box
3. Methodology and Design
3.1. Baseline Convolution Architecture
- Comparison with CNN: ResNet can achieve high mAP with very few parameters, significantly reduce training time, and effectively reduce the occurrence of overfitting.
- Findings in the course of experiments: Under the same performance, the batch size can be larger than that of ResNeXt-101, but the mAP trained by ResNet-50/101 is much better than that of ResNeXt-101.
3.2. FPN
3.3. DCNv2
3.4. Task-Aligned
3.4.1. T-head
3.4.2. TAL
3.5. Soft-NMS
4. Experiments and Results
4.1. Implementation Details
4.1.1. Experimental Platform
4.1.2. Experimental Parameters
4.2. Datasets
4.3. Evaluation Metrics
4.4. Experiment Results and Analysis
4.4.1. ResNet-50, ResNet-101, and ResNeXt-101
4.4.2. Anchor-Based and Anchor-Free
4.4.3. IoU Threshold Adjustment for NMS and Soft-NMS
4.4.4. Result
4.5. Ablative Study
4.6. Comparison with State-of-the-Art
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Song, K.; Yan, Y. A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surf. Sci. 2013, 285, 858–864. [Google Scholar] [CrossRef]
- Wang, L. Support Vector Machines: Theory and Applications; Springer: Berlin/Heidelberg, Germany, 2005; pp. 159–179. [Google Scholar]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Smith, L.I. A Tutorial on Principal Components Analysis. 2002. Available online: https://www.semanticscholar.org/paper/A-tutorial-on-Principal-Components-Analysis-Smith/462bf829634e3ffaef794de5e58809994d30f8ec (accessed on 2 September 2023).
- Abeywickrama, T.; Aamir Cheema, M.; Taniar, D. k-Nearest Neighbors on Road Networks: A Journey in Experimentation and In-Memory Implementation. arXiv 2016, arXiv:1601.01549. [Google Scholar] [CrossRef]
- Wang, W.; Mi, C.; Wu, Z.; Lu, K.; Long, H.; Pan, B.; Li, D.; Zhang, J.; Chen, P.; Wang, B. A Real-Time Steel Surface Defect Detection Approach With High Accuracy. IEEE Trans. Instrum. Meas. 2022, 71, 1357. [Google Scholar] [CrossRef]
- He, Y.; Song, K.; Meng, Q.; Yan, Y. An End-to-End Steel Surface Defect Detection Approach via Fusing Multiple Hierarchical Features. IEEE Trans. Instrum. Meas. 2020, 69, 1493–1504. [Google Scholar] [CrossRef]
- Cheng, X.; Yu, J. RetinaNet With Difference Channel Attention and Adaptively Spatial Feature Fusion for Steel Surface Defect Detection. IEEE Trans. Instrum. Meas. 2021, 70, 2503911. [Google Scholar] [CrossRef]
- Yu, J.; Cheng, X.; Li, Q. Surface Defect Detection of Steel Strips Based on Anchor-Free Network With Channel Attention and Bidirectional Feature Fusion. IEEE Trans. Instrum. Meas. 2022, 71, 5000710. [Google Scholar] [CrossRef]
- Li, Z.; Tian, X.; Liu, X.; Liu, Y.; Shi, X. A Two-Stage Industrial Defect Detection Framework Based on Improved-YOLOv5 and Optimized-Inception-ResnetV2 Models. Appl. Sci. 2022, 12, 834. [Google Scholar] [CrossRef]
- Lv, X.; Duan, F.; Jiang, J.J.; Fu, X.; Gan, L. Deep Metallic Surface Defect Detection: The New Benchmark and Detection Network. Sensors 2020, 20, 1562. [Google Scholar] [CrossRef]
- Wang, J.; Xu, P.; Li, L.; Zhang, F. DAssd-Net: A Lightweight Steel Surface Defect Detection Model Based on Multi-Branch Dilated Convolution Aggregation and Multi-Domain Perception Detection Head. Sensors 2023, 23, 5488. [Google Scholar] [CrossRef]
- Niu, M.; Wang, Y.; Song, K.; Wang, Q.; Zhao, Y.; Yan, Y. An Adaptive Pyramid Graph and Variation Residual-Based Anomaly Detection Network for Rail Surface Defects. IEEE Trans. Instrum. Meas. 2021, 70, 5020013. [Google Scholar] [CrossRef]
- Yang, H.; Wang, Y.; Hu, J.; He, J.; Yao, Z.; Bi, Q. Deep Learning and Machine Vision-Based Inspection of Rail Surface Defects. IEEE Trans. Instrum. Meas. 2022, 71, 5005714. [Google Scholar] [CrossRef]
- Jin, X.; Wang, Y.; Zhang, H.; Zhong, H.; Liu, L.; Wu, Q.M.J.; Yang, Y. DM-RIS: Deep Multimodel Rail Inspection System With Improved MRF-GMM and CNN. IEEE Trans. Instrum. Meas. 2020, 69, 1051–1065. [Google Scholar] [CrossRef]
- Nieniewski, M. Morphological Detection and Extraction of Rail Surface Defects. IEEE Trans. Instrum. Meas. 2020, 69, 6870–6879. [Google Scholar] [CrossRef]
- Su, B.; Chen, H.; Liu, K.; Liu, W. RCAG-Net: Residual Channelwise Attention Gate Network for Hot Spot Defect Detection of Photovoltaic Farms. IEEE Trans. Instrum. Meas. 2021, 70, 3510514. [Google Scholar] [CrossRef]
- Tao, X.; Zhang, D.; Hou, W.; Ma, W.; Xu, D. Industrial Weak Scratches Inspection Based on Multifeature Fusion Network. IEEE Trans. Instrum. Meas. 2021, 70, 5000514. [Google Scholar] [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks. arXiv 2013, arXiv:1312.6229. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.-Y.; Berg, A.C. SSD: Single Shot MultiBox Detector. In Proceedings of the Computer Vision—ECCV 2016, Amsterdam, The Netherlands, 11–14 October 2016; pp. 21–37. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Girshick, R. Fast R-CNN. In Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Washington, DC, USA, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv 2015, arXiv:1506.01497. [Google Scholar] [CrossRef]
- Cai, Z.; Vasconcelos, N. Cascade R-CNN: Delving into High Quality Object Detection. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6154–6162. [Google Scholar]
- Lin, T.-Y.; Dollar, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
- Lin, T.Y.; Goyal, P.; Girshick, R.; He, K.; Dollár, P. Focal Loss for Dense Object Detection. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2999–3007. [Google Scholar]
- Duan, K.; Bai, S.; Xie, L.; Qi, H.; Huang, Q.; Tian, Q. CenterNet: Keypoint Triplets for Object Detection. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6568–6577. [Google Scholar]
- Tian, Z.; Shen, C.; Chen, H.; He, T. FCOS: Fully Convolutional One-Stage Object Detection. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9626–9635. [Google Scholar]
- Zhang, S.; Chi, C.; Yao, Y.; Lei, Z.; Li, S.Z. Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 9756–9765. [Google Scholar]
- Feng, C.; Zhong, Y.; Gao, Y.; Scott, M.R.; Huang, W. TOOD: Task-aligned One-stage Object Detection. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, BC, Canada, 11–17 October 2021; pp. 3490–3499. [Google Scholar]
- Bodla, N.; Singh, B.; Chellappa, R.; Davis, L.S. Soft-NMS—Improving Object Detection with One Line of Code. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5562–5570. [Google Scholar]
- Zhu, X.; Hu, H.; Lin, S.; Dai, J. Deformable ConvNets V2: More Deformable, Better Results. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 9300–9308. [Google Scholar]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable Convolutional Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
- Hosang, J.; Benenson, R.; Schiele, B. Learning Non-maximum Suppression. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 6469–6477. [Google Scholar]
- Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the Computer Vision—ECCV 2014, Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Xie, S.; Girshick, R.; Dollar, P.; Tu, Z.; He, K. Aggregated Residual Transformations for Deep Neural Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5987–5995. [Google Scholar]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Identity Mappings in Deep Residual Networks. In Proceedings of the Computer Vision—ECCV 2016, Amsterdam, The Netherlands, 11–14 October 2016; pp. 630–645. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. In Proceedings of the AAAI conference on artificial intelligence, San Francisco, CA, USA, 4–9 February 2017; Volume 31. [Google Scholar] [CrossRef]
- Zhao, Q.; Sheng, T.; Wang, Y.; Tang, Z.; Chen, Y.; Cai, L.; Ling, H. M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network. arXiv 2018, arXiv:1811.04533. [Google Scholar] [CrossRef]
- Redmon, J.; Farhadi, A. YOLOv3: An Incremental Improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv 2022, arXiv:2207.02696. [Google Scholar]
- Tan, M.; Pang, R.; Le, Q.V. EfficientDet: Scalable and Efficient Object Detection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10778–10787. [Google Scholar]
- Zhu, C.; Chen, F.; Shen, Z.; Savvides, M. Soft Anchor-Point Object Detection. arXiv 2019, arXiv:1911.12448. [Google Scholar]
Backbone | mAP |
---|---|
ResNet-50 | 76.1 |
ResNet-101 | 76.2 |
ResNeXt-101 (64 × 4 d) | 75.1 |
Backbone | Anchor | mAP |
---|---|---|
ResNet-50 | Anchor-based | 75.4 |
Anchor-free | 76.1 |
Backbone | IoU | mAP | |
---|---|---|---|
ResNet-101 | NMS | 0.55 | 77.2 |
NMS | 0.60 | 77.2 | |
Soft NMS | 0.40 | 76.7 | |
Soft NMS | 0.45 | 77.9 | |
Soft NMS | 0.50 | 77.8 | |
Soft NMS | 0.55 | 77.8 | |
Soft NMS | 0.60 | 77.1 | |
Soft NMS | 0.65 | 76.4 |
Backbone | Anchor | Type | IoU | mAP |
---|---|---|---|---|
ResNet-101 | Anchor-free | DCNv2(C3-C5) | Soft NMS 0.45 | 77.9 |
DCNv2 (C3-C5) | DCNv2 (C4-C5) | Backbone | mAP | Baseline |
---|---|---|---|---|
ResNet-101 | 76.2 | - | ||
V | ResNet-101 | 77.2 | +1.0 | |
V | ResNet-101 | 76.1 | −0.1 |
DCNv2 (C4-C5) | Backbone | mAP | Baseline |
---|---|---|---|
ResNet-101 | 76.2 | - | |
V | ResNet-101 | 76.1 | −0.1 |
ResNeXt-101 (64 × 4 d) | 75.1 | −1.1 | |
V | ResNeXt-101 (64 × 4 d) | 76.7 | +0.5 |
Model | mAP | Patches AP | Scratches AP | Pitted_ Surface AP | Crazing AP | Inclusion AP | Roller- in Scale AP |
---|---|---|---|---|---|---|---|
M2Det-320 [43] | 61.1 | 81.7 | 70.8 | 72.0 | 28.5 | 64.9 | 48.4 |
ATSS [6] | 67.8 | 85.8 | 81.8 | 75.7 | 33.0 | 70.8 | 59.6 |
YOLOv3 [44] | 69.4 | 71.4 | 73.7 | 68.3 | 68.4 | 61.9 | 72.3 |
EifficientDet [45] | 70.1 | 83.5 | 73.1 | 85.5 | 45.9 | 62.0 | 72.7 |
Improved YOLOv3 [6] | 70.7 | 84.9 | 92.1 | 87.8 | 24.8 | 72.1 | 62.3 |
FCOS [6] | 71.3 | 86.5 | 78.2 | 79.8 | 44.1 | 76.1 | 63.3 |
SAPD [46] | 73.2 | 93.9 | 97.8 | 87.4 | 44.6 | 73.3 | 42.9 |
Cascade [6] | 73.3 | 88.4 | 88.2 | 81.3 | 38.3 | 76.0 | 67.8 |
YOLOv4 [47] | 74.6 | 92.5 | 77.9 | 83.6 | 64.9 | 74.2 | 54.3 |
SSD300 [6] | 74.8 | 90.6 | 84.1 | 83.8 | 46.9 | 75.9 | 67.3 |
Improved Faster R-CNN [8] | 74.8 | 89.1 | 83.4 | 82.1 | 52.2 | 77.3 | 64.7 |
YOLOv5-s [12] | 75.0 | 87.0 | 92.0 | 90.0 | 30.0 | 79.0 | 72.0 |
RetinaNet [9] | 75.3 | 93.3 | 73.5 | 91.4 | 53.0 | 78.7 | 62.0 |
YOLOv5-v6.1-s [12] | 75.5 | 85.0 | 92.0 | 91.0 | 38.0 | 82.0 | 65.0 |
YOLOv7-tiny [12] | 75.6 | 86.0 | 95.0 | 86.0 | 42.0 | 77.0 | 69.0 |
CABF-FCOS [9] | 76.7 | 93.5 | 84.4 | 88.9 | 55.4 | 75.0 | 62.9 |
CenterNet [6] | 77.1 | 91.4 | 94.2 | 87.4 | 44.2 | 82.7 | 62.9 |
YOLOv8-s [12] | 77.3 | 91.0 | 93.0 | 92.0 | 43.0 | 84.0 | 60.0 |
TAMD (Ours) | 77.9 | 92.0 | 92.4 | 82.6 | 56.8 | 82.8 | 60.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hung, M.-H.; Ku, C.-H.; Chen, K.-Y. Application of Task-Aligned Model Based on Defect Detection. Automation 2023, 4, 327-344. https://doi.org/10.3390/automation4040019
Hung M-H, Ku C-H, Chen K-Y. Application of Task-Aligned Model Based on Defect Detection. Automation. 2023; 4(4):327-344. https://doi.org/10.3390/automation4040019
Chicago/Turabian StyleHung, Ming-Hung, Chao-Hsun Ku, and Kai-Ying Chen. 2023. "Application of Task-Aligned Model Based on Defect Detection" Automation 4, no. 4: 327-344. https://doi.org/10.3390/automation4040019