GM-DETR: Research on a Defect Detection Method Based on Improved DETR
Abstract
:1. Introduction
- This paper proposes combining the global attention (GAM) with the self-attention of the transformer. Integrating global and local information allows the network to more distinctly distinguish defects from the background. The recalibration of local features reduces information diffusion and amplifies interactions between global features, thereby enhancing the neural network's performance and improving defect detection.
- In this paper, to prevent the excessive number of model parameters from increasing the computational cost, we need to filter out unnecessary model parameters. We propose a layer pruning strategy to optimize the decoding layer, thus reducing the number of model parameters.
- This paper addresses the issue of the original loss function’s poor sensitivity to small differences in targets by replacing it with the MSE loss. Due to its simplicity, the MSE loss is faster and more sensitive to small differences in targets, allowing for better differentiation. This change not only improves model accuracy but also enhances convergence speed.
2. Related Work
3. Methods
3.1. The DETR Architecture
3.1.1. Self-Attention
3.1.2. Multi-Head Attention
3.2. Improving the DETR Algorithm
3.2.1. Attention Optimization
3.2.2. Optimization of Transformer Structure
3.2.3. Optimization of the Loss Function
4. Experimental Results and Analysis
4.1. Datasets and Evaluation Indicators
4.2. Experimental Results and Analysis
4.2.1. Ablation Experiment
4.2.2. Comparison of Different Attention Mechanisms
4.2.3. Performance Comparison of Different Models
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Peng, T.; Zheng, Y.; Zhao, L.; Zheng, E. Industrial Product Surface Anomaly Detection with Realistic Synthetic Anomalies Based on Defect Map Prediction. Sensors 2024, 24, 264. [Google Scholar] [CrossRef] [PubMed]
- Cumbajin, E.; Rodrigues, N.; Costa, P.; Miragaia, R.; Frazão, L.; Costa, N.; Fernández-Caballero, A.; Carneiro, J.; Buruberri, L.H.; Pereira, A. A Real-Time Automated Defect Detection System for Ceramic Pieces Manufacturing Process Based on Computer Vision with Deep Learning. Sensors 2023, 24, 232. [Google Scholar] [CrossRef] [PubMed]
- Saberironaghi, A.; Ren, J.; El-Gindy, M. Defect detection methods for industrial products using deep learning techniques: A review. Algorithms 2023, 16, 95. [Google Scholar] [CrossRef]
- Chen, Z.; Guo, H.; Yang, J.; Jiao, H.; Feng, Z.; Chen, L.; Gao, T. Fast vehicle detection algorithm in traffic scene based on improved SSD. Measurement 2022, 201, 111655. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [PubMed]
- Jiang, P.; Ergu, D.; Liu, F.; Cai, Y.; Ma, B. A Review of Yolo algorithm developments. Procedia Comput. Sci. 2022, 199, 1066–1073. [Google Scholar] [CrossRef]
- Li, J.; Su, Z.; Geng, J.; Yin, Y. Real-time detection of steel strip surface defects based on improved yolo detection network. IFAC-PapersOnLine 2018, 51, 76–81. [Google Scholar] [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-end object detection with transformers. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 213–229. [Google Scholar]
- Cheng, Y.; Liu, D. An image-based deep learning approach with improved DETR for power line insulator defect detection. J. Sens. 2022, 2022, 6703864. [Google Scholar] [CrossRef]
- Dang, L.M.; Wang, H.; Li, Y.; Nguyen, T.N.; Moon, H. DefectTR: End-to-end defect detection for sewage networks using a transformer. Constr. Build. Mater. 2022, 325, 126584. [Google Scholar] [CrossRef]
- Dai, X.; Chen, Y.; Yang, J.; Zhang, P.; Yuan, L.; Zhang, L. Dynamic detr: End-to-end object detection with dynamic attention. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 2988–2997. [Google Scholar]
- Zhu, M.; Kong, E. Multi-Scale Fusion Uncrewed Aerial Vehicle Detection Based on RT-DETR. Electronics 2024, 13, 1489. [Google Scholar] [CrossRef]
- Czimmermann, T.; Ciuti, G.; Milazzo, M.; Chiurazzi, M.; Roccella, S.; Oddo, C.M.; Dario, P. Visual-based defect detection and classification approaches for industrial applications. Sensors 2020, 20, 1459. [Google Scholar] [CrossRef] [PubMed]
- Ren, Z.; Fang, F.; Yan, N.; Wu, Y. State of the art in defect detection based on machine vision. Int. J. Precis. Eng. Manuf. Green Technol. 2022, 9, 661–691. [Google Scholar] [CrossRef]
- Anitha, S.; Radha, V. Evaluation of defect detection in textile images using Gabor wavelet based independent component analysis and vector quantized principal component analysis. In Proceedings of the Fourth International Conference on Signal and Image, Paris, France, 7 July 2013; pp. 433–442. [Google Scholar]
- Allili, M.S.; Baaziz, N.; Mejri, M. Texture modeling using contourlets and finite mixtures of generalized Gaussian distributions and applications. IEEE Trans. Multimed. 2014, 16, 772–784. [Google Scholar] [CrossRef]
- Zalama, E.; Gómez-García-Bermejo, J.; Medina, R.; Llamas, J. Road crack detection using visual features extracted by Gabor filters. Comput.-Aided Civ. Infrastruct. Eng. 2014, 29, 342–358. [Google Scholar] [CrossRef]
- Xu, Y.; Li, D.; Xie, Q.; Wu, Q.; Wang, J. Automatic defect detection and segmentation of tunnel surface using modified Mask R-CNN. Measurement 2021, 178, 109316. [Google Scholar] [CrossRef]
- Tran, V.P.; Tran, T.S.; Lee, H.J.; Kim, K.D.; Baek, J.; Nguyen, T.T. One stage detector (RetinaNet)-based crack detection for asphalt pavements considering pavement distresses and surface objects. J. Civ. Struct. Health Monit. 2021, 11, 205–222. [Google Scholar] [CrossRef]
- Yao, S.; Zhu, Q.; Zhang, T.; Cui, W.; Yan, P. Infrared image small-target detection based on improved FCOS and spatio-temporal features. Electronics 2022, 11, 933. [Google Scholar] [CrossRef]
- Xie, H.; Zhang, Y.; Wu, Z. An improved fabric defect detection method based on SSD. AATCC J. Res. 2021, 8, 181–190. [Google Scholar] [CrossRef]
- Hu, B.; Wang, J. Detection of PCB surface defects with improved faster-RCNN and feature pyramid network. IEEE Access 2020, 8, 108335–108345. [Google Scholar] [CrossRef]
- Li, Z.; Tian, X.; Liu, X.; Liu, Y.; Shi, X. A two-stage industrial defect detection framework based on improved-yolov5 and optimized-inception-resnetv2 models. Appl. Sci. 2022, 12, 834. [Google Scholar] [CrossRef]
- Cheng, X.; Yu, J. RetinaNet with difference channel attention and adaptively spatial feature fusion for steel surface defect detection. IEEE Trans. Instrum. Meas. 2020, 70, 2503911. [Google Scholar] [CrossRef]
- Wang, Y.; Zhang, X.; Yang, T.; Sun, J. Anchor detr: Query design for transformer-based detector. In Proceedings of the AAAI Conference on Artificial Intelligence, Carnegie Mellon University, Pittsburgh, PA, USA, 8 July 2022; pp. 2567–2575. [Google Scholar]
- Li, D.; Yang, P.; Zou, Y. Optimizing Insulator Defect Detection with Improved DETR Models. Mathematics 2024, 12, 1507. [Google Scholar] [CrossRef]
- Wang, D.; Li, Z.; Du, X.; Ma, Z.; Liu, X. Farmland obstacle detection from the perspective of uavs based on non-local deformable detr. Agriculture 2022, 12, 1983. [Google Scholar] [CrossRef]
- Han, K.; Xiao, A.; Wu, E.; Guo, J.; Xu, C.; Wang, Y. Transformer in transformer. Adv. Neural Inf. Process. Syst. 2021, 34, 15908–15919. [Google Scholar]
- Karp, R.M.; Vazirani, U.V.; Vazirani, V.V. An optimal algorithm for on-line bipartite matching. In Proceedings of the Twenty-Second Annual ACM Symposium on Theory of Computing, Baltimore, MD, USA, 13–17 May 1990; pp. 352–358. [Google Scholar]
- Mills-Tettey, G.A.; Stentz, A.; Dias, M.B. The dynamic hungarian algorithm for the assignment problem with changing costs. Robot. Inst. Pittsburgh 2007, 7, 27–40. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 1–11. [Google Scholar]
- Ding, G.; Georgilas, I.; Plummer, A. A Deep Learning Model with a Self-Attention Mechanism for Leg Joint Angle Estimation across Varied Locomotion Modes. Sensors 2023, 24, 211. [Google Scholar] [CrossRef]
- Li, J.; Wang, X.; Tu, Z.; Lyu, M.R. On the diversity of multi-head attention. Neurocomputing 2021, 454, 14–24. [Google Scholar] [CrossRef]
- Li, X.; Song, J.; Gao, L.; Liu, X.; Huang, W.; He, X.; Gan, C. Beyond rnns: Positional self-attention with co-attention for video question answering. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; pp. 8658–8665. [Google Scholar]
- Shao, Y.; Lin, J.C.-W.; Srivastava, G.; Jolfaei, A.; Guo, D.; Hu, Y. Self-attention-based conditional random fields latent variables model for sequence labeling. Pattern Recognit. Lett. 2021, 145, 157–164. [Google Scholar] [CrossRef]
- Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing 2021, 452, 48–62. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]
- Shi, P.; Qiu, J.; Abaxi, S.M.D.; Wei, H.; Lo, F.P.-W.; Yuan, W. Generalist vision foundation models for medical imaging: A case study of segment anything model on zero-shot medical segmentation. Diagnostics 2023, 13, 1947. [Google Scholar] [CrossRef] [PubMed]
- Zhao, H.; Gallo, O.; Frosio, I.; Kautz, J. Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 2016, 3, 47–57. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A.C. Mean squared error: Love it or leave it? A new look at signal fidelity measures. IEEE Signal Process. Mag. 2009, 26, 98–117. [Google Scholar] [CrossRef]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 10–25 June 2021; pp. 13713–13722. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Wei, H.; Liu, X.; Xu, S.; Dai, Z.; Dai, Y.; Xu, X. DWRSeg: Rethinking Efficient Acquisition of Multi-scale Contextual Inf ormation for Real-time Semantic Segmentation. In Proceedings of the Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1–10. [Google Scholar]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
- Hu, X.; Liu, Y.; Zhao, Z.; Liu, J.; Yang, X.; Sun, C.; Chen, S.; Li, B.; Zhou, C. Real-time detection of uneaten feed pellets in underwater images for aquaculture using an improved YOLO-V4 network. Comput. Electron. Agric. 2021, 185, 106135. [Google Scholar] [CrossRef]
Environmental Parameters | Value |
---|---|
Operating system | Windows11.0 |
Deep Learning Framework | PyTorch1.11.0 |
Programming language | Python3.10 |
CPU | Intel(R) Core(TM) i5-11400F |
GPU | GTX 1660 Ti |
RAM | 16 GB |
Hyper Parameterization | Value |
---|---|
Learning Rate | 0.01 |
Image Size | 720 × 720 |
Momentum | 0.9 |
Optimizer Type | Adamw |
Weight Decay | 0.001 |
Batch Size | 8 |
Epoch | 300 |
GAM | Decoder | Loss | [email protected] (%) | Para (M) | FPS |
---|---|---|---|---|---|
76.05 | 41.00 | 17.10 | |||
√ | 76.95 | 43.76 | 15.20 | ||
√ | 78.83 | 35.52 | 20.78 | ||
√ | 79.32 | 41.00 | 17.60 | ||
√ | √ | 77.96 | 42.34 | 15.67 | |
√ | √ | √ | 79.77 | 35.71 | 20.40 |
Models | [email protected] (%) | Para (M) | FPS |
---|---|---|---|
DETR | 76.05 | 41.00 | 17.10 |
DETR + CA | 76.04 | 41.32 | 17.02 |
DETR + SE | 76.56 | 42.20 | 15.90 |
DETR + DWR | 75.35 | 41.50 | 16.73 |
DETR + ECA | 75.92 | 41.78 | 17.56 |
DETR + GAM | 76.95 | 43.76 | 15.20 |
Models | [email protected] (%) | Params (M) | Precision (%) | Recall (%) | F1 |
---|---|---|---|---|---|
SSD | 72.50 | 26.28 | 88.03 | 55.98 | 0.68 |
FCOS | 77.97 | 36.76 | 77.91 | 72.83 | 0.75 |
Retina Net | 70.47 | 37.76 | 90.65 | 52.72 | 0.67 |
Faster-RCNN | 75.62 | 137.09 | 35.07 | 80.43 | 0.49 |
YOLOv4 | 77.70 | 63.50 | 84.00 | 74.00 | 0.78 |
YOLOv5 | 75.32 | 47.92 | 80.60 | 58.70 | 0.68 |
DETR | 76.05 | 41.00 | 58.72 | 80.43 | 0.68 |
GM-DETR | 79.77 | 35.71 | 61.54 | 82.61 | 0.71 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, X.; Yang, X.; Shao, L.; Wang, X.; Gao, Q.; Shi, H. GM-DETR: Research on a Defect Detection Method Based on Improved DETR. Sensors 2024, 24, 3610. https://doi.org/10.3390/s24113610
Liu X, Yang X, Shao L, Wang X, Gao Q, Shi H. GM-DETR: Research on a Defect Detection Method Based on Improved DETR. Sensors. 2024; 24(11):3610. https://doi.org/10.3390/s24113610
Chicago/Turabian StyleLiu, Xin, Xudong Yang, Lianhe Shao, Xihan Wang, Quanli Gao, and Hongbo Shi. 2024. "GM-DETR: Research on a Defect Detection Method Based on Improved DETR" Sensors 24, no. 11: 3610. https://doi.org/10.3390/s24113610
APA StyleLiu, X., Yang, X., Shao, L., Wang, X., Gao, Q., & Shi, H. (2024). GM-DETR: Research on a Defect Detection Method Based on Improved DETR. Sensors, 24(11), 3610. https://doi.org/10.3390/s24113610