Vison Transformer-Based Automatic Crack Detection on Dam Surface
Abstract
:1. Introduction
- It is the first attempt to perform dam surface crack detection with a pure ViT-based encoder–decoder network (DCST-net); our approach yields superior crack segmentation performance on the dam crack dataset collected from a real dam surface as well as two open benchmark crack datasets, outperforming state-of-the-art models;
- To establish long-range pixel interaction, we propose an improved SwinT-block as the fundamental unit of the DCST-net; this block efficiently extracts contextual information across feature channels through the utilization of depth-wise separable convolution kernels (DWConv); moreover, it integrates spatial domain contextual information through a proficient position-encoding scheme, thereby capturing a wide receptive field;
- To alleviate the loss of semantic details, we introduce a weighted attention module; it utilizes features from the encoder to produce an attentive mask, which serves as attention coefficients; these coefficients are then multiplicatively applied element-wise to the corresponding features in the decoder, thus suppressing non-crack features while enhancing crack features.
- To facilitate the training of deep networks, we propose a multi-level label supervision training method, which directly supervises different depth feature layers with crack labels; in addition, we design a hybrid loss function to overcome the problem of class imbalance in crack images.
2. Methodology
2.1. Architecture of DCST-Net
2.2. The improved Swin Transformer Block
2.3. Swin Transformer-Based Encoder
2.4. Swin Transformer-Based Decoder
2.5. Weighted Attention Block
2.6. Loss Function
2.7. Multiple-Level Label Supervision
3. Experiment Preparation
3.1. Dataset Description
3.2. Experiment Settings
3.3. Evaluation Metrics
4. Experimental Results and Discussion
4.1. Analysis of Training Results
4.2. Ablation Study
4.3. Comparative Study
4.4. Generalization Study
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Kang, F.; Li, J.; Zhao, S.; Wang, Y. Structural health monitoring of concrete dams using long-term air temperature for thermal effect simulation. Eng. Struct. 2019, 180, 642–653. [Google Scholar] [CrossRef]
- Zhang, G.; Liu, Y.; Zheng, C.; Feng, F. Simulation of influence of multi-defects on long-term working performance of high arch dam. Sci. China Technol. Sci. 2011, 54, 1–8. [Google Scholar] [CrossRef]
- Ye, X.W.; Jin, T.; Li, Z.X.; Ma, S.Y.; Ding, Y.; Ou, Y.H. Structural crack detection from benchmark data sets using pruned fully convolutional networks. J. Struct. Eng. 2021, 147, 04721008. [Google Scholar] [CrossRef]
- Li, Y.; Bao, T.; Shu, X.; Gao, Z.; Gong, J.; Zhang, K. Data-driven crack behavior anomaly identification method for concrete dams in long-term service using offline and online change point detection. J. Civ. Struct. Health 2021, 11, 1449–1460. [Google Scholar] [CrossRef]
- Hamishebahar, Y.; Guan, H.; So, S.; Jo, J. A comprehensive review of deep learning-based crack detection approaches. Appl. Sci. 2022, 12, 1374. [Google Scholar] [CrossRef]
- Graham, W. A Procedure for Estimating Loss of Life Caused by Dam Failure; Bureau of Reclamation, Dam Safety Office: Denver, CO, USA, 1999; p. 10. [Google Scholar]
- Rich, T.P. Lessons in social responsibility from the Austin dam failure. Int. J. Eng. Educ. 2006, 22, 1287–1296. [Google Scholar]
- Chen, B.; Zhang, H.; Wang, G.; Huo, J.; Li, Y.; Li, L. Automatic concrete infrastructure crack semantic segmentation using deep learning. Autom. Constr. 2023, 152, 104950. [Google Scholar] [CrossRef]
- Shi, P.; Shao, S.; Fan, X.; Zhou, Z.; Xin, Y. MCL-CrackNet: A Concrete Crack Segmentation Network Using Multi-level Contrastive Learning. IEEE T. Instrum. Meas. 2023, 72, 5030415. [Google Scholar] [CrossRef]
- Bhowmick, S.; Nagarajaiah, S.; Veeraraghavan, A. Vision and deep learning-based algorithms to detect and quantify cracks on concrete surfaces from UAV videos. Sensors 2020, 20, 6299. [Google Scholar] [CrossRef]
- Shi, P.; Fan, X.; Ni, J.; Wang, G. A detection and classification approach for underwater dam cracks. Struct. Health Monit. 2016, 15, 541–554. [Google Scholar] [CrossRef]
- Fan, X.N.; Wu, J.J.; Shi, P.F.; Zhang, X.W.; Xie, Y.J. A Novel Automatic Dam Crack Detection Algorithm Based on Local-Global Clustering. Multimed. Tools Appl. 2018, 77, 26581–26599. [Google Scholar] [CrossRef]
- Mohan, A.; Poobal, S. Crack detection using image processing: A critical review and analysis. Alex. Eng. J. 2018, 57, 787–798. [Google Scholar] [CrossRef]
- Cao, W.; Liu, Q.; He, Z. Review of Pavement Defect Detection Methods. IEEE Access 2020, 8, 14531–14544. [Google Scholar] [CrossRef]
- Li, B.; Wang, K.; Zhang, A.; Yang, E.; Wang, G. Automatic classification of pavement crack using deep convolutional neural network. Int. J. Pavement. Eng. 2020, 21, 457–463. [Google Scholar] [CrossRef]
- Zhang, J.; Bao, T. An improved resnet-based algorithm for crack detection of concrete dams using dynamic knowledge distillation. Water 2023, 15, 2839. [Google Scholar] [CrossRef]
- Li, Y.T.; Bao, T.F.; Xu, B.; Shu, X.S.; Zhou, Y.H.; Du, Y.; Wang, R.J.; Zhang, K. A deep residual neural network framework with transfer learning for concrete dams patch-level crack classification and weakly-supervised localization. Measurement 2022, 188, 110641. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1137–1149. [Google Scholar] [CrossRef]
- Deng, J.; Lu, Y.; Lee, V.C.S. Concrete crack detection with handwriting script interferences using faster region-based convolutional neural network. Comp. Aided Civ. Infrastruct. Eng. 2020, 35, 373–388. [Google Scholar] [CrossRef]
- Ciaparrone, G.; Serra, A.; Covito, V.; Finelli, P.; Scarpato, C.A.; Tagliaferri, R. A deep learning approach for road damage classification. In Proceedings of Advanced Multimedia and Ubiquitous Engineering; Springer: Singapore, 2018; pp. 655–661. [Google Scholar]
- Xu, G.; Han, X.; Zhang, Y.; Wu, C. Dam crack image detection model on feature enhancement and attention mechanism. Water 2022, 15, 64. [Google Scholar] [CrossRef]
- Ben, H.; Fei, K.; Yu, T. A real-time detection method for concrete dam cracks based on an object detection algorithm. J. Tsinghua Univ. 2023, 63, 1078–1086. [Google Scholar]
- Li, Y.; Bao, T. A real-time multi-defect automatic identification framework for concrete dams via improved YOLOv5 and knowledge distillation. J. Civ. Struct. Health Monit. 2023, 13, 1333–1349. [Google Scholar] [CrossRef]
- Zhang, J.; Zhang, J. An improved nondestructive semantic segmentation method for concrete dam surface crack images with high resolution. Math. Probl. Eng. 2020, 2020, 5054740. [Google Scholar] [CrossRef]
- Pang, J.; Zhang, H.; Feng, C.; Li, L. Research on crack segmentation method of hydro-junction project based on target detection network. KSCE J. Civ. Eng. 2020, 24, 2731–2741. [Google Scholar] [CrossRef]
- Feng, C.; Zhang, H.; Wang, H.; Wang, S.; Li, Y. Automatic pixel-level crack detection on dam surface using deep convolutional network. Sensors 2020, 20, 2069. [Google Scholar] [CrossRef] [PubMed]
- Chen, B.; Zhang, H.; Li, Y.; Wang, S.; Zhou, H.; Lin, H. Quantify pixel-level detection of dam surface crack using deep learning. Meas. Sci. Technol. 2022, 33, 065402. [Google Scholar] [CrossRef]
- Kang, D.; Cha, Y. Efficient attention-based deep encoder and decoder for automatic crack segmentation. Struct. Health Monit. 2022, 21, 2190–2205. [Google Scholar] [CrossRef]
- Lv, Z.; Tian, J.; Zhu, Y.; Li, Y. Automatic crack detection of dam concrete structures based on deep learning. Comput. Concr. 2023, 32, 615. [Google Scholar]
- Li, J.; Lu, X.; Zhang, P.; Li, Q. Intelligent Detection Method for Concrete Dam Surface Cracks Based on Two-Stage Transfer Learning. Water 2023, 15, 2082. [Google Scholar] [CrossRef]
- Wu, Z.; Tang, Y.; Hong, B.; Liang, B.; Liu, Y. Enhanced precision in dam crack width measurement: Leveraging advanced lightweight network identification for pixel-level accuracy. Int. J. Intell. Syst. 2023, 2023, 9940881. [Google Scholar] [CrossRef]
- Zhu, Y.; Tang, H. Automatic damage detection and diagnosis for hydraulic structures using drones and artificial intelligence techniques. Remote Sens. 2023, 15, 615. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Houlsby, N. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Paul, S.; Chen, P.Y. Vision Transformers Are Robust Learners. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 22 February–1 March 2022; Volume 36, pp. 2071–2081. [Google Scholar]
- Zhao, S.; Kang, F.; Li, J. Intelligent segmentation method for blurred cracks and 3D mapping of width nephograms in concrete dams using UAV photogrammetry. Autom. Constr. 2024, 157, 105145. [Google Scholar] [CrossRef]
- Liu, H.; Miao, X.; Mertz, C.; Xu, C.; Kong, H. Crackformer: Transformer network for fine-grained crack detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 3783–3792. [Google Scholar]
- Shamsabadi, E.A.; Xu, C.; Rao, A.S.; Nguyen, T.; Ngo, T.; Dias-da-Costa, D. Vision transformer-based autonomous crack detection on asphalt and concrete surfaces. Autom. Constr. 2022, 104316. [Google Scholar] [CrossRef]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
- Huang, H.; Hao, X.; Pei, L.; Ding, J.; Hu, Y.; Li, W. Automated detection of through-cracks in pavement using three-instantaneous attributes fusion and Swin Transformer network. Autom. Constr. 2024, 158, 105179. [Google Scholar] [CrossRef]
- Sun, Z.; Zhai, J.; Pei, L.; Li, W.; Zhao, K. Automatic Pavement Crack Detection Transformer Based on Convolutional and Sequential Feature Fusion. Sensors 2023, 23, 3772. [Google Scholar] [CrossRef] [PubMed]
- Luo, H.; Li, J.; Cai, L.; Wu, M. STrans-YOLOX: Fusing swin transformer and YOLOX for automatic pavement crack detection. Appl. Sci. 2023, 13, 1999. [Google Scholar] [CrossRef]
- Guo, F.; Qian, Y.; Liu, J.; Yu, H. Pavement crack detection based on transformer network. Autom. Constr. 2023, 145, 104646. [Google Scholar] [CrossRef]
- Guo, F.; Liu, J.; Lv, C.; Yu, H. A novel transformer-based network with attention mechanism for automatic pavement crack detection. Constr. Build. Mater. 2023, 391, 131852. [Google Scholar] [CrossRef]
- Zhang, E.; Shao, L.; Wang, Y. Unifying transformer and convolution for dam crack detection. Autom. Constr. 2023, 147, 104712. [Google Scholar] [CrossRef]
- Cao, H.; Wang, Y.; Chen, J.; Jiang, D.; Zhang, X.; Tian, Q.; Wang, M. Swin-unet: Unet-like pure transformer for medical image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; pp. 205–218. [Google Scholar]
- Ozan, O.; Jo, S.; Loic, L.F.; Matthew, L.; Mattias, H.; Kazunari, M.; Kensaku, M.; Steven, M.; Nils, Y.H.; Bernhard, K.; et al. Attention u-net: Learning where to look for the pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
- Zhao, F.; Chao, Y.; Li, L. A Crack Segmentation Model Combining Morphological Network and Multiple Loss Mechanism. Sensors 2023, 23, 1127. [Google Scholar] [CrossRef]
- Liu, Y.; Yao, J.; Lu, X.; Renping, X.; Li, L. DeepCrack: A deep hierarchical feature learning architecture for crack segmentation. Neurocomputing 2019, 338, 139–153. [Google Scholar] [CrossRef]
- Yang, F.; Zhang, L.; Yu, S.; Prokhorov, D.; Mei, X.; Ling, H. Feature pyramid and hierarchical boosting network for pavement crack detection. IEEE Trans. Intell. Transp. Syst. 2019, 21, 1525–1535. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Long, J.; Shelhamer, E.; Darrell, T. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 3431–3440. [Google Scholar]
- Chen, L.C.; Papandreo, G.; Schroff, F.; Adam, H. Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv 2017, arXiv:1706.05587. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention (MICCAI), Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Howard, A.; Sandler, M.; Chu, G.; Chen, L.C.; Chen, B.; Tan, M.; Adam, H. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 1314–1324. [Google Scholar]
- Dung, C.V. Autonomous concrete crack detection using deep fully convolutional neural network. Autom. Constr. 2019, 99, 52–58. [Google Scholar]
- Dais, D.; Bal, I.E.; Smyrou, E.; Sarhosis, V. Automatic crack classification and segmentation on masonry surfaces using convolutional neural networks and transfer learning. Autom. Constr. 2021, 125, 103606. [Google Scholar]
- Hsieh, Y.-A.; Tsai, Y.J. Machine learning for crack detection: Review and model performance comparison. J. Comput. Civ. Eng. 2020, 34, 04020038. [Google Scholar] [CrossRef]
- Alipour, M.; Harris, D.K.; Miller, G.R. Robust pixel-level crack detection using deep fully convolutional neural networks. J. Comput. Civ. Eng. 2019, 33, 04019040. [Google Scholar] [CrossRef]
- Liu, Z.; Cao, Y.; Wang, Y.; Wang, W. Computer vision-based concrete crack detection using U-net fully convolutional networks. Autom. Constr. 2019, 104, 129–139. [Google Scholar] [CrossRef]
- Geirhos, R.; Rubisch, P.; Michaelis, C.; Bethge, M.; Wichmann, F.A.; Brendel, W. ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv 2018, arXiv:1811.12231. [Google Scholar]
- Tuli, S.; Dasgupta, I.; Grant, E.; Griffiths, T.L. Are Convolutional Neural Networks or Transformers more like human vision? arXiv 2021, arXiv:2105.07197. [Google Scholar]
- Azulay, A.; Weiss, Y. Why do deep convolutional networks generalize so poorly to small image transformations? arXiv 2018, arXiv:1805.12177. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision—ECCV, Munich, Germany, 8–14 September 2018; pp. 833–851. [Google Scholar]
- Mei, Q.; Gül, M.; Azim, M.R. Densely connected deep neural network considering connectivity of pixels for automatic crack detection. Autom. Constr. 2020, 110, 103018. [Google Scholar] [CrossRef]
- Touvron, H.; Cord, M.; Douze, M.; Massa, F.; Sablayrolles, A.; Jegou, H. Training data-efficient image transformers & distillation through attention. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021; Volume 139, pp. 10347–10357. [Google Scholar]
Hardware/Software | Parameters/Version |
---|---|
CPU | Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10 GHz |
GPU | 2 × NVIDIA GTX TITAN Xp, 24 GB memory |
RAM | DDR4, 32 GB |
Operating System | Ubuntu18.04 |
Python | 3.6 |
CUDA | 10.2 |
Cudnn | 8.0.2 |
TensorRT | 7.0 |
Method | Improved SwinT-Blocks | Weighted Attention Block | Hybrid Loss Function | Multiple-Level Label Supervision |
---|---|---|---|---|
a | × | × | × | × |
b | √ | × | × | × |
c | × | √ | × | × |
d | √ | √ | × | × |
e | √ | √ | √ | × |
f(DCST-net) | √ | √ | √ | √ |
Method | Pre (%) | Rec (%) | F1s (%) | mIou (%) |
---|---|---|---|---|
a | 68.16 | 73.46 | 69.10 | 54.42 |
b | 72.66 | 74.26 | 72.02 | 57.66 |
c | 72.94 | 75.15 | 72.40 | 58.08 |
d | 69.80 | 86.40 | 76.43 | 62.84 |
e | 77.81 | 81.73 | 79.48 | 66.65 |
f | 78.41 | 82.02 | 79.96 | 67.21 |
Method | Pre(%) | Rec(%) | F1s (%) | mIou (%) |
---|---|---|---|---|
SegNet [50] | 61.05 | 58.87 | 57.53 | 42.26 |
FCN-8s [51] | 54.90 | 71.17 | 61.98 | 44.91 |
DeepLab v3+ [52] | 69.75 | 79.69 | 73.49 | 58.86 |
U-Net [53] | 72.32 | 76.47 | 73.02 | 58.80 |
LR-ASPP [54] | 54.15 | 66.76 | 58.18 | 42.11 |
DCST-net (ours) | 78.41 | 82.02 | 79.96 | 67.21 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhou, J.; Zhao, G.; Li, Y. Vison Transformer-Based Automatic Crack Detection on Dam Surface. Water 2024, 16, 1348. https://doi.org/10.3390/w16101348
Zhou J, Zhao G, Li Y. Vison Transformer-Based Automatic Crack Detection on Dam Surface. Water. 2024; 16(10):1348. https://doi.org/10.3390/w16101348
Chicago/Turabian StyleZhou, Jian, Guochuan Zhao, and Yonglong Li. 2024. "Vison Transformer-Based Automatic Crack Detection on Dam Surface" Water 16, no. 10: 1348. https://doi.org/10.3390/w16101348
APA StyleZhou, J., Zhao, G., & Li, Y. (2024). Vison Transformer-Based Automatic Crack Detection on Dam Surface. Water, 16(10), 1348. https://doi.org/10.3390/w16101348