Tunnel Lining Crack Recognition Algorithm Integrating SK Attention and Cascade Neural Network

Liu, Keqiang; Zhong, Shisheng; Wang, Rui; Zhou, Peiying; Zhao, Xiaodong; Liu, Xingen

doi:10.3390/electronics12153307

Open AccessArticle

Tunnel Lining Crack Recognition Algorithm Integrating SK Attention and Cascade Neural Network

by

Keqiang Liu

¹,

Shisheng Zhong

²,

Rui Wang

^2,*

,

Peiying Zhou

²,

Xiaodong Zhao

¹ and

Xingen Liu

³

¹

CRRC Qingdao Sifang Vehicle Research Institute Co., Ltd., Qingdao 266111, China

²

College of Ocean Engineer, Harbin Institute of Technology, Weihai 264200, China

³

Shanghai Tongyan Civil Engineering Technology Co., Ltd., Shanghai 200092, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(15), 3307; https://doi.org/10.3390/electronics12153307

Submission received: 17 June 2023 / Revised: 19 July 2023 / Accepted: 20 July 2023 / Published: 1 August 2023

(This article belongs to the Section Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Cracks are one of the main types of tunnel-lining defects. At present, there are no particularly good methods for identifying tunnel-lining cracks. The methods that are used are associated with problems such as poor robustness, low detection efficiency, and inconsistency in defect identification. Vision-based crack identification algorithms that use images of tunnel-lining cracks are also affected by problems such as weak features, irregular shapes, random development, and small proportions. In light of this, we propose in this paper a neural network that integrates U-net and FPN (feature pyramid network) pyramid structure characteristics. Because of its ability to output deep feature information, an SK (selective kernel) attention mechanism was integrated to increase the weight of effective feature regions and improve the detection accuracy of feature maps. An RPN (region proposal network) regional detection network was used for multi-scale regional screening, and non-maximum suppression was used to eliminate repeated anchor frames between different regions. The continuity of cracks was determined using adaptive regional expansion. The region segmentation network was also combined with multi-scale feature maps. Finally, the segmentation results for regions of fine features could be mapped to the whole image to identify cracks, thereby solving the problem that arises when textural features of cracks are weak and cannot be accurately identified. For testing purposes, photos of tunnel-lining crack defects were used. The algorithm described in this paper was able to combine deep and shallow features to identify more abundant crack features, and the recognition accuracy of the crack classification network reached 98.51%. With respect to crack segmentation, the algorithm’s segmentation accuracy reached 94.55%, and its single processing time reached 60.76 ms, indicating more accurate and efficient segmentation performance, compared with FCNs (fully convolutional networks) and classical U-net networks.

Keywords:

tunnel; crack; attention mechanism; cascade neural network; pyramid network; multi-scale characteristics

1. Introduction

Globally, subway mileage has been increasing rapidly for many years [1]. As more subway lines are put into operation, a number of problems may be gradually exposed in subway tunnels that have been opened to traffic. Under the combined action of internal and external factors, the performance of such tunnels gradually degrades, resulting in a variety of defects that endanger their continuing operation. Cracks are one of the most important types of lining defects [2], which are factors that adversely affect the durability and mechanical properties of the tunnel. Traditional detection methods mainly involve use of the naked eye or artificial instruments, and require highly qualified personnel. It is often difficult to ensure the accuracy and integrity of results obtained by such methods, and the detection process may be ineffective as a result. With the development of machine vision technology, studies of automatic crack detection have been carried out in a number of countries [3,4].

Crack recognition involves identifying meaningful feature parts in the image, such as the edges and regions of the defect [5,6], to judge whether the identified features belong to the crack region [3,4]. However, the edge features of tunnel-lining cracks are weak, their shapes are irregular, their development is random, and the cracks occupy only a small proportion of the image. The quick and accurate identification of cracks is therefore a challenge facing researchers today [7,8]. Seung Yeol Lee et al. proposed a rapid detection system for tunnel-lining cracks using the Dijkstra method, which can effectively detect cracks [9]. Ukai Masato applied the brightness-change method to the edges of images to identify the feature with the largest variation in brightness as the crack, and then used the threshold method to define its width and length [10]. However, in the identification of cracks in tunnels, crack features are small, and interference caused by the tunnel is strong, so that the acquired image is unclear; because of this, the ability of traditional algorithms to identify tunnel cracks is limited.

With the development of convolutional neural networks [11,12], automatic crack detection technology has been more widely applied [13]. Cha Y J used a convolutional neural network to detect cracks in concrete, and effectively proved the robustness and effectiveness of convolutional neural networks in crack identification [14]. Protopapadakis Eftychios et al. proposed a detection method for cracks in concrete based on image processing and a convolutional neural network deep-learning model; however, the segmentation efficiency of the network was not high [15,16]. Dung Cao Vu proposed a vision-based method for concrete-crack detection and density evaluation using FCN, and achieved about 90% accuracy with average precision [17]. Ren Yupeng et al. proposed a fully convolutional deep neural network named CrackSegNet to conduct dense pixel-wise crack segmentation. This improved the overall crack-segmentation ability of the network; however, its memory usage was large, and the detection process was slow [18]. Zhe Changmei used a combination of image pixel processing and deep-learning-based classification to determine whether tunnel features were cracks or not; however, in images with more background interference, cracks could not be identified consistently [19]. Weng Piao improved on the full convolution method and applied it to pavement-crack recognition [20]. Li Liangfu integrated a segmentation branch into the discriminator, and combined a generative adversarial network with a semantic segmentation network [21]. Zou et al. proposed a data fusion scheme based on a convolutional neural network and naive Bayesian analysis to analyze cracks in single video frames, and found that this improved the overall performance of the system [22]. However, to some degree at least, all these algorithms are affected by problems such as poor robustness, low detection efficiency, and discontinuous defect identification, which limits their effectiveness in actual detection work.

Attention mechanisms can focus on important information with high weights and ignore irrelevant information with low weights; they can also constantly adjust these weights so that different types of information can be assessed as important and then selected under different circumstances. Thus, they have higher scalability and robustness, and they have produced good results in a growing number of application scenarios. Li, X. et al. proposed an SK (selective kernel) attention mechanism, and used this to achieve excellent performance in various visual tasks [23]. Chen Haixiu et al. proposed a method of underwater image enhancement based on an SK residual attention network, and found, in addition to enhanced image color, that key information and detailed features were greatly improved [24].

Because of the aforementioned problems such as poor robustness, low detection efficiency, and discontinuous defect identification, the method described in this paper involves use of a three-cascade network to identify tunnel-lining cracks [25]. With this method, a multi-scale feature fusion algorithm, neural network, and FPN (feature pyramid network) structure are combined, and an image-feature pyramid algorithm is applied to identify image features. The SK attention mechanism is used for effective feature screening and thus solve the problem that crack features are not easy to identify. To ensure the continuity of the fracture region, the grid division method is used to compare and select grids of different scales at the same feature location, and integrate adjacent regions. The segmentation network uses multi-scale feature fusion to segment the crack region, and maps the segmentation result to the entire image to extract crack information from the image.

2. Structure of Cascade Neural Network

For the present work, a cascading network structure was adopted, as shown in Figure 1. The first layer is a feature extraction network integrating U-net, an FPN pyramid network, and an SK attention mechanism [25,26]. The second layer is a feature-region detection network based on an RPN (region proposal network) structure. The third layer is a region-segmentation network based on an FCN (fully convolutional network) structure. In the first layer, the feature extraction network mainly adopts the neural network that fuses the U-net and FPN feature pyramid to obtain a multi-scale feature map of cracks. In addition, the SK attention mechanism is introduced to improve the weight of effective features in the fracture region and enhance the precision of the network to extract fine features in the fracture region. The second-layer feature-region detection network is used to detect the feature region of the feature map obtained by the upper-layer network and transfer the boundary information determined as the crack region to the lower-level network. The third-layer region segmentation network combines the multi-scale feature map information obtained from the first two levels of the network and the corresponding region information to complete the pixel segmentation.

2.1. Feature Extraction Network

Cracks are small features. Feature extraction networks are mainly used to detect targets at different scales, obtain multi-scale feature maps, and construct feature pyramids [27,28]. The SK attention module is integrated to screen out effective small features according to the weight coefficient, to achieve the enhancement of small features such as cracks. Then, the feature pyramid information obtained by the feature extraction network is transferred to the region detection network and the region segmentation network, respectively, so that the network can judge the features under the condition of different scales.

2.1.1. Structure of Network

The key to the feature extraction network is the integration of the structural features of U-net and FPN, to enable the extraction of features at different scales, replace the sliding-frame operation of the traditional convolutional network, and significantly improve the efficiency of feature extraction, compared with cyclic-loop operation. At the same time, a multi-scale feature map can be extracted to optimize the precision of feature extraction. The trunk structure diagram is shown in Figure 2.

First, it is assumed that the image size of the input network is 512 × 512. Then, to extract primary features, two 3 × 3 convolution operations and maximum-average pooling are performed at levels C1–C5. The primary features are further extracted by 1 × 1 convolution, and the number of channels in the dimension s1 is uniformly reduced. The feature map of the previous scale level is transmitted from top to bottom through Upsample, which adopts the nearest-neighbor interpolation method to expand the feature map to twice the original size. The information at this scale and the up-sampled information are then added and fused by the element-wise operation, and the fused features are extracted through 3 × 3 convolution. Because the high-level features have rich semantic information, the lower-level features with rich textural information may also contain rich semantic information when propagated from top to bottom.

2.1.2. SK Attention Mechanism

The SK attention mechanism is used to calculate the weight of each channel of the input image through the network, improve the ability of feature expression, correct the feature map, and save valuable features. Its structure is shown in Figure 3.

First of all, the feature graph X of the input network is obtained by depth-wise convolution with a convolution kernel of two scales: 3 × 3 and 5 × 5; the corresponding feature graphs

\tilde{U}

and

\hat{U}

are then generated. The 5 × 5 convolution is obtained with an empty convolution of 3 × 3 and a step size of 2 to reduce the amount of computation. Through the addition of element-wise operation, the feature graph

U

is obtained, whose size is [C, H, W]; this integrates the multi-scale receptive field information. Then, a one-dimensional vector with a length of C × 1 × 1 is obtained by calculating the mean value along the H and W dimensions through global-averaging pooling

F_{g p}

. The full connection layer of

F_{f c}

is used to reduce the original C-dimension vector and compress it into a Z-dimension vector. Then, the Z-dimension vector is transformed into a C-dimension vector by linear transformation, which is normalized by softmax function. Finally, the attention weight of each channel is adjusted between a and b. The attention-weight vectors a and b are used to dot the feature graphs

\tilde{U}

and

\hat{U}

, respectively, and to obtain the feature graph after feature selection, and the element-wise operation is performed to obtain the information-fusion feature graph V. The SK attention mechanism uses different weights of convolution kernels for different images, and dynamically generates convolution kernels for images of different scales. As a result, it significantly improves the effectiveness of crack feature extraction.

2.2. Region Detection Network

The second-layer network is used to achieve region detection. In this study, we sought to consider background interference in the tunnel environment, so the method of region division was adopted. By analyzing the image features of cracks, the image is meshed to reduce interference in the background region and improve detection efficiency. The operation procedure is shown in Figure 4.

2.2.1. Structure of Network

Firstly, the input end of the region detection network consists of FeatureMap1–6 delivered by the upper-layer network. Then, in the feature maps of different scales, multiple anchor frames are generated for the detection region. Subsequently, the RPN network is used to classify the anchor frame and the regression of the boundary frame. Finally, information for the anchor frame judged to be that of the crack is recorded, including its length and width (

l, w

), the center point

(x, y)

, the offset

(Δ x, Δ y)

of the frame, and the level

i

of the input feature map. The structure of the region detection network is shown in Figure 5.

According to the scores of anchor frames and the set overlap rate, the reduction criteria are set. By applying these criteria, effective anchor frames are screened out, and other anchor frames are expanded to ensure regional continuity. The information of the anchor box is also corrected, and then input as the next layer of the segmentation network.

2.2.2. Generation of Anchor Box

The pixel points on the feature map at different scales are traversed to generate an anchor frame of corresponding size. For example, FeatureMap2 has a size of 256 × 256 and a step length of 16. Each pixel of FeatureMap2 therefore generates a 16 × 16 anchor box. To enlarge the field of view and facilitate detection, the sides of the anchor box are doubled in length to 32 × 32. Because the crack features are narrow and non-directional, three types of rectangular anchor boxes are used: 16 × 64; 32 × 32; and 64 × 16. This is to ensure that the same region is considered and the search range of length and width is expanded. The anchor box generated by FeatureMap2 is shown in Figure 6.

To reduce the amount of computation, inappropriate anchor frames are deleted by adopting the anchor-box deletion criterion as shown in Equation (1), as follows:

O V = \{\begin{array}{l} > 0.7 p o s i t i v e s a m p l e \\ < 0.3 n e g a t i v e s a m p l e \\ e l s e d e l e t e \end{array}

(1)

where

O V

is the overlap rate.

2.2.3. Classification of Regions

Samples randomly selected from positive and negative samples are used for the training of the classification branch (cls) and the boundary-box regression branch (bbox_reg). In the process of region detection, anchor boxes are input into the trained branches to achieve the preliminary screening of feature information and background information, as well as boundary-box regression.

In the classification branch (cls), the anchor-frame score threshold for judging crack information is set as

θ

, and the region category is determined by the anchor-frame score within the region. The specific operation is shown in flowchart form in Figure 7.

Due to the interference of a large amount of background information, the anchor boxes inside the region need to be screened during region detection. Due to the long and narrow shape of the crack, it is difficult for a single anchor frame to cover all the crack regions. In addition, under different image sizes, the anchor frames also differ in size, so that the phenomenon of overlapping anchor frames occurs. At the same time, when a single small anchor box is judged, a feature similar to a crack may be misjudged. To solve the above problems, secondary screening is carried out on all anchor boxes considering the characteristics of crack images. The specific process involves the following steps:

(i): If the overlap ratio of the two anchor boxes exceeds 65%, the anchor box with the lower score is deleted.
(ii): Expansion is carried out in proportion with the aspect ratio of the anchor box until 90% of the anchor boxes have regions of 25% overlap with their surrounding anchor boxes. Those anchor boxes with independent positions and low scores are then removed.

Step (i) removes anchor boxes with excessive overlap so that the number of anchor boxes is reduced. Step (ii) solves the problem of similar-feature misjudgment and enables all the information on the narrow crack region to be included by extending the anchor box. In the illustrations below, Figure 8a is the input network image, and Figure 8b contains two kinds of anchor boxes: red and green. The region of the red anchor box is small, and the region of overlap between the red and green anchor boxes exceeds 65% of the area of the red anchor box. The green anchor box with the low score is therefore removed, as can be seen in Figure 8c. Figure 8d shows multiple non-overlapping anchor boxes, and Figure 8e shows the adaptive expansion of anchor boxes. In Figure 8e, there is no other anchor box around the green anchor box; as a result, this is judged to be an independent anchor box and is removed, as can be seen in Figure 8f.

2.3. Region Segmentation Network

The region segmentation network is built by the FCN pyramid network structure and adopts a similar jump structure. The feature maps of different scales are used as the input of the network. The prediction result of this layer is obtained by deconvolution with the double up-sampling of the prediction result of the previous layer. Then, the up-sampling result of the corresponding scale is used as the final prediction result of the layer. The specific network structure is shown in Figure 9.

Information of each crack region input by the RPN network can now be obtained and its corresponding feature level number (i) confirmed. Then, the feature map of FeatureMapN (N ≤ i) is applied to classify the pixels in this region. For example, the feature map of a crack region is FeatureMap2. The prediction result of this region is a combination of FeatureMap2, FeatureMap3, FeatureMap4, FeatureMap5, and FeatureMap6. Because the fracture image features are small and narrow, extracting deeper feature maps and conducting classification fusion makes the segmentation results more accurate. Finally, according to the incoming anchor-box information from the region detection network, it is mapped to the original image to produce crack segmentation for the whole image, as shown in Figure 10.

3. Experimental Results and Analysis

3.1. Experimental Environment and Samples

The experiment was developed in a Windows environment on the Pycharm platform, using the Pytorch deep-learning framework based on the Python language. The hardware, software, and programming environments are shown in Table 1.

During the training of the multi-layer network, a large number of tunnel-lining images were first photographed and collected by the camera with a shooting distance of 2.4 m and a photo resolution of 5120 × 5120. The region covered by an individual image was 2.88 × 2.88 m². In such an image, common 1 mm-wide cracks take up about 2 pixels.

A total of 12,000 images with crack characteristics were selected, and the image set was divided into training and test sets in a 5:1 ratio. The unified image size was 512 × 512 pixels. The feature extraction network used artificially marked fracture images as training samples, as shown in Figure 11a, to improve the pertinency of the fracture region. In the region detection network, the region map divided according to an 8 × 8 grid was taken as the input image. Regions containing cracks were considered positive samples, and those without cracks were considered negative samples, as shown in Figure 11b,c. Because only a small proportion of the image includes cracks, the proportion of positive and negative samples is not balanced. The ratio of positive and negative samples was therefore set to 1:3 to ensure balance.

3.2. Result Analysis

3.2.1. Analysis of Feature Extraction Network Results

In this paper, the SK attention module is integrated based on multi-scale feature map output by U-net and the FPN pyramid network to improve the accuracy of feature extraction. Any change in illumination of the tunnel environment has an impact on feature extraction. In this experiment, therefore, we used nonlinear gamma transformation to process the brightness and shade of images [29]. A comparison of feature extraction effects is presented in Figure 12.

It can be intuitively seen from Figure 12 that the crack images processed with gamma transformation have higher precision in the extraction of deep features. To analyze the feature maps obtained from FeatureMap1–6, the output of each channel map of FeatureMap1–6 was considered for comparison purposes. The results are shown in Figure 13.

From FeatureMap6 in Figure 13l, it can be seen that the extracted features are more abstract. In Figure 13g, FeatureMap1 exhibits a greater similarity to the input crack image. Therefore, integrating SK attention with a cascade neural network can combine deep features with shallow features so that richer crack features may be obtained.

3.2.2. Analysis of Region Classification Network Results

To evaluate the classification effect of the classification network, the combination of the real category and the predicted category was divided into TP (true positive), FP (false positive), TN (true negative), and FN (false negative). The confusion matrix of the classification results is shown in Table 2.

Three evaluation criteria commonly used in network performance testing were now introduced; these are classification accuracy (A), precision (P), and recall (R), which are defined as follows:

A = \frac{T P + T N}{T P + F P + T N + F N}

(2)

P = \frac{T P}{T P + F P}

(3)

R = \frac{T P}{T P + F N}

(4)

In this experiment, the initial learning rate of the RPN network was set to 0.01, and cosine attenuation was used. The cosine cycle consisted of 10 epochs, and a total of 100 epochs were trained. Softmax classifier was used in the model, and the region with a score greater than 0.65 was judged to have cracks. A total of 1000 test samples were randomly selected from the test set, and the sample maps were divided into 8 × 8 grid regions, with a total of 64,000 test regions. To verify the rationality of the 8 × 8 grid region division, the samples input into the network were divided into the whole graph and the 8 × 8 region graph for the control experiment. All the sample images contained cracks. The number of negative samples in the whole image test was zero, and only the classification accuracy and recall ratio were counted. The results are shown in Table 3.

As can be seen from Table 3, the RPN network produced a higher classification accuracy in processing 8 × 8 region maps. The interference of the background region was screened, and the effective feature region was retained, by means of region division. Using this method, further missed cracks were obtained through extension of the adaptive region to ensure the continuity of the crack detection region. At the same time, any non-fracture regions that were misjudged as positive cases were screened by independent regions. As a result, the number of false positive cases decreased, and the accuracy rate of the network increased to 98.51%.

3.2.3. Analysis of Region Segmentation Network Results

Because of subjective differences in hand-marked images, there will always be areas of transition between cracked and non-cracked regions in the actual images. To allow for this, a certain range of allowable deviation [30,31] should be set. In this study, because the actual width of cracks was generally about 1 mm, and the size of occupied image pixels was 2 pixels, these values were used to determine the deviation range and thus ensure the accuracy of segmentation precision calculation.

From the regional detection network results, the number of regions judged as positive cases was 7142; among these, the number of regions with real cracks was 7036. In the experiment, the region maps determined as positive examples with real cracks were those reference maps that were obtained by manually marking the images. The performance of segmentation algorithms was evaluated using commonly used parameters, such as segmentation accuracy, under-segmentation rate, and over-segmentation rate [32]. The results of network segmentation were compared with the manually annotated map, and the vertical width of the crack at each pixel allowed a difference of 2 pixels. The statistics of segmentation results are shown in Table 4.

Because the false positive example did not contain cracks, its segmentation accuracy was determined to be the ratio of the number of unsegmented crack regions to the number of false positive examples. As can be seen from Table 4, in the present study, the segmentation network accuracy reached 94.55% for the region that contained cracks.

Both FCN and U-net can achieve pixel-level target segmentation. Using the same test samples, training batches, and output thresholds, the FCN and U-net were compared with the crack recognition network introduced in this paper. The test results are shown in Figure 14.

As can be seen from Figure 14, an experimental comparison of fractures in three different directions—transverse, longitudinal, and oblique—shows that the network described in this paper performed better than FCN with respect to the continuity of fracture feature extraction, and more effectively than U-net network in terms of segmentation integrity. A total of 1000 test-set data were selected for comparative experimental analysis. The statistical results are shown in Table 5. It can be concluded that the segmentation effect of the proposed algorithm significantly improved segmentation accuracy, and greatly reduced single-image processing time.

4. Conclusions

In this paper, we propose an algorithm for recognizing cracks in tunnel lining. In order to fully extract the tiny features of cracks, the algorithm is based on the SK attention mechanism and a cascaded network. After multi-scale regional detection and regional adaptive expansion, background screening can be completed, and the crack contour of the whole map is then output by the segmentation network. This algorithm combines deep features with shallow features to extract a greater abundance of crack features. The experimental results showed that the recognition accuracy of the region classification network reached 98.51%. For crack segmentation, the segmentation accuracy reached 94.55%, and the processing time for a single piece was only 60.76 ms. Compared with the two classical methods of FCN and U-net, better results were obtained for segmentation accuracy and efficiency.

The research reported in this paper can be applied to the automatic detection of tunnel-lining cracks, and further research should be carried out on the extraction of characteristic crack parameters to produce a complete tunnel-lining crack detection algorithm with even higher potential value in engineering applications.

Author Contributions

K.L.; methodology, P.Z. and X.Z.; software, P.Z.; validation, R.W. and X.L., formal analysis; K.L.; investigation, S.Z.; resources, X.L.; data curation, P.Z.; writing—original draft preparation, P.Z. and X.Z.; writing—review and editing, K.L. and S.Z.; visualization, K.L.; supervision, S.Z.; project administration, K.L.; funding acquisition, R.W. and X.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by [the National Key Research and Development Program of China], (2018YFB2101004) and [the National Natural Science Foundation of China] (51975157).

Data Availability Statement

The data that support the findings of this study are not openly available due to sensitivity but are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

China Urban Transportation Association. Statistics and Analysis Report of China’s Urban Rail Transit in 2020; China Urban Transportation Association: Beijing, China, 2021. [Google Scholar]
Wang, H.; Liu, X. Safety Evaluation of Tunnel Lining With Longitudinal Cracks and Reinforcement Design. Chin. J. Rock Mech. Eng. 2010, 29, 2651–2656. [Google Scholar]
Koch, C.; Georgieva, K. A review on computer vision based defect detection and condition assessment of concrete and asphalt civil infrastructure. Adv. Eng. Inform. 2015, 29, 196–210. [Google Scholar] [CrossRef] [Green Version]
Mohan, A.; Poobal, S. Crack detection using image processing: A critical review and analysis. Alex. Eng. J. 2018, 57, 787–798. [Google Scholar] [CrossRef]
Yu, L.; He, S.; Liu, X.; Jiang, S.; Xiang, S. Intelligent Crack Detection and Quantification in the Concrete Bridge: A Deep Learning-Assisted Image Processing Approach. Adv. Civ. Eng. 2022, 2022, 1813821. [Google Scholar] [CrossRef]
Chen, J.; Zhou, M.; Huang, H.; Zhang, D.; Peng, Z. Automated extraction and evaluation of fracture trace maps from rock tunnel face images via deep learning. Int. J. Rock Mech. Min. Sci. 2021, 142, 16. [Google Scholar] [CrossRef]
Huang, H.; Zhao, S.; Zhang, D.; Chen, J. Deep learning-based instance segmentation of cracks from shield tunnel lining images. Struct. Infrastruct. Eng. 2022, 18, 183–196. [Google Scholar] [CrossRef]
Xu, H.; Su, X.; Wang, Y.; Cai, H.; Cui, K.; Chen, X. Automatic Bridge Crack Detection Using a Convolutional Neural Network. Appl. Sci. 2019, 9, 2867. [Google Scholar] [CrossRef] [Green Version]
Lee, S.Y.; Lee, S.H. Development of an inspection system for cracks in a concrete tunnel lining. Can. J. Civ. Eng. 2007, 34, 966–975. [Google Scholar] [CrossRef]
Ukai, M. Advanced Inspection System of Tunnel Wall Deformation using Image Processing. Q. Rep. RTRI 2007, 48, 94–98. [Google Scholar] [CrossRef] [Green Version]
Zhang, F.; Li, G. Study on Highway Crack Diagnosis Based on Cellular Neural Network. Inf. Technol. 2014, 7, 42–44. [Google Scholar]
Liu, H.; Wang, X. Detection and Recognition of Bridge Crack Based on Convolutional Neural Network. J. Heibei Univ. Sci. Technol. 2016, 37, 485–490. [Google Scholar]
Gupta, P.; Dixit, M. Image-based crack detection approaches: A comprehensive survey. Multimed. Tools Appl. 2022, 81, 40181–40229. [Google Scholar] [CrossRef]
Cha, Y.J.; Choi, W.; Büyükoztürk, O. Deep Learning-Based Crack Damage Detection Using Convolutional Neural Networks. Comput.-Aided Civ. Infrastruct. Eng. 2017, 32, 361378. [Google Scholar] [CrossRef]
Protopapadakis, E.; Doulamis, N. Image Based Approaches for Tunnels’ Defects Recognition via Robotic Inspectors; Springer International Publishing: Cham, Switherland, 2015. [Google Scholar]
Makantasis, K.; Protopapadakis, E.; Doulamis, A.; Doulamis, N.; Loupos, C. Deep Convolutional Neural Networks for efficient vision based tunnel inspection. In Proceedings of the 2015 IEEE International Conference on Intelligent Computer Communication and Processing (ICCP), Cluj-Napoca, Romania, 3–5 September 2015. [Google Scholar]
Dung, C.V.; Anh, L.D. Autonomous concrete crack detection using deep fully convolutional neural network. Autom. Constr. 2019, 99, 52–58. [Google Scholar] [CrossRef]
Ren, Y.; Huang, J.; Hong, Z.; Lu, W.; Yin, J.; Zou, L.; Shen, X. Image-based concrete crack detection in tunnels using deep fully convolutional networks. Constr. Build. Mater. 2020, 234, 117367. [Google Scholar] [CrossRef]
Zhe, C. Research on Image Recognition Algorithm of Complex Crack Disease in Subway Tunnel; Beijing Jiaotong University: Beijing, China, 2019. [Google Scholar]
Weng, P.; Lu, Y. Pavement Crack Segmentation Technology Based on Improved Fully Convolutional Networks. Comput. Eng. Appl. 2019, 55, 235–239. [Google Scholar]
Li, L.; Hu, M. Method for Small-bridge-crack Segmentation Based on Generative Adversarial Network. Laser Optoelectron. Prog. 2019, 56, 1–11. [Google Scholar]
Zou, Q.; Zhang, Z. DeepCrack: Learning Hierarchical Convolutional Features for Crack Detection. IEEE Trans. Image Process. 2018, 28, 1498–1512. [Google Scholar] [CrossRef]
Li, X.; Wang, W.; Hu, X.; Yang, J. Selective kernel networks. In Proceedings of the IEEE CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 510–519. [Google Scholar]
Chen, H.; Liu, L. Underwater image enhancement based on the SK attention residual network. J. Nanjing Univ. Inf. Technol. Eng. 2022, 21, 248–252. [Google Scholar]
Rahman, Z.U.; Jobson, D.J.; Woodell, G.A. Retinex Processing for Automatic Image Enhancement. Proc. SPIE-Int. Soc. Opt. Eng. 2004, 13, 100–110. [Google Scholar]
Lin, T.-Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature Pyramid Networks for Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 936–944. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015; Springer International Publishing: Cham, Switherland, 2015; pp. 234–241. [Google Scholar]
Kansizoglou, I.; Bampis, L.; Gasteratos, A. Deep feature space: A geometrical perspective. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 6823–6838. [Google Scholar] [CrossRef] [PubMed]
Liu, Z.; Wang, D. Adaptive correction algorithm for lighting uneven images based on two-dimensional gamma function. J. Beijing Inst. Technol. 2016, 36, 191–196. [Google Scholar]
Amhaz, R.; Chambon, S.; Idier, J.; Baltazart, V. Automatic Crack Detection on Two-dimensional Pavement Images: An Algorithm Based on Minimal Path Selection. IEEE Trans. Intell. Transp. Syst. 2016, 17, 2718–2729. [Google Scholar] [CrossRef] [Green Version]
Shi, Y.; Cui, L.; Qi, Z.; Meng, F.; Chen, Z. Automatic Road Crack Detection Using Random Structured Forests. IEEE Trans. Intell. Transp. Syst. 2016, 17, 3434–3445. [Google Scholar] [CrossRef]
Zhou, L.; Jiang, F. Survey on Image Segmentation Methods. Appl. Res. Comput. 2017, 34, 1921–1928. [Google Scholar]

Figure 1. Structure of cascade network.

Figure 2. Structure of feature extraction network.

Figure 3. Structure of SK attention mechanism.

Figure 4. Schematic diagram of the operation steps of the region detection network: (a) input image; (b) region division; (c) region traversal screening; and (d) region screening results.

Figure 5. Structure of the region detection network.

Figure 6. Anchor box generation diagram.

Figure 7. Region classification operation flowchart.

Figure 8. Region classification and screening: (a) input image; (b) schematic diagram of anchor box overlap; (c) the overlapping anchor box is removed; (d) the overall anchor frame is obtained; (e) anchor box size is expanded; and (f) the independent anchor box is removed.

Figure 9. Structure of region segmentation network diagram.

Figure 10. Schematic diagram of crack segmentation steps for the whole image.

Figure 11. Network training samples: (a) label samples; (b) negative samples of crack; and (c) positive samples of crack.

Figure 12. Examples of feature maps before and after lighting: (a) FeatureMap1 (untreated); (b) FeatureMap2 (untreated); (c) FeatureMap3 (untreated); (d) FeatureMap1 (treated); (e) FeatureMap2 (treated); (f) FeatureMap3 (treated); (g) FeatureMap4 (untreated); (h) FeatureMap5 (untreated); (i) FeatureMap6 (untreated); (j) FeatureMap4 (treated); (k) FeatureMap5 (treated); and (l) FeatureMap6 (treated).

Figure 13. Examples of feature maps at different scales (after illumination processing): (a) FeatureMap1 (untreated); (b) FeatureMap2 (untreated); (c) FeatureMap3 (untreated); (d) FeatureMap4 (untreated); (e) FeatureMap5 (untreated); (f) FeatureMap6 (untreated); (g) FeatureMap1 (treated); (h) FeatureMap2 (treated); (i) FeatureMap3 (treated); (j) FeatureMap4 (treated); (k) FeatureMap5 (treated); and (l) FeatureMap6 (treated).

Figure 14. Comparison of segmentation results of different networks: (a) original image; (b) manually annotated graphs; (c) FCN; (d) U-net; and (e) network described in this paper.

Table 1. Experimental environment configuration.

Experimental Environment	Configuration Instructions
Hardware environment	CPU: Intel(R) Core(TM) i7-6500, 3.2 GHz
Software environment	Operating system: Windows 10
Programming environment	Pytorch deep-learning framework

Table 2. Classification result confusion matrix.

Real Results	Forecast Result
Real Results	Positive Example	Counter Example
Positive example	TP (real example)	FN (false counterexample)
Counterexample	FP (false positive)	TN (true counterexample)

Table 3. Classification network test results statistics table.

Image Type	Number of Positive Samples	Number of Negative Samples	Classification Accuracy %	Precision %	Recall %
Whole image	1000	0	96.51	/	96.41
Small region map	7142	56,858	99.17	95.24	97.42

Table 4. Split network test result statistics table.

Regional Category	Number	Segmentation Accuracy %	Under-Segmentation %	Over-Segmentation %
TP	7036	94.55	1.72	1.89
FP	106	92.26	/	/

Table 5. Comparison table of segmentation results of different networks.

Network Category	Segmentation Accuracy (%)	Under-Segmentation (%)	Over-Segmentation (%)	Average Single Image Processing Time (ms)
FCN	88.53	8.43	1.98	80.65
U-net	91.34	5.57	1.92	78.58
Network described in this paper	94.55	1.72	1.89	60.76

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, K.; Zhong, S.; Wang, R.; Zhou, P.; Zhao, X.; Liu, X. Tunnel Lining Crack Recognition Algorithm Integrating SK Attention and Cascade Neural Network. Electronics 2023, 12, 3307. https://doi.org/10.3390/electronics12153307

AMA Style

Liu K, Zhong S, Wang R, Zhou P, Zhao X, Liu X. Tunnel Lining Crack Recognition Algorithm Integrating SK Attention and Cascade Neural Network. Electronics. 2023; 12(15):3307. https://doi.org/10.3390/electronics12153307

Chicago/Turabian Style

Liu, Keqiang, Shisheng Zhong, Rui Wang, Peiying Zhou, Xiaodong Zhao, and Xingen Liu. 2023. "Tunnel Lining Crack Recognition Algorithm Integrating SK Attention and Cascade Neural Network" Electronics 12, no. 15: 3307. https://doi.org/10.3390/electronics12153307

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Tunnel Lining Crack Recognition Algorithm Integrating SK Attention and Cascade Neural Network

Abstract

1. Introduction

2. Structure of Cascade Neural Network

2.1. Feature Extraction Network

2.1.1. Structure of Network

2.1.2. SK Attention Mechanism

2.2. Region Detection Network

2.2.1. Structure of Network

2.2.2. Generation of Anchor Box

2.2.3. Classification of Regions

2.3. Region Segmentation Network

3. Experimental Results and Analysis

3.1. Experimental Environment and Samples

3.2. Result Analysis

3.2.1. Analysis of Feature Extraction Network Results

3.2.2. Analysis of Region Classification Network Results

3.2.3. Analysis of Region Segmentation Network Results

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI