Next Article in Journal
Development of Temperature Sensor Based on AlN/ScAlN SAW Resonators
Next Article in Special Issue
X-ray Detection of Prohibited Item Method Based on Dual Attention Mechanism
Previous Article in Journal
Joint Optimization of Resource Utilization, Latency and UAV Trajectory in the Power Information Acquisition System
Previous Article in Special Issue
A Multi-Stage Acoustic Echo Cancellation Model Based on Adaptive Filters and Deep Neural Networks
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Evolution of Crack Analysis in Structures Using Image Processing Technique: A Review

by
Zakrya Azouz
,
Barmak Honarvar Shakibaei Asli
* and
Muhammad Khan
*
Centre for Life-Cycle Engineering and Management, School of Aerospace, Transport and Manufacturing, Cranfield University, Cranfield, Bedford MK43 0AL, UK
*
Authors to whom correspondence should be addressed.
Electronics 2023, 12(18), 3862; https://doi.org/10.3390/electronics12183862
Submission received: 3 August 2023 / Revised: 2 September 2023 / Accepted: 6 September 2023 / Published: 12 September 2023
(This article belongs to the Special Issue Signal and Image Processing Applications in Artificial Intelligence)

Abstract

:
Structural health monitoring (SHM) involves the control and analysis of mechanical systems to monitor the variation of geometric features of engineering structures. Damage processing is one of the issues that can be addressed by using several techniques derived from image processing. There are two types of SHM: contact-based and non-contact methods. Sensors, cameras, and accelerometers are examples of contact-based SHM, whereas photogrammetry, infrared thermography, and laser imaging are non-contact SHM techniques. In this research, our focus centres on image processing algorithms to identify the crack and analyze its properties to detect occurred damages. Based on the literature review, several preprocessing approaches were employed including image enhancement, image filtering to remove the noise and blur, and dynamic response measurement to predict the crack propagation.

1. Introduction

The process of observing structures like buildings, bridges, and even aircraft through various sensors and technologies is commonly referred to as structural health monitoring (SHM). The key aim is to recognize any change that has occurred in the structure, making sure that the structure remains secure and functioning over an extended period. Cracks are a common occurrence. Cracks are fractures or gaps that develop within materials like metals, concrete, or ceramics and are one of the critical concerns in SHM. A variety of factors can cause these fractures, including excessive stress, environmental impacts, material flaws, or natural wear. Detecting and monitoring cracks is a fundamental aspect of SHM. By utilizing SHM techniques to detect early signs of cracks, engineers and researchers can intervene promptly and implement necessary maintenance, contributing to the overall longevity and safety of the structure.
Crack incidence and propagation are two essential aspects that influence the structure’s performance [1]. When a structure is under load, and the stress level surpasses a specific threshold, the phenomenon of crack initiation occurs and propagation occurs due to an increase in the applied load. Crack propagation can lead to a deterioration in performance and even failure of the structure [2]. Hence, crack propagation analysis is a crucial issue in ensuring the quality and reliability of structures. As a result, many crack detection and propagation analysis techniques have been studied and developed throughout the previous decades in the domain of SHM and non-destructive assessments.
The conventional approach to contact detection necessitates the employment of sensors that are directly coupled with the structure to evaluate dynamic reactions, for instance, accelerometers, strain gauges, and fiber optic sensors, among others [3]. Nonetheless, various challenges abound. One such challenge is that wired contact sensors necessitate a time-consuming and labor-intensive installation process and require extensive maintenance to ensure long-term monitoring and upkeep [4]. Moreover, the magnitude of the structure and its intricate shape and dimensions exacerbate the situation [5,6].
In the past few years, there has been a significant focus on the advancement of technologies that revolve around the use of alternative approaches. These include cameras, unmanned aerial vehicles, and mobile phones tailored for structural health monitoring (SHM) [7]. Recently, there has been notable progress in the development of affordable vision-sensing technology. Through the application of image and video analysis, it has become feasible to perform high-quality condition assessments of structures from remote locations [8].
Kou et al., proposed a fully non-contact inspection technique using nonlinear laser ultrasonic testing to measure the closed surface cracks [9]. Similarly, Zhu et al. used laser-induced ultrasound by proposing a differential two-wave mixing interferometer to detect cracks in metallic structures. Their proposed system provides a strong tool for contactless detection issues [10]. Another non-contact laser ultrasound approach to detecting cracks was proposed by Kang et al. using a hydrophone. The generated ultrasound signals propagated through the specimen and received a signal from the hydrophone in the water [11]. Gao et al. developed a phased-array laser ultrasonic technique using a fiber picosecond laser for crack detection along with analyzing four factors using the finite element method as a laser diffraction technology [12]. Wen et al. advanced electronic speckle pattern interferometry (ESPI) as a valuable tool for the expeditious identification of cracks in photovoltaic (PV) cells [13]. Kaczmarek et al. conducted an experiment to minimize the thermal resistance of speckle pattern, black body radiation, and heat haze. High-temperature digital image correlation techniques for the full-field strain were used to observe the evolution of crack length and compare the fracture behavior between 1200 °C and 20 °C [14]. In the alloy processing process, some solidification cracks will be generated, which will be laid on the entire measured surface. Wang et al. produced Al-Si-Zn-Mg-Cu samples using Laser Powder Bed Fusion from mixed AlSi10Mg and 7075 powders. The incorporation of silicon into an Al-Zn-Mg-Cu alloy was noted to have a substantial influence on the reduction of crack density, assumably stemming from the decrease in solidification range, enhanced fluidity of the molten phase, and lower coefficient of thermal expansion [15]. This paper was reviewed by Wall et al. to provide a comprehensive assessment of 16 solidification crack testing protocols that have been established for both casting and welding processes, encompassing both self-restrained and externally loaded configurations [16]. Liu et al. presented the development of a laser interferometric sensing measurement (ISM) system based on a 4R manipulator system for real-time, online detection of mechanical targets with high precision during processing [17]. Erka et al. investigated the use of laser scanners and images to detect and quantify surface damage on structures. They proposed a novel method that combines surface-normal-based damage detection with color information to enhance the identification of cracks, corrosion, and other surface defects [18].
Image processing techniques have found extensive applications in crack detection in various applications. The fundamental framework for a technique aimed at detecting cracks through image processing entails acquiring high-quality images utilizing either a camera, smartphone, unmanned aerial vehicle (UAV), or other imaging devices. This is followed by preprocessing steps such as resizing, denoising, segmentation, and morphology [19], all of which are intended to remove shadows and prepare the image for crack detection.
The methodology for detecting cracks employs various image processing approaches such as edge detection [20], segmentation, and pixel analysis to effectively delineate or separate the section of the image that contains the crack [21,22]. The estimation of parameters [23] such as length, width, and depth of the detected crack assists in assessing its seriousness.
The application of machine learning for crack detection through image processing is a crucial aspect in the automated identification and localization of surface faults in various infrastructure elements including bridges, buildings, and concrete structures [24]. This approach provides an efficient and accurate alternative to manual inspection. The integration of machine learning algorithms, such as Convolutional Neural Networks (CNN), Support Vector Machines (SVM), and Artificial Neural Networks (ANN), with image processing techniques has yielded promising outcomes in crack detection and classification [25]. These methods extract significant features from digital images and generalize classification boundaries to classify different kinds of cracks and defects. The utilization of machine learning in crack detection guarantees improved performance results, robustness, and reliability in evaluating and determining the condition of infrastructure elements [26,27,28,29,30].
Video-based structural health monitoring (SHM) techniques have gained popularity as a non-contact and cost-effective method for monitoring structural systems. These techniques utilize video technology for the purpose of capturing the dynamics of structures during their dynamic response [31]. By examining the alterations in pixel intensities induced by structural vibrations, virtual visual sensors (VVS) can be employed to determine the fundamental frequency of vibration [32,33]. To enhance data accuracy and spatial density, various optical measurement techniques have been combined with video technology [34]. Real-time monitoring of structural damage is possible with high-speed cameras and artificial intelligence algorithms [35]. These video-based SHM techniques have shown promising results in laboratory experiments [36] and in practical monitoring examples of bridges [37] and other civil structures [38]. If further developed, these techniques could revolutionize the field of earthquake engineering and structural health monitoring, providing valuable data for the maintenance and repair of structures.
Crack propagation was detected using video cameras and digital photos, using different methods such as image processing techniques, digital image correlation, and image analysis in 2D or 3D. These approaches are classified into three categories. Firstly, directly monitoring crack propagation and estimating the structure’s health. Secondly, determining the dynamic response parameters of the structure including vibration amplitudes and natural frequencies, and predicting crack propagation. Machine learning approaches were used to obtain possible damage predictions from the measured data from the previous methods mentioned above. A variety of applications are used to evaluate cracked images on digital photos and video cameras, utilizing methods such as pixel detection, subpixel, threshold, binarization, RGB models, target tracking, and template matching. Video-based structural health monitoring (SHM) techniques have gained popularity as a non-contact and cost-effective method for monitoring structural systems. These techniques utilize video technology to effectively capture the structure’s dynamic response.
The aim of this paper is to provide a summary of research community experience with vision-based crack detection, concentrating on health monitoring applications. This review paper delves comprehensively into crack detection methodologies. The paper is organized as follows: Section 2 covers image processing techniques for identifying cracks. This section covers various crucial steps including image acquisition, preprocessing, and a comprehensive exploration of edge detection methods for crack detection in addition to traditional segmentation methods, etc. Section 3 sheds light on the pivotal role of machine learning algorithms in crack detection through visual data, emphasizing the impact of enhancing accuracy by reviewing some of the machine learning algorithms such as support vector machine (SVM), decision tree algorithm, etc. Section 4 explores integrating image methods with dynamic response measurements for robust assessments. This involves the utilization of techniques such as motion magnification, multithresholding, and diverse edge detection methods, all harmonized with target tracking structures. The dynamic response measurements are used for the prediction of damages (cracks). Finally, Section 5 concludes the paper with a summary and outlook of future directions of crack detection vision based on SHM.

2. Crack Detection Based on Image Processing Techniques

Crack detection based on image processing techniques relies on images or visual data to detect and locate cracks or fractures in various materials or structures. This approach utilizes computer algorithms and methods to analyze the visual information captured through images and then determines the presence, size, shape, and location of cracks within the material or object. Various methods of image processing are Canny edge detection, Otsu method, histogram equalization method, morphological operation, segmentation, and Sobel edge detection method. The different image processing techniques utilized to detect cracks are covered in this section [39]. Figure 1 provides the main structure for a crack detection method based on image processing. First, the high-resolution images are collected, which are captured by the camera or another imaging device [40]. The images are then preprocessed, where resizing the image, denoising, segmentation, morphology (smoothing edges), and other techniques for mitigating shadows in images may be employed. In certain instances, grayscale or binary conversion may be necessary for crack detection. The outcome derived from the preprocessing stage is subsequently implemented in the process of detecting cracks. This process involves techniques of image processing methodologies such as edge detection, segmentation, or pixel analysis to accentuate or divide the fractured region within the image. The determination of parameters necessitates the computation of distinct characteristics of the identified crack, including its depth, length, and width. These measures serve as aids in the process of decision-making concerning the seriousness of a crack [28].

2.1. Image Acquisition

The process of collecting or obtaining digital pictures from different sources such as cameras, scanners, or other imaging equipment is referred to as image acquisition. The process involves converting real-world visual information into a digital format that computers can understand and manipulate. Depending on the equipment used, different factors such as exposure duration, focus, resolution, color settings, and more can be changed throughout the image capture process. After images are captured, they may be saved, analyzed, and improved by using a range of image processing techniques. Digital images may be obtained in a variety of ways, including utilizing Unmanned Aerial Systems (UAS) or a digital camera, as mentioned in [41,42,43,44,45]; accessing existing datasets [41]; or using a smart mobile phone. Figure 2 shows different types of image acquisition.

2.2. Image Preprocessing

Image quality often degrades for several reasons, including varying lighting conditions such as sunny or cloudy skies, random textures, uneven lighting, irregular shadows, and watermarks. These aspects can have a substantial impact on the accuracy of crack identification using image processing techniques. Image preprocessing primarily consists of lowering the negative impacts of those influences, which might increase image processing efficacy. Preprocessing is a significant step in image processing because it enhances the quality of the input images and reduces noise, making subsequent processing stages more accurate in recognizing and analyzing fractures in structures. The preprocessing aims are to increase contrast, minimize noise, and optimize the picture for feature extraction and analysis.

2.2.1. Image Cropping and Scaling

In image cropping, a specific region of interest is selected and extracted from an image while the surrounding area is discarded. Image resizing, on the other hand, requires changing an image’s dimensions (width and height). It can increase or decrease the total size of the image while maintaining the aspect ratio. Sometimes, the dataset of images is big and high resolution, resulting in a time-consuming image analysis process. In such cases, it is advisable to appropriately crop and resize the images to reduce the processing time. Cropping and scaling are both typical image-processing processes that may be accomplished using a variety of software programs or computer libraries. In the field of crack detection, image cropping and resizing techniques are employed to enhance the quality, efficiency, and reliability of crack analysis algorithms. By appropriately cropping and resizing images, researchers can reduce noise, eliminate irrelevant background information, and ensure consistent image dimensions. These preprocessing steps contribute to the accurate identification and characterization of cracks, leading to more reliable crack detection results. The authors in [46] used images from several sources that had various fault. The images were then cropped and scaled to create the dataset that was utilized to train the proposed algorithm. Enlarging the dataset or applying artificial data expansion techniques, such as rotating or cropping images, can effectively mitigate overfitting caused by a limited dataset size [47]. The  256 × 256  pixel sliding window was used to crop large-scale images acquired from the laboratory and earthquake locations. Bilinear interpolation was used to reduce the size of images acquired from the Internet and the reference to  256 × 256  pixels [48]. Ye et al. [49] built a training dataset for training and validation with 762 raw training images cropped into  80 × 80  pixel dimensions. Han et al. [50] decreased image resolution to save memory space; catenary images with a resolution of  6600 × 4400  pixels had their size reduced to  660 × 440  pixels before being sent to the network. Ren et al. [51] avoided using raw images with  4032 × 3016  resolution and cropped images to a size of  512 × 512 , enabling training with many batches of network input at the same time, which led to excessive GPU memory usage and a failed model. Figure 3 shows the illustration of data annotation and augmentation.

2.2.2. Noise and Blur Reduction

In crack detection approaches, image denoising refers to the act of reducing undesirable noise from images of structures in order to improve the visibility of cracks. Cracks in structures can be difficult to identify and measure, particularly when cracks are tiny or in tough-to-reach regions. Image denoising methods can enhance image quality by removing noise and other artifacts that might hide or distort the appearance of fractures. Several image-denoising filtering algorithms are employed to improve image quality and make cracks more visible. Several kind of filters are ordinarily involved: Gaussian filters, Median filters, Wiener filters, etc. Figure 4 shows the imaging system to generate the acquired image in terms of its original form with blur model and added noise.
This figure presents an image model using the point spread function (PSF) symbolized by  h ( x , y )  and adding random noise  n ( x , y )  to generate the obtained image as follows:
g ( x , y ) = f ( x , y ) h ( x , y ) + n ( x , y ) ,
where  f ( x , y )  is the original image,  g ( x , y )  is the acquired image, and ∗ represents the 2D convolution. To recover the original image from its noisy and blurry version, we would use an inverse filter, which involves a blind deconvolution part and a suitable denoising filter, as shown in Figure 5.
The reconstruction process could be performed in the Fourier domain as follows:
G ( u , v ) = F ( u , v ) H ( u , v ) + N ( u , v ) ,
where  F ( u , v ) G ( u , v ) H ( u , v ) , and  N ( u , V )  are the 2D Fourier transforms of the original image, acquired image, PSF, and noise, respectively. By using the Wiener filter, we can estimate the original image from its degraded version using the following formula:
F ˜ ( u , v ) = 1 H ( u , v ) | H ( u , v ) | 2 | H ( u , v ) | 2 + K G ( u , v ) ,
where K is a specified constant.
Hoang et al. employed Median-filter-based noise suppression and cleaning of small objects to help with crack detection [52]. As a typical preprocessing step to improve the findings for further processing, image-based concrete crack detection in tunnels using median filtering with a size of  5 × 5  was employed to decrease noise. Stentoumis et al. [53] proposed a line improvement that makes use of the intensity features of cracks that generate “salt and pepper” noise. The enhancing stage was followed by a noise reduction process that makes use of a median filter. Chen et al. [54] selected the threshold for the area, for the noise removed. Ni et al. [55] enhanced the contrast between cracks and background by using a Gaussian filter to reduce the noise, as demonstrated by the equation. A saliency map is a primary approach used in preprocessing, while other methods are employed for removing noise and accentuating cracks [56]. The same research [55] investigated crack measuring systems using image processing approaches on Android phones. To reduce noise and improve the input, thresholding methods and morphological operators were used. Otero et al. [57] applied two stages of application to remove noise blobs. The first stage uses their area attributes; the areas of all image blobs are normalized into a [0–1] range. In the second stage, the application finds noise blobs by using a certain threshold value of 0.25. This means that any blob with a smaller area than this threshold gets deleted from the picture. Vijayan et al. [58] converted an RGB image to a grayscale image and subsequently eliminated noise through the application of median filters. Following this, a test involving histogram equalization was carried out on the input image provided to ascertain the recoverability of blur or noise in the image through the utilization of the Wiener filter.

2.2.3. Image Enhancement

Image enhancement is the process of increasing an image’s visual quality. Spatial domain techniques that operate on pixel values such as brightness and contrast adjustment, color correction, and filtering, and frequency domain techniques that operate on image frequency components such as high-pass and low-pass filtering, are examples of image enhancement techniques. Edge improvement, sharpening, and deblurring are among others. Talab et al. [59] eliminated noise from the concrete crack images, and different adaptive filtering and contrast enhancement methods were applied to help to identify the cracks. Liu et al. [45] developed multiscale enhancement and visual characteristics. To begin, a multiscale enhancement strategy based on guided filter and gradient information was presented to cope with the effect of low contrast. The adaptive threshold technique was then used to generate the binary image. Finally, the cracks were purified using a mix of morphological processing and visual characteristics [60]. A novel crack detection approach that combines the Hat-transform with HSV thresholding was presented. An algorithm was created that blends the outputs of these two filters, resulting in a better output image with improved cracking identification characteristics. Histogram equalization was used by Cho et al. to increase the identification rate of black-and-white images [61]. This was followed by the use of an adaptive binary approach, which automatically selected an ideal threshold based on the image’s attributes. The Min–Max Gray Level Discrimination (M2GLD) image enhancement approach was used to improve the Otsu method for fracture identification. The resultant model was created as a tool for properly identifying crack objects and analyzing their properties, such as area, perimeter, breadth, length, and orientation. The experimental findings validated the M2GLD technique’s accuracy in identifying cracks and it was discovered to improve the performance of the Otsu approach. The fact that consumers must modify two parameters, namely, the margin and ratio parameters, is a drawback of this study’s endeavor [62]. Top-hat and bottom-hat filtering techniques are used in the preprocessing phase in order to enhance image contrast. Bottom-hat filters bring out bright things of interest against dark backgrounds, whereas top-hat filters bring out bright objects of interest against dark backgrounds. This novel threshold selection technique is based on relative standard deviation, which is significant in picture segmentation [63].

2.3. Edge Detection Methods for Crack Detection

The term “edge” refers to the region of significant transition in image intensity or contrast. The detection of regions that possess strong intensity contrasts is commonly known as edge detection. It is plausible for a particular pixel to exhibit variability, leading to a possible misconception of it as an edge. This can occur in conditions with poor lighting or high levels of noise, both of which can display features similar to those of an edge. Therefore, it is imperative to exercise greater caution when identifying variations that may appear as edge points (pixels) [64]. The utilization of edge detection techniques enables their application in the detection of fractures. The Sobel operator, Roberts operator, Prewitt operator, and Canny operator are among the frequently used edge detection operators. The effects of these varied operators on edges of the same type differ significantly. The categorization of edge detection algorithms consists of two unique classifications, namely, Gradient-based (first derivative) and Gaussian-Based (second derivative) [65].

2.3.1. Roberts Edge Detection

The Roberts Cross operator is a rudimentary, two-dimensional spatial gradient measurement technique utilized on images. The output pixel values indicate the approximated absolute magnitude of the spatial gradient of the input image at the given point. The operator encompasses a duo of  2 × 2  convolution kernels, illustrated in Equations (4) and (5). One kernel is identical to the other, merely rotated by 90° [66].
G x = 0 1 1 0
G y = 1 0 0 1 .

2.3.2. Canny Edge Detection

The Canny technique is an essential method for detecting edges in images, which involves isolating noise from the image prior to detecting the edges. This method is particularly useful as it does not affect the features of the edges in the image and allows for the application of a tendency to find the edges and the critical value for the threshold [67]. The algorithmic steps involved in the Canny edge detection technique are as follows:
Step 1:
The image  f ( x , y )  is convolved with a Gaussian function  G ( x , y )  to generate a smooth image,  f ^ ( x , y ) , which is defined as follows:
f ^ ( x , y ) = f ( x , y ) G σ ( x , y ) ,
where
G σ ( x , y ) = 1 2 π σ 2 exp x 2 + y 2 2 σ 2 .
Furthermore,  G σ ( x , y )  represents a Gaussian function characterized by the variance  σ 2 .
Step 2:
The first difference gradient operator is applied to compute the edge strength, and the edge magnitude and direction are obtained as before. The following matrices are Sobel operators and use a pair of 3 × 3 convolution masks (see Equations (8) and (9)).
Step 3:
The non-maximal or critical suppression is applied to the gradient magnitude.
Step 4:
A threshold is applied to the non-maximal suppression image.
G x = 1 0 1 2 0 2 1 0 1 .
G y = 1 2 1 0 0 0 1 2 1 .
Figure 6 shows the result of edge detection for the test cracked image by using the Canny and hyperbolic tangent algorithms.
Various researchers have used canny edge detection to somehow detect cracks. Pereira and Pereira [68] introduced a pioneering computer vision framework that incorporates a camera and laser rangefinder to precisely gauge the width of cracks at considerable distances and from any perspective. The precision of the measurements is influenced by a range of challenges, such as the intricacy of calculating crack edges and the non-uniformity of data in the image. To overcome these obstructions, the team of researchers employed a blend of the Canny edge detection method and an upgraded U-net convolutional network algorithm to isolate the cracks. Syahrian et al. [69] implemented the Canny edge detection method as a technique for image processing to identify cracks within the pipe. The algorithm, comprising multiple steps, is capable of detecting edges in an image. Utilizing this method, the edges or lines of any cracks within the pipe were identified, and the resulting differences in the image allowed for the presentation of only the cracks, which were subsequently analyzed. The five processing techniques involved in the Canny edge detection approach—namely, smoothing, gradient determination, non-maximum suppression, double thresholding, and edge tracking by hysteresis—were employed in detecting the cracks inside the pipe. Remarkably, the Canny edge detection method demonstrated exceptional effectiveness in detecting cracks, providing additional insights into both cracks and noises [43]. Figure 7 shows the result of the Canny edge detection processes.

2.3.3. Sobel Edge Detection

The Sobel operator is implemented by measuring the spatial frequency of a 2D image, which involves converting it to grayscale and calculating the absolute gradient magnitude value at each pixel. Equations (8) and (9) illustrate the Sobel operators masks, which are the same as Canny.
Sobel kernels have the capacity to calculate distinct evaluations of the gradient component for every orientation, which are subsequently merged to establish the gradient magnitude for the x and y orientations. The computation of the gradient magnitude is feasible by utilizing Equations (10) and (11), whereas the orientation can be determined by applying Equation (12). The gradient magnitude can be calculated in its amplitude and phase as follows:
| G | = G x 2 + G y 2
| G | = | G x | + | G y |
θ = arctan G x G y .
where  G x  and  G y  are the components on the x and y axis;  θ  is the gradient direction.
In the context of this paper [70], edge detection is a significant method employed in the extraction and visualization of crack information from thermal images in the field of eddy current thermography. This paper presents a novel approach for flaw visualization using an edge detection operator (EDO) for dynamic detection. This approach employs four different EDOs to detect edges in thermal images. Subsequently, the outcomes of the four operators are assessed, and it is demonstrated that the Roberts operator exhibits the most effective detection performance. Furthermore, the Sobel and LoG operators segment images in a comparable manner while preserving less noise [43]. In [71], the process of extracting cracks from a darkened noise block image resembling a shadow was conducted. The paper compares the proposed method to different traditional methods such as Sobel, Laplacian, Canny edge detector with two threshold values, Valley edge detection, Otsu, Minimum Spanning Tree (MST), dynamic thresholding algorithm, clustering analysis, and Fuzzy C-Means (FCM) algorithms. Despite the fact that the image presented in Figure 8 does not depict a crack image obscured by shadowing, the black noise element permeates and encompasses roughly 40% of the overall image area and is situated on one of two cracks. The Sobel edge detector detects the edges of both the cracks and the black noise block.
Wang et al. [72] employed a multistage filtering approach for the purpose of surface crack detection. Specifically, Sobel filtering and multiple median filtering were utilized to effectively eliminate residual noise. The Otsu method was adopted to separate foreground and background regions. Ultimately, a hybrid filtering process was employed to accurately identify the cracks. It is crucial to remember that the non-crack zones were left in the input data. Figure 9b shows the image segmentation effect of the crack area as a result of the improvement of the threshold segmentation algorithm of local adaptive Otsu combined with Sobel edge gradient detection.
A three-step process was used by Talab et al. [59] to calculate the surface area of cracks. The initial phase entailed the conversion of the image to a grayscale image, followed by the application of Sobel’s filter for the detection of cracks. The next step involved the classification of images into foreground and background images, followed by noise removal using Sobel filtering. Subsequently, Otsu’s technique was applied to identify cracks. The researchers employed a real dataset and achieved an accuracy rate of over 85%.

2.3.4. Prewitt Edge Detection

The Prewitt operator is a technique employed to evaluate both the magnitude and orientation of edges, similar to the Sobel operator. Its utility lies in the computation of horizontal and vertical edges with a  3 × 3  convolution mask, respectively. Typically, a common mask is applied, as given by  P x  and  P y  in the following matrices [73].
P x = 1 0 1 1 0 1 1 0 1 .
P y = 1 1 1 0 0 0 1 1 1
Hoang and Nguyen [52] developed an image-processing-based approach for automatically identifying fractures on concrete wall surfaces. The research introduces MO-EDCR, the “Metaheuristic Optimized Edge Detection model for concrete wall Crack Recognition”, a novel strategy that uses the Roberts, Prewitt, Canny, and Sobel algorithms as edge detection methods to expose fracture patterns in concrete walls. The differential flower pollination approach (DFP) is also used as a metaheuristic in the research to optimize the image-processing-based crack detection model. According to the experimental data, the proposed technique applying the Prewitt algorithm yields an acceptable prediction outcome with an 89.95% classification accuracy rate and an area under the curve of 0.90. Table 1 illustrates that Prewitt has a shorter processing time than the other models.

2.3.5. Laplacian of Gaussian (LoG) Operator

A method was presented whereby edge points in an image could be detected by identifying zero-crossings in the second derivative of the image intensity. Unfortunately, the second derivative is highly susceptible to noise, necessitating noise filtering prior to edge detection. To this end, the LoG operator performs Gaussian smoothing before implementing Laplacian [74]. In this method, the image is first convolved with a Gaussian filter. This step serves to smoothen the image and reduce noise. As the edge width increases during the smoothing process, only the point with the local maximum value is considered an edge. Therefore, the Laplacian operator, which is the second derivative operator, is utilized for this purpose. To avoid unnecessary edge pixels, only pixels with first-order differential values (threshold) of zero-crossings exceeding a certain degree are deemed edge points. The LoG operator’s output, denoted by  h ( x , y ) , is acquired via the application of the convolution operation.
h ( x , y ) = Δ 2 [ G ( x , y ) × f ( x , y ) ] = Δ 2 G ( x , y ) × f ( x , y ) ,
Δ 2 G ( log ) ( x , y ) = x 2 + y 2 2 σ 2 σ 4 x 2 + y 2 / 2 σ 2 ,
where Equation (16) is normally called the Mexican hat operator.
The study in [75] stated that the proposed system uses a camera-equipped mobile robot to collect images on the bridge deck and the Laplacian of Gaussian (LoG) algorithm is used to detect cracks. In this paper, the LoG algorithm is used to detect cracks because cracks are regions of rapid intensity change on the bridge deck. Therefore, the LoG algorithm is a suitable choice for crack detection in this context.
Lim et al. [76] proposed a system that uses a mobile robot to inspect bridge deck cracks with a high-resolution camera and a Laplacian of Gaussian algorithm. The system also includes a complete coverage path planning algorithm for the robot to ensure all images are collected. By applying the LoG filter to the bridge deck image, zero-crossings can be obtained, which correspond to the locations of cracks. The size of the Gaussian used for the smoothing stage of the LoG operator affects the resulting zero crossings. As the smoothing is increased, fewer zero-crossing contours will be found, and those that remain will correspond to features of a larger scale in the image. Therefore, the LoG algorithm is suitable for detecting cracks of different sizes on the bridge deck. Figure 10 shows the results of the applied LoG algorithm.
Dorafshan et al. [42] developed a generic algorithm for image processing aimed at detecting cracks. The algorithm was designed to perform filter design, edge detection, image enhancement, and segmentation, with the objective of uniformly comparing different edge detectors. Six filters were used to conduct edge detection including spatial domain filters (Roberts, Prewitt, Sobel, and Laplacian of Gaussian) along with frequency domain filters (Butterworth and Gaussian). The inspector only reviewed all images of defects or sounds based on a physical inspection of the concrete surface aided by a crack microscope. Then, they were classified as follows: true positives (TPs), true negatives (TNs), false positives (FPs), and false negatives (FNs); Accuracy (Ac), Precision (Pr), and Missed Crack Width (MCW). Each image was defective based on a physical inspection of the concrete surface aided by a crack microscope. The performance of the six filters was evaluated by applying the algorithms to fifty images of defective and sound concrete and comparing the results in terms of accuracy, precision, minimum visible crack width, calculation time, and signal-to-noise ratio. According to the findings, utilization of the Laplacian of a Gaussian filter in the spatial domain is advised for prospective applications of real-time crack detection through the utilization of UAS. Figure 11 shows the original image and Figure 12 shows the result of edge detection for the test cracked image using different edge detection algorithms.
Table 2 shows that The LoG filter produced the best accuracy (92%) and precision (88%), the narrowest minimum detectable crack width, and the quickest processing time (1.18 s per image) for edge identification in the spatial domain.

2.4. Traditional Segmentation Methods

Segmenting images into multiple parts holds immense value in the field of digital image processing. These segments, which are also known as image objects, are essentially sets of pixels that are characterized by specific features. The primary objective of segmentation is to simplify and/or alter the representation of an image to a more pertinent and simpler form for examination. The detection of objects and boundaries in images, such as curves and lines, is a usual application of image segmentation. Picture segmentation, on the other hand, involves assigning specific properties to pixels with the same label, thereby labeling every pixel in an image.

Thresholding-Based Segmentation

The thresholding-based segmentation process can be regarded as the process of separating foreground from background. The following algorithms could be considered as thresholding-based segmentation approaches.
(a) 
Global thresholding
Global thresholding is a fundamental image segmentation technique used to distinguish objects or regions of interest from the background in a grayscale picture. It is also known as global threshold segmentation or simple thresholding. It entails deciding on a single threshold value that divides pixels into two groups: foreground and background.
Shen [77] presented a method to detect road cracks from video images. Through the varied object recognition technique, the crack image was selected from moving video to track the cracks, for which the skeleton extraction algorithm was used. This algorithm grayscales the image and thresholds it using global thresholds. The image was then segmented and identified using MATLAB Version: 9.7.0 (R2019b) Update 4.
(b) 
Otsu thresholding
Otsu thresholding is a technique for automatically determining an optimum threshold value for picture segmentation. It seeks a threshold that reduces the intra-class variation of foreground and background pixels.
Crack detection on an airport runway pavement is frequently impacted by signs and markings; there is even a possibility that these will be mistaken for cracks, leading to lower accuracy. The research in [78] implemented a technique for crack detection that uses twice-threshold segmentation. As a first step, a more precise Otsu threshold segmentation algorithm was used to remove the road markings in the runway image. The second step involved segmenting the image with an improved adaptive iterative threshold segmentation algorithm aiming to obtain the crack image. The final crack image was obtained after the image was denoised. This was followed by denoising the image and acquiring the crack image. Table 3 shows that the suggested algorithm is more accurate than Otsu segmentation crack detection and has better performance. Nevertheless, the techniques for removing unwanted elements in this study are not useful when dealing with larger and more complex problems.
(c) 
Adaptive thresholding Segmentation
The adaptive thresholding algorithm (ATA) is a technique employed to differentiate the crucial foreground—specifically, crack and leakage defects—from the background using the disparity in the pixel gray values of each area. This is one of the commonplace conventional approaches for image segmentation.
Senthikumar et al. [79] posited a proficient and precise method for identifying defects in metallic surfaces through an iterative thresholding technique. The proposed approach discerns the defect region, including but not limited to cracks and shrinkages, in the metal surface image by means of binarization via iterative thresholding techniques. The rationale behind the utilization of adaptive double thresholding lies in obtaining a binarized image that is capable of discriminating between the regions of the image that are affected by cracks and those that are not. The adaptive thresholding method adjusts the threshold value based on the local characteristics of the image, which can enhance the detection accuracy of cracks. Figure 13 shows the visualization outcome, as stated in an iterative threshold methodology.
Fan et al. [80] proposed a novel road crack detection algorithm based on deep learning and adaptive image segmentation. The adaptive thresholding method is used in the proposed road crack detection algorithm to extract the cracks from the road surface. This method is used because it can adjust the threshold value for each pixel based on the local characteristics of the image, which helps to accurately detect the cracks in the road surface. The proposed algorithm utilizes an adaptive thresholding method to extract the cracks from the road surface after the images containing cracks are smoothed using bilateral filtering. This method helps to minimize the number of noisy pixels and accurately extract the cracks from the road surface. The algorithm uses a deep convolutional neural network to classify images with an accuracy of 99.92% and an adaptive thresholding method to extract the cracks from the road surface.
(d) 
Region-based Segmentation
In region-growing segmentation methods, pixels with similar features are grouped together. The study in [81] put forth an enhanced algorithm for directional region growth, which aims to identify cracks. The crack detection algorithm for photovoltaic images using a multiscale pyramid and improved region growing technique involves the following main steps: Firstly, the photovoltaic image is preprocessed, incorporating filtering techniques. A multiscale pyramid decomposition is carried out to proceed to Step 2. Following this, in Step 3, the edges of the processed image are detected and the crack profile is extracted. Subsequently, Step 4 entails optimizing edge information to eliminate suspicious edges. Finally, Step 5 involves the execution of a directed regional growth algorithm to effectively identify and complete cracks. In instances where pavement cracks exhibit desirable continuity and high contrast, the utilization of digital image processing techniques can yield favorable detection outcomes. Nevertheless, the cracks typically acquired are slender black areas of irregular shape and their continuity is subject to the texture of the road. The presence of road shadows, stains, and other interfering factors can introduce background noise and impede detection efficacy. Figure 14 shows comparisons of the Canny algorithm with the proposed method.

2.5. Morphological Operations

Morphological operations are image processing techniques that use the shape or morphology of objects in an image to perform operations. These techniques are commonly used to extract certain elements or improve specific parts of an image.
Erosion and dilation are the two most prevalent morphological procedures. Erosion is the removal of pixels from the limits of objects in a picture, whereas dilation is the addition of pixels to the boundaries of things in an image. These procedures may be used to achieve noise reduction, edge detection, segmentation, and feature extraction, among other things. The morphological operations, erosion and dilation, can be used to eliminate small parts and fill gaps in the binary image. Openings can also be used to remove small parts and smooth out the edges of existing cracks.
Morphological techniques are used to detect surface cracks because regular curve tracing methods fail to detect cracks that are non-continuous [82]. Morphological operations can enhance discontinuities in the image and join the missing pixels, making it easier to detect surface cracks. The combination of edge detection as preprocessing and filtering as postprocessing seem to be an effective way to detect surface cracks effectively. In [45], morphological processing is used to remove small cracks and fill gaps in the detected cracks, which improves the accuracy of crack detection. Koshy et al. [83] proposed a comprehensive approach for assessing the strength of civil structures through the integration of image processing and SHM principles. This approach is particularly designed to identify cracks and assess surface degradation in buildings. The methods proposed for detecting cracks and quantifying surface deterioration have been found to be highly appropriate for civil inspection and, thus, offer extensive benefits. The paper aims to provide a solution for the automated inspection of civil infrastructures to improve the performance of structural health monitoring. Ni et al. [55] suggested a concrete crack measurement technique utilizing image processing with the aid of Android smartphone applications. To eradicate distortions in the input image, morphological operators and thresholding techniques were employed, consequently simplifying the detection process.

2.6. Smartphone

Modern cell phones are packed with features that may be used to efficiently analyze the state of structures. Because of the ubiquity of low-cost cell phones, their mobility, big storage capacity, substantial computing power, and easily customizable software, there has been an emerging trend of employing smartphones in SHM applications. With the ubiquity and availability of low-cost cell phones, it is becoming common to use them for structural monitoring and retrofitting. Smartphones have a high potential for usage in SHM applications for large-scale buildings due to a number of appealing characteristics. Smartphone images can be useful in the field of structural health monitoring (SHM). Built-in cameras in smartphones have grown increasingly capable of shooting high-resolution photographs as smartphone technology has advanced. These pictures may be used in SHM applications to display visual information concerning structural conditions such as fractures, deformations, and other types of damage.
Smartphone photos can be used to estimate the dynamic response parameters or vibration characteristics of a structure in structural health monitoring (SHM). Visual information on the structure’s disbandment or condition may be gained by photographing it using smartphones. Following that, image processing and analysis techniques may be used to extract significant information and highlight places of interest, such as fractures or possible damage. Orak and Ozturk [84] employed smartphones and computer vision algorithms to estimate the vibrating characteristics of a cantilever slender beam. They applied the local multithreshold technique to extract the natural frequencies of vibration of the beam, facilitating the analysis of its dynamic behavior. A smartphone-based multipoint displacement monitoring method based on a full convolutional neural network (FCN) was suggested in [85].
Images captured by smartphones can also be utilized in machine learning or deep learning algorithms for the classification or detection of cracks. These photos may be used to train algorithms that can automatically detect and categorize fractures in buildings. By feeding photos into these algorithms, they may learn the patterns and attributes associated with fractures, allowing them to identify and categorize cracks in fresh images with greater accuracy.
Li et al. [86] presented a proposed FCN model for the detection of four distinct classes of concrete damage—namely, cracks, spalling, efflorescence, and holes—utilizing an established image database from a smartphone-based platform. The development of the FCN algorithm involved employing transfer learning (TL) techniques, specifically the weights and biases of DenseNet-121 for feature extraction. The algorithm was then trained and validated using a set of 2200 images. The proposed approach outperformed SegNet in detecting various types of concrete damage. This research introduced a new crack detection approach that uses convolutional neural networks to directly learn discriminative features from raw picture patches. Each collection image’s image patch is categorized using a deep convolutional neural network that was trained for this job. A quantitative examination of 500 photos captured using a low-cost smartphone with a resolution of  3264 × 2448  was performed. When comparing the learned deep features to those retrieved by conventional hand-craft methods, the deep learning framework outperformed them [87].

2.7. Unmanned Aerial Vehicle (UAV)

When the surface is inaccessible or many sensors are required to be mounted, traditional contact-based sensors cause a restriction in the cost of contact-based sensors. A UAV, or drone, has lately gained popularity as a portable option in the realm of non-contact measuring technology. With rapid advancements in UAV technology for structural health monitoring, such research outfits them with lightweight, high-resolution cameras to record photographs of the buildings’ health. Drones are often created without multiple technologies, such as several types of cameras and a Global Positioning System (GPS) to capture data during flight and analyze photos subsequently via ground control centers. UAVs are used in a variety of engineering applications, including structural health monitoring on highways.
Sankara Srinivasan [60] developed a novel algorithm for detecting cracks, which utilizes the Hat-transform in combination with HSV thresholding. By merging the outcomes of both filters, significant advancements in image quality were seen. Their approach was founded on a mathematical morphological technique, and their investigation demonstrated that the bottom-hat transform was more effective in identifying fractures than the top-hat transform. Subsequently, the former approach was paired with HSV thresholding to obtain highly accurate crack detection outcomes. The scientists additionally presented a groundbreaking idea in their investigation, which involved utilizing a drone to detect fractures in real-time. Besides, they devised a MATLAB graphical user interface (GUI) that enabled them to promptly identify and treat fractures, leading to cost savings. Cho et al. [61] devised a method for detecting no-crack in UAV-based systems using Corner Harris, a feature-based image recognition technique that utilized Haar-like features and subsequently converted the image from color to grayscale. To enhance the identification rate on monochromatic images, histogram equalization was utilized, followed by an adaptive binary approach that automatically identified a threshold based on the image’s contents. The suggested technology assesses high-rise structures securely and may also be used in other sectors, such as inspecting steep cliffs and moored boats.
Kim et al. [88] presented a crack detection approach based on UAV-captured pictures and image processing. Field tests were carried out on a concrete wall with various types of cracks caused by loads, creep, and shrinkage. The images that were obtained were subsequently subjected to a hybrid technique of image binarization to ascertain the width of the crack. The proposed image processing methodology proved effective in identifying cracks of width greater than 0.1 mm, with a negligible error rate of 7.3%.
Kim et al. [89] proposed an approach for evaluating large-scale infrastructure faults through automation, which involved the merging of UAV technology with image processing. This approach entails outfitting a UAV with a Raspberry Pi, camera, and ultrasonic displacement sensor to enable the collection of crack photos and computation of distance while in flight. In these tests, image processing techniques such as median filter subtraction, Sauvola’s binarization algorithm, picture revision based on eccentricity and pixel connection, crack decomposition, and width computation were applied. The height of the inspection area was approximately 1.5 m. Actual crack information was compared with the computed crack width as a reference. Crack widths derived from crack gauge readings were comparable in the field experiment.
Pereira and Pereira [68] introduced an unmanned aerial vehicle (UAV) for independent assessment of building pathologies in civil construction along with various options of image processing algorithms for the identification of cracks in building structures. These algorithms are to be implemented on an embedded computer platform that is installed on UAVs. Two image processing techniques were employed for fracture detection. The first technique employed the Sobel operator or Sobel filter to detect edges. The Sobel operator computed either the matching gradient vector or the norm of this vector at each place in the picture. The second algorithm of choice was the particle filter, a non-parametric filter based on the Bayes algorithm. The particle filter attempts to determine the likelihood of an image segment being characterized by a crack or not based on pixel intensity and the number of pixels in its vicinity. This algorithm detected fractures in the tested samples with 74% accuracy for the parameters studied; although, the approach is susceptible to false positives since it does not consider crack pattern features. This method displays the image with detection spots in the crack’s most likely region.

3. The Role of Machine Learning Algorithms Based on Vision for Crack Detection

Machine learning is a subfield of artificial intelligence (AI) and computer science that enables software programs to improve their predictive accuracy without explicitly programming them to do so. As seen in Figure 15, machine learning comes in a variety of flavors. The expression “supervised learning” pertains to the procedure through which an algorithm acquires the ability to predict data from input data, and this form of learning encompasses input and output data. The system attempts to learn through reinforcement learning by interacting with the environment and rewarding good behavior while penalizing undesirable behavior. In recent years, machine learning approaches [90] have gained popularity. These techniques include support vector machines [91,92], random forest, random [93] structured forest, and neural networks [94,95].

3.1. Support Vector Machine (SVM)

The Support Vector Machine (SVM) algorithm aids in identifying the optimal line or decision boundary, which is otherwise referred to as a hyperplane. The SVM algorithm also identifies the closest points to the lines from both classes, known as support vectors. The margin is the distance between the vectors and the hyperplane, and the objective of the SVM is to maximize this distance. The optimal hyperplane is the hyperplane with the maximum margin (see Figure 16).
Gavilán et al. [97] explicated a methodology aimed at identifying road discomfort. In particular, the author employed a vehicle outfitted with line scan cameras, laser beams, and requisite hardware and software (HW-SW) to obtain road photos. Following photo preprocessing, a technique coined multiple directional non-minimum suppression (MDNMS) was implemented to detect the position of any cracks. To determine the suitable parameters for detecting cracks, a linear support vector machine (SVM) classifier was utilized to differentiate among diverse pavements throughout Spain. By adjusting the parameters pertaining to the pavement, the crack-detecting method’s efficacy was improved. The aforementioned methodology yielded a remarkable precision of 98.29% and a recall value of 93.86%.
Ersoz et al. [98] concentrated on the identification of cracks in photos collected by drones. To extract features, the process of image segmentation was carried out manually by establishing a threshold for each training image and computing the geometric properties of image sections. Subsequently, the aforementioned characteristics were subjected to classification utilizing the Support Vector Machine algorithm. The SVM algorithm was then employed in order to undertake the classification of the identified characteristics. Although the reported precision stood at 97%, the utilization of the human threshold method for image segmentation added a bias toward the dataset. In contrast, the research cited beyond demonstrates that SVMs perform well when the features are chosen correctly.

3.2. Decision Tree Algorithm

Decision tree classifiers are utilized to create a hierarchical framework like a tree, wherein the pre-eminent characteristic is designated as the primary node while the other attributes are denoted as the branches of the tree in order to determine the ultimate classification [99]. The determination of the entropy of the system was followed by the design of a hierarchy of characteristics that aim to decrease entropy. The use of decision trees was predominantly employed as an aid to decision-making. Conversion of the input image to grayscale preceded further analysis.
Wu et al. [100] employed the contourlet transformation method, a technique based on wavelet transformation, subsequent to converting the input image into a grayscale image. After undergoing the contourlet transformation, the image was partitioned into a high-pass image and a low-pass image, which were subsequently subjected to directional filters for processing. The implementation of this technique was expected to yield a smoother detection of edges. The preprocessed images were utilized for the purpose of feature extraction via the application of the co-occurrence matrix and Tamura characteristics. The extracted features were subjected to classification using an array of ensemble techniques, such as AdaBoost, random forest, rotation forest, and RotBoost. The ensemble methods cited herein are utilized to enhance the efficacy of a classifier through the provision of support from multiple other classifiers. Variations in these methods arise from the manner in which decision trees are constructed and the amalgamation of their outcomes. The outcomes of utilizing ensemble techniques are juxtaposed with those of employing neural-network-oriented methodologies. The consequences of utilizing ensemble techniques are side-by-side with those of employing neural network-oriented methodologies.

3.3. k-Nearest Neighbor Algorithm (KNN)

k-nearest neighbors [101] is a simplistic yet highly effective algorithm widely employed in the domain of machine learning to perform classification, pattern recognition, and regression tasks. The KNN model accomplishes this task by identifying neighboring data points through the utilization of Euclidean distance analysis conducted upon individual data points. Figure 17b illustrates the k-nearest neighbor algorithm. As we can see, the three nearest neighbors are from category A; hence, this new data point must belong to category A.
Zhang et al. [103] examined subway tunnel fractures using k-NN, support vector machines, radial basis function neural networks, and extreme learning machine classification techniques. The study further employed a diverse array of approaches, such as average smoothing, morphological operations like top hat transformation, thresholding for image segmentation, and statistical methods for feature extraction based on the standard deviation of the shape distance histogram. Despite comparable test accuracies of utilized classifiers being similar, the extreme learning machine exhibited the highest performance at 91.6%. Specifically, all techniques with the KNN classifier had a test accuracy rate of 88.7% in this model.

3.4. Random Structured Forests

Random structured forests are a type of ensemble model that can be utilized for predicting the nearest neighbor. The fundamental principle underpinning ensemble techniques posits that the amalgamation of multiple models will yield a more robust model. In the realm of linguistic ensembles, the random forest algorithm bears a resemblance to the conventional machine learning decision tree methodology. This method begins with a single input and buckets the data according to the direction the data travel in the tree. The concept of the random forest is taken to the next level through the integration of trees with an ensemble approach. The utilization of a random forest classifier possesses the advantage of a compact runtime, effective management of imbalanced data, and the ability to handle missing data [104].
The random forest creates multiple decision trees by randomly selecting rows and features from the dataset (see Figure 18). Each decision tree learns to make predictions independently. The primary characteristics are denoted by the presence of minimal bias, which suggests that the model may perform well on the training data. However, high variability indicates that it may not be effective in generalizing to new, unseen data. This leads to a significantly more precise and resilient model able to manage diverse tasks including regression and categorization [105].
Shi et al. [106] used the random structured forest to construct a crack classifier for detecting cracks in photo patches. After performing image erosion and dilation procedures on each patch, the final crack map was recovered. The picture erosion process may be used to remove pixels from crack borders, reject small crack fragments, limit the detected region, and connect neighboring crack fragments. Yang et al. [107] developed a method that takes advantage of randomly structured forests. The model aims to tackle the problem of heterogeneity in the intensity of fissures in images of roads. Integral channel characteristics were used to improve the portrayal of fractures in such pictures. Following this, a method called random structured forests was used to locate fractures. This approach is capable of detecting arbitrary and complicated fractures in pictures with high accuracy. An SVM model was used to categorize the fractures according to their nature.
Santur et al. [108] employed the random forest approach, a decision-tree-based ensemble method. While the study focuses on railroads, image classification challenges for detecting faults in visual data are comparable to crack detection problems in structures. Several techniques for reducing the dimensionality were implemented in a singular fashion, such as principal component analysis, kernel principal component analysis, singular value 28 decompositions, and histogram matching, with the intention of evaluating the impact of the feature extraction phase on the precision. The random forest algorithm was used to train the features obtained. Combining principal component analysis with histogram matching resulted in an accuracy of 85%.

3.5. Logistic Regression

The logistic regression method is used to solve binary classification problems through supervised learning. It is a mathematical model that utilizes the logistic function to describe binary classification. There exist multiple advanced extensions of logistic regression. In essence, logistic regression involves utilizing a regression model to make predictions about whether a specific data point or entry is likely to belong to a designated class. As seen in Figure 19, logistic regression models the data using a sigmoid function with a proper decision boundary [109]. Logistic regression has a number of critical elements including ease of implementation, computational efficacy, training-based efficacy, and regularization ease. Scaling of input features is deemed to be unnecessary. Nevertheless, it is noteworthy that the ability to tackle a nonlinear problem is constrained and is susceptible to the phenomenon of overfitting.
Landstrom and Thurley [110] demonstrated a morphological image-processing-based fracture identification and measuring method. Initially, segmentation was used to extract 80% of the length of a fracture in an image, and tiny faults or cracks were removed [67,110]. Following that, statistical classification was performed on these segmented pictures using logistic regression, which finds all the main fractures. The system’s overall accuracy was greater than 80%.

3.6. K-Means Clustering Algorithm

K-means is a type of unsupervised learning technique that is frequently used to perform closest-neighbor clustering. Based on their similarity, the data may be grouped into k clusters. K is an integer, and its value must be known in order for the procedure to work [111]. K-means is the most often used clustering method because it is capable of recognizing the correct cluster of fresh data based on the majority of the distance. The first selection of k-cluster centroids is made randomly; thereafter, all points are assigned to their nearest centroids and the newly constructed group’s centroids are recalculated. Because certain K-means are impacted by centroids, they are particularly susceptible to noise and outliers. One advantage of the K-method is that it is simple to apply and explain as well as effective in computing terms [112]. Figure 20 presents a graphical depiction of the K-means algorithm. The initial step consists of two sets of objects, whose centroids are then determined. The dataset clusters are formed again based on the centroids and the clusters responsible for producing the various dataset clusters are identified. In this manner, clusters are selected until the optimal ones can be ascertained [113].
Oliveira and Correia [90] proposed a method for crack identification and classification that is not dependent on the hand labeling of dataset pictures. A total of 84 road photos were acquired utilizing a digital camera as a means of training the system. The system was trained unsupervised using photos from the training dataset. To detect fractures in the input photos, a K-means clustering approach and a blend of two Gaussian models were used. A confusion matrix is shown in Table 4 to define the performance of a classification algorithm. The crack detection performance of the proposed method was evaluated by calculating the accuracy rate, sensitivity, specificity, precision, and F-measure on the equations below.
Accuracy = TP + TN TP + FP + TN + FN
Sensitivity = TP TP + FN
Specificity = TN FP + TN
Precision = TP TP + FP
F - measure = 2 × Sensitivity × Precision Sensitivity + Precision
The findings indicated that the Gaussian models’ combination had the greatest F-Measure (93.5%) and the lowest error rate (0.6%). In the event of a recall, this strategy earned the second-highest recall rate of 95.5%. The observed fissures can be classified into three major categories, namely, longitudinal, transversal, and miscellaneous. This is achieved by examining each crack’s related components and calculating a crack skeleton. Crack width is determined using the crack skeleton. The breadth is then studied further to assess the severity of the crack. One concern is that the system’s precision seems to be lowered when it tries to find tiny cracks that measure 2 mm in width.

3.7. Artificial Neural Network

Among various classification techniques, neural networks exhibit the highest level of sophistication. Neural networks are multilayered systems with several nodes in each layer. A basic linear function is executed at each node. By altering the manipulation of the weight and influence of functions found in the nodes and by means of classification, neural networks acquire knowledge from the data provided during their training. Misclassified samples are utilized to determine the error, which is then sent back to the nodes to adjust their influence. Each node in a layer is linked to the nodes in the layer above it. As a result, a completely linked structure is built in order to establish a relationship between each aspect. Figure 21 shows that the active node of a neural network is characterized by several components. These encompass the inputs, labeled as  x 1  to  x n ; the weights, shown as  w 1  to  w n ; and the activation function, known as  φ . Additionally, the sum of the weighted input is denoted as ‘Sum’ with Bias b, while the output activation function is indicated as ‘Outputs’.
Moon et al. [115] devised a technique for identifying cracks from camera pictures through the implementation of an assortment of preprocessing methods, including median subtraction, Gaussian low-pass filtering, segmentation thresholding, and morphological procedures for feature extraction; it is conceivable to refine and examine the data with improved precision and accuracy. These characteristics were then utilized to train an artificial neural network. As an average of two tests, the suggested process obtained a 90.25% accuracy. Xu et al. [116] split raw big photos of steel buildings into 24 by 24-pixel patches. Then, they built a framework for crack identification by classifying the fracture condition of subimage patches using a constrained Boltzman machine-based artificial neural network (ANN). They retrieved the crack distribution on the surface of steel structures by merging all the crack classification data from the subimage patches. Additionally, they discovered that the size of tiny picture patches used for categorization might affect the accuracy of identification.

3.8. Deep Learning

Deep learning, a field within machine learning, employs a diverse array of nonlinear transformations. Its algorithms are capable of comprehending the interpretation of incoming data through various processing layers with sophisticated architecture. Elaborate artificial intelligence models are represented by convolutional neural networks (CNNs), deep autoencoder structures, and recurrent neural networks (RNNs). The strategies have been extensively employed across various industries, such as voice recognition and machine learning of natural language.

Convolutional Neural Networks (CNN)

The convolutional neural network (CNN), a type of machine learning classifier, distinguishes itself from other models by its built-in feature extraction stage, eliminating the need for picture segmentation as a preprocessing step. The design of the CNN’s network structure is characterized by the use of neural network techniques. While neural networks are constrained by high processing costs due to their fully linked topology, CNNs are not constrained by this constraint and may incorporate numerous layers in their design. While the early levels of the CNN architecture are used to extract information, the last layers are built similarly to neural networks and function as a classifier. The architecture of the convolutional neural network (CNN) is shown in Figure 22 which includes three distinctive hidden layers, namely, the convolutional layer, the maximum pooling layer, and the fully connected layer. In the domain of deep learning, there are numerous activation functions available, though the Rectified Linear Unit (ReLU) is the most extensively implemented one [117].
Fan et al. [118] proposed using CNN to identify pavement fractures in photos taken of Beijing’s pavements with an iPhone. Hundreds of thousands of monochrome and RGB picture patches were used. The suggested methodology was shown to have an accuracy of roughly 92%, which was superior to those of established machine learning techniques such as local thresholding, CrackForest, Canny, minimum route selection, and free-form anisotropy. Tan et al. [119] developed a novel approach for automated crack identification by applying a recently suggested algorithm called mask regional convolutional neural network (R-CNN). The mask R-CNN and this approach are used to recognize, localize, and segment objects in natural pictures. They suggested that the mask R-CNN perform this by means of object detection, making it possible to identify distinct entities within an image and concurrently generate a segmentation mask for each instance. Mask R-CNN is a two-stage model that is based on Faster RCNN. To begin, scan the image and produce suggestions. Second, organize the suggestions by classifying them and creating boundary boxes and masks. The suggested approach, which is based on Mask R-CNN, is extremely quick and excels at crack detection in video and pictures. Additionally, by learning an intrinsic feature, it was proved that the Mask R-CNN based on a crack detector is capable of identifying the presence, position, and form of cracks in real-time and on-site. Fan et al. [80] created a unique FCN with an adaptive thresholding approach for the identification of road fractures using images. The FCN first categorized photos as favorable or bad based on the existence of fractures. The positive pictures were segmented, and the faults were localized using an adaptive threshold approach that reduced the within-cluster sum of squares. A total of 40,000 RGB photos were utilized in the study for training, validation, and testing. The suggested technique achieved an accuracy of 99.92% and 98.70% for classifying and determining pavement fractures at the pixel level, respectively.
Cha et al. [120] developed a system with four convolutional layers for detecting concrete cracks in construction situations. The study examined the effect of the training dataset on the effectiveness of the network. The network that is the subject of study was exposed to training on a multitude of dataset sizes, ranging from 2 K to 40 K images. On the basis of validation scores, it is recommended to use more than 10 K photos for training. Dorafshan et al. [121] investigated the viability of employing tiny commercially available unmanned aerial vehicles (UAVs) to assess concrete decks and structures using CNNs. The suggested technique was first utilized to train the model using photos obtained using a low-resolution camera from a laboratory-scale bridge deck, achieving an accuracy of 94.7%. The suggested CNN was then utilized to study a building, achieving 97.1% accuracy utilizing transfer learning (TL) and AlexNet. However, if network efficiency is a concern, the AlexNet design may be substituted with a more sophisticated and accurate architecture such as ResNet.
Gopalakrishnan et al. [122] implemented a transfer learning approach to effectively address crack detection tasks through the utilization of pretrained networks and fine-tuning techniques. Notably, the renowned pretrained convolutional neural network (CNN) known as VGG-16 was employed to accurately identify signs of distress in pavements. The network underwent training with a dataset comprising 760 photographs, which was followed by its evaluation of an additional 212 images. In order to conduct a comparative evaluation, classifier layers of CNN were substituted with classifiers such as random forest, very randomized trees, SVM, and logistic regression. The outcomes of the research show a precision of 90% for the original pretrained network, which surfaced as the most efficient alternative.
Feng et al. [123] suggested an active learning system for automatically detecting and classifying fractures, deposits, and water leaks in concrete buildings without the need for time-consuming labeling. A deep residual network was used to classify and detect these faults (ResNet). The classifiers underwent constant retraining using newly annotated images via the active learning network, leading to a significant reduction in the manual annotation and labeling of images by humans. The scientists obtained an accuracy of 87.5% for 235,200 picture patches using a positive-sampling approach.
Li et al.’s [124] research concerns the portrayal of an innovative approach to detect cracks in images through the utilization of a sophisticated deep convolutional neural network (CNN). In this regard, a CNN model was developed through the modification of AlexNet; subsequently, it was subjected to rigorous training and validation using a dataset comprising 60,000 images. The paper experimented with a range of base learning rates, including 0.1, 0.05, 0.01, 0.005, 0.001, 0.0005, and 0.0001. The validation accuracies and convergence speeds were observed to gradually increase with the base learning rate change in the range of 0.0001 to 0.01, peaking at 0.01. However, a larger base learning rate of 0.1 led to non-convergence of the CNN. Based on the comparison of validation accuracy under different base learning rates, 0.01 was chosen as the best base learning rate with the highest validation accuracy of 99.06% for the validation period. The training outcomes of the model, based on the identified optimal base learning rate, were implemented in the subsequent testing phase. The trained CNN’s robustness and adaptability were evaluated with 205 pictures with resolutions ranging from 3120 to 4160 pixels that were not used for training or validation. Crack propagation was monitored using image processing. Guo et al. [125] demonstrated a non-contact approach for measuring the length of a crack using a CNN and building a crack length computation algorithm. The initial component of the crack identification procedure was predicted using an upgraded CNN, and the crack length propagation was computed using an advanced canny edge-prediction algorithm. Additionally, a center-hole specimen and a solidified drywall specimen were tested experimentally. The approach described in the study was demonstrated to be both effective and precise. Additionally, the approach may be used to forecast the distribution of cracks inside concrete. As a result, the anticipated technique’s measurement error is less than 15  μ m. This approach is critical in identifying cracks and replicating fracture propagation studies. This approach will aid in the research of fracture mechanics. Jia and Luo [126] proposed a novel methodology for the identification and parameter estimation of crack images. This study’s proposed methodology uses digital image processing and CNNs to increase the accuracy of image classification, which is unlikely to be written by an AI detection tool. By modifying the CNN framework and incorporating digital image processing as a special layer, a new image can be created using the extracted feature graph, which enables the determination of crack length based on the number of pixels in the image. The experimental results of this approach demonstrate a classification accuracy of 95%, and the crack length can be measured with an error of less than 4%. The conclusions drawn from this research indicate that the suggested technique may be useful in identifying cracks and estimating parameters. Tong et al. [127] used a deep convolutional neural network (DCNN) to determine the length of cracks in asphalt pavements. To do this, a training database of 8000 photos of cracked and uncracked pavements was created, of which 500 were randomly chosen to serve as the test database. Additionally, the photos were transformed to grayscale in order to determine their threshold. The BMP format was employed in order to allow for precise extraction of the length and shape of each pavement fracture using K-means clustering analysis. The fissure length identification of the deep convolutional neural network consisted of two convolutional layers (C1 and C2) and two subsampling layers (S1 and S2), followed by two fully connected layers (FC1 and FC2) and the output layer, respectively, which jointly established a pair of convolutional layers, two down-sampling layers, and two entirely linked layers. Max pooling was utilized in two subsampling layers. As a consequence of max pooling, the maximal values in two 2 submatrices of convolutional maps were determined. The approach entails developing the structure of DCNN, training it, and testing it. Surprisingly, if the outcome of testing was insufficient to meet demand, the restructuring and training of the DCNN were necessarily needed. Additionally, it was determined that picture quality and lighting circumstances had little effect on the suggested crack-detecting method’s accuracy. For fracture lengths ranging from 0 to 8 cm, the DCNN obtained an accuracy of 94.35% with a mean squared error of 0.2377 cm. Additionally, it was determined that picture quality and lighting circumstances had little effect on the suggested crack-detecting method’s accuracy. The length ranges of 7–8 cm, on the other hand, had a higher inaccuracy rate than the other ranges.

4. Integration of Image Processing Techniques and Dynamic Response Measurements

As image collection equipment grows more affordable and quicker, dynamic response measurement and frequency or amplitude estimates using image sequence analysis continue to gain popularity. Dynamic response measurement is common in civil, mechanical, and other engineering approaches. Over the previous few decades, a noteworthy amount of focus has been directed toward utilizing cameras to capture visible light for conducting non-contact dynamic response studies.
This section describes the fundamental experimental technique of vibration measurement, which usually occurs by physically stimulating the structure at various frequencies and measuring its reaction to the vibration. A shaker, hammer, and shaking table are commonly used to excite the structure.

4.1. Motion Magnification

Zimmermann et al. [32] aimed to validate the applicability of a non-contact system, specifically a camera. The previously mentioned was achieved through the examination of the displacement–time records of the primary and motion-magnified recordings utilizing a particle tracking velocimetry (PTV) computation. Initially, the displacement data obtained from PTV underwent a preliminary calibration process by applying a correction factor that relates the spatial distance of the video in pixels to a known reference spatial distance of the tracked object. The structure’s natural frequencies were obtained by analyzing the power spectral density plots, and the mode shapes were extracted through subspace identification algorithms. Finally, the natural frequencies were utilized to determine the spectral range necessary for the motion magnification algorithm. The obtained signals of displacement response via PTV exhibit a strong correlation to the signals inferred through accelerometers across all three reference locations, as evidenced by the observations made in Figure 23.
In real-world applications, the cost-effectiveness of this approach has made it a popular option for data acquisition. In the accelerometer plane, the width of the video frame was 104 mm. The cantilever beam underwent an impact from a hammer, and the resulting vibration was measured for comparative analysis. The time series of velocity obtained from the laser vibrometer was integrated to ascertain displacement, which was then cross-verified with the displacements derived from the camera measurements of the optical flow of the accelerometer movement. As there was no time synchronization between the camera and laser vibrometer datasets, the time series had to be manually aligned in the data analysis process. The laser vibrometer, accelerometer, and camera-derived displacement data were subjected to Fast Fourier Transform (FFT) and integrated to obtain displacement, thereby enabling a direct comparison of the frequency peaks and noise floors (see Figure 24) [128]. The methodology suggested entails the acquisition of a video recording of a structure undergoing vibrations, followed by the computation of the displacement signal throughout the entire structure within the image, utilizing a technique that is associated with phase-based motion magnification. In the laboratory, an assessment was conducted to gauge the efficiency of the model and contrast it with the projections of the accelerometer and laser vibrometer by employing a cantilever beam. The presented model proved robust in the detection of shape deformation and damage estimation. The presented model achieved competitive performance in real-world applications and also employed the random forest approach, a decision-tree-based ensemble method. While the study focused on trains, image classification challenges for detecting faults in visual data are comparable to crack detection problems in structures. Numerous approaches for dimensionality reduction are utilized in isolation, including principal components analysis, to assess how the feature extraction stage influences the accuracy.
The phase-based motion magnification (PBMM) algorithm was utilized to obtain the time history of spatial phase variations, which consequently led to the amplification of displacement within the scrutinized frequency range [129]. This was achieved by comparing the nth frame with the initial frame. To record the output, the extrados upward boundary at the center point cross-section served as a “virtual” sensor. Considering only the vertical component, it was assumed to represent the entire transverse motion, with a minute error for extremely small oscillations. Subsequent to the initial stage, the collected data underwent a Fourier transformation, the depiction of which is visually presented in Figure 25 to facilitate comprehension.
Chen et al. [130] employed video cameras situated at a distance exceeding 80 m to evaluate the displacements arising from lift-induced vibrations. As a result of this approach, the vibration frequencies and mode shapes of the bridge were determined. To accomplish this, the motion was converted to physical displacements, with a calibration factor determined by the ratio between the length of a known object in the video and its corresponding pixel length. It is important to note that the calibration factor may be influenced by the object’s depth and location within the video frame. An identical methodology was employed to gauge the resonant frequencies and mode configurations of edifices in a controlled setting and over considerable distances [54,131].

4.2. Multithresholding Technique

Multithresholding is a method utilized for segmenting images into various segments based on their gray levels. In this approach, a number of thresholds are chosen for a certain image, and the image is segmented into several brightness zones corresponding to various objects and the backdrop.
A video camera was robustly implanted for capturing the frequency of small signals having a low amplitude. The presented model requires a video camera to capture videos at a proper frame rate to measure the frequency of vibration using multilevel thresholding. The normal camera utilized in the model can robustly measure the vibration occurring in the horizontal or vertical direction of the camera sensor equipped with AVI–JPEG compression. Ferrer et al. [132] put forward the proposition of searching for subpixel movements as a means of detecting changes that may only be perceptible in small bright sparkles, middle grays, or dark areas. Consequently, the identification of the specific gray levels that will be affected remains elusive. Hence, prognosticating the specific gray levels that may be impacted becomes a challenging task. Rather than constructing prognosticative models concerning alterations in illumination resulting from movement, the investigator has suggested scrutinizing pixel alterations at various levels simultaneously. The low resolution and ample amount of noise did not affect the performance of the proposed method, which robustly measures the vibration of real-world objects like bridges, loudspeakers, and forks. The presented technique proved its robustness compared with the relevant methods. In Figure 26a, the percentage of pixel variations in relation to the ROI size  ( 18 × 18 pixels )  for each thresholded level is demonstrated. It is notable that in all levels apart from level 2, the variations are below 5%, specifically 16 pixels. Figure 26b presents the Fourier transform of each of the eight signals.
The monitoring of structural health was addressed in this research through a novel model. The model employs a camera and computer vision techniques to predict the vibration measurement of a cantilever beam in a contactless manner [84]. The proposed model involves the use of a traditional smartphone in slow motion and image processing methods to robustly extract the spatial frequency of the cantilever beam. The employment of a local multithreshold technique enabled the extraction of the beam’s natural vibration frequencies. The region of interest (ROI) consisted of a squared frame made of  15 × 15  pixels, with a pixel size of 1.4 μm. The maximum and minimum luminance in the area was determined and a total of eight thresholds were applied, resulting in eight binarized sequences. The computation of a Fourier transform was performed on the temporal signal produced by each sequence. By averaging the frequencies obtained from the eight threshold levels, the main frequency peak of the vibrating beam in the considered ROI was obtained. The results of the presented model indicated excellent compatibility between contactless and contact-based vibration measurement techniques.

4.3. Edge Detection Techniques

Patsias and Staszewskiy [133] applied wavelet transform for the purpose of damage detection. The study presents a novel damage detection method based on optically measured mode shape data. To demonstrate the efficacy of the previous approach, they utilize a rudimentary experiment employing a cantilever beam. The proposed methodology involves analyzing the captured image sequence using a wavelet transform. This includes calculating the argument and magnitude images based on the partial derivatives in the horizontal and vertical directions. The final edge representation is then obtained via a threshold operation. The procedure is reiterated for all the images and the displacement of the cantilever is evaluated in terms of the distance from the clamped end, employing a previously established routine. The displacement data that were plotted underwent analysis to derive the corresponding power spectral density (PSD). The natural frequencies that corresponded to the first four mode shapes were accurately identified and distinctly marked.
Figure 27 depicts the acquisition of an image sequence using sets of images. The sequence of images is subsequently utilized to construct the trajectories of the scrutinized features, also referred to as markers. Prior to this, an identification routine was executed to ascertain the coordinates of the markers. The edge contours were derived from these image sequences, capturing the dynamics of the structure’s motion. By applying a wavelet-based edge detection method to each image, the edge features were effectively highlighted. Furthermore, the process established a correlation between the various images, ensuring their coherence. To enhance precision, the measurements were converted from pixels to real-world measurements (mm).
According to the research conducted by Gupta et al. [134], the researchers recorded a video and tracked the edges of the object using the Canny edge detection technique. This algorithm comprises several steps that enable it to identify a diverse set of edges present within the images. To track a specific edge, the edge of interest was tracked across successive frames to obtain the motion. A Gaussian filter was used to reduce noise and crop the edges of interest. The algorithm for tracking identifies sharp changes or discontinuities in the intensity gradients. Utilizing the Eigensystem realization algorithm facilitated the extraction of time series data, average pixel displacement, natural frequencies, and damping ratios. The unscaled mode shapes, damping ratios, natural frequencies, and averaged pixel displacement time-series data were extracted using the Eigensystem realization algorithm. The process of using the Eigensystem realization algorithm (ERA) includes arranging the singular values in a specific order to recognize the primary ‘real’ modes of the system. The singular values are derived by conducting a singular value decomposition of a Hankel matrix created from the estimated response. Based on the count of modes selected as potentially ‘real’, one would proceed to rearrange the system matrices. Following this, a truncated observability matrix and a shifted Hankel matrix are constructed. After obtaining the discrete system realization from the system matrix, one can determine the eigenvalues and eigenvectors corresponding to the system matrix. Transforming the complex modes into real modes and transitioning from discrete time to continuous time leads to the derivation of natural frequencies and damping ratios.

4.4. Target Tracking

Liu and Yang [135] suggested a method using neural networks that have been recommended as methods for robust vibration frequency prediction. The networks are capable of accurately and reliably predicting the vibration frequency by means of image sequences obtained from a single camera. In the suggested image sequence analysis, the video is read as an image sequence to target the region of interest (ROI) and saved as separate pixel brightness vibration signals. The time domain data obtained from vibration signals are used to create frequency domain data. Figure 28 shows the implementation pipeline for frequency prediction.
Similar to the previous study, Liu et al. [136] introduced a new approach for measuring vibration frequency through the application of machine learning and confidence kernel, utilizing an industrial camera as a sensor. The findings of the proposed method’s vibration frequency prediction were compared to industry-standard vibration sensor results in the frequency domain. The frequency measurement results from nine excitations are presented in Table 5, from 5 to 45 Hz with 5-step increments. The findings indicate that the proposed method can effectively predict the target–object vibration frequency, showcasing comparable accuracy to an industry-level vibration sensor. Notably, these predictions hold even in challenging real-field conditions without any additional enhancements or signal processing techniques.
The research in [137] presented a comparative analysis of classical and cutting-edge computer vision tracking algorithms. The assessment of their capacity to track oscillatory movements, which characterize vibrations, was conducted through the utilization of low- and high-frame-rate videos. The researchers conducted two sets of experiments, one of which employed a cantilever and the other utilized a robot. The primary purpose of this study was to explore how vision-based systems can be utilized to analyze vibration by examining recorded videos and identifying the most suitable tracker available in OpenCV for motion tracking. The findings indicate that the magnification of motion in videos using MATLAB and tracking the motion utilizing OpenCV proved successful in analyzing recorded video vibrations. Moreover, the CSRT tracker (Channel and Spatial Reliability Tracking) available in OpenCV was found to be the most suitable for motion tracking based on this qualitative study and experiments. The results of cantilever experiments showed that the resonance frequency and damping ratios were comparable to those of the laser vibrometer method. Figure 29 shows the plot of The Fast Fourier Transform (FFT) using a CSRT tracker for a cantilever beam. The FFT was obtained from the camera method using CSRT. Considering Figure 29, there are two peaks exhibiting an increase in amplitude at frequencies of 5.8 Hz and 10.1 Hz. As a result, these frequencies can be considered to be the resonance frequencies of the cantilever.
Lee and Shinozuka [138] proposed a tracking target to the measuring spot designated with a target panel with known geometry. During image frame capture, the displacement of the target is determined by employing different image processing methods. These techniques encompass target recognition, computation of pixel movements, calculation of actual displacement through the use of transformation matrix and scaling factors, as well as the display and retention of the calculated displacement. The amount of information present is contingent upon the number of pixels featured per frame coupled with the number of frames per second. It is recommended that the region of interest (ROI) selected for the prior calculation of the transformation matrix and scaling factors encompasses four white spot regions. Precalculation is executed through the implementation of numerous frames, potentially amounting to 30 frames, whereby the resultant values are averaged to form a more resilient transformation matrix and scaling factors, particularly under conditions of structural vibration. Nonetheless, the ROI for target identification during the measurement phase need not extend to the four white spots and may be further reduced. The reduction of information to be processed in real-time may be achieved by tracing solely a single spot. For the present method’s validation, field experiments were conducted by tracking the target coordinates with respect to time. Then, the target coordinates were plotted in the time domain and the frequency domain. The test results demonstrated adequate dynamic resolution in both amplitude and frequency.

5. Conclusions

The research aims to summarize the image processing techniques for investigating crack analysis to monitor the variation of geometric features of various engineering structures. In this regard, structural health monitoring (SHM) plays a vital role in providing quantitative and reliable data on the real conditions of a mechanical structure, examining its evolution and detecting any degradations that appear on it. The study was conducted via a comprehensive review that involved a substantial number of published papers. Two types of SHM techniques were discussed: contact-based and non-contact methods. SHM can also be classified into four levels of damage: determination of damage, detecting the location of damage, recognizing the damage severity level, and estimation of the structure’s remaining lifetime. The review papers focused on both contact-based (i.e., sensors, cameras, and accelerometers) and non-contact SHM (i.e., infrared thermography, laser imaging, and photogrammetry) techniques and investigated their advantages and disadvantages in terms of cost, time constraints, and accuracy. Additionally, the researchers classified the crack analysis based on the methodologies applied to crack detection such as machine learning, image processing, artificial neural network, support vector machine, and convolutional neural network.
The study can be concluded by the direction of several techniques used to detect and predict possible damage to structures that could be considered effectively better than the conventional methods. Moreover, the main part of the study reviews the image processing algorithms for crack detection investigations. These algorithms are not limited to detection and prediction; image acquisition, image preprocessing, image cropping and scaling, image enhancement, image detection, and segmentation are reported in this study. After acquiring the image, some preprocessing techniques can be applied to remove possible noise and blur. Then, the generated image is ready for postprocessing tasks such as detection and segmentation. There are classical and modern techniques to perform those tasks, such as Canny, Sobel, Prewitt, and hyperbolic tangent detectors. Moreover, traditional segmentation methods are discussed such as Otsu thresholding, adaptive thresholding, and morphological segmentation. This will inspire researchers to make various techniques movable and able to detect cracks in far objects (i.e., towers, bridges, and wind turbines) remotely.
Further research is recommended regarding the laser-based, non-contact measurement techniques previously discussed. By delving deeper into these pioneering methodologies, the paper’s overall comprehensiveness and relevance in the current crack detection landscape can be enhanced. These laser-based techniques, encompassing laser ultrasonic testing, laser interferometry, laser diffraction technology, and laser speckle measurement, have the potential to revolutionize non-contact crack measurement. Aligned with the swift progression of smart manufacturing, integrating these laser technologies into our paper offers a forward-looking perspective at the forefront of modern crack detection practices. This extension not only enhances the practicality of our paper but also contributes to advancing the entire field, offering heightened impact and applicability to researchers and practitioners alike.

Author Contributions

Conceptualization, Z.A., M.K. and B.H.S.A.; methodology, Z.A. and M.K.; resources, B.H.S.A.; writing—original draft preparation, Z.A. and B.H.S.A.; writing—review and editing, Z.A., M.K. and B.H.S.A.; visualization, Z.A. and B.H.S.A.; supervision, B.H.S.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Yao, Y.; Tung, S.T.E.; Glisic, B. Crack detection and characterization techniques—An overview. Struct. Control Health Monit. 2014, 21, 1387–1413. [Google Scholar] [CrossRef]
  2. Dong, C.Z.; Catbas, F.N. A review of computer vision–based structural health monitoring at local and global levels. Struct. Health Monit. 2021, 20, 692–743. [Google Scholar] [CrossRef]
  3. Sony, S.; Laventure, S.; Sadhu, A. A literature review of next-generation smart sensing technology in structural health monitoring. Struct. Control Health Monit. 2019, 26, e2321. [Google Scholar] [CrossRef]
  4. Flah, M.; Suleiman, A.R.; Nehdi, M.L. Classification and quantification of cracks in concrete structures using deep learning image-based techniques. Cem. Concr. Compos. 2020, 114, 103781. [Google Scholar] [CrossRef]
  5. LeBlanc, B.; Niezrecki, C.; Avitabile, P.; Chen, J.; Sherwood, J. Damage detection and full surface characterization of a wind turbine blade using three-dimensional digital image correlation. Struct. Health Monit. 2013, 12, 430–439. [Google Scholar] [CrossRef]
  6. Li, J.; Xie, X.; Yang, G.; Zhang, B.; Siebert, T.; Yang, L. Whole-field thickness strain measurement using multiple camera digital image correlation system. Opt. Lasers Eng. 2017, 90, 19–25. [Google Scholar] [CrossRef]
  7. Dabous, S.A.; Feroz, S. Condition monitoring of bridges with non-contact testing technologies. Autom. Constr. 2020, 116, 103224. [Google Scholar] [CrossRef]
  8. Feng, D.; Feng, M.Q.; Ozer, E.; Fukuda, Y. A vision-based sensor for noncontact structural displacement measurement. Sensors 2015, 15, 16557–16575. [Google Scholar] [CrossRef]
  9. Kou, X.; Pei, C.; Chen, Z. Fully noncontact inspection of closed surface crack with nonlinear laser ultrasonic testing method. Ultrasonics 2021, 114, 106426. [Google Scholar] [CrossRef]
  10. Zhu, D.; Cheng, Q.; He, J.; Hong, W.; Liu, W.; Yang, S.; Wang, D. Differential two-wave mixing interferometer for crack detection in metallic structures based on laser-induced ultrasound. Opt. Lasers Eng. 2023, 164, 107485. [Google Scholar] [CrossRef]
  11. Kang, K.C.; Park, K.K. Noncontact laser ultrasound detection of cracks using hydrophone. Sensors 2021, 21, 3371. [Google Scholar] [CrossRef] [PubMed]
  12. Gao, F.; Zhou, H.; Huang, C. Defect detection using the phased-array laser ultrasonic crack diffraction enhancement method. Opt. Commun. 2020, 474, 126070. [Google Scholar] [CrossRef]
  13. Wen, T.K.; Yin, C.C. Crack detection in photovoltaic cells by interferometric analysis of electronic speckle patterns. Sol. Energy Mater. Sol. Cells 2012, 98, 216–223. [Google Scholar] [CrossRef]
  14. Kaczmarek, R.; Dupré, J.C.; Doumalin, P.; Pop, O.; Teixeira, L.; Huger, M. High-temperature digital image correlation techniques for full-field strain and crack length measurement on ceramics at 1200 °C: Optimization of speckle pattern and uncertainty assessment. Opt. Lasers Eng. 2021, 146, 106716. [Google Scholar] [CrossRef]
  15. Wang, T.; Wang, Y.; Yang, X.; Chen, B.; Zhu, H. Cracks and process control in laser powder bed fusion of Al-Zn-Mg alloy. J. Manuf. Process. 2022, 81, 571–579. [Google Scholar] [CrossRef]
  16. Wall, A.; Benoit, M.J. A Review of Existing Solidification Crack Tests and Analysis of Their Transferability to Additive Manufacturing. J. Mater. Process. Technol. 2023, 320, 118090. [Google Scholar] [CrossRef]
  17. Liu, T.; Ji, Z.; Ding, Y.; Zhu, Y. Real-Time Laser Interference Detection of Mechanical Targets Using a 4R Manipulator. Sensors 2023, 23, 2794. [Google Scholar] [CrossRef]
  18. Erkal, B.G.; Hajjar, J.F. Laser-based surface damage detection and quantification using predicted surface properties. Autom. Constr. 2017, 83, 285–302. [Google Scholar] [CrossRef]
  19. Liu, N.; Song, W.; Zhao, Q. Morphology and maximum entropy image segmentation based urban pavement cracks detection. J. Liaoning Tech. Univ. Nat. Sci. Ed. 2015, 34, 57–61. [Google Scholar]
  20. Othman, Z.; Abdullah, A.; Kasmin, F.; Ahmad, S.S.S. Road crack detection using adaptive multi resolution thresholding techniques. TELKOMNIKA Telecommun. Comput. Electron. Control. 2019, 17, 1874–1881. [Google Scholar] [CrossRef]
  21. Song, C.; Wu, L.; Chen, Z.; Zhou, H.; Wu, Z. Pixel-Level Crack Detection in Images Using SegNet. In Multi-Disciplinary Trends in Artificial Intelligence Proceedings of the 13th International Conference, MIWAI 2019, Kuala Lumpur, Malaysia, 17–19 November 2019; Springer: Berlin, Germany, 2019. [Google Scholar]
  22. Prabakar, C.V.; Nagarajan, C.K. A novel approach of surface crack detection using super pixel segmentation. Mater. Today Proc. 2021, 42, 1043–1049. [Google Scholar] [CrossRef]
  23. Harjit, K.; Rajandeep, K. A Review on Crack Detection and Parameters Estimation on Road Images. Int. J. Res. Appl. Sci. Eng. Technol. 2017, 5, 1–4. [Google Scholar]
  24. Parente, L.; Castagnetti, C.; Falvo, E.; Rossi, P.; Grassi, F.; Mancini, F.; Capra, A. Towards an automated machine learning and image processing supported procedure for crack monitoring. In Proceedings of the 5th Joint International Symposium on Deformation Monitoring (JISDM 2022)—Editorial Universitat Politècnica de València, València, Spain, 20–22 June 2023; pp. 237–242. [Google Scholar]
  25. Hsieh, Y.A.; Tsai, Y.J. Machine learning for crack detection: Review and model performance comparison. J. Comput. Civ. Eng. 2020, 34, 04020038. [Google Scholar] [CrossRef]
  26. Peng, J.; Zhang, S.; Peng, D.; Liang, K. Research on bridge crack detection with neural network based image processing methods. In Proceedings of the 2018 12th International Conference on Reliability, Maintainability, and Safety (ICRMS), Shanghai, China, 17–19 October 2018; pp. 419–428. [Google Scholar]
  27. Hoang, N.D. Image processing-based recognition of wall defects using machine learning approaches and steerable filters. Comput. Intell. Neurosci. 2018, 2018, 7913952. [Google Scholar] [CrossRef]
  28. Munawar, H.S.; Hammad, A.W.; Haddad, A.; Soares, C.A.P.; Waller, S.T. Image-based crack detection methods: A review. Infrastructures 2021, 6, 115. [Google Scholar] [CrossRef]
  29. Kim, C.N.; Kawamura, K.; Nakamura, H.; Tarighat, A. Automatic crack detection for concrete infrastructures using image processing and deep learning. In Proceedings of the IOP Conference Series: Materials Science and Engineering, Tokyo, Japan, 26–29 February 2020; IOP Publishing: Bristol, UK, 2020; Volume 829, p. 012027. [Google Scholar]
  30. Miao, Y.; Jeon, J.Y.; Park, G. An image processing-based crack detection technique for pressed panel products. J. Manuf. Syst. 2020, 57, 287–297. [Google Scholar] [CrossRef]
  31. Shariati, A.; Schumacher, T.; Ramanna, N. Exploration of Video-Based Structural Health Monitoring Techniques; Technical Report; Rutgers University, Center for Advanced Infrastructure & Transportation: New Brunswick, NJ, USA, 2014. [Google Scholar]
  32. Zimmermann, M.; Gülan, U.; Harmanci, Y.E.; Chatzi, E.N.; Holzner, M. Structural health monitoring through video recording. In Proceedings of the 8th European Workshop on Structural Health Monitoring (EWSHM 2016), Bilbao, Spain, 5–8 July 2016; pp. 5–8. [Google Scholar]
  33. Schumacher, T.; Shariati, A. Monitoring of structures and mechanical systems using virtual visual sensors for video analysis: Fundamental concept and proof of feasibility. Sensors 2013, 13, 16551–16564. [Google Scholar] [CrossRef]
  34. Taghavi Larigani, S.; Heaton, T.H. Characterizing Deformation of Buildings from Videos; California Institute of Technology: Pasadena, CA, USA, 2016. [Google Scholar]
  35. Medhi, M.; Dandautiya, A.; Raheja, J.L. Real-time video surveillance based structural health monitoring of civil structures using artificial neural network. J. Nondestruct. Eval. 2019, 38, 1–16. [Google Scholar] [CrossRef]
  36. Dworakowski, Z.; Kohut, P.; Gallina, A.; Holak, K.; Uhl, T. Vision-based algorithms for damage detection and localization in structural health monitoring. Struct. Control Health Monit. 2016, 23, 35–50. [Google Scholar] [CrossRef]
  37. Fukuda, Y.; Feng, M.Q.; Narita, Y.; Kaneko, S.; Tanaka, T. Vision-based displacement sensor for monitoring dynamic response using robust object search algorithm. IEEE Sens. J. 2013, 13, 4725–4732. [Google Scholar] [CrossRef]
  38. Xu, Y.; Brownjohn, J.M. Review of machine-vision based methodologies for displacement measurement in civil structures. J. Civ. Struct. Health Monit. 2018, 8, 91–110. [Google Scholar] [CrossRef]
  39. Scholar, P. Review and analysis of crack detection and classification techniques based on crack types. Int. J. Appl. Eng. Res 2018, 13, 6056–6062. [Google Scholar]
  40. Mohan, A.; Poobal, S. Crack detection using image processing: A critical review and analysis. Alex. Eng. J. 2018, 57, 787–798. [Google Scholar] [CrossRef]
  41. Ali, R.; Gopal, D.L.; Cha, Y.J. Vision-based concrete crack detection technique using cascade features. In Proceedings of the Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems, Denver, CO, USA, 5–8 March 2018; Volume 10598, pp. 147–153. [Google Scholar]
  42. Dorafshan, S.; Thomas, R.J.; Maguire, M. Benchmarking image processing algorithms for unmanned aerial system-assisted crack detection in concrete structures. Infrastructures 2019, 4, 19. [Google Scholar] [CrossRef]
  43. Han, H.; Deng, H.; Dong, Q.; Gu, X.; Zhang, T.; Wang, Y. An advanced Otsu method integrated with edge detection and decision tree for crack detection in highway transportation infrastructure. Adv. Mater. Sci. Eng. 2021, 2021, 9205509. [Google Scholar] [CrossRef]
  44. Wang, P.; Huang, H. Comparison analysis on present image-based crack detection methods in concrete structures. In Proceedings of the 2010 3rd International Congress on Image and Signal Processing, Yantai, China, 16–18 October 2010; Volume 5, pp. 2530–2533. [Google Scholar]
  45. Liu, X.; Ai, Y.; Scherer, S. Robust image-based crack detection in concrete structure using multi-scale enhancement and visual features. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 2304–2308. [Google Scholar]
  46. Daneshgaran, F.; Zacheo, L.; Stasio, F.D.; Mondin, M. Use of deep learning for automatic detection of cracks in tunnels: Prototype-2 developed in the 2017–2018 time period. Transp. Res. Rec. 2019, 2673, 44–50. [Google Scholar] [CrossRef]
  47. Perez, H.; Tah, J.H.; Mosavi, A. Deep learning for detecting building defects using convolutional neural networks. Sensors 2019, 19, 3556. [Google Scholar] [CrossRef]
  48. Zhang, L.; Shen, J.; Zhu, B. A research on an improved Unet-based concrete crack detection algorithm. Struct. Health Monit. 2021, 20, 1864–1879. [Google Scholar] [CrossRef]
  49. Ye, X.W.; Jin, T.; Chen, P.Y. Structural crack detection using deep learning–based fully convolutional networks. Adv. Struct. Eng. 2019, 22, 3412–3419. [Google Scholar] [CrossRef]
  50. Han, Y.; Liu, Z.; Lyu, Y.; Liu, K.; Li, C.; Zhang, W. Deep learning-based visual ensemble method for high-speed railway catenary clevis fracture detection. Neurocomputing 2020, 396, 556–568. [Google Scholar] [CrossRef]
  51. Ren, Y.; Huang, J.; Hong, Z.; Lu, W.; Yin, J.; Zou, L.; Shen, X. Image-based concrete crack detection in tunnels using deep fully convolutional networks. Constr. Build. Mater. 2020, 234, 117367. [Google Scholar] [CrossRef]
  52. Hoang, N.D.; Nguyen, Q.L. Metaheuristic optimized edge detection for recognition of concrete wall cracks: A comparative study on the performances of roberts, prewitt, canny, and sobel algorithms. Adv. Civ. Eng. 2018, 2018, 1–16. [Google Scholar] [CrossRef]
  53. Stentoumis, C.; Protopapadakis, E.; Doulamis, A.; Doulamis, N. A holistic approach for inspection of civil infrastructures based on computer vision techniques. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, 41, 131–138. [Google Scholar] [CrossRef]
  54. Chen, J.G.; Davis, A.; Wadhwa, N.; Durand, F.; Freeman, W.T.; Büyüköztürk, O. Video camera–based vibration measurement for civil infrastructure applications. J. Infrastruct. Syst. 2017, 23, B4016013. [Google Scholar] [CrossRef]
  55. Ni, T.; Zhou, R.; Gu, C.; Yang, Y. Measurement of concrete crack feature with android smartphone APP based on digital image processing techniques. Measurement 2020, 150, 107093. [Google Scholar] [CrossRef]
  56. Kang, S.M.; Chun, C.J.; Shim, S.B.; Ryu, S.K.; Baek, J.D. Real Time Image Processing System for Detecting Infrastructure Damage: Crack. In Proceedings of the 2019 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 11–13 January 2019; pp. 1–3. [Google Scholar]
  57. Otero, L.D.; Moyou, M.; Peter, A.; Otero, C.E. Towards a Remote Sensing System for Railroad Bridge Inspections: A Concrete Crack Detection Component. In Proceedings of the SoutheastCon 2018, St. Petersburg, FL, USA, 19–22 April 2018; pp. 1–4. [Google Scholar]
  58. Vijayan, V.; Joy, C.M.; Shailesh, S. A Survey on Surface Crack Detection in Concretes using Traditional, Image Processing, Machine Learning, and Deep Learning Techniques. In Proceedings of the 2021 International Conference on Communication, Control and Information Sciences (ICCISc), Idukki, India, 16–18 June 2021; Volume 1, pp. 1–6. [Google Scholar]
  59. Talab, A.M.A.; Huang, Z.; Xi, F.; HaiMing, L. Detection crack in image using Otsu method and multiple filtering in image processing techniques. Optik 2016, 127, 1030–1033. [Google Scholar] [CrossRef]
  60. Sankarasrinivasan, S.; Balasubramanian, E.; Karthik, K.; Chandrasekar, U.; Gupta, R. Health monitoring of civil structures with integrated UAV and image processing system. Procedia Comput. Sci. 2015, 54, 508–515. [Google Scholar] [CrossRef]
  61. Cho, O.H.; Kim, J.C.; Kim, E.K. Context-aware high-rise structure cracks image monitoring system using unmanned aerial vehicles. Int. J. Control Autom. 2016, 9, 11–18. [Google Scholar] [CrossRef]
  62. Hoang, N.D. Detection of surface crack in building structures using image processing technique with an improved Otsu method for image thresholding. Adv. Civ. Eng. 2018, 2018, 3924120. [Google Scholar] [CrossRef]
  63. Peng, T.; Kavya, T.S.; Jang, Y.M.; Kim, B.W. Concrete Crack Detection using Relative Standard Deviation for Image Thresholding. Int. J. Eng. Res. Technol. 2020, 13, 2720. [Google Scholar] [CrossRef]
  64. Hussain, Z.; Agarwal, D. A comparative analysis of edge detection techniques used in flame image processing. Int. J. Adv. Res. Sci. Eng. IJARSE 2015, 4, 3703–3711. [Google Scholar]
  65. Sia, J.S.Y.; Tan, T.S.; Yahya, A.B.; Tiong, M.F.T.; Sia, J.Y.X. Mini Kirsch Edge Detection and Its Sharpening Effect. Indones. J. Electr. Eng. Inform. IJEEI 2021, 9, 228–244. [Google Scholar]
  66. Öztürk, Ş.; Akdemir, B. Comparison of edge detection algorithms for texture analysis on glass production. Procedia-Soc. Behav. Sci. 2015, 195, 2675–2682. [Google Scholar] [CrossRef]
  67. Al-Amri, S.S.; Kalyankar, N.; Khamitkar, S. Image segmentation by using edge detection. Int. J. Comput. Sci. Eng. 2010, 2, 804–807. [Google Scholar]
  68. Pereira, F.C.; Pereira, C.E. Embedded image processing systems for automatic recognition of cracks using UAVs. IFAC PapersOnLine 2015, 48, 16–21. [Google Scholar] [CrossRef]
  69. Syahrian, N.M.; Risma, P.; Dewi, T. Vision-based pipe monitoring robot for crack detection using canny edge detection method as an image processing technique. Kinet. Game Technol. Inf. Syst. Comput. Netw. Comput. Electron. Control. 2017, 2, 243–250. [Google Scholar] [CrossRef]
  70. He, M.; Li, J.; Zhang, Y.; Li, W. Research on crack visualization method for dynamic detection of eddy current thermography. NDT E Int. 2020, 116, 102361. [Google Scholar] [CrossRef]
  71. Wang, W.; Li, L.; Han, Y. Crack detection in shadowed images on gray level deviations in a moving window and distance deviations between connected components. Constr. Build. Mater. 2021, 271, 121885. [Google Scholar] [CrossRef]
  72. Wang, Y.; Zhang, J.Y.; Liu, J.X.; Zhang, Y.; Chen, Z.P.; Li, C.G.; He, K.; Yan, R.B. Research on crack detection algorithm of the concrete bridge based on image processing. Procedia Comput. Sci. 2019, 154, 610–616. [Google Scholar] [CrossRef]
  73. Shrivakshan, G.; Chandrasekar, C. A comparison of various edge detection techniques used in image processing. Int. J. Comput. Sci. Issues IJCSI 2012, 9, 269. [Google Scholar]
  74. Lee, C.; Zhang, A.; Yu, B.; Park, S. Comparison study between RMS and edge detection image processing algorithms for a pulsed laser UWPI (Ultrasonic wave propagation imaging)-based NDT technique. Sensors 2017, 17, 1224. [Google Scholar] [CrossRef] [PubMed]
  75. Lim, R.S.; La, H.M.; Sheng, W. A robotic crack inspection and mapping system for bridge deck maintenance. IEEE Trans. Autom. Sci. Eng. 2014, 11, 367–378. [Google Scholar] [CrossRef]
  76. Lim, R.S.; La, H.M.; Shan, Z.; Sheng, W. Developing a crack inspection robot for bridge maintenance. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation, Shanghai, China, 9–13 May 2011; pp. 6288–6293. [Google Scholar]
  77. Shen, G. Road crack detection based on video image processing. In Proceedings of the 2016 3rd International Conference on Systems and Informatics (ICSAI), Shanghai, China, 19–21 November 2016; pp. 912–917. [Google Scholar]
  78. Peng, L.; Chao, W.; Shuangmiao, L.; Baocai, F. Research on crack detection method of airport runway based on twice-threshold segmentation. In Proceedings of the 2015 Fifth International Conference on Instrumentation and Measurement, Computer, Communication and Control (IMCCC), Qinhuangdao, China, 18–20 September 2015; pp. 1716–1720. [Google Scholar]
  79. Senthikumar, M.; Palanisamy, V.; Jaya, J. Metal surface defect detection using iterative thresholding technique. In Proceedings of the Second International Conference on Current Trends in Engineering and Technology-ICCTET 2014, Coimbatore, India, 8 July 2014; pp. 561–564. [Google Scholar]
  80. Fan, R.; Bocus, M.J.; Zhu, Y.; Jiao, J.; Wang, L.; Ma, F.; Cheng, S.; Liu, M. Road crack detection using deep convolutional neural network and adaptive thresholding. In Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France, 9–12 June 2019; pp. 474–479. [Google Scholar]
  81. Song, M.; Cui, D.; Yu, C.; An, J.; Chang, C.I.; Song, M. Crack detection algorithm for photovoltaic image based on multi-scale pyramid and improved region growing. In Proceedings of the 2018 IEEE 3rd International Conference on Image, Vision and Computing (ICIVC), Chongqing, China, 27–29 June 2018; pp. 128–132. [Google Scholar]
  82. Shivaprasad, K.; Vishwanath, M.; Narasimha, K. Morphology based surface crack detection. J. Adv. Res. Sci. 2015, 1, 15–20. [Google Scholar]
  83. Koshy, S.; Radhakrishnan, B.; Suresh, L.P. Strength analysis of buildings using image processing and SHM principles. In Proceedings of the 2016 International Conference on Emerging Technological Trends (ICETT), Kollam, India, 21–22 October 2016; pp. 1–6. [Google Scholar]
  84. Orak, M.S.; Ozturk, T. Monitoring cantilever beam with a vision-based algorithm and smartphone. Vibroeng. Procedia 2018, 17, 107–111. [Google Scholar] [CrossRef]
  85. Zhang, Y.; Zhao, X.; Liu, P. Multi-point displacement monitoring based on full convolutional neural network and smartphone. IEEE Access 2019, 7, 139628–139634. [Google Scholar] [CrossRef]
  86. Li, S.; Zhao, X.; Zhou, G. Automatic pixel-level multiple damage detection of concrete structure using fully convolutional network. Comput.-Aided Civ. Infrastruct. Eng. 2019, 34, 616–634. [Google Scholar] [CrossRef]
  87. Zhang, L.; Yang, F.; Zhang, Y.D.; Zhu, Y.J. Road crack detection using deep convolutional neural network. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3708–3712. [Google Scholar]
  88. Kim, H.; Lee, J.; Ahn, E.; Cho, S.; Shin, M.; Sim, S.H. Concrete crack identification using a UAV incorporating hybrid image processing. Sensors 2017, 17, 2052. [Google Scholar] [CrossRef]
  89. Kim, H.; Sim, S.H.; Cho, S. Unmanned aerial vehicle (UAV)-powered concrete crack detection based on digital image processing. In Proceedings of the International Conference on Advances in Experimental Structural Engineering, Chicago, IL, USA, 1–2 August 2015. [Google Scholar]
  90. Cao, J.; Zhang, K.; Yuan, C.; Xu, S. Automatic road cracks detection and characterization based on mean shift. Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao J. Comput.-Aided Des. Comput. Graph. 2014, 26, 1450–1459. [Google Scholar]
  91. Hu, Y.; Zhao, C.X.; Wang, H.N. Automatic pavement crack detection using texture and shape descriptors. IETE Tech. Rev. 2010, 27, 398–405. [Google Scholar] [CrossRef]
  92. Jahanshahi, M.R.; Masri, S.F.; Padgett, C.W.; Sukhatme, G.S. An innovative methodology for detection and quantification of cracks through incorporation of depth perception. Mach. Vis. Appl. 2013, 24, 227–241. [Google Scholar] [CrossRef]
  93. Prasanna, P.; Dana, K.J.; Gucunski, N.; Basily, B.B.; La, H.M.; Lim, R.S.; Parvardeh, H. Automated crack detection on concrete bridges. IEEE Trans. Autom. Sci. Eng. 2014, 13, 591–599. [Google Scholar] [CrossRef]
  94. Lee, B.Y.; Kim, Y.Y.; Yi, S.T.; Kim, J.K. Automated image processing technique for detecting and analysing concrete surface cracks. Struct. Infrastruct. Eng. 2013, 9, 567–577. [Google Scholar] [CrossRef]
  95. Cord, A.; Chambon, S. Automatic road defect detection by textural pattern recognition based on AdaBoost. Comput.-Aided Civ. Infrastruct. Eng. 2012, 27, 244–259. [Google Scholar] [CrossRef]
  96. Salehi, H.; Biswas, S.; Burgueño, R. Data interpretation framework integrating machine learning and pattern recognition for self-powered data-driven damage identification with harvested energy variations. Eng. Appl. Artif. Intell. 2019, 86, 136–153. [Google Scholar] [CrossRef]
  97. Gavilán, M.; Balcones, D.; Marcos, O.; Llorca, D.F.; Sotelo, M.A.; Parra, I.; Ocaña, M.; Aliseda, P.; Yarza, P.; Amírola, A. Adaptive road crack detection system by pavement classification. Sensors 2011, 11, 9628–9657. [Google Scholar] [CrossRef] [PubMed]
  98. Ersoz, A.B.; Pekcan, O.; Teke, T. Crack identification for rigid pavements using unmanned aerial vehicles. In Proceedings of the IOP Conference Series: Materials Science and Engineering, Prague, Czech Republic, 21–22 September 2017; IOP Publishing: Bristol, UK, 2017; Volume 236, p. 012101. [Google Scholar]
  99. Kotsiantis, S.B. Decision trees: A recent overview. Artif. Intell. Rev. 2013, 39, 261–283. [Google Scholar] [CrossRef]
  100. Wu, W.; Liu, Z.; He, Y. Classification of defects with ensemble methods in the automated visual inspection of sewer pipes. Pattern Anal. Appl. 2015, 18, 263–276. [Google Scholar] [CrossRef]
  101. Cover, T.; Hart, P. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 1967, 13, 21–27. [Google Scholar] [CrossRef]
  102. Uddin, S.; Haque, I.; Lu, H.; Moni, M.A.; Gide, E. Comparative performance analysis of K-nearest neighbour (KNN) algorithm and its different variants for disease prediction. Sci. Rep. 2022, 12, 6256. [Google Scholar] [CrossRef]
  103. Zhang, W.; Zhang, Z.; Qi, D.; Liu, Y. Automatic crack detection and classification method for subway tunnel safety monitoring. Sensors 2014, 14, 19307–19328. [Google Scholar] [CrossRef]
  104. Kirasich, K.; Smith, T.; Sadler, B. Random forest vs logistic regression: Binary classification for heterogeneous datasets. SMU Data Sci. Rev. 2018, 1, 9. [Google Scholar]
  105. Binte Kibria, H.; Matin, A. The Severity Prediction of The Binary And Multi-Class Cardiovascular Disease–A Machine Learning-Based Fusion Approach. arXiv 2022, arXiv:2203.04921. [Google Scholar]
  106. Shi, Y.; Cui, L.; Qi, Z.; Meng, F.; Chen, Z. Automatic road crack detection using random structured forests. IEEE Trans. Intell. Transp. Syst. 2016, 17, 3434–3445. [Google Scholar] [CrossRef]
  107. Yang, X.; Li, H.; Yu, Y.; Luo, X.; Huang, T.; Yang, X. Automatic pixel-level crack detection and measurement using fully convolutional network. Comput.-Aided Civ. Infrastruct. Eng. 2018, 33, 1090–1109. [Google Scholar] [CrossRef]
  108. Santur, Y.; Karaköse, M.; Akin, E. Random forest based diagnosis approach for rail fault inspection in railways. In Proceedings of the 2016 National Conference on Electrical, Electronics and Biomedical Engineering (ELECO), Bursa, Turkey, 1–3 December 2016; pp. 745–750. [Google Scholar]
  109. Ibrahim, I.; Abdulazeez, A. The Role of machine learning algorithms for diagnosing diseases. J. Appl. Sci. Technol. Trends 2021, 2, 10–19. [Google Scholar] [CrossRef]
  110. Landstrom, A.; Thurley, M.J. Morphology-based crack detection for steel slabs. IEEE J. Sel. Top. Signal Process. 2012, 6, 866–875. [Google Scholar] [CrossRef]
  111. Park, K.; Torbol, M. Visual-based laser speckle pattern recognition method for structural health monitoring. In Proceedings of the Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems 2017, Portland, OR, USA, 25–29 March 2017; Volume 10168, pp. 114–120. [Google Scholar]
  112. Lei, B.; Wang, N.; Xu, P.; Song, G. New crack detection method for bridge inspection using UAV incorporating image processing. J. Aerosp. Eng. 2018, 31, 04018058. [Google Scholar] [CrossRef]
  113. Shukla, S.; Naganna, S. A review on K-means data clustering approach. Int. J. Inf. Comput. Technol. 2014, 4, 1847–1860. [Google Scholar]
  114. Wang, X.; Liu, Y.; Xin, H. Bond strength prediction of concrete-encased steel structures using hybrid machine learning method. Structures 2021, 32, 2279–2292. [Google Scholar] [CrossRef]
  115. Moon, H.G.; Kim, J.H. Intelligent crack detecting algorithm on the concrete crack image using neural network. Proc. 28th ISARC 2011, 2011, 1461–1467. [Google Scholar]
  116. Xu, Y.; Li, S.; Zhang, D.; Jin, Y.; Zhang, F.; Li, N.; Li, H. Identification framework for cracks on a steel structure surface by a restricted Boltzmann machines algorithm based on consumer-grade camera images. Struct. Control Health Monit. 2018, 25, e2075. [Google Scholar] [CrossRef]
  117. Yang, A.Y.; Cheng, L. Two-step surface damage detection scheme using convolutional neural network and artificial neural neural. arXiv 2020, arXiv:2003.10760. [Google Scholar]
  118. Fan, Z.; Wu, Y.; Lu, J.; Li, W. Automatic pavement crack detection based on structured prediction with the convolutional neural network. arXiv 2018, arXiv:1802.02208. [Google Scholar]
  119. Tan, C.; Uddin, N.; Mohammed, Y.M. Deep learning-based crack detection using mask R-CNN technique. In Proceedings of the 9th International Conference on Structural Health Monitoring of Intelligent Infrastructure, St. Louis, MO, USA, 5–7 August 2019. [Google Scholar]
  120. Cha, Y.J.; Choi, W.; Büyüköztürk, O. Deep learning-based crack damage detection using convolutional neural networks. Comput.-Aided Civ. Infrastruct. Eng. 2017, 32, 361–378. [Google Scholar] [CrossRef]
  121. Dorafshan, S.; Thomas, R.J.; Coopmans, C.; Maguire, M. Deep learning neural networks for sUAS-assisted structural inspections: Feasibility and application. In Proceedings of the 2018 International Conference on Unmanned Aircraft Systems (ICUAS), Dallas, TX, USA, 12–15 June 2018; pp. 874–882. [Google Scholar]
  122. Gopalakrishnan, K.; Khaitan, S.K.; Choudhary, A.; Agrawal, A. Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection. Constr. Build. Mater. 2017, 157, 322–330. [Google Scholar] [CrossRef]
  123. Feng, C.; Liu, M.Y.; Kao, C.C.; Lee, T.Y. Deep active learning for civil infrastructure defect detection and classification. Comput. Civ. Eng. 2017, 2017, 298–306. [Google Scholar]
  124. Li, S.; Zhao, X. Image-based concrete crack detection using convolutional neural network and exhaustive search technique. Adv. Civ. Eng. 2019, 2019, 6520620. [Google Scholar] [CrossRef]
  125. Guo, X.; Yuan, Y.; Liu, Y. Crack propagation detection method in the structural fatigue process. Exp. Tech. 2021, 45, 169–178. [Google Scholar] [CrossRef]
  126. Jia, X.; Luo, W. Crack Damage Detection of Bridge Based on Convolutional Neural Networks. In Proceedings of the 2019 Chinese Control and Decision Conference (CCDC), Nanchang, China, 3–5 June 2019; pp. 3995–4000. [Google Scholar]
  127. Tong, Z.; Gao, J.; Han, Z.; Wang, Z. Recognition of asphalt pavement crack length using deep convolutional neural networks. Road Mater. Pavement Des. 2018, 19, 1334–1349. [Google Scholar] [CrossRef]
  128. Chen, J.G.; Wadhwa, N.; Cha, Y.J.; Durand, F.; Freeman, W.T.; Buyukozturk, O. Structural modal identification through high speed camera video: Motion magnification. In Topics in Modal Analysis I, Volume 7: Proceedings of the 32nd IMAC, A Conference and Exposition on Structural Dynamics, Orlando, FL, USA, 3–6 February 2014; Springer: Berlin/Heidelberg, Germany, 2014; pp. 191–197. [Google Scholar]
  129. Civera, M.; Zanotti Fragonara, L.; Surace, C. An experimental study of the feasibility of phase-based video magnification for damage detection and localisation in operational deflection shapes. Strain 2020, 56, e12336. [Google Scholar] [CrossRef]
  130. Chen, J.G.; Adams, T.M.; Sun, H.; Bell, E.S.; Büyüköztürk, O. Camera-based vibration measurement of the world war I memorial bridge in Portsmouth, New Hampshire. J. Struct. Eng. 2018, 144, 04018207. [Google Scholar] [CrossRef]
  131. Chen, J.G.; Wadhwa, N.; Cha, Y.J.; Durand, F.; Freeman, W.T.; Buyukozturk, O. Modal identification of simple structures with high-speed video using motion magnification. J. Sound Vib. 2015, 345, 58–71. [Google Scholar] [CrossRef]
  132. Ferrer, B.; Espinosa, J.; Roig, A.B.; Perez, J.; Mas, D. Vibration frequency measurement using a local multithreshold technique. Opt. Express 2013, 21, 26198–26208. [Google Scholar] [CrossRef]
  133. Patsias, S.; Staszewskiy, W. Damage detection using optical measurements and wavelets. Struct. Health Monit. 2002, 1, 5–22. [Google Scholar] [CrossRef]
  134. Gupta, P.; Rajput, H.S.; Law, M. Vision-based modal analysis of cutting tools. CIRP J. Manuf. Sci. Technol. 2021, 32, 91–107. [Google Scholar] [CrossRef]
  135. Liu, J.; Yang, X. Learning to see the vibration: A neural network for vibration frequency prediction. Sensors 2018, 18, 2530. [Google Scholar] [CrossRef] [PubMed]
  136. Liu, J.; Yang, X.; Zhu, M. Neural network with confidence kernel for robust vibration frequency prediction. J. Sens. 2019, 2019, 6573513. [Google Scholar] [CrossRef]
  137. Muralidharan, P.K.; Yanamadala, H. Comparative Study of Vision Camera-based Vibration Analysis with the Laser Vibrometer Method. 2021. Available online: https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1608598&dswid=-3097 (accessed on 3 August 2023).
  138. Lee, J.J.; Shinozuka, M. Real-time displacement measurement of a flexible bridge using digital image processing techniques. Exp. Mech. 2006, 46, 105–114. [Google Scholar] [CrossRef]
Figure 1. The architecture of image-processing-based crack detection.
Figure 1. The architecture of image-processing-based crack detection.
Electronics 12 03862 g001
Figure 2. Different types of image acquisition.
Figure 2. Different types of image acquisition.
Electronics 12 03862 g002
Figure 3. Illustration of data annotation and augmentation [51].
Figure 3. Illustration of data annotation and augmentation [51].
Electronics 12 03862 g003
Figure 4. Imaging system: acquired image can be modeled by blur and noise functions.
Figure 4. Imaging system: acquired image can be modeled by blur and noise functions.
Electronics 12 03862 g004
Figure 5. Image reconstruction process: blind deconvolution block for deblurring and filter block for denoising are used.
Figure 5. Image reconstruction process: blind deconvolution block for deblurring and filter block for denoising are used.
Electronics 12 03862 g005
Figure 6. Canny and hyperbolic edgedetectors [60].
Figure 6. Canny and hyperbolic edgedetectors [60].
Electronics 12 03862 g006
Figure 7. The results of Canny edge detection processes [69].
Figure 7. The results of Canny edge detection processes [69].
Electronics 12 03862 g007
Figure 8. Shadow-like black noise block image segmentation by different algorithms [71].
Figure 8. Shadow-like black noise block image segmentation by different algorithms [71].
Electronics 12 03862 g008
Figure 9. The improved edge segmentation results [72].
Figure 9. The improved edge segmentation results [72].
Electronics 12 03862 g009
Figure 10. Crack detection results on a real bridge deck: (a) original image, (b) crack detection result, and (c) cracks superimposed on the original image [76].
Figure 10. Crack detection results on a real bridge deck: (a) original image, (b) crack detection result, and (c) cracks superimposed on the original image [76].
Electronics 12 03862 g010
Figure 11. The original image [42].
Figure 11. The original image [42].
Electronics 12 03862 g011
Figure 12. Edge detection for the test cracked images on the original image using (a) Roberts, (b) Prewitt, (c) Sobel, (d) LoG, (e) Butterworth, and (f) Gaussian filters [42].
Figure 12. Edge detection for the test cracked images on the original image using (a) Roberts, (b) Prewitt, (c) Sobel, (d) LoG, (e) Butterworth, and (f) Gaussian filters [42].
Electronics 12 03862 g012
Figure 13. Proposed defect detection in [79] (a) color image and (b) iterative threshold output.
Figure 13. Proposed defect detection in [79] (a) color image and (b) iterative threshold output.
Electronics 12 03862 g013
Figure 14. Comparisons of Canny algorithm with the proposed method [81].
Figure 14. Comparisons of Canny algorithm with the proposed method [81].
Electronics 12 03862 g014
Figure 15. Machine learning types [69].
Figure 15. Machine learning types [69].
Electronics 12 03862 g015
Figure 16. Support Vector Machine [96].
Figure 16. Support Vector Machine [96].
Electronics 12 03862 g016
Figure 17. k-nearest neighbor algorithm (KNN) [102].
Figure 17. k-nearest neighbor algorithm (KNN) [102].
Electronics 12 03862 g017
Figure 18. Generalized structure for the random forest [105].
Figure 18. Generalized structure for the random forest [105].
Electronics 12 03862 g018
Figure 19. Logistic regression: sigmoid function and decision boundary.
Figure 19. Logistic regression: sigmoid function and decision boundary.
Electronics 12 03862 g019
Figure 20. K-means clustering algorithm process [113].
Figure 20. K-means clustering algorithm process [113].
Electronics 12 03862 g020
Figure 21. The architecture of the Artificial Neural Network (ANN) [114].
Figure 21. The architecture of the Artificial Neural Network (ANN) [114].
Electronics 12 03862 g021
Figure 22. The architecture of the convolutional neural network [117].
Figure 22. The architecture of the convolutional neural network [117].
Electronics 12 03862 g022
Figure 23. Comparison of displacements derived via accelerometers and PTV for three reference locations [32].
Figure 23. Comparison of displacements derived via accelerometers and PTV for three reference locations [32].
Electronics 12 03862 g023
Figure 24. Frequency space comparison between displacements derived from the camera, laser vibrometer, and accelerometers [128].
Figure 24. Frequency space comparison between displacements derived from the camera, laser vibrometer, and accelerometers [128].
Electronics 12 03862 g024
Figure 25. A preliminary depiction of the extraction of time history and the subsequent Fourier transformation process is presented herein [129].
Figure 25. A preliminary depiction of the extraction of time history and the subsequent Fourier transformation process is presented herein [129].
Electronics 12 03862 g025
Figure 26. (a) Relative variation with respect to the first frame (absolute value) of the pixel number, and (b) Fourier transform of the signals in (a) [132].
Figure 26. (a) Relative variation with respect to the first frame (absolute value) of the pixel number, and (b) Fourier transform of the signals in (a) [132].
Electronics 12 03862 g026
Figure 27. The feature-based image sequence analysis was performed.
Figure 27. The feature-based image sequence analysis was performed.
Electronics 12 03862 g027
Figure 28. Implementation pipeline of the proposed method: (a) read in the ROI video as an image sequence and save as separate pixel brightness variation signals, then feed in the ConvNet; (b) network output prediction result visualization; (c) optional edge enhancement operation [135].
Figure 28. Implementation pipeline of the proposed method: (a) read in the ROI video as an image sequence and save as separate pixel brightness variation signals, then feed in the ConvNet; (b) network output prediction result visualization; (c) optional edge enhancement operation [135].
Electronics 12 03862 g028
Figure 29. FFT analysis using CSRT tracker for cantilever beam [137].
Figure 29. FFT analysis using CSRT tracker for cantilever beam [137].
Electronics 12 03862 g029
Table 1. Processing time of different models.
Table 1. Processing time of different models.
DFP-
Prewitt
DFP-
Roberts
DFP-
Sobel
DFP-
Canny
Processing time (s)74.6978.1385.9479.69
Table 2. The evaluation of several edge detection approaches in the suggested algorithm for crack detection.
Table 2. The evaluation of several edge detection approaches in the suggested algorithm for crack detection.
DomainEdge DetectorTPR 1
(%)
TNR 2
(%)
FPR 3
(%)
FNR 4
(%)
Ac 5
(%)
Pr 6
(%)
MCW 7
(mm)
Time
(s)
SpatialRoberts6490103677860.41.67
SpatialPrewitt8282181882820.21.4
SpatialSobel8684161485840.21.4
Spatial(LoG)988614292880.11.18
FrequencyButterworth8086142083850.21.81
FrequencyGaussian8088122084870.21.92
1 True Positive Rate, 2 True Negative Rate, 3 False Positive Rate, 4 False Negative Rate, 5 Accuracy, 6 Precision, and 7 Missed Crack Width.
Table 3. Comparative results for Otsu algorithm and Twice-threshold.
Table 3. Comparative results for Otsu algorithm and Twice-threshold.
Test ProjectCrack Detection RateFalse Positive Rate
Otsu algorithm40%60%
Twice-threshold98%2%
Table 4. A confusion matrix.
Table 4. A confusion matrix.
Correct DetectionCracked AreaNoncracked Area
Detection results
Cracked areaTPFP
Noncracked areaFNTN
Table 5. The frequency measurement results from nine excitation and prediction frequencies [136].
Table 5. The frequency measurement results from nine excitation and prediction frequencies [136].
ScenarioExcitation Frequency (Hz)Predicted Frequency (Hz)
15.05.0
210.010.1
315.014.9
420.020.0
525.024.9
630.030.0
735.035.0
840.040.0
945.045.0
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Azouz, Z.; Honarvar Shakibaei Asli, B.; Khan, M. Evolution of Crack Analysis in Structures Using Image Processing Technique: A Review. Electronics 2023, 12, 3862. https://doi.org/10.3390/electronics12183862

AMA Style

Azouz Z, Honarvar Shakibaei Asli B, Khan M. Evolution of Crack Analysis in Structures Using Image Processing Technique: A Review. Electronics. 2023; 12(18):3862. https://doi.org/10.3390/electronics12183862

Chicago/Turabian Style

Azouz, Zakrya, Barmak Honarvar Shakibaei Asli, and Muhammad Khan. 2023. "Evolution of Crack Analysis in Structures Using Image Processing Technique: A Review" Electronics 12, no. 18: 3862. https://doi.org/10.3390/electronics12183862

APA Style

Azouz, Z., Honarvar Shakibaei Asli, B., & Khan, M. (2023). Evolution of Crack Analysis in Structures Using Image Processing Technique: A Review. Electronics, 12(18), 3862. https://doi.org/10.3390/electronics12183862

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop