1. Introduction
The Structural Health Monitoring (SHM) system, as a crucial part of the bridge’s lifecycle, plays an essential role in ensuring the safety, reliability, and maintainability of bridges [
1]. During a bridge’s service life, it is subjected to various loads, such as seismic forces from the natural environment and common wind loads. These loads cause mechanical responses in the structure, which the SHM system monitors to evaluate the structural mechanical performance. The most direct expression of the structural mechanical response is displacement, which varies with different input loads. Displacement, as a fundamental quantity, can reflect the bridge’s vibration, stiffness, and other mechanical signals. Therefore, accurately extracting the bridge’s structural displacement is key to achieving a highly accurate bridge monitoring system.
Traditional displacement monitoring technologies primarily rely on sensors, such as strain sensors, linear displacement sensors, dial gauge sensors, laser displacement sensors, and accelerometers [
2,
3]. Although sensors can provide accurate monitoring results, the data transmission from sensors heavily relies on data cables and signal wires. During long-distance signal transmission [
4], the installation and maintenance of these cables is a significant challenge that cannot be ignored [
5]. Furthermore, they can only monitor displacements at certain points in the structure, and the transmission of data in a high-speed and accurate way is also a key issue. Hence, there is an urgent need for more advanced methods to upgrade the SHM systems for bridges. SHM systems utilizing Global Navigation Satellite System (GNSS), laser Doppler vibrometers, and radar systems have been proven reliable and useful [
6], but these systems highly depend on power and communication systems.
With the rapid development of computer technology, image-processing technology has also emerged. By analyzing optical signals in images using computers, machine vision-based bridge structure displacement monitoring technology has gradually matured. Xudong Jian et al. [
7] used deep learning and computer vision methods, employing deep multilayer perceptrons to identify the influence surface of bridge structures and study the traffic load and state evaluation of the structure. Experiments demonstrated the framework’s high accuracy and robustness. Xu et al. [
8] used machine vision technology to identify vehicle trajectories, combined with millimeter-wave radar to monitor the structure’s displacement response, and calculated the vehicle’s axle load using the structural displacement influence line, achieving vehicle–bridge coupling analysis. Ye et al. [
9] used UAV technology to monitor the structural health of bridges and extract structural displacement. The use of UAV monitoring has the advantages of low cost, flexibility, and high accuracy in extracting bridge structural displacement. Shengfei Zhang et al. [
10] researched vision-based displacement monitoring (VDM) and developed a 2D VDM method to achieve VDM in three-dimensional space using two cameras.
Khuc T, Nguyen T A, Dao H et al. [
11] proposed a UAV-based structural monitoring method to measure sway displacement, addressing the challenges of traditional SHM systems by bypassing fixed camera positions and improving measurement accuracy through advanced algorithms. However, the method faces high computational demands, limited UAV flight time, and sensitivity to wind. Similarly, Tian Y et al., introduced a UAV-based non-contact cable force estimation method using line segment matching to address displacement calculation issues. Although it provides a cost-effective and robust solution for dynamic displacement, it is suitable only for cables with large vibrations under low-wind conditions, and UAV motion can affect the results. Han Y et al. [
12] developed a vision-based method for measuring structural displacement with UAVs and lasers, using black-and-white markers and laser projection. This system is non-contact, adaptable to various environments, and capable of working in low-light conditions, but it is limited to in-plane displacement and constrained by UAV camera frame rates. The method also faces challenges with setup and measuring out-of-plane displacements.
Jana D et al. [
13] focused on cable tension estimation for the Dubrovnik cable-stayed bridge using handheld camera video and image analysis techniques. The approach provides stable displacement measurements and tension estimates comparable to design values but struggles with small-amplitude or high-frequency vibrations. Michel Guzman-Acevedo G et al., presented a reliable UAV-based method for estimating cable tension using template matching and optical flow algorithms. This method has shown high accuracy in both lab and field tests, with minimal deviation from reference values. However, it is limited by UAV battery life, high computational costs, and environmental instability. Future work for both methods could explore integration with IMUs for real-time monitoring and improve small cable vibration detection, enhancing their applicability for SHM in cable-stayed bridges. Chen et al., discussed the displacement monitoring technology based on machine vision, highlighting its advantages and exploring methods for synchronous multi-point displacement detection. They also analyzed the current limitations of machine vision technology and suggested future research directions. Duan et al., proposed a method utilizing Scale-Invariant Feature Transform (SIFT) for extracting structural feature points, facilitating the study of Full-Field Displacement Vectors (FFDVs) to achieve multi-point displacement monitoring across an entire structure. Meanwhile, Xing et al., introduced a multi-point displacement measurement method for bridges using a low-cost camera, with the accuracy of the method validated through experimental tests.
It can be seen that compared to traditional contact sensor displacement monitoring, non-contact displacement monitoring is easier to install and does not require contact with the structure. Typically, bridge deformation monitoring systems rely on sensors and other equipment. For small and medium-sized bridge monitoring projects that use sensors, the costs, including sensors, labor, installation, and others, can easily reach millions of CNY. In contrast, machine vision-based monitoring methods only require a visual acquisition system, targets, and data storage and processing equipment. This results in a cost reduction of 50–70% compared to sensor-based systems [
14,
15]. It can achieve wide-area monitoring coverage and, by avoiding the limitations of high-precision sensors, can be applied in various environments. Additionally, it does not require frequent replacement or maintenance of sensors, is cost-effective, and does not interfere with the normal operation of bridges. This paper proposes a machine vision system for monitoring displacement in standard bridge structures. This system mainly consists of a high-definition camera and a data processing unit, with the latter comprising only a regular PC and MATLAB R2023b software. Compared to other methods, this system offers advantages such as simple usage conditions, low cost (requiring only a single camera setup), and low learning curve (due to simple algorithms). It is referred to as the DoG Structural Displacement Recognition System (DoG stands for Difference of Gaussians). During the validation process, a theoretical and practical comparison method was employed, using the DoG algorithm to monitor a simulated structure and comparing the results with displacement data to verify the algorithm’s accuracy and applicability.
2. The DoG Algorithm Applied for Edge Detection
Data obtained through algorithmic processing of graphics can then be subjected to deeper signal processing by the computer. Common preprocessing methods include grayscale conversion and binarization, which translate the information of each pixel into a format understandable by computers. However, interference from some noisy pixels can affect the desired results. Therefore, during preprocessing, it is crucial to apply denoising techniques to the images to enhance their quality. Common denoising methods include median filtering, mean filtering, bilateral filtering, and Gaussian filtering.
To mitigate Gaussian noise, the utilization of a Gaussian filter based on the noise distribution function is advisable. The expression for the Gaussian low-pass filter function in the image is as follows:
where (x, y) represents the pixel coordinates of the Gaussian kernel. G(x, y) denotes the value of the two-dimensional Gaussian function at coordinates (x, y).
During the processing stage, the Gaussian filter operates by smoothing the image, wherein the gray value of pixels is represented by the weighted average of neighboring pixels. Particularly at the image edges, where the gray value tends to increase, the weighting of surrounding pixels in the Gaussian filter is comparatively reduced. This preservation of edge details is essential during noise reduction processing.
Due to this property, feature detection at a certain scale can be performed by subtracting two adjacent Gaussian scale–space images to achieve the desired outcome. This results in the DoG response image.
where g represents the blurred result of the image at a certain Gaussian scale, and f represents the value of a specific pixel or a pixel region in the input image.
Subtracting the two images
and
yields the expression for the DoG (Difference of Gaussians) function:
The main purpose of applying differential processing to images is to enhance edges and details while suppressing noise. Through multi-scale analysis, features at different scales can be extracted. The precision of edge extraction is higher after performing the Gaussian difference operation, and it does not introduce large-scale noise. The main steps of edge detection include gradient calculation, calculation of gradient magnitude and direction, and non-maximum suppression.
The main purpose of gradient calculation is to compute the gradient of the difference image to obtain the edge strength and direction. The specific calculation formulas are as follows:
where
represents the image gradient using the Gaussian difference method;
represent the pixel directions in the horizontal and vertical axes, respectively.
The formulas for calculating gradient magnitude and direction are as follows:
where
represents the gradient magnitude of the pixel, and
represents the gradient direction of the pixel.
Multiple convolutions with the DoG (Difference of Gaussians) function can be applied to an image, as depicted in
Figure 1. This technique aids in amplifying the detection of relatively subtle structures, which might otherwise be overlooked or challenging to identify in subsequent processes. It mitigates biases in structural performance status information and enables more precise localization for feature point and edge extraction, as shown in
Figure 2.
In the operation of the Difference of Gaussians (DoG) method, image processing operations, such as dilation and erosion, need to be applied to the images. The dilation algorithm enlarges the target area, amalgamating background points that touch the target area into the target object, consequently extending the boundary of the target outward. Its purpose is to fill in some holes in the target area and eliminate small grain noise contained in the target area. Conversely, the erosion algorithm diminishes the extent of the target area, leading to a reduction in the image’s boundary and the elimination of small and insignificant target objects. The formulas for dilation and erosion are as follows:
Figure 3 illustrates the application of the DoG system for structural displacement extraction. The DoG system proposed in this paper is divided into four main steps.
In the validation experiments described in this paper, data collection is mainly divided into algorithm data collection and validation data collection. As shown in Step 1 of
Figure 3, this section focuses on algorithm data collection. By adjusting the field of view (FOV) of the calibrated camera, we ensure that the optical signals of the entire bridge structure can be captured and preserved. This is essential for the subsequent comparison of algorithm displacement with LVDT displacement measurements to determine the accuracy of the DoG algorithm.
As shown in Step 2 of
Figure 3, the DoG algorithm proposed in this paper is mainly applied in the edge extraction phase of structural optical signal processing. By processing the structure under fixed multi-scale DoG filtering, the edge results can reflect the position of the structure in space. Under simulated load conditions, the structure will respond to different loads, resulting in deformation. The DoG edge processing can capture these deformation signals, leading to pixel-level deformation in the collected images. Therefore, when processing the structure, since the camera’s attitude angle and field of view are fixed under the same working conditions, it is only necessary to compare the structural displacement response results at different load times and points to determine the degree of deformation.
In Step 2, the collected video is segmented to obtain the optical signal of the structure at a specific moment. The DoG algorithm is then used to extract the edge information of the structure and track feature points. Since the edge signals are continuous [
16,
17], feature points can be arbitrarily selected. Compared to traditional LVDT displacement meters, the DoG algorithm can extract an infinite number of feature points, effectively solving the problem of insufficient data completeness in traditional displacement extraction. Moreover, the feature points are derived from the natural edge contours of the structure itself, eliminating the need for preprocessing results during system operation and avoiding any impact on the bridge’s traffic capacity in actual conditions [
18].
By tracking the feature points, the displacement of the feature points and the edges near the feature points can be obtained, thereby determining the mechanical response of the structure. The mechanical response of the structure [
19] is reflected by the pixel displacement, and through the above steps, the pixel-level displacement of the structure in the image can be obtained [
20,
21].
In a normal working state, the structure responds to changes in load size or shifts in the point of application with corresponding mechanical signal feedback. This feedback is reflected in optical information as subtle displacements in the structure of the image [
22], typically at the pixel level, occupying only a few or dozens of pixels in the image. In the images captured by the camera, this is specifically manifested as pixel displacement at observation points/feature points.
In the Difference of Gaussians (DoG) algorithm, the Gaussian-blurred image difference can highlight the high-gradient areas of the image. Since the pixel values at the edges change drastically, and the pixel value differences at the edges of the Gaussian-blurred image are significant, calculating the difference can enhance edge information.
Since displacement is a relative value, we can use the DoG algorithm to extract edges in the initial state and the loaded state. The relative deformation between the two states can be considered as the structural mechanical response. By locating the feature points, the pixel displacement of the structure in the image can be determined. Because the feature points and adjusted edge extraction proposed in this paper are obtained through differences, it is necessary to ensure that the parameters of the DoG algorithm are consistent under the same working conditions. Otherwise, different Gaussian filter bases may lead to overall deviations in the extracted results, significantly affecting the accuracy of subsequent structural mechanical performance evaluations.
In the system proposed in this paper, the spatial state of the structure and the camera remains unchanged under the same time–space conditions. The structure, in reality, will be recorded on the image according to a certain scale, known as the pixel scale factor, which can be calculated based on the camera’s posture and the spatial distance between the camera and the structure.
The pixel scale factor (
Z) is a parameter used to convert the image signal into the displacement signal in this system. The calculation formula is as follows:
where
represents the actual length of the object.
represents the pixel length of the object in the image. The unit of the pixel scale factor is mm/pixel.
Using the pixel scale factor, we can convert the deformation of the structure’s edge in the image into the actual deformation of the structure, thereby assessing the mechanical performance of the bridge under normal working conditions and monitoring the displacement of the bridge.
After determining the edge of the structure using the Gaussian difference method, the displacement of the structure is calculated and detected. In this study, a comparison between the calculated displacement obtained using our proposed method and the actual displacement allows for the evaluation and monitoring of the overall structural performance during regular operation [
23].
3. Experimental Structure Composition
The experimental model employed in this investigation is a novel assembled I-beam aluminum-supported bridge, depicted in
Figure 4. The bridge configuration, akin to a simply supported beam structure, is illustrated in
Figure 5, which also depicts the dimensions and specific construction of the test piece. The main beam consists of two prefabricated aluminum alloy I-beams spliced together. The I-beams are standardized with a height of 200 mm, a width of 50 mm, a flange thickness of 5 mm, and a web thickness of 5 mm. Another aspect of standardization involves the ratio of flange thickness to web thickness. The aluminum alloy I-beams on both sides are welded together with three hollow rectangular steels and are consistent with the main beams in terms of material. They are welded to the beam segments and mid-span of the aluminum alloy beams. The bridge deck is made of organic glass, with a rectangular cross-section measuring 12 mm in height and 500 mm in width. The bridge deck is bonded to the longitudinal beams using acrylic adhesive. The model bridge is hinge-supported at both ends, and its calculated span length is 5.64 m. The completed experimental beam and the restraint conditions are shown in the following figure:
In this experiment, it is essential to investigate the degree of deformation of the main beam structure of the bridge under various loads simulated to represent normal working conditions. The experimental design includes the following key points:
The theoretical framework proposed in this paper relies on non-contact computer vision detection technology for analyzing structural displacement measurements. Hence, the utilization of structural imagery data as initial data is paramount for this purpose. Using different acquisition devices can lead to variations in the collected data. Variations in factors like sampling frequency, focal length, resolution, and other device parameters can influence the accuracy of subsequent measurement analyses. Additionally, the outcomes from the same device can vary due to factors like shooting environments, lighting conditions, shooting distances, and camera angles.
- (2)
Displacement sensor (LVDT) for deflection measurement
The data acquisition system (DH5902N, DHDAS, Jiangsu Donghua Testing Technology Co., Ltd., Jingjiang, Jiangsu China) is used for data collection. The sensors are positioned at the lower edge of the simply supported beam, with one sensor placed every 1/8 span, totaling seven sensors.
- (3)
Loading method
To assess the suitability of the algorithm introduced in this paper for deflections under various load conditions, we utilize a manual loading approach for incremental loading. This approach allows for the generation of time-history displacement curves for the supported beam. Through the comparison of time-history displacement curves with data retrieved from displacement sensors, one can ascertain the precision of the acquired displacement values.
As illustrated in
Figure 6, the entirety of the system’s components is seamlessly interconnected to work together, thus finalizing the monitoring system for bridge structure displacement. The front-end camera serves as the primary image acquisition system, transmitting the image signal to the computer for processing. Subsequently, the computer outputs the structure monitoring results, facilitating the evaluation of structural health.
In the laboratory, a model car equipped with weights is used to simulate the state of a bridge under the action of loads during normal operation. The feasibility of the proposed algorithm for structural detection is initially assessed through static load monitoring. Following this, its practical application capability is evaluated under dynamic loads.
- (1)
In the image acquisition system, a Canon 5DS R camera is utilized, maintaining consistent camera parameters under different load conditions (
Figure 7).
The camera outputs images with a resolution of 8688 × 5792 pixels for photography and 1920 × 1080 pixels for videography, capturing static and dynamic loading, respectively. The camera is a full-frame camera with a sensor size of 36 × 24 mm. The required focal length for the shooting process can be calculated using the following formula:
where
represents the focal length of the camera,
represents the distance from the camera to the object being photographed,
represents the width of the sensor, and
represents the width of the object being photographed.
In the experiment, the distance from the test beam to the camera is 5 m, the sensor width is 36 mm, and the width of the test beam is 6 m. Therefore, it can be determined as follows:
Therefore, in the validation experiments for the algorithm presented in this paper, fixing the camera’s focal length at 30 mm ensures that the proportion of the structure in the image is maximized while also guaranteeing that the mechanical characteristics of the structure can be effectively captured. To simulate the bridge’s response to loadings under real working conditions, loading is applied continuously over a short period, starting from the unloaded state to extract bridge displacement. Subsequently, the captured images and videos are transmitted to the data processing terminal via the camera data transmission cable for further processing.
- (2)
To validate the accuracy of the algorithm proposed in this paper, LVDT displacement sensors are used to obtain precise displacement data for comparison. The displacement data from the seven LVDT sensors, which are in contact with the main beam, are collected using a Donghua tester (
Table 1,
Figure 8). Subsequently, the displacement extracted using the DoG algorithm proposed in this paper is compared with the accurate data obtained from the LVDT sensors to assess the accuracy and reliability of the algorithm’s displacement measurements [
24].
- (3)
The loading is manually applied using weights of 10 kg and 5 kg during the experiment. A model car is used to simulate the force applied to the bridge under working conditions.
- (4)
In this study, the algorithm will be used for deformation recognition based on image signals, where the smallest unit in the image is the pixel. Therefore, the minimum resolution in the algorithm system corresponds to the pixel factor in the experiment. The specific calculation method and value of the pixel factor will be provided in
Section 4.3.1.
- (5)
The sampling rates of the LVDT and the camera offer multiple options. To select the most appropriate rates, the Nyquist–Shannon theory was referenced, which states that for band-limited signals, the sampling frequency must be at least twice the highest frequency to avoid aliasing. Before the validation experiments, finite element software Midas Civil 2022 was used to analyze the simply supported beam in the experiment, and the first five modal frequencies of the structure were calculated as 0.511 Hz, 2.035 Hz, 4.544 Hz, 7.995 Hz, and 12.317 Hz. Based on this, the camera sampling rate was set to 25 Hz, and the LVDT sampling rate was set to 500 Hz, ensuring accurate extraction of the structural response.
In the laboratory setting, the experiment is arranged according to the specific layout depicted in
Figure 9:
In the experiments described in this paper, simulating the structural stress response of a bridge under working conditions requires applying loads to the structure. To verify the basic feasibility and practicality of the proposed algorithm, the load tests are divided into static load tests and dynamic load tests. The static load test serves as a fundamental test to evaluate the edge extraction effect of the DoG algorithm. By applying image-to-space scale transformation to obtain the algorithm’s displacement and comparing it with LVDT data, the accuracy of the DoG algorithm can be analyzed. The dynamic load test aims to simulate the structural response of the bridge under working conditions [
25], thereby assessing the applicability of the DoG algorithm proposed in this paper.
To avoid any forward or backward movement of the vehicle caused by the applied load during the static load experiment, the vehicle is firmly secured in position prior to incremental loading and recording. Therefore, the weight of the vehicle does not need to be considered. The weights used in the laboratory are 10 kg, 5 kg, and 2 kg. The configuration of the loaded vehicle and supplementary weights is depicted in
Figure 10:
In the static load experiment, the vehicle is secured in place before incremental loading and recording to prevent the vehicle from sliding forward or backward due to the applied load. To avoid structural displacement fluctuations caused by impact effects during loading, which may lead to significant fluctuations in displacement measured by displacement sensors and extracted by the algorithm, the structure is allowed to settle for one minute after loading. Subsequently, the mechanical condition of the bridge is captured through photography, and displacement sensor data are collected (
Table 2).
During the dynamic load experiment, it is necessary to simulate the mechanical performance generated by a car passing over the bridge under normal working conditions. Therefore, after loading the vehicle, it is propelled at a constant speed over the bridge. During the experiment, a motor generates traction force to propel the vehicle steadily. Concurrently, the structure is monitored by a camera, and data are gathered (
Table 3).