Bridge Deflection Prediction Based on Cascaded Residual Smoothing and Multiscale Spatiotemporal Attention Network

Wu, Xi; Qian, Hai-Min; Liao, Juan; He, Liu-Sheng; Wang, Cheng-Quan

doi:10.3390/app15063147

Open AccessArticle

Bridge Deflection Prediction Based on Cascaded Residual Smoothing and Multiscale Spatiotemporal Attention Network

by

Xi Wu

^1,2,

Hai-Min Qian

^1,2,

Juan Liao

^1,2,*,

Liu-Sheng He

³ and

Cheng-Quan Wang

^1,2

¹

Department of Civil Engineering, Hangzhou City University, Hangzhou 310015, China

²

Zhejiang Engineering Research Center of Intelligent Urban Infrastructure, Hangzhou City University, Hangzhou 310015, China

³

College of Civil Engineering, Tongji University, Shanghai 200092, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(6), 3147; https://doi.org/10.3390/app15063147

Submission received: 20 January 2025 / Revised: 3 March 2025 / Accepted: 6 March 2025 / Published: 13 March 2025

(This article belongs to the Special Issue Risk Control and Performance Design of Bridge Structures)

Download

Browse Figures

Versions Notes

Abstract

:

Bridge deflection values are significant for their health and safety, but current methods for predicting bridge deflection suffer from problems such as anomalous data and low prediction accuracy. To solve the problems of anomalous bias and loss of short-term trend in traditional smoothing methods, this paper proposes a preprocessing method for cascade residual smoothing. The method firstly uses Gaussian filtering to initially remove the high-frequency noise in the signal and retain the overall trend. Then, the residuals of the initial filtering and the original data are smoothed by quadratic exponential smoothing to extract the short-term trend in the deflection data, which is favorable for the data to have the advantages of both stabilization and retention of small fluctuations. In addition, to simultaneously acquire the temporal dependence and spatial features between long- and short-term temporal signals, this paper proposes a multiscale spatial attention network based on Multiscale Convolutional Neural Networks (MSCNNs), Gated Recurrent Units (GRUs), and self-attention (SA). The method obtains multi-level sensory field spatial information within each period through the MSCNN, focuses on the connection between different time steps using a GRU, and employs SA to automatically focus on the deflection features that have a significant impact and ignore unimportant perturbation variations, thus improving the prediction ability of the model. In this paper, compared with CNN-Attention-LSTM, the MAE is reduced by 25.79%, the RMSE is reduced by 24.69%, and the R2 is increased by 2.36%, which proves the superiority and advancement of the method.

Keywords:

bridges; deflection prediction; smoothing; spatiotemporal networks; attention

1. Introduction

Deflection plays a vital role in bridge structural health monitoring systems [1]. It is a direct indicator of the overall vertical stiffness and load-carrying capacity of the bridge structure. To a certain extent, it can reflect the change in the bridge line shape. When the deformation exceeds the allowable range of its deflection, it is often prone to collapse, so deflection monitoring in bridge projects is necessary [2].

With the rapid rise of large structures, the safety and reliability of bridges have become an increasing concern [3]. In general, establishing a structural health monitoring system is the primary method for understanding the operational status of a bridge [4]. However, due to various unfavorable factors, such as sensor failures and environmental influences, the monitoring data from bridge structural health systems often face issues like data distortion and abnormal offsets [5], making it difficult to assess the health of bridges promptly. Therefore, preprocessing the signals collected from the static components of bridges to improve data quality has become an important issue [6]. Study [7] described the sliding average method to reduce noise by smoothing and filtering, which is simple to implement and computationally efficient. However, the method may introduce bias when processing signal boundaries, potentially distorting transient features in bridge deflection monitoring. Study [8] used the least squares method to smooth the signal and identified the signal singularities from the monitoring data by wavelet transform to obtain the characteristics of the real signal. Although this method can effectively identify singularities and perform preliminary denoising, it may be insufficient in dealing with complex noise environments, non-smooth signals, and dynamic noise characteristics. Study [9] applied Gaussian smoothing to filter noise in vibration signals, which preserves the overall trend and reduces high-frequency noise. However, it may misclassify important short-term fluctuations as anomalous, resulting in the loss of critical information. Study [10] employed an exponential smoothing method for rapid identification of the fundamental frequency of bridge ties, adjusting smoothing coefficients for non-smooth signals. However, it only performs simple exponential weighting and cannot accommodate multiscale features, limiting its ability to capture both short-term and long-term changes, which affects accurate bridge health monitoring.

On the other hand, developing an effective bridge deflection prediction model remains a challenge [11]. At present, the primary focus is on statistical and data-driven approaches to predicting bridge health.

Regarding statistics, study [12] analyzed the time-series signals of bridges using autoregressive integrated moving averages (ARIMAs), which effectively isolated the trend component of the data. However, this method only applies to linear models and cannot capture nonlinear features in the time series. Study [13] designed a seasonal differential autoregressive sliding average method with seasonal factors to address this limitation. However, this approach demands high data smoothness and suffers from low computational efficiency.

On the data-driven side, study [14] obtains spatial information within the same period using convolutional neural networks (CNNs), but this method can only obtain features within the size of its convolutional kernel, and it cannot capture temporal features at the same time for long distances. Study [15] used a bidirectional long short-term memory network (BiLSTM) to input historical signals, focusing on connections between different time steps, but the method ignores the spatial information within each segment. In study [16], to focus on the time–space information simultaneously, the CNN and Gated Recurrent Unit (GRU) are united to predict the deflection value by using the bridge deflection data as the input to the CNN-GRU. However, this method cannot identify the importance of data features, and it is easy to obtain abnormal features, which reduces prediction accuracy.

In addition, there are emerging technologies, such as those used by Mirko et al., to study the deformation of simply supported concrete girder bridges over time using the information content provided by satellite data and combining it with other available sources [17]. Subsequently, they proposed a GIS plug-in called Bridge Assessment System via MTInSAR (BAS-MTInSAR), which aims at assessing the deformation of existing simply supported concrete girder bridges by means of Multi-temporal Interferometric Synthetic Aperture Radar (MTInSAR) [18]. However, the limitations of the characterization of the information contained in the satellite data are such that the proposed method cannot replace the traditional monitoring methods.

To solve the problems of ignoring short-term fluctuations and failing to extract temporal and spatial information at the same time in the current existing studies, this paper proposes a bridge deflection prediction method based on cascaded residual smoothing and multiscale spatiotemporal attention networks. Cascade residual smoothing solves the problem of inadequate access to short-term fluctuating trends with traditional smoothing methods and maintains the overall trend, improving data quality. Specifically, cascaded residual smoothing starts by initially removing high-frequency noise from the signal through Gaussian filtering, preserving the overall trend. Then, the raw signal and the residuals after initial smoothing are obtained to further highlight the short-period fluctuations in the signal, making it easy to capture detailed changes. Finally, quadratic exponential smoothing is used to extract short-term trends or small-period variations in the deflection data, ensuring that the predictions are stable and reflect subtle fluctuations.

On the other hand, to simultaneously acquire spatiotemporal information between different periods and focus on the important features while ignoring the abnormal features, we designed the multiscale spatiotemporal attention network (MSSAN). The network first extracts spatial information of the deflection data at multiple scales using a Multiscale Convolutional Neural Network (MSCNN), which can capture both short-term fluctuations and long-term trends in the data. Then, the GRU effectively captures the time dependence of the bridge deflection data by using self-attention (SA) to focus on the important features, focusing on the critical time steps in the deflection data that greatly impact future deflection values. Finally, the predicted values are output by linear transformation. In this paper, the validity of the method is verified through experiments, and the data taken at several static levels shows good prediction results, and comparison with similar techniques, which reflects the superiority and advancement of the method proposed in this paper.

The specific contributions of this paper are as follows:

This paper designs a cascade residual smoothing method to solve the problem of ignoring short-term fluctuations that exists in traditional smoothing methods. The method separates the long-term trend and short-term fluctuations, and gradually denoises them by cascading them, and the cascading residual smoothing reduces the risk of over-smoothing the data. After Gaussian filtering treats the overall trend, quadratic exponential smoothing is applied only to the residual part, thus smoothing short-term fluctuations more gently and reducing strong interference with the original data.
This paper proposes a multiscale spatiotemporal attention network to solve the problem of not being able to obtain spatiotemporal information simultaneously, which exists in traditional prediction methods. The network has multi-level feature extraction and time-dependent modeling capabilities and screens important time-series features to improve prediction accuracy.

2. Smoothing Method Analysis

2.1. Gaussian Smoothing

Gaussian smoothing is a technique commonly used to smooth data and remove noise. It can help in bridge deflection prediction by reducing measurement noise and environmental disturbances for smoother data. In Gaussian smoothing, each data point is replaced with a weighted average within its neighborhood, with weights determined by a Gaussian function. Points further away from the center are given less weight, which enables a smoothing of the noise while preserving the main trends [19].

Since one-dimensional data are used in this paper, a one-dimensional Gaussian filtering kernel is used in this paper, whose Gaussian function formula is shown below:

f (j) = \frac{1}{\sqrt{2 π} σ} e^{- \frac{{(j - μ)}^{2}}{2 σ^{2}}}

(1)

g (j) = \frac{f (j)}{\sum_{i = - (t - 1) / 2}^{(t - 1) / 2} f (i)}

(2)

j = \frac{- (t - 1)}{2}, \frac{- (t - 1)}{2} + 1, \dots, \frac{(t - 1)}{2}

(3)

where j is an integer,

μ

is the template mean,

σ

is the standard deviation of the template, and t is the template length. When calculating the Gaussian template, first, select the template length, take the center of the template as the origin, and carry out the calculation by Formula (1); taking the template length of three as an example, the resulting Gaussian template can be

f = [f [- 1], f [0], f [1]]

, as shown in Figure 1. The normalized Gaussian template can then be derived from Equation (2)

g = [g [- 1], g [0], g [1]]

.

In bridge deflection prediction, deflection data usually come from sensor measurements, which are susceptible to environmental factors (e.g., wind, temperature changes, traffic loads, etc.) and measurement errors, resulting in more high-frequency noise. Modeling directly with these data may result in large model errors. Therefore, Gaussian smoothing as a preprocessing step for data can effectively improve data quality.

2.2. Quadratic Exponential Smoothing

Quadratic exponential smoothing introduces a trend component to simple exponential smoothing and is suitable for dealing with time series with linear trends [20]. The basic idea is to perform exponential smoothing on the data twice, the first time for extracting the smoothed base-level values and the second time for extracting the trend components. The model equation for quadratic exponential smoothing is as follows:

S_{t} = α \cdot Y_{t} + (1 - α) \cdot (S_{t - 1} + T_{t - 1})

(4)

T_{t} = β \cdot (S_{t} - S_{t - 1}) + (1 - β) \cdot T_{t - 1}

(5)

{\hat{Y}}_{t + k} = S_{t} + k \cdot T_{t}

(6)

where Equation (4) represents the smoothed level update, Equation (5) represents the trend update, Equation (6) represents the prediction equation,

S_{t}

represents the smoothed value at time t,

T_{t}

is the trended value at time t,

α

and

β

are the coefficients controlling the strength of the smoothing and the trend, respectively,

Y_{t}

is the measured bridge deflection value at time t, and

{\hat{Y}}_{t + k}

is the output after time t has predicted k time.

In bridge health monitoring, deflection data reflect the deformation of a bridge under load. These data usually contain a short-term noise component and a long-term trend component. Therefore, quadratic exponential smoothing is good at extracting deflection data trends while reducing short-term fluctuation effects.

2.3. Cascade Residual Smoothing

This paper proposes a cascade residual smoothing method to solve the problem wherein traditional smoothing methods ignore short-term fluctuations. The method preserves long-term trends and short-term fluctuations in the raw data and improves the smoothness of the data. As shown in Figure 2, the method uses Gaussian smoothing to extract the main trend of the data and quadratic exponential smoothing to extract short-term volatility features from the residuals. The new data that are ultimately generated contain long-term trends and retain information on short-term fluctuations, allowing the model to capture subtle changes in the data on a smoothed basis.

The core of this method lies in applying quadratic exponential smoothing to the residual part after the initial extraction of the overall trend by Gaussian smoothing, which is used to extract the short-term trends in the residuals. Quadratic exponential smoothing can take into account both the smoothing component and the trend component of the residuals. Further, it removes noise from the residuals through smoothing and trend extraction while retaining the effective short-term variations. Bridge deflection data often contain fluctuations at multiple frequencies, and the cascade residual smoothing method effectively addresses this complexity through stepwise processing, making it particularly suitable for data preprocessing in bridge health monitoring.

3. Analysis of Predictive Modeling Methods

3.1. Multiscale Convolutional Neural Networks

The structure of a traditional CNN generally includes an input layer, an output layer, and multiple hidden layers; the structure is shown in Figure 3. The choices of hidden layers are convolution, pooling, and fully connected. In convolutional layers, convolutional kernels generate local features due to sparse connectivity between neurons in neighboring layers. The desired features can be rearranged and mined through a series of convolution operations that exploit the statistical properties of similarity between feature maps. A nonlinear activation function is applied after the convolution. The output features of the convolutional layer can be written as follows:

V_{j}^{r} = φ (\sum_{i} V_{i}^{r - 1} * l_{i j}^{r} + b_{j}^{r})

(7)

where

*

denotes the convolution operator,

V_{i}^{r - 1}

and

V_{j}^{r}

are the ith input feature of the r − 1th layer and the jth output feature of the rth layer in the convolution process, respectively,

l_{i j}^{r}

and

b_{j}^{r}

are the convolution kernel and bias, and

φ

is the nonlinear activation function.

From the structure of the traditional CNN, it is clear that it can only obtain features through its own single convolutional kernel, which cannot perceive global features. Multiscale temporal features can be extracted and fused step by step using parallel convolutional structures. In this paper, to obtain different levels of temporal–spatial features, we use three different sizes of convolutional kernels to form the MSCNN, the structure of which is shown in Figure 4. Three different sizes of convolutional kernels acquire spatial information under different sizes of sensory fields, which makes up for the insufficiency of a single convolutional kernel to acquire features. Once the different feature maps are extracted, they are stitched together into a fused feature layer and fed into subsequent modules. The MSCNN acquires spatial information from bridge deflection data, focusing on trends within each period.

3.2. Gated Recurrent Unit

A traditional GRU replaces the forgetting and input gates of LSTMs uniformly with an update gate and retains the reset gate, which maintains a balance between prediction accuracy and complexity. The update gate controls the extent to which previous feature information affects feature information at the current moment, and the reset gate controls the extent to which current information is combined with information from the previous moment. The GRU structure is shown in Figure 5, and its specific calculation steps are as follows.

\{\begin{array}{l} r_{t} = σ (W_{r} \cdot [h_{t - 1}, x_{t}] + b_{r}) \\ z_{t} = σ (W_{z} \cdot [h_{t - 1}, x_{t}] + b_{z}) \\ {\tilde{h}}_{t} = \tanh (W_{h} \cdot [r_{t} \cdot h_{t - 1}, x_{t}] + b_{h}) \\ h_{t} = (1 - z_{t}) \cdot h_{t - 1} + z_{t} \cdot {\tilde{h}}_{t} \end{array}

(8)

where

x_{t}

is the input data;

h_{t}

is the output result;

h_{t - 1}

is the output of the previous period;

σ

is the s-type activation function;

r_{t}

and

z_{t}

are the outputs of the reset gate and the update gate at the time t, respectively;

{\tilde{h}}_{t}

denotes the hidden state at the current moment; and

W_{r}

,

W_{h}

, and

W_{z}

denote the weight matrices of the reset gate, the update gate, and the hidden state, respectively.

b_{r}

,

b_{z}

, and

b_{h}

denote the reset gate, update gate, and hidden state bias.

Since the inputs are time-series data on bridge deflections, attention must be paid to the links between the different periods. The GRU can mine the temporal information in the data and consider the effect of the period before and after on the current prediction, which is conducive to improving prediction accuracy.

3.3. Self-Attention

Conventional convolutional layers can only acquire local fault information and cannot simultaneously focus on long-range feature data, ignoring the correlation between overall and local information. The proposal of Transformer changes this situation, which mainly relies on SA for feature acquisition at different distances and makes up for the CNN’s shortcomings in the lack of global information acquisition ability. The relationship between the CNN and SA is shown in Figure 6. SA essentially linearly projects the input data to obtain the query matrix Q, the keyword matrix K, and the value matrix V. The dot product between Q and all K is then computed, and a Softmax activation function is used to determine the weight coefficients. Finally, multiplying with V yields an attention-weighted feature map. The specific formula for SA is shown below:

Attention (Q, K, V) = Soft \max (\frac{Q K^{T}}{\sqrt{d_{k}}}) V

(9)

where Q, K, and V correspond to dimensions

d_{q}

,

d_{q}

, and

d_{q}

, respectively,

\sqrt{d_{k}}

is a scaling factor to keep the gradient stable, and Softmax is the normalized activation function. The structure of SA is shown in Figure 7.

SA can dynamically adjust the feature weights by calculating the correlation between different locations in the bridge deflection input sequence, ensuring that the model can pay more attention to the spatiotemporal features that have a greater impact on deflection when predicting. This mechanism allows the model to selectively pay attention to some unnecessary anomalous features.

3.4. Multiscale Spatiotemporal Attention Network

To obtain the temporal and spatial characteristics of bridge deflection at the same time, the MSSAN is designed for bridge deflection prediction in this paper and the structure of this method is shown in Figure 8. The time-series data of bridge deflections are, firstly, input into three convolution kernels of different sizes, which are used to extract multiscale spatial features of the deflection data. Convolution kernels of different sizes can capture local information at different scales, helping the model to identify the feature variations of bridges at different spatial scales, thus improving the ability to capture the spatial features of bridge deflection. Next, these spatial features are fused and fed into the GRU to obtain dynamic deflection features in the time dimension. The GRU captures trends in deflection data in the time dimension by remembering long-time dependencies through its gating mechanism. Then, a self-attention mechanism is introduced to further enhance the important features and automatically identify the important features for deflection prediction in the time-series data. Finally, the final prediction is obtained by linear mapping.

3.5. Designed Bridge Deflection Prediction Method

There are two difficulties with traditional bridge deflection predictions: (1) previous signal smoothing methods are unable to retain the overall trend characteristics and short-term fluctuation trends at the same time and are prone to lose part of the time-series information; (2) the prediction model lacks access to multi-level spatiotemporal information and fails to make full use of feature correlations over long distances, affecting the accuracy of bridge deflection prediction. To solve the above problems, this paper proposes a bridge deflection prediction method based on cascaded residual equilibrium and the multiscale spatiotemporal attention network, which is used to achieve high-precision deflection prediction in the case of complex and drastically varying deflection characteristics and the specific process is shown in Figure 9.

The bridge deflection prediction method proposed in this paper consists of the following important parts: Firstly, the raw bridge deflection dataset is initially smoothed and filtered of high-frequency noise by Gaussian smoothing in cascade residual smoothing to capture the overall trend change of bridge deflection. Secondly, the Gaussian smoothed data were compared to the raw data, and the residuals were calculated. The residuals represent subtle features and short-term fluctuations that Gaussian smoothing fails to capture, preserving short-term variations and localized features in the deflection data. To further extract the trend information in the residuals, quadratic exponential smoothing is applied to the residuals so that the new data generated ultimately retains both the long-time trend of the bridge deflection and the short-time fluctuation characteristics. Then, the multiscale spatiotemporal attention network captures the multi-level spatiotemporal information of bridge deflection, filters the important features to ignore the unnecessary information, and outputs the prediction results through linear mapping. Finally, the partitioned training set is used for model training to verify the effectiveness of the test set on the trained model.

4. Calculus Analysis

4.1. Data Presentation

This paper uses the deflection data collected from Xin’anjiang Bridge, a center-bearing unequal-span steel pipe concrete arch bridge in China. The main bridge of Xin’anjiang Bridge is a three-hole unequal span center-bearing steel-tube concrete arch bridge, and the approach bridge is a hollow plate girder structure with a total length of 374.4 m, span arrangement of (13 + 10 + 86.5 + 125 + 86.5 + 13 + 2 × 20) m, and a bridge deck width of 10.5 m. Xin’anjiang Bridge belongs to the arch bridge system and was built more than 30 years ago; the bridge is in the operation process with its materials, components, and structure aging and becoming damaged, and the gradual decline in structural performance easily leads to safety hazards. The structure’s safety during operation needs to be guaranteed by real-time monitoring and early warning of possible diseases and risks. Therefore, it is necessary to establish a health monitoring system for it.

The bridge deflection data acquisition scheme for this experiment uses six static levels, with the static level near the bridge abutment as the reference point, and the remaining five levels arranged at key locations, including the 1/4 position in the middle of the span and the top of the arch area, as shown in Figure 10. These location choices effectively capture the deformation of the bridge under different loads and environmental factors. The overall deformation trend of the bridge is analyzed by using the readings at the reference point as a reference to obtain relative deflection data at other measurement points.

For this paper, we collected data from 1 May 2024 to 30 May 2024 with a sampling interval of 1 h. Five static level values were collected for each hour, giving 720 datasets.

4.2. Model Parameters and Hyperparameter Selection

The residual cascade smoothing and MSSAN proposed in this paper have some model parameters that need to be determined, and the parameter values are determined by grid-seeking parameters, as shown in Table 1 and Table 2. In Table 1, the MSSAN employs multiscale convolution with convolution kernels of 3, 7, and 11, and different levels of padding to maintain the same number of features for easy splicing, respectively. The GRU maps the convolved feature dimensions to 32 to obtain the bridge deflection correlation between different periods. Then, SA linearly projects the input data to obtain the query matrix Q, the keyword matrix K, and the value matrix V, all set to 16. Finally, the predicted values are output by linear mapping. In Table 2, the Adam optimizer is used in this paper, and the learning rate is set to 0.001, the number of training and batches are 500 and 256, respectively, the history step is 5, and the prediction step is 1.

The Mean Absolute Error (MAE), Root Mean Square Error (RMSE) and coefficient of determination (R²) are used as the evaluation indexes of the model in this experiment, and their calculation formulas are shown in Equations (10)–(12):

M A E = \frac{1}{n} \sum_{i = 1}^{n} | \hat{y} - y_{i} |

(10)

R M S E = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(\hat{y} - y_{i})}^{2}}

(11)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y - \bar{y})}^{2}}

(12)

where

\hat{y} = \{{\hat{y}}_{1}, {\hat{y}}_{2}, \dots, {\hat{y}}_{n}\}

is the predicted value,

y = \{y_{1}, y_{2}, \dots, y_{n}\}

is the measured value, and n is the number of variables. The RMSE and MAE are susceptible to the range of predicted values, and if the raw data are large, they may produce large metrics. R², on the other hand, is a value ranging from 0 to 1, with values close to 1 indicating that the model fits the data well and values close to 0 indicating that the model fits poorly. In the cascade residual smoothing method proposed in this paper, there are Gaussian smoothing standard deviation

σ

, smoothing intensity coefficient

α

and trend coefficient

β

, and these three hyperparameters are selected by averaging the coefficient of determination, R², over five experiments, as shown in Figure 11.

In Figure 11, it can be seen that cascade residual smoothing with Gaussian smoothing standard deviation

σ

works best at 1.25. In addition, the smoothing intensity coefficient

α

in quadratic exponential smoothing is not as large as possible, reaching better results at 0.5, and the trend coefficient

β

is optimal at 0.01.

4.3. Projected Results

In this experiment, the deflection signals collected from five static levels of Xin’anjiang Bridge are selected as input data, and the last seven days of data are used as the test set, i.e., the last 168 sets of data, and the rest are used as the training set. In Figure 12, this paper shows the prediction results based on the cascade residual smoothing method and the MSSAN for five different static levels. As can be seen from Figure 12, the predicted values of the method proposed in this paper can match the original true values better, reflecting the method’s effectiveness in this paper.

In Figure 12a, it can be seen that the predicted values of the method can begin to approximate the real values very well. However, there still exists a low bias when encountering local maxima and local minima. This bias will be more obvious in long-term tasks, such as after 100 h of the test set, and the longer the prediction time, the higher the cumulative bias is, which is an unavoidable drawback of the common prediction methods nowadays [21]. However, compared to other prediction methods, the proposed method has less bias in encountering local maxima and local minima. In Figure 12b, there are no obvious ‘peaks’ and ‘troughs’, so the predicted values can better match the real values, but the other methods have different degrees of deviation. In the experiments of this paper, the overall trend can still be retained in the long-term prediction without completely deviating from the original true labels, as shown in Figure 12c–e; even though some local minima at the later stage cannot be matched exactly, they can still be better approximated compared to the other methods in the face of the short-term fluctuations, which reflects the superiority of the proposed method in this paper.

In addition, to explore whether the cascade residual smoothing method proposed in this paper can effectively solve the problem of the traditional smoothing method being unable to obtain the overall trend and short-term fluctuation information at the same time, this paper visualizes the raw data, the Gaussian-smoothed data, and the cascade residual-smoothed data, taking part of the data of the static level meter 3 as an example, as shown in Figure 13. As can be seen from the figure, the traditional Gaussian smoothing does obtain the overall trend change of the raw data, but in the face of some short-term fluctuations, it directly ignores the local maximum and local minimum values and instead replaces them with curves of smaller curvature, which results in the loss of short-term fluctuation information. The cascade residual smoothing method proposed in this paper can solve this problem, as the red line in the figure represents the cascade residual smoothed data; in the face of some short-term fluctuations, it does not directly ignore the change, but retains the overall trend of the original attention to the ‘peaks’ and ‘troughs’, as well as retaining the original short-term oscillation fluctuation information, which is not available in traditional methods. The cascade residual smoothing method is mined further for short-term volatility than the Gaussian smoothing method, with the former analyzing the residuals with quadratic exponential smoothing based on the latter. This is true for bridges where unusual peaks or troughs occur over a short period, which may mean that the bridge structure has been subjected to unusual loads over a short period, or that there has been a degradation of material properties. Timely attention and analysis of these short-term fluctuations can help engineers to identify potential safety hazards early and avoid sudden structural failures.

To verify the effectiveness and sophistication of the proposed cascaded residual smoothing-MSSAN, this paper also compares it with similar techniques, which include the BiLSTM, the CNN-GRU, and CNN-Attention-LSTM [22]. The results of the comparison experiments are presented in Table 3, where MAE, RMSE, and R² are used as indicators for validation, where smaller RMSE and MAE are better, larger R² is better, and bolding represents the optimal value of the indicator for the position of the static level.

As can be seen from Table 3, although our proposed method does not work best on the static level 5, it is optimal in most cases. This paper also statistically averages the metrics of these methods at the overall static level, as shown in Figure 14. In terms of MAE, the mean MAE of the cascaded residual smoothing-MSSAN is 2.4525, which is 28.17% lower compared to the mean MAE of the BiLSTM of 3.4143, 37.37% lower compared to the mean MAE of the CNN-GRU of 3.9158, and 25.79% lower compared to the mean MAE of CNN-Attention-LSTM of 3.3047. In terms of RMSE, the average RMSE of the cascaded residual smoothing-MSSAN is 3.7556, which is 27.14% lower compared to the average RMSE of the BiLSTM of 5.1543, 38.41% lower compared to the average RMSE of the CNN-GRU of 6.0979, and 24.69% lower compared to the average RMSE of CNN-Attention-LSTM of 4.9869. In terms of R², the R² mean of the cascaded residual smoothing-MSSAN is 0.9384, which is 7.29% higher compared to the R² mean of the BiLSTM of 0.8746, 9.59% higher compared to the R² mean of the CNN-GRU of 0.8563, and 2.36% higher compared to the R² mean of CNN-Attention-LSTM of 0.9168.

This paper argues that the poor performance of the BiLSTM in bridge deflection prediction stems from its inability to comprehensively model nonlinear relationships and complex spatial features. While the CNN-GRU excels in extracting local features, bridge deflection prediction requires capturing multiscale features and modeling long-term dependencies. The CNN-GRU struggles to effectively capture these complex relationships, limiting its predictive accuracy. CNN-Attention-LSTM incorporates an attention mechanism to focus on important features but fails to preprocess the raw data effectively, leading to input data containing high-frequency noise and abnormal fluctuations.

Compared to these methods, the method based on cascaded residual smoothing and an MSSAN proposed in this paper predicts better. This is because the method first removes the noise by cascaded residual smoothing and also preserves both the long-term trend and short-term fluctuation features of the bridge deflection. Then, the spatial features at different scales are captured using the MSCNN, followed by obtaining the connection between different time steps through the GRU, extracting the temporal features and focusing on the deflection features that have a large impact through SA. In addition, the method still focuses on the changes such as extreme deflection and peaks and troughs after removing the anomalous data, which improves the robustness and accuracy of the prediction, which reflects the advancement and superiority of the proposed method in this paper.

5. Conclusions

In this paper, a bridge deflection prediction method based on cascaded residual smoothing and a multiscale spatiotemporal attention network is proposed. Firstly, the bridge deflection signal is preprocessed by the designed cascade residual smoothing method, which solves the problem of the traditional smoothing method ignoring the short-term fluctuation information, and at the same time, the long- and short-term trend characteristics are paid attention to by the quadratic exponential filtering of the residuals after the Gaussian filtering. Finally, the model prediction of bridge deflection is made by the designed multiscale spatiotemporal attention network. The model can obtain the spatiotemporal information of bridge deflection at multiple scales and filter the important fluctuation features by self-attention. Compared with the prediction methods that require observational data for physical modeling, the proposed method can automatically learn multilayer features from raw data and make predictions, avoiding the complicated manual feature design process. In addition, compared with the BiLSTM, the CNN-GRU, and CNN-Attention-LSTM, the mean MAE values of the proposed method were reduced by 28.17%, 37.37%, and 25.79%, respectively. The average RMSE values decreased by 27.14%, 38.41%, and 24.69%, respectively. R² increased by 7.29%, 9.59%, and 2.36%, respectively.

In future work, we hope to overcome the problem of low prediction accuracy of the proposed method over the long-term and will try to perform feature extraction from the perspective of the frequency domain to make it acquire better features.

Author Contributions

Conceptualization, X.W.; Methodology, J.L.; Data collection, H.-M.Q.; Writing—original draft preparation, X.W., L.-S.H. and C.-Q.W.; Writing—review and editing, X.W. and H.-M.Q.; Funding acquisition, X.W., J.L. and L.-S.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research work was supported by the Scientific Research Foundation of Hangzhou City University under Grants No. X-202107 and X-202109 and the Zhejiang Engineering Research Center of Intelligent Urban Infrastructure under Grant No. IUI2022-YB-05.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Acknowledgments

The authors thank everyone helping with this study and express their gratitude to all contributing to this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chung, W.; Kim, S.; Kim, N.-S.; Lee, H.-U. Deflection estimation of a full scale prestressed concrete girder using long-gauge fiber optic sensors. Constr. Build. Mater. 2008, 22, 394–401. [Google Scholar] [CrossRef]
Fu, Z.; Wang, H.; Fan, P.; An, P. Development and application research of bridge structure health monitoring technology. Sci. Innov. 2024, 21, 173–175+178. [Google Scholar] [CrossRef]
Worden, K.; Cross, E.J. On switching response surface models, with applications to the structural health monitoring of bridges. Mech. Syst. Signal Process. 2018, 98, 139–156. [Google Scholar] [CrossRef]
Zhang, R.; Meng, L.; Mao, Z.; Sun, H. Spatiotemporal deep learning for bridge response forecasting. J. Struct. Eng. 2021, 147, 04021070. [Google Scholar] [CrossRef]
Liu, H.; Geng, D.; Wang, L. Research on data processing methods for bridge vibration monitoring. Sci. Technol. Innov. 2023, 13, 145–148. [Google Scholar]
Liu, X.; Wang, Y.; Tang, C. Application of automatic hydrostatic leveling measurement technology in subway tunnel structure monitoring. Bull. Surv. Mapp. 2021, 8, 69–73. [Google Scholar]
Pei, Y.; Guo, M. The fundamental principle and application of sliding average method. Gun Launch Control. J 2001, 1, 21–23. [Google Scholar]
Chen, J.; Wan, C.; Xiao, C.; Li, L. Research on signal processing for bridge health monitoring. In Proceedings of the 2009 9th International Conference on Electronic Measurement & Instruments, Beijing, China, 16–19 August 2009; pp. 4-338–4-342. [Google Scholar]
Lu, X.; Han, F.; Ma, L.; Zhang, L.; Sun, P. Noise-resistant bearing fault diagnosis method based on Gaussian filtering and multiscale CNN. Noise Vib. Control. 2024, 44, 132–137. [Google Scholar]
Li, L.; Li, X.; Tan, S. Research and application of rapid identification method for bridge cable fundamental frequency based on microwave interferometric radar. World Bridges 2023, 51, 91–97. [Google Scholar]
Xin, J.; Jiang, Y.; Zhou, J.; Peng, L.; Liu, S.; Tang, Q. Bridge deformation prediction based on SHM data using improved VMD and conditional KDE. Eng. Struct. 2022, 261, 114285. [Google Scholar] [CrossRef]
Fang, R.; Yang, J. Application of the ARMA model to bridge structural health monitoring. Intell. Autom. Soft Comput. 2010, 16, 755–762. [Google Scholar] [CrossRef]
Chen, Z.; Chang, J. Application of combined ridge regression and SARIMA methods in bridge health monitoring data analysis. Sci. Technol. Eng. 2023, 23, 8846–8853. [Google Scholar]
Li, S.; Zuo, X.; Li, Z.; Wang, H. Applying Deep Learning to Continuous Bridge Deflection Detected by Fiber Optic Gyroscope for Damage Detection. Sensors 2020, 20, 911. [Google Scholar] [CrossRef] [PubMed]
Nie, H.; Ying, J.; Deng, J. Bridge health monitoring deflection prediction method based on Bi-LSTM model. Highway 2024, 69, 213–219. [Google Scholar]
Gu, S.; Chang, S.; Han, M.; Zhao, J. Bridge deflection prediction driven by millimeter-wave radar data. Sci. Technol. Eng. 2023, 23, 4874–4880. [Google Scholar]
Calò, M.; Ruggieri, S.; Nettis, A.; Uva, G. A MTInSAR-based early warning system to appraise deformations in simply supported concrete girder bridges. Struct. Control Health Monit. 2024, 2024, 8978782. [Google Scholar] [CrossRef]
Calò, M.; Ruggieri, S.; Nettis, A.; Uva, G. A GIS Plugin for the Assessment of Deformations in Existing Bridge Portfolios via MTInSAR Data. Remote Sens. 2024, 16, 4293. [Google Scholar] [CrossRef]
Qiu, X.; Wang, F.; Zhang, Q.; Tao, G.; Zhou, S. An improved Gaussian process for filling the missing data in GNSS position time series considering the influence of adjacent stations. Sci. Rep. 2024, 14, 19268. [Google Scholar] [CrossRef]
Wu, D.; Ma, W.; Yang, L. Power load forecasting based on a multi-objective combined model with double exponential smoothing. Comput. Eng. Des. 2023, 44, 2541–2547. [Google Scholar]
Sun, B.; Zhang, B.; Yu, X.; Chen, X.; Jiang, Z.; Wu, J. Research on bridge technical condition prediction based on deep learning. China Transp. Informatiz. 2024, S1, 568–571. [Google Scholar]
Shi, Y.; Zheng, D.J.; Zhao, H.; Zhou, X. Dam deformation prediction model based on CNN-Attention-LSTM. Water Resour. Hydropower Eng. 2024, 55, 121–132, (In Chinese and English). [Google Scholar]

Figure 1. Gaussian smoothing.

Figure 2. Cascade residual smoothing.

Figure 3. Convolutional neural network.

Figure 4. Multiscale convolutional neural network.

Figure 5. Gated recurrent unit.

Figure 6. Relationship diagram.

Figure 7. Self-attention.

Figure 8. Multiscale spatiotemporal attention network.

Figure 9. The method designed in this paper.

Figure 10. Xin’anjiang Bridge data collection.

Figure 11. Hyperparameter selection.

Figure 12. Prediction results based on cascade residual smoothing and MSSAN.

Figure 13. Data after cascade residual smoothing.

Figure 14. Average value.

Table 1. MSSAN parameters.

Module	Kernels	Channel	Strides	Dimension	Padding
Conv1	3	16	1	/	1
Conv2	7	16	1	/	3
Conv3	11	16	1	/	5
GRU	/	/	/	32	/
SA-Q	/	/	/	16	/
SA-K	/	/	/	16	/
SA-V	/	/	/	16	/
Linear	/	/	/	1	/

Table 2. Hyperparameter selection.

Hyperparameter	Value
Optimizer	Adam
Learning rate	0.001
epoch	500
Batch size	256
time step	5
$σ$	1.25
$α$	0.5
$β$	0.01

Table 3. Comparison of experimental results.

Static Level Position	Methodology	MAE	RMSE	R²
1	Cascade residual smoothing-MSSAN	9.6932	15.0834	0.9775
	BiLSTM	13.3812	20.5796	0.9581
	CNN-GRU	15.6298	24.3279	0.9281
	CNN-Attention-LSTM	13.5982	21.0110	0.9477
2	Cascade residual smoothing-MSSAN	0.1173	0.1674	0.9868
	BiLSTM	0.3329	0.3781	0.8923
	CNN-GRU	0.8363	1.6303	0.7839
	CNN-Attention-LSTM	0.1764	0.2003	0.9637
3	Cascade residual smoothing-MSSAN	0.6403	0.9998	0.9467
	BiLSTM	1.0948	1.6285	0.8586
	CNN-GRU	0.8853	1.4356	0.8902
	CNN-Attention-LSTM	0.8005	1.095	0.9361
4	Cascade residual smoothing-MSSAN	0.8013	1.0842	0.9277
	BiLSTM	1.2577	1.7821	0.8047
	CNN-GRU	1.0098	1.416	0.8751
	CNN-Attention-LSTM	0.9637	1.116	0.8978
5	Cascade residual smoothing-MSSAN	1.0105	1.4434	0.8533
	BiLSTM	1.0049	1.4034	0.8597
	CNN-GRU	1.2181	1.6798	0.8046
	CNN-Attention-LSTM	0.9848	1.5126	0.8390

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, X.; Qian, H.-M.; Liao, J.; He, L.-S.; Wang, C.-Q. Bridge Deflection Prediction Based on Cascaded Residual Smoothing and Multiscale Spatiotemporal Attention Network. Appl. Sci. 2025, 15, 3147. https://doi.org/10.3390/app15063147

AMA Style

Wu X, Qian H-M, Liao J, He L-S, Wang C-Q. Bridge Deflection Prediction Based on Cascaded Residual Smoothing and Multiscale Spatiotemporal Attention Network. Applied Sciences. 2025; 15(6):3147. https://doi.org/10.3390/app15063147

Chicago/Turabian Style

Wu, Xi, Hai-Min Qian, Juan Liao, Liu-Sheng He, and Cheng-Quan Wang. 2025. "Bridge Deflection Prediction Based on Cascaded Residual Smoothing and Multiscale Spatiotemporal Attention Network" Applied Sciences 15, no. 6: 3147. https://doi.org/10.3390/app15063147

APA Style

Wu, X., Qian, H.-M., Liao, J., He, L.-S., & Wang, C.-Q. (2025). Bridge Deflection Prediction Based on Cascaded Residual Smoothing and Multiscale Spatiotemporal Attention Network. Applied Sciences, 15(6), 3147. https://doi.org/10.3390/app15063147

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Bridge Deflection Prediction Based on Cascaded Residual Smoothing and Multiscale Spatiotemporal Attention Network

Abstract

1. Introduction

2. Smoothing Method Analysis

2.1. Gaussian Smoothing

2.2. Quadratic Exponential Smoothing

2.3. Cascade Residual Smoothing

3. Analysis of Predictive Modeling Methods

3.1. Multiscale Convolutional Neural Networks

3.2. Gated Recurrent Unit

3.3. Self-Attention

3.4. Multiscale Spatiotemporal Attention Network

3.5. Designed Bridge Deflection Prediction Method

4. Calculus Analysis

4.1. Data Presentation

4.2. Model Parameters and Hyperparameter Selection

4.3. Projected Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI