1. Introduction
Structural health monitoring (SHM) is essential for evaluating the safety and operational status of buildings, playing a key role in preserving structural integrity and ensuring the safety of human lives [
1,
2]. DHM serves as a specialized application of SHM, addressing the challenges associated with the aging and performance of dam structures. The importance of dam safety has grown in light of numerous accidents and failures, underscoring the need for effective health monitoring and early warning systems. Dam failure represents a dynamic, time-evolving process, necessitating continuous surveillance to ensure structural integrity [
3]. Throughout the operational life, dams are subjected to the dual effects of external loads and the gradual aging of internal materials [
4], leading to a gradual decline in structural safety [
5]. This degradation typically manifests as anomalies or sudden changes in monitoring data, with deformation being a critical metric for assessing a dam’s operational health [
6]. Therefore, conducting in-depth studies on the patterns of dam deformation and developing robust anomaly detection models are crucial for evaluating structural safety and implementing timely interventions.
Dam monitoring systems, which include pendulums installed within and on the dam’s surface, are pivotal for collecting deformation data [
7]. However, these data often include outliers that deviate from expected patterns due to complex factors like physical laws, structural defects, and instrumentation errors, complicating accurate outlier identification. These outliers can be categorized as either reasonable or unreasonable based on their origins. Like other monitoring variables, the deformation series exhibits several characteristics, including a large number of monitoring points, rich information content, and long-term and cyclical changes related to environmental factors [
8].
Existing studies about dam anomaly detection generally follow a unified approach, entailing the creation of a predictive model to establish a baseline series. This is followed by calculating the residuals between the baseline and observed series and setting confidence intervals to identify anomalies. For instance, Li et al. [
8] introduced an anomaly identification and warning system for dams using M-robust regression methods. This was further refined by Han et al. [
9], who enhanced the M-robust linear regression technique and developed an efficient method for the online identification of anomalies in monitoring data. Xu et al. [
10] devised a model to pinpoint anomalies in dam monitoring data, introducing a three-stage online process for outlier differentiation. Zhang et al. [
11] proposed a data type-based self-matching model aimed at detecting anomalies in dams, addressing the limitations of single-method approaches to outlier identification. Despite the significant role these traditional methods play in processing dam monitoring data, they often struggle to capture the nonlinear relationships between series, particularly when dealing with data such as deformation sequences [
12].
With advancements in machine learning (ML), the deployment of ML algorithms for DHM, particularly in anomaly detection, has received heightened interest. The application of ML in this context is divided into the following three categories based on data labeling: supervised learning, which employs classifiers trained on extensive labeled datasets to detect anomalies; unsupervised learning, which uses models trained on unlabeled data encompassing both normal and anomalous conditions to identify outliers; and semi-supervised learning, which begins with model training on unlabeled data, followed by refinement with a limited amount of labeled data for enhanced anomaly detection accuracy.
Recently, anomaly detection methods based on supervised learning have emerged. Salazar et al. [
13] implemented anomaly detection in dam monitoring data using reinforced regression tree models and compared the performance of causal models, non-causal models, and autoregressive models. They emphasized the interpretability benefits of causal and non-causal models, alongside the simplicity and efficiency offered by autoregressive models. Further research by Salazar et al. [
14] investigated anomaly detection within DHM using random vector machines and random forests, discussing the potentials and limitations of multi-class, two-class, and one-class classifiers. Despite the high accuracy of these methods, their development faces significant challenges, including managing large deformation monitoring datasets, difficulty in data labeling, and the scarcity of adequate training samples for supervised learning.
Conversely, unsupervised learning techniques, which do not necessitate labeled data, encompass a wide array of anomaly detection algorithms, such as clustering-based, distance-based, density-based, and prediction model-based strategies [
15,
16,
17,
18,
19,
20,
21]. Researchers have explored the application of unsupervised learning techniques in detecting anomalies in DHM. Shao et al. [
22] introduced a general and robust method for anomaly detection from the perspectives of image processing and artificial intelligence. Rong et al. [
23] innovated a multipoint anomaly identification model, integrating an enhanced local outlier factor with mutual verification to account for spatiotemporal correlations. Ji et al. [
24] introduced an anomaly detection strategy based on refined spectral clustering. Su et al. [
25] introduced a diagnostic approach for dam structural behavior that combines probabilistic reasoning and data fusion. Dong et al. [
26] presented a monitoring data anomaly identification method using an improved cloud model and radial basis function neural network. Liu et al. [
27] developed an arch dam deformation anomaly detection model based on long short-term memory networks. These unsupervised approaches mitigate the challenge of label scarcity encountered in supervised methods for dam anomaly detection. Nonetheless, their effectiveness is significantly dependent on the precision of the models used [
28], which limits their application. In response to these limitations, researchers have advanced the use of variational autoencoders (VAEs) [
29] in anomaly detection, offering a promising direction for overcoming the obstacles associated with accurate modeling.
VAEs have become a prominent unsupervised learning technique in dam anomaly detection, focusing on training classifiers to learn the probability distribution of normal operational states. Zhou et al. [
30] developed an innovative model that merges generative adversarial networks (GANs) with VAEs. Shu et al. [
31] proposed a cutting-edge anomaly assessment framework based on sequential variational autoencoders coupled with Evidence Theory, enabling both the detection and fusion of anomalies. Subsequently, Shu et al. [
32] further integrated spatiotemporal correlations into their model, employing temporal VAEs and graph convolutional neural networks for enhanced anomaly detection. While VAE-based models exhibit commendable performance in identifying anomalies in dam monitoring data, they often presuppose a Gaussian distribution of the underlying data, which is a presumption that may not accurately reflect the true distribution of dam deformation data. This theoretical discrepancy can affect the accuracy of reconstructing monitoring data. Furthermore, these methods are generally trained on normal data sequences, which is a practice that overlooks the presence of random anomalies in sensor-collected data, thereby complicating data preprocessing and diminishing the methods’ overall effectiveness and applicability. Despite these challenges, semi-supervised learning, which bridges the gap between supervised and unsupervised learning by potentially enhancing the accuracy of anomaly detection without requiring extensive labeled data, presents a viable alternative. However, the adoption of semi-supervised learning techniques within the DHM sector is still nascent, with more common use in sectors like mechanics [
33] and environmental studies [
34].
Addressing the limitations of current data-driven approaches for anomaly detection in dam deformation, this study identifies the following three primary challenges: (1) the dependency of supervised learning on labeled data and complex preprocessing, which restricts its applicability in large-scale engineering projects; (2) the constraints of unsupervised learning methods due to theoretical inaccuracies and reliance on the quality of datasets, affecting their precision; and (3) the prevalent focus on temporal characteristics of deformation, often neglecting or merely qualitatively analyzing spatial correlations, thereby missing out on the benefits of a detailed spatial analysis for accuracy improvement. To address these issues, this paper proposes an efficient anomaly detection method for dam deformation based on self-supervised learning. This novel approach comprehensively considers both the spatial correlations and temporal attributes of dam deformation, aiming to overcome the limitations associated with label generation in supervised learning and the accuracy dependence and theoretical discrepancies characteristic of unsupervised learning. For details on self-supervised learning, see
Section 2.1.
The proposed methodology constructs a correlation matrix incorporating spatial associations and employs sliding window techniques to generate time sequences. It integrates convolutional blocks and transformer technology for effective information extraction and applies spatiotemporal contrastive learning to pre-train the encoder. This enables the distinguishing of unique dataset representations, followed by classifier fine-tuning for anomaly discrimination. The efficacy of this methodology is demonstrated through a case study. The primary contributions of this research are summarized as follows: (1) It proposes a self-supervised spatiotemporal contrastive pretraining (STCLP) method for representation learning in dam health monitoring, which, to the best of the authors’ knowledge, represents the first application of contrastive learning in this domain. (2) It proposes an anomaly detection method for dam deformation based on STCLP, which leverages large unlabeled data to enhance generalization performance, offering more timely and robust detection than other state-of-the-art techniques. (3) It improves the integration of spatial relationships by incorporating spatial contrastive learning on top of temporal ones, fully utilizing the spatial features of dam deformation for information extraction, thereby increasing the accuracy and rationale of the method.
This paper is structured as follows:
Section 2 presents the background knowledge and implementation details of the proposed dam anomaly detection method.
Section 3 details the case study analysis and discussion of results. Finally,
Section 4 concludes this study and suggests directions for future research.
3. Case Study
The proposed anomaly detection method was evaluated using real-world engineering data on horizontal deformation. The development and testing of the models took place in a Python 3.7 and PyTorch 1.7 environment on a computer configured with an Intel(R) Core(TM) i7-8700K CPU at 3.70 GHz and an NVIDIA GeForce RTX 2080 Ti graphics card with 11 GB of VRAM.
3.1. Data Collection and Processing
The case study focuses on a concrete arch dam in Yunnan Province, China.
Figure 5 presents photographs of the dam from upstream and downstream perspectives. Horizontal deformation, a critical monitoring aspect, is tracked by a pendulum system installed within the dam structure.
Figure 6 illustrates the comprehensive arrangement of the dam’s pendulum system. In seven dam sections (9#, 15#, 19#, 22#, 25#, 29#, and 35#) across various gallery levels (at elevations of 1014 m, 1054 m, 1100 m, 1150 m, and 1190 m), vertical pendulums are strategically placed in segments within the foundation gallery, grouting gallery, and at the inverted pendulum connections to monitor the dam’s horizontal deformation and deflection. The pendulums primarily employ automated monitoring with the use of capacitive pendulum inclinometers, recording observations three times a day. For additional accuracy, manual observations using optical pendulum inclinometers are conducted monthly.
Figure 7 presents the long-term monitoring series of horizontal deformations at various measuring points, with the sample series spanning from 1 January 2012, to 3 December 2018, at a daily data sampling frequency. An analysis reveals that changes in upstream water levels predominantly influence the dam’s horizontal deformation, resulting in downstream deformation as water levels increase and an upstream rebound when they decrease. Moreover, deformation trends and periodic changes at different locations exhibit remarkable similarity, and nearly all measuring points show anomalies around 1 January 2013, indicating spatial correlation among the deformations. The collected data contain missing values and outliers and involve numerous measuring points, making label creation time-consuming and labor-intensive, thereby rendering supervised learning impractical.
A Pearson correlation analysis was performed on the deformation data from each measuring point, with the results displayed in
Figure 8. The color gradient from white to blue indicates the absolute value of the correlation coefficient, ra nging from 0 to 1, where asterisks denote significance levels. The analysis reveals that all measuring points are correlated, with correlation coefficients above 0.5 and significance values (
p) less than 0.001, suggesting that the correlations between measuring points are statistically significant. In this study, a threshold (S) of 0.6 was selected; points with correlations above this threshold were considered to have significant relationships. The original data were then multiplied by the corresponding correlation matrix to generate a sequence with spatial features. This sequence underwent initialization and sliding window operations to create a dataset with spatiotemporal features, which was divided into training, testing, and validation sets in a 4:1:1 ratio.
3.2. Implementation Details
After dividing the dataset, fivefold cross-validation was performed using a network search with different random seeds to identify the optimal hyperparameters for the network’s pretraining and downstream tasks, as detailed in
Table 1. The Adam optimizer was employed for both pretraining and downstream tasks due to its capacity for faster convergence when training neural networks. During pretraining, the model underwent 100 epochs, with an early stopping algorithm implemented to halt training when optimal performance was achieved. This optimization revealed that model performance generally peaked around 40 epochs, which was then adopted for the downstream tasks. After extensive experimentation and comparison, the optimal weights for temporal and spatial loss were found to be
and
, respectively; the process is elaborated in the subsequent sensitivity analysis of parameters.
Furthermore, in the parameter settings for data augmentation, a scaling factor of 1.1 was used during the weak augmentation phase, while the maximum number of fractional parts in strong augmentation was set to five, and the standard deviation parameter for random noise was set to 0.1 to enhance the diversity of the dataset. Such parameters are designed to foster the model’s capacity to learn robust features amidst diverse and noisy data. As for the transformer network’s configurations, the hidden dimension was established at 100, with the network comprising four attention layers, each with four heads, and the multilayer perceptron’s hidden layer dimension set at 64. These configurations aim to provide sufficient model complexity to effectively capture the temporal and spatial dependencies of the input sequences.
3.3. Anomaly Detection
In this setup, the model is initially pretrained on a selection from the unlabeled training dataset. It is then fine-tuned on a small dataset to which labels have been randomly assigned. Due to the limited size of the fine-tuning dataset, data augmentation techniques are employed to enhance data diversity. Taking A09-PL-01 as an example, the processes of pretraining and fine-tuning are illustrated in
Figure 9. The figure demonstrates that during pretraining, the loss decreases with an increase in the number of training epochs, stabilizing around 40 epochs. In the fine-tuning phase, the loss continues to decrease, with accuracy progressively exceeding 0.9. After fine-tuning, the model’s ability to detect anomalies in the unlabeled test set is evaluated. The model’s predictions are output as probability values, with probabilities above a certain threshold indicating positive classes (anomalies) and those below it indicating negative classes (non-anomalies).
3.4. Result Analysis
Following the previous description, the proposed method was applied to detect anomalies in the deformation data of 31 measuring points on the dam, with the results presented in
Table 2. This table provides a detailed breakdown of the evaluation metrics during the training validation phase and the testing phase. Overall, the model demonstrated exceptional performance, achieving accuracy and precision rates above 95% and recall rates above 75%, with
F1 scores exceeding 80% across most of the test dataset, except for a few measuring points. Reduced performance at certain locations, such as A09-PL-01 and A35-PL-02, is linked to an increased presence of outliers in the training set, which posed challenges to the unsupervised training process and negatively impacted the training efficiency and anomaly detection capabilities of the model.
Additionally, the table indicates a less consistent performance across measuring points during the testing phase compared to the training and validation phases. In some instances, the detection rate during the testing phase dropped below that of earlier phases, while in others, it reached 100%. A thorough analysis of the original data identifies the following underlying cause of this phenomenon: the random distribution of anomalous data throughout the dataset. Given the data length ratio of 4:1 between the training validation set and the test set, the latter has limited data, consequently reducing the number of anomalies. This results in more extreme detection outcomes under certain conditions.
Figure 10 presents the anomaly detection results for the test sets of seven representative measuring points. In the figure, black lines represent the data of the test set, red circles indicate correctly identified anomalies, blue squares denote anomalies that were not detected, and green triangles signify normal values incorrectly labeled as anomalies. Observation of the results shows that the proposed method accurately detects both missing and abrupt values, with instances of false positives and false negatives being exceedingly rare, underscoring the method’s exceptional performance.
3.5. Comparison with Other Methods
To evaluate the performance of the proposed anomaly detection method, it was benchmarked against the following three leading unsupervised learning approaches: temporal contrastive learning (TC), transformer-based anomaly detection (TAD), and generative adversarial network variational autoencoder (GAN-VAE). The TC method relies on a temporal contrast model for training and a small dataset for fine-tuning, employing a classification task for anomaly detection. The proposed method expands upon TC by integrating spatial correlations and utilizing spatial contrast for improved performance. The TAD leverages a transformer to reconstruct normal sequences and calculates anomaly scores based on reconstruction errors for anomaly detection. Similarly, transformers are used in the proposed method to extract sequence information. GAN-VAE models combine the adversarial learning power of GANs and the probabilistic generative modeling of VAEs, constructing samples that closely resemble the original data’s distribution and detecting anomalies through the relationship between reconstruction errors and thresholds. Unlike the generative learning approach of GAN-VAE models within the unsupervised learning paradigm, the proposed method falls under the contrastive learning branch of unsupervised learning. The proposed method focuses specifically on displacement analysis; therefore, Zhou et al.’s information fusion component was not included in the calculations.
Similarly, the dataset was divided into training, testing, and validation sets with a ratio of 4:1:1. The parameter configurations for the three methods are detailed in
Table 3. The parameters for the TC method are identical to those of the proposed method. In the TAD method, the number of attention heads is four, and the size of the hidden layer is 32. For the GAN-VAE method, both the encoding and decoding phases use a one-dimensional convolutional neural network with 32 convolutional kernels and a kernel size of seven. The learning rates are set at 0.001 for the VAE and generator and 0.0002 for the discriminator. A KL loss weighting coefficient of 0.1 is applied, with an additional GAN loss coefficient of 0.5. The noise dimension is set to 100 for GAN training. The threshold values for the TAD and GAN-VAE methods are set to 10% of the data range.
Figure 11 displays the comparative results of anomaly detection across various methods for representative measuring points. The four axes represent precision, recall, accuracy, and
F1 score metrics, with identical value ranges and scale sizes across axes. A metric value closer to one indicates better detection performance. The shaded areas in different colors illustrate the performance of each method on these metrics, as follows: red for the proposed method, blue for TC, green for TAD, and black for GAN-VAE. The size of the shaded area visually represents the method’s overall effectiveness, with larger areas indicating superior anomaly detection capabilities.
Overall, the red-shaded area completely encompasses the other colored areas, indicating that the proposed model outperforms the other models across all measuring points. The shape of the red area is nearly square, signifying that the proposed method demonstrates balanced performance across all four metrics, ensuring stable anomaly detection capabilities. Although the accuracy values of each method reach around 90%, with minor differences, significant disparities exist in the other three metrics. This is because the accuracy metric reflects the overall detection accuracy across all data, including both normal and anomalous values. The high proportion of normal values in the original data suggests that all models exhibit strong detection capabilities for normal values.
Specifically, both the TC and TAD methods exhibit advantages and outperform the GAN-VAE method. Except for A09-PL-01, the TC method consistently achieves metric values above 0.7, indicating stable performance capabilities. In contrast, the TAD method demonstrates high precision values, yet its recall and F1 scores are comparatively lower, with instances in which precision exceeds recall by 0.6, signifying that while TAD accurately identifies anomalies, it also has a high rate of false positives. The GAN-VAE method shows inferior performance in precision, recall, and F1 scores, significantly influenced by its assumptions about data distribution, the selection of thresholds, and the inherent instability of adversarial training.
Although the TC method also employs contrastive learning for training, its metric values are approximately 0.2 lower than those of the proposed method across almost all indicators, which significantly highlights the importance of spatial correlations in the analysis of dam deformation. The proposed method enhances the TC approach by incorporating spatial contrast loss, thus offering a more comprehensive consideration of spatial correlations. Similar to the proposed method, the TAD approach processes temporal information using transformers, yet it exhibits a higher rate of missed detections, further demonstrating the superiority of contrastive training over conventional mean squared error (MSE) training.
In summary, the proposed method demonstrates exceptional performance in anomaly detection, particularly when accounting for spatial correlations, which further enhances its efficacy.
3.6. Sensitivity Analysis of Parameters
The efficacy of the proposed anomaly detection method is contingent upon the selection of time steps and the allocation of weights to temporal and spatial losses. This section provides a preliminary discussion on the choice of time steps and the values of loss weights and . Taking the data from the A01-PL-09 measuring point as an example, anomaly detection was conducted within various ranges of time steps and weights. The preliminary patterns of these parameters were analyzed with precision as the evaluation metric.
Figure 12a illustrates the impact of time step selection on overall performance, in which the x-axis represents the proportion of time step length to feature length, and the y-axis shows the variation in precision values, with all weights set to 1.0. The graph indicates that moderately increasing the proportion of time steps can enhance performance, but an excessive proportion may impair it. This is because larger time steps reduce the dataset available for training, leading to poorer outcomes. In our dataset, the model performs best when the time step is approximately 30% of the feature length (batch size), thus setting the time step to 10 in the configuration.
Figure 12b displays the impact of contrast loss weights on overall performance. The horizontal axis shows the range of weight variations, and the vertical axis displays the accuracy values. Green markers denote the results of changing
while holding
at 1.0; orange markers indicate the effects of altering
with
fixed at 1.0. It was observed that the model exhibits optimal performance when
is 1 and
is 0.3. Moreover, the model appears to be more sensitive to variations in
than to changes in
. This sensitivity can be attributed to the fact that, under real-world conditions, dam deformation is significantly influenced by seasonal variations, while the impact of spatial correlations within the dam structure is comparatively minor. Generally, the spatial relationships in dam deformation remain at a consistent level.
4. Conclusions
This study introduced a dam deformation anomaly detection method based on self-supervised spatiotemporal contrastive pretraining. It constructs temporal and spatial contrast modules that learn similar representations by maximizing the similarity of positive sample pairs and minimizing the similarity of negative pairs within each module. To extract temporal and spatial features between inputs, transformers and nonlinear projection heads were utilized. The framework includes dataset construction, unsupervised pretraining, parameter transfer, semi-supervised fine-tuning, and anomaly detection. The performance of this method was comprehensively studied, yielding the following conclusions:
The model exhibits excellent performance, achieving over 95% in both accuracy and precision, above 75% in recall, and over 80% in the F1 score, with only a few exceptional measuring points.
When applied to a real arch dam engineering case and compared with three benchmark models, the proposed method outperforms other unsupervised learning approaches. The analysis underscores the critical importance of spatial correlations in dam deformation analysis, with spatially aware methods showing superior anomaly detection outcomes than those that do not consider such correlations.
Sensitivity analysis of the model’s hyperparameters indicates that an appropriate increase in the time step ratio can enhance performance, whereas excessive time steps may impair it. The model is particularly sensitive to changes in and less so to changes in .
Proving adept at identifying local changes in input data, the proposed method is not limited to deformation metrics but is also applicable to detecting anomalies in dam body stress, seepage, cracks, and other relevant data types. Future research will explore the application of this method in comprehensive anomaly detection for dams.