**1. Introduction**

In the field of structural health monitoring, the problem of data accumulation has been paid more and more attention. During the real-time monitoring of bridges, a large number of data are generated every day. These data containing the damage information of the bridge structure are the basis of bridge state assessment and long-term performance prediction. However, the installed sensors are exposed to harsh environments. As the working time increases, the performance of the sensor will decrease, which may cause sensor failure or data anomalies [1]. In the absence of an effective data processing mechanism, anomalies not only increase the cost of storage but also fail to guide the formulation of bridge maintenance strategies.

The existing data anomaly detection methods can generally be divided into modelbased methods and data-driven methods. Basically, model-based methods rely on finite element models to reflect inherent structural characteristics. A series of statistical and mechanical models have been established to predict the output of the measurement [2–5]. Model-based methods can achieve better detection accuracy. However, when dealing with

**Citation:** Zhang, Y.; Lei, Y. Data Anomaly Detection of Bridge Structures Using Convolutional Neural Network Based on Structural Vibration Signals. *Symmetry* **2021**, *13*, 1186. https://doi.org/10.3390/ sym13071186

Academic Editor: Sergei Alexandrov

Received: 5 June 2021 Accepted: 25 June 2021 Published: 30 June 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

large numbers of SHM data, it is difficult to create a reliable explicit finite element model to describe the structural behavior of the structure in service [6].

Data-driven methods include statistical process control and machine learning methods. They do not rely on finite element models and directly analyze measured time series data, which hopefully alleviates the shortcomings of model-based methods [7]. Among data-driven methods, deep-learning-based methods have the potential to learn from big data containing abnormal data to automatically diagnose various abnormal data. Recently, deep learning has been increasingly applied to solve time-series-related tasks [8–10], including time series classification, time series prediction, and time series anomaly detection. Bao [11] et al. proposed a data anomaly detection method based on computer vision and deep learning. The original time series measurement values are first converted into image vectors, and then these image vectors are input to a deep neural network (DNN) to identify various anomalies. Tang et al. [12] proposed a new anomaly detection method using computer vision and deep learning methods. This method first converts the original time series data into images, imitating human-vision-based data collection, and then trains CNN for abnormal classification. Mao et al. [13] combined the generative adversarial network with an autoencoder to improve the performance of existing unsupervised learning methods and used two data sets from full-scale bridges to verify the proposed method.

Supervised deep learning relies heavily on a large number of labeled training data to train the network. However, many abnormal data patterns in actual projects do not have enough labeled data. Therefore, how to efficiently generate a large number of labeled synthetic data with fewer samples is a problem worthy of attention. As an effective tool to improve the quantity and quality of training data, data augmentation is essential for the successful application of deep learning models. The basic idea of data augmentation is to allow limited data to generate more value when new data are not added substantially while maintaining correct labels. Data augmentation has achieved good results in many application scenarios [14]. Sun et al. [15] proposed a simple but effective data augmentation method for generating multi-view 2D pose annotations. Liu et al. [16] proposed an image generation technique to enhance the robustness of the convolutional neural network model. Time-domain transformation is the most direct data augmentation method for time series data. Most of them directly process the original input time series. Cui et al. [17] proposed a sliding window method combined with a Multi-scale Convolutional Neural Network (MCNN) to solve the time series classification problem and achieved good results on a large number of benchmark data sets. Fawaz et al. [18] proposed a new method for generating new time series with DTW and ensembled them by a weighted version of the DBA algorithm. Wen et al. [19] used data augmentation methods such as random mutation and adding random trends in different data sets and proposed a time series segmentation approach based on convolutional neural networks (CNN) and transfer learning. Gao et al. [20] proposed a label expansion method to change those data points near the labeled anomalies and their labels as anomalies, which brings performance improvement for time series anomaly detection.

For the time series classification problem, most studies model the problem as a classification problem based on computer vision, while the classification method directly based on vibration signals is rarely studied. In addition, less research uses time series data augmentation to obtain a more balanced sample set. However, one-dimensional convolutional networks, which are faster for time series problems, are also used in rare cases. In this paper, a data anomaly identification method using one-dimensional CNN is proposed based on bridge monitoring acceleration data, in which data augmentation is employed to process the samples.

### **2. Data Anomaly Classification Method Based on 1D-CNN**

*2.1. Bridge Overview and Data Set Composition*

This research uses the health monitoring data set of a large-span cable-stayed bridge in China. The main span of the bridge is 1088 m long, and the two side spans are 300 m each, including two 306 m-high towers. The structural health monitoring system of the bridge consists of 38 sensors. The position on the bridge is shown in Figure 1. Sensors include accelerometers, anemometers, strain gauges, global positioning systems (GPS), and thermometers. For this research, one-month (1 January–31 January 2012) acceleration data from all 38 sensors of the SHM system were used for data anomaly detection. The sampling frequency of the accelerometer is 20 Hz. The original continuous measurement data are divided into one-hour time periods, and in a one-month time period, through the method of non-overlapping windows, 744 time series measurement data of each sensor are obtained so as to obtain a total of 28,272 (744 × 38) data. The dimensions of a single data point are 1 × 72,000. Figure 2a–g shows an example of each type of data pattern. Table 1 describes the quantity and characteristics of normal data and six types of abnormal data. Each data point has a real category label. The normal time series measurement data are marked as 1, and the other six abnormal data patterns are marked as 2–7. It can be seen that nearly 52% of the data are abnormal. "Trend" is the main abnormal pattern that constitutes 20% of the data set, followed by "missing" and "square", each accounting for about 10%. On the other hand, the "outlier" only accounts for 1.9% of the data set, followed by "drift", which accounts for 2.4% of the data.
