1. Introduction
Marine strata contain rich oil and natural gas resources. In recent years, lots of countries have invested in the development of deep-sea oil resources. Pipeline transportation has become one of the most important transportation modes for deep-sea resource development due to its high safety, good continuity, large transportation volume, and other advantages. Therefore, the corresponding usage of offshore pipelines is large [
1,
2].
Deep-water risers can be roughly divided into steel catenary risers, tower risers, top-tensioned risers, flexible risers, etc., according to their functions and structural forms. Among them, the steel catenary riser (SCR) has been studied and developed in the oil and gas offshore industry for more than 20 years with several advantages of strong adaptability to floating body movement, fast construction operation, low cost, and the ability to operate in a high-temperature and high-pressure working environment [
3,
4]. Therefore, the SCR is increasingly valued by the engineering community. Because the SCR works in a complex marine environment for a long time, it is very vulnerable to damage caused by waves, currents, temperature, and other factors. SCR damage will lead to a decrease in the local stiffness of the structure, thus affecting the performance of the riser. The touchdown zone (TDZ) of the SCR is directly in contact with the highly nonlinear seabed soil mass, resulting in high bending stress of the SCR, generating fatigue stress that is unfavorable to the SCR life, further leading to SCR damage [
5]. With the increase in damage severity, the bearing capacity and function of the SCR will gradually deteriorate, leading to major safety accidents and even threats to human life and safety. Therefore, it is very important to conduct structural health monitoring on the TDZ of the SCR [
6].
In the past, the research on SCR damage often focused on the mechanism and characterization of various kinds of damage, but there was little research on damage location and damage severity quantification. The main purpose of this study is to locate and quantify the fatigue damage at the touchdown zone of SCRs. The simulation of damage refers to other damage studies of steel pipe or steel plate structures [
7,
8,
9]. The traditional vibration-based damage identification methods have been widely used and can be divided into two categories: parametric approaches and non-parametric approaches [
10,
11,
12]. Parametric approaches detect the differences in modal parameters before and after structural damage or their related parameters to obtain damage-sensitive characteristics [
13,
14]. However, as the overall attribute of a structure, the natural frequency is not sensitive to local small damage. Therefore, identification methods based on modal shapes, stress, and other parameters have been developed [
15,
16]. In the premise of obtaining accurate vibration response data, modal shapes, stress, modal strain energy, and other characteristic parameters are highly sensitive to the local microdamage of the structure [
17]. However, there are some challenging problems in practical applications, such as incomplete modal information and environmental noise, which will lead to deviations between the measured modal parameters and the true values [
18,
19,
20]. The non-parametric method, also known as the data-driven method, aims to directly extract the damage features from the structural vibration response signals without the need for structural modal parameter identification. Denby et al. analyzed the fatigue state of SCRs caused by vortex-induced vibration through model tests and concluded that the spacing and velocity of risers are the key factors affecting the fatigue damage of risers [
21]. Bai et al. discussed the SCR fatigue damage caused by spar platform movement under the action of an eddy current, proposed a detailed process of calculating fatigue damage, and also gave the calculation method of the cumulative fatigue damage of risers due to various factors. Although nonparametric methods do not need to identify modal parameters or solve global optimization problems, they also face difficulties such as massive data processing and measurement noise [
22].
In recent years, artificial intelligence (AI) methods have received extensive attention [
23]. Because of its strong adaptive learning ability, AI can provide a satisfactory method for processing the massive data generated by complex structures [
24]. Thus, AI was employed in the field of SCR damage identification. A simplified method was proposed by Wong et al. to detect fatigue damage to TTRs at the early design stage by an artificial neural network (ANN) [
25]. A total number of 21,532 simulation models were established by changing six pipeline geometry and sea state variables, and then random sample data were generated by Latin hypercube sampling (LHS). In addition, some studies combine intelligent algorithms to better complete SCR engineering work [
26,
27,
28,
29,
30]. However, these mentioned approaches tend to be influenced by the data source noise, and the high-dimensional characteristics of the data have not been taken into account.
At present, deep learning has been widely used in automatic driving, face recognition, energy, intelligent robots, drug efficacy prediction, etc. [
31,
32]. Compared with traditional AI approaches (SVM, DT, Naive Bayesian, and BP neural network), deep learning methods can easily process big data and identify high-level features in data. The CNN is a classical deep learning model and has been applied in structural damage detection. Oh et al. proposed a CNN-based damage detection approach for bridges [
33]. The dynamic displacement response is taken as the damage-sensitive characteristic. By the nonlinear relationship established by the CNN, the damaged story is located by detecting the difference in vibration response between the health state and the damaged state. Generally, the damage identification methods based on the CNN are often applied to relatively simple damage scenarios, and time series characteristics need to be further studied. The recurrent neural network (RNN) is widely used in time-series information processing [
34]. The essence of the RNN is that the memory can generate and address pattern sequences of any length. The RNN can establish the mapping from historical data to the target vector, and store historical information in a variable form in the internal state of the network. The RNN has two variants with similar but different structures: long- and short-term memory network (LSTM) and gated recurrent unit neural network (GRU) [
35,
36]. The GRU has an efficient structure that can memorize historical information and integrate the current state, new input, and historical information in a cyclic gating mode. Therefore, the GRU network has great potential in structural damage identification [
37].
Compared with the prediction methods based on traditional artificial intelligence, the CNN and RNN have shown superior performance in many application fields. The CNN can obtain spatial information of data smoothly, and the RNN can obtain the long-term time dependency information of data. However, a single CNN or RNN cannot consider space and time information at the same time, which means that it is one-sided for the CNN or RNN to solve the damage identification problem [
38]. In addition, the CNN and RNN have their own shortcomings. CNN-based methods generally have high memory consumption, weak interpretability, and insufficient robustness [
39]. Because of the problem of gradient disappearance, RNN-based methods are inefficient and easily lose the long-term time-dependence characteristics of sequences. Therefore, combining the advantages of the CNN and RNN has become an important research idea to solve complex prediction problems. Many studies have successfully solved the actual prediction problem by combining the CNN with the RNN. Khaki et al. established a prediction method combining the CNN-GRU model with weather and soil conditions and successfully predicted the yield of corn and soybean [
40]. Yan et al. established a CNN-LSTM air quality prediction model taking into account the spatial–temporal distribution characteristics of air quality [
41]. The practical application results show that the combination of the CNN and RNN is very effective in processing spatial–temporal data characteristics simultaneously.
Although the hybrid model is very effective when solving the problem of big data classification and prediction, the hyperparameters (such as activation functions, learning rate, the number of CNN convolution kernels, the number of GRU hidden layer nodes, etc.) require repeated debugging and often depend on experience [
42]. Therefore, it is necessary to establish more scientific and effective methods to solve hyperparameter optimization problems, such as heuristic optimization algorithms [
43,
44,
45]. Heuristic algorithms are generally based on bionic or empirical algorithms, which can give an approximate optimal solution to the optimization problem at an acceptable computational cost [
46]. At present, common heuristic algorithms include the whale optimization algorithm (WOA), genetic algorithm (GA), arithmetic optimization algorithm (AOA), particle swarm optimization (PSO), etc. PSO is a population intelligence algorithm inspired by bird feeding behavior, whose core idea is group cooperation and information sharing. PSO is commonly applied in several fields because of its simple operation and fast convergence [
47,
48]. The hyperparameters of the hybrid approach include integers and decimals, and the PSO is good at finding better values from the local optimal solutions. Therefore, the PSO is employed to solve the hyperparameter optimization problem of the hybrid model in this study.
In detail, an SCR fatigue damage identification method based on the CNN-GRU hybrid model is proposed, and the hyperparameters of the CNN-GRU are optimized by PSO. The main purpose of this study is to locate and quantify the fatigue damage at the touchdown zone of SCRs, which is caused by floating platform movement, waves, and current load. The proposed method takes advantage of the ability of the CNN and GRU to jointly obtain the spatial and temporal characteristics in damage identification data. To obtain the spatial–temporal feature information in the original data, the collected one-dimensional time-series data is converted into two-dimensional spatial–temporal data. The CNN establishes the nonlinear mapping between the structural data acquisition point and the damage, and the function of the GRU is to extract the long-term time features of the data. The convolution layer, pooling layer, GRU layer, and fully connected layer are connected in series and the dimension between every two adjacent layers is adjusted, so as to establish the CNN-GRU model. In order to further promote the identification ability of the CNN-GRU model, PSO is utilized to optimize its hyperparameters (the number of CNN convolution kernels, the number of GRU hidden layer nodes, the number of fully connected layer nodes, and the learning rate), thereby establishing the PSO-CNN-GRU (PCG) model. Subsequently, the SCR model based on an engineering application is established, its working environment and damage characteristics data sources are introduced, and the time-series database of damaged SCR acceleration in the touchdown zone is constructed. By analyzing the damage feature database, an SCR damage identification method based on the PCG model is established. In order to verify the feasibility and effectiveness of this method, the proposed PCG is compared with the CNN, GRU, and CNN-GRU models. The experimental results show that the PCG model can effectively realize the damage identification for the SCR touchdown zone.
The rest of this study is organized as follows: In
Section 2, the structures of the CNN, GRU, and PSO are briefly introduced. Moreover, the method of combining the CNN with the GRU and the working principle of the proposed PCG model are explained. In
Section 3, the SCR simulation model, SCR working conditions, and the construction method of the acceleration characteristic database of the damaged SCR model are introduced. An SCR damage identification method based on the PCG model is presented in this section. In
Section 4, the damage identification performance of the proposed PCG model is tested by structural acceleration time-series data, and the prediction results obtained by the PCG are compared with those obtained by other methods. The conclusions are given in
Section 5.
2. Establishment of PSO-CNN-GRU Model
In this section, the structure and basic principle of the CNN, GRU, PSO, and the construction method of the PSO-CNN-GRU (PCG) model are briefly introduced.
2.1. Structure of CNN
Compared with traditional technologies, CNN has unique advantages, such as weight sharing, automatic feature extraction, and high adaptability [
49]. A typical CNN is generally composed of convolution operations, pooling operations, and fully connected layers. The convolution operation utilizes the convolution check input signal to perform convolution operations repeatedly and generate corresponding features. The pooling layer performs down-sampling on the output of the convolution layer to reduce the data dimension and improve the operation speed. Generally, one or more convolution layers are followed by a pooling layer, and the fully connected layer outputs the data processing results. Generally, a CNN can extract the spatial characteristics of the original data, and the data processed by the CNN will not lose its temporal characteristics. The mechanism of the convolution layer is based on a convolution operation. If the input of the convolution layer is
, the result
can be obtained by convolving the signal
with the convolution kernel
of size
. The detailed computing method is shown in Equation (1) [
50].
2.2. Structure of GRU
The GRU is an RNN variant for processing sequence data with high efficiency and precision. Due to the gated mechanism, a GRU can extract high-dimensional time characteristics of long sequence data and has the advantages of simple operation and stable gradient [
51]. In a GRU neural network, there is no difference between the internal state and the external state. The core mechanism of the GRU is the parameter update of the gate and state. GRU stores and learns time characteristics through the gate- and state-related parameters. The calculation approaches of GRU gates and states are shown in Equations (2)–(5) [
51]. The framework of the GRU is presented in
Figure 1.
where
is the hidden state at time
;
is the input feature;
is the candidate state;
,
, and
are the weights shared at all time steps;
is the activation function;
is the update gate; and
is the reset gate.
2.3. Structure of PSO
As one of the most common swarm intelligence optimization algorithms, Particle swarm optimization (PSO) was inspired by animal behavior and was generated in 1995. Three years after the PSO framework was first presented, Shi and Eberhart introduced a control parameter called “inertia weight” into the initial PSO and proposed an improved PSO, which is widely accepted as a classical particle swarm optimization method [
52]. In this section, the working principle of the classical PSO is briefly explained.
The solving process of the classical PSO depends on the motion of particles. Each particle moving according to the established rule in the solution space is regarded as the solution to be evaluated. The motion rules of all particles are influenced by the group experience. Assume that the individual in the PSO algorithm is
.
is the current iteration, and
is the population size.
denotes a
-dimensional vector
. The corresponding particle velocity is
. All dimensions of each particle store possible solutions to the problem. The particle velocity represents the displacement of the next iteration particle relative to the previous generation particle. Each particle obtains speed information from personal historical experience and group experience. Generally,
is the optimal solution obtained so far;
is the optimal solution for individual history. In each iteration, the individual
and velocity
will be updated according to Equations (6)–(8) [
52].
where
is the current weight decreasing along with the increase in generations,
is the maximum weight,
is the minimum weight,
is the maximum number of iterations,
and
are two acceleration coefficients in (0, 2), and
and
are two random numbers in the range (0, 1). The group is always close to the particle optimal value and the group optimal value. The stopping criterion of PSO can be set as the minimum precision or the maximum iterations. Stop generation when PSO meets the stopping criteria.
2.4. Establishment of CNN-GRU Model
The CNN is suitable for processing spatial data, and convolution layer and pooling layer operations can maintain the continuity of the spatial information of data. Through repeated iteration and error backpropagation, the CNN can improve output accuracy by extracting the spatial information of the data. The GRU is suitable for processing timing data. The connection between the cycle units in the GRU makes the information cycle in adjacent time steps. This forms an internal feedback state, so the GRU can dynamically extract the time information of the data. The CNN-GRU model composed of a convolutional operation, pooling operation, dropout operation, and GRU in series can simultaneously obtain the spatial and temporal information of the data. The CNN-GRU mainly consists of an input layer, a convolution layer, a pooling layer, a GRU layer, a fully connected (FC) layer, and an output layer. The structure of the CNN-GRU model is shown in
Figure 2.
The input of the CNN-GRU model is the time-series data obtained from multiple measurement points, while the output is the damage identification result of the structure. Firstly, convert the one-dimensional time-series data into two-dimensional data and input it into the model. Secondly, the spatial characteristics of the data can be extracted by convolution and pooling operations. Then, the output of the pooling layer is sent to the GRU layer to learn long-term time-dependence characteristics. At last, the structural damage state is output by the FC layer. When the activation function of the FC layer is “Softmax”, the output is the damage locations of the structure; when the activation function is “Linear”, the output is the damage degree of the structure.
2.4.1. Z-Score Standardization
The original feature data needs to be standardized. Standardization is for better training of neural networks. To ensure that the input of the neural network is not affected by dimensions, and the feature distribution range is roughly the same, different data preprocessing is required for different features. In this study, the Z-score standardized method is selected to preprocess the initial data to make the data distribution more inclined to a normal distribution, which is more conducive to model training. The calculation method is:
where
is the original data;
is the mean value of samples; and
is the standard deviation of samples.
2.4.2. Activation Functions and Loss Functions
In order to construct the non-linear conversion of data and learn the complex mapping relationship, the activation functions are utilized in neural networks. At present, the common activation functions include Sigmoid, Tanh, and Relu, the calculation approaches of which are shown as follows [
53]. Relu is employed in both the convolution layer and the GRU layer. Tanh is only applied in the GRU layer.
SCR damage location is a discrete value multi-classification problem, thus the cross-entropy function is applied as the loss function of the CNN-GRU, and its computing method is as Equation (13) [
53].
where
is the index of the category marker vector;
represents the correct marker vector; and
represents the predicted value. The cross-entropy function describes that
and
are the distance between two probability distributions. The smaller the cross-entropy value, the stronger the prediction ability of the approach. The identification of SCR damage degree is a problem of measuring degrees of deviation between consecutive values. Therefore, the mean square error function is used as the loss function of the hybrid model.
2.5. Establishment of PSO-CNN-GRU
In deep learning algorithms, both parameters and hyperparameters affect the model’s capability. The hyperparameters are set before the training of the model and cannot be generated from the data. The determination of hyperparameters directly influences the prediction ability of the model. The setting of hyperparameters is generally designed manually by experienced personnel. This process has low efficiency, poor directivity, and high calculation cost. Once the data is slightly adjusted, the process of parameter adjustment needs to be repeated. Manually designing hyperparameters of complex models can no longer meet the requirements of big data analysis.
In order to efficiently find suitable CNN-GRU model hyperparameters and further improve the prediction accuracy, the PSO is utilized to optimize the CNN-GRU hyperparameters in this study. The main optimized hyperparameters include the number of convolution kernels in the CNN layer, the number of hidden layer nodes in the GRU, the number of FC layer nodes, and the learning rate. There exist discrete values and continuous values among the optimized parameters, which is a challenge to the optimization capability of PSO. Therefore, the hyperparameter setting problem of the CNN-GRU model is transformed into a PSO-based optimization problem. The evaluation result of each particle is determined by the fitness function. The fitness function of PSO is the fundamental factor to establish the PCG model. In this study, the particle fitness value is obtained by training CNN-GRU models with different hyperparameters. The fitness function of PSO (
) comprehensively considers root mean square error (
) and mean absolute error (
), which is calculated by Equations (14)–(16) [
51].
In the PCG hybrid model, the aim of the CNN-GRU model is to solve the damage identification problem, and PSO is employed to automatically optimize the hyperparameters of the CNN-GRU model. As a module that can be adjusted repeatedly, the CNN-GRU model is embedded into the fitness function of PSO.
Figure 3 is the PCG model flow chart. The process of optimizing hyperparameters for the CNN-GRU model based on PSO is expressed as follows:
Step 1: Initialize particle position , particle velocity , the acceleration coefficient, and other necessary parameters. Code each individual that has four dimensions, corresponding to four hyperparameters; namely, the number of convolution filters, the number of GRU hidden layer nodes, the FC hidden layer nodes, and the learning rate.
Step 2: Decode individuals and build corresponding CNN-GRU models. Obtain prediction results by the test set data. Calculate the fitness values of all individuals by Equation (16) based on the prediction results.
Step 3: Update particle velocity , , and by Equation (8).
Step 4: Generate the next generation of individuals according to Equation (7). Judge the boundary conditions and limit the range of individual dimension values.
Step 5: Determine whether the stopping criterion is met. If yes, end the iteration and the optimal hyperparameter is extracted. Otherwise, return to step 2.