1. Introduction
Aging bridge populations compel owners to investigate effective, low-cost approaches for locating deficiencies across their inventory. Integrating automated structural health monitoring (SHM) and affiliated damage identification methods into bridge condition assessment and management processes could help bridge owners better manage their assets and maintain safe operations more cost-effectively. When dealing with SHM, Machine Learning (ML) methods are often used to predict damage location and intensity from measured or simulated structural response [
1]. Proper Orthogonal Decomposition (POD) has proven to be a powerful method for extracting damage information from measured response [
2]. However, this method, like most ML methods, is applicable to a single structure under specific damage scenarios [
3]. In addition, obtaining ground truth information from real world structural damage states can be challenging, if not impossible. To alleviate these limitations, this study aims to use Transfer Learning (TL) as a tool for transferring knowledge about a feature space from a structure with pre-trained classifiers of damage location and intensity with known labels to another structure with unknown labels. The proposed method utilizes POD to detect damage and TL to generalize techniques to a wide variety of bridges with some degree of similarity in behavior and damage states by transferring, a priori, knowledge from one structure to similar structures in the population.
Data-driven methodologies for damage identification in structures are often developed using ML and, more specifically, pattern recognition algorithms [
4]. The most common approach for implementing data-driven damage identification is to find a robust damage feature from measured data. One of the main challenges in feature selection is finding features that are sensitive to damage and insensitive to other operational variables. Additionally, the correlation of feature behavior with damage level and potential low dimensionality of the feature vector are crucial factors affecting successful damage identification [
5]. Although modal parameters are common features for most vibration-based damage identification methods, they cannot be directly used for damage identification in highly nonlinear structural systems [
6]. In addition, modal parameters need to be decoupled from sensor noise to achieve improved damage detection accuracy [
7]. Malekjafarian et al. [
8] investigated the fault detection of an in-service railway track using measured acceleration. They demonstrated that, for a certain range of train forward speed, the extracted amplitude of acceleration after applying Peak-Based Decomposition (PBD) corresponded to the data observed from the Track Recording Vehicle (TRV) and could potentially be considered a damage indicator. Song et al. [
9] proposed ensemble empirical mode decomposition (EEMD) to eliminate unnecessary information from the original signal for analyzing the pantograph–catenary system.
Principal Component Analysis (PCA), a POD method, is a powerful technique for capturing dominant features in a multi-dimensional system using a few modes [
10]. To eliminate environmental effects, Bellino et al. [
11] proposed a PCA-based damage identification approach that chose the first natural frequency as the damage feature and assessed its performance by examining an experimental time varying system under controlled temperature. Galvanetto and Violaris [
12] numerically studied structural damage detection using Singular Value Decomposition (SVD) by computing Proper Orthogonal Modes (POMs) and examined differences between POMs in healthy and damaged structure models. Shane and Jha [
13] developed a POD-based damage identification algorithm for a composite beam using vibration data. Eftekhar Azam et al. [
14] developed an automated damage detection framework utilizing POD and Artificial Neural Networks (ANNs) to detect the location of simulated fatigue cracks in a steel railway bridge. More recently, Ardani et al. [
15] examined the effectiveness of a POD framework for damage identification in three simply supported bridges by imposing actual damage scenarios.
The damage identification framework for complex structural systems is developed by training the damage features to be mapped to the corresponding damage labels. ANN is utilized to model this relationship between input (damage features) and output (damage labels) data. This algorithm has been widely used for pattern recognition and classification, and, in structural engineering, for the identification of deficiencies. Xu and Humar [
16] proposed an ANN-focused, two-step algorithm that implemented modal energies for simulated damage identification using an FE girder bridge model. Mehrjoo et al. [
17] demonstrated the efficacy of ANNs in identifying damage location and severity in truss bridge connections using modal shape parameters. Gu et al. [
18] proposed using a multilayer ANN that focused on changes in natural frequencies. Novelty indices that quantified damage severity were determined to distinguish changes in natural frequencies caused by damage from those caused by temperature variations.
Despite the large amount of research that successfully utilized ANNs as a learning method for applications of structural damage identification and SHM to various engineering problems, training time and output accuracy depend heavily on network structure. In many cases, the amount of computational effort required to retrain a developed network for a structure in a population with some degree of similarity to a structure within a trained network is excessive.
Finite Element (FE) modeling uncertainties (MUs) can lead to unreliable ANN results [
19]. MU can be influential with respect to damage detection algorithm accuracy even if there is a good match between FE model predictions and experimental data [
20]. Lee et al. [
21] used differences between mode shape ratios for damage identification on a simple beam and a multi-girder bridge using ML to examine the effect of MU on damage detection. The proposed method was applied to an in-service, multi-girder bridge, and minor estimated damage intensity false positives were observed. Bakhary et al. [
19] utilized statistical ANNs involving modal parameters from an FE model of a single span steel portal frame to investigate the effects of MU on vibration-based damage detection by considering random errors. They concluded that the statistical ANN detected damages with higher accuracy compared to a normal ANN. Rageh et al. [
22] investigated the effects of MU on an operational railway bridge using a hybrid damage detection algorithm based on POD and ANNs. A series of numerical investigations were completed that involved different MUs and a robust damage feature was developed that was less sensitive to MUs. Result accuracy varied based on examined MU, with less accurate results being obtained at higher MU levels.
In addition to issues associated with the effects of MUs on damage identification accuracy, one of the drawbacks of most conventional SHM ML techniques is the inability to use transfer knowledge from one structure under specific damage scenarios to another, similar, structure in a population, potentially one with different damage scenarios. Difficulty associated with transferring a developed and trained algorithm within a population of structures has motivated the SHM community to incorporate Population-Based SHM (PBSHM). Recently developed TL methods and their applications to SHM can provide systematic approaches to knowledge transfer within a population of structures [
3,
23]. Traditionally, the main assumption in implementing these methods is that training and testing data are selected from the same distribution [
24]. To overcome this limitation, TL methods generalize the classifier trained for one structure to be applicable to another structure. In the context of SHM, TL methods have been used for image processing, computer vision, and pattern recognition. Gardner et al. [
3] proposed a PBSHM by employing TL methods in the form of Domain Adaptation (DA). They assessed TL performance with respect to labeling a target domain using Transfer Component analysis (TCA) [
25], Joint Domain Adaptation (JDA) [
26], and Adaptation Regularization-based TL (ARTL) [
27] in classification-type problems for homogeneous- and heterogenous-type populations. The efficiency of each TL method was examined for two heterogenous populations. The first case population encompassed two numerical simulations of a three degree of freedom structure, each with different geometric and material properties, with one simulation being the source and the other the target domain. The second case involved using a numerical simulation of a structure as the source domain and an experimental replica as the target domain. Recently, Zhang et al. [
28] proposed a TL-based method and Bayesian model updating (BMU) to reduce the effect of MUs on model updating performance. Modal parameters, including normalized frequency change ratios and mode shapes, were used as features. They utilized ARTL to transfer knowledge from a source domain consisting of an eight-floor numerically simulated structure to a target six-floor experimental structure. Zhang et al. [
29] used JDA as a TL method to map wave signals from one plate to another and a convolutional long short-term memory (ConvLSTM) network to learn mapping relationships from the source plate so that the damage image was detected in the targeted plate. Yan et al. [
30] developed a structural anomaly detection framework using the transmissibility function and statistical threshold selection, and examined its robustness against uncertainty. Mei et al. [
31] demonstrated the better performance of the Bhattacharyya distance-driven algorithm for novelty detection against transmissibility functions that follow Gaussioan distribution.
This study was motivated because research shows that a trained source domain TL-based classifier can be generalized to detect deficiencies in unlabeled target domains, and, when a JDA–kernel method is implemented, higher accuracy target domain label predictions are obtained [
3]. To date, research regarding TL-based SHM, while promising, has focused on experimental and numerical simulations under controlled environments. Therefore, a need to expand this approach to bridges and examine its effectiveness for actual structures under actual load with larger feature space exists. In addition, coupling this TL method with POD provides a robust damage identification framework with minimum sensitivity to noises.
This study focused on developing and studying a TL approach for transferring knowledge from a base, a modeled existing railway bridge class with known labels in the feature space, to a modeled bridge class with realistic MU and unknown labels in the feature space. Bridge FE models subjected to real train loadings measured during passages over the actual bridge under simulated damage scenarios were used to validate the TL approach for bridge damage identification. JDA coupled with a linear kernel, herein referred to as JDA–kernel, was implemented to map between POMs of the two bridge models. The derived relationship and the Kernel Nearest Neighbor (KNN) approach were then implemented to obtain POM labels for a target bridge with unknown labels. The resulting JDA–kernel approach for damage detection and intensity identification was evaluated using three scenarios: known damage intensity (DI) and unknown damage location (DL); known DL and unknown DI; and unknown DL and DI.
4. TL Using Coupled POD and JDA–kernel
Before implementing the JDA–kernel method, two models were selected from the experiment set to map knowledge from one model, identified as the source domain, to another, target domain, model to predict labels for the target domain. The base model,
, was selected as the source domain, and each model with MU from
Table 1 as the target domain. Each model had eleven damage labels, one for the healthy state (0% damage) and ten imposed damage levels with different intensities
at different locations
, identified in
Figure 1b. Feature spaces were the first POMs. The base model (
) was simulated for all 4824 scenarios using SAP2000 OAPI and validated using field measured data from the bridge.
The JDA–kernel method and KNN classifier were implemented for damage identification in the target domain using trained knowledge from the source domain. Each TL implementation assumed that estimated POMs for source and target domains from [
22] were classified and labels for the source domain known. Predicted target domain labels from this supervised TL implementation were compared with those from previously known labels to determine their accuracy. Since the JDA–kernel generates pseudo labels, the algorithm was implemented over several iterations until it converged.
Three scenarios were investigated. Scenario 1 was where DI was known, and the algorithm determined DLs. The feature space in both the source and target domains contain matrices of the first POMs where each column represented the associated first POM for a particular DL, where and , and the label space included . Scenario 2 identified DIs in the target domain with DL being known. Here, the feature space in both the source and target domains contain matrices of the first POMs where each column represented the associated first POM for a particular DI, where and , and the label space included . For Scenario 3, both DIs and DLs were unknown and the algorithm was applied to the entire dataset in the source and target domains to determine target domain labels. Categorization was performed for various DIs and the label space was defined as The feature space was defined as , where and (24 features for healthy case and 480 features for each DI). In this scenario, redundant healthy POMs were added to the feature space so that the same number of features from each damage state were included to prevent the training process from underestimating healthy states. This provided .
Cross-validation of the JDA–kernel algorithm was implemented using the k-fold method [
39]. Four partitions were used, with one partition being reserved for cross-validation and training being carried out using the remaining three partitions. Hyperparameter tuning was also implemented using the grid search method [
40] to find optimum values for the number of transferred components (
) and the regularization parameter (
) used in each simulation.
Figure 3 details the TL process. As illustrated in the figure, the process consisted of four steps. Preparing features and defining source and target models were addressed in Step 1. JDA–kernel implementation, cross-validation, classifier training, and hyperparameter tuning were implemented in Step 2. In Steps 3 and 4, the JDA–kernel and classifier were reimplemented using partial data and entire data sets, respectively, from optimized parameters obtained in Step 2. The score in Step 3 and 4 represents the calculated mean accuracy of the JDA–kernel and classifier by applying the optimized parameters in the algorithm.