*4.3. Methods Comparison and Results Discussion*

In this section, four methods are employed for comparison on the test tasks T1~T5 to illustrate the superiority of the proposed FDACNN. The details of compared methods are described as below.

The standard DANN method is utilized for comparisons [35]. Training data are sequences that convert fusion data into one dimension.

Another popular domain-adaptation method based on maximum mean discrepancy (DA-MMD) is applied for comparisons [36]. The train data and network structure used in DA-MMD are consistent with the proposed method.

The DACNN model was trained using single vibration signals (DACNN-SV). This model is trained using frequency domain signals and CS2 sequence (2 × 64 × 32 formats).

The DACNN model was trained using single infrared thermal image (DACNN-SI). This model is trained using RGB 3-channels of infrared thermal images (3 × 64 × 32 formats).

The results of different comparison methods are listed in Table 5. Meanwhile, in order to compare the results more intuitively, the bar diagram of results is shown in Figure 5. It can be observed that DACNN\_SV has the lowest accuracy in five tasks: 47.38%, 40.88%, 67.44%, 35.56% and 35.44%, respectively. This is due to the fact that the vibration signal is not sensitive to two non-structural faults, which makes it impossible to classify, and the

resulting accuracy is low. DACNN\_SI has a good performance on T1 and T3, but it is not satisfactory in other test tasks; in particular, the accuracy of T5 is only 83%. It illustrates that infrared thermal images can be used to identify structural and non-structural failure states effectively, but they are susceptible to environmental interference. From the results of DANN and DA-MMD, it can be observed that the accuracy in all test tasks is lower than the proposed FDACNN. This is because two-dimensional data fusion can effectively maintain the fault information contained in the infrared image and vibration signal. Meanwhile, the adversarial domain adaptation network can enable the extracted extractor to extract target domain features that are easy to distinguish.


**Table 5.** The results of different comparisons methods.

Additionally, in order to demonstrate the ability of the feature extractor to extract domain invariant features, principal analysis (PCA) is used to map extracted features into 2-dimensional space. Figure 6 shows 2-dimensional visualizations in different test tasks, in which PCA 1 and PCA 2 denote first and second principal components, respectively. It can be observed that the points with the same color are clustered in T1 and T3, and the point clusters of different colors are obviously isolated. In T2, T4 and T5, only a few points with the same color are confused, and most of them with the same color are relatively concentrated. Therefore, extracted features are relatively separable in all test tasks. It suggests that the trained feature extractor has the ability to extract distinguishable features from the unlabeled target domain fusion samples.

**Figure 6.** Feature visualization of different test tasks.
