**1. Introduction**

The gearboxes play an irreplaceable role in the mechanical power system, which usually works in harsh and complex environments [1,2]. The failure of gearboxes may cause unexpected accidents and economic losses. Therefore, accurately identifying and diagnosing faults is the key to ensuring that gearboxes normally operate [3].

The intelligent diagnosis method has received wide attention from researchers because of its ability to detect faults automatically and is not limited by manual experience. The methods based on deep learning are particularly prominent because they can adaptively learn the fault information hidden in the collected signals, such as long short-term memory network (LSTM) [4], recurrent neural network (RNN) [5] and convolutional neural network (CNN) [6]. In addition, some extended models based on standard deep learning models are proposed for rotating machinery fault diagnosis, such as deep convolutional auto-encoder (DCAE) [7], CNN with capsule network [8] and multiscale CNN [9], etc. Yao et al. [10] proposed a stacked inverted residual CNN (SIRCNN), which had stable and reliable fault diagnosis accuracy. Shao et al. [11] established an ensemble deep autoencoder (EDAE), which consists of several DAEs with different activation functions. The results indicated that it has good accuracy in rolling bearing fault diagnosis. However, the methods mentioned above assume that adequate high-quality data collected from the concerned machine are available for estimating underlying data distributions. In addition, these methods need training and testing data drawn from the same probability distribution [12]. In actual applications, it is impractical to obtain a large amount of labeled data.

**Citation:** Mao, G.; Zhang, Z.; Qiao, B.; Li, Y. Fusion Domain-Adaptation CNN Driven by Images and Vibration Signals for Fault Diagnosis of Gearbox Cross-Working Conditions. *Entropy* **2022**, *24*, 119. https://doi.org/10.3390/e24010119

Academic Editor: Daniel Abasolo

Received: 15 November 2021 Accepted: 10 January 2022 Published: 13 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

In addition, the performance of the aforementioned methods may decrease in recognizing unlabeled data collected from another machine or different working conditions due to data discrepancy [13].

Transfer learning can transfer learned knowledge to related machinery or fields, and it is widely applied to address the above-mentioned cross-domain fault diagnosis problem. Currently, parameter transfer and feature domain-adaptation are two popular transfer learning implementation methods. Parameter transfer is suitable for scenarios where only a few labeled data from the target domain are available, but it is not enough to train the model. Qian et al. [14] proposed a method for rolling bearing fault diagnosis under variable working conditions by transferring the parameters of the stack auto-encoder (SAE). Chen et al. [15] proposed the transfer neural network to diagnose the faults of the rotary machinery, which pre-trains a 1D-CNN with the source data and then uses the limited target data to fine-tune the model to obtain a transfer convolutional neural network. However, in most practical applications, there is no available labeled target data to participate in the model training process. Domain adaptation techniques based on feature transfer have been much preferred in this case. One implementation of domain adaption is to add a domain adaptation term to the loss function, such as Maximum Mean Discrepancy (MMD) [16–18] and Wasserstein distance [19]. Another implementation of domain adaption is through domain adversarial training, in which a feature extractor aims to extract common features from both source and target domain by adversarial training [20–22]. In addition to this, in order to further improve transfer and generalization capabilities of the models, multiple source domains of data are used to extract transferable features, which are used to diagnose the faults of rotating machinery [23–25].

However, most existing studies on transfer diagnosis mainly focus on single-channel signals with vibration signals as the mainstay. This is because the vibration signal can be collected by the acceleration sensor attached to the surface of the component, which is sensitive to the impact caused by structural damage, such as gear fracture and bearing outer race crack. For some non-structural faults, such as gear box oil shortages, vibration signals are not sensitive to them. These failures can also cause serious consequences and should not be ignored. Infrared thermal image can perfectly reflect non-structural fault information and is widely applied in fault diagnosis [26,27]. However, the single infrared thermal image is very sensitive and is easily affected by external factors such as oil temperature [28]. Therefore, the fault diagnosis method based on multi-source heterogeneous data fusion is an issue worthy of study. Bai et al. [29] proposed a method for coupling fault diagnosis of rotary machinery by using infrared images and vibration signals, in which the enhanced infrared thermal image and two-dimensional vibration signals are spliced and inputted into CNN to obtain final diagnosis result. Shao et al. [30] pre-trained multiple novel SAEs using multisensory signals from the source domain and finetuned each novel SAE using a target domain sample. The diagnosis result is obtained by a modified voting strategy. In the above research studies, multi-source heterogeneous signals are widely applied in fault diagnosis since they can supply abundant fault information. However, it is rare to use infrared thermal images and vibration signals to diagnose structured and unstructured failure states in unlabeled target domains.

This paper proposed a fusion domain-adaptation CNN (FDACNN) driven by images and vibration signals. An FDACNN consists of two main stages: data-level fusion and domain-adaptation network training. In the stages of data-level fusion, raw signals are transformed into frequency and squared envelope spectrum, and they are arranged into two-dimensional format. Two-dimensional format data are combined with the infrared thermal image to form fusion data samples for model training. In the stages of domainadaptation network training, a features extractor, a domain discriminator and a state classifier are constructed. After a number of adversarial training, the domain invariant features can be extracted from fusion samples and used for the classification of health states. In actual industrial gearbox, both the accelerators and infrared camera can be installed to collect the infrared images and vibration signals. The vibration signal can be used to

effectively diagnose structural failures such as tooth breakage, tooth missing, and gear wear. Moreover, the infrared thermal image is sensitive to non-structural failures, such as oil shortage and oil temperature exorbitant. In this study, more comprehensive features can be extracted from infrared thermal image and vibration signals than a single sensor. Moreover, the proposed method has lower calculation complexity, which can rely in the online fault diagnosis of gearbox. The main contributions and insights of this study are listed below:


The rest of this article is arranged as follows: Section 2 presents preliminary and basic knowledge. The details of the proposed FDACNN are provided in Section 3. Section 4 validates the proposed method and analyzes the results. Finally, the conclusion in Section 5 brings the study to a close.

#### **2. Preliminaries**
