*4.2. Compared Methods*

To show the superior performance of the proposed model, four comparative methods are adopted as follows:


### *4.3. Implementation Details*

As detailed in Table 1, we randomly discarded a number of fault types to design different partial transfer diagnosis tasks on the basis of the two fault diagnosis datasets. For the dataset, each sample consists of 2400 data points, then fast Fourier transformation (FFT) is applied to transform the time-domain signal to frequency-domain signal that contains 1200 Fourier coefficients. The structure of the framework are illustrated in Table 2. The learning rate is set as 0.0001, and the maximum training epoch is 1000. In order to avoid the effects of random cause, we conducted 10 experiments on each task. The running steps of the proposed model are shown in Algorithm 1. In the test process, the spectral data of the target domain can be directly input into the model for classification. The code programming of the model is implemented on the Pytorch platform.


**Step 4:** Train the auxiliary classifier *CA* to obtain the optimal parameters ˆ *θCA* by minimizing *F*(*θCA* );


**Table 1.** Descriptions of the diagnosis tasks.



### *4.4. Experimental Results*

As mentioned in Section 2, the deep features in different activation layers of the model are involved in subdomain adaptation. In order to obtain the best performance for subdomain adaptation, the deep features with dimensions 128, 256 and 512 in fully

connected layers of the classifier and feature generator (named L1, L2 and L3, respectively) are extracted and combined for comparison. B7 task was selected to verify the combination of different layers, and the experiment was conducted for 15 times. As shown in Figure 5, it is clear that L1 achieves the best performance during the single layer, while L3 performs the worst. It means that the model needs to carry out deep operation to extract more separable domain invariant features. In the multi-layer combination, L1 + L2 performed better than single layer while L1 + L2 + L3 has a lower performance than L1 + L2. This indicates that some non-invariant features may exist in the shallow layers and using subdomain adaptation to align these features will degrade the performance of the model. Therefore, we apply the combination of L1 + L2 for the designed tasks.

**Figure 5.** Boxplot for the performance of different layer combinations under the same task.

The average accuracy of the proposed method and the comparison method in all tasks are detailly shown in Table 3. In general, our method obtains the highest average accuracy and the lowest standard deviation. This indicates that WSAN has excellent and stable performance in both global and partial domain adaptation tasks. Since the basic approach does not include any domain adaptation operations, it obtains the worst performance on all tasks. MKMMD achieves the highest accuracy on the non-partial transfer fault diagnosis task B1, but performed poorly on the partial transfer tasks. This indicates that the domain adaptation methods based on MMD has superior performance in the fault diagnosis task under variable conditions, but it is not feasible to directly apply it to partial transfer scenarios. DSAN performed better than MKMMD in most tasks, and its average accuracy is 4.2% higher than that of MKMMD. But it still lags behind the other two partial transfer methods because it does not carry out any weight learning operation. WSAN achieved an average accuracy of 97.7%, which was 4.7% higher than ETN, 13.2% higher than DSAN, and 17.4 higher than MKMMD.

It can be noted that ETN and WSAN, as two domain adaptation methods with weighted learning, perform significantly better than other methods in partial transfer diagnosis tasks. In addition, it can be found that the proposed method gets more ahead of ETN with the increasing degree of domain class asymmetry. For task B2, the accuracy of WSAN is 1.4% higher than that of ETN, while WSAN is 5.6% higher than that of ETN on task B8. The same phenomenon can be observed for tasks G1 and G6.


**Table 3.** Experimental results of the average testing accuracies in all tasks (%).

To demonstrate the feature classification effect of our method intuitively, the highdimensional features extracted of the model are processed with the well-known *t*-SNE [33] technology for dimension reduction. The dimension reduction results of B3 are shown in Figure 6. In Figure 6a, we can see that the feature separability and clustering effect obtained by the basic method are inefficient. Features become separable but shared types and outliers are still cannot be distinguished in Figure 6b,c when domain adaptation is adopted. Although MKMMD and DSAN perform efficient global domain adaptation, the existence of outlier types would enable the model to extract classification knowledge that is not applicable in the target domain. This also indicates that the global adaptation methods only pays attention to the alignment of the two domains, but does not consider the relationship between the subdomains within the domain. In Figure 6d, it can be seen that ETN basically separates outlier samples but the alignment of shared type features is not accurate enough, which indicates that the classifier cannot carry out effective sample-level alignment after obtaining class-level weights and it may leads to inaccurate classification. There are some confusions between the source samples of RO2 and RF1 types. In this case, ETN may treat the RF1 samples as outliers and filter out some useful classification knowledge. For the proposed method, precise alignment of the related subdomains is performed while blocking the outlier types in Figure 6e. After obtaining accurate class-level weights, WSAN can use the proposed WLMMD to perform effective subdomain alignment which involves the sample-level weights learning.

In order to further explore how the weights learned affect the alignment of deep features, the similarity matrix of source and target features in deep layer is drawn on task G4. According to [30], the similarity matrix can be calculated by *G*(*xi*,*xj*) = exp(− *x<sup>i</sup>* − *x<sup>j</sup>* 2/200) wherein *<sup>x</sup><sup>i</sup>* ∈ *Ds* and *<sup>x</sup><sup>j</sup>* ∈ *Dt* . Figure 7a shows the actual correspondence between the source and target labels. In Figure 7b, only the samples of SW type can be identified to a certain extent, while the features extracted from the other two target types of samples are highly similar to various source types, which is extremely unfavorable for classification. Obviously, the deep features extracted by the basic method are chaotic due to the lack of domain adaptation operation. In Figure 7c,d, the corresponding samples of SP and SW types have low similarity degree, and some of the samples have great similarity with other types. Consequently, global domain adaptation methods may extract fuzzy deep features when dealing with partial transfer problem. Figure 7e shows that ETN can assign large weight to shared types, but there are still some outlier samples with large weights, resulting in a higher similarity between target features of SP and source features of PP and PW. By comparison, Figure 7f indicates that WSAN obtains more accurate weights, which is reflected in the large similarity between the extracted features of the target domain and corresponding features of source domain, and only a few samples are weakly similar to other source types. In general, the proposed method

can make the shared samples fully participate in the subdomain adaptation and block outliers. Thus, the extracted domain invariant features own high similarity among the corresponding shared types.

**Figure 6.** *t*-SNE visualization results of (**a**) Basic, (**b**) MKMMD, (**c**) DSAN, (**d**) ETN, and (**e**) WSAN in task B3. The samples circled in red are outlier types.

**Figure 7.** Similarity matrix of learned features by (**a**) ture label, (**b**–**e**) comparison methods, and (**f**) the proposed method. The abscissa and ordinate represent the source sample sequence and target sample sequence, respectively. The depth of the color indicates the similarity between the corresponding samples.

### **5. Conclusions**

A weighted subdomain adaptation network (WSAN) is presented to solve partial transfer fault diagnosis problem of machinery. Different from the previous global domain adaptation approaches, we divide all samples into different subdomains according to sample types of the source domain, and design WLMMD to perform accurate subdomain alignment. In addition, in order to obtain class-level weights, an additional auxiliary classifier is set up to conduct adversarial training with the feature generator. Under the guidance of class-level weights, the prediction probability output of the target domain by the classifier is used as the sample-level weights, so that the model could capture fine-grained transferable information within the relevant subdomains. The optimal layer combination was found by exploring the performance of the deep features in different activation layers participating the subdomain adaptation. The best diagnostic performance can be obtained under the combination of fully connected layers (L1 + L2) with dimensions 128 and 256. Experimental results on the bearing and gear datasets collected in our laboratory indicates that the average accuracy of the proposed method on the designed fault diagnosis task is 97.7%, which is higher than that of several comparison methods. This means WSAN could solve the partial transfer fault diagnosis problem more efficiently compared several popular methods. *t*-SNE dimension reduction and correlation matrix show that WSAN can learn accurate weights and carry out accurate weighted subdomain adaptation.

Although the proposed weighted subdomain adaptation approach achieves superior performance on the partial transfer fault diagnosis tasks, the laboratory works on the premise that the target data is available during training. It is difficult to guarantee the performance of such a model under unknown working conditions. Such approaches may fail when we need real-time diagnosis. However, this problem may be solved with the help of domain generalization technology [34], and we will explore this issue in depth in our future work.

**Author Contributions:** Writing—review and editing, S.J.; visualization, J.W.; supervision, X.Z.; project administration, B.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** The research was supported by the National Natural Science Foundation of China (52005303), the Natural Science Foundation of Shandong Province(ZR201911100329), the Project of China Postdoctoral Science Foundation (2019M662399), and the Postdoctoral Innovation Project of Shandong Province (202003029).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are available on request from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.
