4.2.1. CWRU Dataset Analysis
In this section, experimental scenarios are designed utilizing a CRWU bearing dataset. When dividing samples, a sliding window with a length of 400 is employed to intercept the bearing data. Furthermore, to meet the input requirements of the CNN, each intercepted sample is transformed into 20 × 20 two-dimensional data. After constructing the sample dataset, multiple experimental scenarios are set up to validate the performance of various model algorithms. The comparison methods are described in
Table 5.
The multiple experimental scenarios designed are shown in
Table 6. It can be observed that in Experimental Scenario 1, the number of labeled samples is 4 × 10, while the number of unlabeled sample data is 4000, 6000, and 8000 respectively, with the screened number of unlabeled samples being 50, 100, and 200 respectively. In Experimental Scenario 2, the number of labeled samples is reduced to 4 × 5, while the number of unlabeled samples and the screened number of unlabeled samples remain consistent with those in Scenario 1.
Firstly, the performance of various models is tested in Experiment Scenario 1. Specifically, the CNN-based method aims to validate the performance of the baseline model without leveraging unlabeled sample information. Subsequently, by adjusting the number of labeled samples and gradually incorporating unlabeled sample information, we investigate the performance trends of the π-model and VAE-SSL model. Furthermore, to illustrate the specific impact of selecting different quantities of unlabeled samples on the performance of CNN-SSL, AUS-SSL, and ACUS-SSL models under varying sizes of unlabeled datasets, corresponding comparative experiments are conducted. The model architecture parameters for the proposed method are presented in
Table 7.
Under Experiment Scenario 1, the diagnostic results of the comparison methods and the proposed method are presented in
Table 8. As evident from
Table 8, it is difficult to obtain an effective diagnostic model by training a CNN with only 10 labeled samples per category without utilizing any unlabeled samples. However, by comparing the experimental results from the second to the fourth rows, it can be observed that, with the number of labeled samples remaining constant, increasing the number of unlabeled samples can improve the diagnostic accuracy of various semi-supervised models. This is because a large number of unlabeled samples provide rich fault feature information for these models, thereby enhancing the reliability of model feature extraction.
Comparing the experimental results in the fourth and fifth columns of the second row, it is found that the π-model, by introducing feature consistency loss as a regularization term, can effectively extract fault-related feature information from unlabeled samples, resulting in an improvement in performance compared to the CNN model trained solely on labeled samples. Compared to the π-model, the CNN-SSL model adopts a different strategy. It applies a screening mechanism to the predictions of unlabeled samples to obtain relatively reliable pseudo-labeled data, leading to a certain improvement in diagnostic accuracy. However, it is worth noting that the CNN-SSL model finds it difficult to ensure the complete reliability of the unlabeled data when screening pseudo-labels. Additionally, even if relatively reliable pseudo-labels can be obtained through strict screening mechanisms, the strictness of the screening conditions often makes it difficult to obtain a large number of pseudo-labels, thereby affecting the reliability of the model.
Furthermore, comparing the experimental results in the fifth and sixth columns of the second row reveals that the diagnostic accuracy of VAE-SSL is higher than that of CNN-SLL. This is because VAE-SSL employs a generative semi-supervised strategy, which balances the data distribution of different categories during the training process, thereby reducing the impact of data imbalance on diagnostic accuracy. This capability enables VAE-SSL to extract and represent fault features more accurately than CNN-SSL. However, comparing the sixth and seventh columns of the second row shows that the performance of VAE-SSL is lower than that of AUS-SSL. This is because the information-based screening mechanism in the AUS-SSL model can effectively filter out samples rich in classification information from unlabeled samples. These filtered samples, after interacting with an expert system, can obtain fully reliable label information. Subsequently, incorporating this reliable label information into the model training process can improve the diagnostic performance of the model.
Comparing the experimental results in the seventh and eighth columns of the second row, it is evident that the proposed ACUS-SSL method achieves the highest diagnostic accuracy. This is because, while AUS-SSL ensures the reliability of unlabeled information screening, it ignores the imbalance of unlabeled screening information, leading the model to bias towards the majority class and compromising its ability to recognize minority class samples. However, the proposed ACUS-SSL introduces an imbalance-information-driven cost-sensitive strategy, ensuring the reliability of the model and demonstrating satisfactory results. Specifically, with only 50 unlabeled samples screened at an imbalance ratio of 1:3:4:2, the model achieves a diagnostic accuracy of 93.75%, representing a significant improvement of 13.75% in diagnostic accuracy compared to the AUS-SSL model.
Additionally, from the results of Experiments 2 to 3, it can be observed that although we increased the number of screened unlabeled samples, the imbalance ratio of the samples also continued to rise. However, it is noteworthy that our proposed ACUS-SSL method can still effectively overcome this challenge and achieve high diagnostic accuracy.
The confusion matrix for Experiment 1 is illustrated in
Figure 6. Observing
Figure 6, it is evident that the proposed ACUS-SSL method achieves precise identification for each fault category, with fault discrimination accuracy higher than other comparative methods. This is attributed to its reliable unlabeled sample information screening mechanism, which accurately selects high-quality unlabeled data to effectively augment the original labeled training dataset, thereby enhancing the model’s diagnostic capability. Additionally, by constructing an imbalance-driven cost-sensitive function, the method addresses the potential information imbalance issue encountered during unlabeled sample screening, ensuring the recognition accuracy of the model for minority fault categories while maintaining the overall fault diagnosis performance of the model.
To validate the diagnostic capability of the proposed method under extreme conditions, especially when only a very limited number of original labeled samples are available, we specifically designed Experiments 4 to 6 in Scenario 2. The experimental results are presented in
Table 9.
As evident from
Table 9, the performance of each diagnostic model decreases as the number of labeled samples decreases. This is because the reliability of supervised models trained with a smaller number of labeled samples is greatly compromised, which in turn affects the credibility of the unlabeled data features extracted from these models. Specifically, by comparing the data in Row 2 of
Table 8 and
Table 9, we can observe that the unreliability of supervised models directly reduces the effectiveness of unlabeled samples in enhancing model performance, thereby affecting the performance of different semi-supervised models. Furthermore, by comparing the results of Experiment 5 and Experiment 6, it is evident that, under the condition of ensuring reliable unlabeled sample selection, incorporating the selected unlabeled samples with rich information into the training of supervised models can significantly improve the accuracy of sample recognition. Moreover, the proposed ACUS-SSL method consistently outperforms other comparative methods in diagnosing various faults. This fully demonstrates the effectiveness of the dynamic optimization strategy for supervised models and unlabeled sample feature extraction capabilities designed in
Section 3.3. Additionally, it underscores the superiority of the proposed method in screening unlabeled sample class information and handling imbalanced data scenarios.
The confusion matrix for Experiment 4 is shown in
Figure 7. It is clearly observable that, even when faced with the challenge of an extremely limited number of labeled samples, which results in a relatively low reliability of the constructed supervised model, our proposed ACUS-SSL method is still able to accurately identify each type of fault. This performance demonstrates the robustness of the ACUS-SSL method, namely, its ability to maintain high robustness with very few labeled samples. This provides a new solution to address the scarcity of labeled samples and model reliability issues in practical applications.
4.2.2. SMU Dataset Analysis
The SMU bearing dataset differs significantly from the CWRU dataset in terms of experimental design, fault types, and data sampling. In particular, the SMU bearing data not only covers multiple fault types of bearings but also includes varying degrees of damage, providing a more comprehensive testing platform for the validation of various diagnostic algorithms.
Firstly, for the vibration data of bearings monitored under different health states, a sliding window with a length of 400 is used to extract samples. Secondly, to meet the input requirements of the CNN, each extracted sample is transformed into 20 × 20 two-dimensional data. Finally, multiple experimental scenarios with the same number of labeled samples but different numbers of unlabeled samples are constructed to verify the performance of various diagnostic models. Specifically, in Experimental Scenario 3, the number of labeled samples is 10 × 10, and the number of unlabeled samples is 10,000, 15,000, and 20,000, respectively, with the selected number of unlabeled samples being 50, 100, and 200. In Experimental Scenario 4, the number of labeled samples is reduced to 10 × 5, while the number of unlabeled samples and the selected number of unlabeled samples remain the same as in Scenario 3. The detailed experimental scenario settings are shown in
Table 10.
The detailed test results for Experiments 7 to 9 in Scenario 3 are listed in
Table 11. Observing
Table 11, it is evident that traditional CNNs struggle to build a reliable fault diagnosis model when only a very limited number of labeled samples are available. The reason lies in the fact that when the diagnostic task requires not only accurate differentiation of fault types but also further subclassification of fault severity, a large amount of sample data is needed to support the learning and formation of clear decision boundaries. Comparing the diagnostic results of CNN and π-mode, it can be seen that when the selected unlabeled samples are added, the π-model method enhances the model’s feature extraction capability by utilizing the feature information of unlabeled data, resulting in higher fault diagnosis accuracy than CNN. Furthermore, comparing the diagnostic results of π-model with CNN-SSL, VAE-SSL, and AUS-SSL, it is apparent that these methods outperform π-model in diagnostic performance. This is attributed to their use of self-training to assign pseudo-labels to unlabeled data, thereby enriching the sample feature information required for supervised model training and enhancing the model’s diagnostic performance. However, it is worth noting that despite these improvements in performance, the accuracy of fault diagnosis is still not ideal. This is primarily due to the issue of class imbalance in the selected unlabeled samples, which adversely affects the learning direction of model parameters and thus constrains the further improvement of the model’s fault diagnosis performance. Fortunately, our proposed ACUS-SSL method effectively utilizes a cost-sensitive function designed for imbalanced information to guide and correct the learning direction of model parameters, thereby significantly enhancing the model’s fault recognition performance. Therefore, ACUS-SSL maintains excellent diagnostic capability even when faced with imbalanced sample data.
Finally, when comparing the overall results of Experiments 7 to 9, we can observe that after introducing a significant number of unlabeled samples to enhance the models’ feature capture capabilities, the π-model, CNN-SSL, VAE-SSL, and AUS-SSL all demonstrate improvements in fault diagnosis accuracy. This phenomenon underscores the significance of unlabeled samples in augmenting the models’ fault feature extraction and recognition capabilities. By effectively leveraging these unlabeled samples, these methods not only enrich the training data but also further enhance the models’ sensitivity and precision in detecting intricate fault features. Furthermore, the proposed ACUS-SSL method consistently exhibits the highest diagnostic accuracy. This is because when facing complex diagnostic tasks, the proposed method ensures the reliability of the label information of the selected unlabeled samples on the one hand. On the other hand, by constructing a cost-sensitive function driven by data imbalance, it effectively addresses the issue of information imbalance in unlabeled sample selection, thereby adaptively adjusting the weights of different classes of samples during model training and improving the model’s recognition ability for minority or difficult-to-classify categories.
Figure 8 presents the confusion matrices of various diagnostic models in Experiment 7. As observed from
Figure 8, the proposed ACUS-SSL method achieves precise identification of each fault category in complex fault diagnosis tasks.
To validate the applicability of the proposed method in scenarios with fewer labeled samples, we designed Experiments 10 to 12 in Scenario 4 for comparison with Experiments 7 and 9. The diagnostic results of each model method are presented in
Table 12. As can be seen from
Table 12, even under the severe challenge of an extremely limited number of labeled samples, the proposed method still demonstrates superior performance compared to other comparison methods. This significant advantage not only verifies the strong decision-making and recognition capabilities of the proposed method but also highlights its ability to accurately mine distinguishable fault features from limited samples, which is consistent with the conclusions drawn from
Table 9.
Furthermore, when comprehensively comparing the analysis results of a total of 12 experimental scenarios on two different datasets, we can conclude that the presented method exhibits more outstanding diagnostic performance compared to other methods, demonstrating a strong advantage in complex diagnostic tasks.
Figure 9 displays the confusion matrices of various diagnostic models in Experiment 10. It is evident from
Figure 9 that, even with extremely limited labeled samples, the proposed method can accurately obtain high-quality unlabeled data by selecting informative unlabeled samples. Additionally, through the designed cost-sensitive strategy, the model can accurately capture fault features in an imbalanced data environment, thereby achieving precise identification of each fault category.
Furthermore, to highlight the outstanding performance of the proposed method across different experimental scenarios,
Figure 10 intuitively displays the comparison of accuracy among various diagnostic models in all 12 experimental scenarios. It can be observed directly from the figure that the diagnostic performance of the proposed method is superior to other comparative methods in every experimental scenario. Therefore, when faced with the challenge of relatively scarce labeled data, the proposed method can efficiently utilize a large amount of unlabeled data, thereby effectively enhancing the fault diagnosis performance of the model. Not only does the proposed method reduce the dependence on a large amount of labeled data, but it also significantly improves the accuracy and reliability of fault diagnosis. This provides a novel solution and approach for addressing the challenges of scarce labeled samples and imbalanced semi-supervised fault diagnosis in practical industrial applications.