Domain Adversarial Transfer Learning Bearing Fault Diagnosis Model Incorporating Structural Adjustment Modules
Abstract
:1. Introduction
- (1)
- There are no publicly available large bearing failure datasets that can be used to train neural networks, leading to poor performance of transfer learning in solving multi-failure complex problems.
- (2)
- Deep learning to find the optimal hyperparameters for a model takes a lot of developer time and effort.
- (3)
- Existing AutoML platforms and tools generally require significant computational resources when performing Neural Architecture Search (NAS).
- (1)
- Applying pre-trained models to the target domain using transfer learning avoids the use of AutoML for Neural Architecture Search (NAS), reduces the demand for computational resources during model development, and significantly shortens the development cycle relative to building models from scratch.
- (2)
- Compared with the fine-tuning [36] strategy commonly used in transfer learning, this paper proposes the method of embedding the neural network depth and width dynamic adjustment module, which largely improves the model’s ability to classify complex faults and effectively copes with the current lack of large public bearing fault datasets.
- (3)
- The introduction of Optuna to optimize the hyperparameters of the model no longer requires professionals to spend a lot of time on manual debugging, which lowers the threshold of model development and use and is conducive to the promotion of “intelligence” in various industries.
2. Theoretical Foundation
2.1. Adversarial Domain Adaptation
2.1.1. Basic Idea
2.1.2. Processes for Adversarial Domain Adaptation
- The feature extractor extracts features from the source and target domain data, usually implemented using deep neural networks (e.g., convolutional neural networks, fully connected networks, etc.).
- Discriminator: the task of the discriminator is to determine whether the input features are from the source or target domain. Ideally, after training, the discriminator will not be able to distinguish between features from the source and target domains.
- The classifier is responsible for classification tasks based on extracted features (e.g., image classification, text classification, etc.).
- Adversarial training aims to optimize the generator and discriminator by backpropagation. The goal of the discriminator is to differentiate as much as possible between data in the source and target domains, while the goal of the feature extractor and classifier is to try to make the discrimination as ineffective as possible by generating an adversarial process.
2.1.3. Principles and Formulas: An Example of Domain Adversarial Neural Network (DANN)
- Basic Settings and Symbol Definitions
- 2.
- Categorized losses
- 3.
- Adversarial loss
- 4.
- Objectives of Confrontation Training
- 5.
- Joint optimization
3. Proposed Methodology
3.1. Adversarial Domain Adaptive Transfer Learning
3.1.1. Source Domain Model Selection and Training
- Data preprocessing
- 2.
- Building CNN models
- 3.
- Model training
- 4.
- Evaluation and Testing of Models
3.1.2. Domain Adaptive Migration of Pre-Trained Models
- Pre-training dissemination
- 2.
- Calculating the loss function
- 3.
- Backpropagation and Optimization
- 4.
- Training and validation
- 5.
- Updating the best indicators
3.2. Structural Conditioning Module Incorporating Optuna Hyperparametric Optimization Algorithm
3.2.1. Defining the Network Structure
3.2.2. Setting up Optuna to Optimize Space
3.2.3. Depth and Width Adjustment
3.2.4. Freezing and Thawing Layers
3.2.5. Training and Assessment
3.3. Hyperparameter Optimization of the Model
3.3.1. Defining the Objective Function
3.3.2. Defining a Hyperparametric Search Space
3.3.3. Creating an Optuna Study
3.3.4. Implementing the Optimization Process
3.3.5. Recording and Selecting Optimal Hyperparameters
4. Experimental Verification
4.1. Description of the Dataset
- (1)
- Adaptation of cross-domain learning: the CWRU dataset as the source domain has better standardization and consistency and can be used to train a base model, whereas the Paderborn dataset as the target domain is more challenging and complex, which is relevant to the point of this paper.
- (2)
- Adaptability to real-world environments: through cross-domain learning, the model is able to learn some generalized feature extraction methods from the concise and standard CWRU dataset and then migrate to the noisy and data-complex Paderborn dataset, which is able to better cope with the variations and disturbances of real-world industrial environments.
- (3)
- Improve robustness: By training the base model on the source domain and then migrating it to the target domain for tuning, the robustness of the model in the face of complex environments, noise, and disturbances in real applications can be enhanced.
Feature | CWRU Dataset | Paderborn Dataset |
---|---|---|
Fault Type | Single faults | Compound faults |
Fault Creation | Artificial EDM machining | Artificial machining + natural wear |
Operating Conditions | Fixed load | Dynamic loads, variable speed operation |
Signal Types | Vibration signals | Vibration + AE + current signals |
Data Scale | Small | Large |
Authenticity | Lab environment | Industrial-like scenarios |
Application | Basic algorithm validation | Testing in complex scenarios |
4.1.1. Introduction to CWRU Bearing Dataset
- (1)
- The failure types include inner ring failure, outer ring failure, rolling body failure, and mixed failure (inner ring and outer ring failure at the same time). This paper selects the inner ring failure and outer ring failure data for research.
- (2)
- Experimental setup: Datasets were acquired at a fixed load and multiple rotational speeds with four load settings (loaded to 0, 1/3, 2/3, and 1× the rated load) and different rotational speeds (1797 rpm, 1750 rpm, 2100 rpm, etc.).
- (3)
- The signal type mainly uses vibration signals; the acquisition directions for the X-axis, Y-axis, and Z-axis; and a sampling frequency of 12 kHz. The data acquisition environment is relatively simple, with less interference.
4.1.2. Introduction to Paderborn University Bearing Dataset
- (1)
- The failure types include inner ring failures, outer ring failures, and mixed inner and outer ring failures, and the failure types in the dataset are a mixture of manual processing and natural aging failures. These fault types are selected in the experiment.
- (2)
- Experimental setting: This dataset uses dynamic load and variable speed operating conditions to simulate load variations closer to those found in industrial environments. The vibration signals in the experiments are collected by multiple sensors, including acoustic emission (AE) and current signals in addition to accelerometers.
- (3)
- The signal type contains vibration signals (X- and Y-axis), acoustic emission signals (AE), and current signals, with sampling frequencies typically 25.6 kHz or higher. In this paper, only vibration signals are studied.
4.2. Experimental Setup
- (1)
- In order to show that the proposed method occupies less computational resources than the AutoML tool alone, the proposed method is trained with AutoGluon and Auto-Keras under the same conditions, and its running time on the GPU is compared;
- (2)
- To illustrate the need for model migration and hyperparameter optimization, the proposed method is compared with RNN and CNN;
- (3)
- In order to demonstrate that the addition of a neural network depth and width dynamic adjustment module can facilitate the positive migration of pre-trained models and the ability to classify complex faults, the proposed method is compared with DANN, MMD, and DAN.
4.3. Analysis of Experimental Results
- (1)
- Comparing method 2 and method 3, the accuracy of the proposed method is better than that of AutoGluon and Auto-Keras, which indicates that the proposed method is able to automatically optimize the hyperparameters of the model while guaranteeing higher fault diagnosis accuracy than the more advanced AutoML tools. This is due to the fact that the proposed method uses TL combined with the structural adjustment module to construct the model in the case where the target domain task is related to the source domain task, which is more advantageous than the model construction methods of AutoGluon and Auto-Keras.
- (2)
- According to Figure 9, it can be seen that the GPU time of the proposed method is less than those of AutoGluon and Auto-Keras. This in part reflects the fact that the proposed method requires less computational resources than AutoGluon and Auto-Keras. This is due to the fact that TL applies pre-trained models to the target domain, which accurately reduces the search space and thus accelerates the NAS process.
- (3)
- Comparing method 4 and method 7, it can be seen that the diagnostic accuracy of DANN is slightly higher than that of CNN, indicating that adversarial domain adaptive migration learning can effectively improve the model’s ability to diagnose across domains, and the use of this strategy can effectively promote positive migration.
- (4)
- Comparing method 1, method 4, method 5. and method 6 shows that the classification accuracy and F1 score of the proposed method are better than those of TL. This indicates that the structural adjustment module can effectively improve the adaptive ability of the pre-trained model when facing a complex fault dataset.
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Liu, R.; Yang, B.; Zio, E.; Chen, X. Artificial intelligence for fault diagnosis of rotating machinery: A review. Mech. Syst. Signal Process. 2018, 108, 33–47. [Google Scholar] [CrossRef]
- Zhang, S.; Zhang, S.; Wang, B.; Habetler, T.G. Deep Learning Algorithms for Bearing Fault Diagnostics—A Review. In Proceedings of the 2019 IEEE 12th International Symposium on Diagnostics for Electrical Machines, Power Electronics and Drives (SDEMPED), Toulouse, France, 27–30 August 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 257–263. [Google Scholar] [CrossRef]
- Lei, Y.; Yang, B.; Jiang, X.; Jia, F.; Li, N.; Nandi, A.K. Applications of machine learning to machine fault diagnosis: A review and roadmap. Mech. Syst. Signal Process. 2020, 138, 106587. [Google Scholar] [CrossRef]
- Lee, D.; Sabater, A. Visibility Graphs, Persistent Homology, and Rolling Element Bearing Fault Detection. In Proceedings of the 2022 IEEE International Conference on Prognostics and Health Management (ICPHM), Detroit (Romulus), MI, USA, 6–8 June 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–5. [Google Scholar] [CrossRef]
- Shinde, P.V.; Desavale, R.G.; Jadhav, P.M.; Sawant, S.H. A multi fault classification in a rotor-bearing system using machine learning approach. J. Braz. Soc. Mech. Sci. Eng. 2023, 45, 121. [Google Scholar] [CrossRef]
- Phan, Q.N.X.; Le, T.M.; Tran, H.M.; Van Tran, L.; Dao, S.V.T. Novel Machine Learning Techniques for Classification of Rolling Bearings. IEEE Access 2024, 12, 176863–176879. [Google Scholar] [CrossRef]
- He, M. Development of Deep Learning Based Methodology on Rotating Machine Fault Diagnosis. Ph.D. Thesis, University of Illinois Chicago, Chicago, IL, USA, 2018. [Google Scholar]
- Hoang, D.-T.; Kang, H.-J. A survey on Deep Learning based bearing fault diagnosis. Neurocomputing 2019, 335, 327–335. [Google Scholar] [CrossRef]
- Saufi, S.R.; Ahmad, Z.A.B.; Leong, M.S.; Lim, M.H. Challenges and Opportunities of Deep Learning Models for Machinery Fault Detection and Diagnosis: A Review. IEEE Access 2019, 7, 122644–122662. [Google Scholar] [CrossRef]
- Shan, S.; Liu, J.; Wu, S.; Shao, Y.; Li, H. A motor bearing fault voiceprint recognition method based on Mel-CNN model. Measurement 2023, 207, 112408. [Google Scholar] [CrossRef]
- Li, R.; Yu, P.; Cao, J. Rolling bearing fault diagnosis method based on SOA-BiLSTM. In Proceedings of the 2023 7th International Conference on Electronic Information Technology and Computer Engineering, Xiamen, China, 20–22 October 2023; ACM: New York, NY, USA, 2023; pp. 41–45. [Google Scholar] [CrossRef]
- Zhou, H.; Chen, W.; Cheng, L.; Williams, D.; De Silva, C.W.; Xia, M. Reliable and Intelligent Fault Diagnosis with Evidential VGG Neural Networks. IEEE Trans. Instrum. Meas. 2023, 72, 1–12. [Google Scholar] [CrossRef]
- Chen, S.; Zou, S. Enhancing Bearing Fault Diagnosis with Deep Learning Model Fusion and Semantic Web Technologies. Int. J. Semantic Web Inf. Syst. 2024, 20, 1–20. [Google Scholar] [CrossRef]
- Fan, J.; Ma, C.; Zhong, Y. A Selective Overview of Deep Learning. Stat. Sci. 2021, 36, 264–290. [Google Scholar] [CrossRef]
- Hu, X.; Chu, L.; Pei, J.; Liu, W.; Bian, J. Model complexity of deep learning: A survey. Knowl. Inf. Syst. 2021, 63, 2585–2619. [Google Scholar] [CrossRef]
- Fristiana, A.H.; Alfarozi, S.A.I.; Permanasari, A.E.; Pratama, M.; Wibirama, S. A Survey on Hyperparameters Optimization of Deep Learning for Time Series Classification. IEEE Access 2024, 12, 191162–191198. [Google Scholar] [CrossRef]
- Majidi, F.; Openja, M.; Khomh, F.; Li, H. An Empirical Study on the Usage of Automated Machine Learning Tools. In Proceedings of the 2022 IEEE International Conference on Software Maintenance and Evolution (ICSME), Limassol, Cyprus, 3–7 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 59–70. [Google Scholar] [CrossRef]
- Sun, Y.; Song, Q.; Gui, X.; Ma, F.; Wang, T. AutoML in The Wild: Obstacles, Workarounds, and Expectations. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, Hamburg, Germany, 23–28 April 2023; ACM: New York, NY, USA, 2023; pp. 1–15. [Google Scholar] [CrossRef]
- Hanussek, M.; Blohm, M.; Kintz, M. Can AutoML outperform humans? An evaluation on popular OpenML datasets using AutoML Benchmark. In Proceedings of the 2020 2nd International Conference on Artificial Intelligence, Robotics and Control, Cairo, Egypt, 12–14 December 2020; ACM: New York, NY, USA, 2020; pp. 29–32. [Google Scholar] [CrossRef]
- Kim, D.; Koo, J.; Kim, U.-M. A Survey on Automated Machine Learning: Problems, Methods and Frameworks. In Human-Computer Interaction. Theoretical Approaches and Design Methods; Kurosu, M., Ed.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2022; Volume 13302, pp. 57–70. [Google Scholar] [CrossRef]
- Feurer, M.; Eggensperger, K.; Falkner, S.; Falkner, S.; Lindauer, M.; Hutter, F. Auto-Sklearn 2.0: Hands-free AutoML via Meta-Learning. arXiv 2020, arXiv:2007.04074. [Google Scholar]
- Neutatz, F.; Lindauer, M.; Abedjan, Z. AutoML in heavily constrained applications. VLDB J. 2024, 33, 957–979. [Google Scholar] [CrossRef]
- Pham, H.; Guan, M.Y.; Zoph, B.; Le, Q.V.; Dean, J. Efficient Neural Architecture Search via Parameter Sharing. arXiv 2018, arXiv:1802.03268. [Google Scholar]
- He, X.; Zhao, K.; Chu, X. AutoML: A survey of the state-of-the-art. Knowl.-Based Syst. 2021, 212, 106622. [Google Scholar] [CrossRef]
- Luo, Y.; Wang, M.; Zhou, H.; Yao, Q.; Tu, W.; Chen, Y.; Chen, Y.; Dai, W.; Yang, Q. AutoCross: Automatic Feature Crossing for Tabular Data in Real-World Applications. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; ACM: New York, NY, USA, 2019; pp. 1936–1945. [Google Scholar] [CrossRef]
- Elsken, T.; Metzen, J.H.; Metzen, J.; Hutter, F. Neural Architecture Search: A Survey. arXiv 2019, arXiv:1808.05377. [Google Scholar]
- Lu, J.; Behbood, V.; Hao, P.; Zuo, H.; Xue, S.; Zhang, G. Transfer learning using computational intelligence: A survey. Knowl.-Based Syst. 2015, 80, 14–23. [Google Scholar] [CrossRef]
- Zhuang, F.; Qi, Z.; Duan, K.; Xi, D.; Zhu, Y.; Zhu, H.; Xiong, H.; He, Q. A Comprehensive Survey on Transfer Learning. Proc. IEEE 2021, 109, 43–76. [Google Scholar] [CrossRef]
- Chen, X.; Yang, R.; Xue, Y.; Huang, M.; Ferrero, R.; Wang, Z. Deep Transfer Learning for Bearing Fault Diagnosis: A Systematic Review Since 2016. IEEE Trans. Instrum. Meas. 2023, 72, 1–21. [Google Scholar] [CrossRef]
- Tang, S.; Ma, J.; Yan, Z.; Zhu, Y.; Khoo, B.C. Deep transfer learning strategy in intelligent fault diagnosis of rotating machinery. Eng. Appl. Artif. Intell. 2024, 134, 108678. [Google Scholar] [CrossRef]
- Wu, Z.; Jiang, H.; Zhao, K.; Li, X. An adaptive deep transfer learning method for bearing fault diagnosis. Measurement 2020, 151, 107227. [Google Scholar] [CrossRef]
- Huo, C.; Jiang, Q.; Shen, Y.; Zhu, Q.; Zhang, Q. Enhanced transfer learning method for rolling bearing fault diagnosis based on linear superposition network. Eng. Appl. Artif. Intell. 2023, 121, 105970. [Google Scholar] [CrossRef]
- Thuan, N.D. Robust knowledge transfer for bearing diagnosis in neural network models using multilayer maximum mean discrepancy loss function. Meas. Sci. Technol. 2024, 35, 126129. [Google Scholar] [CrossRef]
- Zhang, Z.; Li, J.; Cai, C.; Ren, J.; Xue, Y. Bearing Fault Diagnosis Based on Image Information Fusion and Vision Transformer Transfer Learning Model. Appl. Sci. 2024, 14, 2706. [Google Scholar] [CrossRef]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; ACM: New York, NY, USA, 2019; pp. 2623–2631. [Google Scholar] [CrossRef]
- Pan, S.J.; Yang, Q. A Survey on Transfer Learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
- Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; Marchand, M.; Lempitsky, V. Domain-Adversarial Training of Neural Networks. In Domain Adaptation in Computer Vision Applications; Csurka, G., Ed.; Advances in Computer Vision and Pattern Recognition; Springer International Publishing: Cham, Switzerland, 2017; pp. 189–209. [Google Scholar] [CrossRef]
- Gao, H.; Zhang, X.; Gao, X.; Li, F.; Han, H. ICoT-GAN: Integrated Convolutional Transformer GAN for Rolling Bearings Fault Diagnosis Under Limited Data Condition. IEEE Trans. Instrum. Meas. 2023, 72, 1–14. [Google Scholar] [CrossRef]
- Li, Y.; Yang, R.; Wang, H. Unsupervised Method Based on Adversarial Domain Adaptation for Bearing Fault Diagnosis. Appl. Sci. 2023, 13, 7157. [Google Scholar] [CrossRef]
- Jeong, S.; Kim, B.; Cha, S.; Seo, K.; Chang, H.; Lee, J.; Kim, Y.; Noh, J. Real-Time CNN Training and Compression for Neural-Enhanced Adaptive Live Streaming. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 6023–6039. [Google Scholar] [CrossRef]
- Lessmeier, C.; Kimotho, J.K.; Zimmer, D.; Sextro, W. Condition Monitoring of Bearing Damage in Electromechanical Drive Systems by Using Motor Current Signals of Electric Motors: A Benchmark Data Set for Data-Driven Classification. PHM Soc. Eur. Conf. 2016, 3, 1. [Google Scholar] [CrossRef]
- Smith, W.A.; Randall, R.B. Rolling element bearing diagnostics using the Case Western Reserve University data: A benchmark study. Mech. Syst. Signal Process. 2015, 64–65, 100–131. [Google Scholar] [CrossRef]
Convolutional Layer Count | Convolutional Layer Output Channel Number | Convolution Kernel Size | Convolutional Stride | Full Connection Layer Quantity | Fully Connected Layer Neuron Count | Learning Rate | Task Weight | Domain Weight |
---|---|---|---|---|---|---|---|---|
2–10 | 16–256 | 2–64 | 1–16 | 1–3 | 16–256 | 1 × e−5–1 × e−2 | 0–2 | 0–1 |
No. | Method | Learning Rate | Batch Size | Epoch | Trial | Hard Ware |
---|---|---|---|---|---|---|
1 | Proposed Method | 128 | 20 | 150 | GPU | |
2 | AutoGluon | 128 | 20 | 150 | GPU | |
3 | Auto-keras | 128 | 20 | 150 | GPU | |
4 | DANN | 0.001 | 128 | 20 | GPU | |
5 | MMD | 0.001 | 128 | 20 | GPU | |
6 | DAN | 0.001 | 128 | 20 | GPU | |
7 | CNN | 0.001 | 128 | 20 | GPU | |
8 | RNN | 0.001 | 128 | 20 | GPU |
No. | Method | Accuracy (%) |
---|---|---|
1 | Proposed Method | 98.53 ± 0.35 |
2 | AutoGluon | 88.75 ± 0.22 |
3 | Auto-keras | 81.25 ± 0.13 |
4 | DANN | 92.47 ± 5.10 |
5 | MMD | 85.46 ± 4.05 |
6 | DAN | 86.24 ± 1.60 |
7 | CNN | 89.56 ± 1.45 |
8 | RNN | 89.23 ± 1.32 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhong, Z.; Xie, H.; Wang, Z.; Zhang, Z. Domain Adversarial Transfer Learning Bearing Fault Diagnosis Model Incorporating Structural Adjustment Modules. Sensors 2025, 25, 1851. https://doi.org/10.3390/s25061851
Zhong Z, Xie H, Wang Z, Zhang Z. Domain Adversarial Transfer Learning Bearing Fault Diagnosis Model Incorporating Structural Adjustment Modules. Sensors. 2025; 25(6):1851. https://doi.org/10.3390/s25061851
Chicago/Turabian StyleZhong, Zhidan, Hao Xie, Zhenxin Wang, and Zhihui Zhang. 2025. "Domain Adversarial Transfer Learning Bearing Fault Diagnosis Model Incorporating Structural Adjustment Modules" Sensors 25, no. 6: 1851. https://doi.org/10.3390/s25061851
APA StyleZhong, Z., Xie, H., Wang, Z., & Zhang, Z. (2025). Domain Adversarial Transfer Learning Bearing Fault Diagnosis Model Incorporating Structural Adjustment Modules. Sensors, 25(6), 1851. https://doi.org/10.3390/s25061851