1. Introduction
Due to its strong point of versatility, rotating machinery is currently widely adopted in various pieces of mechanical equipment and works in extremely complex environments. Damage will not only impede the normal operation of the equipment, but also cause huge economic losses and pose threats to personal safety. Consequently, performing studies on the fault diagnosis of rotating machinery has become an urgent issue in the field of machine health monitoring.
Generally speaking, the fault diagnosis techniques of rotating machinery are commonly classified into the three types: model-based, signal-based and machine learning-based [
1,
2]. In model-based methods, a dynamic model should be developed based on the characteristics of vibration responses and the fault generation mechanism of rotating machinery. Cheng et al. [
3] established a torsional vibration model for the planetary gear system, and conducted a signal simulation under the normal state and different breaking modes of the sun wheel gear. Then, a new metric named sideband-amplitude ratio was extracted from the frequency spectrum to quantitatively evaluate the damage degree. Park et al. [
4] constructed a lumped parametric model of planetary gears based on the relationship between the gear mesh stiffness and the transmission errors, and it was verified in detecting planetary gear failures. Model-based fault diagnosis methods can enhance the comprehension of the transmission behavior and vibration responses of rotating machinery under different failure modes. However, there exist many assumptions and constraints in the model building process; thus, the influences of various uncertainties cannot be fully considered. Furthermore, there is still a certain gap between the simulation and the actual situations, which also has an impact on the fault diagnosis results to some extent. In signal-based methods, two techniques can be used to obtain fault information for the diagnosis of rotating machinery: signal filtering or signal decomposition methods to separate fault components from normal components, and time-frequency analysis methods to reveal the time-varying characteristics of the frequency components of vibration signals. Vibration signals are commonly used in the field of rotating machinery diagnosis because the machine sounds can reflect the most inherent information of health state and contain lots of information, such as working condition [
5,
6]. Teng et al. [
7] extracted the sensitive components of vibration signals using the traditional narrow-band filtering technique at first; then, the inverted spectral analysis and Hilbert-yellow transform were applied to detect the obvious cracking faults; and finally, the Gaussian wavelet transform and multi-scale envelope spectral demodulation were employed to successfully detect the cracking faults in the composite fault mode. Feng et al. [
8] combined a Kalman filter with a higher-order energy demodulation operator to effectively diagnose early faults with different signature frequencies. Although fault diagnosis methods based on the signal processing technology are intuitive, practical and physically meaningful, traditional signal decomposition and demodulation methods still have their own limitations. For example, the methods for extracting the envelope spectrum and the transient frequency spectrum components are still complex, as they rely on staff with rich engineering experience to discriminate the types of faults. Recently, the development of machine learning has pushed the fault diagnosis of rotating machinery towards intelligence. In machine learning-based methods, the fault diagnosis process is divided into the two stages [
9]: the fault feature extraction stage and the health state recognition stage (i.e., fault classification). The feature extraction aims at extracting information about the health state from the raw signals, removing redundancy and facilitating the fault classification of rotating machinery. The fault classification involves the selection of an appropriate classification method to construct a fault diagnosis model for identifying faults of rotating machinery. Traditional machine learning methods—artificial neural networks (ANN) [
10], k-nearest neighbor (KNN) [
11], sparse representation-based classifier (SRC) [
12,
13], support vector machine (SVM) [
14,
15], etc.—artificially extract fault features from the collected data and select sensitive features for training to achieve fault type identification automatically. Chen et al. [
16] utilized probabilistic neural networks to perform effective fault diagnosis of hydroelectric turbine generators. Pandya et al. [
17] presented an improved KNN algorithm based on an asymmetric proximity function to enhance the diagnosis precision of bearings. However, these methods still rely on the artificial feature extraction and suffer from problems such as low accuracy and poor generalization performance.
With the emergence of deep learning methods, ground-breaking solutions to the fault diagnosis problem of rotating machinery have been provided recently. Deep learning-based diagnosis approaches apply hierarchical networks to learn abstract fault features layer by layer, and then place an output layer after the last extraction layer to accomplish the fault identification task. Nowadays, depth network models are broadly used in the field of fault diagnosis of rotating machinery, such as depth neural networks (DNN), stacked automatic encoder (SAE), depth belief networks (DBN) and convolutional neural networks (CNN). Jia et al. [
18] proposed a five-layer SAE model for the fault diagnosis of rotating machinery; the raw data were converted into the frequency-domain data and subsequently fed it into the model to implement the diagnosis task, and the effectiveness of the method was fully verified by experiments of rotating bearings and planetary gearboxes. Wang et al. [
19] introduced the batch normalization layer (BN) into SAE, which could address the issue of internal covariance shifting while training multilayer networks. The experimental results showed that the proposed method could enhance fault identification accuracy and accelerate convergent speed of the training process. Yang et al. [
20] presented a DNN-based automatic classification algorithm, which was evaluated by the sub-signals extracted in the frequency-domain. Shen et al. [
21] adopted a feature robustness-enhanced adaptive fault diagnosis network based on a deep compression auto-encoder (CAE) and demonstrated its superiority on a gearbox databank. Yin et al. [
22] optimized the network structure of DBN by a genetic algorithm and utilized it for the fault diagnosis of gear drive chains. Gan et al. [
23] constructed a two-layer fault diagnosis network based on DBN, which adopted a wavelet transform for data preprocessing and implemented fault type and fault degree identification respectively based on the two-layer network. Li et al. [
24] combined a hybrid diagnosis model based on SAE and DBN. DBN was used to discriminate health states on the learning features extracted from SAE. Shang et al. [
25] published a DBN-based fault diagnosis model for rolling bearings, which could evade the complex structure of deep neural nets to some extent. The proposed model has the merits of easily training and good fault diagnosis capability. However, most fault diagnosis models based on deep learning theory are regular fully-connected networks with low model generalization performance, and the parameters increase exponentially with the number of layers. Therefore, issues of overfitting and decreasing diagnosis ability still affect the actual applications of above methods.
Compared with the fully-connected DNN, CNN’s characteristics such as sparse connection, weight sharing and pooling operations can reduce the number of parameters of the training network, and enhance the model’s stability and generalization performance. The original version of CNN is a two-dimensional structure inspired by the visual system, which has achieved remarkable results in the field of image recognition because of its ability to describe the natural 2D spatial correlations in images [
26,
27]. Nowadays, the traditional 2D-CNN algorithm has been extensively applied in the field of fault diagnosis. Wen et al. [
28] modified the LeNet-5 to perform fault diagnosis for bearings and centrifugal pumps, wherein the raw time-domain signal was transformed into a two-dimensional gray scale image to train the model. Guo et al. [
29] adopted a time-frequency domain transformation of the original signal to obtain a continuous wavelet transform scale diagram (CWTS), on whose basis they adopted CNN to directly classify the fault signal. The validity and generality of the proposed method were confirmed on a rotor experiment platform. Zhao et al. [
30] developed a planetary gearbox fault diagnosis strategy based on the synchro squeezing transform (SST) and the deep convolutional neural networks (DCNN), which showed the fault identification accuracy was up to 98.3%. Sun et al. [
31] converted the raw signal to a 2D image; then they carried out automatic feature extraction from the image via CNN and completed the fault classification of bearings. Inspired by the successful applications in the fields of natural language processing and speech recognition [
32,
33], the 1D-CNN algorithm was introduced for fault diagnosis only recently. Liu et al. [
34] improved the traditional LeNet-5 network, and their results showed that the improved 1D LeNet-5 network could achieve more significant performance on fault diagnosis. Eren et al. [
35] utilized a compact adaptive 1D-CNN classifier to implement the fault diagnosis of induction bearings and proved the feasibility of the algorithm on a real dataset. Zhang et al. [
36] presented an end-to-end fault diagnosis approach based on 1D-CNN, and the experimental results exhibited high accuracy, even in a noisy environment. Several studies have been conducted to prove the 1D-CNN has advantages over 2D-CNN in processing vibration signals for fault diagnosis. An et al. [
37] compared the performances of 2D-CNN and 1D-CNN in bearing fault diagnosis, which demonstrated that the 1D-CNN model offered better feature extraction capability than the 2D-CNN model. Jing et al. [
38] compared the fault diagnosis performances of different CNN models under three types of input data (i.e., the raw data, the spectrum data and the combined time-frequency data), and the results indicated that the 1D-CNN model is superior than others when the spectrum data is adopted as its input.
To be noted is that most of the current studies focusing on fault diagnosis methods of rotating machinery have been conducted on ideal data—that is, assuming the training samples cover the total working conditions of rotating machinery. However, in real engineering situations, it is infeasible to obtain the ideal data for all working conditions of rotating machinery because the working load changes constantly. Therefore, it is critical to utilize the data collected from limited conditions to establish fault diagnosis models. In other words, how to enhance the generalization capability of fault diagnosis models has emerged as one of the hot spots in practical industry applications.
Aiming at the shortfalls of existing methods, this paper proposes a fault diagnosis method based on one-dimensional, self-normalizing convolutional neural networks (1D-SCNN) to address the low accuracy and poor generalization capability of fault diagnosis for rotating machinery. As shown in
Figure 1, the main innovation of this paper is summarized as follows.
A fault diagnosis model based on 1D-SCNN is presented, which has a simple and compact architecture configuration with only a convolutional layer and a pooling layer. Compared with the conventional techniques, it can achieve competitive performance in terms of fault diagnosis accuracy and generalization capability.
The scaled exponential linear units (SeLU) are employed to strengthen the features of the fault signal. With the self-normalizing properties, activations can maintain normalization when propagating through layers of the network. Therefore, SeLU can maintain the stability and convergence of the network, and enhance the generalization capability of the model.
The -dropout algorithm is introduced into the feature extractor and classifier simultaneously, which not only can restrain the overfitting at the initial stage of training, but also should be able to accelerate the speed of the network’s parameters updating and further boost the generalization capability of the model.
A series of experiments utilizing the Case Western Reserve University bearing dataset are conducted. The results demonstrate that the proposed method possesses good fault diagnosis accuracy and generalization capability, and provides an excellent solution for enhancing the reliability and maintainability of mechanical equipment.
5. Conclusions
To improve the fault identification accuracy of rotating machinery under cross-load level conditions, a 1D-SCNN-based fault diagnosis method is proposed. The method involves feature extraction based on the frequency spectrum of the raw vibration signals. By introducing the self-normalizing properties of SNN into 1D-CNN through SeLU, and applying -dropout twice to regularize the model, the accuracy and generalization capability of the fault diagnosis model are greatly enhanced. Furthermore, the model is configured with a simple and compact architecture, which has a good computational complexity. The experimental results demonstrate that the proposed method not only can possesses high fault identification accuracy under the same load level, but also should be able to achieve good performance under cross-load level conditions. Given that in actual engineering situations, it is tricky to obtain ideal data under all working conditions of rotating machinery due to constant changes in speed and load level, the proposed method can utilize the data collected from limited working conditions to build a fault diagnosis model to identify fault types under other working conditions, which has significant engineering application value.
It should be noted that the actual running conditions of the rotating machinery are very complex; thus, it is impossible to obtain a training dataset including all the conditions. Considering the diversity of real engineering situations, future work will be devoted to taking more validation scenarios into account to test the generalization capability of the proposed method. For example, training the model using the datasets of A and C, and then testing the performance of the model using the datasets of B and D. Furthermore, most time-domain features and the simple frequency spectrum are insensitive for bearing fault diagnosis in actual situations. Considering the complexity of real engineering situations, we will try some other features, such as the squared envelope or the squared envelope spectrum, in our future work.