Small-Sample Fault Diagnosis of Axial Piston Pumps across Working Conditions, Based on 1D-SENet Model Migration

Yang, Xukang; Jiang, Anqi; Jiang, Wanlu; Yue, Yi; Jing, Lei; Zhou, Junjie

doi:10.3390/jmse12081430

Open AccessArticle

Small-Sample Fault Diagnosis of Axial Piston Pumps across Working Conditions, Based on 1D-SENet Model Migration

by

Xukang Yang

^1,2

,

Anqi Jiang

^3,*

,

Wanlu Jiang

^1,2,*

,

Yi Yue

^1,2

,

Lei Jing

^1,2 and

Junjie Zhou

^1,2

¹

Hebei Provincial Key Laboratory of Heavy Machinery Fluid Power Transmission and Control, Yanshan University, Qinhuangdao 066004, China

²

Key Laboratory of Advanced Forging & Stamping Technology and Science, Yanshan University, Ministry of Education of China, Qinhuangdao 066004, China

³

School of Electrical Engineering, Yanshan University, Qinhuangdao 066004, China

^*

Authors to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2024, 12(8), 1430; https://doi.org/10.3390/jmse12081430

Submission received: 20 July 2024 / Revised: 8 August 2024 / Accepted: 14 August 2024 / Published: 19 August 2024

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Hydraulic pumps are the core components that provide power for hydraulic transmission systems, which are widely used in aerospace, marine engineering, and mechanical engineering, and their failure affects the normal operation of the entire system. This paper takes a single axial piston pump as the research object and proposes a small-sample fault diagnosis method based on the model migration strategy for the situation in which only a small number of training samples are available for axial piston pump fault diagnosis. To achieve end-to-end fault diagnosis, a 1D Squeeze-and-Excitation Networks (1D-SENets) model was constructed based on a one-dimensional convolutional neural network and combined with the channel domain attention mechanism. The model was first pre-trained with sufficient labeled fault data from the source conditions, and then, based on the model migration strategy, some of the underlying network parameters were fixed, and a small amount of labeled fault data from the target conditions was used to fine-tune the rest of the parameters of the pre-trained model. In this paper, the proposed method was validated using an axial piston pump fault dataset, and the experimental results show that the method can effectively improve the overfitting problem in the small sample fault diagnosis of axial piston pumps and improve the recognition accuracy.

Keywords:

1D-SENet; axial piston pump; small sample; transfer learning; fault diagnosis

1. Introduction

Hydraulic transmission is widely used in marine engineering machinery, mining machinery, construction machinery, and other fields by virtue of its advantages of having a high power density and response speed [1]. Among them, the harsh working environment of marine engineering-specific machinery and equipment puts forward higher requirements for the safety, stability, and reliability of their hydraulic transmission systems. The axial piston pump, as a core component in the hydraulic transmission system, provides power for the system, and its health status directly affects the normal operation of the whole system [2,3]. Therefore, real-time condition monitoring and fault diagnosis of axial piston pumps are of great significance [4,5]. In recent years, against the background of big data, deep learning has gradually become a mainstream method in the field of fault diagnosis by virtue of its powerful feature learning capability [6].

Deep learning can be regarded as a special form of artificial neural network, which improves the learning ability and performance of the model by increasing the depth of the network. Since Krizhevsky et al. achieved the best classification in the ImageNet Large Scale Visual Recognition Challenge using CNNs in 2012 [7], CNNs have been widely used in tasks such as image classification, target detection, and semantic segmentation in the field of computer vision by virtue of their properties, such as local perception and weight sharing. AlexNet, GoogLeNet, ResNet, and other classical networks with two-dimensional pictures as input have appeared successively [8,9,10]. Due to the fact that classical deep learning algorithms typically require two-dimensional images as input, many scholars combine the time–frequency transform method with 2DCNN [11,12] in the intelligent fault diagnosis of equipment [13,14]. However, the state signals that show during machine operation are usually one-dimensional vectors, and converting the raw signals into two-dimensional pictures may cause a certain degree of information distortion. Instead, 1DCNN is directly used to process the original one-dimensional time series signals, which not only ensures that the fault information contained in the input is not lost, but also simplifies the network structure and the fault diagnosis process, which is conducive to the application of the model in the real-time diagnosis of the equipment and improves the efficiency of the fault diagnosis in practical engineering applications. In a study by Yang et al., based on the pressure signals of the adjustment hydraulic servomotor, 1DCNN was used to achieve the regulating oil motor classification of seven states [15]. Gao et al. proposed a rolling bearing fault diagnosis method based on adaptive modified complementary empirical modal decomposition (AMCEEMD) and a one-dimensional convolutional neural network (1DCNN) model, which improved the accuracy of the fault classification through effective denoising and feature extraction [16]. Liu et al. improved the prediction accuracy of burr problems in the aluminium alloy wheel manufacturing process under limited sample conditions using a 1D-ResNet model combined with migration learning techniques [17].

However, since one-dimensional signals may contain a variety of fault features ranging from low to high frequencies, the traditional 1DCNN cannot capture this complex information over different frequency band ranges at the same time and suffers from the defect of a weak feature extraction capability [18,19,20]. To solve this problem, the 1DCNN attention mechanism is combined, which can automatically extract the more important information for the current task from the many features and improve the model’s performance while reducing the information redundancy [21]. Attention mechanisms in convolutional neural networks can be categorized into mixed-domain attention mechanisms [22], spatial-domain attention mechanisms [23], and channel-domain attention mechanisms [24] according to the domains of interest. In order to achieve the purpose of improving the network’s performance without increasing the learning parameters too much, Duan et al. enhanced the feature extraction capability of a one-dimensional, lightweight CNN model by utilizing the SE (Squeeze-and-Excitation, SE) attention mechanism to enhance the accuracy and robustness of fault diagnosis of mountain ropeway bearings [25]. Deng et al. proposed a lightweight neural network, namely the Shuffle-SENet-based axlebox bearing fault diagnosis method for high-speed locomotives, which improved the diagnostic accuracy for and operational efficiency of axlebox bearing faults under complex working conditions [26].

In addition, compared with shallow machine learning methods, the training of deep learning fault diagnosis models with satisfactory performance usually requires more labeled data, and the training and test data should obey the independent same distribution. In reality, the working environment of axial piston pumps and other mechanical equipment is harsh, and the working conditions are complex and variable, and the distribution of data under different working conditions varies, so for some working conditions where fault data are scarce or labeled data are scarce, traditional deep learning-based fault diagnosis methods are difficult to apply. In recent years, the development of transfer learning has provided new ideas to solve the above problems, and parameter fine-tuning is a simple and effective model transfer learning method [27]. The deep learning model-based transfer learning method assumes that the source and target tasks share some common knowledge at the model level, such as model parameters, model prior knowledge, and model architecture, and improves the learning efficiency of the target task by reusing the model that was learned in the source domain. Aiming at a series of engineering practical problems in the field of fault diagnosis, such as small samples, variable operating conditions, and cross-equipment, related scholars have introduced transfer learning and carried out a lot of research [28]. Chen et al. improved the least-squares support vector machine by adding an auxiliary data penalty term in the objective function and combining it with the recursive quantitative analysis to achieve a bearing fault diagnosis when samples are scarce under variable operating conditions [29]. Qian et al. proposed a fault diagnosis network capable of adapting to changes in operating conditions that was based on higher-order KL dispersion and migration learning [30]. He et al. proposed a deep-migration multi-wavelet self-encoder, which fine-tuned the source model that was obtained from source-domain training with a small amount of target-domain data to achieve the fault diagnoses of gearboxes under small samples [31]. Li et al. constructed an example by mining the relationship between the structural features of the sample graphs and utilized graph convolutional networks and MMD to learn migratable features [32]. Specifically, there are few studies on the application of model-based transfer learning in the cross-case, small-sample fault diagnosis of a single hydraulic pump.

As a core component in the hydraulic transmission system, axial piston pumps are of great significance for fault diagnosis. Deep learning has gradually become a mainstream method in the field of fault diagnosis by virtue of its powerful feature learning capability. However, there are still some challenges and problems in axial piston pump fault diagnosis based on deep learning:

(1): The traditional 1DCNN deep learning model has the defect of insufficient feature extraction capability when capturing multi-band fault features in one-dimensional signals.
(2): Deep learning fault diagnosis models usually require a large amount of labeled data, and the training and testing data need to obey independent homogeneous distributions. However, in reality, the working environment of mechanical equipment is harsh, and the working conditions are complex and variable, and the data under different working conditions have distributional differences, which leads to the difficulty of applying the traditional deep learning method in the working conditions in which the fault data are scarce, or the labeled data are scarce.
(3): Most deep model migration studies for cross-condition and small-sample problems are focused on the intelligent fault diagnoses of bearings and gears, and there are still fewer studies on hydraulic axial piston pumps, which need to be further explored and improved.

Therefore, the main contributions of this paper are as follows:

(1): The vibration signal of the pump body of an axial piston pump was utilized as the data source to achieve non-destructive condition monitoring. Meanwhile, 1DCNN was used to directly process the raw sensor data, eliminating the complex and time-consuming manual feature extraction and signal preprocessing steps.
(2): In this paper, a 1D-SENets (1D Squeeze-and-Excitation Networks) model was proposed, which modified the SE element in SENet into a structure suitable for one-dimensional data and added it to a 1DCNN to explicitly model the inter-dependencies between feature channels; global information was used to selectively enhance useful features, suppress useless features, and finally enhance the ability of feature extraction.
(3): For the cross-condition and small-sample problem in axial piston pump fault diagnosis, this study combined 1D-SENet with the model-based transfer learning method, firstly by pre-training the 1D-SENet model in the source condition with sufficient labeled samples, and then fine-tuning the 1D-SENet model in the target condition with only a small number of labeled samples, and finally improving the diagnostic accuracy of the axial piston pump in the small-sample condition significantly.

The rest of the paper is as follows. Section 2 describes the fundamentals of 1D-SENet. Section 3 illustrates the model architecture of 1D-SENet and the methodological flow of this paper. Section 4 analyses the experimental data collection in detail and analyses the experimental results. Finally, the conclusions drawn from this study and future research directions are summarized in Section 5.

2. 1D-SENet Structure Principle

The 1D-SENet model constructed in this paper consists of two parts, the feature extraction module, and the classification module, as shown in Figure 1.

The feature extraction module is shown in Figure 1a, which consists of a convolutional layer, a batch normalization layer (BN), an SE unit, a LeakyReLU activation function, and a pooling layer stacked sequentially. It is worth noting that the SE unit finally adopts a Sigmoid function to highlight the importance of the different channels, whereas the Sigmoid function is prone to gradient vanishing or exploding when the network layers are very deep, so it is inserted after the BN layer in this paper.

For each layer in the network, the distribution of its input features usually keeps changing during the training process, thus making it necessary for the parameters in the convolutional layer to be constantly adapted to the changing distribution, which will increase the training difficulty. Batch Normalization (BN) [33] is a feature normalization technique that can be inserted into deep learning architectures as a trainable process (usually added between the convolutional layer and the activation function or between the fully connected layer and the activation function) to reduce the internal covariate bias and accelerate the model training. BN firstly normalizes each feature channel individually by batches. The normalized features are then further scaled and shifted to increase the feature representation capability and flexibility of the network.

Taking a one-dimensional convolutional layer [34] as an example, let

X_{0}^{l} \in ℝ^{N^{'} \times W^{'}}

be the input of the

l

th convolutional layer, and

N^{'}

and

W^{'}

be its depth and width, respectively; let

X_{c}^{l} \in ℝ^{N \times W}

be the output of the

l

th convolutional layer, and

N

and

W

be its depth and width, respectively; and let

K

be the width of the convolutional kernel of the

l

th layer. Let

x_{c}^{l (i, j)}

denote the

j

th eigenvalue of the

i

th eigenmap in

X_{c}^{l}

, then the convolution operation can be described by Equation (1):

\begin{array}{l} x_{c}^{l (i, j)} & = x_{0}^{l (j)} * w^{l (i)} + b^{l (i)} \\ = \sum_{k = 1}^{K} x_{0}^{l (j, k)} w^{l (i, k)} + b^{l (i)} \end{array}

(1)

where

*

is the discrete convolution operator,

x_{0}^{l (j)}

is the

j

th region in

X_{0}^{l}

to be convolved,

x_{0}^{l (j, k)}

is the

k

th eigenvalue in

x_{0}^{l (j)}

,

w^{l (i)}

is the weight of the

i

th convolution kernel in the

l

th convolution layer,

w^{l (i, k)}

is the

k

th element in

w^{l (i)}

, and

b^{l (i)}

is the bias in the convolution operation corresponding to

w^{l (i)}

.

Then the corresponding BN operation [35] can be expressed as follows:

μ_{b} = \frac{1}{B} \sum_{n = 1}^{B} \sum_{j = 1}^{W} x_{c_{n}}^{l (i, j)}

(2)

σ_{b} = \frac{1}{B} {\sum_{n = 1}^{B} \sum_{j = 1}^{W} (x_{c_{n}}^{l (i, j)} - μ_{b})}^{2}

(3)

{\hat{x}}_{b_{n}}^{l (i, j)} = \frac{x_{c_{n}}^{l (i, j)} - μ_{b}}{\sqrt{(σ_{b}^{2} + ε)}}

(4)

x_{b_{n}}^{l (i, j)} = ρ^{l (i)} {\hat{x}}_{b_{n}}^{l (i, j)} + τ^{l (i)}

(5)

where

x_{c_{n}}^{l (i, j)}

denotes the

j

th dimensional feature of the

i

th feature map output from the

n

th sample in a certain batch after passing through the

l

th convolutional layer,

W

is the width of the feature map,

B

is the size of the batch,

ε

is a constant term close to 0 that guarantees the stability of the value, and

ρ^{l (i)}

and

τ^{l (i)}

are the corresponding scaling and translation factors, respectively.

Denote by

U \in ℝ^{N \times W}

,

U = [u_{1}; \dots; u_{i}; \dots; u_{N}]

the

N

feature maps obtained after the convolutional and BN layers, where

u_{i}

is the feature map corresponding to the

i

th channel. For the SE unit, the squeezing operation

F_{sq}

compresses

u_{i}

along the spatial dimension into a real number

a_{i}

by Global Average Pooling (GAP), which is considered to contain all the spatial information of the corresponding channel, and the specific computation process can be represented by Equation (6):

a_{i} = F_{sq} (u_{i}) = \frac{1}{W} \sum_{j = 1}^{W} u_{i} (j)

(6)

The excitation operation is realized by two fully connected layers and two activation functions, which can capture the correlation between channels and generate the corresponding channel weights. The first fully connected layer is used for dimensionality reduction, followed by the RELU activation function. The second fully connected layer is used to raise the dimension, restoring the dimension to the number of channels entered, followed by the Sigmoid function. The model’s complexity can be reduced by the gating mechanism. Incentives can be expressed as follows:

s = F_{e x} (a, W_{1}, W_{2}) = f_{Sigmoid} (W_{2} f_{ReLU} (W_{1} a))

(7)

where

W_{1} \in ℝ^{\frac{N}{r} \times N}

and

W_{2} \in ℝ^{N \times \frac{N}{r}}

are fully connected operations,

r

is a shrinkage factor for dimensionality approximation (

r

is taken as 16 in this paper), and

s

denotes the weight vector of the

N

feature maps in

U

.

Finally, by using the Scale operation, the channel weight vector and the feature map

U

obtained by the excitation operation are used to carry out the point product operation to obtain the final output:

{\tilde{u}}_{i} = F_{scale} (u_{i}, s_{i}) = s_{i} u_{i}

(8)

The classification module, shown in Figure 1b, consists of a fully connected layer, which is prevented from overfitting by adding a Dropout layer and utilizes a Softmax activation function to achieve fault diagnosis classification.

In addition, in condition monitoring and fault diagnosis, the values of the collected sensor signals (e.g., vibration acceleration signals) are usually positive and negative, and if the ReLU function is used as the activation function, it will make the negative values in the input signals or features to be set to 0 directly, which results in the loss of some information, and in serious cases, it may lead to neuron necrosis problems, weakening the performance of the model. Related scholars improved based on the ReLU function and proposed LeakyReLU [36]. The expression of LeakyReLU function is shown in Equation (9):

f_{LeakyReLU} (x) = \{\begin{matrix} x, & x > 0 \\ a x, & x ⩽ 0 \end{matrix}

(9)

where

a

usually takes values in the range 0 to 0.5.

Therefore, in this paper, LeakyReLU is used as the activation function (where

a = 0.01

), which is placed between the SE unit and the pooling layer in the feature extraction module, and between the BN layer and the Dropout layer in the classification module, respectively.

3. A Transfer Learning Approach Based on the 1D_SENet Model

3.1. Theories Related to Transfer Learning

Given a source domain

D_{s}

and a learning task

T_{s}

, and a target domain

D_{t}

and a learning task

T_{t}

, the purpose of transfer learning is to acquire knowledge in the source domain

D_{s}

and the learning task

T_{s}

in order to help improve the prediction function

g_{t} (\cdot)

in the target task the performance, of which

D_{s} \neq D_{t}

or

T_{s} \neq T_{t}

. The transfer learning process is shown in Figure 2. The left side of the figure shows the traditional machine learning process, and the right side shows the migration learning process. It can be seen that the inputs to the migration learning algorithm contain not only the data in the target domain, but also the training data, models, and tasks in the source domain. The figure shows one of the key concepts of migration learning: learning as much knowledge as possible from the source domain to ameliorate the problem of scarce training data in the target domain.

As a kind of inductive transfer learning method, model-based transfer learning assumes that the source domain has a large amount of labeled data.

D_{s} = {\{x_{s_{i}}, y_{s_{i}}\}}_{i = 1}^{n_{s}}

,

x_{s_{i}} \in ℝ^{d}

, and

y_{s_{i}}

are the data and labels corresponding to the

i

th sample in the source domain, and

n_{s}

is the number of samples in the source domain and assumes that the target domain has a small amount of labeled data.

D_{t} = {\{x_{t_{j}}, y_{t_{j}}\}}_{j = 1}^{n_{t}}

,

x_{t_{j}} \in ℝ^{d}

, and

y_{t_{j}}

are the data and labels corresponding to the

j

th sample in the target domain,

n_{t}

is the number of samples in the target domain, and

n_{t} ≪ n_{s}

. Model migration attempts to migrate the large amount of structural knowledge learned by the model from the source domain data to the target domain to obtain a more accurate prediction model with only a small amount of labeled data in the target domain.

3.2. 1D_SENet Model Parameters

The one-dimensional deep convolutional neural network constructed in this paper consists of seven modules (B1–B7), and Table 1 shows the corresponding structural parameters. Modules B1–B6 are all sequentially stacked by convolutional layer (Conv), BN layer, SE unit, and pooling layer (Pool) for extracting deep features. In addition, since the BN layer and SE unit do not contain convolution and pooling operations, and their output data sizes are kept consistent with the inputs, these two parts are no longer reflected in Table 1.

The first convolutional layer in the convolutional neural network uses wide convolutional kernels for sliding window and convolution operations, which can better extract the detailed features in the signal [37]. In view of this, the convolutional kernel size in module B1 is set to 128 × 1, the step size is 8, and the number of convolutional kernels is 16. The convolutional kernel sizes in modules B2–B6 are all set to 3 × 1, and the step sizes are 1, and the number of convolutional kernels are in the order of 32, 64, 64, 128, and 128. Average pooling is used in each module, and the pooling sizes are all set to 2. Module B7 is the output module, which includes the fully connected layer (Fc), the BN layer, the Dropout layer, and the Softmax classifier. The number of neurons in the fully connected layer is 100, the probability of random neuron inactivation in the Dropout layer is 0.5, and finally the number of neurons in the Softmax classifier is set to 5, according to the fault diagnosis needs of axial piston pump 5 classification.

3.3. Methodological Process

A training set that obeys a particular distribution can be viewed as a sample of that distribution, and supervised learning typically uses the empirical risk of the training data set to approximate the expected risk of the joint distribution. When the training set has fewer samples, there is usually a large deviation between the distribution embodied in the training data and the true distribution, which can easily lead to problems such as overfitting and poor generalization performance of the learned model. This is the reason traditional data-driven fault diagnosis methods based on data have low recognition accuracy in the case of small samples.

In this paper, based on 1D-SENet, a model migration strategy is introduced to improve the performance of fault diagnosis in the case of small samples, and the specific scheme is shown in Figure 3, which consists of two phases: pre-training and fine-tuning. In the pre-training phase, a large amount of labeled data in the source domain is used to fully train the network to obtain a pre-trained model, which learns a large amount of structural knowledge from the source domain data and has good diagnostic performance in the source domain task. In the fine-tuning phase, the parameters of the pre-trained model are fine-tuned by a small number of labeled samples from the target domain, so that the model can be adapted to the target domain classification task.

In deep learning models, the features extracted by the underlying network are more generic and those extracted by the higher layer network are more specific (i.e., as the number of network layers increases, the features learnt by the model become increasingly dependent on the particular dataset and task). Based on this property, it is possible to fix some of the underlying network parameters in the pre-trained model and fine-tune only the remaining high-level network parameters. The specific steps of model migration-based fault diagnosis are as follows:

(1): Sample the interception of the sensor signals of the source and target conditions to construct the dataset. Data Augmentation (DA) is one of the commonly used techniques in deep learning, which is used to increase the training dataset to make it as diverse as possible, to achieve the purpose of improving the generalization ability of the model. In the field of fault diagnosis, the method of overlapping sampling is widely used to expand the data, i.e., the sliding step length is required to be smaller than the window length when intercepting samples using a sliding window, as shown in the left side of Figure 3.
(2): Perform data preprocessing to accelerate model training and improve model performance. Firstly, the samples are normalized so that the value of each dimension of each input sample is in the interval (−1, 1); then, the data are normalized by z-score so that the values of the same dimension of different samples satisfy the mean value of 0 and the standard deviation of 1.
(3): Construct the 1D-SENET fault diagnostic model, set the learning rate, the number of samples of the batch, and the maximum number of iterations (the network will, in each iteration, perform one forward computation, backpropagation, and parameter update) and other hyperparameters, and use a large number of source domain samples for training to obtain the pre-training model.
(4): Fix some of the underlying network parameters of the pre-trained model and fine-tune the remaining network parameters with a small number of target domain samples.
(5): Input the test samples in the target domain into the fine-tuned model to obtain the corresponding diagnostic results.

4. Experimental Verification

4.1. Data Acquisition

When a piston pump fails in operation, increased vibration is the most obvious manifestation, but the excitation frequency generated by different faults is not the same, which will make the vibration signal corresponding to each fault state differ in their amplitude, frequency, and other aspects. The vibration source signal passes through the piston pump, causing vibrations that will form the sound waves that are the sound signal, which also contains the corresponding fault information. In addition, a piston pump failure will also have a direct impact on its outlet pressure. This paper uses the axial piston pump fault diagnosis test system shown in Figure 4. This experiment takes an MCY14-1B axial piston pump as the research object, and the main component models and performance parameters in the experimental system are shown in Table 2.

The vibration, sound and outlet pressure of the axial piston pump are monitored by the vibration acceleration sensor, sound level meter, and pressure sensor, respectively; the collection and storage of the monitoring signals are performed by the written LabVIEW program and the NI data acquisition card. A total of five channels of sensor signals are collected in the test, including the sound signals, the pump outlet pressure signals, and vibration signals of three channels. Figure 5 shows the schematic diagram of the sensor installation.

The acceleration sensor is fixed to the end cap of the piston pump in three mutually perpendicular directions by means of a strong adhesive, noting that the direction perpendicular to the end cap plane is the Z-direction, and the directions parallel to the end cap plane are the X- and Y-directions, respectively. The sound level meter was placed at a height of 0.75 m from the ground, and measurements were taken at a range of 0.5 m or closer to the pump, and the point with the largest sound pressure level was selected as the measurement point. Three different working conditions of 5 MPa, 10 MPa, and 15 MPa were set by adjusting the main relief valve pressure. Each condition contained five fault states of the piston pump. During the test, the corresponding fault states were simulated by replacing various faulty parts in turn, which were obtained from the hydraulic pump maintenance department, as shown in Figure 6, where normal and faulty parts are marked. The sampling frequency was 20 kHz, and the sampling time of each fault state was set to 10 s.

For the vibration signal, since the plunger reciprocates along the direction perpendicular to the end cap, the shock component of the signal collected by the accelerometer in the Z-direction is more obvious compared with the X- and Y- directions. Therefore, the Z-direction vibration signal is used for analysis in this paper.

In this paper, the sliding window length of overlapping sampling is set to 2048, and the sliding step length is set to 512, and the samples are sequentially intercepted from the corresponding vibration signals with overlapping, so that a total of three datasets are obtained according to different working conditions. Each dataset consists of 1500 samples, and there are five fault states, each of which contains 300 samples. The composition of the datasets corresponding to the three working conditions is shown in Table 3.

Figure 7 shows the time-domain diagrams corresponding to individual samples of different fault states under 5 MPa operating conditions (length 0.1024 s), where the vertical coordinate represents the voltage amplitude in V. It can be seen that the amplitude of the normal state is small, the amplitude of the loose boot and the center spring failure is relatively large, and there are certain differences between the waveforms of various states, which can be effectively differentiated by diagnostic models.

4.2. Model Training Hyperparameter Setting and Single-Case Performance Testing

The training set sample selection process and the initialization of the network model parameters are both highly random, and to make the diagnostic results more dependable, the test is repeated 10 times for each working condition. The SGD optimization algorithm is used in the training process, with the initial learning rate set to 0.1 and the momentum factor set to 0.9, and the learning rate is dynamically updated using the cosine annealing learning rate scheduler. The batch size (i.e., the number of samples contained in each batch) was set to 50, and the maximum number of iterations was set to 200.

The experimental analyses in this paper were done on the same computer with Intel Core i5-12400F (2.50 GHz), 32 GB RAM, NVIDIA GeForce RTX 3060Ti (8 GB), and the development environment was python 3.8.18 + torch 2.1.1 + cu118.

4.2.1. Model Performance Test with Regular Number of Training Sets

For the convenience of description, the constructed network is denoted as 1D-SENet when no model transfer is performed and MT_1D-SENet (Model Transfer 1D-SENet) when model transfer is performed. The fault diagnosis performance of the 1D-SENet model was first validated for a single working condition with enough samples, which was tested on three datasets. For each working condition, its dataset was divided into training, validation, and test sets using stratified sampling division in the ratio of 3:1:1, so the number of samples in the training set for each working condition was 900, the number of samples in the validation set was 300, and the number of samples in the test set was 300, and the number of samples in each state was kept in balance. After the training was completed, the diagnostic results of the 1D-SENET model on the test set for each of the three working conditions were tabulated, as shown in Figure 8.

From Figure 8a, it can be seen that the diagnostic results are most stable at 5 MPa, with 100% in all 10 cases. The minimum diagnostic accuracy of 99.33% was still achieved at 10 MPa and 15 MPa despite some fluctuations. Observation of Figure 8b shows that the average diagnostic accuracies of the 10 tests for all three conditions are higher than 99.7%, which are 100%, 99.97%, and 99.73%, respectively. The proposed 1D_SENet model possesses good identification accuracy in the fault diagnosis of axial piston pumps.

4.2.2. Model Performance Testing with Small Sample Training Set

To further analyze the diagnostic performance of the model with small samples, a small number of samples were randomly selected from the dataset for training, and the remaining samples were used as validation and testing. Five cases with the number of training samples of 60 × 5, 48 × 5, 36 × 5, 24 × 5, and 12 × 5 were evaluated, and the five cases were labeled A1, A2, A3, A4, and A5 in order, and the results are shown in Figure 9.

As can be seen from Figure 9, the diagnostic accuracy of the 1D_SENet model gradually decreases as the number of training samples decreases. When the number of samples of each type of state in the training set samples is only 12, the average accuracy of the three working conditions is 84.47%, 93.83%, and 81.83%, and the standard deviation also becomes exceptionally large. When the training samples are sparse, the network parameters cannot be adequately trained, and the fault diagnosis performance based on the 1D_SENet model is poor.

4.2.3. Comparative Model Performance Tests

In this paper, 1D_SENet is compared and studied with five classical algorithms, 1D_AlexNet, 1D_VGG19, 1D_EfficientNetB0, 1D_ShuffleNetV2, and 1D_MobileNetV3_small, in order to validate the superior performance of the 1D_SENet model. Meanwhile, this paper also obtains the ordinary 1D_CNN network without added SE units, based on the principle of ablation experiments, to verify the effectiveness of the added SE units in 1D_SENet. Here, performance experiments were conducted on each model under the operating condition of 10 MPa, with a training set sample size of 120 (consistent with A4). The parameter count for each model, as well as the mean and standard deviation of the test set accuracy over 10 trials, were analyzed. The results are shown in Table 4.

Analyzing Table 4, it can be seen that, on the one hand, compared with the traditional networks 1D_AlexNet, 1D_VGG19, and the lightweight networks 1D_ShuffleNetV2 and 1D_MobileNetV3_small, 1D_SENet has the highest diagnostic accuracy in axial piston pumps, which reaches 98.5%, and also has the smallest standard deviation and parameter count, which are, respectively, 0.36 and 0.54, indicating that it maintains high diagnostic performance while also possessing strong robustness and lightweight characteristics, which is more conducive to deploying it to practical engineering environments where resources such as computational power and storage space are limited. On the other hand, compared with 1D_CNN, after adding the SE channel domain attention mechanism, the number of parameters of 1D_SENet only grows by 0.02 MB, and there is a significant increase in diagnostic accuracy and model stability, which proves that the performance of the network can be improved by adding SEs to 1D_CNN networks with only a small amount of increase in the learning parameters.

4.3. Cross-Case Small Sample Model Migration Fault Diagnosis Test

The following is a performance validation of the model migration scheme proposed in Section 3.3. The migration learning from 5 MPa to 10 MPa conditions (i.e., the samples under 5 MPa conditions are used as the source domain data, and the samples under 10 MPa conditions are used as the target domain data) is denoted as task T1, and the migration learning from 5 MPa to 15 MPa conditions is denoted as task T2, and so on; six tasks can be obtained. For each task, the number of pre-training sets, validation sets, and test set samples in the source domain is consistent with that in Section 4.2.1; the number of training sets, validation sets, and test set samples used for fine-tuning in the target domain is consistent with that in case A5 in Section 4.2.2. Table 5 shows the dataset division corresponding to the six tasks.

The analysis is first conducted to address the effect of the number of fine-tuning layers on the diagnostic accuracy of the model migration. For the convenience of representation, the fine-tuning of all parameters in the network is denoted as S0; fixing module B1 and fine-tuning the network parameters of the remaining modules is denoted as S1; fixing module B1–B2 and fine-tuning the network parameters of the remaining modules is denoted as S2; and so on. Fixing module B1–B6 and fine-tuning the network parameters of the remaining modules is denoted as S6, obtaining a total of seven S0–S6 case.

The training process in the pre-training phase follows the hyper-parameter settings in Section 4.2; in the fine-tuning phase, the batch size is set to 10 due to the small number of fine-tuning samples. The results of the average accuracy for each task are shown in Figure 10 by repeating the experiment 10 times.

With the increase of the number of fixed modules, the accuracy of all six tasks increases and then decreases, and tasks T1, T3, and T4 all achieve the maximum value at S2, which is 98.43%, 97.36667%, and 96.43%, respectively, while task T2 achieves the maximum value at S4, which is 97.03%, and task T5 achieves the maximum value at S1, which is 95.43%, and task T6 achieves the maximum value at S3, which is 99.3%. It is not optimal to fine-tune all the parameters, and better diagnostic performance can be obtained by fixing some of the underlying network parameters appropriately. In addition, observing the mean accuracy curves of the six tasks in each trial, the average accuracy of the six tasks at S3 is the highest, and the overall performance is the best.

Table 6 and Table 7, respectively, compare the accuracy of the six tasks when fine-tuning all modules (i.e., S0) and fixing modules B1 to B3 to fine tune the remaining modules (i.e., S3). The minimum, maximum, standard deviation, and average values of 10 experimental results were analyzed.

As can be seen from the table, compared with fine-tuning all the modules (S0), the diagnostic accuracy of the six tasks when fixing modules B1–B3 to fine-tune the remaining modules (S3) performs better in the three indicators of minimum, mean, and standard deviation, which indicates that the latter has a better diagnostic performance and greater stability. In the process of model migration, fine-tuning all the parameters of the network is not better than fixing some of the underlying parameters, which is attributed to the fact that a wide convolutional kernel is used in the B1 module, which has significantly more parameters than the other modules, and it is difficult to adequately train it with only a small number of samples from the target domain, which leads to a decrease in the diagnostic performance of the model.

Furthermore, comparing S3 with the results of the low-sample scenario A5, as Figure 9 shows, without employing the pre-training and fine-tuning strategy, the average diagnostic accuracy of the model under the conditions of 5 Mpa, 10 Mpa, and 15 MPa were merely 84.47%, 93.83%, and 81.83%, respectively. Whereas, after adopting the transfer scheme based on the pre-trained and fine-tuned model, the average diagnostic accuracy of the model in tasks T3 and T5, with the target domain being the 5 Mpa condition, could reach 96.53% and 95.17%, respectively. In tasks T1 and T6, with the target domain being the 10 Mpa condition, the average diagnostic accuracy could reach 98.07% and 99.3%, respectively. In tasks T2 and T4, with the target domain being the 15 Mpa condition, the average diagnostic accuracy could reach 96.6% and 95.67%, respectively. Moreover, the standard deviation in all six tasks was lower than that of A5. It can be seen that when the sample size of a working condition is small and the model cannot be trained adequately, the pre-training–fine-tuning strategy can be adopted to make use of a large number of samples from other working conditions to significantly improve the diagnostic performance and robustness of the model in this working condition.

4.4. Comparison of Convergence Properties and Feature Visualisation before and after Model Migration

In this paper, the convergence characteristics of tasks T1 and T3 during S3 training are analyzed and studied in comparison with the convergence characteristics when no model migration is performed. In Figure 11a, MT_1D-SENet is pre-trained using samples at the 10 MPa working condition (i.e., corresponding to task T3). In Figure 11b, MT_1D-SENet is pre-trained using samples at 5 MPa operating conditions (i.e., corresponding to task T1).

Comparison shows that the validation loss of 1D-SENet increases and then decreases with the iterative process, fluctuates violently, and the final loss value is larger; whereas, the validation loss of MT_1D-SENet has always remained stable after rapid convergence, and the final loss value is smaller than that of 1D-SENet, which indicates that the training process of MT_1D-SENet is more stable. In addition, the verification accuracy of MT_1D-SENet is always higher than that of 1D-SENet during the training process of the two tasks, indicating that the former always maintains a good generalization performance during the iteration process.

Using the t-SNE algorithm, the deep features of the test set samples of the two methods in this experiment (the output features of the Dropout layer, with a dimension of 100) are reduced to two dimensions and visualized respectively, and the results are shown in Figure 12.

In Figure 12, Figure 12a,d,g shows the feature visualizations corresponding to the 1D-SENet method trained directly using a small number of samples at 5 MPa, 10 MPa, and 15 MPa conditions, respectively, in which there is a serious overlap between the various types of samples, and the discriminability is very poor. Figure 12b,c shows the feature visualization for model migration from 10 MPa and 15 MPa to 5 MPa. Figure 12e,f shows the feature visualization for model migration from 5 MPa and 15 MPa to 10 MPa. Figure 12h,i shows the feature visualization for model migration from 5 MPa and 10 MPa to 15 MPa, respectively. We can clearly see that all the other categories in these six plots are well separated, and the samples of the same category are well clustered together, except for the two failure characteristics of the slipper loose and the center spring failure, which are confused to a certain extent. Compared with the 1D-SENet method, the MT_1D-SENet method can obtain good knowledge of the discriminative structure by using only a small number of samples of the target operating conditions.

5. Conclusions

In this paper, we took migration learning as the basic method and combined it with cutting-edge technologies, such as deep learning and attention mechanisms, to conduct an in-depth exploration of axial piston pump fault diagnosis for small samples and variable working conditions and validated it on axial piston pump fault datasets, and the experimental results show that.

(1): When the 1D-SENet model is trained and evaluated using a regular number of axial piston pump samples in each working condition, the model has excellent diagnostic accuracy and generalization performance, indicating the superiority of the 1D-SENet network architecture proposed in this paper. And in the case of small samples, the diagnostic accuracy of 1D-SENet is not only higher than that of traditional networks, but also the model size is much smaller than that of classical lightweight networks, and its lightweight characteristics are more conducive to the promotion to the actual engineering.
(2): For the small-sample fault diagnosis task of axial piston pumps, a model migration strategy was adopted, i.e., on the pre-trained model learnt from a large number of labeled samples of the source condition, a small number of labeled samples of the target condition were used for fine-tuning. The final results show that the proposed method can effectively improve the recognition accuracy of small-sample fault diagnosis for target operating conditions and accelerate the convergence speed of the model.
(3): The relationship between the performance after model migration and the number of model fine-tuning layers was explored, and the results show that better axial piston pump fault diagnosis performance can be obtained when some of the bottom network parameters are fixed appropriately and only the top network parameters are fine-tuned.

In the future, the method proposed in this paper can be further extended to axial piston pump fault classification based on pressure signals, and an in-depth study of the multi-source information fusion method based on vibration and pressure signals can be conducted to improve the accuracy and efficiency of diagnosis. In addition, for the simultaneous operation of multiple pumps in marine engineering, research related to axial piston pump fault diagnosis across equipment can be conducted. Finally, for the “black box” problem in the deep learning model, the interpretability of the model can be further explored.

Author Contributions

Conceptualization, X.Y.; methodology, X.Y.; software, X.Y. and Y.Y.; validation, L.J. and J.Z.; formal analysis, A.J.; resources, W.J.; data curation, X.Y. and L.J.; writing—original draft preparation, X.Y.; writing—review and editing, W.J.; visualization, A.J.; supervision, W.J.; project administration, W.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (Grant No. 52275067) and the Province Natural Science Foundation of Hebei, China (Grant No. E2023203030).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wang, F.; Chen, J.; Cheng, M.; Xu, B. A Novel Hydraulic Transmission Solution to Large Offshore Wind Turbine: Design and Control Strategy. Ocean Eng. 2022, 255, 111285. [Google Scholar] [CrossRef]
Banaszek, A.; Petrovic, R. Problem of Non Proportional Flow of Hydraulic Pumps Working with Constant Pressure Regulators in Big Power Multipump Power Pack Unit in Open System. Teh. Vjesn. 2019, 26, 294–301. [Google Scholar] [CrossRef]
Banaszek, A.; Petrović, R.; Andjelković, M.; Radosavljević, M. Efficiency of a Twin-Two-Pump Hydraulic Power Pack with Pumps Equipped in Constant Pressure Regulators with Different Linear Performance Characteristics. Energies 2022, 15, 8100. [Google Scholar] [CrossRef]
Jiang, W.; Liu, S. An experimental study on the sensitivity of dimensionless indicators in the magnitude domain to hydraulic pump failure. J. Yanshan Univ. 2010, 34, 383–389. [Google Scholar]
Tang, S.; Zhu, Y.; Yuan, S. Intelligent Fault Diagnosis of Hydraulic Piston Pump Based on Deep Learning and Bayesian Optimization. Isa Trans. 2022, 129, 555–563. [Google Scholar] [CrossRef] [PubMed]
Wen, C.L.; Lv, F.Y. A review of deep learning-based fault diagnosis methods. J. Electron. Inf. Sci. 2020, 42, 234–248. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
Zhou, Y.; Chen, S.; Wang, Y.; Huan, W. Review of Research on Lightweight Convolutional Neural Networks. In Proceedings of the 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 12–14 June 2020; pp. 1713–1720. [Google Scholar]
Siddique, M.F.; Ahmad, Z.; Ullah, N.; Kim, J. A Hybrid Deep Learning Approach: Integrating Short-Time Fourier Transform and Continuous Wavelet Transform for Improved Pipeline Leak Detection. Sensors 2023, 23, 8079. [Google Scholar] [CrossRef]
Siddique, M.F.; Ahmad, Z.; Ullah, N.; Ullah, S.; Kim, J.-M. Pipeline Leak Detection: A Comprehensive Deep Learning Model Using CWT Image Analysis and an Optimized DBN-GA-LSSVM Framework. Sensors 2024, 24, 4009. [Google Scholar] [CrossRef]
Guo, S.; Yang, T.; Gao, W.; Zhang, C. A Novel Fault Diagnosis Method for Rotating Machinery Based on a Convolutional Neural Network. Sensors 2018, 18, 1429. [Google Scholar] [CrossRef]
Wang, L.-H.; Zhao, X.-P.; Wu, J.-X.; Xie, Y.-Y.; Zhang, Y.-H. Motor Fault Diagnosis Based on Short-Time Fourier Transform and Convolutional Neural Network. Chin. J. Mech. Eng. 2017, 30, 1357–1368. [Google Scholar] [CrossRef]
Yang, X.; Jiang, A.; Jiang, W.; Zhao, Y.; Tang, E.; Chang, S. Abnormal Detection and Fault Diagnosis of Adjustment Hydraulic Servomotor Based on Genetic Algorithm to Optimize Support Vector Data Description with Negative Samples and One-Dimensional Convolutional Neural Network. Machines 2024, 12, 368. [Google Scholar] [CrossRef]
Gao, S.; Li, T.; Pei, Z. Fault Diagnosis Method of Rolling Bearings Based on Adaptive Modified CEEMD and 1DCNN Model. Isa Trans. 2023, 140, 309–330. [Google Scholar] [CrossRef] [PubMed]
Liu, Z.; Guo, B.; Wu, F.; Han, T.; Zhang, L. An Improved Burr Size Prediction Method Based on the 1D-ResNet Model and Transfer Learning. J. Manuf. Process. 2022, 84, 183–197. [Google Scholar] [CrossRef]
Chen, X.; Han, M.; Shu, W.; Zhang, K.; Li, L. A fault diagnosis method based on VMD and multiscale one-dimensional convolutional neural network. Mod. Electron. Technol. 2023, 46, 103–109. [Google Scholar] [CrossRef]
Chen, R.X.; Xu, P.W.; Han, K.L.; Zeng, L.; Wang, S.; Zhu, Y.C. Intelligent Detection of Loose Wind Turbine Foundation Bolts by Multiscale One-Dimensional Convolutional Neural Network. Vib. Shock 2022, 41, 301–307. [Google Scholar] [CrossRef]
Xu, P.-W.; Chen, R.-X.; Hu, X.-L.; Yang, L.-X.; Tang, L.-L.; Lin, L. PSO optimised multiscale one-dimensional convolutional neural network for loose foundation bolt diagnosis of wind turbines. Vib. Shock 2022, 41, 86–92. [Google Scholar] [CrossRef]
Wang, W.; Shen, J.; Jia, Y. A review of visual attention detection. J. Softw. 2019, 30, 416–439. [Google Scholar] [CrossRef]
Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the Computer Vision—ECCV 2018; Springer International Publishing: Cham, Switzerland, 2018; pp. 3–19. [Google Scholar]
Zhang, J.; Xie, Y.; Xia, Y.; Shen, C. Attention Residual Learning for Skin Lesion Classification. IEEE Trans. Med. Imag. 2019, 38, 2092–2103. [Google Scholar] [CrossRef]
Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
Duan, J.; Yu, S.W.; Xie, M.K.; He, J.Y.; Wang, H.J.; Li, W.X.; Yang, Z. One-dimensional lightweight CNN-based bearing fault diagnosis for mountain ropeways. J. Agric. Eng. 2023, 39, 70–79. [Google Scholar]
Deng, F.; Lv, H.; Gu, X.; Hao, R. A Fault Diagnosis Method for Axlebox Bearing of High-Speed Rolling Stock Based on Lightweight Neural Network Shuffle-SENet. J. Jilin Univ. (Eng. Ed.) 2022, 52, 474–482. [Google Scholar] [CrossRef]
Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. How Transferable Are Features in Deep Neural Networks? Adv. Neural Inf. Process. Syst. 2014, 4, 3320–3328. [Google Scholar]
Wang, K.; Li, Y.H. A review of the application of transfer learning in the field of predictive maintenance of mechanical equipment. China Instrum. 2019, 12, 64–68. [Google Scholar]
Chen, C.; Shen, F.; Yan, R. Improvement of LSSVM migration learning method for bearing fault diagnosis. J. Instrum. 2017, 38, 33–40. [Google Scholar] [CrossRef]
Qian, W.; Li, S.; Wang, J. A New Transfer Learning Method and Its Application on Rotating Machine Fault Diagnosis Under Variant Working Conditions. IEEE Access 2018, 6, 69907–69917. [Google Scholar] [CrossRef]
He, Z.; Shao, H.; Wang, P.; Lin, J.; Cheng, J.; Yang, Y. Deep Transfer Multi-Wavelet Auto-Encoder for Intelligent Fault Diagnosis of Gearbox with Few Target Training Samples. Know. Based Syst. 2020, 191, 105313. [Google Scholar] [CrossRef]
Li, T.; Zhao, Z.; Sun, C.; Yan, R.; Chen, X. Domain Adversarial Graph Convolutional Network for Fault Diagnosis Under Variable Working Conditions. IEEE Trans. Instrum. Meas. 2021, 70, 1–10. [Google Scholar] [CrossRef]
Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 7–9 July 2015. [Google Scholar]
Yang, X. Research on the Condition Monitoring and Fault Early Warning Diagnosis System of Regulating Oil Motor Based on 1D-CNN and SVDD Algorithm. Master’s Thesis, Yanshan University, Qinhuangdao, China, 2021. [Google Scholar]
Yue, Y. Research on Migration Learning Based Fault Diagnosis Method for Axial Piston Pumps. Master’s Thesis, Yanshan University, Qinhuangdao, China, 2023. [Google Scholar]
Maas, A.L.; Hannun, A.Y.; Ng, A.Y. Rectifier Nonlinearities Improve Neural Network Acoustic Models. Int. Conf. Mach. Learn. 2013, 30, 3. [Google Scholar]
Zhang, W.; Peng, G.; Li, C.; Chen, Y.; Zhang, Z. A New Deep Learning Model for Fault Diagnosis with Good Anti-Noise and Domain Adaptation Ability on Raw Vibration Signals. Sensors 2017, 17, 425. [Google Scholar] [CrossRef]

Figure 1. Feature extraction and classification module: (a) feature extraction module and (b) classification module.

Figure 2. Example of the migration learning process.

Figure 3. Fault diagnosis scheme based on model migration.

Figure 4. Schematic diagram of axial the piston pump troubleshooting test system. 1—tank; 2, 24—filter; 3—vane pump; 4, 25—globe valve; 5, 13—flow meter; 6, 15—pressure gauge switch; 7, 16—pressure gauge; 8, 18—relief valve; 9—swash plate axial piston pump; 10—vibration acceleration transducer; 11—sound level meter; 12—check valve; 14—pressure transducer; 17, 22—accumulator; 19—electromagnetic; 20—electro-hydraulic servo valve; 21—hydraulic cylinder; and 23—one-way throttle valve.

Figure 5. Sensor’s arrangement.

Figure 6. Various faulty components of axial piston pumps: (a) swash plate wear; (b) slipper wear; (c) slipper loose; (d) center spring wear.

Figure 7. Time-domain diagram of vibration signals in the Z-direction for diverse types of faults in axial piston pumps: (a) swash plate wear; (b) slipper loose; (c) slipper wear; (d) center spring wear; and (e) normal.

Figure 8. Performance of 1D_SENet model in three working conditions: (a) 10 diagnostic results and (b) accuracy and standard deviation.

Figure 9. Comparison of diagnostic accuracy for different number of training samples under single working condition.

Figure 10. Effect of fine-tuning the number of layers on the diagnostic accuracy of each task.

Figure 11. Comparison of convergence characteristics before and after model migration: (a) comparison of convergence characteristics before and after model migration for 5 MPa conditions and (b) comparison of convergence characteristics before and after model migration for the 10 MPa condition.

Figure 12. Comparison of feature visualization between 1D-SENet and MT_1D-SENet for different working conditions: (a) 5 Mpa; (b) 10 MPa⇢5 Mpa; (c) 15 MPa⇢5 Mpa; (d) 10 Mpa; (e) 5 MPa⇢10 Mpa; (f) 15 MPa⇢10 Mpa; (g) 15 Mpa; (h) 5 MPa⇢15 Mpa; and (i) 10 MPa⇢15 MPa.

Table 1. 1D_SENet model parameters.

Module	Output Size (Width × Depth)	Layer Type	Kernel Size	Kernel Number	Stride	Padding
B1	120 × 16	Conv	128 × 1	16	8	—
B1	120 × 16	Pool	2 × 1	16	2	—
B2	60 × 32	Conv	3 × 1	32	1	Same
B2	60 × 32	Pool	2 × 1	32	2	—
B3–B4	15 × 64	Conv	3 × 1	64	1	Same
B3–B4	15 × 64	Pool	2 × 1	64	2	—
B5–B6	3 × 128	Conv	3 × 1	128	1	Same
B5–B6	3 × 128	Pool	2 × 1	128	2	—
B7	100 × 1	Fc	100	1	—	—
	100 × 1	Dropout	—	—	—	—
	5	Softmax	5	1	—	—

Table 2. Main component types and performance parameters of the test bench.

Serial Number	Component Name	Module Model	Component Performance Parameters
1	drive motor	Y132M-4	Rated RPM: 1480 r/min
2	axial piston pumps	MCY14-1B	7-plunger, displacement: 10 mL/r
3	data acquisition card	USB-6221	Maximum sampling rate: 250 kS/s
4	vibration sensors	YD72D	Frequency range: 1–18,000 Hz

Table 3. Composition of datasets for different operating conditions.

Dataset Information	D1	D2	D3
Pump outlet pressure/MPa	5	10	15
State Type	Normal Swash plate wear Slipper wear Slipper loose Center spring wear	Normal Swash plate wear Slipper wear Slipper loose Center spring wear	Normal Swash plate wear Slipper wear Slipper loose Center spring wear
Number of samples in each state	300	300	300

Table 4. Diagnostic accuracy and model size of the six models.

Model Name	Average Value (%)	Standard Deviation (%)	Number of Participants (MB)
1D_SENet	98.5	0.36	0.54
1D_CNN	98.2	0.65	0.52
1D_AlexNet	94.23	11.07	13
1D_VGG19	52.8	20.76	45.54
1D_ShuffleNetV2	97.67	1.07	3.71
1D_MobileNetV3_small	97.43	2.1	10.75

Table 5. Division of datasets for the six tasks.

Tasks	Source Domain Working Condition	Target Domain Working Conditions	Pre-Training			Fine-Tuning
Tasks	Source Domain Working Condition	Target Domain Working Conditions	Training Set	Validation Set	Test Set	Training Set	Validation Set	Test Set
T1	5 MPa	10 MPa	900	300	300	12 × 5	228 × 5	60 × 5
T2	5 MPa	15 MPa	900	300	300	12 × 5	228 × 5	60 × 5
T3	10 MPa	5 MPa	900	300	300	12 × 5	228 × 5	60 × 5
T4	10 MPa	15 MPa	900	300	300	12 × 5	228 × 5	60 × 5
T5	15 MPa	5 MPa	900	300	300	12 × 5	228 × 5	60 × 5
T6	15 MPa	10 MPa	900	300	300	12 × 5	228 × 5	60 × 5

Table 6. Diagnostic accuracy for six tasks fine-tuning all modules (S0).

Tasks	Minimum Value (%)	Maximum Values (%)	Standard Deviation (%)	Average Value (%)
T1	75.33	98.67	7.10	94.53
T2	74.0	91.67	5.27	86
T3	83.67	99.0	4.62	91.9
T4	79.0	98.33	5.1	90.8
T5	80.0	96.0	5.53	88.73
T6	90.33	99.0	2.79	95.6

Table 7. Diagnostic accuracy of six task-fixed modules B1 to B3 and fine-tuned residual modules (S3).

Tasks	Minimum Value (%)	Maximum Values (%)	Standard Deviation (%)	Average Value (%)
T1	97.0	99.0	0.6	98.07
T2	95.67	98.0	0.77	96.6
T3	95.67	97.67	0.67	96.53
T4	94.67	97.67	1.11	95.67
T5	93.67	96.67	0.88	95.17
T6	98.67	100	0.4	99.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yang, X.; Jiang, A.; Jiang, W.; Yue, Y.; Jing, L.; Zhou, J. Small-Sample Fault Diagnosis of Axial Piston Pumps across Working Conditions, Based on 1D-SENet Model Migration. J. Mar. Sci. Eng. 2024, 12, 1430. https://doi.org/10.3390/jmse12081430

AMA Style

Yang X, Jiang A, Jiang W, Yue Y, Jing L, Zhou J. Small-Sample Fault Diagnosis of Axial Piston Pumps across Working Conditions, Based on 1D-SENet Model Migration. Journal of Marine Science and Engineering. 2024; 12(8):1430. https://doi.org/10.3390/jmse12081430

Chicago/Turabian Style

Yang, Xukang, Anqi Jiang, Wanlu Jiang, Yi Yue, Lei Jing, and Junjie Zhou. 2024. "Small-Sample Fault Diagnosis of Axial Piston Pumps across Working Conditions, Based on 1D-SENet Model Migration" Journal of Marine Science and Engineering 12, no. 8: 1430. https://doi.org/10.3390/jmse12081430

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Small-Sample Fault Diagnosis of Axial Piston Pumps across Working Conditions, Based on 1D-SENet Model Migration

Abstract

1. Introduction

2. 1D-SENet Structure Principle

3. A Transfer Learning Approach Based on the 1D_SENet Model

3.1. Theories Related to Transfer Learning

3.2. 1D_SENet Model Parameters

3.3. Methodological Process

4. Experimental Verification

4.1. Data Acquisition

4.2. Model Training Hyperparameter Setting and Single-Case Performance Testing

4.2.1. Model Performance Test with Regular Number of Training Sets

4.2.2. Model Performance Testing with Small Sample Training Set

4.2.3. Comparative Model Performance Tests

4.3. Cross-Case Small Sample Model Migration Fault Diagnosis Test

4.4. Comparison of Convergence Properties and Feature Visualisation before and after Model Migration

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI