1. Introduction
Self-priming centrifugal pumps, as indispensable key equipment in the industrial field, have recently played an important role in energy conversion and drive actuators [
1]. Nevertheless, the self-priming centrifugal pump is susceptible to failures or significant accidents caused by component damage due to prolonged operation and harsh working conditions. Therefore, accurate and timely diagnosis of self-priming centrifugal pump faults is significant for improving equipment reliability, ensuring production continuity, and reducing maintenance costs. However, the operating state of self-priming centrifugal pumps is usually characterized by variability and complexity. Traditional fault diagnosis methods include state estimation, time-frequency analysis, statistical methods, etc. These methods depend on experience and equipment, are costly in time and manpower, and are highly restrictive.
The continuous advancement of machine learning and artificial intelligence technologies has led to the widespread application of data-driven techniques across various domains, including manufacturing, healthcare, transportation, finance, and energy. These innovations are driving progress and development in diverse industries. For instance, in the realm of robotics, Peng et al. introduced the Funabot-Suit, a biologically inspired garment propelled by McKibben muscles, enabling natural proprioceptive perception [
2]. Meanwhile, Mao et al. devised a predictive modeling approach for flexible electro-hydrodynamic pumps, leveraging soft computing techniques [
3]. Moreover, data-driven methodologies have found substantial utility in the domain of rotating machinery failure analysis. Zhou et al. put forth a deep convolutional generative adversarial network to achieve precise diagnostics with limited labeled data, exemplifying the capabilities of these techniques [
4]. Han et al. introduced an innovative framework tailored for addressing the challenge of transfer diagnosis with sparse target data. This approach not only reduces distribution disparities but also mitigates undesirable transitions [
5]. In a similar vein, Wu et al. proposed an adaptive deep transfer learning method for bearing fault diagnosis [
6].
Data-driven fault diagnosis methods provide significant advantages over traditional approaches, including increased levels of automation, enhanced accuracy, and multidimensional analysis capabilities. This results in more timely and precise fault diagnosis outcomes, ultimately contributing to improved pump reliability and operational efficiency. Notably, this method comprises two pivotal steps: feature extraction and pattern recognition [
7,
8].
To commence, it is imperative to extract pertinent features from the vibration signal originating from the self-priming centrifugal pump; however, it is observed that the vibration signal of the self-priming centrifugal pump typically demonstrates characteristics of non-stationarity, nonlinearity, and complexity. Conventional feature extraction methods operating in time, frequency, or time–frequency domains are susceptible to a range of challenges. These issues encompass information loss, fluctuations in signals, subjective and unstable artificial feature selection, noise interference, and reliance on domain knowledge or experiential input [
9]. Consequently, entropy-based methods have been developed in line with the advancements in nonlinear dynamics technology. These methods encompass symbolic dynamic entropy, Shannon entropy, multiscale entropy, sample entropy, permutation entropy, multiscale dispersion entropy (MDE), and multiscale fluctuation dispersion entropy (MFDE) [
10,
11,
12,
13].
Within the context of the aforementioned entropy theories, it is worth noting that MFDE can effectively gauge the regularity of the time series, thereby enabling the detection of subtle variations within the vibration signals. MFDE is advantageous for its rapid computational speed and robust resistance to noise. Nevertheless, it does exhibit certain limitations when applied to the extraction of fault features from faulty signals. Notably, as the scale factors increase, the coarse-grained sequence undergoes shortening, leading to substantial deviations in the larger factors [
12]. Wang et al. proposed the refined time-shift multiscale fluctuation dispersion entropy (RTSMFDE), whose powerful feature extraction capability was validated by fault diagnosis experiments of a wind turbine [
7]. The time-shift multiscale decomposition was used to replace the original mean multiscale decomposition based on the MFDE, preserving the important structure information of the signal and making the obtained entropy more accurate and stable. In addition, the entropy calculation was carried out using a refined method, i.e., the relative frequencies of fluctuation dispersion modes of all time-shifted coarse-granulated sequences under the scale factors were first averaged, and then the entropy was calculated, reducing the possibility of invalid entropy. Given the advantages of the RTSMFDE, this paper applies it to the fault feature extraction of self-priming centrifugal pumps.
Nonetheless, when the RTSMFDE feature set is directly utilized as the input for the pattern recognition classifier, there is a potential compromise in the classifier’s recognition accuracy; this arises due to the high dimensionality and the presence of information redundancy within the RTSMFDE feature set. Consequently, the implementation of a dimensionality reduction (DR) method becomes imperative to obtain low-dimensional and discriminating feature sets [
14].
Traditional DR methods, such as linear discriminant analysis (LDA) and principal component analysis, are all linear methods incapable of handling nonlinear feature sets [
15,
16]. In contrast, manifold learning, which is a nonlinear DR method, is an effective approach for uncovering low-dimensional structures within high-dimensional spaces, offering a more suitable solution for DR applied to the data collected from hydraulic pumps [
17,
18]. Commonly used manifold learning methods include isometric mapping (Isomap), Laplacian eigenmaps (LE), locally linear embedding (LLE), and local tangent space alignment (LTSA) [
19,
20,
21,
22]. However, their application in reducing the dimensionality of the self-priming centrifugal pump fault feature set has some limitations. For instance, the above-mentioned methods are unsupervised DR approaches that fail to leverage the available sample label information fully; hence, the DR result is easily disturbed by noise points. The aforementioned methods use Euclidean distance to construct a neighborhood graph easily affected by dimension. In cases where certain outliers are treated as near neighbors, the Euclidean distance metric may fail to establish a meaningful relationship between isolated samples and their other close neighbors; this can lead to the disruption of the underlying neighborhood graph structure [
23].
The recently proposed cosine pairwise-constrained supervised manifold mapping (CPCSMM) method aims to extract both local and global structural information from signal features, effectively reducing and visualizing high-dimensional data [
7]. Due to its exceptional performance in the DR of fault features, this paper employs the CPCSMM method for reducing the dimensionality of a high-dimensional RTSMFDE feature set. Consequently, a low-dimensional and discerning fault feature set can be obtained.
The extracted feature set should be input into the classifier for recognition to achieve the intelligent diagnosis of a self-priming centrifugal pump. Common methods include the K-nearest neighbor, naive Bayes, artificial neural network, and support vector machine (SVM) [
24]. However, the K-nearest neighbor requires excessive computation costs when dealing with high-dimensional and large-scale data. Unbalanced data distribution and inappropriate parameter settings will significantly affect the accuracy of K-nearest neighbor classification [
25,
26]. The naive Bayes method computes prior probabilities and operates under the assumption that the target attributes are conditionally independent of one another, which may not be entirely accurate when dealing with certain fault features in rotating machinery [
27]. The complexity of artificial neural networks increases with the input data dimension and can easily overfit. This artificial neural network lacks strict theoretical support, while the black box model is poorly interpreted [
28,
29,
30,
31].
When juxtaposed with the previously mentioned methods, the SVM classifier delivers superior classification outcomes and holds notable advantages in managing scenarios involving limited sample sizes and nonlinear data. This is achieved by striking an optimal equilibrium between learning model accuracy and complexity. As a result, the SVM finds extensive application in the intelligent fault diagnosis of rotating machinery. However, the efficacy of the SVM is constrained by two core parameters of the classifier, namely the kernel parameter
g and the penalty factor
. Specifically, parameter
g regulates the complexity of the feature subspace distribution, while parameter
gauges the proportion of misclassified samples and the complexity of the model [
10]. To enhance the SVM’s generalization ability and accuracy, the adaptive chaotic Aquila optimization support vector machine (ACAO-SVM) classifier was adopted, employing an adaptive optimization strategy [
10]. The ACAO-SVM was further implemented in recognizing faults of self-priming centrifugal pumps.
The main contributions of this paper can be summarized as follows:
- (1)
The introduction of a novel intelligent fault diagnosis method tailored for self-priming centrifugal pumps, which integrates the refined time-shift multiscale fluctuation dispersion entropy, cosine pairwise-constrained supervised manifold mapping, and adaptive chaotic Aquila optimization support vector machine;
- (2)
The practical application of the proposed intelligent fault diagnosis method within the context of analyzing a self-priming centrifugal pump case. This endeavor serves the purpose of validating the method’s effectiveness;
- (3)
A comprehensive comparative analysis involving the proposed fault diagnosis method, various feature extraction techniques, feature dimensionality reduction methods, and existing intelligent fault diagnosis approaches. This comparative assessment is aimed at substantiating the method’s superior performance.
The organization of the remaining sections of the paper is as follows: in
Section 2, the theoretical basis and specific process of the proposed intelligent fault diagnosis method for self-priming centrifugal pumps are presented.
Section 3 conducts a case study on a self-priming centrifugal pump and compares it with existing methods of feature extraction, feature dimensionality reduction, and fault diagnosis. Lastly,
Section 4 summarizes the paper, and the key conclusions are drawn, emphasizing the significant contributions and implications of the proposed intelligent fault diagnosis method for self-priming centrifugal pumps.
2. Intelligent Fault Diagnosis Model for Self-Priming Centrifugal Pump
2.1. Proposed Intelligent Fault Diagnosis Model
Based on the RTSMFDE, CPCSMM, and ACAO-SVM, an intelligent fault diagnosis method for self-priming centrifugal pumps was devised. The methodology employed in this study involves a sequential process. Initially, the RTSMFDE method was employed for the extraction of fault-related information from the self-suction centrifugal pump. Subsequently, the CPCSMM method was applied to reduce the dimensionality of the RTSMFDE feature set, effectively isolating sensitive feature components. Lastly, the resulting low-dimensional feature set was fed into the ACAO-SVM classifier, facilitating intelligent fault recognition.
Figure 1 illustrates the process, while the specific steps are detailed as follows:
- (1)
Signal acquisition. A single sensor obtains the self-priming centrifugal pump signal under different operating conditions;
- (2)
Feature extraction. The RTSMFDE extracts the entropy features of each group of signal samples and constructs the fault feature vector in the entropy domain;
- (3)
Dimensionality reduction. The dimensionality of the extracted high-dimensional fault feature set from the RTSMFDE is reduced by the CPCSMM method, resulting in an entropy-manifold feature set that exhibits a high degree of fault differentiation;
- (4)
Fault identification. The training set is constructed by randomly selecting entropy-manifold feature vectors from the samples. Conversely, the remaining samples’ entropy-manifold feature vectors are considered as the test set. Both the training and test sets undergo normalization. The normalized training set is used to construct the predictive model. Subsequently, the normalized test set is fed into the predictive model for intelligent fault diagnosis self-priming centrifugal pumps.
2.2. Refined Time-Shift Multiscale Fluctuation Dispersion Entropy
The signal was set as , and the RTSMFDE process was as follows.
- (1)
Under the scale factor s, h time-shift multiscale decomposition sequences X are constructed:
where the time-shift multiscale decomposition sequence is converted into the original signal when
s = 1 and
is the nearest integer, which is less than
.
- (2)
The normal cumulative distribution function is used to decompose the time-shift multiscale subsequences mapping to :
where
;
and
represent the standard deviation and mean of
, respectively.
- (3)
is mapped to an integer index from 1 to c using a linear transformation :
where
represents the rounding function;
c stands for the category. Since step (2) uses the normal cumulative distribution function, the mapping process can still be considered nonlinear.
- (4)
The reconstructed sequence is obtained using phase space reconstruction:
where
t represents delay and
m represents the embedding dimension.
- (5)
Fluctuation dispersion analysis for the reconstructed sequence:
where the fluctuation dispersion mode of each sequence
is defined as
,
,
,...,
. In addition, the number of potential fluctuation dispersion modes assigned to the sequence
is
.
- (6)
The relative frequency of each fluctuation dispersion mode is calculated as follows:
- (7)
The average relative frequency of multiple time-shift multiscale decomposition sequences with scale factor s is calculated in a refined way:
- (8)
The RTSMFDE can be expressed as follows:
As referenced in the literature [
7], this paper has set the parameters for the RTSMFDE method as follows:
N = 3000,
m = 2,
c = 6,
t = 1, and
s = 25.
2.3. Cosine Pairwise-Constrained Supervised Manifold Mapping
The utilization of the CPCSMM allows for the reduction of dimensionality in fault features. The process, demonstrated in
Figure 2, entails the following specific steps:
- (1)
Cosine distance measurement
Euclidean distance is typically used in traditional manifold learning methods to measure sample similarity when constructing a neighborhood graph. If the Euclidean distance between two samples is smaller, their similarity is greater. On the contrary, if the Euclidean distance between two samples is larger, their similarity is smaller.
The European distance measurement, however, is not without its shortcomings. Firstly, it is susceptible to the influence of dimensions, resulting in an uncertain range. Secondly, when treating outliers as proximate points, the Euclidean distance metric proves inadequate in accurately depicting the connection between these isolated points and their other nearby neighbors. This inadequacy has the potential to undermine the overall integrity of the neighborhood graph structure.
In comparison to the Euclidean distance measurement, the cosine distance measurement exhibits certain advantages. It mitigates the impact of dimensions and maintains a fixed value range. In addition, the cosine distance measurement shows better robustness when dealing with outliers. Therefore, this chapter uses the cosine distance measurement to measure sample distance in high-dimensional space.
The cosine similarity expression between any two vectors
and
is:
The cosine distance between vectors
A and
B is defined as:
It can be observed that the cosine distance ranges from [0, 2].
- (2)
Pairwising the constrained neighborhood graph
The data set
is provided, where
represents the sample points,
indicates the label category, and
L represents the whole number of categories. First, the pairwise constrained neighborhood graph
is constructed, where
represents two constraint types of nearest neighbor points, namely weak constraint and strong constraint. Specifically, the two points belong to the strong constraint type, and the weight is defined as
if the sample points
and
have the same label category. If sample points
and
have different label categories, the two points belong to the weak constraint type, and the weight is defined as
. In addition, the constraint set
SSL is constructed for the strong constraint type sample. The constraint set
SWL is constructed for the weak constraint type sample to obtain the paired constraint set:
Based on the above definition, a strongly constrained neighborhood graph is constructed for the same label class samples. A weakly constrained neighborhood graph is constructed for samples of different label categories.
- (3)
Supervising the discriminant distance matrix
If the constraint relation between any two points
and
on the neighborhood graph
is
, then the distance between them is defined as:
where
is the adjustment coefficient used to curb the overgrowth of inter-class distance and is characterized as the average cosine distance between all the samples.
If the constraint relation between any two points
and
on the neighborhood graph
is
, then the distance between them is defined as:
where
is the adjustment factor.
The supervised discriminant distance matrix
of the dataset is constructed based on the above analysis and expressed as follows:
- (4)
Sparse global manifold structure
The distance matrix of a sparse global manifold structure is constructed based on the above theory. The detailed process is as follows.
The manifold topology of the original high-dimensional data set is approximated by randomly selecting some sparse points
from the data set
V. Among them, the quantity of sparse points should be less than the total sample number. The global manifold structure matrix
among all the sample points and sparse points is constructed. Specifically, the manifold distance before the sparse point
and the sample point
is the corresponding supervised discriminant distance if
is the
k adjacent point
. Conversely, the approximation of the distance between the two manifolds is achieved by calculating the shortest path between the two points using the Dijkstra algorithm. The corresponding expression is as follows:
- (5)
Low-dimensional mapping results
The low-dimensional mapping result
of a sparse set of points is computed to build a centralized inner product matrix:
where
represents the square matrix of sparse point manifold distance matrix
and
represents a centralized matrix.
The maximum
d eigenvalues
are calculated, where
d represents the intrinsic dimension. The
i-th eigenvalue and its corresponding eigenvector are represented by parameters
and
, respectively. Then, the DR result of the sparse point set is:
The low-dimensional mapping result
of other sample points (not sparse points) is calculated as:
where
represents the column average matrix of
, and
represents the square matrix of
.
can be expressed as follows:
2.4. Adaptive Chaotic Aquila Optimization Support Vector Machine
The optimization process of key parameters of the SVM classifier applies the adaptive chaotic Aquila optimization (ACAO) method. Moreover, an intelligent fault classification method of ACAO-SVM is proposed. The process, depicted in
Figure 3, is detailed through the following specific steps:
- (1)
Data preprocessing. The training and test sets are created by randomly dividing the input feature set. In addition, is used to normalize the training and test sets to [0, 1]. and v represents the normalized eigenvalues and the original, and min and max represent the minimum and maximum eigenvalues;
- (2)
Initializing the ACAO method parameters. The minimum population size Pmin is 5, the maximum population size Pmax is 30, the maximum number of initial iterations T is 200, the upper limit UB of the optimization problem is [0.001, 0.001], the lower limit LB of the optimization problem is [100, 100], and the individual position of the Aquila is ();
- (3)
The population location is initialized by the tent chaotic mapping method (details shown in Equations (20) and (21)). In the Dim-dimension space, generates the tent chaotic sequence with different trajectories:
Z carrier to the solution space is used to generate the initial individual space position
:
- (4)
Adaptive updating of Aquila’s population size. Aquila’s population size is adaptively updated using the linear reduction method (Equation (22)). The computational complexity of the Aquila optimizer (AO) is determined by the maximum number of iterations T, the optimization solution dimension Dim, and the population size P. Thus, to enhance the operational efficiency of the AO, the original constant population size strategy is replaced with a linear reduction adaptive population size update method;
where
t represents the current iteration.
and
represent the minimum and the maximum population size, respectively, and
represents the integer function.
- (5)
The fitness value of each Aquila individual is calculated in the current iteration. The fitness value is defined as the average error classification rate after conducting a three-fold cross-validation on the training set. To achieve this, the training set is normalized and divided into three groups. One group is randomly selected as the sub-validation set, while the remaining two groups are treated as the training set, resulting in the creation of three models. The fitness value is obtained by calculating the average error classification rate of each model on its corresponding validation set. The current target prey position is identified as the position of the Aquila individual with the lowest fitness value in the current iteration. Consequently, the optimization process of the SVM classifier parameters aims to discover the global minimum fitness value;
- (6)
Updating the individual position of the Aquila. The strategies include soaring at high altitudes with vertical dives, glide attacks at close range, contour flying, slow descent attacks at low altitudes, and stalking and capturing prey, as detailed in Equations (23)–(26):
where
rand represents the random number between [0, 1],
represents the best individual position under
t iteration (i.e., the prey position), and
denotes the mean position of all individuals in the current iteration.
The Aquila hovers over its prey, preparing to land and launch an attack. This process is called contour flight for a short glide attack and can be mathematically expressed as follows:
where
u and
v are random numbers between [0, 1],
is an integer from 1 to
Dim, and
is the number of cycles between 1 and 20.
The Aquila bird hovers over its prey, poised to descend and initiate an attack. This maneuver, known as contour flight, involves a brief gliding descent and can be expressed mathematically as follows:
According to the random target movement, the Aquila walks on the ground to attack and capture the prey. The corresponding mathematical expression is as follows:
where
represents the mass function under
t iteration.
is the moving parameter.
is the flight slope.
- (7)
Evaluating the position of individual Aquilas and the prey. In the present iteration, if either the fitness values of the individual or the prey surpass their historical values, the original positions of the individual or the prey should be replaced with the updated positions. Alternatively, if the historical positions of either the individual or the prey are superior in terms of fitness values, these historical positions are retained;
- (8)
Determining whether the iteration is terminated. If the maximum number of iterations is reached, the entire cycle is halted. Otherwise, steps (4)–(7) are iteratively repeated until the specified condition is satisfied;
- (9)
Determining the final prey location. At the termination of the iteration, the final captured prey position is determined by outputting the location of the best individual in the Aquila population;
- (10)
Establishing the SVM prediction model. The SVM prediction model is established according to the parameter optimization result ;
- (11)
Sample classification. The normalized test set is fed into the SVM prediction model for intelligent classification, which then generates the predicted fault type for the test samples.