1. Introduction
Piston air compressors are widely used as the typical volumetric compressors in industrial processes. Due to the complex motion mechanisms and differentiated fault mechanisms, various forms of failure during service are exhibited. with over 60% of failures occurring in the gas valves. However, the state characteristics of reciprocating air compressors are often obscured by the impact characteristics of vibration signals. Given the nonlinear, non-stationary, and multi-component coupling characteristics of its vibration signals, traditional analysis methods bring with them many limitations, making effective fault diagnosis difficult [
1,
2].
Currently, the mainstream methods for processing and modeling vibration signals still rely on time–frequency domain analysis. However, classical time–frequency domain methods, such as those referenced in
Table 1 [
3,
4,
5,
6,
7], do not perform satisfactorily when dealing with complex nonlinear systems. The widely used method of Empirical Mode Decomposition (EMD) [
8] suffers from endpoint effects, mode mixing, over/under-sifting, and lacking robustness against noise. In contrast, nonlinear relationships between data are efficiently captured, dimensionality is reduced, and features are extracted by the Symplectic Geometric Modal Decomposition (SGMD) [
9] method, based on symplectic geometry spectral analysis [
10]. This innovative signal processing technique preserves the integrity of the original time series while exhibiting good decomposition and adaptability. SGMD has also been proposed and applied in bearing fault studies [
11], which primarily demonstrates its feasibility. However, limited by the high computational complexity of differential geometry and the lack of adaptive dimensionality selection, the performance of the SGMD method is less reliable when facing non-stationary signals.
After decomposing the original sequence, introducing entropy algorithms to assess the complexity of each decomposition component and reorganizing them based on entropy similarity can reduce the modeling complexity [
12]. Researchers often employ single-scale entropy algorithms such as sample entropy [
13], permutation entropy [
14], and fuzzy entropy [
15]. Among these, the refined composite multiscale dispersion entropy (RCMDE) offers unique advantages. It effectively analyzes nonlinear and non-stationary signals under different scales and time transformations, providing richer information [
16]. Compared to other algorithms, RCMDE demonstrates better robustness and is applicable to various signal types. Therefore, selecting multiscale dispersion entropy for signal analysis yields more accurate and comprehensive feature extraction and analysis results. However, a single refined composite multiscale dispersion entropy cannot guarantee complete recognition rates. Thus, this paper utilizes the amplitude variation index and peak-to-peak values, which best reflect reciprocating compressor fault characteristics, to assist in fault identification [
17].
In this study, the algorithm of the SaDE-ELM method for pattern recognition is proposed. Compared to methods such as Support Vector Machine (SVM) [
18], Artificial Neural Networks (ANNs) [
19], and Decision Trees, SaDE-ELM not only demonstrates a faster training speed and better generalization ability but also allows for the simplification of the model structure, reducing the complexity of hyperparameter tuning. Especially for small- to medium-scale pattern recognition problems, SaDE-ELM may be considered as an efficient choice [
20]. However, the selection of the appropriate pattern recognition method needs to be based on specific circumstances for large-scale complex datasets and deep learning tasks.
Passively, the challenges encountered by traditional methods in extracting faults from reciprocating air compressors are investigated in this study. The diagnosis model based on the Artificial Bee Colony algorithm and Symplectic Geometric Modal Decomposition (ABC-SGMD) is proposed, and its effectiveness is validated through numerical simulations and experiments. The structure of the paper is outlined as follows:
In
Section 2, theoretical algorithms are introduced for optimizing Symplectic Geometric Modal Decomposition (SGMD) using the Artificial Bee Colony (ABC) algorithm, alongside the development of a multi-feature fusion model and modal identification method.
Section 3 focuses on Numerical Simulation Verification, where fault mechanisms are analyzed and simulation signals of air compressor valve faults are generated. This section compares the advantages of the ABC-SGMD method in decomposing simulated signals with other approaches.
Section 4, Fault Simulation Experiment, demonstrates the effectiveness of the proposed method on a laboratory reciprocating air compressor fault model. The results indicate that the method successfully addresses the challenges posed by traditional methods in handling nonlinear, non-stationary signals and in reducing the decomposition time.
2. Theoretical Algorithms
2.1. Symplectic Geometry Modal Decomposition (SGMD)
The main implementation process of Symplectic Geometry Modal Decomposition (SGMD) is as follows [
21]:
Firstly, apply the phase space reconstruction method to any given original signal, such as
, where
is the length of the data. This signal is reconstructed based on Takens’ theorem [
22], resulting in a trajectory matrix
X
where
is the embedding dimension,
is the delay time, and
.
Then, the QR decomposition is performed on the skew-symmetric matrix
, obtaining the Hamiltonian matrix
, where both
M and
N are Hamiltonian matrices. Construct the skew-symmetric orthogonal matrix
Q.
where
B is an upper triangular matrix with eigenvalues
. According to the properties of the Hamiltonian matrices,
are the eigenvalues of matrix
A, and the corresponding eigenvectors of matrix
A are denoted as
. Also, denote
, where
S represents the transformation coefficient matrix, and
Z represents the reconstructed trajectory matrix.
Finally, using diagonal averaging and the reconstructed matrix
, a set of Hamiltonian geometry components
with a length of
n is transformed.
. Let
. If
, then
. The averaged diagonal transformation matrix is
According to Equation (5), the d sets of initial SGC components can be determined as , where .
The initial SGC components are not completely independent, so it is necessary to recombine them. Thus, the d sets of the original SGC components are filtered and recombined.
The energy of the Hamiltonian geometry component
is given by
The energy entropy increment for each component is calculated as
where
represents the proportion of the
i-th component in the total energy E, and
.
The specific process is shown in
Figure 1.
2.2. ABC-Optimized SGMD
The Artificial Bee Colony algorithm (ABC) is a heuristic optimization algorithm inspired by the foraging behavior of bees [
23]. It is a swarm intelligence-based algorithm that simulates the search behavior of bees to find the optimal solution. The basic idea is to view the problem to be optimized as the process of bees searching for food. In the algorithm, bees are divided into three roles: employed bees, onlooker bees, and scout bees. Each bee represents a solution (parameter combination) and finds a better solution through search and information exchange.
In optimization algorithms, the colony size refers to the total number of employed bees and observer bees, which affects the search capability. The max cycles determines the total running time or maximum iterations of the algorithm, and the search range influences the area of the solution space explored: if it is too small, the global optimum might be missed, while, if it is too large, the convergence speed might be reduced. The search step size is the extent of solution updates in each iteration; if it is too small, convergence will be slow; otherwise, the optimal solution might be missed. The specific implementation steps of Symplectic Geometry Mode Decomposition (SGMD) optimized by ABC are as follows:
Initialization: A group of initial solutions (bees) is randomly generated, with each bee representing a parameter combination of SGMD. Simultaneously, each bee is assigned a fitness value to assess its impact on the system’s reconstruction.
SGMD and Reconstruction: According to the parameter combination represented by the current bee, perform SGMD and reconstruction to obtain the reconstruction result of the system.
Fitness Evaluation: Calculate the corresponding fitness value based on the reconstruction result of SGMD. The fitness value can be evaluated based on the difference between the reconstruction result and the original data, the frequency range of the mode, and other indicators to measure the quality of the reconstruction effect.
Employed Bee Stage: Each bee improves the parameters through a local search based on the current parameter combination. The bee will choose a neighborhood solution for the search and calculate the fitness value after the search. If the solution obtained by the search is better, the bee’s parameter combination will be updated to the new solution.
Onlooker Bee Stage: Each bee will observe the parameter combinations of other bees and choose the parameter combination with the best fitness value as a reference.
Scout Bee Stage: If a bee has not found a better solution than the current parameter combination after a certain number of searches, it is regarded as a scout bee, and its parameter combination is randomly regenerated.
Follower Bee Stage: Each bee will follow the parameter combination with the best fitness value observed before and conduct further searches. If a better parameter combination is found, update the current solution.
Selection Bee Stage: According to the fitness value of each bee, select the parameter combination with the best fitness value as the global optimal solution.
Termination Judgment: Determine whether to terminate the algorithm according to the preset termination conditions (such as reaching the maximum number of iterations or reaching the target fitness). If the termination condition is not reached, return to step 4 to continue optimization.
By utilizing the ABC algorithm, the SGMD can efficiently navigate the parameter space, finding the optimal or near-optimal parameters that enhance the performance of the decomposition process, even in the presence of noise or other complexities in the data. This results in more accurate and robust signal decomposition, which is crucial for the subsequent simulation and experimental analysis.
2.3. Refined Composite Multiscale Dispersion Entropy (RCMDE)
In the RCMDE algorithm,
τ corresponds to different starting points of the coarse-graining process for time series of different scales. In other words, the value of RCMDE is determined by the average of the dispersion entropy of the coarse-grained sequences [
24].
The
k-th coarse-grained sequence corresponding to the signal
is as follows:
The RCMDE value under the scale
τ is calculated in the following way:
where
represents the RCMDE value under the scale
τ, and
is the average of the dispersion pattern probabilities corresponding to the coarse-grained sequences, and it is represented in Equation (11):
where
represents the probability of the dispersion pattern corresponding to the
k-th coarse-grained sequence under the scale
τ.
2.4. Adaptive Differential Evolution Extreme Learning Machine
The Extreme Learning Machine (ELM) is a single hidden layer feedforward neural network model [
25]. Its basic idea is to randomly initialize the weights of the input layer and the hidden layer and then directly obtain the weights from the output layer to the hidden layer. The method brings about fast training for the basis of the idea of minimizing the mean square error, which principle is to accelerate training by randomly initializing the parameters of the hidden layer nodes. Under certain conditions, it can achieve comparable or even better performance in a shorter training time.
Self-Adaptive Evolutionary Differential Evolution (SaDE) is an evolutionary algorithm designed for global optimization problems. The SaDE algorithm simulates the evolutionary process in nature, utilizing differential mutation and crossover operations to search for the optimal solution. It is characterized by fast convergence and strong global search capabilities.
The SaDE-ELM method combines the ELM and SaDE algorithms, using the Adaptive Differential Evolution algorithm to optimize the parameters of the ELM network, including the weights and biases of the hidden layer nodes, as well as the weights of the output layer. During the optimization process, the network parameters are gradually optimized through continuous iterative evolution of the population through selection, crossover, and mutation operations, resulting in a better model fitting ability and generalization ability [
26].
As the core highlight of the SaDE-ELM method, it employs the Adaptive Differential Evolution algorithm to search for better parameter combinations, which improves the performance of the Extreme Learning Machine. Compared to the traditional ELM method, the SaDE-ELM model is more likely to avoid local optimal solutions and has stronger global search capabilities. In addition, the training speed of the SaDE-ELM method is faster, making it suitable for handling large-scale datasets and complex tasks.
2.5. Algorithm Model
The method proposed in this study for diagnosing the valve faults of reciprocating air compressors based on Symplectic Geometry Mode Decomposition (SGMD) and an improved Extreme Learning Machine (ELM) is as follows:
The data collection system is set up to collect vibration signals under various typical faults on the reciprocating air compressor.
The ABC-SGMD method is used to decompose the reciprocating requirements and various fault signals, obtaining multiple IMF component signals, and suitable components are selected through the kurtosis value method and correlation coefficient method.
Analyze some eigenvalues, including the Fine Composite Multiscale Dispersion Entropy, to construct a multi-feature fusion model.
Subsequently, the multi-feature fusion model is input into the SaDE-ELM to obtain the fault diagnosis results of the reciprocating air compressor.
The main technical route is shown in
Figure 2.
3. Numerical Simulation Verification
In order to primarily validate the superiority of the ABC-SGMD model, a typical simulated signal is constructed and processed. A simulated impact signal of a reciprocating air compressor under normal operation is generated, and strong white noise is added to simulate the actual signal [
27]. The expression of the simulated signal is as follows:
The signal s(t) represents the total signal of the air compressor faults, where Ai is the amplitude of the i-th signal, and h(t) is an exponentially decaying sine wave representing the characteristic vibration of the air compressor valve. T is the signal period, and n(t) is additive Gaussian white noise representing random noise components in the signal.
h(t) is a simulated signal of a single compressor valve fault, where e−Ct is the decaying exponential part, and cos(2πfnt) is a cosine wave with frequency fn representing the vibration mode of the valve.
Ai represents the variation of amplitude over time, where A0 is the baseline amplitude, and cos(2πfrt) is a cosine wave with frequency fr representing the periodic variation of the amplitude, possibly due to changes in the vibration modes caused by the air compressor faults.
The specific data are shown in
Table 2.
The components and the synthesized simulation signal are shown in
Figure 3a, and the spectrum of the components is shown in
Figure 3b. Secondly, an added noise component with an amplitude of 0.2 is introduced into the simulated signal to mimic the environmental interference under actual working conditions, and a comparison between the noise-added signal and the original signal is shown in
Figure 3c,d.
First, the signal is decomposed using the SGMD method, and the corresponding spectrum distribution is displaced individually with Fourier transform, from which the matching relevance of the characteristic frequencies of the decomposed signal and the original frequencies is observed. Additionally, the signal is decomposed using the EMD method, with both methods set to a decomposition order of 6. The decomposition results are compared, as shown in
Figure 3. It can be observed that both methods successfully decompose the characteristic frequencies of the simulated signal at 6, 26, and 51 Hz. However, compared to SGMD, EMD does not perform as well when decomposing signals with noise. The 6 Hz signal is submerged in noise, as shown in
Figure 4a. Compared to EMD, SGMD provides clearer decomposition components. However, the number of decomposition modes in SGMD is manually set, and it can be observed from
Figure 4b that multiple redundant components are decomposed, which greatly affects the efficiency and results of the decomposition.
Therefore, the Artificial Bee Colony-optimized Sinusoidal Geometric Mode Decomposition, or ABC-SGMD, is proposed. The number of bee individuals is set to 50, the search range is set to 10, the search step size is 0.1, and the maximum number of iterations is 50. The simulated signal is decomposed, and the spectrum is obtained through Fourier transform, as shown in
Figure 5.
The method clearly demonstrates its effectiveness by producing decomposed components that are distinguishable without being distorted by noise or excessive redundancy, which highlights the strong potential of the method in achieving accurate and meaningful results.
4. Fault Simulation Experiment
To further validate the effectiveness of the aforementioned method in processing normal and faulty data signals from reciprocating air compressor valves under actual operating conditions, the reciprocating air compressor valve experimental simulation bench in our laboratory was used, as shown in
Figure 6. At a speed of 20 revolutions per second, three operating conditions of the valve were simulated (normal valve, intake valve wear, and exhaust valve spring failure), as illustrated in
Figure 7.
The vibration signals from the intake valve under different operating conditions are collected and analyzed in the time domain, resulting in the time domain plot shown in
Figure 8a. This plot reveals the changing patterns and characteristics of the vibration signals: compared to signals under normal conditions, the signals under spring failure conditions exhibit more pronounced impact characteristics in each cycle due to the spring’s reduced ability to absorb vibration and impact energy, resulting in larger amplitude fluctuations. In contrast, with the wear effect of the intake valve, the irregular entry of gas into the cylinder results in denser, lower-amplitude noise signals under intake valve wear conditions. At the same time, the potential leakage may result in some signal amplitudes being lower than the original signal. To gain a deeper understanding of the frequency domain characteristics of the vibration signals, Fourier transform is performed on the collected signals, as shown in
Figure 8b. Here, the horizontal axis represents the frequency, and the vertical axis represents the signal amplitude. In this paper, the frequency distribution of the signals is analyzed to understand the operational state and vibration characteristics of the gas valve.
First, the ABC-SGMD method is utilized to decompose the vibration signals under three working conditions into a series of Intrinsic Mode Function (IMF) components, each group of which corresponds to a local vibration component. With the proposed method, different frequency and amplitude components from the signal could be distinguished, facilitating a more detailed analysis of its frequency domain characteristics. The IMF component graph from the ABC-SGMD procedure is shown in
Figure 9, from which the local vibration characteristics and frequency components of the vibration signals under each working condition are illustrated. By analyzing these components, a deeper understanding of the complex frequency domain structure of the vibration signals under each working condition could be gained, thereby providing a robust basis for further signal processing and feature extraction.
Here, the signal is decomposed into various components, and the kurtosis value and correlation coefficient with the original signal are calculated from each component. According to the selection criteria, components with a kurtosis value greater than 3 and a correlation coefficient greater than 0.6 are retained [
28], and their union for the reconstructed signal is taken as the object of further analysis in this study.
Based on a previous analysis, it was concluded that the impact characteristics resulting from air reduction could be effectively captured by properly selecting RCMDE, ydz, and Vpp from the decomposition results of the initial vibration signal. To observe the differences in characteristic values under varying working conditions, three sets of data were randomly selected, and their characteristic values are compared in
Table 3. This analysis of the signal component characteristics under different working conditions provides a solid foundation for the subsequent fault diagnosis.
It can be observed that there are significant differences in the eigenvalues among different working conditions. Passively, the fault category can be accurately determined by inputting these eigenvalues into the Learning Machine. The output results of the Extreme Learning Machine (ELM) for the signal through SGMD and EMD are observed in
Figure 10.
From
Figure 10, it is observed that, compared to the traditional EMD method, the SGMD method shows an improvement in accuracy of approximately 3%. Furthermore, when compared to the enhanced ABC-SGMD method, it is found that the accuracy rate is increased by an additional 13% through the improvement of the method using the Artificial Bee Colony algorithm.
To eliminate the influence of random factors, the signal was repeatedly trained and recognized multiple times in this study, selecting 80% of the data as training samples and 20% as test samples. For both scenarios, multiple trainings were conducted by the study, and the average of the final recognition results is presented in
Table 4.
It can be clearly seen that the test set accuracy rate of ABC-SGMD increased by an average of about 15% compared to the traditional EMD and SGMD methods.
To verify the effectiveness of the multi-feature fusion model, it is assumed that each model contains only one feature vector. Each reconstructed matrix is input into the Self-Adaptive Differential Evolution Extreme Learning Machine (SaDE-ELM) method for recognition to determine its accuracy rate. The decomposition results are shown in
Figure 11 below.
To mitigate the influence of random factors, the signal was repeatedly trained and recognized in this study, as 80% of the data were selected as training samples and 20% as test samples. The training process was repeated several times, and the final recognition results are illustrated in
Table 5.
The comparison reveals that the average accuracy rate of the Fine Composite Multiscale Scattering Entropy method achieves the highest average accuracy rate at 83.4%. However, the accuracy of this method is not stable. In contrast, the accuracy of the peak-to-peak value is relatively stable, but its average accuracy rate is lower, at only 67.3%. Compared to extracting individual eigenvalues and combining them into a multi-feature fusion model, these methods have certain gaps in accuracy and stability. To achieve better diagnostic results, the eigenvalues of the decomposed IMF components can be extracted, formed into a feature vector matrix, and analyzed through SaDE-ELM. This approach is expected to yield better diagnostic outcomes.
In summary,
Table 6 demonstrates that the ABC-SGMD-based method outperforms other methods in diagnostic performance under various noise conditions. The feature fusion model, constructed using feature vectors extracted by this method, more accurately reflects differences among the various fault states at different noise levels. Additionally, the ELM optimized by SaDE effectively classifies these feature vectors, showcasing the proposed method’s noise resistance and generalization ability.