ACMSlE: A Novel Framework for Rolling Bearing Fault Diagnosis

Wu, Shiqian; Zhang, Weiming; Qian, Jiangkun; Yu, Zujue; Li, Wei; Zheng, Lisha

doi:10.3390/pr13041167

Open AccessArticle

ACMSlE: A Novel Framework for Rolling Bearing Fault Diagnosis

by

Shiqian Wu

¹

,

Weiming Zhang

¹,

Jiangkun Qian

¹,

Zujue Yu

¹,

Wei Li

¹ and

Lisha Zheng

^2,*

¹

College of Ship Engineering, Jiangxi Polytechnic University, Jiujiang 332005, China

²

Department of Management Engineering and Equipment Economics, Naval University of Engineering, Wuhan 430033, China

^*

Author to whom correspondence should be addressed.

Processes 2025, 13(4), 1167; https://doi.org/10.3390/pr13041167

Submission received: 12 March 2025 / Revised: 29 March 2025 / Accepted: 4 April 2025 / Published: 12 April 2025

(This article belongs to the Section Automation Control Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Precision rolling bearings serve as critical components in a range of diverse industrial applications, where their continuous health monitoring is essential for preventing costly downtime and catastrophic failures. Early-stage bearing defects present significant diagnostic challenges, as they manifest as weak, nonlinear, and non-stationary transient features embedded within high-amplitude random noise. While entropy-based methods have evolved substantially since Shannon’s pioneering work—from approximate entropy to multiscale variants—existing approaches continue to face limitations in their computational efficiency and information preservation. This paper introduces the Adaptive Composite Multiscale Slope Entropy (ACMSlE) framework, which overcomes these constraints through two innovative mechanisms: a time-window shifting strategy, generating overlapping coarse-grained sequences that preserve critical signal information traditionally lost in non-overlapping segmentation, and an adaptive scale optimization algorithm that dynamically selects discriminative scales through entropy variation coefficients. In a comparative analysis against recent innovations, our integrated fault diagnosis framework—combining Fast Ensemble Empirical Mode Decomposition (FEEMD) preprocessing with Particle Swarm Optimization-Extreme Learning Machine (PSO-ELM) classification—achieves 98.7% diagnostic accuracy across multiple bearing defect types and operating conditions. Comprehensive validation through a multidimensional stability analysis, complexity discrimination testing, and data sensitivity analysis confirms this framework’s robust fault separation capability.

Keywords:

feature extraction; adaptive composite multiscale slope entropy; time-window shifting strategy; PSO-ELM classification

1. Introduction

Rotating machinery underpins a wide range of industrial applications, from petrochemical processing and marine engineering to aerospace systems. Central to these machines are rolling bearings—precision components that convert sliding friction between rotating shafts and housings into rolling friction. This critical function demands an exceptional durability, load-bearing capacity, and operational stability, making continuous health monitoring essential to prevent catastrophic failures and enhance system reliability. As such, the extraction and analysis of early, subtle fault signatures in rolling bearings have become a pivotal point of focus for research. However, the nonlinear and non-stationary characteristics of rotating machinery vibration signals create significant challenges in fault diagnosis [1,2,3,4]. In recent years, many researchers have conducted extensive studies on how to efficiently and accurately diagnose rolling bearing faults. Additionally, when breakthrough developments have occurred in non-mechanical fields, fault diagnosis technology has also advanced considerably. For instance, fundamental research in mathematics and physics has inspired the underlying logic of fault diagnosis, while artificial intelligence and deep learning have improved its diagnostic accuracy. Various diagnostic methods from the medical field have also provided valuable insights [5].

Entropy, originally a macroscopic quantity in thermodynamics used to characterize a system’s disorder, was expanded upon when Shannon introduced information entropy in 1948 to quantify time series information [6]. Since then, entropy has been widely applied in nonlinear dynamics research. Researchers subsequently developed various entropy measures, including approximate entropy [7], sample entropy (SE) [8], permutation entropy (PE) [9], fuzzy entropy [10], attention entropy [11], distribution entropy [12], symbolic dynamic entropy [13], and conditional entropy [14]. While these methods show effectiveness in analyzing the nonlinear features of rotating machinery vibration signals, traditional entropy measures like sample entropy primarily focus on amplitude similarities and often neglect transitional characteristics crucial for bearing fault diagnosis. Fuzzy entropy offers improved noise resistance but requires complex parameter tuning that introduces additional optimization challenges. Permutation and dispersion entropy, in contrast, focus on ordinal patterns but may lose critical amplitude information relevant to fault severity assessments. Slope entropy [15] addresses these limitations by directly analyzing the rate-of-change between consecutive points, making it inherently sensitive to impulsive patterns characteristic of bearing impacts while maintaining computational stability across varying signal lengths and noise conditions. These advantages make slope entropy particularly suitable for bearing fault diagnostics, where sudden impacts generate distinctive slope patterns in vibration signals. Despite these advantages, single-scale slope entropy, like other single-scale methods, still faces challenges in capturing the multiscale dynamics inherent in bearing vibration signals.

So many researchers have improved existing entropy methods, with modifications to SE and PE being the most widespread [16,17]. Traditional single-scale entropy methodologies, while demonstrating utility in mechanical fault diagnostics, exhibit inherent limitations in capturing system dynamics. The comprehensive characterization of vibrational signatures across multiple temporal scales—a fundamental prerequisite for robust fault diagnostics—presents significant analytical challenges. This inherent constraint catalyzed the development of sophisticated multiscale frameworks that encompass multiscale variants of approximate entropy [18], SE [19], and PE [20]. These advanced analytical paradigms, which employ hierarchical coarse-graining procedures to quantify signal complexity across multiple temporal domains, have garnered substantial attention within the research community and precipitated the emergence of novel entropy-based diagnostic architectures specifically optimized for rotating machinery condition monitoring and fault detection. Yuan introduced an enhanced methodology through multivariate coarse-graining procedures, proposing Composite Multivariate Multiscale Permutation Fuzzy Entropy [21], which effectively mitigated issues of entropy fluctuation and information loss. Rostaghi integrated fuzzy set theory, multidimensional embedding reconstruction theory, and dispersion patterns to develop Refined Composite Multivariate Multiscale Fuzzy Dispersion Entropy [22]. This approach demonstrated reduced sensitivity to signal length constraints while yielding more stable analytical outcomes. Chen proposed Refined Composite Multiscale Diversity Entropy (RCMDE) by incorporating scale factor characteristics, addressing the inherent limitations of original diversity entropy, wherein the length of multiple time series data becomes truncated at deeper scales [23]. While these improvement methods have enhanced the accuracy of feature extraction compared to the original approaches mentioned, they have yet to solve the problems of long computation times and information loss.

Entropy-based methods have effectively advanced fault feature extraction, and when combined with the rapid development of artificial intelligence technologies in recent years, both the efficiency and accuracy of fault diagnosis methodologies have dramatically improved. Neural networks, with their powerful pattern recognition capabilities, have become essential tools in diagnostic systems, evolving from basic multilayer perceptrons to sophisticated architectures such as Convolutional Neural Networks (CNNs) [24] and Deep Belief Networks (DBNs) [25] that can automatically extract hierarchical features from raw vibration signals. While deep learning approaches offer remarkable accuracy, they typically require substantial computational resources and large training datasets, which may limit their applicability in industrial scenarios requiring the rapid diagnosis of limited samples. This challenge has prompted researchers to investigate more efficient neural network architectures, alongside optimization techniques, to enhance performance. The Extreme Learning Machine (ELM) [26] presents significant advantages over conventional neural networks due to its remarkably fast training speed and good generalization performance; however, the random initialization of the input weights and hidden biases in the standard ELM often leads to instability and suboptimal solutions. Fortunately, swarm intelligence optimization algorithms—inspired by collective behaviors in nature—have emerged as effective solutions for parameter tuning and model optimization in diagnostic systems. Techniques such as Particle Swarm Optimization (PSO) [27], ant colony Optimization (ACO) [28], and Genetic Algorithms (GAs) [29] mimic biological cooperative behaviors to efficiently search complex solution spaces and avoid local optima. Among these, PSO stands out for its implementation simplicity, rapid convergence, and ability to optimize continuous variables without requiring derivative information about the objective function. The synergistic combination of appropriate neural network architectures with efficient optimization methods offers a promising direction for enhancing fault diagnostic accuracy while maintaining computational efficiency.

To overcome these limitations, this paper introduces Adaptive Composite Multiscale Slope Entropy (ACMSlE) by combining function-adaptive patterns with time-shifting and refined sampling hybrid strategies. This method not only preserves the advantages and stability of MSlE in analyzing time series complexity but also suppresses noise, enhances feature extraction, and significantly improves the capture of local variation characteristics.This investigation presents a novel fault diagnosis framework that employs Fast Ensemble Empirical Mode Decomposition (FEEMD) [30] for the preprocessing of raw vibration signals from mechanical equipment, with the resultant Intrinsic Mode Function (IMF) components serving as inputs to the proposed Adaptive Composite Multiscale Slope Entropy (ACMSlE) for fault feature extraction. Considering the difficulty in evaluating the effectiveness of feature extraction with this approach, a PSO-ELM is employed to achieve fault classification. Subsequently, a new rotating machinery fault diagnosis method based on FEEMD-ACMSlE and PSO-ELM is proposed.

The principal contributions of this investigation include the following:

(1): The proposal of ACMSlE for measuring time series complexity. This method effectively optimizes edge processing and enhances trend capture, effectively preserving data points as the scale increases. It fully utilizes original data and improves noise resistance through combined filtering, effectively addressing the shortcomings of previous methods.
(2): The validation of ACMSlE using different types of signals and sampling point numbers, before comparing it with three other newly improved entropy methods.
(3): The development of a rotating machinery fault diagnosis method based on FEEMD-ACMSlE and PSO-ELM. Comparative analysis with experimental data verifies its feasibility and superiority.

The remainder of this paper is organized as follows: Section 2 elaborates on our proposed feature extraction theory using ACMSlE, including the fundamental concepts of slope entropy, multiscale slope entropy, and the innovative ACMSlE algorithm. Section 3 presents our comprehensive fault diagnosis methodology, detailing the FEEMD preprocessing technique, PSO-ELM classification approach, and the integrated diagnostic framework. Section 4 evaluates the effectiveness of ACMSlE through rigorous testing with various signal types and comparative analyses. Section 5 validates the proposed fault diagnosis method using experimental bearing data obtained under diverse conditions and compares it with alternative approaches. Section 6 discusses the limitations and potential applications of our method, and Section 7 summarizes the contributions of this research and outlines directions for future investigations.

2. Proposed Feature Extraction Theory for Bearings Utilizing ACMSlE

2.1. Slope Entropy

Slope entropy quantifies the complexity of a time series by analyzing the slope variations between consecutive points, effectively preserving the amplitude information of the original signal. For a time series

x = {x_{0}, x_{1}, \dots, x_{N - 1}}

, SlopEn is computed as follows:

SlopEn (x, m, γ, δ) = - \sum_{k = 1}^{n} p_{k} log p_{k}

(1)

where m is the embedding dimension

(m > 2)

, with a default value of

m = 3

for optimal pattern recognition based on empirical studies;

γ

is the slope threshold parameter that controls the sensitivity of the algorithm to slope variations (default

γ = 1

, corresponding to 45°)—higher values of

γ

require greater angular changes to trigger recognition;

δ

is the zero-region threshold (default

δ = 1 \times 10^{- 3}

) that defines the minimum change required to record a slope variation, effectively serving as a filtering threshold that prevents the algorithm from responding to insignificant noise while capturing actual signal features; and

p_{k}

represents the relative frequency of the k-th slope pattern. These default parameter values were determined through extensive testing across multiple datasets to provide an optimal performance in distinguishing actual signal characteristics from background noise [31,32,33].

2.2. Multiscale Slope Entropy

MSlE is a multiscale optimization method used within SlE that extends single-scale analyses to capture complex temporal patterns across different time horizons. The key steps for calculating MSlE are described below.

Given a time series

x = x_{1}, x_{2}, \dots, x_{N}

, MSlE calculation involves the following:

(1) Coarse-graining process:

y_{j}^{(τ)} = \frac{1}{τ} \sum_{i = (j - 1) τ + 1}^{j τ} x_{i}, 1 \leq j \leq ⌊ N / τ ⌋

(2)

where

τ

is the scale factor,

y_{j}^{(τ)}

is the coarse-grained time series, and

⌊ N / τ ⌋

is the length of each coarse-grained sequence.

(2) For each scale factor

τ

, the slope entropy is calculated using the coarse-grained time series:

MSlE (x, m, γ, δ, τ) = SlE (y^{(τ)}, m, γ, δ)

(3)

(3) The final MSlE curve is

{MSlE}_{curve} = MSlE 1, MSlE 2, \dots, MSlE τ \max

(4)

where

τ_{\max}

is the maximum scale factor considered in the analysis, which is typically constrained by the condition

⌊ N / τ_{\max} ⌋ \geq 100

to ensure statistical reliability.

It should be noted that while MSlE has the capability to provide a valuable multiscale analysis, the traditional coarse-graining process can lead to information loss as the scale factor increases.

2.3. Adaptive Composite Multiscale Slope Entropy

ACMSlE enhances the traditional MSlE algorithm using adaptive processing strategies and refined sampling techniques. To better illustrate the fundamental differences between traditional MSlE and our proposed ACMSlE, Figure 1 presents a comparative schematic diagram of both approaches. Our method incorporates multiple processing modes optimized for different signal characteristics, while improving noise resistance through composite sampling and adaptive parameter selection. The steps for calculating ACMSlE can be briefly described as follows:

(1) Composite coarse-graining: instead of single coarse-graining, ACMSlE employs

τ

different coarse-grained series for each scale:

y_{k, j}^{(τ)} = \frac{1}{τ} \sum_{i = (j - 1) τ + k}^{j τ + k - 1} x_{i}, 1 \leq k \leq τ

(5)

where k represents the number of different starting points.

(2) Adaptive processing: for each scale

τ

, entropy is calculated as follows:

ACMSlE (x, m, γ, δ, τ) = \frac{1}{τ} \sum_{k = 1}^{τ} SlE (y_{k}^{(τ)})

(6)

(3) Enhanced signal modification: There are three modes used for signal processing—standard mode, which uses a Hamming-windowed moving average; local variation mode, which combines a moving std and median filtering; and trend capture mode, which uses an exponential moving average.

Y_{modified} = \{\begin{matrix} (Y_{std} + | Y_{med} |) / 2, & Tx = 1 \\ α Y_{\exp}, & Tx = 2 \\ Y_{hamming}, & otherwise \end{matrix}

(7)

where

Y_{modified}

represents the modified output value,

Y_{std}

denotes the standardized value,

Y_{med}

represents the median-processed value,

Y_{\exp}

indicates the exponentially processed value,

Y_{hamming}

represents the Hamming window-processed value, Tx is the processing type selection parameter, and

α

is an adaptive weighting coefficient that controls the influence of the exponential moving average component during trend capture mode.

The

α

parameter is not a fixed constant but rather adaptively calculated based on local signal characteristics using

α = \frac{local_complexity}{max (local_complexity)}

(8)

where local_complexity is computed as the coefficient of variation (the ratio of the standard deviation to the mean) within a sliding window of the signal. This adaptive mechanism enables the algorithm to automatically adjust its sensitivity according to the signal’s complexity, providing enhanced feature discrimination for the varying dynamic behaviors typically observed in bearing vibration signals.

In signal segments containing transient fault features or high-frequency components characteristic of incipient bearing defects, their

α

values approach 1.0, giving greater weight to the exponential moving average component, which better preserves these critical diagnostic features. Conversely, in signal regions dominated by background machinery noise or steady-state vibrations, smaller

α

values reduce the influence of this component, effectively enhancing noise rejection while maintaining sensitivity to relevant fault signatures. This dynamic adaptation represents a key innovation within our approach, allowing the ACMSlE method to optimize feature extraction characteristics based on local signal properties rather than relying on fixed processing parameters.The selection of these specific signal processing techniques was guided by their distinctive advantages for bearing fault diagnosis applications.

The Hamming window was chosen for standard mode processing due to its superior frequency domain characteristics and minimal spectral leakage (approximately −42 dB side-lobe attenuation), which preserves the spectral integrity of bearing fault’s impulses while effectively suppressing noise. Unlike rectangular windows or other tapering functions, the Hamming window offers an optimal trade-off between main-lobe width and side-lobe suppression.

For local variation detection, the combination of moving standard deviation and median filtering creates a complementary system particularly effective for bearing diagnostics. Standard deviation efficiently amplifies transient vibration changes typical of incipient faults, while median filtering provides robust impulse noise rejection without blurring sharp fault transitions—a limitation of Gaussian filters. This hybrid approach consistently outperformed single-filter techniques in our evaluation tests while maintaining O(N) computational complexity.

The exponential moving average employed in the trend capture mode delivers adaptive memory properties that are particularly valuable for tracking progressive fault development patterns. Compared to more computationally intensive techniques like wavelet transforms (O(N log N)) or Savitzky–Golay filters, our selected methods maintain linear time complexity while providing excellent feature discrimination capabilities. This computational efficiency, combined with the adaptive switching mechanism used to switch between processing modes, makes ACMSlE highly suitable for practical diagnostic applications, including, potentially, real-time implementations.

Parameter selection is a critical aspect of the ACMSlE methodology. The selection of the parameters m,

γ

, and

δ

was informed by both theoretical considerations and extensive empirical testing. The embedding dimension

m = 3

provides an optimal balance between pattern recognition capability and computational efficiency, capturing essential nonlinear dynamics while avoiding the curse of dimensionality. Similarly, the slope threshold parameter

γ = 1

(corresponding to a 45° angle) and zero-region threshold

δ = 1 \times 10^{- 3}

were determined through sensitivity testing to effectively distinguish between fault-induced vibration patterns and background noise.

The scale factor

τ

is crucial for capturing signal characteristics at different temporal scales. The selection of

τ

follows specific criteria to ensure both computational efficiency and analytical effectiveness:

The scale factor $τ$ begins at unity and incrementally extends to any positive integer greater than 1, with a default configuration of three scales ( $τ = 1, 2, 3$ ).
For a time series of length N, the upper bound of $τ$ must satisfy the data length constraint to maintain statistical reliability: $\frac{N}{τ} \geq 100$ . This constraint ensures sufficient data points in each coarse-grained series for reliable entropy estimation.
Based on extensive empirical analysis, we propose the following guidelines for $τ$ selection:

$τ \in \{\begin{matrix} [1, 5] & for short - term variations \\ [5, 10] & for medium - term patterns \\ [10, 20] & for long - term trend analysis \end{matrix}$

(9)

The default value of

τ = 3

is selected as it provides a balanced trade-off between computational complexity and the ability to capture multiscale dynamics in most practical applications. This setting has been empirically validated across various signal types [34], demonstrating a robust performance in capturing relevant temporal patterns while maintaining computational efficiency.

3. The Proposed Fault Diagnosis Method

3.1. Fast Ensemble Empirical Mode Decomposition

FEEMD is an enhanced algorithm within the Empirical Mode Decomposition (EMD) family that makes use of the advantages of EEMD while improving its computational efficiency. It is primarily used for processing nonlinear and non-stationary signals as it features strong resistance to modal aliasing and minimal boundary effects. In this study, we employ this algorithm as a preprocessing method before ACMSlE feature extraction.The implementation process is as follows:

(1) Add a white noise sequence to the original signal

x (t)

:

x_{i} (t) = x (t) + w_{i} (t)

(10)

where

w_{i} (t)

represents the white noise added in the i-th iteration, with amplitude

α

.

(2) Perform EMD decomposition on the noise-added signal:

x_{i} (t) = \sum_{j = 1}^{n} c_{i, j} (t) + r_{i} (t)

(11)

where

c_{i, j} (t)

denotes the j-th Intrinsic Mode Function (IMF) of the i-th decomposition and

r_{i} (t)

is the residual term.

(3) The core improvement of FEEMD lies in selective reconstruction:

{\bar{c}}_{j} (t) = \frac{1}{N} \sum_{i = 1}^{N} c_{i, j} (t)

(12)

where N represents the ensemble number, which is typically much smaller than that required by EEMD.

(4) Final signal reconstruction:

x (t) = \sum_{j = 1}^{n} {\bar{c}}_{j} (t) + \bar{r} (t)

(13)

The key parameters in the implementation of this process are as follows:

Noise amplitude $η$ : typically set to 0.1–0.3 times the standard deviation of the signal.
Ensemble number N: generally set between 20 and 100, significantly less than the hundreds required by EEMD.
IMF screening criterion:

$σ = \sqrt{\frac{1}{T} \sum_{t = 1}^{T} {[\frac{h_{k - 1} (t) - h_{k} (t)}{h_{k - 1} (t)}]}^{2}}$

(14)

where $h_{k} (t)$ represents the signal after the k-th screening.

The noise amplitude parameter

η

in FEEMD significantly influences decomposition quality and computational efficiency. According to research,

η

values between 0.1 and 0.2 times the signal’s standard deviation provide an optimal balance between noise-assisted decomposition and minimal artificial component introduction. For bearing fault diagnosis applications in particular, studies have demonstrated that setting

η \approx 0.15

yields effective IMF separation while preserving fault-related transient features. Further research has shown that lower values (

η < 0.1

) result in insufficient mode separation, while higher values (

η > 0.3

) introduce excessive artificial components that could mask subtle fault signatures. Following these established guidelines, we set the ensemble number N to 50, which provides sufficient statistical stability while maintaining computational efficiency compared to traditional EEMD approaches, which require hundreds of ensembles [35,36,37].

3.2. Particle Swarm Optimization

Particle Swarm Optimization (PSO) is a swarm intelligence algorithm that simulates social behavior, where particles (candidate solutions) navigate through a search space guided by both individual and collective experience. Each particle adjusts its trajectory based on its own best-found position and the best position discovered by any particle in the swarm [38].

The movement of particles is governed by

v_{i}^{k + 1} = w v_{i}^{k} + c_{1} r_{1} (p_{b e s t, i} - x_{i}^{k}) + c_{2} r_{2} (g_{b e s t} - x_{i}^{k})

(15)

x_{i}^{k + 1} = x_{i}^{k} + v_{i}^{k + 1}

(16)

where

v_{i}^{k}

and

x_{i}^{k}

are the velocity and position of particle i at iteration k; w is the inertia weight;

c_{1}

and

c_{2}

are acceleration coefficients;

r_{1}

and

r_{2}

are random values in [0,1];

p_{b e s t, i}

is particle i’s best position; and

g_{b e s t}

is the global best position.

The advantages of PSO in neural network optimization include its gradient-free operation, minimal parameter requirements, and ability to escape local optima, making it ideal for optimizing classifier parameters in fault diagnosis applications.

3.3. Extreme Learning Machine

The Extreme Learning Machine (ELM) is a single-hidden-layer feedforward neural network characterized by its analytical determination of output weights, which is in contrast to traditional neural networks that rely on iterative gradient-based training. The key innovation of the ELM lies in it randomly assigning input weights and biases while analytically calculating output weights, enabling both universal approximation and an extremely fast training speed.The standard ELM algorithm can be formalized as shown in Table 1.

For our bearing fault diagnosis application, we implemented an ELM with the following configuration:

Input layer with dimensions matching the ACMSlE feature vector size;
Hidden layer containing 20 neurons with the sigmoid activation function $g (x) = 1 / (1 + e^{- x})$ ;
Output layer containing 9 neurons corresponding to the bearing’s health condition.

The sigmoid activation function was selected for its effectiveness in capturing nonlinear relationships in fault patterns. The hidden layer size was determined through cross-validation, balancing the model’s complexity with its capability for generalization.

While the ELM offers a remarkable training speed (typically orders of magnitude faster than backpropagation-based methods), its performance can be sensitive to the random initialization of the input weights and biases. This randomness often leads to inconsistent performance and potentially suboptimal solutions. To address this limitation, we employ Particle Swarm Optimization to enhance the ELM framework, as detailed in the following section.

3.4. PSO-ELM

As discussed in the previous section, the standard ELM suffers from performance instability due to its random parameter initialization. To address this limitation, we employ Particle Swarm Optimization to enhance the ELM framework, resulting in the PSO-ELM hybrid algorithm. This approach systematically optimizes the input weights and biases that would otherwise be randomly assigned, significantly improving the algorithm’s generalization performance and prediction accuracy while maintaining its rapid training speed. The implementation process is as follows:

(1) ELM Network Structure Initialization. Building on the ELM architecture described in Section 3.3, we formulate the following network structure:

s u m_{i = 1}^{L} b e t a_{i} g (w_{i} c d o t x_{j} + b_{i}) = o_{j}, q u a d j = 1, 2, \dots, N

(17)

(2) PSO Parameter Initialization

{p a r t i c l e}_{i} = [w_{i}, b_{i}], q u a d i = 1, 2, \dots, M

(18)

where M represents the particle population size. Each particle is initialized with random values: input weights

w_{i}

and biases

b_{i}

are generated using a uniform distribution within [−1, 1], while initial velocities are set to random values in [−0.1, 0.1]. This initialization approach provides sufficient diversity for effective search space exploration as soon as the algorithm is run.

(3) Fitness Function Definition

f i t n e s s = \frac{1}{N} \sum_{j = 1}^{N} {(t_{j} - o_{j})}^{2}

(19)

(4) Particle Position Update

\begin{matrix} v_{i}^{k + 1} & = w v_{i}^{k} + c_{1} r_{1} (p_{b e s t, i} - x_{i}^{k}) + c_{2} r_{2} (g_{b e s t} - x_{i}^{k}) \end{matrix}

(20)

\begin{matrix} x_{i}^{k + 1} & = x_{i}^{k} + v_{i}^{k + 1} \end{matrix}

(21)

where w is the inertia weight;

c_{1}, c_{2}

are learning factors;

r_{1}, r_{2}

are random numbers;

p_{b e s t, i}

represents the individual’s best position; and

g_{b e s t}

represents the global best position.

(5) ELM Parameter Optimization. We update the ELM input weights and biases using optimal particle positions, calculate the hidden layer output matrix H, and solve the output weights using the Moore–Penrose generalized inverse:

b e t a = H^{+} T

. The key optimization parameters and their influence on diagnostic precision include the following:

Particle swarm size M: Set to 30 in our implementation after evaluating populations from 10 to 50. Smaller populations ( $M < 20$ ) often converged prematurely to local optima, particularly for similar fault patterns. Increasing beyond $M = 40$ improved the model’s accuracy by less than 0.5% while doubling its computational time.
Maximum iterations: Set to 200 in our implementation, with convergence typically occurring within 150–180 iterations across various fault conditions. Extended testing showed minimal accuracy improvements (<0.3%) when increasing this to 400 iterations.
Learning factors $c_{1}, c_{2}$ : Both set to 2.0, balancing the cognitive and social learning components. For subtle fault detection (especially incipient ball faults), slightly emphasizing cognitive learning ( $c_{1} = 2.2$ , $c_{2} = 1.8$ ) improved the classification accuracy by approximately 1.5%.
Inertia weight w: Implemented as linearly decreasing from 0.9 to 0.4 throughout iterations. This strategy consistently outperformed constant weights by 2–3% in terms of accuracy, particularly for complex fault patterns. This dynamic adjustment facilitates broad exploration in early iterations and refined exploitation in later stages.

3.5. The Proposed Fault Diagnosis Method

This study proposes a novel fault diagnosis method for rotating machinery vibration signals. The diagnostic workflow is illustrated in Figure 2, with the specific diagnostic process consisting of the following steps:

(1): Preprocessing of Original Vibration Signals: The raw vibration signals obtained through FEEMD sampling are preprocessed by computing the correlation coefficients between each IMF component and the original signal, analyzing signal and envelope spectra, calculating variance contribution rates and kurtosis values, and selecting appropriate statistical measures or weighted statistical measures for IMF component screening (creating a correlation coefficient by default).
(2): Feature Vector Construction: We apply the ACMSIE method to the screened IMF components to calculate corresponding entropy values, construct feature vectors, and combine all extracted feature vectors to form an initial fault feature set.
(3): Training and Testing Process: The fault feature set is partitioned into training and testing feature datasets at a 1:1 ratio; the training data are input into PSO-ELM for learning, while the testing data are used for detection and recognition validation.
(4): Result Analysis and Maintenance: Based on the diagnostic results, appropriate maintenance procedures are implemented for the identified faults.

4. Evaluation Signals and Results Achieved with ACMSlE

To evaluate the effectiveness of ACMSlE under different temporal sequences, we conducted tests using the following composite signals.

4.1. White Gaussian Noise and 1/f Noise

White Gaussian noise (WGN), characterized by its stable statistical properties and stochastic nature, effectively simulates random vibrations and electromagnetic interference in mechanical systems. In particular, 1/f noise (pink noise) exhibits characteristics analogous to numerous physical processes in nature, including bearing vibrations, and demonstrates scale-invariant statistical properties across different temporal scales, with greater complexity than Gaussian white noise. Therefore, these two types of signals are commonly employed to evaluate the robustness of methods in extracting various signal characteristics.

In this section, both WGN and 1/f noise share consistent parameters: a signal length of

N = 2048

sampling points, a number of independent trials

M = 100

, and a scale factor

τ = 20

. Comparative experiments were conducted with CMSE, CMCE [39], MSlE [40], and ACMSlE used as the research objects. The standard deviations of the various entropy measures were calculated and plotted as shown in Figure 3. In the WGN experiments, ACMSlE demonstrates the smallest standard deviation when

τ < 10

and

τ > 12

, indicating its superior stability. While showing slight deficiencies at

τ = 10, 11

and 12, these values remain comparable to CMSE and within acceptable ranges. In the 1/f noise experiments, ACMSlE exhibits significant advantages across the entire scale range, effectively addressing the limitations of CMSE and MSlE, whose standard deviations increase rapidly with scale changes, leading to substantially degraded stability.

4.2. Logistic Map

The logistic map, as a deterministic equation capable of generating chaotic behavior, plays a crucial role in chaos theory. Its fundamental mathematical expression is as follows:

x_{n + 1} = r x_{n} (1 - x_{n})

, where

x_{n} \in [0, 1]

represents the value of its n-th iteration and r is the control parameter, typically ranging from 0 to 4. The system exhibits various dynamical behaviors depending on the value of r.

The logistic map signals were generated using

N = 2048

sampling points and three different control parameters

r = {3.7, 3.8, 3.9}

. For better visualization, the first 1000 points are plotted in Figure 4. The time-domain waveforms clearly demonstrate distinct dynamical behaviors: periodic oscillation at

r = 3.7

, evident chaotic behavior at

r = 3.8

, and intensified chaos at

r = 3.9

.

The analysis of the three logistic map signals using CMSE, CMCE, MSlE, and ACMSlE is presented in Figure 5. MSlE, without composite multiscale analysis, demonstrates a poor separation performance at low scales (

τ < 6

). CMCE’s analysis shows its limited ability to distinguish curves with different r values. CMSE, utilizing multiscale analysis, exhibits good separation at low scales; however, it shows confusion when

τ = 5

. Moreover, when

τ > 9

, although

r = 3.7

and

r = 3.8

can be distinguished, their discrimination is not pronounced. ACMSlE shows some confusion in discrimination when

τ = 1

, which is a common limitation inherent to the slope entropy calculation formula. However, it demonstrates an excellent separation performance across all subsequent scales, validating its effectiveness in processing nonlinear systems.

4.3. Data Length Verification

By employing 1/f noise with varying data lengths, we can systematically evaluate the method’s performance across different scales, verify whether the feature extraction algorithm exhibits scale dependency, and assess whether its computational stability deteriorates with increasing data length.

The sampling points we selected were

M = {1024, 2048, 3072, 4096, 5120}

. The results of these points being processed by CMSE, CMCE, MSIE, and ACMSIE are shown in Figure 6. Evidently, both CMSE and MSIE failed to demonstrate consistency across different scales with varying data points. At the low sampling points (

M = 1024

and 2048), their entropy values exhibited deviations as the scale increased. Notably, the CMSE algorithm encountered situations where entropy values could not be computed at higher scales when the number of points was insufficient (

M = 1024

). The CMCE algorithm demonstrated good consistency in entropy across different sampling points, with significant deviation occurring only at low data points (

M = 1024

). However, the differentiation among the sampling points inevitably increased with the increasing scale. ACMSIE maintained consistency across all scales, with acceptable deviation even at low sampling points, indicating that this algorithm is insensitive to data point dependency and can adapt to various practical scenarios.

5. Experimental Verification

5.1. Experimental Signals

The proposed fault diagnosis method was evaluated using the publicly available bearing fault dataset from Huazhong University of Science and Technology [41]. Compared to conventional datasets, this dataset incorporates compound faults, thereby increasing the diagnostic challenge. The algorithm was implemented using MATLAB 2024a on a computer with an Intel(R) Core(TM) i5-12600KF processor running at 4.9 GHz and with 32.00 GB of RAM.

The bearing fault experimental data were collected from a Spectra-Quest Machinery Fault Simulator, such as the one shown in Figure 7. For vibration measurements, a TREA331 triaxial accelerometer (CTC, USA) was utilized with a sensitivity of 100 mV/g (±15%) and frequency response range of 30–600,000 CPM (±3 dB). The accelerometer was mounted directly on the bearing housing to ensure optimal vibration transmission, with its triaxial configuration (X, Y, Z channels) enabling the comprehensive capture of bearing dynamic characteristics. Signal acquisition was performed at a sampling rate of 25.6 kHz through a dedicated data acquisition system, ensuring the high-fidelity measurement of fault-induced vibration signatures across the entire frequency range of interest. Vibration signals were acquired from ER-16K bearings under four different operating conditions (3900 rpm, 4200 rpm, 4500 rpm, and 4800 rpm) and in nine distinct health states. The faulty bearings had a shaft diameter of 38.52 mm, a ball diameter of 7.94 mm, and contained 9 balls. All bearing defects were artificially introduced and had controlled dimensions. For inner and outer race defects, medium faults measured approximately 0.15 mm in width, while severe faults measured 0.3 mm. The ball element defects were slightly larger, with medium faults measuring 0.25 mm and severe faults 0.5 mm. The combination faults included defects on both the inner race and ball elements, creating more complex vibration signatures that challenge traditional diagnostic methods. The experimental setup utilized a Marathon Motors variable-frequency drive controller for precise speed regulation. This control system allowed for the adjustment of the motor frequency between 20 and 80 Hz, corresponding to a wide range of shaft rotation speeds, with stability maintained at ±0.5% throughout each test. While measurements were collected across this operational range, our comparative analysis focused on four representative speeds. After our preliminary analysis revealed similar feature characteristics across these speeds, we selected 4200 rpm as the representative speed for our detailed comparative analysis. It should be noted that the test bench was designed to investigate fault characteristics under controlled speed conditions without external load variation, allowing us to isolate the effects of bearing faults without the presence of additional variables. All processing and diagnostic analyses were performed offline, with collected vibration data systematically divided into equal training and testing sets. Figure 7 illustrates the nine health states of the bearings: 1—normal condition (Normal); 2—medium inner race fault (I-1); 3—severe inner race fault (I-2); 4—medium outer race fault (O-1); 5—severe outer race fault (O-2); 6—medium ball fault (B-1); 7—severe ball fault (B-2); 8—medium combination fault (C-1); and 9—severe combination fault (C-2). The nine fault categories were carefully selected to represent the full spectrum of bearing failure modes commonly encountered in industrial applications. The normal condition serves as the essential baseline for comparison. Inner race faults are among the most common bearing defects in the industry and typically result from improper installation or material fatigue. Outer race faults frequently occur due to their stationary position relative to the load zone. Ball element faults present unique diagnostic challenges due to their rotational movement, creating non-stationary fault signatures. The compound faults (simultaneous inner race and ball defects) have particular industrial relevance, as real-world bearings frequently develop multiple fault types simultaneously, yet these complex cases are rarely addressed in diagnostic studies. The two severity levels (medium/severe) for each fault type reflect the progressive nature of bearing degradation in industrial settings, where early detection enables proactive maintenance planning. These fault categories and their dimensional specifications align with industrial bearing inspection standards, where the defect size directly correlates with the urgency of maintenance and remaining useful life estimations.

This section aims to investigate the superiority of our fault diagnosis method by selecting all nine bearing health conditions under the operating condition of 4200 rpm as the research subject for its validation. For each fault type, 120 samples were used, with each sample containing 2048 data points. The complete analysis was performed in a controlled offline environment to enable comprehensive evaluation of the algorithm’s performance across different methodologies. Samples were systematically divided into equal training and testing sets to ensure unbiased performance assessment. The preprocessing methods used for comparison included the Empirical Wavelet Transform (EWT) [42], FEEMD, Singular Spectrum Analysis (SSA) [43], and Variational Mode Decomposition (VMD) [44]. The feature extraction methods compared to determine the superior method were CMSE, CMCE, MSlE, and CMSlE. The classifiers used for the comparison were Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM) architectures [45], PSO-ELM, and the Support Vector Machine (SVM) [46].

5.2. Comparison of Pretreatment Methods

FEEMD, EWT, VMD, and SSA are distinct signal decomposition methodologies, each exhibiting unique advantages in processing non-stationary and nonlinear signals, particularly for bearing fault diagnosis. Specifically, EWT and VMD emphasize adaptive decomposition, while FEEMD excels in mitigating mode mixing issues, and SSA specializes in denoising and trend extraction. Although significant breakthroughs have recently been achieved in these methodologies, this study focuses on evaluating their compatibility with our proposed entropy-based method, deliberately excluding derivative algorithms. These preprocessing techniques were applied in conjunction with the ACMSlE method, followed by validation using PSO-ELM. The classification accuracy results are presented in the confusion matrix shown in Figure 8.

Based on the experimental data, it is evident that the primary misclassification occurs with Class 1 and Class 4, which represent normal and severe out race faults, respectively, as well as with Class 6, which corresponds to medium ball faults. All four methodologies exhibit these misclassification issues. However, the results indicate that preprocessing using the FEEMD method significantly minimizes these errors without introducing any unique drawbacks. Consequently, the FEEMD approach achieves a final accuracy of 98.7%, which is at least 1.5% higher than the other methods, demonstrating its superior performance.

To eliminate experimental randomness and compare the significant differences between the various methods, statistical indicators were introduced, with specific metrics and their implications presented in Table 2. Thirty trials were conducted under identical conditions, utilizing 60 training indicators and 60 test samples, while the default parameters were kept constant throughout. The results are shown in Table 3. Evidently, FEEMD demonstrates significant advantages over the other methods across the different preprocessing approaches.

5.3. Comparison of Feature Extraction Methods

t-SNE visualization was employed to validate the effectiveness of four entropy-based feature extraction methods, as illustrated in Figure 9. To ensure robust visualization, we selected 20 samples per fault category and implemented enhanced visual aids, including centroids, projection lines, and bottom-plane projections, to facilitate the clear spatial interpretation of feature distributions. Notably, mixing phenomena were observed in the CMSE, CMCE, and MSlE visualizations. Specifically, the CMSE method (Figure 9a) exhibited a partial separation of fault categories, but showed significant confusion between the Normal, B-2, and O-2 conditions in the lower front region of the visualization space. More problematically, it failed to adequately separate the features of moderate and severe compound faults (classes C-1 and C-2), which is one of the most challenging aspects of our dataset. The CMCE method (Figure 9b) demonstrated the poorest performance among all tested entropy approaches, with substantial overlap and mixing between multiple fault categories. Particularly concerning was its inability to form consistent cluster boundaries, with samples from different fault categories (especially Normal, B-1, O-2, and B-2) scattered throughout the feature space without clear separation.The MSlE method (Figure 9c) demonstrated promising results with well-formed clusters for most fault types, particularly for the B-2, O-2, and C-2 categories, which formed distinct, compact clusters. However, it showed a tendency to confuse normal operating conditions with moderate outer race faults (classes Normal and O-1), creating potential diagnostic ambiguity in critical early-fault detection scenarios. Additionally, the I-2 and C-2 samples exhibited a proximity that could lead to misclassification. In contrast, the proposed ACMSlE method (Figure 9d) exhibited a superior performance, achieving clear separation between all nine fault categories with distinct, compact clustering and minimal overlap. Each fault type formed a well-defined cluster with appropriate spatial separation from other categories. Particularly noteworthy is ACMSlE’s ability to distinguish between similar fault types of different severity levels (e.g., I-1 vs. I-2, C-1 vs. C-2), which is crucial for effective prognostic applications and maintenance planning. Also, in order to avoid chance playing a role, the results of the indicators after 30 repeated tests are shown in Table 3: method 1 is FEEMD-ACMSlE-PSO_ELM, method 2 is EWT-ACMSlE-PSO_ELM, method 3 is SSA-ACMSlE-PSO_ELM, method 4 is VMD-ACMSlE-PSO_ELM, method 5 is FEEMD-ACMSlE-CNN-LSTM, and method 6 is FEEMD-ACMSlE-SVM. In contrast, the proposed ACMSlE method exhibited a superior performance, achieving a clear separation of the various fault characteristics.

5.4. Comparison of Classifier Method

Finally, we selected SVM and CNN-LSTM as the control groups used for a comparison with the PSO-ELM method. CNN-LSTM combines a CNN’s local feature extraction capability (through convolution kernels) with LSTM’s temporal modeling ability, making it suitable for processing signals with spatio-temporal dependencies. SVM, as a classical small-sample classifier used in traditional machine learning, excels at handling linear or nonlinear classification problems in high-dimensional feature spaces (through kernel function mapping). PSO-ELM is an optimized shallow neural network that integrates the rapid training characteristics of the Extreme Learning Machine (ELM) with the global search capability of Particle Swarm Optimization (PSO), addressing the instability issues caused by random parameter initialization in the traditional ELM.The PSO-ELM results are shown in Figure 9a, while the results of the other two methods are presented in Figure 10. The results demonstrate that PSO-ELM achieved the highest accuracy, at 98.7%, followed by CNN-LSTM at 95.6% and SVM at 95.4%. This indicates that the proposed ACMSlE method exhibits superior synergy with optimized shallow networks (in terms of feature–model complexity matching), suggesting that this method effectively leverages the capabilities of the feature extraction approach.

6. Discussion and Limitations

The implementation of our proposed framework requires the consideration of several key factors for its optimal performance and industrial application. While our study demonstrates its strong performance in controlled laboratory settings, these important limitations should be acknowledged.

The computational complexity of ACMSlE, while higher than conventional entropy measures, remains reasonable for offline analysis with standard computing resources. Its current implementation is most suitable for environments where diagnostic accuracy takes precedence over real-time processing constraints. For real-time applications and embedded systems with limited resources, several adaptation strategies could make our approach more feasible:

Implementing a batch processing approach where vibration data are collected in fixed-length windows (1–2 second intervals);
Optimizing the number of scale factors, as even with just three scales, the method maintains strong discrimination capabilities;
Employing hardware acceleration through FPGA-based solutions to leverage the inherently parallel nature of multiscale entropy computation;
Replacing the computationally intensive FEEMD preprocessing with lighter alternatives when processing speed is prioritized over maximum accuracy;
Developing hybrid monitoring systems where simplified algorithms perform continuous monitoring that triggers comprehensive analysis only when potential fault signatures are detected.

Real industrial environments present additional challenges beyond these computational constraints. Non-stationary machinery behavior due to load variations, speed fluctuations, or transient events may affect the stability of entropy calculations. The current methodology assumes relatively stable operating conditions and may require adaptive parameter selection mechanisms to maintain its performance under highly variable conditions. Sensor noise is particularly challenging for early-stage faults, where signal-to-noise ratios are low. Although our multiscale approach inherently provides some noise resistance, extreme noise conditions may require additional preprocessing. Scenarios in which data are missing due to sensor failures or communication interruptions would impact the continuous time series analysis that entropy methods depend on, necessitating robust interpolation techniques, which could be a subject for future research.

The high classification accuracy (98.7%) achieved for compound fault scenarios should be considered alongside the potential risks of overfitting and misclassification. Compound faults present unique challenges, as their vibration signatures combine the characteristics of multiple elemental faults with complex interactions. Several aspects of our approach mitigate overfitting risks: the 30 independent trials with different random initializations demonstrate remarkable consistency (±0.32% standard deviation) and t-SNE visualizations show that ACMSlE creates well-separated feature clusters even between medium and severe compound fault categories. Nevertheless, misclassification risks remain in borderline cases where compound fault signatures might share similarities with individual component faults. The primary confusion that was observed occurred between normal conditions and medium outer race faults when using some preprocessing methods, though our optimal FEEMD-ACMSlE-PSO-ELM combination significantly reduced this confusion.

Despite these limitations, the fundamental components of our methodology have potential applicability, beyond bearing diagnosis, in other rotating machinery systems. The ACMSlE feature extraction method could be adapted for gearboxes, rotors, and motors with appropriate parameter adjustments. The primary requirements for its successful extension would be that the phenomena of interest produce detectable patterns across multiple time scales, and that an acceptable signal-to-noise ratio can be achieved through appropriate preprocessing.

7. Conclusions

This study introduces a novel ACMSlE feature extraction method for nonlinear dynamics analysis, which effectively enhances its stability and complexity discrimination capability while reducing its data length dependency compared to that of the conventional MSlE approach. Furthermore, we propose an integrated rolling bearing fault diagnosis framework that combines FEEMD preprocessing, ACMSlE feature extraction, and PSO-ELM classification. The proposed method was validated using an experimental dataset from Huazhong University of Science and Technology. The results demonstrate its superior performance compared to other entropy-based methods (CMSE, CMCE, and MSlE) and various combinations of preprocessing and classification techniques (VMD-X-PSO-ELM, EWT-X-PSO-ELM, SVM, and CNN-LSTM). Notably, our algorithm maintains a high accuracy of 98.7% when dealing with challenging datasets, significantly outperforming alternative approaches. Furthermore, our research aligns with several emerging trends in the 2022–2024 diagnostic literature. The increasing focus using on interpretable AI for industrial applications has highlighted the value of physically meaningful features like those provided by our entropy-based approach. While deep learning methods continue to gain popularity, recent work has emphasized the importance of balancing black-box complexity with explainable decision-making processes that engineers can trust and understand. Our hybrid approach, combining advanced feature extraction with optimized shallow networks, represents a middle ground that maintains both performance and interpretability. The growing interest in adaptive methodologies that can automatically adjust to varying signal conditions is also relevant to our development of the ACMSlE method, with its dynamic parameter selection capabilities. As the field continues to evolve toward more robust and adaptable diagnostic frameworks, our approach contributes to this progression by demonstrating how traditional entropy analysis can be enhanced through strategic modifications that address specific industrial challenges.

Several avenues for future research remain. The classification accuracy of this approach could potentially be further enhanced by exploring alternative optimization algorithms for ELM parameter tuning. Additionally, investigating suitable optimization algorithms for the SVM that better complement ACMSlE features presents another promising direction. Furthermore, the integration of fuzzy clustering techniques with our entropy-based feature extraction method could enhance its handling of the inherent uncertainties in bearing fault diagnosis. Future work should also consider validating the method in industrial environments under variable operating conditions, extending it to other mechanical systems such as gearboxes and rotors, and investigating its hybridization with deep learning approaches for improved feature learning. Optimizing this methodology for real-time implementation and developing models for continuous fault progression tracking represent additional valuable research directions. These aspects warrant further investigation in subsequent studies.

Author Contributions

Conceptualization, S.W.; methodology, S.W.; software, S.W.; validation, W.Z., L.Z., and J.Q.; formal analysis, L.Z.; investigation, J.Q.; resources, W.Z.; data curation, S.W.; writing—original draft preparation, S.W.; writing—review and editing, Z.Y. and W.L.; visualization, S.W.; supervision, J.Q.; project administration, W.Z.; funding acquisition, W.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Jiangxi Provincial Department of Education Science and Technology Research Project under grant number GJJ2404910, China.

Data Availability Statement

The data used are unavailable due to privacy or ethical restrictions.

Acknowledgments

The authors gratefully acknowledge Chao Zhao’s research team at Huazhong University of Science and Technology for providing the bearing fault dataset, which played a crucial role in validating our proposed methodology. The sophisticated design and complexity of their fault dataset presented both challenges and opportunities that significantly contributed to the rigorous evaluation of our algorithm.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Wei, Y.; Li, Y.; Xu, M.; Huang, W. A review of early fault diagnosis approaches and their applications in rotating machinery. Entropy 2019, 21, 409. [Google Scholar] [CrossRef] [PubMed]
Liu, R.; Yang, B.; Zio, E.; Chen, X. Artificial intelligence for fault diagnosis of rotating machinery: A review. Mech. Syst. Signal Process. 2018, 108, 33–47. [Google Scholar] [CrossRef]
Zhu, Z.; Lei, Y.; Qi, G.; Chai, Y.; Mazur, N.; An, Y.; Huang, X. A review of the application of deep learning in intelligent fault diagnosis of rotating machinery. Measurement 2023, 206, 112346. [Google Scholar] [CrossRef]
Liu, D.; Cui, L.; Wang, H. Rotating machinery fault diagnosis under time-varying speeds: A review. IEEE Sensors J. 2023, 23, 29969–29990. [Google Scholar] [CrossRef]
Kizilkaya, A.; Elbi, M.D. A fast approach of implementing the Fourier decomposition method for nonlinear and non-stationary time series analysis. Signal Process. 2023, 206, 108916. [Google Scholar] [CrossRef]
Shannon, C.E. A mathematical theory of communication. Bell Sist. Tech. J. 1948, 27, 111–123. [Google Scholar]
Delgado-Bonal, A.; Marshak, A. Approximate entropy and sample entropy: A comprehensive tutorial. Entropy 2019, 21, 541. [Google Scholar] [CrossRef]
Richman, J.S.; Lake, D.E.; Moorman, J.R. Sample entropy. In Methods in Enzymology; Elsevier: Amsterdam, The Netherlands, 2004; Volume 384, pp. 172–184. [Google Scholar]
Bandt, C.; Pompe, B. Permutation entropy: A natural complexity measure for time series. Phys. Rev. Lett. 2002, 88, 174102. [Google Scholar] [CrossRef]
Ye, J. Two effective measures of intuitionistic fuzzy entropy. Computing 2010, 87, 55–62. [Google Scholar] [CrossRef]
Yang, J.; Choudhary, G.I.; Rahardja, S.; Fränti, P. Classification of interbeat interval time-series using attention entropy. IEEE Trans. Affect. Comput. 2020, 14, 321–330. [Google Scholar] [CrossRef]
Li, P.; Liu, C.; Li, K.; Zheng, D.; Liu, C.; Hou, Y. Assessing the complexity of short-term heartbeat interval series by distribution entropy. Med. Biol. Eng. Comput. 2015, 53, 77–87. [Google Scholar] [CrossRef] [PubMed]
Wang, X.; Si, S.; Li, Y. Multiscale diversity entropy: A novel dynamical measure for fault diagnosis of rotating machinery. IEEE Trans. Ind. Inform. 2020, 17, 5419–5429. [Google Scholar] [CrossRef]
Fischer, I. The conditional entropy bottleneck. Entropy 2020, 22, 999. [Google Scholar] [CrossRef]
Cuesta-Frau, D. Slope entropy: A new time series complexity estimator based on both symbolic patterns and amplitude information. Entropy 2019, 21, 1167. [Google Scholar] [CrossRef]
Mao, K.; Wang, Y.; Ye, J.; Zhou, W.; Lin, Y.; Fang, B. Belief structure-based Pythagorean fuzzy entropy and its application in multi-source information fusion. Appl. Soft Comput. 2023, 148, 110860. [Google Scholar] [CrossRef]
Cuesta Frau, D. Permutation entropy: Influence of amplitude information on time series classification performance. Math. Biosci. Eng. 2019, 16, 6842–6857. [Google Scholar] [CrossRef]
Shang, H.; Liu, Z.; Wei, Y.; Zhang, S. A novel fault diagnosis method for a power transformer based on multi-scale approximate entropy and optimized convolutional networks. Entropy 2024, 26, 186. [Google Scholar] [CrossRef]
Yin, Y.; Shang, P. Multivariate multiscale sample entropy of traffic time series. Nonlinear Dyn. 2016, 86, 479–488. [Google Scholar] [CrossRef]
Fan, Q.; Liu, Y.; Yang, J.; Zhang, D. Graph multi-scale permutation entropy for bearing fault diagnosis. Sensors 2023, 24, 56. [Google Scholar] [CrossRef]
Yuan, Q.; Lv, M.; Zhou, R.; Liu, H.; Liang, C.; Cheng, L. Use of composite multivariate multiscale permutation fuzzy entropy to diagnose the faults of rolling bearing. Entropy 2023, 25, 1049. [Google Scholar] [CrossRef]
Rostaghi, M.; Rostaghi, R.; Humeau-Heurtier, A.; Azami, H. Refined composite multivariate multiscale fuzzy dispersion entropy: Theoretical analysis and applications. Chaos Solitons Fract. 2024, 185, 115128. [Google Scholar] [CrossRef]
Chen, G.; Lu, T.; Wang, X.; Wei, Y.; Ma, H. Fault severity identification of planetary gearbox based on refined composite multiscale diversity entropy. Trans. Inst. Meas. Control 2024, 46, 2161–2173. [Google Scholar] [CrossRef]
Ruan, D.; Wang, J.; Yan, J.; Gühmann, C. CNN parameter design based on fault signal analysis and its application in bearing fault diagnosis. Adv. Eng. Inform. 2023, 55, 101877. [Google Scholar] [CrossRef]
Gao, S.; Xu, L.; Zhang, Y.; Pei, Z. Rolling bearing fault diagnosis based on SSA optimized self-adaptive DBN. ISA Trans. 2022, 128, 485–502. [Google Scholar] [CrossRef]
Cai, W.; Yang, J.; Yu, Y.; Song, Y.; Zhou, T.; Qin, J. PSO-ELM: A hybrid learning model for short-term traffic flow forecasting. IEEE Access 2020, 8, 6505–6514. [Google Scholar] [CrossRef]
Van, M.; Hoang, D.T.; Kang, H.J. Bearing fault diagnosis using a particle swarm optimization-least squares wavelet support vector machine classifier. Sensors 2020, 20, 3422. [Google Scholar] [CrossRef]
Li, S.; Wei, Y.; Liu, X.; Zhu, H.; Yu, Z. A new fast ant colony optimization algorithm: The saltatory evolution ant colony optimization algorithm. Mathematics 2022, 10, 925. [Google Scholar] [CrossRef]
Boedo, S.; Santhanam, V. Optimal shape design of spherical squeeze film bearings using genetic algorithms. Tribol. Int. 2023, 190, 109058. [Google Scholar] [CrossRef]
Fu, Y.; Jia, L.; Qin, Y.; Yang, J.; Fu, D. Fast EEMD based AM-correntropy matrix and its application on roller bearing fault diagnosis. Entropy 2016, 18, 242. [Google Scholar] [CrossRef]
Kouka, M.; Cuesta-Frau, D. Slope entropy characterisation: The role of the δ parameter. Entropy 2022, 24, 1456. [Google Scholar] [CrossRef]
Li, Y.; Gao, P.; Tang, B.; Yi, Y.; Zhang, J. Double feature extraction method of ship-radiated noise signal based on slope entropy and permutation entropy. Entropy 2021, 24, 22. [Google Scholar] [CrossRef] [PubMed]
Kouka, M.; Cuesta-Frau, D.; Moltó-Gallego, V. Slope Entropy Characterisation: An Asymmetric Approach to Threshold Parameters Role Analysis. Entropy 2024, 26, 82. [Google Scholar] [CrossRef] [PubMed]
Li, Y.; Tang, B.; Huang, B.; Xue, X. A dual-optimization fault diagnosis method for rolling bearings based on hierarchical slope entropy and SVM synergized with shark optimization algorithm. Sensors 2023, 23, 5630. [Google Scholar] [CrossRef]
Zhang, W.; Zhou, J. A comprehensive fault diagnosis method for rolling bearings based on refined composite multiscale dispersion entropy and fast ensemble empirical mode decomposition. Entropy 2019, 21, 680. [Google Scholar] [CrossRef]
Mao, M.; Xu, B.; Sun, Y.; Tan, K.; Wang, Y.; Zhou, C.; Zhou, C.; Yang, J. Application of FCEEMD-TSMFDE and adaptive CatBoost in fault diagnosis of complex variable condition bearings. Sci. Rep. 2024, 14, 30448. [Google Scholar] [CrossRef]
Tong, S.; Zhang, Y.; Xu, J.; Cong, F. Pattern recognition of rolling bearing fault under multiple conditions based on ensemble empirical mode decomposition and singular value decomposition. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2018, 232, 2280–2296. [Google Scholar] [CrossRef]
Kennedy, J.; Eberhart, R. Particle swarm optimization. In Proceedings of the ICNN’95-International Conference on Neural Networks, Perth, WA, Australia, 27 November–1 December 1995; Volume 4, pp. 1942–1948. [Google Scholar]
Lin, T.K.; Chien, Y.H. Performance evaluation of an entropy-based structural health monitoring system utilizing composite multiscale cross-sample entropy. Entropy 2019, 21, 41. [Google Scholar] [CrossRef]
Li, Y.; Tang, B.; Jiao, S.; Zhou, Y. Optimized multivariate multiscale slope entropy for nonlinear dynamic analysis of mechanical signals. Chaos Solitons Fract. 2024, 179, 114436. [Google Scholar] [CrossRef]
Zhao, C.; Zio, E.; Shen, W. Domain generalization for cross-domain fault diagnosis: An application-oriented perspective and a benchmark study. Relia. Eng. Syst. Saf. 2024, 245, 109964. [Google Scholar] [CrossRef]
Ge, M.; Wang, J.; Ren, X. Fault diagnosis of rolling bearings based on EWT and KDEC. Entropy 2017, 19, 633. [Google Scholar] [CrossRef]
Rodrigues, P.C.; Lourenço, V.; Mahmoudvand, R. A robust approach to singular spectrum analysis. Qual. Reliab. Eng. Int. 2018, 34, 1437–1447. [Google Scholar] [CrossRef]
Bagheri, A.; Ozbulut, O.E.; Harris, D.K. Structural system identification based on variational mode decomposition. J. Sound Vib. 2018, 417, 182–197. [Google Scholar] [CrossRef]
Li, Q.; Guan, X.; Liu, J. A CNN-LSTM framework for flight delay prediction. Expert Syst. Appl. 2023, 227, 120287. [Google Scholar] [CrossRef]
Panja, R.; Pal, N.R. MS-SVM: Minimally spanned support vector machine. Appl. Soft Comput. 2018, 64, 356–365. [Google Scholar] [CrossRef]

Figure 1. Comparative flowcharts of MSlE (left) and ACMSlE (right) algorithms.

Figure 2. The workflow of our fault diagnosis method.

Figure 3. SD of CMSE, CMCE, MSLE, and ACMSlE at 20 scales of (a) white noise and (b) 1/f noise.

Figure 4. The waveforms resulting from logistic map changes: r = 3.7 (a), 3.8 (b), and 3.9 (c).

Figure 5. The complexity of logistic map signals with different r values. (a) CMSE, (b) CMCE, (c) MSlE, and (d) ACMSlE.

Figure 6. The performance of the models on different data points: (a) CMSE, (b) CMCE, (c) MSlE, and (d) ACMSlE.

Figure 7. (a) Fault test bench; (b) bearing health status.

Figure 8. Recognition accuracy of four feature extraction methods: (a) number of misclassified samples; (b) average recognition rate.

Figure 9. t-SNE visualization results of four entropy methods: (a) MSE, (b) CMSLE, (c) CMSE, and (d) ACMSLE.

Figure 10. The classification accuracy results: (a) CNN-LSTM and (b) SVM.

Table 1. Extreme Learning Machine training algorithm.

Input: Training samples ${x_{j}, t_{j}}_{j = 1}^{N}$ , hidden neuron count L
Output: Trained ELM model parameters
1	Randomly generate input weights $w_{i}$ and biases $b_{i}$ for $i = 1, 2, \dots, L$
2	Construct hidden layer output matrix H:
	$H_{j i} = g (w_{i} \cdot f x_{j} + b_{i})$ , $j = 1, \dots, N$ ; $i = 1, \dots, L$
3	Calculate output weights: $β = H^{†} T$ using Moore-Penrose generalized inverse
4	Return network parameters ${w_{i}, b_{i}, β}$

Table 2. Indicators used to evaluate classification effectiveness.

Index	Equation	Annotation
Accuracy	$\frac{T P + T N}{T P + T N + F P + F N}$	TP: True Positive (correctly identified positive cases). TN: True Negative (correctly identified negative cases). Measures overall prediction correctness.
Precision	$\frac{T P}{T P + F P}$	FP: False Positive (case incorrectly identified as positive). Measures exactness of positive predictions. Also known as Positive Predictive Value.
Recall	$\frac{T P}{T P + F N}$	FN: False Negative (cases incorrectly identified as negative). Measures completeness of positive predictions. Also known as Sensitivity or True Positive Rate.
F1-Score	$\frac{2 \times Precision \times Recall}{Precision + Recall}$	Harmonic mean of Precision and Recall. Balances Precision and Recall. Range: [0,1], with 1 being the best score.

Table 3. Performance comparison of different methods.

Method	Accuracy	Precision	Recall	F1-Score
Method 1	98.72 ± 0.32	98.80 ± 0.30	98.72 ± 0.32	98.75 ± 0.31
Method 2	96.76 ± 0.50	96.85 ± 0.48	96.76 ± 0.50	96.80 ± 0.49
Method 3	97.14 ± 0.34	97.24 ± 0.35	97.24 ± 0.34	97.29 ± 0.34
Method 4	93.27 ± 0.46	93.22 ± 0.48	93.27 ± 0.46	93.25 ± 0.47
Method 5	95.19 ± 0.41	95.67 ± 0.84	95.19 ± 0.69	95.29 ± 0.69
Method 6	95.37	95.76 ± 0.88	95.37 ± 0.69	95.47 ± 0.74

Values are presented as mean ± Std in percentage (%).

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wu, S.; Zhang, W.; Qian, J.; Yu, Z.; Li, W.; Zheng, L. ACMSlE: A Novel Framework for Rolling Bearing Fault Diagnosis. Processes 2025, 13, 1167. https://doi.org/10.3390/pr13041167

AMA Style

Wu S, Zhang W, Qian J, Yu Z, Li W, Zheng L. ACMSlE: A Novel Framework for Rolling Bearing Fault Diagnosis. Processes. 2025; 13(4):1167. https://doi.org/10.3390/pr13041167

Chicago/Turabian Style

Wu, Shiqian, Weiming Zhang, Jiangkun Qian, Zujue Yu, Wei Li, and Lisha Zheng. 2025. "ACMSlE: A Novel Framework for Rolling Bearing Fault Diagnosis" Processes 13, no. 4: 1167. https://doi.org/10.3390/pr13041167

APA Style

Wu, S., Zhang, W., Qian, J., Yu, Z., Li, W., & Zheng, L. (2025). ACMSlE: A Novel Framework for Rolling Bearing Fault Diagnosis. Processes, 13(4), 1167. https://doi.org/10.3390/pr13041167

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

ACMSlE: A Novel Framework for Rolling Bearing Fault Diagnosis

Abstract

1. Introduction

2. Proposed Feature Extraction Theory for Bearings Utilizing ACMSlE

2.1. Slope Entropy

2.2. Multiscale Slope Entropy

2.3. Adaptive Composite Multiscale Slope Entropy

3. The Proposed Fault Diagnosis Method

3.1. Fast Ensemble Empirical Mode Decomposition

3.2. Particle Swarm Optimization

3.3. Extreme Learning Machine

3.4. PSO-ELM

3.5. The Proposed Fault Diagnosis Method

4. Evaluation Signals and Results Achieved with ACMSlE

4.1. White Gaussian Noise and 1/f Noise

4.2. Logistic Map

4.3. Data Length Verification

5. Experimental Verification

5.1. Experimental Signals

5.2. Comparison of Pretreatment Methods

5.3. Comparison of Feature Extraction Methods

5.4. Comparison of Classifier Method

6. Discussion and Limitations

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI