Low Complexity Non-Linear Spectral Features and Wear State Models for Remaining Useful Life Estimation of Bearings

Chelmiah, Eoghan T.; McLoone, Violeta I.; Kavanagh, Darren F.

doi:10.3390/en16145312

Open AccessArticle

Low Complexity Non-Linear Spectral Features and Wear State Models for Remaining Useful Life Estimation of Bearings

by

Eoghan T. Chelmiah

^*

,

Violeta I. McLoone

and

Darren F. Kavanagh

^*

Faculty of Engineering, South East Technological University, Carlow Campus, Kilkenny Rd, R93 V960 Carlow, Ireland

^*

Authors to whom correspondence should be addressed.

Energies 2023, 16(14), 5312; https://doi.org/10.3390/en16145312

Submission received: 11 May 2023 / Revised: 27 June 2023 / Accepted: 5 July 2023 / Published: 11 July 2023

(This article belongs to the Special Issue Condition Monitoring and Machine Learning Strategies for Electrical Apparatus 2022–2023)

Download

Browse Figures

Versions Notes

Abstract

:

Improving the reliability and performance of electric and rotating machines is crucial to many industrial applications. This will lead to improved robustness, efficiency, and eco-sustainability, as well as mitigate significant health and safety concerns regarding sudden catastrophic failure modes. Bearing degradation is the most significant cause of machine failure and has been reported to cause up to 75% of low-voltage machine failures. This paper introduces a low complexity machine learning (ML) approach to estimate the remaining useful life (RUL) of rolling element bearings using real vibration signals. This work explores different ML recipes using novel feature engineering coupled with various k-Nearest Neighbour (k-NN), and Support Vector Machines (SVM) kernel and weighting functions in order to optimise this RUL approach. Original non-linear wear state models and feature sets are investigated, the latter are derived from Short-time Fourier Transform (STFT) and Hilbert Marginal Spectrum (HMS). These feature sets incorporate one-third octave band filtering for low complexity multivariate feature subspace compression. Our proposed ML algorithm stage has employed two robust supervised ML approaches: weighted k-NN and SVM. Real vibration data were drawn from the Pronostia platform to test and validate this prognostic monitoring approach. The results clearly demonstrate the effectiveness of this approach, with classification accuracy results of up to 82.8% achieved. This work contributes to the field by introducing a robust and computationally inexpensive method for accurate monitoring of machine health using low-cost vibration-based sensing.

Keywords:

condition-based monitoring (CbM); feature extraction; Hilbert-Huang transform (HHT); Hilbert marginal spectrum (HMS); k-Nearest Neighbour (kNN); machine learning (ML); mechanical bearings; prognostics; rotating machine; support vector machine (SVM); remaining useful life (RUL)

1. Introduction

A major driver for developing advanced predictive maintenance systems is the rapid growth of electrification of various sectors such as transportation, energy, and industrial and agricultural processes. It is imperative that unscheduled maintenance and machine downtime are reduced in order to increase the availability and overall efficiency and the adoption of green technologies. Also, preventing abrupt failure modes from occurring significantly reduces health and safety concerns associated with electric and rotating machines. Catastrophic equipment failure for various mission-critical systems, such as aircraft and electrified vehicles can have serious health and safety consequences [1,2,3]. Accurate prediction of the remaining useful life (RUL) of components significantly reduces the potential for sudden failure modes to occur, allowing for timely repairs. As a result, the typical high cost associated with equipment failures can be removed, along with incurred delays and lost operational time [4,5].

One of the most critically essential mechanical components used in all types of rotating machines is rolling element bearings. Bearing failure has been reported by several studies to be responsible for up to

75 %

of low-voltage motor/generator breakdowns and up to

41 %

of all rotating machine failures [6,7,8]. Bellini and Immovilli have documented bearing failure to be root cause of an enormous portion, that of up to 90% for small machine breakdowns [9]. Furthermore, it has been claimed that approximately 40% of all bearing premature failure modes are caused and accelerated by inadequate lubrication to the component [10]. Bearings typically operate in stressful environments with high levels of vibration from the operating machines, which contributes towards their degradation and fault occurrence. The major failure modes, as defined by the ISO standards, are as follows: fatigue, wear, corrosion, electrical erosion, plastic deformation, fracture, and cracking [11].

Data-driven bearing prognostic systems are constructed using signal processing techniques with real measured sensor acquired signals, to analyse and detect trends providing valuable evidence of system degradation [12,13]. Sensing modalities to acquire bearing degradation signatures that have been widely explored in recent years include vibration signals [1,3,14,15], acoustic emissions [16,17], stator current measurements [18,19,20], thermal-imaging [21], and multiple sensor fusion [22,23]. Of these, vibration signals, acquired from mounted accelerometers is often attributed as the most favourable approach for condition-based monitoring (CbM) in general, due to the non-invasive nature of the measurement data, low cost, robustness and ease of implementation in practice [24]. Similarly, a rich literature exists on the topic of feature extraction approaches on vibration signals for prognostic monitoring applications. These include statistical time-domain features [25,26,27] and time-frequency feature extraction methods such as Short-time Fourier Transform (STFT) [28,29,30], signal Envelope Analysis (EA) [31,32,33], Wavelet Transform (WT) [34,35], Feature Mode Decomposition (FMD) [36], and Empirical Mode Decomposition (EMD) methods [37,38,39,40]. In recent years, a variety of Machine Learning (ML) algorithms have been proposed to perform the task of bearing RUL predictions. These algorithms vary in complexity and efficiency and often incorporate supervised ML approaches such as k-Nearest Neighbour (k-NN) and Support Vector Machines (SVM) [3,15,16], regression model approaches [14,39], deep learning models including convolutional neural networks (CNN) and deep belief networks (DBN) [41,42], blind deconvolution methods [43,44], and Bayesian probabilistic prediction models such as Kalman and particle filtering [45,46].

This paper introduces a low complexity prognostic monitoring approach for RUL estimation of mechanical rolling element bearings that builds further on previous work by the authors, which achieved RUL classification accuracy percentages of 74.3% using signal EA combined with novel feature engineering techniques [1]. This paper’s proposed methods incorporate novel feature engineering techniques comprised of non-linear time-frequency signal processing methods, to effectively reduce the computational complexity by lowering the dimensionality of the bearing vibration signals. The one-third octave band filter is proposed for feature compression of both STFT and Hilbert marginal spectra (HMS). Non-linear temporal class boundaries are also proposed for characterising the lifetime trend of rolling element bearings. Two robust low complexity algorithm techniques to perform RUL classification have been investigated which were Weighted k-Nearest Neighbour (k-NN) and Support Vector Machines (SVM). Both these supervised learning algorithms were optimised and refined through applied experimentation with the various parameters such as incorporating different kernel functions and distance metrics. This approach has been tested on real-world vibration signals from the Pronostia platform [47], with the results clearly demonstrating and validating the robustness and efficacy for this challenging RUL estimation problem. The results and analysis on two experimental studies are presented, with study A for a full set of specimens, achieving RUL classification accuracy of 74.1%, and study B for a reduced set of specimens which experienced a gradual degradation trend under test conditions, achieving RUL classification accuracy of 82.8%.

This article comprises of the following sections. Section 2 illustrates the proposed RUL estimation framework. The experimental approach is presented in Section 3. The results achieved using the proposed Condition-based Monitoring System on real bearing vibration signal data are presented in Section 4. Section 5 examines the key findings from the results and discusses some shortcomings. To conclude, Section 6 summarises the major contributions of this article and gives some insights for future work.

2. RUL Estimation Framework

The proposed ML methodology and experimental approach for performing bearing RUL estimations is illustrated in Figure 1, listing the major stages involved. The framework begins with the signal acquisition stage, where vibration signals are collected. The signals are processed during the feature extraction stage, to extract time-frequency information. Extracted signals are further processed during the feature compression stage, to reduce the number of time-frequency data down to the most important and useful elements using filters. The wear state classes stage of the RUL estimation process categorises the training samples into five groups. Each of the five groups represents a sequential portion of the bearings lifetime, from unaged to failure. The final stage of the framework is the classification stage, where supervised ML algorithms are trained and tested to estimate the RUL of unlabelled test signals.

2.1. Vibration Signal Acquisition

Mechanical bearings in operating rotating machinery degrade naturally in a gradual process. This gradual and slowly evolving degradation process, from new to the state of failure, takes many years to occur in an effectively designed system. Applied experiments have been conducted involving artificially inducing faults to the structure of the bearing or accelerating the ageing process by applying excessive radial loads and rotating speeds, as reported in [48] and [47]. The RUL estimation framework presented in this article has been tested and validated on the Pronostia Dataset [47], which has been used widely in the field of bearing prognostics [2,49,50,51].

The ball bearings under test for this dataset, were NSK 6804DD, which can operate at a maximum speed of 13,000 rpm, and a load limit of 4000 N [52]. There are a total of 17 bearing vibration signals in the dataset, collected using three different radial load and rotating speed conditions, detailed as 1800 rpm, 1650 rpm, 1500 rpm, and 4000 N, 4200 N, 5000 N, respectively [47]. In this work, the data signals for former conditions were used. Real bearing degradation signals acquired through applied accelerated ageing experiments are presented in the dataset; however, details regarding the specific type, severity and source of each failure mode is not identified. This dataset prioritises monitoring the overall condition state of the mechanical bearing as opposed to diagnosing and localising the fault.

A pair of Dytran Model 3035B miniature accelerometer sensors were utilised to record bearing vibration signals. The sensors were radially mounted orthogonally to one another, with one on the vertical and the other on the horizontal axis of the bearings external race. The sensor pair were connected to a data acquisition card (NI DAQCard-9174) to aggregate and transmit the data to the central unit for real-time data visualisation and storage through a USB 2.0 link [47,52]. The NI-9234 sound and vibration input module was used to perform vibration measurements from the accelerometer sensors. The module features 24 bit hlanalogue to digital converter (ADC) resolution. The accelerometer recorded signals were recorded by the ADC at a sampling rate of 25.6 kHz for all test cases [47]. Each vibration measurement, with a duration of 0.1 s, consists of 2560 samples. Measurements were taken every 10 s until the vibration measurements surpassed the predetermined threshold amplitude level indicating bearing failure. The unit of measurement for the vibration signals is in G’s, where 1 g is the earth standard gravitational acceleration. The vibration signals used for the methods was from the horizontal vibration sensors only, which are on the same plane as the applied load. Previous experiments have shown these to be superior for the prognostic purposes [3].

The bearing failure threshold or limit was defined as the condition when 20 g of vibration amplitude was reached. This value was predefined to prevent damage to the fixture mounts and prolong the usability of the experimental testbed to acquire more data from subsequent test specimens [47]. The duration of each test case varied in length with the longest test case reaching almost 8 h and the shortest lasting only 2 h and 25 min, as depicted in Figure 2a,b. Plots (c) and (d) of Figure 2 illustrate the spectrogram representation of bearing signals S.01 and S.04, respectively. Bearing S.01 exhibits a gradual degradation trend, observed by the incremental increase in power spectral density (PSD) values from unaged to failure. In contrast, bearing S.04 exhibits a very sudden and severe degradation trend, observed as the sudden increase in PSD values just after the 10,000 s timestamp in plot (d).

2.2. Feature Extraction

The bearing vibration signals are transferred from the time domain to the time-frequency domain to observe and determine the spectral energy components of the signal, as illustrated in Figure 3. In this work, two methods of time-frequency feature extraction are presented and investigated: the Short-time Fourier Transform (STFT), and the more modern Hilbert Marginal Spectrum approach (HMS). Advantages of calculating the time-frequency domain spectral components can be attributed to using the spectral energy information for discovering frequency areas of interest which best indicate the health condition of the bearing component. The STFT has found widespread use in the field of bearing condition monitoring due to its robustness and mathematical underpinnings [28,29,30]. The EMD and HHT approach proposed by Huang et al. has found widespread use for processing non-stationary, non-linear signals [54]. The two make good comparisons as one is tried and tested whereas the other can unlock diagnostic feature signatures perhaps masked by the limitations of STFT.

2.2.1. Fourier Spectral Features

Deriving Fourier-based time-frequency features has been a favourable approach for both signal analysis and pre-processing for non-stationary and noisy vibration signals, due to the low computational complexity and definite physical meaning [55,56]. However, extracted Fourier features rely on the instance of a complete oscillation period to define local frequency values, requiring the algorithm parameters to be optimised for the transform process to yield highly accurate high-resolution spectral features [57].

The power spectral density representation of the signal,

X (m, f)

, with parameters as defined in Table 1, was calculated as follows. Each vibration measurement, with a duration of 0.1 s, was divided into a combination of overlapping sampling windows. The sampling windows, with a length of 256 samples, were applied for optimal time resolution. The objective was to have a small enough signal slice, which can be assumed to be stationary for the duration of the sampling window. The sampling windows were overlapping each other by a factor of 99%, to observe the incremental changes in frequency for the duration of the vibration signal. A 1024-point Fast Fourier transform (FFT) was calculated for each overlapping segment, in vector form, and added to a matrix containing the magnitude and phase values of each point in both time and frequency. The resolution of these frequency bins is determined by the chosen sampling frequency and number of spectral points specified. Setting the sampling frequency of the Fourier spectral transformation to equal the sampling rate of the accelerometer data, 25.6 kHz, and choosing the number of spectral points to satisfy Nyquist (1024 points), yields a vector with 512 values. Each of the 512 vector values is spaced at 25 Hz intervals and represents the magnitude and phase values from 0 Hz to Nyquist frequency of 12.8 kHz. The absolute values of these spectral points are calculated to provide the magnitude of each spectral frequency in time.

F (m, f) = \sum_{n = 1}^{N} x (n) g (n - m R) e^{- j 2 π f n}

(1)

where

F (m, f)

is the sliding FFT of windowed data centred about time

m R

,

g (n)

is a window function, and R is the hop size between successive FFTs.

Calculating the squared magnitude of the Fourier spectral points acquired in Equation (1) produces the power spectral density (PSD), see Equation (2). This value represents the spectral power of each frequency band from 0 Hz to Nyquist frequency.

X (m, f) = {| F (m, f) |}^{2}

(2)

2.2.2. Hilbert Marginal Spectral Features

The second feature extraction technique proposed is the Hilbert marginal spectrum (HMS), calculated using the Hilbert-Huang Transform (HHT). The HHT comprises Empirical Mode Decomposition (EMD) followed by the application of the Hilbert Transform. A specialised adaptive, and incredibly effective, method for analysing non-linear and non-stationary data is the Huang et al. combination of EMD and HHT feature extraction method [54]. It has been claimed that performing EMD is computationally intensive [58]. However, studies have shown that the time complexity of the EMD is actually equivalent to that of the traditional Fourier Transform [59].

In recent years, EMD has been widely used for bearing diagnostic and prognostic applications [38,39,40]. And unlike the previously discussed STFT feature extraction method, EMD feature extraction is an adaptive, data-driven time-frequency approach which uses only local signal characteristic time scales to decompose the non-stationary signal into a set of Intrinsic Mode Functions (IMFs). Hence, EMD presents the advantage of purely data-driven feature extraction, without relying on complete oscillation periods of Sine and Cosine functions to calculate spectral energy activity. Since the measured accelerometer vibration signals are non-stationary, the HHT is used to calculate the marginal spectrum as the predicting features for the ML RUL estimation model. The non-stationary vibration signals can be regarded as a combination of several IMF’s. Given a vibration signal

x (t)

, EMD is used to first decompose the signal into n intrinsic modal functions (IMFs) to represent the mean trend of the signal [57].

The initial EMD process was performed as follows:

1.: Construct upper signal envelope by fitting a cubic splice line through all local maxima of the signal $x (t)$ .
2.: Construct lower envelope of signal by fitting a cubic splice line through all local minima of the signal $x (t)$ .
3.: Calculate signal $h_{i} (t)$ through subtracting the mean value of the upper and lower signal envelopes from the original signal $x (t)$ .
4.: Check if $h_{i} (t)$ is an IMF by determining its standard deviation. The criterion for $h_{i} (t)$ to be classified as an IMF is that its standard deviation should be $h_{i} (t) > 0.2$ . If true, $h_{i} (t)$ is labelled as the IMF $c_{i} (t)$ and is therefore a component in the EMD process. If the standard deviation of the signal $h_{i} (t) < = 0.2$ , $h_{i} (t)$ is not recorded as an IMF, so the process steps $1 - 4$ involving calculating the signal envelopes using all local maxima and minima is repeated on $h_{i} (t)$ for two additional attempts in order to reach the criterion if possible. This process is referred to as the sifting process, and is necessary to separate the finest local mode from the vibration signal based only on the characteristic time scale by eliminating riding waves and smoothing uneven amplitudes [54].

$h_{i k} = h_{i (k - 1)} (t) - m_{i (k - 1)} (t)$

(3)
5.: Once an IMF has been identified, $c_{i} (t)$ is subtracted from signal $x (t)$ to produce signal $r_{i} (t)$ , i.e.,

$r_{i} (t) = r_{i - 1} (t) - c_{i} (t)$

(4)

with $r_{0} (t) = x (t)$ .
IMF signals $c_{i} (t)$ are extracted by a process of repeating steps one to five, replacing the original signal $x (t)$ with the extracted residue signal $r_{i} (t)$ for each iteration.

The EMD procedure is repeated until no more IMFs can be extracted from the signal residue

r_{i} (t)

, or until a preset threshold number of IMFs have been extracted. This ensures that the extracted IMF components retain sufficient physical attributes including amplitudes and frequency modulations required to effectively perform the succeeding Hilbert Transform steps [54]. For the proposed bearing RUL estimation framework, the threshold number of IMFs to be extracted from each signal sample

x (t)

was set to seven. The second stage of the HMS feature extraction process involves the application of the Hilbert Transform to each IMF

c_{i} (t)

, described as

H [c_{i} (t)] = \frac{1}{π} \int_{- \infty}^{+ \infty} \frac{c_{i} (τ)}{t - τ} d τ

(5)

When applying the Hilbert Transform in Equation (5), the analytic form of the IMF can be expressed as

C_{i}^{A} (t) = c_{i} (t) + j c_{i}^{H} (t) = a_{i} (t) e^{j θ_{i} (t)}

(6)

where

c_{i}^{H} (t) = \frac{1}{π} \int \frac{c_{i} (s)}{t - s} d s

(7)

a_{i} (t) = \sqrt{c_{i}^{2} + {(c_{i}^{H})}^{2}}

(8)

θ_{i} (t) = t a n^{- 1} (\frac{c_{i}^{H}}{c_{i}})

(9)

where

C_{i}^{A}

is the analytic signal and

c_{i}^{H}

is the Hilbert transform of the IMF

c_{i}

.

a_{i} (t)

is the instantaneous amplitude and

θ_{i} (t)

is the instantaneous phase.

The signal’s instantaneous frequency is determined as

ω = \frac{d θ (t)}{d t}

(10)

And the constructed Hilbert spectrum is

H (ω, t) = \sum_{i = 1}^{n} a_{i} (t) e^{j θ_{i} (t)}

(11)

where n denotes the total number of IMFs extracted during the previous EMD stage (seven in this experiment).

Integrating the Hilbert spectrum with respect to time t, we get the Hilbert marginal spectrum (HMS)

H (ω) = \frac{1}{T} \int_{0}^{T} (H (ω, t) d t

(12)

where T is the length of the sampled vibration signal

x (t)

.

The HMS

H (ω)

is a vector showing the total energy contributed by each spectral point [54,60]. These 512 spectral points are spaced at 25 Hz intervals and represent the spectral amplitudes content over the frequency range of 0 to 12,800 Hz (Nyquist frequency). The HMS was extracted from each vibration measurement signal, creating a matrix of spectral energy values in the time-frequency domain, which is donated as

H (m, f)

.

2.3. Feature Compression

Dimensionality reduction reduces the complexity of the computations, enabling the development and application of more effective and precise classification or regression methods. Feature compression to reduce the dimensionality helps to avoid the well-known phenomenon often defined as the curse of dimensionality [61,62]. This term describes the inherent problem caused by the exponential increase in volume associated with adding extra dimensions to the Euclidean space [62].

The STFT and HMS extracted features were used for experimentation with the RUL estimation framework. Feature compression was applied to both spectra. Both the STFT and HMS signals were divided into (1) linear bands and (2) non-linear one-third octave bands. In each case, 25 features were extracted. This was accomplished using the one-third octave band up to 25.6 KHz, producing 25 bands when we map our STFT bins. Accordingly, the linear bands were also fixed at 25 bands as well so that the order of multi-dimensional feature space is comparable. Matrix multiplication between the 512 time-frequency feature vectors and either linear and non-linear filter banks comprising of strategically placed ones and zeros is performed to reduce the dimensionality as defined in Equation (13). The linear filter bank consists of 25 linearly spaced frequency banks whereas the non-linear one-third octave band scale approach allows for higher priority to be placed on the lower frequencies, which give a better indication to the health condition of the bearing.

Y (m, k) = S (m, f) \times B (f, k)

(13)

where the feature vectors

Y (m, k)

are the result of a matrix multiplication of

S (m, f)

and

B (f, k)

.

S (m, f)

is either

X (m, f)

or

H (m, f)

, depending on the feature set employed and is a

m \times 512

matrix. m denotes the number of signal measurements for each bearing signal and therefore varies as each of the signals from the seven naturally degraded bearings examined in this study are different in duration.

B (f, k)

is either a linear or one-third octave band filter

512 \times 25

matrix comprised of ones and zeros strategically placed according to the desired filter parameters. For each row of the

B (f, k)

matrix that represents an octave or linear scaled band, ones are placed in the columns which contain the corresponding frequencies which lie within the range of the corresponding band limits. Zeros are used to denote all remaining columns of frequencies which lie outside the range of the corresponding band. The feature vector

Y (m, k)

is labelled as

L (m, k)

when linear filtering is used, and

Y (m, k)

is labelled as

O (m, k)

when one-third octave band filtering is used in the Section 4.

2.4. Wear State Classes

The proposed bearing RUL estimation framework classifies unseen target instances from the accelerometer sensors into the closest resembling match of five wear states. Analysing bearing degradation signals in the time domain, Figure 2, shows the first half of the signal showing little or no variation in energy, whereas the latter stages depict an almost exponential-like increase in vibration amplitude. The proposed framework places a higher emphasis on the latter stages through assigning non-linear temporal class boundaries between the five wear state classes. This has been achieved using one minus the negative exponential function, as shown in Equation (14). The proposed bearing lifetime model is non-linear, where class one (healthy) represents the entire first 63.2% of the bearing’s lifetime. The second class consists of the portion of signal between 63.2% and 86.5% of the bearing’s lifetime. The third class consists of the portion of signal between 86.5% and 95% of the bearing’s lifetime. The fourth class consists of the portion of signal between 95% and 98.2% of the bearing’s lifetime, and the final fifth class consisting of the final 1.8% of the bearing’s lifetime, as described in Equation (14).

β_{i} = 1 - e^{- i}

(14)

where

β_{i}

defines the non-linear temporal class boundaries, respectively, and the index

i = {1, 2, 3, 4}

corresponds to those class numbers. Note:

β_{5} = 1

as the boundaries are normalised with respect to time.

2.5. ML Methods

RUL estimations can be generated using supervised ML algorithms capable of classifying the health wear state of the bearing and recognising trends and patterns in vast datasets. Support vector machines (SVM) and the k-Nearest Neighbour (k-NN) were two of the most popular classification algorithms employed in this study.

2.5.1. Support Vector Machines

The support vector machine (SVM) is an elegant and powerful supervised ML method, which combines statistical and predictive models to perform regression or binary and multi-class classification [63,64]. Predictor instances from pre-labelled historical datasets are analysed to calculate an optimal separating criterion between the different classes. Classification of unseen target instances is performed by identifying the area with respect to the separating hyperplanes to identify which wear state the signal is assigned. Calculating the optimal separating hyperplane can be performed on both linearly and non-linearly separable patterns and data structures, making it a fundamental, robust pattern recognition tool for diagnostic and prognostic applications. The pre-processed training data, obtained using the feature extraction, feature compression and class labelling methods presented in this work are assigned as predictor features to train the SVM classification algorithm. Predictor features are mapped to a higher-dimensional feature space through a kernelling process, achieved by applying a suitable kernel function. Kernelling allows for trends and patterns to be identified and distinguished to improve the optimal separating hyperplane placement between classes. The choice of Kernel function is dependant on the predictor feature sets characteristics; therefore, it is best to test multiple kernel functions in order to choose the one that suits the data type best and provides optimal classification performance. In this study, the following six kernel functions are tested to determine the optimal option: Linear, Quadratic, Cubic, Fine Gaussian, Medium Gaussian and Coarse Gaussian functions. The training data points that are closest to the separating hyperplanes in the higher dimensional feature space make up the tiny subset of data points called support vectors. These data points have a direct bearing on the calculation of the optimal hyper-plane function, making them critical elements within the training set. Optimal hyperplanes are calculated to maximise the distance between the nearest support vectors of each temporal class. The margin is the distance between the support vector and the hyperplane. Equation (15) shows the hyperplane function, which is used on the unseen target instances to achieve the classification outcome.

w^{T} y + b = 0

(15)

where, w represents the weight vector, y is the input vector and b is the bias. The result determines whether the data point is an instance of the class above or below the hyperplane.

SVM classification models have been shown to perform very effectively in high-dimensional feature spaces, achieving high levels of classification accuracy even in cases where the number of dimensions exceeds the total number of predictor samples [40,63].

2.5.2. Weighted k-Nearest Neighbour

One of the most robust and versatile classification algorithms used for machine learning (ML) and pattern recognition systems is the k-Nearest Neighbour (k-NN) method [65,66]. Bearing RUL estimation is achieved by wear state classification of target instance classification, achieved through determining the most prevalent class among the k nearest neighbours from the historical database of pre-processed predictor instances. The closest neighbouring points are identified by choosing a distance metric and calculating the distance from the target instance to all predictors. A preset value of k controls the number of closest neighbours to compare with when identifying the most prevalent class for the predictor instance. Unlike the previously presented SVM method which only uses a small subset of the training data to perform classification, the k-NN method calculates the distance for all training data points, reducing the efficiency and increasing the computational complexity of the ML method. The distance metric used and the numerical value of k determine the performance of the classification framework. This work has performed experiments with the following six methods of k-NN classification: Fine, Medium, Coarse, Cosine, Cubic and Weighted k-NN, as summarised in Table 2.

3. Experimental Approach

3.1. Experimental Studies A and B

Two separate experimental procedures were carried out: study A and study B. Study A involved a supervised ML classification incorporating the seven bearing test cases from the Pronostia Dataset under the first set of condition parameters [47]. ML classification models were trained and tested in a round robin approach [3,15]. It can be observed from the accelerometer recorded data presented in Figure 2 that three of the seven bearings tested exhibit a more sudden abrupt fault signature as opposed to the gradual, evolving degradation trend expected in a run-to-failure bearing trend. These abrupt failure modes are likely as a result of the severe accelerated ageing regime employed to gather the bearing vibration signals from unaged to failure. Hence, in study B, bearings S.04, S.05 and S.06 were removed from the round robin testing framework to exclude the sudden fault modes from the overall data.

3.2. Round Robin Testing Framework

In both studies A and B, each round of classification for bearing RUL estimation involved using a subset of the Pronostia Dataset for training the ML algorithm and the remainder of the dataset was assigned as the testing set. Of the seven bearing signals from the dataset, six were used to train the algorithm and the seventh was eliminated and only used for testing the classification algorithm. Once RUL estimations had been calculated for each of the seventh bearing signal measurements, it was added back into the training pool and a different subsequent bearing signal was removed from the pool and assigned as the test case, as illustrated in Figure 4. The algorithm was retrained for each iteration to ensure that only out of sample signals were tested. This testing and validation framework was incorporated to obtain a more realistic performance indicator to how the algorithm is performing unlike in-sample approaches e.g., fold or percentage yield validation. The incorporation of this framework to train and validate/test the performance of each prediction model was to prevent the occurrence of over-fitting. All model training data comprised signals from a completely different bearing for each test case. This round robin training and testing framework gives a more accurate analysis of the bearing RUL estimation performance and it tries to maximise the potential of available data. Hence, a simple moving averaging (MA) technique using a window size of 9 signal measurements was incorporated into the method as a post-processing step. The experiments explored MA window sizes of 3, 6, 9 and 12. The classification accuracy results from these experiments proved the window size of 9 to be closest to optimal. In real-time, these nine signal sample measurements span a duration of 90 s, which is insignificant when compared to the years that a bearing is normally intended to last in an actual rotating machine.

3.3. Experimental Framework

The experimental framework for bearing RUL estimation involved varying the following parameters, described in detail previously:

(i) Feature Extraction from the discrete-time domain, vibration signals were performed using the following time-frequency analysis techniques: STFT, yielding the signal

X (m, f)

and HMS, producing

H (m, f)

. Both of the examined methods perform time-frequency transformation to achieve a vector representation of 512 spectral amplitudes ranging from 0 to 12,800 Hz (Nyquist frequency) in 25 Hz intervals.

(ii) Feature Compression of the extracted feature signals was achieved using linear filter bands and one-third octave filter bands. The extracted feature signal

X (m, f)

was compressed into

L_{X} (m, k)

or

L_{H} (m, k)

, with the linear and one-third octave band filters, respectively. The extracted

H (m, f)

signal was compressed to produce

O_{X} (m, k)

or

O_{H} (m, k)

with the linear and one-third octave band filters, respectively. Both feature sets were compressed from 512 down to 25 spectral features using matrix multiplication.

(iii) Wear state classification model: five non-linear temporal class boundaries were selected.

(iv) ML model Training and testing using two supervised low complexity ML algorithms; SVM and k-NN. Algorithm optimisation was performed through varying the hyper-parameters including the kernel function options for the SVM and varying the value of k and distance metric used for the k-NN classifiers. Six methods of each classification type were tested, consisting of the following: linear, quadratic, cubic, fine, medium, and coarse Gaussian kernel functions, and fine, medium, coarse, cosine, cubic and weighted k values and distance metrics, respectively.

4. Results

The outcomes of the suggested ML framework for bearing RUL classification are presented in this section. The Jaccard Index was used to analyse the wear state classification accuracy [67], Equation (16). The ratio was multiplied by 100 in order to display the data as percentages in the following result tables.

J (z, \hat{z}) = \frac{| z \cap \hat{z} |}{| z \cup \hat{z} |}

(16)

where z represents the true class of a time-sample and

\hat{z}

represents the class prediction from the ML algorithm.

4.1. Study A

The classification accuracy results achieved using STFT and HMS derived features in experimental study A are presented in Table 3. The lowest performance for the SVM algorithm approach was

51.6 %

using the linear band features with the cubic SVM kernel function. The highest classification performance of

74.1 %

was achieved by the one-third octave band feature set, using the SVM medium Gaussian kernel function. This was also the highest accuracy score achieved overall for study A. With regard to the k-NN experiments, the linear frequency band feature set combined with the Fine k-NN yielded the lowest STFT performance of

61.3 %

. The highest classification accuracy of

73.0 %

was achieved using the one-third octave band features with the coarse k-NN classifier.

The classification results for the HMS derived features in experimental study A are presented in Table 4. For the SVM classification framework, the lowest performance recorded was

3.4 %

for the 512 HMS derived feature set using the quadratic SVM kernel function, whereas, the highest classification performance of

69.5 %

was achieved using the linear filter band feature set with the fine Gaussian SVM kernel function parameter. The lowest k-NN classification performance was the 512 HMS feature set utilising the cosine distance metric k-NN classifier was

44.0 %

. The highest performance accuracy from the HMS derived features in study A was

70.8 %

using the one-third octave band features with the medium k-NN classification algorithm.

4.2. Study B

The classification accuracy results achieved using the STFT and HMS features derived from the discrete-time signals in experimental study B are presented in Table 5. In the case of the SVM classification approach, the linear filter band feature set utilising a cubic SVM kernel function yielded the lowest accuracy results, which were

39.4 %

. The highest classification performance achieved was

82.8 %

using the one-third octave band feature set the medium Gausian SVM kernelling method. This was the best performance achieved for Study B, and the best overall score for both studies, A and B respectively. The linear band feature set using the fine k-NN classifier produced the lowest classification performance of

66.2 %

for the k-NN classification method. The weighted k-NN method using one-third octave band features achieved the greatest classification accuracy of

81.0 %

.

The experimental results using the HMS derived features in study B are presented in Table 6. In the case of the SVM classification framework, the lowest performance was recorded at

2.8 %

accuracy for the 512 HMS features using the quadratic SVM kernel function. The highest classification performance, on the other hand, was achieved at

74.3 %

, for the HMS linear filter band features using a fine Gaussian SVM kernel function. In contrast, using the fine Gaussian SVM kernel function with the HMS linear filter band features had the highest classification performance, achieving

74.3 %

. For the k-NN classification results, the lowest performance accuracy was recorded at

32.0 %

for the 512 unfiltered HMS features from a k-NN classifier using a cosine distance metric. The highest classification accuracy was achieved at

75.9 %

for the one-third octave band features the use of a medium k-NN classifier.

5. Discussion

This section interprets the results and discusses the key implications of the findings, as well as highlighting limitations and proposing recommendations for advancing this area and improving future RUL prediction methods.

The one-third octave band feature compression achieved the highest overall results for both STFT and HMS feature extraction methods across both studies A and B. The 512 feature point uncompressed spectra suffer as a result of the high dimensional feature space, which effectively gives rise to Bellman’s “curse of dimensionality” [61]. This phrase describes the underlying issue brought on by the exponential volume increase by including additional dimensions in the Euclidean space [62]. To reduce the dimensionality of the extracted time-frequency spectral feature vectors, from 512 features down to 25 features, using linearly scaled bands through matrix multiplication proved to have undesirable consequences for many of the RUL estimation classification results; for example, the Fourier based features (see Table 3 and Table 5). Albeit, this does not hold true for the HMS features where the reduction often decreased the performance. Accordingly, this article proposes a more unified feature compression approach, by applying one-third octave scale filter banks to map and compress the feature space into 25 salient features. The success of this non-linear feature extraction approach is attributed to more diagnostic characteristic information for bearing wear-state estimation.

The ML recipes with the highest performance for both study A and study B, were Fourier-based, although the HMS features in study A were close at

70.8 %

, compared to

74.1 %

. However, in the case of study B, the performance gap was highly statistically significant at

82.8 %

, compared to

75.9 %

. This could be attributed to the Fourier-based features being better suited to characterising the stationary components of the bearing fault frequencies and their higher order harmonics as the bearings degrade. It is known that that the extracted IMF features are not strictly orthogonal, [54], which can give rise to mode mixing [57,68]. Consequently, this mode mixing might offer an explanation as to why the extracted IMF features suffer from limitations in their current form here, in terms of their ability to characterise and describe degradation across time for ML algorithms. Moreover, this offers scope for experimental avenues and further work, which employ related feature engineering techniques, such as Variational Mode Decomposition (VMD) [68] and Ensemble Empirical Mode Decomposition (EEMD) [57]. These alternative approaches try to preserve the orthogonality condition within the feature space. This is potentially advantageous from the viewpoints of feature engineering and ML based RUL estimation.

As described in Section 3, in terms of the rationale for study B, the results have shown that abrupt failure modes are extremely problematic for CbM systems [7]. The high degree of incidents of abrupt failure is less likely to occur in practice. The authors of this work hypothesise that the high occurrence of abrupt failure cases, recorded at 42.9% of the dataset for condition 1, is likely a result of the severe accelerated ageing regime employed. In keeping with this, Study B eliminated the bearing specimens that exhibit abrupt failure modes from the testing framework. From the result tables presented in Section 4, in general, a vast classification improvement can be observed for ML recipes employing both the STFT and HMS features by concentrating on the bearing test cases exhibiting a more gradual degradation trend, which is to be expected. Prior work has commented on the challenges of performing CbM on abrupt failure cases [7].

Figure 5 and Figure 6 illustrate the confusion matrices of the best-performing feature subsets, which were highlighted in bold in each of the tables in Section 4 for study A and B, respectively. At the wear state class level, these confusion matrices enable the classification results to be further examined. A visual representation of where exactly misclassification occurred gives indication as to which classes are most challenging to correctly classify and the degree to which the state was misclassified. From Figure 5, it can be observed that all of the ML recipes and feature sets in study A perform well on wear state class 1, the maximum and minimum range being

84.4 %

to

97.9 %

in comparison to the maximum and minimum range of class 5 being

23.3 %

to

64.0 %

. The ML recipe with the highest overall percentage accuracy in study A of

74.1 %

was that of

O_{X} (m, k)

with SVM (Medium), shown in matrix (a) of Figure 5. This particular ML recipe also achieved the highest classification accuracy for classes 2, 3 and 5. Similarly, for study B, Figure 6 shows the classification performance for the first wear state class performing well with the maximum and minimum range being

89.5 %

to

96.5 %

. It can be observed that the vast majority of misclassified samples tend to be predicting the neighbouring class. The best performing ML recipe in study B, with a score of

82.8 %

, was that of

O_{X} (m, k)

with SVM (Medium), shown in matrix (a) of Figure 6. This particular ML recipe also achieved the highest classification accuracy for wear state classes 3 and 5.

This research has led to the introduction of a robust framework for industry developers to implement a sophisticated non-invasive predictive maintenance system with strong classification performances of

74.1 %

and

82.8 %

for studies A and B, respectively. The supervised ML algorithms, such as SVM and k-NN, employed here are limited by their dependence solely on the quality of the training to determine trends and form class boundaries in higher dimensional feature spaces. As seen when comparing the results of studies A and B, the major improvement in average classification accuracy highlights the importance of using only high-quality data for prognostics bearing CbM. In real systems, the onset of early failure typically takes years to arise. Hence, a clear trade-off exists, as it is highly desirable to have a large dataset; however, it is unfeasible in practice under ordinary operational conditions, from the perspective of time and cost budgeting for the number of dedicated measurement test rig set-ups required. Thus, accelerated ageing is largely unavoidable in practice in order to facilitate the acquisition of datasets of sufficient size. However, the level of accelerated ageing observed here [47] is thought to be too severe, with max and min duration of

2.42

h and

7.78

h, see Figure 2. A clear motivation has emerged from this body of work towards designing experimental campaigns for datasets which scale the level of accelerated ageing down in order to have specimen failure occurring after several thousand hours. This would greatly increase the quality and improve the realistic ageing trends captured, e.g., less abrupt failure modes. This would be extremely advantageous and offer more scope and opportunities for the suitable application of advanced deep learning methods, which extract high-level, complex abstractions as data representations through a hierarchical learning process. The architecture of deep learning networks, which typically involve deep neural networks with many hidden layers and complex connections, requires more computational resources and expertise to design and train effectively. Deep learning algorithm approaches require large amounts of labeled training data, due to the associated high-dimensional parameter space of deep networks, demanding vast quantities of diverse examples to capture the underlying patterns accurately [69]. On the other hand, supervised ML algorithms, such as SVM and k-NN, can perform well with smaller datasets, making them more accessible for certain applications where data are limited. Deep learning algorithms, especially when dealing with large-scale models, demand significant computational resources, including high-performance graphics processing units (GPUs) or even specialised hardware like tensor processing units (TPUs). On the contrary, supervised ML algorithms are generally less resource-intensive, making them more accessible and practical for deployment on standard hardware [69], and embedded devices (e.g., system-on-chips (SoCs) and microcontrollers).

6. Conclusions

This paper has presented a robust and innovative approach to estimate the remaining useful life (RUL) of rotating machines based around using novel non-linear features and wear state models. One-third octave band feature compression of the Short-time Fourier Transform (STFT) and Hilbert Marginal Spectrum (HMS) is incorporated to reduce the computational complexity for training a supervised learning classification framework by effectively reducing the dimensionality of the time-frequency spectrum. The results have achieved strong accuracy of up to

82.8 %

using a Medium Gaussian Support Vector Machine (SVM) classification algorithm. The efficacy of using classification algorithms, such as Weighted k-Nearest Neighbour (k-NN), has also been shown for this RUL prognostic application. The one-third octave bands and the non-linear wear state classes have significantly superior performance on different feature types, as reported in these studies. Ultimately, in practice, as the approach complexity is low, vibration signals from operating motors can be acquired in real-time, non-invasively and non-destructively using strategically mounted accelerometers. In practice, these acquired signals can be processed locally on electronic hardware (edge-AI) or transmitted and ported to a cloud database for large-scale deployments in rotating machine predictive maintenance. Our proposed low-complexity approach, along with the RUL estimation framework, is valuable for predictive maintenance by reducing costly machine downtime, timely unscheduled replacements/repairs for critical components, and mitigating the risk of the occurrence of health and safety incidents. Future avenues that are open to further investigation to build on this work could involve experimenting with the feature extraction methods and deep learning approaches. Moreover, coupled with further testing of their performance as prognostic health estimation (RUL) on large datasets containing high fidelity real bearing vibration signal data. Ideally, these datasets should encompass various operating conditions, including different applied radial loads and rotating speeds.

Author Contributions

E.T.C.: Formal analysis, Methodology, Investigation, Writing—Original Draft and Writing—Review and Editing; V.I.M.: Supervision, Analysis, Results interpretation, Writing—Review and Editing and Visualisation; D.F.K.: Supervision, Formal analysis, Methodology, Analysis, Results interpretation, Writing—Review and Editing, Plotting and Visualisation. All authors have read and agreed to the published version of the manuscript.

Funding

The research conducted in this publication was funded by the Irish Research Council under award number GOIPG/2021/1744. All authors are with the Faculty of Engineering, South East Technological University, Ireland.

Data Availability Statement

Publicly available datasets were analysed in this study. These data can be found here: [https://ti.arc.nasa.gov/tech/dash/groups/pcoe/prognostic-data-repository/#bearing, accessed on 15 April 2022], also see [47].

Acknowledgments

The authors thank South East Technological University’s Dillon O’Reilly, Liam Dinkelmann and Cian Madigan for their insightful weekly advice and suggestions on technical issues.

Conflicts of Interest

The authors declare no conflict of interest. The South East Technological University have endorsed this manuscript to go forward for peer review and publication. The funders had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

ADC	Analogue to digital converter
CbM	Condition-based monitoring
CNN	convolutional neural networks
DAQ	Data acquisition
DBN	deep belief networks
EA	Envelope analysis
EEMD	Ensemble empirical mode decomposition
EMD	Empirical mode decomposition
FFT	Fast Fourier transform
FMD	Feature mode decomposition
GPU	Graphics processing unit
HHT	Hilbert-Huang transform
HMS	Hilbert marginal spectrum
IMF	Intrinsic mode functions
ISO	International Organization for Standardization
k-NN	k-Nearest neighbour
MA	Moving average
ML	Machine learning
PSD	Power spectral density
RPM	Revolutions per minute
RUL	Remaining useful life
SoC	System-on-chip
STFT	Short-time Fourier transform
SVM	Support vector machine
TPU	Tensor processing unit
USB	Universal serial bus
VMD	Variational mode decomposition
WT	Wavelet transform

References

Chelmiah, E.T.; McLoone, V.I.; Kavanagh, D.F. Remaining Useful Life Estimation of Rotating Machines through Supervised Learning with Non-Linear Approaches. Appl. Sci. 2022, 12, 4136. [Google Scholar] [CrossRef]
Berghout, T.; Mouss, L.H.; Bentrcia, T.; Benbouzid, M. A Semi-supervised Deep Transfer Learning Approach for Rolling-Element Bearing Remaining Useful Life Prediction. IEEE Trans. Energy Convers. 2021, 37, 1200–1210. [Google Scholar] [CrossRef]
Chelmiah, E.T.; McLoone, V.I.; Kavanagh, D.F. Remaining Useful Life Estimation of Rotating Machines using Octave Spectral Features. In Proceedings of the IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society, Singapore, 18–21 October 2020; pp. 3031–3036. [Google Scholar]
Benbouzid, M.E.H. A review of induction motors signature analysis as a medium for faults detection. IEEE Trans. Ind. Electron. 2000, 47, 984–993. [Google Scholar] [CrossRef] [Green Version]
Zhang, P.; Du, Y.; Habetler, T.G.; Lu, B. A Survey of Condition Monitoring and Protection Methods for Medium-Voltage Induction Motors. IEEE Trans. Ind. Appl. 2011, 47, 34–46. [Google Scholar] [CrossRef]
Gyftakis, K.N.; Cardoso, A.J.M. Reliable detection of very low severity level stator inter-turn faults in induction motors. In Proceedings of the IECON 2019-45th Annual Conference of the IEEE Industrial Electronics Society, Lisbon, Portugal, 14–17 October 2019; Volume 1, pp. 1290–1295. [Google Scholar]
Tavner, P. Review of condition monitoring of rotating electrical machines. IET Electr. Power Appl. 2008, 2, 215–247. [Google Scholar] [CrossRef]
Nandi, S.; Toliyat, H.A.; Li, X. Condition monitoring and fault diagnosis of electrical motors—A review. IEEE Trans. Energy Convers. 2005, 20, 719–729. [Google Scholar] [CrossRef]
Bellini, A.; Immovilli, F.; Rubini, R.; Tassoni, C. Diagnosis of Bearing Faults of Induction Machines by Vibration or Current Signals: A Critical Comparison. In Proceedings of the 2008 IEEE Industry Applications Society Annual Meeting, Edmonton, AB, Canada, 5–9 October 2008; pp. 1–8. [Google Scholar] [CrossRef]
Morinigo-Sotelo, D.; Duque-Perez, O.; Garcia-Escudero, L.A.; Perez-Alonso, M. Bearing lubrication assessment using an statistical analysis of the stator current spectrum. In Proceedings of the The XIX International Conference on Electrical Machines-ICEM 2010, Rome, Italy, 6–8 September 2010; pp. 1–6. [Google Scholar] [CrossRef]
ISO 15243:2004; Rolling Bearings—Damage and Failure—Terms, Characteristics and Causes. ISO: Geneva, Switzerland, 2004.
Zhang, S.; Zhang, S.; Wang, B.; Habetler, T.G. Deep Learning Algorithms for Bearing Fault Diagnostics—A Comprehensive Review. IEEE Access 2020, 8, 29857–29881. [Google Scholar] [CrossRef]
Porotsky, S.; Bluvband, Z. Remaining useful life estimation for systems with non-trendability behaviour. In Proceedings of the 2012 IEEE Conference on Prognostics and Health Management, Denver, CO, USA, 18–21 June 2012; pp. 1–6. [Google Scholar] [CrossRef]
Benker, M.; Bliznyuk, A.; Zaeh, M.F. A Gaussian Process Based Method for Data-Efficient Remaining Useful Life Estimation. IEEE Access 2021, 9, 137470–137482. [Google Scholar] [CrossRef]
Chelmiah, E.T.; McLoone, V.I.; Kavanagh, D.F. Wear State Estimation of Rolling Element Bearings using Support Vector Machines. In Proceedings of the 2020 15th IEEE International Conference on Signal Processing (ICSP), Beijing, China, 6–9 December 2020; Volume 1, pp. 306–311. [Google Scholar]
Scanlon, P.; Kavanagh, D.F.; Boland, F.M. Residual Life Prediction of Rotating Machines Using Acoustic Noise Signals. IEEE Trans. Instrum. Meas. 2013, 62, 95–108. [Google Scholar] [CrossRef]
Kavanagh, D.F.; Scanlon, P.; Boland, F. Envelope analysis and data-driven approaches to acoustic feature extraction for predicting the remaining useful life of rotating machinery. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 31 March–4 April 2008; pp. 1621–1624. [Google Scholar]
Han, D.; Yu, J.; Gong, M.; Song, Y.; Tian, L. A remaining useful life prediction approach based on low-frequency current data for bearings in spacecraft. IEEE Sens. J. 2021, 21, 18978–18989. [Google Scholar] [CrossRef]
Duque-Perez, O.; Pozo-Gallego, D.; Morinigo-Sotelo, D.; Fontes Godoy, W. Condition monitoring of bearing faults using the stator current and shrinkage methods. Energies 2019, 12, 3392. [Google Scholar] [CrossRef] [Green Version]
Haddad, R.Z.; Lopez, C.A.; Pons-Llinares, J.; Antonino-Daviu, J.; Strangas, E.G. Outer race bearing fault detection in induction machines using stator current signals. In Proceedings of the 2015 IEEE 13th International Conference on Industrial Informatics (INDIN), Cambridge, UK, 22–24 July 2015; pp. 801–808. [Google Scholar] [CrossRef]
Roldan, S.; Sanchez-Londono, D.; Barbieri, G. Thermographic Indicators for the State Assessment of Rolling Bearings. IFAC-PapersOnLine 2021, 54, 1218–1223. [Google Scholar] [CrossRef]
Verstraete, D.; Droguett, E.; Modarres, M. A deep adversarial approach based on multi-sensor fusion for semi-supervised remaining useful life prognostics. Sensors 2020, 20, 176. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Singleton, R.K.; Strangas, E.G.; Aviyente, S. The use of bearing currents and vibrations in lifetime estimation of bearings. IEEE Trans. Ind. Informat. 2016, 13, 1301–1309. [Google Scholar] [CrossRef]
Caesarendra, W.; Tjahjowidodo, T. A Review of Feature Extraction Methods in Vibration-Based Condition Monitoring and Its Application for Degradation Trend Estimation of Low-Speed Slew Bearing. Machines 2017, 5, 21. [Google Scholar] [CrossRef]
Lv, M.; Zhang, C.; Guo, A.; Liu, F. A new performance degradation evaluation method integrating PCA, PSR and KELM. IEEE Access 2020, 9, 6188–6200. [Google Scholar] [CrossRef]
Kong, X.; Yang, J. Remaining Useful Life Prediction of Rolling Bearings Based on RMS-MAVE and Dynamic Exponential Regression Model. IEEE Access 2019, 7, 169705–169714. [Google Scholar] [CrossRef]
Jin, X.; Que, Z.; Sun, Y.; Guo, Y.; Qiao, W. A data-driven approach for bearing fault prognostics. IEEE Trans. Ind Appl. 2019, 55, 3394–3401. [Google Scholar] [CrossRef]
Su, X.; Liu, H.; Tao, L.; Lu, C.; Suo, M. An end-to-end framework for remaining useful life prediction of rolling bearing based on feature pre-extraction mechanism and deep adaptive transformer model. Comput. Ind. Eng. 2021, 161, 107531. [Google Scholar] [CrossRef]
Zhang, Y.; Tang, B.; Han, Y.; Deng, L. Bearing performance degradation assessment based on time-frequency code features and SOM network. Meas. Sci. Technol. 2017, 28, 045601. [Google Scholar] [CrossRef]
Prudhom, A.; Antonino-Daviu, J.; Razik, H.; Climente-Alarcon, V. Time-frequency vibration analysis for the detection of motor damages caused by bearing currents. Mech. Syst. Signal Process. 2017, 84, 747–762. [Google Scholar] [CrossRef] [Green Version]
Hou, B.; Wang, D.; Wang, Y.; Yan, T.; Peng, Z.; Tsui, K.L. Adaptive Weighted Signal Preprocessing Technique for Machine Health Monitoring. IEEE Trans. Instrum. Meas. 2020, 70, 1–11. [Google Scholar] [CrossRef]
Yan, M.; Wang, X.; Wang, B.; Chang, M.; Muhammad, I. Bearing remaining useful life prediction using support vector machine and hybrid degradation tracking model. ISA Trans. 2020, 98, 471–482. [Google Scholar] [CrossRef]
Zhang, X.; Zhao, J.; Kang, J.; Li, H.; Teng, H. Bearing prognostics with non-trendable behavior based on shock pulse method and frequency analysis. J. Vibroeng. 2014, 16, 3963–3976. [Google Scholar]
Wang, X.; Wanga, T.; Ming, A.; Zhang, W.; Li, A.; Chu, F. Cross-operating-condition Degradation Knowledge Learning for Remaining Useful Life Estimation of Bearings. IEEE Trans. Instrum. Meas. 2021, 70, 3520911. [Google Scholar] [CrossRef]
Yoo, Y.; Baek, J.G. A novel image feature for the remaining useful lifetime prediction of bearings based on continuous wavelet transform and convolutional neural network. Appl. Sci. 2018, 8, 1102. [Google Scholar] [CrossRef] [Green Version]
Miao, Y.; Zhang, B.; Li, C.; Lin, J.; Zhang, D. Feature mode decomposition: New decomposition theory for rotating machinery fault diagnosis. IEEE Trans. Ind. Electron. 2022, 70, 1949–1960. [Google Scholar] [CrossRef]
Chelmiah, E.T.; Kavanagh, D.F. Hilbert Marginal Spectrum for Failure Mode Diagnosis of Rotating Machines. In Proceedings of the IECON 2021–47th Annual Conference of the IEEE Industrial Electronics Society, Toronto, ON, Canada, 13–16 October 2021; pp. 1–6. [Google Scholar]
Maurya, S.; Singh, V.; Verma, N.K. Condition Monitoring of Machines Using Fused Features from EMD-Based Local Energy with DNN. IEEE Sens. J. 2020, 20, 8316–8327. [Google Scholar] [CrossRef]
Liu, X.; Song, P.; Yang, C.; Hao, C.; Peng, W. Prognostics and health management of bearings based on logarithmic linear recursive least-squares and recursive maximum likelihood estimation. IEEE Trans. Ind. Electron. 2017, 65, 1549–1558. [Google Scholar] [CrossRef]
Soualhi, A.; Medjaher, K.; Zerhouni, N. Bearing health monitoring based on Hilbert–Huang transform, support vector machine, and regression. IEEE Trans. Instrum. Meas. 2014, 64, 52–62. [Google Scholar] [CrossRef] [Green Version]
Chen, Z.; Tu, X.; Hu, Y.; Li, F. Real-time bearing remaining useful life estimation based on the frozen convolutional and activated memory neural network. IEEE Access 2019, 7, 96583–96593. [Google Scholar] [CrossRef]
Ren, L.; Sun, Y.; Wang, H.; Zhang, L. Prediction of bearing remaining useful life with deep convolution neural network. IEEE Access 2018, 6, 13041–13049. [Google Scholar] [CrossRef]
Miao, Y.; Wang, J.; Zhang, B.; Li, H. Practical framework of Gini index in the application of machinery fault feature extraction. Mech. Syst. Signal Process. 2022, 165, 108333. [Google Scholar] [CrossRef]
Miao, Y.; Li, C.; Zhang, B.; Lin, J. Application of a coarse-to-fine minimum entropy deconvolution method for rotating machines fault detection. Mech. Syst. Signal Process. 2023, 198, 110431. [Google Scholar] [CrossRef]
Mishra, M.; Martinsson, J.; Goebel, K.; Rantatalo, M. Bearing Life Prediction with Informed Hyperprior Distribution: A Bayesian Hierarchical and Machine Learning Approach. IEEE Access 2021, 9, 157002–157011. [Google Scholar] [CrossRef]
Singleton, R.K.; Strangas, E.G.; Aviyente, S. Extended Kalman filtering for remaining-useful-life estimation of bearings. IEEE Trans. Ind. Electron. 2014, 62, 1781–1790. [Google Scholar] [CrossRef]
Nectoux, P.; Gouriveau, R.; Medjaher, K.; Ramasso, E.; Chebel-Morello, B.; Zerhouni, N.; Varnier, C. PRONOSTIA: An experimental platform for bearings accelerated degradation tests. In Proceedings of the IEEE International Conference on Prognostics and Health Management, PHM’12, Denver, CO, USA, 18–21 June 2012; pp. 1–8. [Google Scholar]
Lee, J.; Qiu, H.; Yu, G.; Lin, J.; Rexnord Technical Services. IMS, University of Cincinnati. “Bearing Data Set” 2007. Available online: https://www.kaggle.com/datasets/vinayak123tyagi/bearing-dataset (accessed on 10 May 2023).
Javed, K.; Gouriveau, R.; Zerhouni, N.; Nectoux, P. Enabling health monitoring approach based on vibration data for accurate prognostics. IEEE Trans. Ind. Electron. 2014, 62, 647–656. [Google Scholar] [CrossRef] [Green Version]
Wang, B.; Lei, Y.; Li, N.; Li, N. A Hybrid Prognostics Approach for Estimating Remaining Useful Life of Rolling Element Bearings. IEEE Trans. Rel. 2020, 69, 401–412. [Google Scholar] [CrossRef]
Wang, H.; Dong, G.; Chen, J. Application of improved genetic programming for feature extraction in the evaluation of bearing performance degradation. IEEE Access 2020, 8, 167721–167730. [Google Scholar] [CrossRef]
Benkedjouh, T.; Medjaher, K.; Zerhouni, N.; Rechak, S. Remaining useful life estimation based on nonlinear feature reduction and support vector regression. Eng. Appl. Artif. Intell. 2013, 26, 1751–1760. [Google Scholar] [CrossRef]
ISO 1683:2015; Acoustics—Preferred Reference Values for Acoustical and Vibratory Levels. ISO: Geneva, Switzerland, 2015.
Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.C.; Tung, C.C.; Liu, H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
Du, Y.; Wang, A.; Wang, S.; He, B.; Meng, G. Fault diagnosis under variable working conditions based on STFT and transfer deep residual network. Shock Vib. 2020, 2020, 1274380. [Google Scholar] [CrossRef]
Su, H.; Chong, K.T. Induction machine condition monitoring using neural network modeling. IEEE Trans. Ind. Electron. 2007, 54, 241–249. [Google Scholar] [CrossRef]
Lei, Y.; Lin, J.; He, Z.; Zuo, M.J. A review on empirical mode decomposition in fault diagnosis of rotating machinery. Mech. Syst. Signal Process. 2013, 35, 108–126. [Google Scholar] [CrossRef]
Wu, L.C.; Chen, H.H.; Horng, J.T.; Lin, C.; Huang, N.E.; Cheng, Y.C.; Cheng, K.F. A novel preprocessing method using Hilbert Huang transform for MALDI-TOF and SELDI-TOF mass spectrometry data. PLoS ONE 2010, 5, e12493. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.H.; Yeh, C.H.; Young, H.W.V.; Hu, K.; Lo, M.T. On the computational complexity of the empirical mode decomposition algorithm. Phys. A Stat. Mech. Its Appl. 2014, 400, 159–167. [Google Scholar] [CrossRef]
Mao, W.; He, J.; Tang, J.; Li, Y. Predicting remaining useful life of rolling bearings based on deep feature representation and long short-term memory neural network. Adv. Mech. Eng. 2018, 10, 1687814018817184. [Google Scholar] [CrossRef] [Green Version]
Bellman, R. Dynamic programming. Science 1966, 153, 34–37. [Google Scholar] [CrossRef]
Zhou, Z.; Mo, J.; Shi, Y. Data imputation and dimensionality reduction using deep learning in industrial data. In Proceedings of the 2017 3rd IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, 13–16 December 2017; pp. 2329–2333. [Google Scholar]
Bandyopadhyay, I.; Purkait, P.; Koley, C. Performance of a Classifier Based on Time-Domain Features for Incipient Fault Detection in Inverter Drives. IEEE Trans. Ind. Informat. 2019, 15, 3–14. [Google Scholar] [CrossRef]
Elforjani, M.; Shanbr, S. Prognosis of bearing acoustic emission signals using supervised machine learning. IEEE Trans. Ind. Electron. 2017, 65, 5864–5871. [Google Scholar] [CrossRef] [Green Version]
Moshrefzadeh, A. Condition monitoring and intelligent diagnosis of rolling element bearings under constant/variable load and speed conditions. Mech. Syst. Signal Process. 2021, 149, 107153. [Google Scholar] [CrossRef]
Buzzoni, M.; D’Elia, G.; Mucchi, E.; Dalpiaz, G. A vibration-based method for contact pattern assessment in straight bevel gears. Mech. Syst. Signal Process. 2019, 120, 693–707. [Google Scholar] [CrossRef]
Shi, R.; Ngan, K.N.; Li, S. Jaccard index compensation for object segmentation evaluation. In Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP), Paris, France, 27–30 October 2014; pp. 4457–4461. [Google Scholar]
Habbouche, H.; Amirat, Y.; Benkedjouh, T.; Benbouzid, M. Bearing Fault Event-Triggered Diagnosis using a Variational Mode Decomposition-based Machine Learning Approach. IEEE Trans. Energy Convers. 2021, 37, 466–474. [Google Scholar] [CrossRef]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]

Figure 1. A flow diagram outlining the proposed machine learning (ML) methodology and experimental approach using different recipes. This method diagram is illustrated using five main panels, which directly correspond with the remaining useful life (RUL) estimation framework text in the Section 2.1–Section 2.5.

Figure 2. (a) Vibration signals data from the Pronostia platform for seven test bearings under condition 1, speed of 1800 rpm and load of 4000 N. (b) Moving average (MA) plot of seven vibration signals are shown in the form of acceleration level (dB) across time (s), for condition 1 of the bearing dataset. The MA interval is 512 points and a reference level of 1

μ

m/

s^{2}

was used, as per [53]. (c) Spectrogram representation of the S.01 bearing signal, exhibiting gradual increase in power spectral density (PSD) as the bearing degrades from healthy to failure. (d) Spectrogram representation of the S.04 bearing signal, exhibiting a very sudden and abrupt increase in PSD, indicating the presence of a catastrophic failure mode.

Figure 2. (a) Vibration signals data from the Pronostia platform for seven test bearings under condition 1, speed of 1800 rpm and load of 4000 N. (b) Moving average (MA) plot of seven vibration signals are shown in the form of acceleration level (dB) across time (s), for condition 1 of the bearing dataset. The MA interval is 512 points and a reference level of 1

μ

m/

s^{2}

was used, as per [53]. (c) Spectrogram representation of the S.01 bearing signal, exhibiting gradual increase in power spectral density (PSD) as the bearing degrades from healthy to failure. (d) Spectrogram representation of the S.04 bearing signal, exhibiting a very sudden and abrupt increase in PSD, indicating the presence of a catastrophic failure mode.

Figure 3. Feature extraction approach using Short-time Fourier Transform (STFT) and Hilbert Marginal Spectrum (HMS) techniques combined with linear and non-linear feature engineering to effectively compress the number of spectral features down to 25.

Figure 4. The round robin testing framework for the training and testing signal used in experimental studies A (a) and B (b). All of the samples for the bearing being tested are completely unseen and thus not allocated for algorithm training purposes for that particular iteration.

Figure 5. Beginning with the top left pane and moving left to right, the confusion matrices of the best performing SVM and k-NN trained feature sets in Study A are presented for STFT and HMS, respectively. Specifically, (a)

O_{X} (m, k)

features with SVM (Medium G), (b)

O_{X} (m, k)

features with k-NN (Coarse), (c)

L_{H} (m, k)

features with SVM (Fine G), and (d)

O_{H} (m, k)

features with k-NN (Medium).

Figure 5. Beginning with the top left pane and moving left to right, the confusion matrices of the best performing SVM and k-NN trained feature sets in Study A are presented for STFT and HMS, respectively. Specifically, (a)

O_{X} (m, k)

features with SVM (Medium G), (b)

O_{X} (m, k)

features with k-NN (Coarse), (c)

L_{H} (m, k)

features with SVM (Fine G), and (d)

O_{H} (m, k)

features with k-NN (Medium).

Figure 6. Beginning with the top left pane and moving left to right, the confusion matrices of the best performing SVM and k-NN trained feature sets in Study B are presented for STFT and HMS, respectively. Specifically, (a)

O_{X} (m, k)

features with SVM (Medium G), (b)

O_{X} (m, k)

features with k-NN (Weighted), (c)

L_{H} (m, k)

features with SVM (Fine G), and (d)

O_{H} (m, k)

features with k-NN (Medium).

Figure 6. Beginning with the top left pane and moving left to right, the confusion matrices of the best performing SVM and k-NN trained feature sets in Study B are presented for STFT and HMS, respectively. Specifically, (a)

O_{X} (m, k)

features with SVM (Medium G), (b)

O_{X} (m, k)

features with k-NN (Weighted), (c)

L_{H} (m, k)

features with SVM (Fine G), and (d)

O_{H} (m, k)

features with k-NN (Medium).

Table 1. Short-time Fourier Parameters.

Sampling Frequency ( $f_{s}$ )	25,600 Hz
Window Length ( $L_{w}$ )	256
Hop Size ( $L_{h}$ )	( $L_{w}$ ) × 99%
No. of FFT Points (N)	1024
Window Type (w)	Blackman

Table 2. k-NN Method Parameters.

k-NN Method	Distance Weightings	k Value
Fine	Euclidean	1
Medium	Euclidean	10
Coarse	Euclidean	100
Cosine	Cosine	10
Cubic	Cubic	10
Weighted	Weights	10

Table 3. Classification Results

(%)

for STFT Feature Sets in Experimental Study A.

Table 3. Classification Results

(%)

for STFT Feature Sets in Experimental Study A.

	SVM
Features	Linear	Quadratic	Cubic	Fine G	Medium G	Coarse G
$X (m, f)$	65.0	62.2	60.0	63.2	65.4	65.0
$L_{X} (m, k)$	58.5	58.3	51.6	63.2	60.6	64.9
$O_{X} (m, k)$	72.3	69.1	62.6	62.4	74.1	67.9
	k-NN
Features	Fine	Medium	Coarse	Cosine	Cubic	Weighted
$X (m, f)$	62.4	64.6	68.0	67.3	65.5	64.4
$L_{X} (m, k)$	61.3	62.3	66.8	66.9	64.2	61.9
$O_{X} (m, k)$	68.7	69.7	73.0	69.4	69.4	69.7

Table 4. Classification Results

(%)

for HMS Feature Sets in Experimental Study A.

Table 4. Classification Results

(%)

for HMS Feature Sets in Experimental Study A.

	SVM
Features	Linear	Quadratic	Cubic	Fine G	Medium G	Coarse G
$H (m, f)$	66.8	3.4	24.7	69.1	68.3	66.9
$L_{H} (m, k)$	68.2	14.1	26.5	69.5	68.2	68.0
$O_{H} (m, k)$	67.8	7.6	51.5	67.7	69.2	68.0
	k-NN
Features	Fine	Medium	Coarse	Cosine	Cubic	Weighted
$H (m, f)$	66.3	65.6	63.6	44.0	67.0	65.7
$L_{H} (m, k)$	69.3	69.6	69.8	68.1	69.4	69.7
$O_{H} (m, k)$	70.1	70.8	70.6	69.9	70.3	70.7

Table 5. Classification Results

(%)

for STFT Feature Sets in Experimental Study B.

Table 5. Classification Results

(%)

for STFT Feature Sets in Experimental Study B.

	SVM
Features	Linear	Quadratic	Cubic	Fine G	Medium G	Coarse G
$X (m, f)$	61.4	54.2	50.2	63.2	61.9	64.5
$L_{X} (m, k)$	52.2	49.2	39.4	63.2	54.4	64.0
$O_{X} (m, k)$	72.4	72.1	65.9	62.6	82.8	66.9
	k-NN
Features	Fine	Medium	Coarse	Cosine	Cubic	Weighted
$X (m, f)$	66.2	68.3	69.7	67.2	69.3	67.9
$L_{X} (m, k)$	66.4	68.9	68.0	69.8	69.6	68.2
$O_{X} (m, k)$	78.7	80.4	79.3	79.9	78.6	81.0

Table 6. Classification Results

(%)

for HMS Feature Sets in Experimental Study B.

Table 6. Classification Results

(%)

for HMS Feature Sets in Experimental Study B.

	SVM
Features	Linear	Quadratic	Cubic	Fine G	Medium G	Coarse G
$H (m, f)$	68.8	3.2	31.7	72.7	71.5	69.0
$L_{H} (m, k)$	71.1	2.8	34.9	74.3	71.2	71.0
$O_{H} (m, k)$	70.4	4.7	58.3	74.1	72.5	71.0
	k-NN
Features	Fine	Medium	Coarse	Cosine	Cubic	Weighted
$H (m, f)$	67.9	66.7	63.2	32.0	69.2	66.9
$L_{H} (m, k)$	73.7	73.7	74.2	71.2	73.3	73.9
$O_{H} (m, k)$	75.4	75.9	75.4	74.6	75.5	75.8

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chelmiah, E.T.; McLoone, V.I.; Kavanagh, D.F. Low Complexity Non-Linear Spectral Features and Wear State Models for Remaining Useful Life Estimation of Bearings. Energies 2023, 16, 5312. https://doi.org/10.3390/en16145312

AMA Style

Chelmiah ET, McLoone VI, Kavanagh DF. Low Complexity Non-Linear Spectral Features and Wear State Models for Remaining Useful Life Estimation of Bearings. Energies. 2023; 16(14):5312. https://doi.org/10.3390/en16145312

Chicago/Turabian Style

Chelmiah, Eoghan T., Violeta I. McLoone, and Darren F. Kavanagh. 2023. "Low Complexity Non-Linear Spectral Features and Wear State Models for Remaining Useful Life Estimation of Bearings" Energies 16, no. 14: 5312. https://doi.org/10.3390/en16145312

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Low Complexity Non-Linear Spectral Features and Wear State Models for Remaining Useful Life Estimation of Bearings

Abstract

1. Introduction

2. RUL Estimation Framework

2.1. Vibration Signal Acquisition

2.2. Feature Extraction

2.2.1. Fourier Spectral Features

2.2.2. Hilbert Marginal Spectral Features

2.3. Feature Compression

2.4. Wear State Classes

2.5. ML Methods

2.5.1. Support Vector Machines

2.5.2. Weighted k-Nearest Neighbour

3. Experimental Approach

3.1. Experimental Studies A and B

3.2. Round Robin Testing Framework

3.3. Experimental Framework

4. Results

4.1. Study A

4.2. Study B

5. Discussion

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI