Next Article in Journal
A Data-Driven Factor Graph Model for Anchor-Based Positioning
Next Article in Special Issue
Correlative Method for Diagnosing Gas-Turbine Tribological Systems
Previous Article in Journal
Blockchain-Based Federated Learning System: A Survey on Design Choices
Previous Article in Special Issue
A Support Vector Machine-Based Approach for Bolt Loosening Monitoring in Industrial Customized Vehicles
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Remaining Useful-Life Prediction of the Milling Cutting Tool Using Time–Frequency-Based Features and Deep Learning Models

1
Symbiosis Institute of Technology, Symbiosis International (Deemed University), Pune 412115, India
2
Symbiosis Centre for Applied Artificial Intelligence, Symbiosis International (Deemed University), Pune 412115, India
3
Faculty of Computing and Data Science, Flame University, Lavale, Pune 412115, India
*
Authors to whom correspondence should be addressed.
Sensors 2023, 23(12), 5659; https://doi.org/10.3390/s23125659
Submission received: 10 May 2023 / Revised: 12 June 2023 / Accepted: 15 June 2023 / Published: 17 June 2023
(This article belongs to the Special Issue Sensors and Methods for Diagnostics and Early Fault Detection)

Abstract

:
The milling machine serves an important role in manufacturing because of its versatility in machining. The cutting tool is a critical component of machining because it is responsible for machining accuracy and surface finishing, impacting industrial productivity. Monitoring the cutting tool’s life is essential to avoid machining downtime caused due to tool wear. To prevent the unplanned downtime of the machine and to utilize the maximum life of the cutting tool, the accurate prediction of the remaining useful life (RUL) cutting tool is essential. Different artificial intelligence (AI) techniques estimate the RUL of cutting tools in milling operations with improved prediction accuracy. The IEEE NUAA Ideahouse dataset has been used in this paper for the RUL estimation of the milling cutter. The accuracy of the prediction is based on the quality of feature engineering performed on the unprocessed data. Feature extraction is a crucial phase in RUL prediction. In this work, the authors considers the time–frequency domain (TFD) features such as short-time Fourier-transform (STFT) and different wavelet transforms (WT) along with deep learning (DL) models such as long short-term memory (LSTM), different variants of LSTN, convolutional neural network (CNN), and hybrid models that are a combination of CCN with LSTM variants for RUL estimation. The TFD feature extraction with LSTM variants and hybrid models performs well for the milling cutting tool RUL estimation.

1. Introduction

Machining is an important process in the manufacturing industry [1]. Machining process monitoring plays a vital role in improving the industry’s productivity by reducing unscheduled downtime caused due to failure of the cutting tool [2]. A proper predictive maintenance strategy must be defined to estimate the cutting tool’s life reduced due to tool wear caused during the machining operation [3]. With the evolving artificial intelligence (AI) techniques and advancements in sensor technology, data-driven prediction models are widely used for tool wear and remaining useful life (RUL) prediction [4]. The RUL of a cutting tool is how long it can accomplish the function effectively before it begins considerably deteriorating to perform its purpose. Multiple factors, including the material being cut, the cutting speed, the tool geometry, and the cutting fluid, can affect the RUL of a cutting tool. This paper mainly focuses on the RUL of cutting tools caused by changes in tool geometry due to tool wear. There are numerous approaches for estimating the RUL of a cutting tool, including physics-based models and machine learning (ML) or deep learning (DL)-based methods. Physics-based models, also known as mechanistic models, are based on fundamental physics principles and laws regulating the system. These models seek to describe the behavior of the system using mathematical equations that represent the physical processes involved. In ML/DL models, typically, these methods include monitoring the cutting tool’s condition and performance during use and utilizing this data to anticipate how long the tool can be used before it must be replaced. In the ML/DL-based method, the data-driven approach is an effective technique for forecasting the RUL of the equipment [5].
To prevent the machine’s unplanned downtime and utilize the cutting tool’s maximum life, the accurate prediction of the RUL cutting tool plays an important role [6,7]. Sayyad et al. discussed the effect of unplanned downtime on the equipment cost and profit of the industries [4]. Accurate RUL prediction has the potential to considerably increase the dependability and operational safety of industrial components or systems, thereby reducing the costs of maintenance and preventing severe breakdowns. Figure 1 illustrates the generalized concept of RUL of the equipment.
The prediction accuracy of the RUL estimation plays an important role as accurate prediction helps to utilize the maximum life of the cutting tool. Many researchers aim to predict the tool wear instead of the RUL of the cutting tool. In the case of tool wear prediction, the frequent measurements of the flank or crater wear values are pretty tricky. In manual tool measurement using a tool maker’s microscope, the cutting tool must frequently remove for sensor data and wear mapping, which disturbs the machining process. In comparison, the RUL prediction provides continuous data mapping with sensor data in terms of time. As compared to tool wear as target output, a few research works on the RUL as a target variable in milling cutting tools. Additionally, consolidated comparative studies of the different time–frequency-based feature extraction techniques with various decision-making algorithms are not much addressed in previous works. So, the significant contribution of the work is as follows:
  • To estimate the RUL of the milling cutting tool using the NUAA idea-house dataset [8];
  • To use the time–frequency feature extraction techniques such as STFT, CWT, and WPT to get useful insights from data with reduced data dimensions;
  • To use the different machine learning and deep learning decision-making algorithms for cutting tool RUL prediction and check the performance of each model using different evaluation parameters.

2. Related Work

This comprehensive literature analysis aims to investigate the present state of research, various approaches used for RUL estimation in the context of cutting tools, and the different feature extraction and selection techniques used to improve the RUL estimation accuracy. As the physics-based and data-driven driven models are widely used for tool wear and RUL predictions [9,10,11], this section discussed the physics-based and data-driven approaches. Different feature extraction techniques, such as time, frequency, and time–frequency domain and various feature selection techniques, are discussed for RUL prediction.

2.1. Physics-Based Model

In physics-based modeling for prediction, fundamental principles and mathematical equations need to be considered to model the system’s behavior. For accurate predictions, a number of factors, such as understanding the system, formulating the equations based on the system’s behavior, assumptions for simplification, estimation of system parameters etc., need to be studied deeply.
In the case of RUL prediction of the cutting tool, the tool geometry, cutting tool, and material materials must be considered. The cutting forces model, degradation mechanism, and wear rate estimation equations are required for estimation; based on this, the health indicators are developed. Model refinement and continuous improvement in the model are required in physics-based modeling to improve prediction accuracy.
Physics-based techniques suffer from a scarcity of accurate analytical models to characterize tool wear processes due to the cutting process’s intrinsic complexity and the machining process’s imperfect understanding [9]. The data-driven modeling approach is preferred in the tool wear and RUL predictions to avoid the modeling uncertainty in the physics-based model [12].

2.2. Data-Driven Model

The data-driven makes use of machine learning and statistical approaches to extract patterns and relationships from accessible data. In a data-driven model, the sensor data is collected from the machine by mounting the sensors at the appropriate position to understand the condition of the cutting tool. The dataset should encompass a wide range of tool life spans and operating conditions to capture the variation in tool deterioration patterns. The raw data need to be pre-processed before performing the feature extraction. The data pre-processing ensures the data’s integrity and suitability for modeling. This step entails dealing with missing values, anomalies, and data normalization or standardization to create a consistent scale for the variables. The extracted features from the pre-processed data are provided to decision-making algorithms to get output as the desired RUL prediction value. Figure 2 shows the generalized data-driven model for RUL prediction.
In a data-driven model, the machining signals are gathered using different sensors such as acoustic, dynamometer, current, vibration, etc., using the indirect sensing technique [13,14]. These collected signals are used for RUL prediction by applying feature extraction, feature selection, and prediction algorithms on data.
Many researchers used the data-driven modeling approach for the tool wear and RUL prediction using single or multi-sensors with different feature extraction approaches [15,16,17]. Feature extraction and selection play an essential role in the accuracy of the prediction models in the data-driven models.

2.3. Feature Extraction and Selection

Proper feature extraction and selection are crucial in the training phase of any machine and deep learning model. Generally, features are extracted in the time domain (TD), frequency domain (FD), and time–frequency domains (TFD) [18]. TD features mainly represent the change in signal amplitude concerning time. Generally, signals are converted from the TD to the FD spectrum in the FD using the fast Fourier-transform (FFT) technique. The FFT is the sine wave function that provides the transient nature of spectral signal in terms of amplitude and frequency distribution [19]. In the frequency domain, FFT does not consider the abrupt change in the signal. The FD provides the time distribution information in the Fourier-transform (FT) phase characteristics. It is not easy to use this time distribution information of signal in the FD in signal processing [20]. The FFT lacks the ability to provide frequency information over the localized signal region in time. Both TD and FD-based feature extraction techniques are more suitable for stationary signals application where the spectral component of the signal does not change with time [21]. However, most real-time-generated signals are non-stationary types with varied spectral components with time [22]. The TFD feature extraction process is preferred for non-stationary signals generated in the machining process [23]. TFD-based feature extraction techniques mainly include STFT, WT, empirical mode decomposition (EMD), Hilbert–Huang transform (HHT), etc. The STFT and WT, TFD feature extraction techniques provide a good result for tool wear and RUL prediction [23,24,25].
Rafezi et al. use vibration and sound signals to monitor the tool condition in CNC lathe drilling operations [26]. The author uses both TD and TFD features for tool condition monitoring and found that the TFD’s wavelet packet decomposition approach correlates better with tool conditions. Hong et al. use a dynamometer for gathering torque and forces generated during the micro-milling machine [24]. The WPT method extracts the features from the raw signals to monitor the tool wear in micro-milling. Xiang et al. used the accelerometer to capture the vibration signals during milling [27]. To extract features from the input vibration data, the WPT is employed. The extracted WPT features are provided to the backpropagation neural network (BPNN) and LSTM to predict the tool wear class. LSTM shows a higher testing accuracy (up to 95.67%) than the BPNN model for estimating the type of tool wear.
From the literature, it was found that the time–frequency domain feature extraction will provide better prediction results for the non-stationary signals generated during the machining. At the same time, deep learning models such as the LSTM show better prediction results in the time-series data analysis. From the previous work, it was found that limited comparative research has been carried out in the RUL prediction using the TFD feature extraction technique with different feature selection and ML and DL decision-making models. In this work, the time–frequency domain techniques are used for feature extraction with PCC and RF feature selection methods. The various predictions ML and DL models, including SVM, RFR, GBR, LSTM, CNN, LSTM variants, and hybrid models such as CNN with LSTM variants, are used to improve the prediction accuracy of the RUL estimation.

3. Time–Frequency Domain Feature Extraction

The non-stationary signals with different time-varying frequency characteristics show poor time-localization in the spectral domain. The TFD analysis is preferred to overcome the TD and FD limitations. Figure 3 shows the signal windowing approach’s comparison of the TD, FD, STFT, and WT [28,29].
In this study, the author used the STFT and WT methods for feature extraction purposes, as these methods show promising results in RUL prediction during machining.

3.1. Short-Time Fourier-Transform (STFT)

The poor time-localization problem of the spectral domain’s non-stationary signals is overcome by dividing the original signal into multiple short-duration windows in Fourier-transform; this technique is called window Fourier-transform (WFT) or STFT. FFT does not use the windowing function for signal transformation, as shown in Equation (1). In contrast, for calculating the STFT of the signal, the windowing function is used that is mathematically expressed using Equation (2) [30].
F τ , ω = + f t e i ω t d t
S τ , ω = + f t w t τ e i ω t d t
where ‘f(t)’ is the signal to be analyzed, ‘w(t − τ)’ is the window function, ‘τ’ is the translation parameter for time localization, ‘ω’ is the frequency component of the signal.
For computing STFT, different equal-length windowing functions, such as Hamming or Gaussian windows, are used. Discrete Fourier-transform (DFT) is performed on each section separately to form the time–frequency (TF) spectral signal. Reducing window size improves the time resolution resulting in more accurate TF resolution with increased computation time. At the same time, a wide window size results in poor time resolution with good frequency resolutions. The windowing function used in STFT does not vary (not scalable and movable) as the window size chosen before STFT operation.

3.2. Wavelet Transforms (WT)

WT is an extension of the FT. WT is the type of TF feature extraction technique. WT uses the family of ‘wavelets’ to decompose the signal. The wavelet is used as a windowing function in WT. Selecting the wavelet family uses different windowing functions such as Symlets, Morlets, Daubechies, Harr, etc. The wavelet functions can be shifted and scaled according to signal requirements. Due to the property of scaling and shifting, WT is adaptable to a wide range of time and frequency resolutions, making it a better alternative to STFT in non-stationary signal analysis. Equation (3) shows the mother wavelet ψ(t) used to calculate the wavelet transform function [23].
Ψ z , τ t = 1 z Ψ t τ z
ψ(t) = mother wavelet, τ = transformation parameter, z = scaling factor, t = time stamp of generated signal. In the original mother wavelet value of z = 1 and τ = 0. This WT is mainly divided into CWT, DWT, and WPT [31].

3.2.1. Continuous Wavelet Transform (CWT)

CWT is an effective signal transformation technique in stationary and non-stationary signal analysis. The mathematical representation of the CWT of the signal is expressed by Equation (4)
C W z , τ = 1 z + f t ψ t τ z d t
where ‘ f t ’ is the signal for wavelet transform, ‘ ψ ’ is the complex conjugate of mother wavelet Ψ(t), z is the scaling parameter used for zooming the wavelet, τ is the translation parameter used to define the location of the window. The integral compares the shape of the generated wavelet with the original signal. The equation generates the wavelet coefficient, which shows the correlation between the waveform and generated wavelet used at various scaling and shifting values [32]. However, its computation time is slow and generates redundant signals during its transformation.

3.2.2. Wavelet Packet Transform (WPT)

WPT is an enhancement in DWT. In which both the detailed and approximate coefficient obtained in the DWT is further decomposed at every stage [33].
Figure 4 shows the WPT with three levels of decomposition. Here, LP and HP are the low-pass and high-pass filters of the signals. The LP and HP are again divided into approximate and detailed coefficients. WPT uses Equation (3) to decompose the signal to calculate the wavelet transform function. WPT uses the two-scale difference to construct scaling and wavelet functions from a single scaling function. The coefficients related to the scaling function, also known as approximation coefficients, are linked with low-frequency data.
In contrast, wavelet function coefficients are correlated with information with high frequency or detail coefficients. Figure 4 shows that 1st level of the decomposition signal is decomposed into D(HP) and A(LP). Similarly, in the second level of decomposition, the approximate signal is decomposed into (AA(LP) and DA(LP)), and the detailed coefficient decomposes into DD(HP) and AD(HP).

4. Proposed Methodology

The overall methodology section is divided into four sub-sections. As the online dataset is used in this work, Section 4.1 discusses the dataset description, Section 4.2 discusses the feature extraction and selection, and Section 4.3 discusses the models used for RUL prediction. Finally, evaluation parameters are discussed that are used for the model comparison. The detailed methodology for RUL prediction is shown in Figure 5.

4.1. Dataset Description

The IEEE NUAA Ideahouse [8] dataset is used to predict the RUL of the cutting tool. In this dataset, the vibration sensor (PCBTM-W356B11), sensory tool holder (SpikeTM sensory tool holder), and PLC are used to collect the vibration, cutting forces, and current/power from the milling machine (DMU™ 80P douBlock) during the machining of titanium alloy (TC4) with solid carbide and high-speed steel endmill cutters (12 mm diameter and 75 mm length). Figure 6 shows the schematic representation test rig of the NUAA Ideahouse dataset. The sensory tool holder is connected to the milling machine’s spindle to collect the cutting forces. The vibration sensor is mounted near the workpiece to be machined to collect the vibration signals.
Figure 7 shows the signal acquisition system for the NUAA Ideahouse milling dataset. Table 1 shows sampling frequencies for each acquisition equipment. The sampling rates were chosen based on the cutting and spindle speeds. The vibration, cutting forces, and spindle current/power are collected with the sampling rate of 400 Hz, 600 Hz, and 300 Hz, respectively. During the collection process, the low-frequency signals gathered by the software were autonomously interpolated, so the volume of data stored for each signal type is the same. Data synchronization software synchronizes the sampling frequencies at 300 hertz for all the signals.
In the IEEE NUAA Ideahouse [8] dataset, the experiment L9 orthogonal array is created using the experiment design, as shown in Table 2. Out of nine cases, the first two cases, W1 and W2, are considered for the RUL prediction.
A total of thirty runs are taken in case-1 (W1), as shown in Table 3, and the flank wear of the tool is measured after each run. The maximum width of the flank wear is decided based on the ISO-8688 standards. In this dataset, the machining data is collected until the maximum value of the tool wear (maximum flank wear, i.e., VBmax) reaches up to 0.30 mm. The 0.30 mm is considered the cutting tool’s functional failure during machining in this dataset. The RUL of the cutting tool is estimated based on the value of flank wear. The additional time (in seconds) column is added to the sensor data based on the sampling rate of the data to generate the RUL column for each run. For the W1 run, the maximum value of flank wear is reached up to 0.27 mm. So, all 30 runs are considered for generating the RUL column based on the sampling frequency.
Figure 8 shows the raw data representation (scaled raw data between 0 to 1) of the individual sensor signal with respect to time. For raw data representation, all 30 runs of the W1 case are merged. The total time span to reach the maximum flank tool wear (VBmax) value from 0 mm to 0.27 mm is 3004 s. The TFD features are extracted from the raw data, and selected features are divided for the model training and testing. The data are split into 70–30% for training and testing. The different ML and DL models are trained on the test data, and the model’s performance is evaluated based on the test data. Figure 9 shows the training and testing phases of the RUL prediction approach.

4.2. Feature Extraction and Selection

The raw data in the dataset are normalized and provided for time–frequency feature extraction. The data are extracted in different TFDs such as STFT, CWT, and WPT. The statistical features shown in Table 4 are extracted from the TFD features coefficients vectors.
The extracted statistical TFD features are selected using Pearson’s correlation coefficient (PCC) and random forest regressor (RFR) methods. PCC [34] is extensively used in machining for feature selection in tool wear and RUL prediction. Equation (5) determines the linear correlations between signals and output variables.
P C C r = a i a ¯ b i b ¯ i = 1 n a i a ¯ 2 i = 1 n b i b ¯ 2
where  a i  = input feature,  a ¯  = average of input feature,  b i  = target variable,  b ¯  = average of the target variable. The value of “r” can range from −1 to 1, with −1 denoting a high degree of negative correlation and 1 denoting a high degree of positive correlation [35].
Another method used for feature selection is the RF method. RF is the embedded feature selection method that lowers the danger of overfitting and performs quicker operations by overcoming the limitations of wrapper and filter feature selection methods [36]. RF is made up of a number of decision trees that were created by randomly extracting characteristics from the data. The significance of a feature is determined by the decrease in impurity or the increase in node purity that results from dividing a specific feature. Whenever a division is made during the building of each decision tree, the decrease in impurity is noted. This decrease is accumulated for every feature across the entire forest. The final step is to normalize the accumulated diminution by dividing it by the total number of trees, providing the feature importance score. The model creates a set containing the necessary features by trimming trees below a given node. The selected features using PCC and RF methods are provided for different RUL prediction models.

4.3. Models for RUL Prediction

The different models are used for RUL prediction, including ML models such as SVM, RFR, and GBR and DL models such as LSTM, LSTM variants, CNN, and hybrid models, which combine CNN and LSTM variants. This section discussed the brief about the LSTM, LSTM variants, and CNN with the LSTM model.
The LSTM [37] shows promising performance in tool wear and RUL prediction [38,39,40]. LSTM is an advancement of the recurrent neural network (RNN) [41]. The RNN’s gradient vanishing drawback was reduced in the LSTM structure [42]. The architecture of an LSTM unit is depicted in Figure 10. Long-range dependencies are exploited due to the improvements in the LSTM.
The LSTM modifies the memory at each step rather than overwriting it. The LSTM’s main component is the cell. To add or change cell memory, the LSTM employs sigmoidal gates. ‘Input gate-I’, ‘candidate gate-C’, ‘output gate-O’, and ‘forget gate F’ make up a sigmoidal gate. A(t − 1) and A(t) denote the memory of the previous and subsequent units, respectively. The previous and next cell is hidden state outputs represented by B(t) and B(t − 1). X(t) is the input value, whereas X is element-wise multiplication. The Y(t) indicates the output generated by the LSTM cell. The next unit cell is updated by the gate parameters by modifying or adjusting the parameters and filtering the information. Le et al. [43] discussed the detailed working of the LSTM model.
Figure 11 depicts several LSMT model versions. The vanilla LSTM comprises a single hidden layer of LSTM units that can only access sequential data in one way [44]. The stack LSTM model, on either hand, considers the many hidden LSTM layers. Whereas the forward and backward LSTMs are combined to form the Bi-directional LSTM. The architecture of different LSTM versions is also discussed by Kolekar et al., Chandra et al., and Zhao et al. [45,46,47].
Figure 12 shows the combination of the CNN-LSTM architecture [48] for the RUL prediction of the cutting tool. Zhang X et al. and Agga A et al. discussed the architecture of the CNN-LSTM in detailed [48,49]. In this work, along with the CNN-LSTM, the different variants of the SLTM are combined with the CNN model, such as CNN-Vanilla LSTM, CNN-Bidirectional LSTM, and CNN-stack LSTM.

4.4. Performance Evaluation Parameters

Different performance measurement parameters, such as ‘R-squared score (R2)’, ‘Root Mean Square Error (RMSE)’, and ‘Mean Absolute Percent Error (MAPE)’, are used to measure the extent to which these prediction models work. The R2 is a metric that assesses the accuracy of a forecast based on real and predicted data [50].
It is used to evaluate the regression model performance by determining how far the predicted points are from the actual data points. Whereas the RMSE provides the square root of the average of predicted and actual values. Finally, MAPE is used to calculate the percentage prediction errors. The formulae for all the performance parameters are provided in Table 5, where n = number of data points,  a ¨ i  = predicted value, and  a i  = true or actual value.

5. Results and Discussion

The NUAA Ideahouse dataset is in raw signal format with eight incoming signals, including four cutting forces, two vibrations, and one current and power signal, as mentioned in Section 4. From the L9 orthogonal array, the first two cases, W1 and W2, are considered in this work. The results related to case W1 are thoroughly elaborated in this section, and the summarized results table is provided for the W2 case at the end of the result section.
The time column is added to the dataset based on the sampling frequency (300 Hz/per signal). The actual values of the RUL are calculated based on the time column. The features are extracted and selected based on the sensor data as input and RUL as a target feature for the RUL prediction models. The data are normalized using the z-score data normalization technique before passing them to the model. This result section is organized into three parts, i.e.,:
  • The feature extraction based on different TFD techniques such as CWT, STFT, and WPT and feature selection using PCC and RFR methods is discussed;
  • Model performance for each TFD feature using PCC and RF feature selection techniques using different ML (SVM, RFR, and GBR) and DL models (LSTM, LSTM variants, CNN, and hybrid model consisting of CNN with LSTM variants) are evaluated;
  • Finally, the graphs indicating the actual and predicted RUL of the cutting tool versus the actual machining time of milling are plotted for each condition, and a summary of all the obtained results is discussed.

5.1. Feature Extraction and Selection

The features are extracted in the TFD using STFT, CWT, and WPT. The statistical features are extracted from the generated time–frequency coefficient vectors. A total of 64 features are extracted in STFT and CWT each, as the number of input signals is eight (Figure 5), and eight statistical features (Table 3) are generated from each signal. In the WPT, the extracted coefficients are divided into approximate and detailed coefficients, generating a total of 128 features (64 approximate and 64 detailed). The extracted features in each method are shown in Table 6.
After feature extraction, the features are selected using PCC and RF methods. In PCC, features with a correlation greater than 0.2 are chosen, whereas in RF, features with a weightage greater than 0.5 are chosen. The selection of threshold values for PCC and RF is finalized after many iterations. The threshold values are kept constant for all the feature extraction techniques to compare model performance.
Figure 13 shows the change in the mean STFT representation of the individual sensor signal with respect to time. Table 7 shows the selected features using the PCC feature selection technique for extracted STFT-based features. The feature names are indicated by the type of feature extraction technique followed by the type of statistical feature considered and the signal considered for feature extraction. A total of twenty-one features are selected that are having correlation coefficient greater than 0.2. Similarly, Table 8 shows the selected features using RF for STFT feature extraction. Out of 64 features, 31 high-weightage features are selected.
Figure 14 shows the change in the mean CWT representation of the individual sensor signal with respect to time. Table 9 and Table 10 indicate the feature selected using PCC and RF from extracted CWT features. The eleven features having a PCC value higher than 0.2 are selected for prediction model training and testing. Forty-three features are selected based on RF weightage greater than 0.5.
Figure 15 shows the change in the mean WPT representation of the individual sensor signal with respect to time. Table 11 indicates the selected 26 features using the PCC technique from the extracted 128 WPT features at the first level of decomposition. The ‘a’ and ‘d’ indicate the extracted approximate and detailed feature coefficients, followed by the extracted statistical details and signal names.
Table 12 indicates the selected features using the RF feature selection method from extracted WPT features. A total of 19 features are selected, with a weightage greater than 0.5.

5.2. Machine Learning Models Performance

The extracted and selected features are initially provided to different machine learning (ML) algorithms to check each model’s performance for RUL prediction. Various approaches for selecting features, including PCC and RFR methods, are used to assess the efficacy of each prediction model. In ML models, support vector machine (SVM), random forest regressor (RFR), and gradient boosting regressor (GBR) are used for RUL prediction.
Table 13 shows the performance evaluation for the different ML models using the PCC feature selection technique. The RUL prediction based on the ML model performs poorly compared to DL algorithms. The maximum value of R2 is 0.366 for the PCC-based feature selection method given by the RFR model for WPT feature extraction. Whereas, for the same extracted feature, the features are selected using the RF, and the performance of the ML models is slightly improved, as shown in Table 14. The WPT shows the maximum R2 of 0.496 for the RFR model in the RF base features selection method. The different DL models, such as LSTM, LSTM variants, CNN, and a combination of CNN with different LSTM variants, are used to improve the performance of the prediction models.

5.3. Deep Learning Model Performance

In the DL models, the extracted and selected features are initially provided to the different LSTM variants, such as Vanilla, Bi-directional, and Stack LSTM models, to check each model’s performance for RUL prediction. Similarly, the selected features are passed to the CNN model along with the hybrid model of CNN with different LSTM variants.
In this work, the call-backs and early-stopping approach are used to increase the performance and efficiency of DL models. Call-backs are functions that can be set to execute at certain points during training, such as after each epoch or after a given number of batches have been processed. These capabilities can be utilized to carry out a range of operations, including altering the learning rate, tracking training progress, and preserving model checkpoints, whereas early stopping, on the other hand, is a technique that is used to prevent overfitting. A call-back that checks the validation performance at the end of each epoch and stops training if the performance has not increased for a given number of epochs can be used to enable early stopping in deep learning models. These two approaches increase the effectiveness of the training process by preventing overfitting, conserving time, and reducing the amount of computational resources needed.
In this work, different performance evaluation parameters, such as R-squared (R2), RMSE, and MAPE, are considered to check the performance of each model. Generally, in regression, the R2 values above 0.90 and MAPE values below 10% are considered models showing good prediction values.

5.3.1. RUL Prediction Using STFT Feature Extraction Technique

The RUL of the cutting tool is predicted using the STFT feature extraction technique. Table 15 shows the performance evaluation parameters of the different LSTM variants model using STFT time–frequency-based feature extraction techniques for RUL prediction. For the PCC-based feature selection technique, the stack LSTM shows the maximum testing accuracy of 0.802, with 0.125 and 7.372% as RMSE and MAPE values, respectively. In RFR feature selection, stack LSTM provides a maximum R2 score value of 0.782 as testing accuracy with 0.131 RMSE value and 08.52% MAPE value.
Figure 16 shows the learning curves for the RUL prediction, indicating the loss vs. the number of epochs for each model using PCC and RFR feature extraction in the STFT feature extraction technique. The graph shows that the losses are minimum for the highest R2 and minimum RMSE or MAPE values. Stack LSTM model offers a minimum loss for the PCC and RFR-based feature selection.
Figure 17 and Figure 18 show the graphs of the actual and predicted RUL of the cutting tool concerning total machining time for PCC-based and RFR-based feature selection. The stack LSTM shows the minimum deviation in RUL prediction for both feature selection techniques.
Similarly, Table 16 shows the performance evaluation parameters of the CNN and CNN-LSTM variants models using STFT time–frequency-based feature extraction techniques for RUL prediction. For the PCC-based feature selection technique, the CNN-LSTM shows the maximum testing accuracy of 0.881, with 0.097 and 6.877% as RMSE and MAPE values, respectively. In RFR feature selection, CNN-bidirectional LSTM provides a maximum R2 score of 0.951 as testing accuracy with 0.062 RMSE value and 04.161% MAPE value.
Figure 19 and Figure 20 show the actual and predicted values of the RUL for the PCC-based and RF-based feature selection techniques, respectively. In PCC-based feature selection, as the CNN-LSTM shows the maximum accuracy, Figure 19b shows the minimum deviation between the actual and predicted RUL values. Similarly, in RFR-based feature selection, Figure 20d, the CNN-Stack-LSTM shows the minimum deviation in actual and predicted RUL values with maximum accuracy.

5.3.2. RUL Prediction Using CWT Feature Extraction Technique

Table 17 shows the performance evaluation parameters of the different LSTM variants model using CWT time–frequency-based feature extraction techniques for RUL prediction. In CWT, for vanilla LSTM, the maximum testing accuracy is 0.851, with 0.104 and 7.359 as RMSE and MAPE values, respectively. In RFR feature selection, stack LSTM provides a maximum R2 score value of 0.927 as testing accuracy with 0.075 and 5.781 as RMSE and MAPE values, respectively.
Figure 21 shows the learning curves for all six conditions of the model.From the learning curves, it is clear that the model which shows maximum accuracy provides the minimum training and testing losses.
Figure 22 and Figure 23 show the graphical representation that indicates the actual and predicted RUL of different LSTM models with respect to machining time. Models with the highest accuracies for both feature selection methods demonstrate the least deviation between the real and anticipated values of RUL concerning machining time.
Similarly, Table 18 shows the performance evaluation parameters of the CNN and CNN-LSTM variants models using CWT time–frequency-based feature extraction techniques for RUL prediction. For the PCC-based feature selection technique, the CNN-Bidirectional-LSTM shows the maximum testing accuracy of 0.960, with 0.051 and 3.576% as RMSE and MAPE values, respectively. In RFR feature selection, CNN-bidirectional LSTM provides a maximum R2 score of 0.971 as testing accuracy with 0.048 RMSE value and 3.428% MAPE value.
Figure 24 and Figure 25 show the actual and predicted values of the RUL for the PCC-based and RF-based feature selection techniques, respectively, for extracted features using the CWT method. In PCC-based feature selection, Figure 24c shows the minimum deviation between the actual and predicted RUL values, as the CNN-bidirectional-LSTM shows the maximum accuracy. Similarly, in RFR-based feature selection, Figure 25c, the CNN-bidirectional-LSTM shows the minimum deviation in actual and predicted RUL values with a maximum R-squared value of 0.96.

5.3.3. RUL Prediction Using WPT Feature Extraction Technique

The WPT is used to estimate the RUL of the cutting tool. Table 19 shows the performance evaluation parameters of the different LSTM variants model using WPT time–frequency-based feature extraction techniques for RUL prediction. For the PCC-based feature selection technique, the stack LSTM shows the maximum testing accuracy of 0.857, with 0.102 and 7.140% as RMSE and MAPE values, respectively. In RFR feature selection, stack LSTM provides a maximum R2 score value of 0.978 as train accuracy and 0.967 as testing accuracy with 0.051 RMSE value and 03.676% MAPE value.
Figure 26 indicates the training and validation loss learning curves for the prediction of RUL employing various LSTM variants for PCC and RFR-based feature selection techniques. In PCC-based feature selection, vanilla LSTM shows minimum training losses at 51 epochs. The model uses early stopping to avoid overfitting using a three-patient level in the call-back function. At the same time, in RFR based feature selection technique, the Stack LSTM shows minimum training losses at the 52 epochs and indicates the maximum accuracy for the same epochs.
Figure 27 and Figure 28 show the actual and predicted RUL vs. machining time of the cutting tool using the WPT feature extraction technique for PCC and RFR-based feature selection, respectively, for different LSTM variants. From the graphical representation, it is clear that the model with the highest prediction accuracy shows a slight variation between the real and predicted RUL. Figure 14, vanilla LSTM, shows the minimum deviation between the actual and predicted RUL. Whereas, in Figure 15, stack LSTM shows the lowest variation between real and predicted RUL values.
Similarly, Table 20 shows the performance evaluation parameters of the CNN and CNN-LSTM variants models using WPT time–frequency-based feature extraction techniques for RUL prediction. For the PCC-based feature selection technique, the CNN-Bidirectional-LSTM shows the maximum testing accuracy of 0.908, with 0.086 and 5.90% as RMSE and MAPE values, respectively. In RFR feature selection, CNN-bidirectional-LSTM provides a maximum R2 score of 0.955 as testing accuracy with 0.056 RMSE value and 03.59% MAPE value.
Figure 29 and Figure 30 show the graphical representation of actual and predicted values of the RUL for the PCC-based and RF-based feature selection techniques, respectively, for extracted features using the WPT method. In PCC-based feature selection, Figure 29c shows the minimum deviation between the actual and predicted RUL values, as the CNN-bidirectional-LSTM shows the maximum accuracy. Similarly, as shown in Figure 30d, RFR-based feature selection, the CNN-bidirectional-LSTM, shows the minimum deviation in actual and predicted RUL values.
Table 21 summarizes all the results from the prediction models, including LSTM, LSTM variants, CNN, and CNN with LSTM variants for case W1. In the STFT feature extraction technique, the CNN-stack-LSTM provides the maximum R2 value of 0.951 using the RF feature selection technique. In CWT feature extraction, the CNN-bidirectional LSTM provides a maximum R2 value of 0.971. In the WPT feature extraction technique, stack-LSTM provides the maximum R2 value of 0.967.
Similarly, the model performance is verified on case W2. In the case of W2, 18 runs are required to reach the maximum tool wear value of 0.30 mm. The DL models for RUL predictions provide good results, as summarized in Table 22.
The results show that the RF feature selection technique performs slightly better than the PCC-based feature selection technique. Tool wear, or RUL, is a non-linear and complex phenomenon. The PCC feature selection technique provides better results for linear relationships than non-linear ones. The RF feature selection technique gives better results for non-linear relationships and complex models. In the case of RUL prediction models, ML models show poor prediction performance as the model struggles to capture complex and non-linear relationships in the cutting tool RUL data. In comparison, the DL models show fairly good prediction results in RUL prediction. In this work, based on the results, it is observed that, compared to the normal CCN and LSTM models, LSTM variants and hybrid models (CNN with LSTM variants) provide better results. The LSTM variants and CNN with LSTM variants easily and more accurately understand the temporal or time-related aspects of sequential or time series signals captured for RUL prediction of cutting tool.

6. Conclusions

In this work, the IEEE NUAA Ideahouse dataset is used for the cutting tool’s remaining useful life (RUL) prediction. Time–frequency feature extraction techniques such as STFT and WT are used to avoid the limitations of TD and FD feature extraction. The model prediction results are verified using the two cases (W1 and W2) from the dataset. The following conclusions are drawn from the obtained results:
  • The RF feature selection technique performs slightly better than the PCC-based feature selection technique. The RF feature selection technique gives better results for non-linear relationships and complex models;
  • The DL models such as LSTM, LSTM variants, CNN, and CNN with LSTM variants provide better prediction accuracies than ML models, as these models are effective for the time-series and complex non-linear cutting tool data for RUL estimation;
  • In STFT, CWT, and WPT feature extraction techniques, the highest value of R2 score is more than 0.95 for LSTM variants and hybrid (CNN with LSTM variants) prediction models;
  • The result shows that the TFD feature extraction technique is effective for RUL prediction with deep learning models such as LSTM, LSTM variants, CNN, and hybrid model CNN with LSTM variants.

Author Contributions

Conceptualization, S.S. and S.K.; methodology, S.S. and S.K.; validation, A.B., K.K. and A.A.; formal analysis, K.K. and A.A.; investigation, A.B.; resources, S.K. and K.K.; data curation, S.S.; writing—original draft preparation, S.S. and S.K.; writing—review and editing, A.B., K.K. and A.B.; visualization, S.K.; supervision, A.B.; project administration, K.K. and A.A.; funding acquisition, A.A. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Tong, X.; Liu, Q.; Pi, S.; Xiao, Y. Real-time machining data application and service based on IMT digital twin. J. Intell. Manuf. 2019, 31, 1113–1132. [Google Scholar] [CrossRef]
  2. Javed, K.; Gouriveau, R.; Li, X.; Zerhouni, N. Tool wear monitoring and prognostics challenges: A comparison of connectionist methods toward an adaptive ensemble model. J. Intell. Manuf. 2016, 29, 1873–1890. [Google Scholar] [CrossRef]
  3. Zonta, T.; da Costa, C.A.; Righi, R.D.R.; de Lima, M.J.; da Trindade, E.S.; Li, G.P. Predictive maintenance in the Industry 4.0: A systematic literature review. Comput. Ind. Eng. 2020, 150, 106889. [Google Scholar] [CrossRef]
  4. Sayyad, S.; Kumar, S.; Bongale, A.; Kamat, P.; Patil, S.; Kotecha, K. Data-Driven Remaining Useful Life Estimation for Milling Process: Sensors, Algorithms, Datasets, and Future Directions. IEEE Access 2021, 9, 110255–110286. [Google Scholar] [CrossRef]
  5. Liu, Y.C.; Chang, Y.J.; Liu, S.L.; Chen, S.P. Data-driven prognostics of remaining useful life for milling machine cutting tools. In Proceedings of the 2019 IEEE International Conference on Prognostics and Health Management, ICPHM 2019, San Francisco, CA, USA, 17–20 June 2019; pp. 1–5. [Google Scholar] [CrossRef]
  6. Tian, Z. An artificial neural network method for remaining useful life prediction of equipment subject to condition monitoring. J. Intell. Manuf. 2009, 23, 227–237. [Google Scholar] [CrossRef]
  7. Wang, Y.; Zhao, Y.; Addepalli, S. Remaining Useful Life Prediction using Deep Learning Approaches: A Review. Procedia Manuf. 2020, 49, 81–88. [Google Scholar] [CrossRef]
  8. Li, Y.; Liu, C.; Li, D.; Hua, J.; Wan, P. Documentation of Tool Wear Dataset of NUAA_Ideahouse. IEEE Dataport. 2021. Available online: https://ieee-dataport.org/open-access/tool-wear-dataset-nuaaideahouse (accessed on 6 January 2023).
  9. Hanachi, H.; Yu, W.; Kim, I.Y.; Liu, J.; Mechefske, C.K. Hybrid data-driven physics-based model fusion framework for tool wear prediction. Int. J. Adv. Manuf. Technol. 2018, 101, 2861–2872. [Google Scholar] [CrossRef]
  10. Liang, Y.; Wang, S.; Li, W.; Lu, X. Data-Driven Anomaly Diagnosis for Machining Processes. Engineering 2019, 5, 646–652. [Google Scholar] [CrossRef]
  11. Wu, D.; Jennings, C.; Terpenny, J.; Gao, R.; Kumara, S. Data-Driven Prognostics Using Random Forests: Prediction of Tool Wear. 2017. Available online: http://proceedings.asmedigitalcollection.asme.org/pdfaccess.ashx?url=/data/conferences/asmep/93280/ (accessed on 6 January 2023).
  12. Dimla, D.E. Sensor signals for tool-wear monitoring in metal cutting operations—A review of methods. Int. J. Mach. Tools Manuf. 2000, 40, 1073–1098. [Google Scholar] [CrossRef]
  13. Sick, B. Online and indirect tool wear monitoring in turning with artificial neural networks: A review of more than a decade of research. Mech. Syst. Signal Process. 2002, 16, 487–546. [Google Scholar] [CrossRef]
  14. Sayyad, S.; Kumar, S.; Bongale, A.; Bongale, A.M.; Patil, S. Estimating Remaining Useful Life in Machines Using Artificial Intelligence: A Scoping Review. Libr. Philos. Pract. 2021, 2021, 1–26. [Google Scholar]
  15. Zhou, Y.; Liu, C.; Yu, X.; Liu, B.; Quan, Y. Tool wear mechanism, monitoring and remaining useful life (RUL) technology based on big data: A review. SN Appl. Sci. 2022, 4, 232. [Google Scholar] [CrossRef]
  16. Tran, V.T.; Pham, H.T.; Yang, B.-S.; Nguyen, T.T. Machine performance degradation assessment and remaining useful life prediction using proportional hazard model and support vector machine. Mech. Syst. Signal Process. 2012, 32, 320–330. [Google Scholar] [CrossRef] [Green Version]
  17. Liu, M.; Yao, X.; Zhang, J.; Chen, W.; Jing, X.; Wang, K. Multi-Sensor Data Fusion for Remaining Useful Life Prediction of Machining Tools by iabc-bpnn in Dry Milling Operations. Sensors 2020, 20, 4657. [Google Scholar] [CrossRef]
  18. Zhang, C.; Yao, X.; Zhang, J.; Jin, H. Tool Condition Monitoring and Remaining Useful Life Prognostic Based on a Wireless Sensor in Dry Milling Operations. Sensors 2016, 16, 795. [Google Scholar] [CrossRef] [Green Version]
  19. Zhou, Y.; Xue, W. A Multi-sensor Fusion Method for Tool Condition Monitoring in Milling. Sensors 2018, 18, 3866. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  20. Thirukkumaran, K.; Mukhopadhyay, C.K. Analysis of Acoustic Emission Signal to Characterization the Damage Mechanism During Drilling of Al-5%SiC Metal Matrix Composite. Silicon 2020, 13, 309–325. [Google Scholar] [CrossRef]
  21. da Costa, C.; Kashiwagi, M.; Mathias, M.H. Rotor failure detection of induction motors by wavelet transform and Fourier transform in non-stationary condition. Case Stud. Mech. Syst. Signal Process. 2015, 1, 15–26. [Google Scholar] [CrossRef] [Green Version]
  22. Delsy, T.T.M.; Nandhitha, N.M.; Rani, B.S. RETRACTED ARTICLE: Feasibility of spectral domain techniques for the classification of non-stationary signals. J. Ambient. Intell. Humaniz. Comput. 2020, 12, 6347–6354. [Google Scholar] [CrossRef]
  23. Zhu, K.; Wong, Y.S.; Hong, G.S. Wavelet analysis of sensor signals for tool condition monitoring: A review and some new results. Int. J. Mach. Tools Manuf. 2009, 49, 537–553. [Google Scholar] [CrossRef]
  24. Hong, Y.-S.; Yoon, H.-S.; Moon, J.-S.; Cho, Y.-M.; Ahn, S.-H. Tool-wear monitoring during micro-end milling using wavelet packet transform and Fisher’s linear discriminant. Int. J. Precis. Eng. Manuf. 2016, 17, 845–855. [Google Scholar] [CrossRef]
  25. Segreto, T.; D’addona, D.; Teti, R. Tool wear estimation in turning of Inconel 718 based on wavelet sensor signal analysis and machine learning paradigms. Prod. Eng. 2020, 14, 693–705. [Google Scholar] [CrossRef]
  26. Rafezi, H.; Akbari, J.; Behzad, M. Tool Condition Monitoring based on sound and vibration analysis and wavelet packet decomposition. In Proceedings of the 2012 8th International Symposium on Mechatronics and Its Applications, Sharjah, United Arab Emirates, 10–12 April 2012; pp. 1–4. [Google Scholar] [CrossRef]
  27. Xiang, Z.; Feng, X. Tool Wear State Monitoring Based on Long-Term and Short-Term Memory Neural Network; Springer: Singapore, 2020; Volume 593. [Google Scholar] [CrossRef]
  28. Ganesan, R.; DAS, T.K.; Venkataraman, V. Wavelet-based multiscale statistical process monitoring: A literature review. IIE Trans. Inst. Ind. Eng. 2004, 36, 787–806. [Google Scholar] [CrossRef]
  29. Strackeljan, J.; Lahdelma, S. Smart Adaptive Monitoring and Diagnostic Systems. In Proceedings of the 2nd International Seminar on Maintenance, Condition Monitoring and Diagnostics, Oulu, Finland, 28–29 September 2005. [Google Scholar]
  30. Wang, L.; Gao, R. Condition Monitoring and Control for Intelligent Manufacturing; Springer: Berlin/Heidelberg, Germany, 2006; Available online: https://www.springer.com/gp/book/9781846282683 (accessed on 6 January 2023).
  31. Burus, C.S.; Gopinath, R.A.; Guo, H. Introduction to Wavelets and Wavelet Transform—A Primer; Prentice Hall: Upper Saddle River, NJ, USA, 1997. [Google Scholar]
  32. Hosameldin, A.; Asoke, N. Condition Monitoring with Vibration Signals; Wiley-IEEE Press: Piscataway, NJ, USA, 2020. [Google Scholar]
  33. Liu, M.-K.; Tseng, Y.-H.; Tran, M.-Q. Tool wear monitoring and prediction based on sound signal. Int. J. Adv. Manuf. Technol. 2019, 103, 3361–3373. [Google Scholar] [CrossRef]
  34. Li, X.; Lim, B.S.; Zhou, J.H.; Huang, S.; Phua, S.J.; Shaw, K.C.; Er, M.J. Fuzzy neural network modelling for tool wear estimation in dry milling operation. In Proceedings of the Annual Conference of the Prognostics and Health Management Society, PHM 2009, San Diego, CA, USA, 27 September–1 October 2009; pp. 1–11. [Google Scholar]
  35. Nettleton, D. Selection of Variables and Factor Derivation. In Commercial Data Mining; Morgan Kaufmann: Boston, MA, USA, 2014; pp. 79–104. [Google Scholar] [CrossRef]
  36. Sayyad, S.; Kumar, S.; Bongale, A.; Kotecha, K.; Selvachandran, G.; Suganthan, P.N. Tool wear prediction using long short-term memory variants and hybrid feature selection techniques. Int. J. Adv. Manuf. Technol. 2022, 121, 6611–6633. [Google Scholar] [CrossRef]
  37. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
  38. Chan, Y.-W.; Kang, T.-C.; Yang, C.-T.; Chang, C.-H.; Huang, S.-M.; Tsai, Y.-T. Tool wear prediction using convolutional bidirectional LSTM networks. J. Supercomput. 2021, 78, 810–832. [Google Scholar] [CrossRef]
  39. Lindemann, B.; Maschler, B.; Sahlab, N.; Weyrich, M. A survey on anomaly detection for technical systems using LSTM networks. Comput. Ind. 2021, 131, 103498. [Google Scholar] [CrossRef]
  40. An, Q.; Tao, Z.; Xu, X.; El Mansori, M.; Chen, M. A data-driven model for milling tool remaining useful life prediction with convolutional and stacked LSTM network. Measurement 2019, 154, 107461. [Google Scholar] [CrossRef]
  41. Wu, Z.; Christofides, P.D. Economic Machine-Learning-Based Predictive Control of Non-linear Systems. Mathematics 2019, 7, 494. [Google Scholar] [CrossRef] [Green Version]
  42. Chen, C.-W.; Tseng, S.-P.; Kuan, T.-W.; Wang, J.-F. Outpatient Text Classification Using Attention-Based Bidirectional LSTM for Robot-Assisted Servicing in Hospital. Information 2020, 11, 106. [Google Scholar] [CrossRef] [Green Version]
  43. Le, X.H.; Ho, H.V.; Lee, G.; Jung, S. Application of Long Short-Term Memory (LSTM) Neural Network for Flood Forecasting. Water 2019, 11, 1387. [Google Scholar] [CrossRef] [Green Version]
  44. Chatterjee, A.; Gerdes, M.W.; Martinez, S.G. Statistical Explorations and Univariate Timeseries Analysis on COVID-19 Datasets to Understand the Trend of Disease Spreading and Death. Sensors 2020, 20, 3089. [Google Scholar] [CrossRef] [PubMed]
  45. Chandra, R.; Goyal, S.; Gupta, R. Evaluation of Deep Learning Models for Multi-Step Ahead Time Series Prediction. IEEE Access 2021, 9, 83105–83123. [Google Scholar] [CrossRef]
  46. Zhao, R.; Yan, R.; Wang, J.; Mao, K. Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks. Sensors 2017, 17, 273. [Google Scholar] [CrossRef] [Green Version]
  47. Kumar, S.; Kolekar, T.; Kotecha, K.; Patil, S.; Bongale, A. Performance evaluation for tool wear prediction based on Bi-directional, Encoder–Decoder and Hybrid Long Short-Term Memory models. Int. J. Qual. Reliab. Manag. 2022, 39, 1551–1576. [Google Scholar] [CrossRef]
  48. Zhang, X.; Lu, X.; Li, W.; Wang, S. Prediction of the remaining useful life of cutting tool using the Hurst exponent and CNN-LSTM. Int. J. Adv. Manuf. Technol. 2021, 112, 2277–2299. [Google Scholar] [CrossRef]
  49. Agga, A.; Abbou, A.; Labbadi, M.; El Houm, Y.; Ali, I.H.O. CNN-LSTM: An efficient hybrid deep learning architecture for predicting short-term photovoltaic power production. Electr. Power Syst. Res. 2022, 208, 107908. [Google Scholar] [CrossRef]
  50. Lin, Y.-C.; Wu, K.-D.; Shih, W.-C.; Hsu, P.-K.; Hung, J.-P. Prediction of Surface Roughness Based on Cutting Parameters and Machining Vibration in End Milling Using Regression Method and Artificial Neural Network. Appl. Sci. 2020, 10, 3941. [Google Scholar] [CrossRef]
Figure 1. Concept of remaining useful life (RUL) of the equipment.
Figure 1. Concept of remaining useful life (RUL) of the equipment.
Sensors 23 05659 g001
Figure 2. Generalized data-driven model for RUL prediction.
Figure 2. Generalized data-driven model for RUL prediction.
Sensors 23 05659 g002
Figure 3. Comparison of windowing approach (a) Time-domain, (b) Frequency-domain (Fast Fourier Transform) (c) Short Time Fourier Transform (STFT) (d) Wavelet Analysis.
Figure 3. Comparison of windowing approach (a) Time-domain, (b) Frequency-domain (Fast Fourier Transform) (c) Short Time Fourier Transform (STFT) (d) Wavelet Analysis.
Sensors 23 05659 g003
Figure 4. WPT with three levels of decomposition.
Figure 4. WPT with three levels of decomposition.
Sensors 23 05659 g004
Figure 5. Methodology for tool RUL prediction using different TFD feature extraction methods and LSTM variants.
Figure 5. Methodology for tool RUL prediction using different TFD feature extraction methods and LSTM variants.
Sensors 23 05659 g005
Figure 6. The schematic diagram of the test rig setup for the NUAA Ideahouse dataset.
Figure 6. The schematic diagram of the test rig setup for the NUAA Ideahouse dataset.
Sensors 23 05659 g006
Figure 7. Monitoring signals acquisition for NUAA Ideahouse milling dataset.
Figure 7. Monitoring signals acquisition for NUAA Ideahouse milling dataset.
Sensors 23 05659 g007
Figure 8. Raw data representation of all the sensor signals to time for the W1 case.
Figure 8. Raw data representation of all the sensor signals to time for the W1 case.
Sensors 23 05659 g008
Figure 9. Flowchart of training and testing phases of the RUL prediction approach.
Figure 9. Flowchart of training and testing phases of the RUL prediction approach.
Sensors 23 05659 g009
Figure 10. The architecture of the LSTM unit.
Figure 10. The architecture of the LSTM unit.
Sensors 23 05659 g010
Figure 11. LSTM variants. (a) Normal (Vanilla), (b) Bi-Directional, and (c) Stack.
Figure 11. LSTM variants. (a) Normal (Vanilla), (b) Bi-Directional, and (c) Stack.
Sensors 23 05659 g011
Figure 12. Hybrid CNN-LSTM architecture for RUL prediction.
Figure 12. Hybrid CNN-LSTM architecture for RUL prediction.
Sensors 23 05659 g012
Figure 13. Extracted mean STFT sensor signals representation with time.
Figure 13. Extracted mean STFT sensor signals representation with time.
Sensors 23 05659 g013
Figure 14. Extracted mean CWT sensor signals representation with time.
Figure 14. Extracted mean CWT sensor signals representation with time.
Sensors 23 05659 g014
Figure 15. Extracted mean WPT sensor signals representation with time.
Figure 15. Extracted mean WPT sensor signals representation with time.
Sensors 23 05659 g015
Figure 16. RUL prediction learning curves using STFT-based feature extraction for different LSTM variants.
Figure 16. RUL prediction learning curves using STFT-based feature extraction for different LSTM variants.
Sensors 23 05659 g016
Figure 17. The actual and predicted value of RUL versus machining time for STFT and PCC-based feature selection using different LSTM variants. (a) Vanilla. (b) Bi-directional, and (c) Stack.
Figure 17. The actual and predicted value of RUL versus machining time for STFT and PCC-based feature selection using different LSTM variants. (a) Vanilla. (b) Bi-directional, and (c) Stack.
Sensors 23 05659 g017
Figure 18. The actual and predicted value of RUL versus machining time for STFT and RFR-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Figure 18. The actual and predicted value of RUL versus machining time for STFT and RFR-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Sensors 23 05659 g018
Figure 19. The actual and predicted values of RUL versus machining time for STFT and PCC-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Figure 19. The actual and predicted values of RUL versus machining time for STFT and PCC-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Sensors 23 05659 g019
Figure 20. The actual and predicted value of RUL versus machining time for STFT and RF-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Figure 20. The actual and predicted value of RUL versus machining time for STFT and RF-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Sensors 23 05659 g020aSensors 23 05659 g020b
Figure 21. RUL prediction learning curves using CWT-based feature extraction for different LSTM variants.
Figure 21. RUL prediction learning curves using CWT-based feature extraction for different LSTM variants.
Sensors 23 05659 g021aSensors 23 05659 g021b
Figure 22. The actual and predicted values of RUL versus machining time for CWT and PCC-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Figure 22. The actual and predicted values of RUL versus machining time for CWT and PCC-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Sensors 23 05659 g022aSensors 23 05659 g022b
Figure 23. The actual and predicted value of RUL versus machining time for CWT and RFR-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Figure 23. The actual and predicted value of RUL versus machining time for CWT and RFR-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Sensors 23 05659 g023aSensors 23 05659 g023b
Figure 24. The actual and predicted value of RUL versus machining time for CWT and PCC-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Figure 24. The actual and predicted value of RUL versus machining time for CWT and PCC-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Sensors 23 05659 g024
Figure 25. The actual and predicted value of RUL versus machining time for CWT and RFR-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Figure 25. The actual and predicted value of RUL versus machining time for CWT and RFR-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Sensors 23 05659 g025
Figure 26. RUL prediction learning curves using WPT-based feature extraction for different LSTM variants.
Figure 26. RUL prediction learning curves using WPT-based feature extraction for different LSTM variants.
Sensors 23 05659 g026aSensors 23 05659 g026b
Figure 27. The actual and predicted values of RUL versus machining time for WPT and PCC-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Figure 27. The actual and predicted values of RUL versus machining time for WPT and PCC-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Sensors 23 05659 g027aSensors 23 05659 g027b
Figure 28. The Actual and predicted values of RUL versus machining time for WPT and RFR-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Figure 28. The Actual and predicted values of RUL versus machining time for WPT and RFR-based feature selection using different LSTM variants. (a) Vanilla, (b) Bi-directional, and (c) Stack.
Sensors 23 05659 g028aSensors 23 05659 g028b
Figure 29. The actual and predicted value of RUL versus machining time for WPT and PCC-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Figure 29. The actual and predicted value of RUL versus machining time for WPT and PCC-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Sensors 23 05659 g029
Figure 30. The actual and predicted value of RUL versus machining time for WPT and RFR-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Figure 30. The actual and predicted value of RUL versus machining time for WPT and RFR-based feature selection using different models. (a) CCN, (b) CNN-LSTM, (c) CNN-bidirectional LSTM, and (d) CNN-Stack LSTM.
Sensors 23 05659 g030
Table 1. Signal acquisition equipment and sampling frequency.
Table 1. Signal acquisition equipment and sampling frequency.
Signal Category Acquisition EquipmentSample Frequency (Hz)
Spindle current and power PLC300
VibrationPCBTM-W356B11400
Cutting forceSpikeTM sensory tool holder600
Table 2. Details of the orthogonal experiment of IEEE NUAA Ideahouse.
Table 2. Details of the orthogonal experiment of IEEE NUAA Ideahouse.
No. of CasesFeed Per Tooth (mm/r)Spindle Speed (r/min)Axial Cutting Depth (mm)Tool
Material
Workpiece Material
W10.04517502.5Solid
carbide
TC4
W20.04518003
W30.04518503.5
W40.0517503
W50.0518003.5
W60.0518502.5
W70.05517503.5
W80.05518002.5
W90.05518503
Table 3. Tool wear labels for W1 case.
Table 3. Tool wear labels for W1 case.
Sr.
No.
Flank Wear (mm)Sr.
No
Flank Wear (mm)
Flute-1Flute-2Flute-3Flute-4Flute-1Flute-2Flute-3Flute-4
10.050.120.10.05160.170.230.210.14
20.10.140.10.05170.180.230.210.14
30.120.140.110.09180.180.230.210.15
40.120.150.130.1190.180.230.210.15
50.130.160.150.1200.190.230.210.15
60.130.180.160.1210.190.240.220.15
70.140.180.160.1220.190.240.220.15
80.150.180.160.12230.190.240.220.15
90.160.190.170.12240.190.240.220.15
100.160.20.180.12250.190.250.240.15
110.160.210.180.12260.190.250.250.15
120.170.210.190.13270.20.250.250.15
130.170.220.20.13280.20.260.260.15
140.170.220.210.13290.20.260.260.15
150.170.220.210.14300.20.270.260.15
Table 4. The statistical features and their formulae.
Table 4. The statistical features and their formulae.
Sr. No.Statistical FeaturesFormula
1Mean S m e a n = 1 N i = 1 N s i
2Standard deviation S s t d = i = 1 n ( s i s m ) ( n 1 ) 2
3Variance S v a r = i = 1 n ( s i s m ) ( n 1 ) 2
4Kurtosis S k u r = i = 1 N ( s i s m ) 4 ( N 1 ) S σ 4
5Skewness S s k e w = i = 1 n ( s i s m ) 3 ( n 1 ) S σ 3
6Root mean square S r m s = 1 n i = 1 n s i 2
7Peak to Peak S p e a k = m a x s i m i n s i
8Peak amplitude S C F = max s i ( 1 N i = 1 N x i )
Table 5. Model performance parameters.
Table 5. Model performance parameters.
Sr. No.Performance ParametersFormula
1R-squared (R2) R 2 = 1 i n a ¨ i a i 2 i n a i 2
2RMSE R M S E = 1 n i n a ¨ i a i 2
3MAPE M A P E = 1 n i = 1 n a i a ¨ i a i
Table 6. Feature extraction and its feature count.
Table 6. Feature extraction and its feature count.
Feature Extraction MethodFeature Count
STFT64
CWT64
WPT128
Table 7. Selected features using PCC from STFT feature extraction technique.
Table 7. Selected features using PCC from STFT feature extraction technique.
Feature CountFeaturePCCFeature CountFeaturePCC
1stft_rms_Bending_Moment_Y0.56312stft_p2p_vib_x0.290
2stft_std_Bending_Moment_Y0.50013stft_peak_amp_vib_x0.290
3stft_mean_Bending_Moment_Y0.49914stft_skew_Bending_Moment_Y0.282
4stft_var_Bending_Moment_Y0.49815stft_rms_vib_x0.274
5stft_p2p_Bending_Moment_Y0.49316stft_kurtosis_Bending_Moment_Y0.270
6stft_peak_amp_Bending_Moment_Y0.49317stft_std_Bending_Moment_X0.256
7stft_mean_Torsion_Z0.34818stft_var_Bending_Moment_X0.255
8stft_skew_Torsion_Z0.32719stft_peak_amp_Bending_Moment_X0.253
9stft_kurtosis_Torsion_Z0.31720stft_p2p_Bending_Moment_X0.253
10stft_var_vib_x0.30021stft_rms_Bending_Moment_X0.243
11stft_std_vib_x0.297
Table 8. Selected features using RF from STFT feature extraction technique.
Table 8. Selected features using RF from STFT feature extraction technique.
Feature CountFeaturesWeightFeature CountScoreWeight
1stft_rms_Bending_Moment_Y32.8717stft_peak_amp_Torsion_Z1.43
2stft_mean_Torsion_Z7.1318stft_var_vib_x1.39
3stft_skew_Torsion_Z4.3719stft_var_vib_y1.25
4stft_peak_amp_vib_x4.1520stft_mean_Axial_Force1.15
5stft_p2p_vib_x4.0121stft_var_Torsion_Z0.80
6stft_mean_Bending_Moment_Y3.8322stft_std_Torsion_Z0.74
7stft_std_vib_x3.2023stft_var_Bending_Moment_Y0.66
8stft_var_Axial_Force2.7824stft_kurtosis_Bending_Moment_Y0.65
9stft_rms_Axial_Force2.7625stft_skew_Bending_Moment_Y0.62
10stft_std_Axial_Force2.7026stft_kurtosis_Torsion_Z0.61
11stft_rms_vib_x2.5127stft_mean_vib_y0.59
12stft_peak_amp_Axial_Force2.3928stft_rms_vib_y0.58
13stft_p2p_vib_y2.0029stft_std_Bending_Moment_Y0.57
14stft_p2p_Torsion_Z1.7530stft_std_vib_y0.55
15stft_p2p_Axial_Force1.7231stft_rms_Torsion_Z0.54
16stft_peak_amp_vib_y1.59
Table 9. Selected features using PCC from CWT feature extraction technique.
Table 9. Selected features using PCC from CWT feature extraction technique.
Feature CountFeaturesPCCFeature CountFeaturesPCC
1cwt_rms_Bending_Moment_Y0.3727cwt_p2p_Torsion_Z0.326
2cwt_std_Bending_Moment_Y0.3728cwt_var_Torsion_Z0.322
3cwt_var_Bending_Moment_Y0.3709cwt_peak_amp_Bending_Moment_Y0.306
4cwt_std_Torsion_Z0.33310cwt_p2p_Bending_Moment_Y0.306
5cwt_rms_Torsion_Z0.33311cwt_skew_Torsion_Z0.201
6cwt_peak_amp_Torsion_Z0.326
Table 10. Selected features using RF from CWT feature extraction technique.
Table 10. Selected features using RF from CWT feature extraction technique.
Feature CountFeaturesWeightFeature CountFeaturesWeight
1cwt_mean_Bending_Moment_Y23.2123cwt_kurtosis_Spindle_current0.93
2cwt_mean_Torsion_Z12.4924cwt_skew_Axial_Force0.92
3cwt_mean_Bending_Moment_X7.2825cwt_kurtosis_Spindle_power0.89
4cwt_rms_Torsion_Z5.0626cwt_peak_amp_Bending_Moment_X0.89
5cwt_rms_Bending_Moment_Y3.4027cwt_std_Bending_Moment_X0.87
6cwt_skew_Bending_Moment_Y2.6428cwt_kurtosis_Axial_Force0.82
7cwt_rms_Bending_Moment_X2.1629cwt_rms_Axial_Force0.82
8cwt_skew_Torsion_Z1.9530cwt_skew_Spindle_power0.82
9cwt_kurtosis_Bending_Moment_X1.7631cwt_var_Bending_Moment_X0.80
10cwt_var_Bending_Moment_Y1.6232cwt_skew_Spindle_current0.73
11cwt_std_Torsion_Z1.5733cwt_p2p_Bending_Moment_X0.73
12cwt_std_Bending_Moment_Y1.5334cwt_kurtosis_vib_x0.71
13cwt_var_Torsion_Z1.4735cwt_peak_amp_Axial_Force0.69
14cwt_mean_Axial_Force1.3736cwt_std_Axial_Force0.69
15cwt_kurtosis_Torsion_Z1.3037cwt_var_Axial_Force0.69
16cwt_kurtosis_Bending_Moment_Y1.2738cwt_skew_vib_y0.67
17cwt_skew_Bending_Moment_X1.1639cwt_mean_Spindle_current0.67
18cwt_peak_amp_Bending_Moment_Y1.0540cwt_mean_Spindle_power0.66
19cwt_p2p_Torsion_Z1.0441cwt_p2p_Axial_Force0.66
20cwt_skew_vib_x1.0242cwt_mean_vib_y0.59
21cwt_peak_amp_Torsion_Z0.9443cwt_p2p_Spindle_current0.51
22cwt_p2p_Bending_Moment_Y0.93
Table 11. Selected features using PCC from WPT feature extraction technique.
Table 11. Selected features using PCC from WPT feature extraction technique.
Feature CountFeaturesPCCFeature CountFeaturesPCC
1a_rms_Bending_Moment_Y0.6614a_peak_amp_Bending_Moment_Y0.36
2a_skew_Torsion_Z0.6015a_p2p_Bending_Moment_Y0.36
3a_mean_Bending_Moment_Y0.5716d_p2p_Torsion_Z0.34
4a_std_Bending_Moment_Y0.5017d_peak_amp_Torsion_Z0.34
5a_var_Bending_Moment_Y0.5018d_rms_Torsion_Z0.34
6d_rms_Bending_Moment_Y0.4719d_std_Torsion_Z0.34
7d_std_Bending_Moment_Y0.4720d_var_Torsion_Z0.34
8d_var_Bending_Moment_Y0.4721a_kurtosis_vib_x0.31
9d_peak_amp_Bending_Moment_Y0.4022a_kurtosis_vib_y0.29
10d_p2p_Bending_Moment_Y0.4023a_peak_amp_Torsion_Z0.27
11a_std_Torsion_Z0.3724a_p2p_Torsion_Z0.27
12a_var_Torsion_Z0.3725a_mean_Bending_Moment_X0.25
13a_skew_Bending_Moment_Y0.3626a_rms_Bending_Moment_X0.25
Table 12. Selected features using RF from WPT feature extraction technique.
Table 12. Selected features using RF from WPT feature extraction technique.
Feature CountFeatureWeightFeature CountFeatureWeight
1a_rms_Bending_Moment_Y32.8811a_mean_vib_y1.35
2a_skew_Bending_Moment_Y23.3812a_p2p_Torsion_Z1.30
3a_skew_Torsion_Z4.6013a_peak_amp_Torsion_Z1.26
4a_rms_vib_x4.5814a_p2p_Bending_Moment_Y0.74
5a_mean_vib_x3.3615a_mean_Torsion_Z0.72
6a_rms_Axial_Force3.2216a_var_Bending_Moment_Y0.67
7a_kurtosis_Torsion_Z3.1817a_peak_amp_Bending_Moment_Y0.65
8a_mean_Axial_Force2.4818a_mean_Bending_Moment_Y0.61
9a_kurtosis_Bending_Moment_Y2.1819a_std_Bending_Moment_Y0.59
10a_rms_vib_y1.36
Table 13. RUL prediction for PCC-based feature selection techniques using different ML models.
Table 13. RUL prediction for PCC-based feature selection techniques using different ML models.
Feature
Extraction Techniques
Prediction ModelsRUL Prediction
Performance Evaluation on Testing Data
R2RMSEMAPE
STFTSVR0.2350.24917.983
RFR0.3630.22716.001
GBR0.3160.23517.871
CWTSVR0.0620.28822.631
RFR0.1000.27020.900
GBR0.0840.27321.922
WPTSVR0.2160.25218.344
RFR0.3660.22715.933
GBR0.3200.23417.510
Table 14. RUL prediction for RFR-based feature selection techniques using different ML models.
Table 14. RUL prediction for RFR-based feature selection techniques using different ML models.
Feature
Extraction
Techniques
Prediction
Models
RUL Prediction
Performance Evaluation on Testing Data
R2RMSEMAPE
STFTSVR0.2410.24818.447
RFR0.3830.22416.001
GBR0.3460.23017.406
CWTSVR0.0810.28922.820
RFR0.1020.27021.111
GBR0.1110.26921.856
WPTSVR0.3470.23016.045
RFR0.4960.202613.495
GBR0.4520.21115.194
Table 15. STFT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using different LSTM variants.
Table 15. STFT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using different LSTM variants.
Feature Selection TechniquesPrediction ModelsRUL Prediction
Performance Evaluation on Training DataPerformance Evaluation on Testing Data
R2RMSEMAPER2RMSEMAPE
Pearson’s Correlation Coefficient (PCC)Vanilla LSTM0.7410.14710.1630.7060.15210.523
Bi-direction LSTM0.8000.12908.8070.7550.13909.175
Stack LSTM0.8650.10606.7430.8020.12507.372
Random Forest Regressor (RFR)Vanilla LSTM0.7800.13508.9980.7370.14409.510
Bi-direction LSTM0.7040.15710.7150.6540.16511.461
Stack LSTM0.8090.12608.1380.7820.13108.520
Table 16. STFT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using CNN and CNN-LSTM variants.
Table 16. STFT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using CNN and CNN-LSTM variants.
Feature Selection TechniquesPrediction
Models
RUL Prediction
Performance Evaluation on Training DataPerformance Evaluation on Testing Data
R2RMSEMAPER2RMSEMAPE
Pearson’s
Correlation
Coefficient (PCC)
CNN0.8780.10107.0570.7750.13308.946
CNN-LSTM0.9340.07405.4260.8810.09706.877
CNN-Bi-LSTM0.7880.13308.6050.7530.14009.090
CNN-Stack-STM0.8700.10406.8420.8330.11507.421
Random Forest Regressor (RFR)CNN0.9720.04803.5700.9300.07405.499
CNN-LSTM0.8290.11908.0670.7740.13408.728
CNN-Bi-LSTM0.9060.08805.8910.8380.11307.090
CNN-Stack-LSTM0.9640.05403.6810.9510.06204.161
Table 17. CWT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using different LSTM variants.
Table 17. CWT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using different LSTM variants.
Feature Selection TechniquesPrediction ModelsRUL Prediction
Performance Evaluation on Training DataPerformance Evaluation on Testing Data
R2RMSEMAPER2RMSEMAPE
Pearson’s Correlation Coefficient (PCC)Vanilla LSTM0.9070.08706.5030.8510.10407.359
Bi-direction LSTM0.8640.10607.5050.8180.12008.446
Stack LSTM0.8610.10707.5060.7930.12808.577
Random Forest Regressor (RFR)Vanilla LSTM0.8820.09907.1760.8380.11308.369
Bi-direction LSTM0.9090.08706.5820.8810.09707.274
Stack LSTM0.9530.06204.7890.9270.07505.781
Table 18. CWT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using CNN and CNN-LSTM variants.
Table 18. CWT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using CNN and CNN-LSTM variants.
Feature Selection TechniquesPrediction
Models
RUL Prediction
Performance Evaluation On Training DataPerformance Evaluation on Testing Data
R2RMSEMAPER2RMSEMAPE
Pearson’s Correlation Coefficient (PCC)CNN0.9810.03903.1570.9340.07205.301
CNN-LSTM0.9370.07205.2360.8580.10606.922
CNN-Bi-LSTM0.9870.031702.4470.9600.05103.576
CNN Stack LSTM0.9000.09106.5160.8280.11707.814
Random Forest Regressor (RFR)CNN0.9620.05504.4150.9190.08006.177
CNN-LSTM0.9350.07505.7290.8900.09306.880
CNN-Bi-LSTM0.9920.02501.950.9710.04803.428
CNN-Stack-LSTM0.9760.04403.4180.9530.05904.354
Table 19. WPT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using different LSTM variants.
Table 19. WPT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using different LSTM variants.
Feature Selection TechniquesPrediction ModelsRUL Prediction
Performance Evaluation on Training DataPerformance Evaluation on Testing Data
R2RMSEMAPER2RMSEMAPE
Pearson’s Correlation Coefficient (PCC)Vanilla LSTM0.9220.08006.0210.8570.10207.140
Bi-direction LSTM0.8140.12408.5760.7780.13209.087
Stack LSTM0.8280.11907.6980.7710.13408.134
Random Forest Regressor (RFR)Vanilla LSTM0.9370.07204.9140.9010.08805.721
Bi-direction LSTM0.8970.09205.9660.8730.10006.533
Stack LSTM0.9780.04203.0430.9640.05103.676
Table 20. WPT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using CNN and CNN-LSTM variants.
Table 20. WPT feature extraction-based RUL prediction for PCC and RFR feature selection techniques using CNN and CNN-LSTM variants.
Feature Selection TechniquesPrediction ModelsRUL Prediction
Performance Evaluation on Training DataPerformance Evaluation on Testing Data
R2RMSEMAPER2RMSEMAPE
Pearson’s
Correlation
Coefficient (PCC)
CNN0.9790.04103.1920.9410.06804.943
CNN-LSTM0.9460.06605.0230.9030.08705.925
CNN -Bi-LSTM0.9500.06404.7000.9080.08605.905
CNN-Stack-LSTM0.8780.10006.9150.8270.11707.630
Random Forest Regressor (RFR)CNN0.9660.05203.8690.9260.07605.522
CNN-LSTM0.6300.17511.4520.9460.05105.522
CNN-Bi-LSTM0.9790.04103.2110.9480.06404.647
CNN-Stack-LSTM0.9770.0430.0300.9550.05803.590
Table 21. Summarized performance evaluation for different feature extraction techniques and DL models for RUL prediction using PCC and RFR-based feature selection techniques (Case-W1).
Table 21. Summarized performance evaluation for different feature extraction techniques and DL models for RUL prediction using PCC and RFR-based feature selection techniques (Case-W1).
Feature
Extraction Techniques
Prediction ModelsRUL Prediction
Feature Selection Using Pearson’s Correlation Coefficient (PCC)Feature Selection Using Random
Forest (RF)
R2RMSEMAPER2RMSEMAPE
STFTVanilla LSTM0.7060.15210.5230.7370.14409.510
Bi-direction LSTM0.7550.13909.1750.6540.16511.461
Stack LSTM0.8020.12507.3720.7820.13108.520
CNN0.7750.13308.9460.9300.07405.499
CNN-LSTM0.8810.09706.8770.7740.13408.728
CNN-Bi-LSTM0.7530.14009.0900.8380.11307.090
CNN-Stack-LSTM0.8330.11507.4210.9510.06204.161
CWTVanilla LSTM0.8510.10407.3590.8380.11308.369
Bi-direction LSTM0.8180.12008.4460.8810.09707.274
Stack LSTM0.7930.12808.5770.9270.07505.781
CNN0.9340.07205.3010.9190.08006.177
CNN-LSTM0.8580.10606.9220.8900.09306.880
CNN-Bi-LSTM0.9600.05103.5760.9710.04803.428
CNN-Stack-LSTM0.8280.11707.8140.9530.05904.354
WPTVanilla LSTM0.8570.10207.1400.9010.08805.721
Bi-direction LSTM0.7780.13209.0870.8730.10006.533
Stack LSTM0.7710.13408.1340.9640.05103.676
CNN0.9410.06804.9430.9260.07605.522
CNN-LSTM0.9030.08705.9250.9460.06504.448
CNN-Bi-LSTM0.9080.08605.9050.9480.06404.647
CNN-Stack-LSTM0.8270.11707.6300.9550.05803.590
Table 22. Summarized performance evaluation for different feature extraction techniques and LSTM variants for RUL prediction using PCC and RFR-based feature selection techniques (Case-W2).
Table 22. Summarized performance evaluation for different feature extraction techniques and LSTM variants for RUL prediction using PCC and RFR-based feature selection techniques (Case-W2).
Feature
Extraction Techniques
Prediction ModelsRUL Prediction
Feature Selection Using Pearson’s Correlation Coefficient (PCC)Feature Selection Using Random
Forest Regressor (RFR)
R2RMSEMAPER2RMSEMAPE
STFTVanilla LSTM0.9490.06404.6470.9460.06604.780
Bi-direction LSTM0.9540.06204.6520.9260.07905.767
Stack LSTM0.8320.12007.3690.9400.07104.866
CNN0.9620.05703.9720.9560.06104.367
CNN-LSTM0.8980.09405.8030.8910.09606.967
CNN-Bi-LSTM0.9040.08705.9750.9670.04403.256
CNN-Stack-LSTM0.8120.12708.4120.8860.09906.263
CWTVanilla LSTM0.8830.10006.5060.9650.05403.867
Bi-direction LSTM0.7880.13509.3060.9000.92605.950
Stack LSTM0.7930.1280.0850.9490.06604.197
CNN0.9270.07905.2640.9290.07805.103
CNN-LSTM0.9280.07805.6840.8100.12808.679
CNN-Bi-LSTM0.9720.04803.1230.9790.04202.965
CNN-Stack-LSTM0.9630.05603.9840.9100.08704.969
WPTVanilla LSTM0.9790.042103.1950.9770.04403.243
Bi-direction LSTM0.9710.04803.6490.8240.11808.063
Stack LSTM0.9650.05404.0430.9850.03402.728
CNN0.9710.04903.6720.9500.06404.700
CNN-LSTM0.9750.04503.4780.9770.04303.012
CNN-Bi-LSTM0.8780.10006.9150.9700.04403.032
CNN-Stack-LSTM0.8270.11707.6300.9550.05903.590
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sayyad, S.; Kumar, S.; Bongale, A.; Kotecha, K.; Abraham, A. Remaining Useful-Life Prediction of the Milling Cutting Tool Using Time–Frequency-Based Features and Deep Learning Models. Sensors 2023, 23, 5659. https://doi.org/10.3390/s23125659

AMA Style

Sayyad S, Kumar S, Bongale A, Kotecha K, Abraham A. Remaining Useful-Life Prediction of the Milling Cutting Tool Using Time–Frequency-Based Features and Deep Learning Models. Sensors. 2023; 23(12):5659. https://doi.org/10.3390/s23125659

Chicago/Turabian Style

Sayyad, Sameer, Satish Kumar, Arunkumar Bongale, Ketan Kotecha, and Ajith Abraham. 2023. "Remaining Useful-Life Prediction of the Milling Cutting Tool Using Time–Frequency-Based Features and Deep Learning Models" Sensors 23, no. 12: 5659. https://doi.org/10.3390/s23125659

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop