1. Introduction
Tool wear monitoring (TWM) is important to guarantee the manufacturing process’s quality and efficiency [
1]. The tool wear will affect the product quality, and excessive wear may result in tool damage and the shutdown of the production line, which will cause substantial economic loss. Therefore, developing an effective tool wear condition monitoring method has important practical significance.
Tool wear monitoring approaches mainly involve conventional and deep learning (DL) approaches. Traditional tool wear monitoring methods mainly rely on hand-designed feature extraction algorithms and machine learning models. Standard features include cutting force, a sound signal, and a vibration signal. Then, the relevant features can be extracted from the original signal using a feature extraction algorithm and classified or regressed by a machine learning algorithm. Shi et al. [
2] presented a tool wear prediction approach integrating least squares support vector machine (LS-SVM) and principal component analysis (PCA) techniques. Gomes et al. [
3] employed the support vector machine (SVM) and vibration and sound signals to monitor tool wear. Chen et al. [
4] presented an SVM-based tool wear prediction approach using the Whale Optimization Algorithm (WOA). Gai et al. [
5] established a WOA-SVM classification model using fusion features to identify tool wear states. The combination of these optimization algorithms and SVMs described above suffers from related shortcomings. Firstly, the performance of SVMs is highly dependent on the correct choice of parameters. Improper parameter selection can lead to overfitting or underfitting of the model. Secondly, SVM models are not very interpretable, and although SVMs can deal with non-linear problems by using non-linear kernel functions, choosing the right kernel function is not always intuitive. This can make the decision-making process of the model difficult to understand for non-technical people. Finally, although related researchers have used a variety of optimization algorithms to improve the training efficiency of SVMs, these optimization algorithms suffer from the shortcomings of falling into local optimums, sensitivity to initial values, and slow convergence. They cannot deal with complex and diverse tool wear states. Moreover, capturing the tool wear state’s dynamic change using a traditional method is challenging due to its limited modeling ability for long-term dependence.
In order to resolve these issues, DL models have attracted extensive attention in TWM. Characterized by powerful nonlinear fitting capabilities and automatic feature learning capabilities, DL models can derive high-level features from raw sensor data and capture complex tool wear state patterns [
6]. Tool wear condition monitoring is a critical research area in the manufacturing industry, and many researchers have proposed various methods to solve it [
7].
A convolutional neural network (CNN) is a DL model that can extract local features effectively. In TWM, CNN is often utilized to extract the tool wear state’s spatial characteristics [
8]. CNN can gradually extract the high-level features of the tool wear state through multi-layer convolution and pooling operations. Many studies have successfully applied CNN to classify and forecast tool wear states. For instance, Dai et al. [
9] presented a CNN-based TWM approach. Garcia et al. [
10] presented a CNN-based in situ TWM approach. Kothuru et al. [
11] combined depth visualization and CNN to achieve tool wear state detection. Wu et al. [
12] presented an automatic CNN-based tool wear detection approach.
A recurrent neural network (RNN) is a DL model suitable to process sequence data [
13]. However, traditional RNNs have deficiencies like gradient disappearance and explosion when dealing with long sequence data. In order to overcome these problems, scholars have proposed improved RNN structures like long short-term memory (LSTM) and gated cycle units (GRU). For example, Xu et al. [
14] presented a multi-scale convolutional GRU network to predict tool wear. Liu et al. [
15] presented a TWM approach that combines Densetnet and GRU. Chen [
16] presented a tool wear prediction approach using parallel CNN and BiLSTM. These improved RNN models can capture the temporal pattern in the tool wear state sequence well and have excellent long-term dependence modeling ability.
Transformer is a self-attention mechanism-based DL model initially utilized for natural language processing tasks. The Transformer encoder models the global context of the input sequence and captures dependencies at different points in the sequence. In recent years, scholars have begun to apply the Transformer to time series data analysis, including tool wear condition monitoring. For example, Liu [
17] proposed a new CNN-transformer neural network model for TWM. Liu et al. [
18] presented a new transformer-based neural network model for tool wear prediction. The Informer model solves this problem by using a sparse attention mechanism and a hierarchical structure to efficiently deal with long time sequences. The Informer encoder is the core of the Informer model. The main task of the Informer encoder is to capture the patterns and dependencies of the input time series and to encode this information into a fixed-length representation. The Informer encoder introduces the ProbSparse self-attention mechanism, which uses a probabilistic mechanism to capture the patterns and dependencies of the input time series and to encode this information into a fixed-length representation. The main task of the Informer encoder is to capture the patterns and dependencies of the input time series and encode this information into a fixed-length representation. The Informer encoder introduces the ProbSparse self-attention mechanism, which uses a probabilistic mechanism to select the critical time steps, thus reducing the computational complexity. To further reduce the computational burden, the Informer encoder uses a hierarchical structure that divides the time series data into multiple sub-sequences and applies the self-attention mechanism to each sub-sequence independently. Therefore, in this paper, an Informer encoder is chosen to model long-term dependencies and sequentially capture important features in the time series to improve the accuracy and efficiency of tool wear condition monitoring.
In summary, the DL-based tool wear state monitoring method has better feature learning capability and long-term dependence modeling ability than the traditional method. The current work presents a DL network model, CIEBM, which combines a CNN, an Informer encoder, and BiLSTM. The CIEBM model utilizes the advantages of the CNN, Informer encoder, and BiLSTM in feature extraction, long-term dependence modeling, and time series modeling to accurately monitor and predict tool wear state. Compared to traditional methods such as optimization algorithms and SVM, the CIEBM model takes full advantage of different neural networks and is able to automatically learn and extract features from the original data without the need to manually design or select the features. It is also more suitable for tool wear prediction because the CIEBM model is able to capture complex and non-linear relationships in the data due to its multi-layer structure.
The essential novelties are as follows:
- (1)
This study presents a new TWM approach that combines the advantages of the CNN, the Informer encoder, and BiLSTM. This is the first time these three DL techniques have been combined to monitor tool wear conditions.
- (2)
This method can extract spatial features from the raw sensor data, capture long-term dependence and time patterns, and learn the feature representation of tool wear state comprehensively to enhance the TWM’s precision and reliability.
- (3)
The presented approach has excellent efficiency and good interpretability, which can help to understand the key factors of tool wear and prepare a valuable reference to prevent and manage tool wear.
The paper is structured as follows:
Section 2 focuses on the theory related to the CIEBM model;
Section 3 focuses on the structure of the CIEBM model and the parameters related to the network;
Section 4 focuses on the experimental procedure and results; and finally,
Section 5 presents the conclusions.
5. Conclusions
A tool wear state monitoring approach using CNN, Informer encoder, and BiLSTM was proposed to evaluate its performance on the tool wear state dataset. The experimental results and analysis demonstrate the following results:
- (1)
Experimental results reveal that the presented TWM approach based on CNN, Informer encoder, and BiLSTM has high accuracy in TWM. All of them reached over 95% in the relevant evaluation indexes, reflecting the excellent performance of the CIEBM model, which can efficiently classify and forecast the tool wear state.
- (2)
In tool wear monitoring, CNN can extract spatial features from sensor data. Informer encoders can model long-term dependencies and capture global context information with ProbSparse Self-Attention and a feedforward neural network layer. BiLSTM captures temporal patterns and context information to further improve monitoring accuracy.
- (3)
Our model is the first to use CNN, an Informer encoder, and BiLSTM together for tool wear condition monitoring, and it is also the first to target global feature modeling based on the non-linearity of the tool wear process to enable the model to better learn the relationship between the features of different wear stages. This is of great importance for further research.
- (4)
Further analysis shows that our method has an excellent classification impact on normal and different degrees of wear, and the confusion between normal and heavy wear is slight, indicating that the method can effectively distinguish tool states with different degrees of wear.
In summary, the tool wear state monitoring approach using CNN, Informer encoder, and BiLSTM performed well in the experiment. This method has significant application value for TWM in the industrial field. Nevertheless, many details still need to be improved, such as further optimization of the model architecture, hyperparameter adjustment, and dataset size expansion, to enhance the monitoring’s precision and robustness.
In future work, the method of combining physical models of tool wear with deep learning will be further investigated. By modeling the physical model, the interpretability of deep learning will be further improved while providing a theoretical basis for optimizing the deep learning network model for the production scenario of tool wear. In the next phase, we will continue to conduct field experiments to study wear under variable working conditions to further improve the generalization ability of the model.