*Article* **Misfire Detection Using Crank Speed and Long Short-Term Memory Recurrent Neural Network**

**Xinwei Wang 1,2, Pan Zhang 3, Wenzhi Gao 3,\*, Yong Li 3, Yanjun Wang <sup>3</sup> and Haoqian Pang <sup>3</sup>**

<sup>1</sup> State Key Laboratory of Engine Reliability, Weifang 261061, China; wangxinw@weichai.com

<sup>2</sup> Weichai Power Co., Ltd., Weifang 261061, China

<sup>3</sup> State Key Laboratory of Engines, Tianjin University, Tianjin 300350, China; zhangpan@tju.edu.cn (P.Z.); li\_yong@tju.edu.cn (Y.L.); wangyanjunz@tju.edu.cn (Y.W.); panghaoqian199817@gmail.com (H.P.)

**\*** Correspondence: gaowenzhi@tju.edu.cn

**Abstract:** In this work, a new approach was developed for the detection of engine misfire based on the long short-term memory recurrent neural network (LSTM RNN) using crank speed signal. The datasets are acquired from a six-cylinder-inline, turbo-charged diesel engine. Previous works investigated misfire detection in a limited range of engine running speed, running load or misfire types. In this work, the misfire patterns consist of normal condition, six types of one-cylinder misfire faults and fifteen types of two-cylinder misfire faults. All the misfire patterns are tested under wide range of running conditions of the tested engine. The traditional misfire detection method is tested on the datasets first, and the result show its limitation on high-speed low-load conditions. The LSTM RNN is a type of artificial neural network which has the ability of considering both the current input in-formation and the previous input information; hence it is helpful in extracting features of crank speed in which the misfire-induced speed fluctuation will last one or a few cycles. In order to select the engine operating conditions for network training properly, five data division strategies are attempted. For the sake of acquiring high performance of designed network, four types of network structure are tested. The results show that, utilizing the datasets in this work, the LSTM RNN based algorithm can overcome the limitation at high-speed low-load conditions of traditional misfire detection method. Moreover, the network which takes fixed segment of raw speed signal as input and takes misfire or fault-free labels as output achieves the best performance with the misfire diagnosis accuracy not less than 99.90%.

**Keywords:** engine misfire; pattern recognition; fault detection; LSTM; time-frequency analysis

### **1. Introduction**

Engine misfire is a phenomenon of no-burning in cylinder which may be caused by insufficient fuel injection, bad fuel quality, insufficient ignition energy, or mechanical failure, etc. Since misfire fault will cause abnormal engine running condition and air pollution, many researchers have been trying to put forward effective methods to achieve accurate and real-time misfire detection.

The techniques for engine misfire detection can be categorized according to the utilized sensor signals, which includes the method using engine body vibration signal [1], the method using acoustic signal [2], the method analyzing exhaust gas temperature [3], the method monitoring in-cylinder iron current [4], and the method using crank speed [5]. The method using engine body vibration signal could sample much information, since the vibration signal is sampled with high resolution and is related to in-cylinder combustion. However, a large amount of computation is required for processing vibration data. The method using acoustic signal has not solved the problem of noise interference in practical implementation. The method analyzing the temperature of exhaust gas is limited by the sensor's response time. The method monitoring in-cylinder iron current needs to modify

**Citation:** Wang, X.; Zhang, P.; Gao, W.; Li, Y.; Wang, Y.; Pang, H. Misfire Detection Using Crank Speed and Long Short-Term Memory Recurrent Neural Network. *Energies* **2022**, *15*, 300. https://doi.org/ 10.3390/en15010300

Academic Editors: Luis Hernández-Callejo, Sergio Nesmachnow and Sara Gallardo Saavedra

Received: 26 November 2021 Accepted: 29 December 2021 Published: 3 January 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

the engine body. The method using crank speed has been adopted by many researchers, since the crank speed can be sampled relatively easily and cannot be easily contaminated by uncorrelated noise.

The misfire detection methods based on crank speed can be categorized into physical model-based algorithms and data-driven diagnosis algorithms.

The model-based method is used to diagnose engine misfire by building the relationship between crank speed and in-cylinder pressure based on the engine dynamic model. Zheng et al. [5] designed a Luenberger sliding mode observer to estimate engine combustion torque based on experimental crank speed of a four-cylinder engine. Rizvi et al. [6] proposed a hybrid model for simulating the relationship between engine power and crank speed fluctuations. Misfire was detected by using Markov chain. Helm et al. [7] estimated engine torque based on a parametric Kalman filter. Misfire was detected by employing the estimated torque and an interacting multiple model algorithm. Hmida et al. [8] proposed the torsional model of crankshaft. The Lagrange method and Newmark algorithm were employed to derive the equations of motion. The appearance of sidebands around the acyclic frequency was adopted to detect misfire.

The model-based algorithm can lead to very accurate results if properly executed. Nevertheless, the method needs precise engine model parameters which is hard to gauge accurately. The damping is an example that cannot even be measured. Meanwhile, the complexity of model-based algorithm may not permit the real time implementation of the algorithm [9]. Therefore, this method has not been widely used in industrial application.

The data-driven diagnosis algorithms provide another way of misfire detection, in which the misfire related characteristics are extracted directly from crank speed instead of deriving the excitation torque or in-cylinder pressure. Misfire is detected by distinguishing the misfire related characteristics from fault free. The representative data-driven method is the engine roughness method which is proposed by Plapp et al. [10] and is still used in modern vehicles. However, this method is limited on high-speed low-load conditions when the number of engine cylinders is not less than six.

Another data-driven method is conducted by analyzing the typical frequencies of crank speed. Taraza et al. [11] utilized the lowest three harmonic orders of crank speed as an indicator for one-cylinder misfire detection. Geveci et al. [12] analyzed the first and the second harmonic components of crank speed under normal and cylinder 1# misfire conditions at various speeds and loads. This method is limited as well, since the speed spectrum of two-cylinder misfire fault may be confused with one-cylinder misfire patterns.

Over the past about twenty years, machine learning algorithms developed rapidly and have been exploited in misfire detection research field [13]. Compared with the algorithms that one or some human-designed indicators are calculated for misfire detection, the machine learning algorithm could extract more fault features from one signal or could process many signals at the same time. Not only the crank speed, but also the engine vibration and in-cylinder pressure have been used in the machine learning algorithm as reported in the literature.

Li et al. [14] utilized the crank speed and the techniques including the empirical mode decomposition, kernel independent component analysis, Wigner bispectrum and support vector machine (SVM) for detecting misfire of a marine diesel engine. Chen and Randall [15] designed a misfire detection method which consists of three stages: fault detection, fault localization and fault severity identification. This method was achieved by using the lowest four harmonic orders of crank speed and a fully connected artificial neural network (ANN). Jung et al. [16] distinguish misfire and fault-free conditions using crank speed and Kullback–Leibler divergence. SVM was utilized as the automatic classification tool. Gani and Manzie [17] also employed the SVM technique and crank speed for classifying normal condition, intermittent misfire and continuous misfire in cylinder 6# of engine. The accuracy approached 100% in test dataset. In the work of Sharma et al. [18], the statistic features of vibration signals, like standard deviation, kurtosis, median and so on, were selected as fault features for misfire detection. The decision tree algorithm was employed for fault classification. As reported by Moosavian et al. [19], wavelet denoising technique, ANN, least square support vector machine, and D–S evidence theory were applied for misfire detection. The final classification accuracy of 98.56% was achieved by using acoustic and vibration signal under idle condition. Gu et al. [20] utilized multivariate empirical mode decomposition and SVM techniques for a twelve-cylinder diesel engine misfire detection. Qin et al. [1] designed a deep twin convolutional neural network with multi-domain inputs for misfire detection. Since the vibration signal was employed, the authors also studied the algorithm's performance when there was strong environmental noise on the vibration. Jafarian et al. [21] employed vibration signals from four sensors placed on the engine for misfire detection. The fast Fourier transform (FFT) was used for feature extraction; the ANN, SVM, and k-nearest neighbor (kNN) algorithms were used for classification. Liu et al. [22] took many signals, including engine speed, exhaust temperature, and fuel consumption, as the inputs of ANN for misfire detection. Bahri et al. [23] detected misfire of a homogeneous charge compression ignition engine by using in-cylinder pressure and ANN model.

It can be seen that the mentioned data-driven diagnosis algorithms mainly focused on the classical feature extraction methods, such as human-designed threshold, FFT, wavelet transform and empirical mode decomposition, and traditional pattern recognition methods, such as fully connected ANN and SVM. Thus, the algorithm performance often depends on the proper selected features and the domain expertise in engine misfire. In addition, the mentioned algorithms based on machine learning either mainly considered a few engine speed and load conditions, or only considered a few one-cylinder misfire types. In practice, the fault features would change with the engine running conditions or different misfire types, and for a data-driven algorithm especially a machine learning algorithm, the sample size is an important factor for the algorithm training. Therefore, there is still some research work needs to do for applying the machine learning based algorithms into actual industrial scenario.

Recurrent neural network (RNN) is a type of neural network which is good at processing sequence data. Sequences may be of finite or countably infinite length, and may be temporal or non-temporal. Examples of time-indexed data include the audio recordings which are sampled at fixed intervals. In fact, RNNs are frequently applied to sequences whose meaning are directly related to the data order but no explicit notion of time [24]. Engine crank speed is a type of sequence that the prior motion will affect the later motion. For example, assuming the firing order of a six-cylinder engine is 1-5-3-6-2-4, if misfire occurs in the first cylinder, not only the instantaneous speed variation of the first cylinder changes, but also the fourth cylinder. Therefore, RNN is hopeful for misfire detection. For the earlier RNN, it was difficult to handle the problems of vanishing and exploding gradient that occurred when training RNN across many steps [25]. Therefore, in this paper, the RNN with long short-term memory (LSTM) [26] which overcomes the training difficulties is utilized.

Compared with misfire detection studies in literature, the main contribution of this paper is as follows.


The rest of the paper is organized as follows. In Section 2, the experiment setup and diesel engine rig tests are introduced. The speed characteristics under misfire and the limitation of traditional misfire detection method are described in Section 3. The scheme of misfire diagnosis and the LSTM algorithm are introduced in Section 4. In Section 5, the experimental results are analyzed and discussed. Finally, conclusions are given in Section 6.

### **2. Experimental Equipment and Data Acquisition**

*2.1. Test Rig Setup and Data Acquisition System*

The test engine was a four-stroke, six-cylinder-inline diesel engine. In order to adjust the fuel injection parameter conveniently, a diesel engine with electronic unit pump was employed. In addition, with larger number of cylinders, an engine would operate steady, so the fault features of misfire would become weaker relatively [28]. Thus, a six-cylinder engine was selected. The basic technical data of engine is shown in Table 1. A hydraulic dynamometer was connected to the engine for providing external load. A flexible shaft coupling was mounted to connect the engine crank shaft and the dynamometer. Figure 1 shows the picture of the whole test-rig.

**Table 1.** Engine specifications.


**Figure 1.** The test-rig.

A Kistler high-temperature pressure piezoelectric sensor, Type 6058A, was mounted in cylinder 1# through the glow plug hole for verifying misfire occurrence in cylinder chamber. A magnetic sensor, which was mounted opposite to the teeth on the flywheel was used to measure the angular speed of the crankshaft. The sensors' signals were synchronously sampled and primarily processed using Siemens LMS SCM05 system with 24-bit ADC resolution and a maximum sampling rate of 102.4 kHz.

### *2.2. Test Description*

The measurements were conducted over the engine speed range 800–2200 r/min with interval 100 r/min, at different load levels, as shown in Figure 2. For each engine speed and load value, the measurements were operated under normal, one-cylinder misfire and two-cylinder misfire conditions. The misfire types are shown in Table 2. Including normal condition, the total fault types are 22. The misfire condition was achieved by setting the injection parameter zero on the programmable electronic control unit.

**Figure 2.** Engine operating conditions.

**Table 2.** All the misfire types.


The tests were conducted in an intensive engine running speeds and loads, and varied in a wide range. Partial data were used for network training and the rest were used for network testing. The size of training dataset had been set from large to small until an optimal size was achieved.

### **3. The Speed Characteristics under Misfire and the Limitation of Traditional Misfire Detection Method**

When misfire occurs, the instantaneous engine crankshaft speed will drop and the subsequent speed will rise up compared with the normal conditions. The variation of the whole speed curve will be larger. An example is shown in Figure 3a, under the running condition of 1000 r/min and 100 Nm, when a misfire occurs in cylinder 1#, as the dash curve indicates, the speed becomes different from the normal condition. When the engine speed is high and the load is low, as shown in Figure 3b, the variation rule of instantaneous crankshaft speed becomes unclear, and the difference between normal and misfire condition also becomes indistinguishable. Moreover, when two-cylinder misfire occurs, the fault features expressed from engine speed curve is easily confused with that of one-cylinder misfire condition, especially under the high-speed and low-load conditions. Figure 4a,b show the comparison between speed curves of one-cylinder misfire and two-cylinder misfire.

**Figure 4.** Instantaneous engine speed comparison between cylinder 1# misfire and cylinders 1# and 2# misfire conditions. (**a**) Under 1900 r/min and 100 Nm. (**b**) Under 2000 r/min and noload condition.

Therefore, for detecting misfire accurately, the fault features that can reduce or eliminate the interference from engine speed and load should be found. One way to eliminate the impact of engine running range is to divide it into small blocks and then find fault features for each block [27]. However, this will increase workload when the engine running range is large. The better way is to find or design an algorithm that can extract useful feature or can learn more features for the whole engine running conditions.

An example of the traditional methods which is called engine roughness method is introduced as below. This method calculates a misfire indicator that is based on the difference of two consecutive angular accelerations. Equation (1) presents the calculation of the indicator [10].

$$\mathbf{S}\_{\bar{i}} = (T\_{i+1} - T\_{\bar{i}}) / T\_{\bar{i}}^3 \tag{1}$$

where S*<sup>i</sup>* is engine roughness of the *i*th cylinder. *Ti* is the time period from ignition of the *i*th cylinder to ignition of the next cylinder in firing order.

Figure 5 shows the results of indicator S under different engine speeds and different misfire patterns. When misfire occurs in cylinder *i*, the corresponding S*<sup>i</sup>* will become larger than the predefined threshold. An example is shown in Figure 5a, the threshold can be defined in range 15–19, and when S*<sup>i</sup>* is detected larger than the threshold, it is thought the misfire happened in the cylinder *i*. This method is limited at the high-speed low-load conditions. As shown in Figure 5b, under 1900 r/min and no-load condition, it is hard to determine the threshold, and the two-cylinder misfire modes are easily confused with one-cylinder misfire modes.

The unsatisfied results appeared at high-speed and low-load conditions are caused by the background noise which has approximately the same order of magnitude with the value related to the misfire presence. The reasons for relatively higher background noise are mainly from the different burning behaviors caused by the systematic nonuniformity. In addition, with the speed increasing and load decreasing, the signal to noise ratio will decrease since the useful features caused by misfire will decrease. Figure 6 presents the standard deviation of crankshaft speed under 800 r/min and no-load conditions. The standard deviation is calculated once per cycle, the points in Figure 6 which are shown in the form of mean value and standard deviation are calculated from 200 cycles of data. The results clearly show that when misfire occurs, the amplitude of speed variation will increase, and this is helpful for extracting misfire features. However, when engine speed becomes higher and load becomes lower, the amplitude of speed variation decreases, and the amplitude difference between normal and misfire patterns also decreases, as shown in Figure 7. Then, the signal to noise ratio decreases. The limitation of the engine

roughness method proves that it is hard to use too few features to achieve a perfect fault detection result.

Since LSTM RNN is good at learning features of sequences, the LSTM RNN is utilized in this paper to detect misfire and to overcome the limitation of traditional algorithm.

**Figure 5.** Engine roughness under normal and misfire conditions. (**a**) Under 1200 r/min and no-load condition. (**b**) Under 1900 r/min and no-load condition.

**Figure 6.** Mean values and standard deviations of the standard deviation of crankshaft speed under 800 r/min and no-load condition with different misfire patterns. Manner 1 represents misfire of two consecutive cylinders in firing order; manner 2 represents misfire of two cylinders with one-cylinder interval; manner 3 represents misfire of two cylinders with two-cylinder interval.

**Figure 7.** Standard deviation of one cycle speed under different speeds and loads with different misfire patterns. Fill areas mean error bars. Under each speed, the five points from left to right in a curve represent the five loads which are no-load, 100 Nm, 150 Nm, 200 Nm and 250 Nm. Moreover, manner 1, manner 2 and manner 3 have the same meaning with those in Figure 6.

### **4. The LSTM RNN**

The classical artificial neural networks are design to extract features from datasets whose sub-samples are independent with each other. In some application scenarios, like natural language processing, the meaning of a whole sentence is dependent on the meaning and order of the previous and later words. RNNs are designed to be applied in this kind of research field. RNNs are connectionist models that capture the meaning of sequences via cycles in the network. Basic architecture of an RNN is shown in Figure 8, which is an unfold architecture.

**Figure 8.** Basic architecture of an RNN module, showing shared parameters.

As presented in Figure 8, the forward pass of an RNN module looks the same as that of a multi-layer perceptron which has a single hidden layer. The main difference is that the activations of the hidden layer are from both the current input layer and the hidden layer activations from the previous step, as described in Equation (2) [29]. Equation (3) calculates the output value or vector. Thus, an RNN will map the input sequences into output.

$$h^{(t)} = f\left(b + \mathcal{W}h^{(t-1)} + lIx^{(t)}\right) \tag{2}$$

$$y^{(t)} = Vh^{(t)}\tag{3}$$

where *W*, *U* and *V* are the weight matrices. *b*, *x*, *h*, *f*, and *y* donate the bias vector, input vector, hidden layer vector, activation function and the output vector, respectively.

The classic RNN has the problem of a vanishing gradient [30]. In addition, sometimes gradient explosion will also occur. This is because the error surface is either very flat or very deep after updating weights in many time steps. This problem is also called the long-term dependency problem. One effective way to solve this problem is using gating mechanism to control the information passing path, such as LSTM.

LSTM RNN has the basic structure of RNN, which is a chain of repeating modules. The main difference of an LSTM RNN from other RNNs is the structure of the module, which is marked with shadow area in Figures 8 and 9. In a module of LSTM RNN, three gates are designed to control the output.

**Figure 9.** LSTM schematic. I, input gate; II, forget gate; III, output gate.

The main line of an LSTM module is the calculations of input vector, cell state and output, as indicated by the blue dot and arrow in Figure 9. First of all, the input vector of LSTM module is acquired by concatenating the outputs of the previous module and the current inputs. Secondly, two gates are designed to adjust the cell state. As shown in Figure 9, the input gate is applied to decide whether the current inputs will be used to update the cell state. By the same principle, the forget gate is applied to adjust the proportion of previous cell state in the current one. This makes an LSTM module have the memory function. Then the cell state will be updated and stored for the LSTM module of next step. Next, an output gate is designed to adjust the output of the updated cell which has been rescaled by a tanh activation function firstly. Finally, the output will be transported to the next layer and the next module.

Equations (4) and (5) show the calculation of new candidate vector *<sup>c</sup>*(*t*) and input gate vector *i* (*t*); Equation (6) calculates the forget gate vector *f* (*t*); the cell state *c*(*t*) can be updated by Equation (7); the output gate vector *o*(*t*) is calculated by Equation (8); and the final output *h*(*t*) will be acquired by Equation (9) [24]. The output vectors of these three gates are all the values between 1 and 0, which will make the outputs of corresponding layer change from original value to 0.

$$\hat{x}^{(t)} = \tanh\left(\mathcal{W}\_{\mathbf{c}} \cdot \left[h^{(t-1)}, \mathbf{x}^{(t)}\right] + b\_{\mathbf{c}}\right) \tag{4}$$

$$\dot{a}^{(t)} = \text{sigmoid}\left(\mathcal{W}\_i \cdot \left[h^{(t-1)}, \mathbf{x}^{(t)}\right] + b\_i\right) \tag{5}$$

$$f^{(t)} = \text{sigmoid}\left(\mathcal{W}\_f \cdot [h^{(t-1)}, \mathbf{x}^{(t)}] + b\_f\right) \tag{6}$$

$$\mathfrak{c}^{(t)} = f^{(t)} \circ \mathfrak{c}^{(t-1)} + \mathfrak{i}^{(t)} \circ \hat{\mathfrak{c}}^{(t)} \tag{7}$$

$$\boldsymbol{\phi}^{(t)} = \text{sigmoid}\left(\mathcal{W}\_o[\boldsymbol{h}^{(t-1)}, \boldsymbol{x}^{(t)}] + \boldsymbol{b}\_o\right) \tag{8}$$

$$h^{(t)} = o^{(t)} \circ \tanh(c^{(t)}) \tag{9}$$

where sigmoid and tanh are activation functions. *x*(*t*) is the input vector from training or testing dataset. *h*(*t*−1) and *h*(*t*) are the current and previous outputs of LSTM module, respectively. [*h*(*t*−1), *x*(*t*)] means concatenating *h*(*t*−1) and *x*(*t*). *bc*, *bi*, *bf* , and *bo* are biases. *Wc*, *Wi*, *Wf* , and *Wo* are weight matrices. ◦ means Hadamard product. When the network is trained, *bc*, *bi*, *bf* , and *bo* are initialized with ones. For *Wc*, *Wi*, *Wf* , and *Wo*, each weight matrix is the concatenation of two matrices which are corresponding to *h*(*t*−1) and *x*(*t*), respectively; accordingly, the two parts of a weight matrix are initialized, respectively. In this work, both the two parts of each weight are initialized as uniform distribution which is shown in Equation (10) [31].

$$\mathcal{W} \sim \mathcal{U} \left[ -\frac{\sqrt{6}}{\sqrt{n\_j + n\_{j+1}}}, \frac{\sqrt{6}}{\sqrt{n\_j + n\_{j+1}}} \right] \tag{10}$$

where *nj* and *nj*+<sup>1</sup> is the element number layer *j* and *j* + 1, respectively.

This is the key mechanism of LSTM RNN. The network training is also based on back propagation through time (BPTT) strategy and gradient descent algorithm. Up to now, there have been many types of variants on the LSTM, such as adding peephole connections, using coupled forget and input gates, gated recurrent unit, depth gated RNNs and so on. As reported in Greff's work [32], the result of comparing these popular LSTM variants shown that there were not significant differences among them. Therefore, the standard LSTM RNN is adopted in this paper.

### **5. Signal Processing and Results Analysis**

### *5.1. Network Training Strategy*

The experiments have been described in Section 2.2. There are 70 different engine running speeds and loads conditions in total. For each condition, 22 misfire types were conducted. Under each fixed speed, load and misfire type condition, 200 cycles data were sampled. Thus, the number of total datasets is 308,000 (22 × 70 × 200 = 308,000), and one dataset corresponds to one engine power cycle which contains 120 speed data point.

The datasets were acquired in a dense speed and load range. However, for industrial application, it would be better to use small number of datasets to train a well-performed neural network. Five division modes of training and testing datasets were attempted, as described in Table 3. The arrangement of mode 1 will be attempted firstly, and if the test result is higher than 90%, the rest arrangements will be tested. The arrangements of modes 2\_a, 2\_b, 2\_c, and 2\_d are shown in Figure 10, in which the training datasets are the conditions in shadow, the rest datasets are for testing. The arrangements in Figure 10a, c will be attempted first, and if the test results are higher than 90%, the arrangements in Figure 10b,d will be tested.

**Table 3.** Description of the training and testing datasets arrangement.


**Figure 10.** Different arrangements of training and testing datasets. (**a**) Arranging training data in 100 r/min interval. (**b**) Arranging training data in 200 r/min interval. (**c**) Arranging training data in dense speed and load interval. (**d**) Arranging training data in sparse speed and load interval. The engine running conditions in grey shadows are used for network training, the rest for network testing.

In order to achieve a better performance of the LSTM RNN, four different structures of input layer and output layer have been tested.


**Figure 11.** Different types of LSTM RNNs utilized in this paper. (**a**) LSTM with sequence inputs and sequence outputs. (**b**) LSTM with sequence inputs and last outputs. (**c**) LSTM with segment inputs and sequence outputs. (**d**) LSTM with segment inputs and sequence outputs.

The basic parameters for these four types of networks are summarized in Table 4.


**Table 4.** The basic parameters of networks utilized.

### *5.2. Results Analysis*

### 5.2.1. The First Network Structure

For the first structure which is designed as Figure 11a, one input group consists of 10 datasets, that is 1200 (120 × 10) speed data points. One output group has the same length with the corresponding input. The network consists of one input layer, one LSTM layer, and one output layer. The initial learn rate is 0.01 and the learn rate drop period is 3. The adaptive moment estimation method is adopted for network training. The number of elements of hidden layer is 20. When the training and testing datasets are arranged as mode 1, the final training and testing accuracies are 17.35% and 15.36%, respectively. Since the accuracies are not high, no more tests are attempted.

### 5.2.2. The Second Network Structure

The second structure is designed as Figure 11b. One input group consists of one dataset which is 120 speed data points. Only the last LSTM cell outputs prediction result. The network consists of one input layer, one LSTM layer, and one output layer. The initial learn rate, the learn rate drop period, and the training algorithm are the same as those in the first structure.

When the training and testing datasets are arranged as mode 1 (described in Table 3), the element numbers of LSTM layer, which are 3, 5, 10, 20, 40 and 80, have been tested. The corresponding training and testing accuracies are drawn in Figure 12. It is thus clear that when the training datasets are arranged as mode 1, 5 elements are enough for the network training, and the corresponding training and testing accuracies are 99.23% and 99.20%. Since the accuracies are very high, the datasets arranged in modes 2\_a, 2\_b, 2\_c and 2\_d are tested and the results are shown in Figure 13. It can be seen that utilizing the training datasets in a sparser manner, the acceptable performance can also be acquired. It seems that with less training data, it will be easier to train a network that can achieve the accuracy higher than 95%, such as the network trained in modes 2\_b and 2\_d.

### 5.2.3. The Third Network Structure

The third structure is designed as Figure 11c. One input layer consists of 20 elements which correspond to the interval of one-cylinder working. The output of one LSTM cell consists of two categories which are normal and misfire, and each LSTM cell has a corresponding output. The output indicates whether the current powering cylinder is fault-free. The initial learn rate, the learn rate drop period, and the training algorithm are the same as those in the first method.

The training and testing datasets are arranged as Table 3. The training strategy is also by changing the numbers of hidden layer elements, which are 3, 5, 10, 20, 40 and 80. The training and testing results under mode 1 are summarized in Figure 14. The training and testing results under modes 2\_a, 2\_b, 2\_c, and 2\_d are summarized in Figure 15. Since the outputs are calculated for each cylinder, it is unable to compare the original results with other methods. Therefore, when we calculate the accuracy, the results for one cylinder are converted to that for one cycle. The concrete method is to group every six consecutive outputs from cylinder 1# and tag each group according to the fault cylinders. If three or more fault cylinders are detected in one cycle, the result will be categorized as a fault prediction. Compared with the second network structure, the third network structure has higher accuracy with the same number of hidden layer elements. For most of the five data division modes, 95% accuracy can be achieved with no more than 5 hidden layer elements.

**Figure 12.** Training and testing accuracies of the second network structure and division mode 1. The cyan circle marks the accuracy that is more than 95% with the lowest hidden element number.

**Figure 13.** Training and testing accuracies of the second network structure. (**a**) Mode 2\_a. (**b**) Mode 2\_b. (**c**) Mode 2\_c. (**d**) Mode 2\_d. The cyan boxes mark the accuracies that are more than 95% with the lowest number of hidden layer elements.

**Figure 14.** Training and testing accuracies of the third network structure and data division mode 1. The dark yellow circle marks the accuracy that is more than 95% with the lowest hidden element number.

### 5.2.4. The Fourth Network Structure

The fourth structure is designed as Figure 11d. One input layer consists of 40 elements which are the lowest 20 real and 20 imaginary parts of the frequency-domain results of the speed signal. For the normalization of inputs, the 20 real parts and the 20 imaginary parts are normalized, respectively. The frequency-domain results are acquired by transforming the raw speed using Fourier synchrosqueezed transform algorithm [33]. As a type of time-frequency analysis method, the Fourier synchrosqueezed transform algorithm could acquire the instantaneous frequency-domain information more precisely than short-time Fourier transform. This advantage is helpful for abstracting fault features from the speed signal. Since there are 60 teeth of the flywheel, the sample rate is set as 60. The data length of one cycle is 120, and the data length of 2's power could achieve high computational speed; thus, the truncated data length is set as 128. A Kaiser window is utilized for reducing spectral leakage. Considering that the output is in the manner of sequence which means each LSTM cell has an output, the accuracy is calculated according to the last output of one cycle. Figures 16 and 17 present the training and testing results using the fourth structure.

**Figure 16.** Training and testing accuracies of the fourth network structure and data division mode 1. The purple circle marks the accuracy that is more than 95% with the lowest hidden element number.

Like the previous methods, the main variables are also the data division mode and the number of hidden layer elements. The results show that it is easy to achieve high accuracies which are more than 95% when the training data and testing data are arranged in mode 1. However, when the data are arranged as modes 2\_a, 2\_b, 2\_c and 2\_d, 20 or more hidden layer elements are needed for achieving 95% accuracy.

5.2.5. Results Comparison

	- It can be seen that the distribution of misdiagnosed results is scattered. The number of most misdiagnosed results do not exceed 10.
	- In Figure 19, the main misdiagnosis results occur on normal, cylinder 3# misfire, cylinders 1# and 3# misfire, cylinders 4# and 5# misfire, and cylinders 4# and 6# misfire conditions. In Figure 20, the main misdiagnosis results occur on normal, cylinder 3# misfire, cylinders 4# and 5# misfire, and cylinders 4# and 6# misfire conditions. The misdiagnosed results are related to the true results, for example,

when misfire occurs in cylinder 3#, the predicted result is misfire in cylinders 3# and 5#.

• The most common running condition for an engine is normal condition. By observing the results in Figures 19 and 20, the worse misdiagnosed case is for normal condition which is presented in Figure 19b. However, the detection accuracy on this normal condition is 98.91% (8902 ÷ 9000 = 98.91%), which is a relatively high detection accuracy. Meanwhile, since the network performs well on the worst misdiagnosis case, it can be concluded that the detecting accuracy for each type of fault is acceptable.

**Figure 18.** Training and testing results by using the third network and data division mode 2\_d. (**a**) Training result. Total accuracy is 97.40%. (**b**) Testing result. Total accuracy is 98.12%. The red boxes mark the accuracies that increase more than 4%. The magenta boxes mark the accuracies that decrease more than 4%.


**Table 5.** The best accuracies for different networks under different data division modes.

For each mode, the left column is training accuracy, the right column is testing accuracy.

**Figure 19.** Confusion matrices of the results acquired with the third network and data division mode 2\_b. (**a**) Training result. Total accuracy is 99.95%. (**b**) Testing result. Total accuracy is 99.90%.

**Figure 20.** Confusion matrices of the results acquired with the third network and data division mode 2\_c. (**a**) Training result. Total accuracy is 99.98%. (**b**) Testing result. Total accuracy is 99.96%.

### *5.3. Comparison with Similar Research Efforts in the Literature*

Table 6 provides a comparison of the results of this paper and some similar research efforts in the literature. Considering different application demands, the researchers conducted their studies on different types of engines. The main differences among these research works are the selection of engine running speed, running load, misfire types and

the fault detection algorithms. Although different research objects may lead to a different performance of an algorithm, such as that the four-cylinder engine has more clear fault features than the six-cylinder engine with the same displacement, the detection accuracy will still prove the effectiveness of an algorithm. On the whole, the accuracies reported in Table 6 are all relatively high, the comparison confirms the good performance of the algorithm utilized in this paper. In addition, many misfire types have been tested in this paper, which means more classification labels are needed for the algorithm, this also proves the effectiveness of the LSTM RNN algorithm. From the point of view of machine learning algorithm application, it is helpful for evaluating the network effectiveness if the datasets for network training and testing are sampled under different engine speed or load conditions. For example, if the network is trained on 1000 r/min and 1200 r/min, and it performs well on 1100 r/min, it is reasonable to infer that the network will perform well on 1150 r/min; however, if both the network training and testing are conducted under 1000 r/min and 1200 r/min, it is hard to evaluate the network performance on 1100 r/min or 1150 r/min. Compared with some research works in Table 6, the algorithm proposed in this paper are tested on the engine running conditions that are different from those for network training, which proves the feasibility of the algorithm.


**Table 6.** Comparison of our results with the similar works in the literature.



### **6. Conclusions**

In this paper, an LSTM RNN based approach for engine misfire detection is proposed. The traditional misfire detection method has limitations on the high-speed and lowload engine operating conditions. Hence, the traditional misfire detection method is conducted on the datasets to verify its feasibility first; and the reason of the limitation, that one threshold is insufficient to extract the fault feature when the background noise is high, is concluded. In order to extract the fault features extensively and effectively, unlike previous works, the LSTM RNN is a powerful technique on sequence signal processing is utilized to detect misfire. In addition, for the sake of ensuring the feasibility of proposed algorithm, two-cylinder misfire faults are tested beside one-cylinder faults, and a wide range of engine working speed and load conditions which including the high-speed and low-load conditions are tested.

The LSTM RNNs are designed according to the characteristic of speed signal. Four kinds of input layer structures are designed. These inputs contain instantaneous raw speed signal, a fixed segment of raw speed signal, and the extracted real and imaginary parts of speed signal. Moreover, five data division modes are attempted to explore the optimal training data size. These training datasets can be categorized into two parts: the training data that has running condition intersection with the testing data, and the training data that has no running condition intersection with the testing data. The testing results show that the sequence-input-sequence-output LSTM RNN which utilizes raw speed data could not achieve acceptable detecting accuracy. The second, third and fourth LSTM RNNs could achieve accuracies more than 98%. The best performance is achieved by the third LSTM RNN with data division mode 2\_c, and the testing accuracy is 99.96%. Meanwhile, the third LSTM RNN with data division mode 2\_b is also recommended, because it has the relatively high testing accuracy 99.90% and small training data size as well.

In this study, misfire detection is conducted on complete misfire conditions. It is also significant that misfire fault could be detected when it is not severe. Therefore, in further research, the slight misfire fault including partial misfire will be utilized to improve the detection sensitivity of the proposed algorithm. In addition, future work will include developing hardware for misfire detection of this engine as well. The LSTM RNN models developed in this study will then be written into the hardware to provide misfire information.

**Author Contributions:** Conceptualization, W.G. and P.Z.; methodology, X.W. and P.Z.; software, P.Z.; validation, W.G.; formal analysis, P.Z.; investigation, P.Z., Y.L., Y.W. and H.P.; resources, X.W. and W.G.; data curation, P.Z.; writing—original draft preparation, P.Z.; writing—review and editing, W.G.; visualization, P.Z.; supervision, W.G.; project administration, X.W. and W.G.; funding acquisition, X.W. and W.G. All authors contributed to this work by collaboration. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors would like to convey special gratitude to the State Key Laboratory of Engine Reliability for sponsoring this research (No. skler-202010).

**Conflicts of Interest:** The authors declare no conflict of interest.

### **Abbreviations**



### **References**

