Next Article in Journal
Optimization Method for Water-Flooded Beach-Bar Sand Bodies: A Case Study of the Fourth Member Red Beds of the Paleogene Shahejie Formation in the Dongying Depression
Previous Article in Journal
An Intelligent Design Method for Remanufacturing Considering Remanufacturability and Carbon Emissions
Previous Article in Special Issue
Data-Driven Modeling Methods and Techniques for Pharmaceutical Processes
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Intelligent Fault Diagnosis of Marine Diesel Engines Based on Efficient Channel Attention-Improved Convolutional Neural Networks

1
Marine Engineering College, Dalian Maritime University, Dalian 116026, China
2
Dalian Maritime University Smart Ship Limited Company, Dalian 116026, China
*
Author to whom correspondence should be addressed.
Processes 2023, 11(12), 3360; https://doi.org/10.3390/pr11123360
Submission received: 1 November 2023 / Revised: 29 November 2023 / Accepted: 2 December 2023 / Published: 3 December 2023

Abstract

:
With the rapid development of smart ships, the ship maintenance model is also changing. In order to extract the fault characteristics of diesel engine thermal parameters more easily, reduce the model’s complexity and improve the model’s accuracy, a new approach is proposed: first, the traditional convolutional neural networks (improved convolutional neural networks (ICNN)) are improved by using Meta-ACON as the activation function, improved AdamP as the optimizer, and label smoothing regularization (LSR) as the loss function, which enhances the stability of the model. Secondly, efficient channel attention (ECA) is added to achieve the mastery of global feature information, reduce the complexity of the traditional self-attention module, and enhance the model’s feature extraction ability. Lastly, the accuracy and reliability of the model are verified through ablation and comparison experiments. The accuracy rate reaches 97.6%, which is significantly improved by 32.1% compared with the original model, and the robustness of the model is verified through the introduction of noise. The experimental results demonstrate the applicability of the model in the field of diesel engine fault diagnosis.

1. Introduction

Diesel engines are favored for their excellent fuel economy and high thermal efficiency, which is why most vessels employ diesel engines as their primary propulsion systems. Consequently, the stability and safety of diesel engines become of paramount importance. For marine diesel engines, during prolonged operation, mechanical components are susceptible to degradation due to factors such as material wear, varying loads, temperature fluctuations, and humidity. These influences make key components highly prone to failures, potentially escalating from minor malfunctions to severe breakdowns or even accidents. This can result in reduced efficiency, personnel casualties, and environmental pollution, underscoring the significance of proactive maintenance and fault prevention [1]. From traditional perspectives, ship maintenance has long been considered an area of unnecessary high expenditure, with advanced monitoring methods still awaiting widespread adoption [2]. Whether it is the development of unmanned vessels or the prevailing maritime operational modes of today, the early detection of faults is of paramount importance, whilst the choice of solely relying on expert knowledge falls far short of meeting this demand. In recent years, fault diagnosis methods based on signal analysis, collective intelligence evolution, and machine learning have continued to emerge [3,4,5]. Similarly, some directions have been offered to address the issues encountered in the aforementioned fault diagnosis [6].
Zhong [7] introduced a semi-supervised principal component analysis method (SSPCA) as an alternative to unsupervised learning, leveraging both labeled and unlabeled samples. This approach was applied to diesel engine fault diagnosis, enhancing the diagnostic performance for marine diesel engines. Shyamal Krishna Agrawal [8] proposed a data-driven model for classifying the types and severity of faults occurring in diesel engines in real-world scenarios. This model demonstrated a high level of accuracy. Nipuna Rajapaksha [9] collected acoustic signals and applied fast Fourier transform (FFT). A second-order polynomial kernel function was also utilized to transform input feature vectors into a high-dimensional feature space, consequently enhancing the accuracy of SVM in diesel engine fault diagnosis while reducing the time required. Bedir Ünver [10] and his colleagues opted for the fault tree analysis (FTA) method in a fuzzy environment to conduct a systematic and reliable analysis of crankcase explosion incidents in marine two-stroke diesel engines. The results closely aligned with real-world scenarios. He et al. [11] employed a genetic algorithm to optimize the backpropagation neural network (BP), achieving higher accuracy in diagnosing diesel engine fuel supply system faults compared to a standalone backpropagation neural network. This approach also resulted in shorter training times. Hamed Rezaniaiee Aqdam [12] conducted health monitoring of mooring cables and introduced a damage diagnosis method based on radial basis function (RBF) neural networks. This method, grounded in Rod theory and finite element analysis, employed the sub-structure perturbation (SSP) approach to address uncertainties in boundary conditions. The results indicate that this approach exhibits superior performance. José Carlos M. Oliveira [13] and his colleagues investigated the application of fault detection and diagnosis (FDD) based on unweighted neural networks in single-variable dynamic systems. The results indicate that FDD exhibits robust generalization. Wang [14] proposed a novel intelligent fault diagnosis framework that combines variational mode decomposition (VMD) with the Rihaczek distribution to obtain time-frequency representations (TFRs) for diesel engines with high concentration properties. Subsequently, a new algorithm for graph-regularized bi-directional non-negative matrix factorization was employed to extract fault features from the TFRs corresponding to different fault models. Chen [15] and colleagues proposed a novel fault diagnosis approach based on the distance and probability graph convolutional network (DPGCN) model, which enhances classification accuracy and stability, particularly in the presence of imbalanced datasets. Yu [16] addressed the issues of cross-term interference in the Wigner–Ville distribution (WVD) and redundancy control in fast correlation-based filter (FCBF) for diesel engine fault recognition. They proposed a novel approach for diesel engine fault identification based on adaptive WVD, improved FCBF, and relevance vector machine (RVM). Cao [17] introduced an optimization-driven approach for improving the training speed and testing accuracy of diesel engine fault state models. This approach is based on an enhanced artificial bee colony (IABC) optimization technique, aimed at addressing the global parameter optimization problem in support vector machine (SVM). Guan [18] and their team designed an innovative deep learning network structure, the random convolutional neural network (RCNN), for the intelligent health monitoring of diesel engines. They validated the effectiveness and superiority of the model. Han [19] aimed for CNNs to generate task-specific features from raw time-series sensor data, followed by classification. This approach has been validated to effectively detect and isolate propulsion system faults. Zhan [20] introduced a fault analysis approach that combines optimized variational mode decomposition (VMD) with improved convolutional neural networks (CNN) to address the essential requirement of preventive maintenance for diesel engines. Zhou [21] employed recurrence plots (RP) to characterize the nonlinear features of vibration signals. By integrating this approach with convolutional neural networks (CNN), they introduced the RP-CNN method. The results demonstrated that the RP-CNN method effectively identifies four mechanisms: normal state, fatigue wear, abrasive wear, and adhesive wear. Qin [22] introduced a novel multi-scale CNN-LSTM neural network (MSCNN-LSTM-Net) along with its residual CNN denoising module for robust diesel engine misfire diagnosis in noisy environments. Experimental validation of single-cylinder and multi-cylinder diesel engine misfire diagnosis under various noise levels and operational conditions confirmed the effectiveness of this approach. Zhou [23] and colleagues extended an autoencoder generative adversarial network (AE-GAN) for handling imbalanced data. They introduced a data generation and filtering strategy within the AE-GAN framework, leveraging autoencoders to learn features of imbalanced samples while having the discriminator filter out generated samples that do not meet the criteria. Wang [24] and colleagues proposed a comparative diagnostic model that employed fully connected layers as a measure of similarity between pairs of features to determine their classification into specific types. Additionally, regularization techniques were incorporated to enhance performance. Saufi et al. [25] introduced a small-sample fault diagnosis approach based on spectral kurtosis filtering and particle swarm-optimized stacked sparse autoencoders. This method achieved high diagnostic accuracy even when the number of training samples for each fault is as low as 100.
Although previous fault diagnosis methods for diesel engine vibration fault signals have achieved satisfactory results, during the actual offshore operation, the diesel engine operating conditions are complex and the environment is harsh, and the extraction of vibration fault signals causes great interference [26]. Therefore, this paper will be based on the thermal parameters of the diesel engine to identify its faults. Xu [27] modeled a one-dimensional simulation of a diesel engine based on a real ship, reproducing six typical single-failure and a variety of typical double-failure concomitant phenomena of a diesel engine. Knezevic [28] analyzed a turbocharger by establishing a fault tree analysis method for turbochargers based on the Wärtsilä-Transas model. Zhang [29] conducted research into the fault diagnosis of marine diesel engines. The simulation software GT-Suite was utilized to establish the whole model of diesel engines, and the fault diagnosis framework of marine diesel engines based on an adaptive genetic algorithm was constructed, and good results were obtained. Tsitsilonis [30] used the instantaneous engine crankshaft torque as a thermal parameter feature for diesel engine fault diagnosis. He et al. [31] verified the effectiveness of this scheme by specifically identifying the fault states of the turbocharger through the system criticality metrics. Hou [32] was able to achieve an accuracy of 95% using the improved genetic algorithm (GA) and the multi-layer perceptron (MLP) for the diesel engine cylinder faults. It can be seen that although previous authors have applied thermal parameters to diesel engines, most of them have preferred other modules of the diesel engine rather than the diesel engine itself, and the use of optimization algorithms for model training can greatly increase the time cost. Therefore, this paper proposes a faster and more accurate model for fault state identification in diesel engines.
The summary of the work in this paper is as follows:
  • A new model with more efficient and stronger classification ability is proposed, which adopts 1D-Meta-ACON (activated or inactivated) as the activation function on the basis of traditional CNN, significantly improving the classification ability of the model without increasing the computational complexity of the model. AdamP is adopted as the optimizer of the model, which accelerates the speed of convergence and ensures the convergence effect of the model; in addition, LSR is also introduced as a regularization constraint, thus avoiding the problem of overconfidence of the model.
  • In order to solve the problem of the long-distance dependence of CNN limited by the size of the convolutional kernel, the ECA module algorithm is incorporated, which involves only a small number of parameters and at the same time can bring significant performance gains.
  • Taking a Wärtsilä dual-fuel mainframe of model 9L34DF as an example, the model’s high efficiency and classification ability are verified by setting up and extracting fault data. After adding impulse noise and quantization noise, the model can still maintain high accuracy, which verifies the robustness of the model.

2. Theory and Methods

2.1. Convolutional Neural Network (CNN)

Convolutional neural networks (CNNs), one of the numerous types of neural networks, stand out as one of the most effective feature extractors. They play a crucial role in various fields, including image processing, semantic segmentation, object detection, and fault diagnosis [33]. The core concept of convolutional neural networks (CNNs) revolves around feature extraction from input vectors through multiple layers of convolution and pooling operations. Moreover, the classification results are obtained through fully connected layers and the application of the SoftMax function. Let the input be a feature vector as in Equation (1), where x i represents the i th sample, y i denotes the corresponding label for x i , and N signifies the total number of samples:
{ X , Y } = { x i , y i } , i = 1 , 2 , , N ,
As in Figure 1, the core part of CNN is the convolutional layer, and in the mainstream CNN-based network architecture, the question of how to reasonably design the convolutional parameters is one of the main influencing factors of the final performance of the model. Its role is to extract vector features using convolution kernel. The convolution kernel slides the input vectors at a designed step size while performing the operation with Equation (2):
y k l = f ( b k l + cov 1 D ( W l , S i l 1 ) ) , k = 1 , 2 , , M l ,
N l 1 is the number of output neurons of the layer l 1 convolution and pooling module; S i l 1 is the i th to ( i + n 1 ) th input neuron in the layer l 1 convolution-pooling; b k l and W l denote the layer l convolution with its corresponding k th input bias and the weight of the layer l convolution, respectively. f ( x ) is the activation function. If the convolution padding is zero, n is the convolution kernel size, and the step size is s d , then the output of its convolution with respect to the input is:
M l = c e i l { N l 1 n s d } + 1 ,
where c e i l ( x ) means upward rounding, and when s d = 1 , the pooling layer output of the l th convolution-pooling layer is Equation (4), where d s is downward sampling.
S l = d s ( Y l ) ,

2.2. Efficient Channel Attention (ECA)

ECA is an improved version of SE-NET (self-attention net), as shown in Figure 2, which removes the original fully-connected layer and replaces it with a 1 1 c convolutional kernel for processing, as convolution has the recognized ability of capturing information across channels [34]. It is the reduced parameters of ECA that make the model lighter and the training time shorter. The adaptive convolution kernel size k , which is the coverage of cross-channel interactions, is determined firstly via aggregating the convolutional features after global average pooling of vectors of size [ H ,   W ,   C ] without dimensionality reduction, and then one-dimensional convolution is performed, followed by Sigmoid function learning. Where the adaptive one-dimensional convolution kernel is computed as in Equation (5), usually, odd denotes k as the odd number.
k = ψ ( C ) = log 2 C γ + b γ o d d ,

2.3. Model Framework

Artificial intelligence has been introduced into diesel engine fault diagnosis methods and has obtained improved results [35], but the exploration of CNN has not been explored on a deeper level under the condition of ship operation state data and small sample data. For this reason, this paper proposes an intelligent diagnosis method of ECA-ICNN, as shown in Figure 3.
The data with a batch size of 32 and a dimension of 1 1 23 are input into the model, and after two convolution-pooling, in which the activation function is changed to Meta-ACON, a tensor with a feature number of 20 and a sequence length of 4 is obtained. Then, after the learning of the ECA module and global flat pooling, the full connection is carried out and the final classification result is output.

2.4. Label Smoothing Regularization (LSR)

The use of cross entropy to calculate the loss during the small sample fault diagnosis training process only takes into account the loss of the correct labels in the training samples and ignores the loss of the incorrectly labeled positions, which results in the overfitting phenomenon. So, in order to avoid the above situation, LSR is used in this paper to improve the diagnostic ability of the model, which is advantageous in small-sample fault diagnosis by replacing the real labels with smooth labels to alleviate the over-model over-confidence. The relationship between cross-entropy loss ( C E ,   l 0 ) and L S R ( l ) can be expressed as Equation (6):
l = k = 1 K log ( p ( k ) ) q ( k ) = k = 1 K log ( p ( k ) ) [ ( 1 ε ) q ( k ) + ε K ] = ( 1 ε ) [ k = 1 K log ( p ( k ) ) q ( k ) ] + ε [ k = 1 K log ( p ( k ) ) K ] = ( 1 ε ) l 0 + ε [ k = 1 K log ( p ( k ) ) K ] ,
p ( k ) is the predictive distribution, q ( k ) is the true distribution, q ( k ) is the true distribution after labelling smoothing, ε is the coefficients, K is the number of categories, and let the labelling distribution be a uniform distribution μ ( k ) = 1 / K .

2.5. META-ACON

ACON (activate or not), as shown in Figure 4, is a method that achieves a form of control over activation, whether linear or non-linear, by simply maintaining a switch factor. In ACON, there are various variants, including ACON-A, ACON-B, and ACON-C. The Meta-ACON mentioned in this paper is an effective activation function based on ACON-C [36].
Initially, two learning parameters, p 1 and p 2 , are set. Then, based on an input feature vector x with a precomputed average in D dimensions, two 1 1 convolutions ( C 1 and C 2 ) are applied. Finally, a sigmoid function is used to produce a value between 0 and 1, determining whether activation should occur, as in Equations (7) and (8):
β = σ C 1 C 2 1 D d = 1 D x ,
f A C O N ( x ) = ( p 1 p 2 ) x σ ( β ( p 1 p 2 ) x ) + p 2 x ,
Ultimately, when β is used, f A C O N x = m a x ( p 1 x ,   p 2 x ) .
When β 0 , f A C O N x = m e a n ( p 1 x ,   p 2 x ) .

2.6. Improved AdamP

There are a large number of weights and parameters that need to be adjusted in a CNN to minimize the loss function. Therefore, the choice of optimizer is crucial in terms of the model’s convergence speed and the key to the model’s improvement of the model. In this paper, an optimized version of AdamP [37] is chosen. Unlike traditional Adam, which employs a fixed L2 regularization term (parameter decay) to control the parameter norm, it may lead to excessive weight decay in certain situations. AdamP introduces an adaptive weight decay mechanism that dynamically adjusts the weight decay magnitude. Furthermore, when the gradients in Adam become excessively large, it can cause the model parameters to deviate significantly from the original parameter space, thus affecting the model’s performance. AdamP, on the other hand, uses cosine similarity to determine whether the gradient needs to be projected. Lastly, instability in learning rates can also arise in certain non-convex optimization problems, affecting the model’s convergence speed. Therefore, AMSGrad has been introduced to address this issue. This improvement maintains a historical maximum second-order estimate (maximum variance) from all past iterations, ensuring that the model’s learning rate does not increase indefinitely and enhances the stability of the model, as in Figure 5. Let the objective function to be optimized be f ( w ) , then the gradient parameter at time step i can be expressed as Equation (9), and then g i is brought into the first-order moment estimation vector m i and the second-order moment estimation vector v i maintained by AdamP, where m i is the exponential moving average (EMA) of the gradient, which is used to estimate the mean of the gradient, while v i is the exponential moving average (EMA) of the square of the gradient, which is used to estimate the variance of the gradient. They are updated as in Equations (10) and (11):
g i w f i ( w i ) ,
m i β 1 m i 1 + ( 1 β 1 ) g i ,
v i β 2 v i 1 + ( 1 β 2 ) g i 2 ,
where β 1 and β 2 are the decay rates, which are the given input parameters. According to AMSGrad, it is necessary to keep the maximum variance of all iterations in the history, as in Equation (12). We can also derive the update result for momentum p from Equation (13):
v ^ i max ( v ^ i 1 , v i ) ,
p i m i / v ^ i + ε ,
ε is an extremely small positive value introduced to prevent division by zero. For each iteration, the cosine similarity between the gradient and the weight is computed. If the projection is less than the specified threshold δ divided by the square root of the weight dimension, as shown in Equation (14), the update step is equal to the projection of the momentum vector p i ; otherwise, it is equal to the momentum p i itself. The final updated weight is defined in Equation (15):
cos ( w i , g i ) < δ / dim ( w ) ,
w i w i 1 η q i

2.7. Diesel Engine Fault Diagnosis Process

The flow of the diesel engine fault diagnosis algorithm for ECA-ICNN is shown in Figure 6 with the following steps:
  • Data Preprocessing: Normalize and preprocess the relevant thermodynamic parameter data obtained from diesel engine model simulations.
  • Furthermore, the entire dataset is randomly divided into three parts at a ratio of 7:1:2, for the purposes of model training, validation, and testing.
  • Model Parameter Optimization: Hyperparameters are manually adjusted with the aim of reducing diagnostic time and accelerating model training speed, while preventing issues such as gradient explosions or vanishing.
  • Input model: the divided training set samples and labels are fed into the ECA-ICNN model for training.
  • Save the model: the trained model is saved, validated with a test set of samples, and the diagnostic results are output.

3. Experiment and Results

3.1. Model of Diesel Engine and Fault Simulation

Model diagnostics were applied to a dual-fuel mother ship with a Wärtsilä 9L34DF main engine. Due to the high cost of failure with a simulation through introducing hazardous and destructive items onto real ships [38], in this paper, therefore, a specialized diesel engine simulation software, AVL-BOOST R2020.1, was employed for the construction of the diesel engine model and the extraction of fault data. The main performance parameters of the diesel engine are shown in Table 1. MCR stands for maximum continuous rating.
The diesel engine simulation model is shown in Figure 7. In Figure 7, E1 represents the constructed 9L34DF diesel engine system, which includes SB1 and SB2 as the boundaries of the entire system; CL1 is the air filter; TC1 is the turbocharger, where C stands for the compressor, and T for the turbine; CO1 is the air cooler; PL1 is the intake manifold; C1–C9 are the nine cylinders of the diesel engine; PL2 is the exhaust manifold; MP1-MP24 are the corresponding checkpoints arranged, and the black lines represent the corresponding pipelines.
Based on the model construction, under diesel engine calibration conditions, simulated operational parameters obtained using AVL-BOOST R2020.1 software were compared with the data from the actual ship’s dynamometer report to validate the model’s reliability. The results are shown in Table 2. The measured values are the reliable data obtained from the bench test report presented by the diesel engine factory, and the role here is only to provide a theoretical basis for the reliability of the simulation model. The simulation value is the result calculated by the software through the set parameters of the diesel engine.

3.2. Faulty Dataset Extraction

In the simulation results, it can be concluded that the error between its main parameters and the measured data are within 3%, and the results meet the requirements of fault simulation. So, in this paper, diesel engines’ operating states have been simulated, such as valve clearance increase (F1), injection timing delay (F2), supercharger inefficiency (F3), air cooler efficiency decrease (F4), and normal state (F0). In this study, the text further divides the individual faults into four classes (LV1–LV4), according to their severity. The specific simulation scheme is shown in Table 3:
After conducting simulated runs under the aforementioned fault parameter configurations, it can be observed that the thermal parameters within the diesel engine exhibit varying fluctuations under different fault conditions. Therefore, in order to efficiently analyze the relationship between the internal thermal parameters of the diesel engine and its faults, and to achieve instant monitoring and diagnosis of diesel engine faults, a total of 23 elements of monitoring data were selected, and the specific detection indexes were collected, as shown in Table 4. A total of 400 sets of data are simulated, of which 280 are in the training set, 25 are in the validation set and 95 are in the test set.

3.3. Result

This paper is based on the Pytorch framework to implement the above diesel engine fault diagnosis model. The experiments were performed on a computer configured with an AMD5800H processor (Advanced Micro Devices, Inc. Sunnyvale, CA, USA) at 3.2 GHz, 16 GB of RAM and an NVIDIA GEFORCE RTX 3060 (NVIDIA Corporation, Santa Clara, CA, USA) graphics processor. The specific parameters of the models are provided in Table 5 below.

3.3.1. Evaluation of Models

The disrupted test set was fed into the model and validated with 10 independent repetitions of the experiment, which were used to assess the diagnostic capability of the model, and the results of the selected parts are shown in Figure 8. Here, the x-axis is the total number of test samples in the test set, 19 samples per class, totaling 95 samples, and the y-axis is the five classes of labels predicted by the model. More details are shown in Table 6. It can be concluded that the delayed injection timing and the reduced efficiency of the supercharger were not judged accurately enough. The reason for this is the complex fuel injection timing conditions and severe afterburning, which result in lower peak pressure and peak temperature, leading to an increase in exhaust gas temperature. In contrast, other faults can have the same characteristics, causing interference in the diagnostic accuracy; a decrease in the efficiency of the supercharger can also lead to a decrease in the maximum burst pressure, an increase in fuel consumption, and a decrease in power, among other effects, which likewise negatively affect the model classification effort. However, the final classification results still maintain a fairly high accuracy, and it can be concluded that the model has reliability.

3.3.2. Impact of Hyperparameters

In order to exploit the performance of the model and optimize it, the learning rate and batch size are fine-tuned in this paper, as shown in Table 7. The number of training rounds was set to 50, the learning rate was adjusted from 0.01 to 0.1, and the training batch was set to 8, 16, 32, 64, and 128 for 10 repetitions of independent experiments, respectively. The experimental results show that when the training batch is 128 and the learning rate is 0.01 or 0.1, the model will start to converge at rounds 23 and 19, respectively, and the rest of the model results do not converge. The rest of the models started converging at around rounds 2, 4, 7 and 12 with training times around 80, 60, 25 and 20 s for the corresponding batch sizes. Finally, the highest accuracy of 97.58% was obtained after model convergence at a learning rate of 0.003 and a batch size of 32, as shown in Figure 9.

3.3.3. Ablation Experiment

To verify the diagnostic performance, the dataset was fed into CNN, ACON-CNN and ECA-CNN for comparison using ablation experiments.
CNN, ECA-CNN, and ACON-CNN all share the same backbone network parameters. Except for ACON-CNN and ECA-ICNN, all activation functions use the non-saturating activation function RELU, as it is linear for positive inputs, leading to faster convergence and computation speed. The fault diagnosis accuracy and loss of the test set under different model structures are shown in Figure 10. It can be seen that the original CNN model is very insensitive to the fault data, with less than 70% accuracy; while replacing the activation function with 1D-META-ACON or adding the ECA module of the CNN model, they demonstrate a significant increase in accuracy, which is due to the fact that 1D-META-ACON is an activation function that can be dynamically adapted, and this adaptivity can more accurately deal with different features, which in turn improves the learning rate and performance. In addition, the introduction of the self-attention mechanism can solve the long-distance dependence problem of CNNs, because the convolution operation of traditional CNNs can only focus on the local region, and the ECA module solves this problem effectively.
As shown in Figure 11, in this paper, a comparison is made with other more mainstream activation functions, and the effect of 1D-META-ACON enhancement is very obvious. This is because although the other functions have their own advantages, such as the computational simplicity of ReLU, and the non-monotonicity provided by Mish and Swish, they are static and do not adjust dynamically according to the input data. Thanks to this property, 1D-META-ACON manages the gradient flow more efficiently and reduces the problem of vanishing or exploding gradients. ReLU, on the other hand, is more likely to cause the problem of disappearing gradients because of the characteristic that the gradient is zero when it is negative. In addition, 1D-META-ACON has low computational complexity; it simply generates beta parameters through a small network.
After adding the ECA module, the model is able to effectively extract features from the data. It has compared two methods: self-attention mechanism (SE-Net) and its improved version called coordinate attention, as shown in Table 8. Although there is no significant difference in accuracy, it is clear that the training time is greatly reduced with the addition of the ECA module, resulting in a significant improvement in efficiency. This is due to the fact that dimensionality reduction is important for learning channel attention, and that proper cross-channel interaction can maintain performance while significantly reducing model complexity. ECA module’s local cross-channel interaction strategy, which does not require dimensionality reduction, can be efficiently implemented with 1D convolution. In addition, its adaptive selection of a 1D convolution kernel size is used to determine the coverage of local cross-channel interactions.

3.3.4. Robustness of the Model

Regarding the model’s robustness, when collecting electromechanical data on ships, there is a high probability that the diagnosis results will be affected to some extent by factors such as machine vibrations, aging of instrumentation equipment, and faults in the sensors themselves. In response to the above factors, this paper added noise to the input data and decided to carry out the experiments in two groups according to the characteristics of the diesel engine’s working environment. Firstly, simulated instrument data were validated by introducing pulse noise. In this process, all data were selected, and irregular pulse signals were randomly added based on their own percentage to simulate oscillations in the instrument data. According to expert experience, the pointer swing amplitude during actual ship navigation is approximately around twenty percent. In this study, the pulse intensity was increased to fifty percent to simulate and explore the model’s potential, as shown in Figure 12a. The second group simulates the characteristics of data displayed on the engine control room console and introduces quantization noise. Since it involves discretizing a continuous signal into a finite number of values, it is simulated by rounding the continuous signal to the nearest discrete value. Due to the small error associated with these types of data, in this paper, the noise intensity is set to 1–5%. The results are shown in Figure 12b. According to the experimental results, the ECA-ICNN model can reach more than ninety percent accuracy regardless of the interference of impulse noise or quantization noise, maintains a high diagnostic accuracy, and shows convincing robustness.

3.3.5. Optimizer Comparison Experiments

For the optimizer selection, we compared six conventional optimizers, selected for the number of training rounds (early stopping), the total training time, the average number of rounds per session, the accuracy and the training loss, as shown in Table 9. In all the experiments, it can be seen that AdamP, AdamW, NAadm, Adadelta, and SGD all stop training due to a premature disappearance of the gradient, and the traditional Adam is also slightly less effective than IAdamP. This is because IAdamP introduces a projection term that projects the parameters back to their initial weight decay state, thus improving the model’s performance. At the same time, it retains the adaptive learning rate adjustment based on weight decay from the Adam optimizer, making the model easier to train. Finally, it employs cosine similarity to measure the similarity between gradients and weights, providing a reasonable control of weight update directions and avoiding the model becoming stuck in local minima and saddle points.

4. Discussion

Taking the diesel engine simulation model as an example, a new improved CNN model is proposed for fault diagnosis, which achieves an accuracy of 97.6%, which is 32.1% higher than that of the original model. The 1D-META-ACON activation function introduced effectively improves the performance of the model, and its self-adaptive mechanism brings the CNN model higher flexibility and effectiveness, which increases the accuracy by 20% compared with the original model, outperforms other activation functions by 20% over the original model, and outperforms the other activation functions; the ECA module allows the plagued long-range dependency problem of the CNN model to be solved, effectively capturing the interconnections between different parts of the input, and also improves the accuracy by 11.2% compared to the original model, and because the ECA module generates the channel attention through fast one-dimensional convolution, the size of its kernel can be adaptively determined by the channel dimensionality of the nonlinear mapping, which reduces its training time to 24.87 s, an improvement of 35.6% and 26.6% over the original self-attention module and the coordinate self-attention module, respectively. Finally, by adding multiple degrees of different noises to the data, its accuracy still reaches over 90%, verifying that the robustness of the model is also very impressive.
In summary, this paper can bring two advantages for diesel engine fault diagnosis applications. Firstly, the proposed ECA-ICNN model achieves good results for the fault diagnosis accuracy of diesel engines. Secondly, multiple ship nacelle systems can be combined for large-scale system condition assessment.
However, there are some shortcomings and challenges in this paper. For example, the question of whether to decide to utilize a more lightweight convolutional neural network as a feature extractor, which would further shorten the training time, or whether it is a challenge to learn from the non-equilibrium data, which can only be obtained for the thermal parameters of the diesel engine in real ship operation.

Author Contributions

Conceptualization, J.W. and H.C.; methodology, J.W.; software, K.J.; validation, J.W., H.C. and Z.C.; formal analysis, H.C.; investigation, Z.C.; resources, K.J.; data curation, Z.A.; writing—original draft preparation, J.W.; writing—review and editing, J.W., H.C. and Z.A.; visualization, K.J.; supervision, H.C.; project administration, H.C.; funding acquisition, H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Development of Ship Operation Condition Monitoring and Simulation Platform, grant number 1638882993269; Research and Application of Smart Ship Digital Twin Information Platform, grant number 2022JH1/10800097.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to the parameters of the engine model being private.

Conflicts of Interest

The authors declare that they have no known competing or financial interests. or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Zhang, X.; He, C.; Lu, Y.; Chen, B.; Zhu, L.; Zhang, L. Fault diagnosis for small samples based on attention mechanism. Measurement 2022, 187, 110242. [Google Scholar] [CrossRef]
  2. Lazakis, I.; Turan, O.; Aksu, S. Increasing ship operational reliability through the implementation of a holistic maintenance management strategy. Ships Offshore Struct. 2010, 5, 337–357. [Google Scholar] [CrossRef]
  3. Zhang, S.; Zhang, S.B.; Wang, B.N.; Habetler, T.G. Deep Learning Algorithms for Bearing Fault Diagnosticsx—A Comprehensive Review. IEEE Access 2020, 8, 29857–29881. [Google Scholar] [CrossRef]
  4. Luo, H.; He, C.; Zhou, J.; Zhang, L. Rolling bearing sub-health recognition via extreme learning machine based on deep belief network optimized by improved fireworks. IEEE Access 2021, 9, 42013–42026. [Google Scholar] [CrossRef]
  5. Ke, Y.; Yao, C.; Song, E.Z.; Dong, Q.; Yang, L.P. An early fault diagnosis method of common-rail injector based on improved CYCBD and hierarchical fluctuation dispersion entropy. Digit. Signal Process 2021, 114, 103049. [Google Scholar] [CrossRef]
  6. Zhao, Z.B.; Li, T.F.; Wu, J.Y.; Sun, C.; Wang, S.B.; Yan, R.Q.; Chen, X.F. Deep learning algorithms for rotating machinery intelligent diagnosis: An open source benchmark study. ISA Trans. 2020, 107, 224–255. [Google Scholar] [CrossRef] [PubMed]
  7. Zhong, K.; Li, J.B.; Wang, J.; Han, M. Fault Detection for Marine Diesel Engine Using Semi-supervised Principal Component Analysis. In Proceedings of the 2019 9th International Conference on Information Science and Technology (ICIST), Hulunbuir, China, 2–5 August 2019; pp. 146–151. [Google Scholar]
  8. Agrawal, S.K.; Banerjee, S.; Sinha, A.; Das, D. SafeEngine: Fault Detection with Severity Prediction for Diesel Engine. In Proceedings of the 2022 IEEE 10th Region 10 Humanitarian Technology Conference (R10-HTC), Hyderabad, India, 16–18 September 2022; pp. 216–220. [Google Scholar]
  9. Rajapaksha, N.; Jayasinghe, S.; Enshaei, H.; Jayarathne, N. Sensitivity analysis of SVM kernel functions in machinery condition classification. In Proceedings of the 2021 IEEE Southern Power Electronics Conference (SPEC), Kigali, Rwanda, 6–9 December 2021; pp. 1–10. [Google Scholar]
  10. Unver, B.; Gurgen, S.; Sahin, B.; Altin, I. Crankcase explosion for two-stroke marine diesel engine by using fault tree analysis method in fuzzy environment. Eng. Fail. Anal. 2019, 97, 288–299. [Google Scholar] [CrossRef]
  11. He, J.; Li, X.; Zhao, Y. The fault diagnosis of diesel fuel supply system based on BP neural network optimized by genetic algorithm. J. Phys. Conf. Ser. 2021, 1732, 12065. [Google Scholar] [CrossRef]
  12. Aqdam, H.R.; Ettefagh, M.M.; Hassannejad, R. Health monitoring of mooring lines in floating structures using artificial neural networks. Ocean. Eng. 2018, 164, 284–297. [Google Scholar] [CrossRef]
  13. Oliveira, J.C.M.; Pontes, K.V.; Sartori, I.; Embiruçu, M. Fault detection and diagnosis in dynamic systems using weightless neural networks. Expert Syst. Appl. 2017, 84, 200–219. [Google Scholar] [CrossRef]
  14. Wang, X.; Cai, Y.P.; Li, A.H.; Zhang, W.; Yue, Y.J.; Ming, A.B. Intelligent fault diagnosis of diesel engine via adaptive VMD-Rihaczek distribution and graph regularized bi-directional NMF. Measurement 2021, 172, 108823. [Google Scholar] [CrossRef]
  15. Wang, R.H.; Chen, H.; Guan, C. DPGCN Model: A Novel Fault Diagnosis Method for Marine Diesel Engines Based on Imbalanced Datasets. IEEE Trans. Instrum. Meas. 2023, 72, 228002. [Google Scholar] [CrossRef]
  16. Liu, Y.; Zhang, J.H.; Ma, L. A fault diagnosis approach for diesel engines based on self-adaptive WVD, improved FCBF and PECOC-RVM. Neurocomputing 2016, 177, 600–611. [Google Scholar] [CrossRef]
  17. Cao, H.; Zhang, J.D.; Cao, X.; Li, R.; Wang, Y.R. Optimized SVM-Driven Multi-Class Approach by Improved ABC to Estimating Ship Systems State. IEEE Access 2020, 8, 206719–206733. [Google Scholar] [CrossRef]
  18. Wang, R.H.; Chen, H.; Guan, C. Random convolutional neural network structure: An intelligent health monitoring scheme for diesel engines. Measurement 2021, 171, 108786. [Google Scholar] [CrossRef]
  19. Han, P.; Li, G.; Skulstad, R.; Skjong, S.; Zhang, H. A deep learning approach to detect and isolate thruster failures for dynamically positioned vessels using motion data. IEEE Trans. Instrum. Meas. 2020, 70, 1–11. [Google Scholar] [CrossRef]
  20. Zhan, X.B.A.; Bai, H.J.; Yan, H.; Wang, R.C.; Guo, C.M.; Jia, X.S. Diesel Engine Fault Diagnosis Method Based on Optimized VMD and Improved CNN. Processes 2022, 10, 2162. [Google Scholar] [CrossRef]
  21. Zhou, Y.K.; Wang, Z.Y.; Zuo, X.; Zhao, H. Identification of wear mechanisms of main bearings of marine diesel engine using recurrence plot based on CNN model. Wear 2023, 520, 204656. [Google Scholar] [CrossRef]
  22. Qin, C.J.; Jin, Y.R.; Zhang, Z.N.; Yu, H.G.; Tao, J.F.; Sun, H.; Liu, C.L. Anti-noise diesel engine misfire diagnosis using a multi-scale CNN-LSTM neural network with denoising module. CAAI Trans. Intell. Technol. 2023, 8, 963–986. [Google Scholar] [CrossRef]
  23. Zhou, F.N.; Yang, S.; Fujita, H.; Chen, D.M.; Wen, C.L. Deep learning fault diagnosis method based on global optimization GAN for unbalanced data. Knowl.-Based Syst. 2020, 187, 104837. [Google Scholar] [CrossRef]
  24. Wang, C.J.; Xu, Z.L. An intelligent fault diagnosis model based on deep neural network for few-shot fault diagnosis. Neurocomputing 2021, 456, 550–562. [Google Scholar] [CrossRef]
  25. Saufi, S.R.; Bin Ahmad, Z.A.; Leong, M.S.; Lim, M.H. Gearbox Fault Diagnosis Using a Deep Learning Model With Limited Data Sample. IEEE Trans. Ind. Inform. 2020, 16, 6263–6271. [Google Scholar] [CrossRef]
  26. Li, S.T.; Zhang, Y.; Wang, L.B.; Xue, J.Y.; Jin, J.F.; Yu, D.L. A CEEMD Method for Diesel Engine Misfire Fault Diagnosis based on Vibration Signals. In Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China, 27–29 July 2020; pp. 6572–6577. [Google Scholar]
  27. Xu, N.; Zhang, G.L.; Yang, L.B.; Shen, Z.Y.; Xu, M.; Chang, L. Research on thermoeconomic fault diagnosis for marine low speed two stroke diesel engine. Math. Biosci. Eng. 2022, 19, 5393–5408. [Google Scholar] [CrossRef] [PubMed]
  28. Knezevic, V.; Orovic, J.; Stazic, L.; Culin, J. Fault Tree Analysis and Failure Diagnosis of Marine Diesel Engine Turbocharger System. J. Mar. Sci. Eng. 2020, 8, 1004. [Google Scholar] [CrossRef]
  29. Zhang, D.F.; Tong, P.X.; Zhu, W.; Zheng, J. Research on Reciprocating diesel engines fault diagnosis based on adaptive genetic algorithm optimization. In Proceedings of the 2022 IEEE 6th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Beijing, China, 3–5 October 2022; pp. 872–876. [Google Scholar] [CrossRef]
  30. Tsitsilonis, K.M.; Theotokatos, G. Engine Malfunctioning Conditions Identification through Instantaneous Crankshaft Torque Measurement Analysis. Appl. Sci. 2021, 11, 3522. [Google Scholar] [CrossRef]
  31. He, Z.C.; Yang, Y.; Han, H.Y.; Wang, J.; Zhang, Y.H.; Li, H. A key performance indicator-based fault detection scheme for marine diesel turbocharging systems. J. Franklin Inst. 2021, 358, 9346–9363. [Google Scholar] [CrossRef]
  32. Hou, L.S.; Zou, J.Q.; Du, C.J.; Zhang, J.D. A fault diagnosis model of marine diesel engine cylinder based on modified genetic algorithm and multilayer perceptron. Soft Comput. 2020, 24, 7603–7613. [Google Scholar] [CrossRef]
  33. Bhatt, D.; Patel, C.; Talsania, H.; Patel, J.; Vaghela, R.; Pandya, S.; Modi, K.; Ghayvat, H. CNN Variants for Computer Vision: History, Architecture, Application, Challenges and Future Scope. Electronics 2021, 10, 2470. [Google Scholar] [CrossRef]
  34. Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
  35. Xu, X.J.; Zhao, Z.Z.; Xu, X.B.; Yang, J.B.; Chang, L.L.; Yan, X.P.; Wang, G.D. Machine learning-based wear fault diagnosis for marine diesel engine by fusing multiple data-driven models. Knowl.-Based Syst. 2020, 190, 105324. [Google Scholar] [CrossRef]
  36. Ma, N.; Zhang, X.; Liu, M.; Sun, J. Activate or not: Learning customized activation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 8032–8042. [Google Scholar]
  37. Heo, B.; Chun, S.; Oh, S.J.; Han, D.; Yun, S.; Kim, G.; Uh, Y.; Ha, J.-W. Adamp: Slowing down the slowdown for momentum optimizers on scale-invariant weights. arXiv 2020, arXiv:2006.08217. [Google Scholar]
  38. Yang, K.; Fan, H.Y. Research on Fault Diagnosis Method of Diesel Engine Thermal Power Conversion Process. Adv. Eng. Res. 2017, 120, 1445–1450. [Google Scholar]
Figure 1. Con-pooling layer.
Figure 1. Con-pooling layer.
Processes 11 03360 g001
Figure 2. Efficient channel attention module.
Figure 2. Efficient channel attention module.
Processes 11 03360 g002
Figure 3. Overall architecture for the proposed network of ECA-ICNN.
Figure 3. Overall architecture for the proposed network of ECA-ICNN.
Processes 11 03360 g003
Figure 4. An ACON network that learns to activate (green) or not (white).
Figure 4. An ACON network that learns to activate (green) or not (white).
Processes 11 03360 g004
Figure 5. Vector direction of the gradient and momentum.
Figure 5. Vector direction of the gradient and momentum.
Processes 11 03360 g005
Figure 6. Fault diagnosis process of ECA-ICNN.
Figure 6. Fault diagnosis process of ECA-ICNN.
Processes 11 03360 g006
Figure 7. Diesel engine simulation model.
Figure 7. Diesel engine simulation model.
Processes 11 03360 g007
Figure 8. Parts result in testing: (a) result of the first round of experiments; (b) result of the third round of experiments; (c) result of the sixth round of experiments; (d) result of the ninth round of experiments.
Figure 8. Parts result in testing: (a) result of the first round of experiments; (b) result of the third round of experiments; (c) result of the sixth round of experiments; (d) result of the ninth round of experiments.
Processes 11 03360 g008
Figure 9. Hyper-parameter experiments.
Figure 9. Hyper-parameter experiments.
Processes 11 03360 g009
Figure 10. Ablation comparative experiment and losses.
Figure 10. Ablation comparative experiment and losses.
Processes 11 03360 g010
Figure 11. Accuracy of activate functions.
Figure 11. Accuracy of activate functions.
Processes 11 03360 g011
Figure 12. Different performance of the model with the addition of two types of noise: (a) accuracy of amplitude noise; (b) accuracy of quantization noise.
Figure 12. Different performance of the model with the addition of two types of noise: (a) accuracy of amplitude noise; (b) accuracy of quantization noise.
Processes 11 03360 g012
Table 1. Main performance parameters of the Wärtsilä 9L34DF dual fuel marine engine.
Table 1. Main performance parameters of the Wärtsilä 9L34DF dual fuel marine engine.
ParametersValueParametersValue
Stroke5Piston stroke (mm)400
Number of cylindersInline 9 cylindersPower (kW)4060
Cylinder diameter (mm)340Compression Ratio12.6
Rated speed (r/min)750Effective Fuel Consumption (g/kW·h)191
Average piston speed (m/s)10 (at MCR)Firing Order of Cylinders1-7-4-2-8-6-3-9-5
Table 2. Comparison results of measured and simulated data of main parameters.
Table 2. Comparison results of measured and simulated data of main parameters.
ParametersMeasure ValueSimulation ValueAbsolute Error
Power/kW40604123.790.015
Mean Effective Pressure/bar19.9620.020.003
Maximum Explosive Pressure/bar175171.160.021
Boost Pressure/bar4.524.570.011
Exhaust Gas Temperature/°C520522.070.003
Oil Consumption/(g/cycle)4.124.140.004
Table 3. Diesel engine fault parameter configuration.
Table 3. Diesel engine fault parameter configuration.
Parameter Configuration
CodeFault CharacteristicsNormalSeverity
LV1LV2LV3LV4
F1valve clearance increase0.4 mm0.420.440.460.48
F2injection timing delay−13.1 deg−12.1−11.1−10.1−9.1
F3supercharger inefficiency100%95%90%85%80%
F4air cooler efficiency decrease100%95%90%85%80%
Table 4. Thermal parameters.
Table 4. Thermal parameters.
SymbolNameUnitSymbolNameUnit
Pmp3Turbocharger Outlet PressurebarTmp23Exhaust Manifold Temperature°C
Vmp3Turbocharger Outlet Flow Ratem/sPmp24Turbine Outlet Pressurebar
Tmp3Turbocharger Outlet Temperature°CVmp24Turbine Outlet Flow Ratem/s
Pmp4Air Cooler Outlet PressurebarTmp24Turbine Temperature°C
Vmp4Air Cooler Outlet Flow Ratem/sFIntake VolumeKg/s
Tmp4Air Cooler Outlet Temperature°CgFuel Consumption Rateg/kW·h
Pmp5Cylinder Inlet PressurebarPzMaximum Explosive Pressurebar
Vmp5Cylinder Inlet Flow Ratem/sλHighest Boost Pressurebar/deg
Tmp5Cylinder Inlet Temperature°CPPowerKW
Tmp14Exhaust Temperature°CPIIMEPbar
Pmp23Exhaust Manifold PressurebarPBBMEPbar
Vmp23Exhaust Manifold Flow Ratem/s
Table 5. Structures of ECA-ICNN.
Table 5. Structures of ECA-ICNN.
LayerKernelStrideActivationOutput
Conv_1d_1511D-Meta-ACON[32, 40, 22]
Conv_1d_2311D-Meta-ACON[32, 50, 18]
Max Pool2/2 [32, 50, 9]
Conv_1d_3211D-Meta-ACON[32, 30, 8]
Conv_1d_4211D-Meta-ACON[32, 20, 4]
Max Pool2/2 [32, 20, 2]
ECA Block11RELU[32, 20, 2]
Adaptive Avg Pool [32, 20, 1]
FC SoftMax[32, 5]
Table 6. Performance of testing set.
Table 6. Performance of testing set.
FaultyPredict/Target
Normal (F0)75/76
Valva Clearance Increase (F1)76/76
Injection Delay (F2)72/76
Supercharger Efficient Decrease (F3)72/76
Air Cooler Efficient Decrease (F4)76/76
Accuracy97.6%
Table 7. Results of hyper-parameter experiment.
Table 7. Results of hyper-parameter experiment.
Batch Size8163264128
Epoch of Convergence2–53–78–1311–1619/23/∞
Training Time (s)79.19–11.8743.08–59.1723.68–31.215–23.9711.18/13.36/∞
Table 8. Accuracy and training time of self-attention.
Table 8. Accuracy and training time of self-attention.
Self-AttentionsAccuracy (%)Training Time (s)
SE-Net+ CNN95.938.62
CoA+ CNN94.831.48
ECA+ CNN97.624.87
Table 9. Optimizer.
Table 9. Optimizer.
OptimizerEpochTraining Time (s)Ave-Time (s)Accuracy (%)Losses
AdamP4627.150.5996.60.54
Adam5015.520.3195.30.55
AdamW124.30.3519.21.90
NAdam3614.00.3994.20.56
Adadelta4316.80.3996.90.62
SGD277.50.2792.90.56
IAdamP5024.870.4997.60.53
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, J.; Cao, H.; Cui, Z.; Ai, Z.; Jiang, K. Intelligent Fault Diagnosis of Marine Diesel Engines Based on Efficient Channel Attention-Improved Convolutional Neural Networks. Processes 2023, 11, 3360. https://doi.org/10.3390/pr11123360

AMA Style

Wang J, Cao H, Cui Z, Ai Z, Jiang K. Intelligent Fault Diagnosis of Marine Diesel Engines Based on Efficient Channel Attention-Improved Convolutional Neural Networks. Processes. 2023; 11(12):3360. https://doi.org/10.3390/pr11123360

Chicago/Turabian Style

Wang, Jihui, Hui Cao, Zhichao Cui, Zeren Ai, and Kuo Jiang. 2023. "Intelligent Fault Diagnosis of Marine Diesel Engines Based on Efficient Channel Attention-Improved Convolutional Neural Networks" Processes 11, no. 12: 3360. https://doi.org/10.3390/pr11123360

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop