A Fault Diagnosis Method for Marine Engine Cross Working Conditions Based on Transfer Learning

Wang, Longde; Cao, Hui; Cui, Zhichao; Ai, Zeren

doi:10.3390/jmse12020270

Open AccessArticle

A Fault Diagnosis Method for Marine Engine Cross Working Conditions Based on Transfer Learning

¹

Marine Engineering College, Dalian Maritime University, Dalian 116026, China

²

Dalian Maritime University Smart Ship Limited Company, Dalian 116026, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2024, 12(2), 270; https://doi.org/10.3390/jmse12020270

Submission received: 9 January 2024 / Revised: 30 January 2024 / Accepted: 31 January 2024 / Published: 1 February 2024

(This article belongs to the Section Ocean Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Marine engines confront challenges of varying working conditions and intricate failures. Existing studies have primarily concentrated on fault diagnosis in a single condition, overlooking the adaptability of these methods in diverse working condition. To address the aforementioned issues, we propose a cross working condition fault diagnosis method named the Balanced Adaptation Domain Weighted Adversarial Network (BADWAN). This method combines transfer learning to tackle the challenges of cross working condition diagnosis with limited labels. Specifically tailored for scenarios with incomplete labeling in the target working conditions, we designed an Enhanced Centroid Balance scheme to balance the label space, thereby enhancing the model’s transfer capabilities. Additionally, we designed an Instance Affinity Weighting scheme on the foundation of Class-level Weighting, refining the model to the instance level for effective information interaction. Furthermore, we incorporated the Adaptive Uncertainty Suppression strategy to further boost the model’s classification prowess. Two experimental scenarios were designed to demonstrate the effectiveness of the proposed model using a Wärtsilä9L34DF dual-fuel engine as an experimental subject. The results demonstrate an over 90% diagnostic accuracy in scenarios with complete target working condition labels and 86% accuracy in scenarios with incomplete labels, outperforming other transfer learning models. The BADWAN model excels in cross-condition fault diagnosis tasks for marine engines with incomplete target working condition labels, offering a novel solution to this field.

Keywords:

marine engine; fault diagnosis; transfer learning

1. Introduction

Ensuring the safety and reliability of marine engines, the primary power sources for maritime vessels, is a crucial prerequisite for the seamless operation of ships [1]. As science and technology advance, marine engine performance continues to improve. However, this progress also makes its internal structure increasingly complex. In addition, the engine’s working environment is extremely harsh. This complex and harsh environment is conducive to component wear, undercooling and other faults, leading to suboptimal combustion in the cylinder, surges in emissions, excessive heat loads, decreased efficiency and, in severe cases, the potential paralysis of the power system [2,3]. Such conditions not only jeopardize the reliability and safety of normal voyage, but also incur unnecessary economic losses and contribute to environmental pollution. Consequently, swiftly and accurately diagnosing faults in marine engines poses a significant challenge within the realm of vessel safety.

Fortunately, with the rapid development of sensor monitoring technology, big data analytics, artificial intelligence and other technologies, new vitality has been injected into the data-driven fault diagnosis methodology [4]. Deep learning, as a pivotal manifestation of data-driven approaches, exhibits heightened diagnostic prowess when confronted with data characterized by intricate features and high variability. It has garnered outstanding outcomes in the realm of fault diagnosis research [5,6,7,8,9]. However, the efficacy of deep diagnostic models is contingent upon access to an extensive dataset comprising a substantial number of accurately labeled training samples.

Regrettably, in practice, acquiring an ample quantity of labeled samples proves to be a time-consuming and challenging endeavor. Frequently, the predominant portion of collectible samples consists of raw, unlabeled data [10,11]. Moreover, owing to the intricate and fluctuating nature of marine engine operating conditions, the gathered monitoring samples frequently exhibit diverse probability distributions. In the context of a classification task involving samples with different probability distributions, existing research employs Unsupervised Domain Adaptation (UDA) [12] to transfer valuable knowledge learned from labeled source domain samples to unlabeled target domain samples. This process aims to achieve classification in unknown target domains. Li et al. [13] achieved cross-domain fault diagnosis by minimizing the maximum mean difference (MMD) between domains. Liu et al. [14] employed a Domain Adversarial Neural Network (DANN) model, which integrates an adversarial mechanism to extract domain-invariant features, for the execution of the bearing fault diagnosis task under varying loads. Chen et al. [15] employed Dynamic Adversarial Adaptation Networks (DAANs) to achieve the joint distribution alignment of marginal and conditional distributions between domains. They complemented this approach with streaming regularization to further uncover latent internal information. This effort was aimed at enhancing the model’s adaptability for fault diagnosis in rotating machinery under diverse operating conditions. The majority of domain adaptive transfer learning methods discussed in the preceding studies accomplish the transfer of domain knowledge by narrowing the gap between domains or extracting domain-invariant features through adversarial techniques. This approach holds significant utility in diagnostic tasks characterized by variations in sample distributions [16]. However, it is crucial to note that the prerequisite for the UDA task is the consistency between the label spaces of the source and target domains.

Nevertheless, in practical scenarios, the labeling space of the target domain often constitutes only a subset of the labeling space of the source domain. In inter-domain distance approximations or domain alignments, irrelevant class samples from the source domain that are not present in the target domain can impact the effectiveness of domain adaptive methods. This results in negative transfers, as illustrated in Figure 1. In this context, certain scholars have introduced the concept of the Partial Domain Adaptation (PDA) task [17]. Cao et al. [18] designed the Partial Adversarial Domain Adaptation (PADA) model for the above problem, which utilizes the confidence level of the entropy of the classification results to assign weights to each category of the source domain so as to alleviate the negative transfer effect brought about by the abnormal source category samples. Further, Li et al. [19] built upon the PADA method and addressed scenarios involving outliers in samples within specific domain transfer tasks related to mechanical failures. They employed the results of both the classifier and the discriminator to collaboratively calculate probability weights, intending to assign weights to the importance of categories in the source domain. This approach aimed to alleviate the adverse effects of samples from irrelevant source domains. However, it is essential to note that the entropy-based confidence weighting method necessitates ensuring the authenticity and reliability of the classification. Building upon the above issue, Liang et al. [20] developed a balanced adversarial alignment method atop the weighting strategy. This method enhances the target domain by selecting specific samples from the source domain to ensure the consistency of the labeling space, yielding exceptional results. However, due to the random selection of samples, resulting in the existence of unrepresentative samples in the target domain enhancement pool, the introduction of inappropriate samples affects the alignment between the domains instead.

For our proposed model, the Balanced Adaptation Domain Weighted Adversarial Network (BADWAN), we devised a scheme involving Class-level weighting and Instance Affinity Weighting. This scheme aims to assess the importance of categories and the affinity of instance samples, allowing the model training to selectively prioritize more significant categories and samples for learning. Moreover, because the strategy of using cross-entropy as the objective function for the label classifier neglects the effective suppression of error classes, i.e., the probability of error classes in the source domain samples may still remain high. This could lead the model to assign the maximum classification probability to these error classes when applied to the surrounding aligned target data. To address this issue, we introduce a scheme known as Adaptive Uncertainty Suppression. This approach involves incorporating complementary entropy loss to bolster the model’s output probability for the correct category while concurrently suppressing the elevated prediction probability associated with erroneous categories. Moreover, relying solely on classification results to exclude potential outlier class samples poses risks. To mitigate this, we incorporated the Enhanced Centroid Balance scheme. This approach involves augmenting the labeling space of the target domain to resemble that of the source domain more closely during domain alignment. Specifically, it borrows samples closer to the class centroid, i.e., more representative samples, from the source domain. This transformation helps convert the Partial Domain Adaptation (PDA) task into a conventional Unsupervised Domain Adaptation (UDA) task.

In summary, our research contributes in the following ways:

A Self-Calibrating Convolutional Neural Network (SCNet) is adopted as a feature extractor instead of the traditional CNN to expand the sensory field of the model and enhance the model’s learning ability for remote dependencies between data, thus improving the model’s ability to extract domain-invariant features.
The Enhanced Centroid Balance method is designed to enhance the target domain label space by introducing pseudo-target domain samples, so that the enhanced target domain is more similar to the label space of the source domain, thus alleviating the negative transfer problem existing in PDA.
Building upon traditional Class-level Weighting, we introduce Instance Affinity Weighting to assess sample affinity. This innovation allows the model to dynamically select more affinity instance samples for learning, taking into account class importance discrimination.
To refine the accuracy of weighting, we implement an uncertainty suppression scheme to bolster the classifier’s classification ability. This scheme further reduces the model’s output probability for incorrect results through complementary entropy loss.

The remainder of the paper is structured as follows: Section 2 provides background knowledge, Section 3 outlines the proposed method in detail, Section 4 assesses the method’s performance through a case study, and finally, Section 5 summarizes the research findings and discusses future perspectives.

2. Preliminaries

2.1. Problem Definition

In this section, certain notations and definitions are presented to offer a clear elucidation of the involved cross working condition partial transfer fault diagnostic tasks. Our research concept involves integrating transfer learning methods to extrapolate knowledge acquired from the labeled working conditions of marine engines to unlabeled unknown working conditions. This approach facilitates fault diagnosis and the recognition of samples in unknown working conditions. We adhere to the definition of the transfer learning method, designating the labeled known working condition as the source domain and the unlabeled unknown working condition as the target domain, and denote

X_{s} = {(x_{i}^{s}, y_{i}^{s})}_{i = 1}^{n_{s}}, x_{i}^{s} ϵ R^{d}

for labeled source domain samples and

X_{t} = {(x_{i}^{t})}_{i = 1}^{n_{t}}, x_{i}^{t} ϵ R^{d}

for unlabeled target domain samples. The labeled source domain samples

X_{s} ϵ R^{d \times n_{s}}

with the d-dimensional features obey the distribution

P_{s} (X_{s})

. Similarly, target domain samples

X_{t} ϵ R^{d \times n_{t}}

conform to the distribution

P_{t} (X_{t})

, where

n_{s}

and

n_{t}

represent the number of samples in the source domain and the target domain, respectively. Given that the traditional domain adaptative transfer learning method [21] addresses the issue of domain variability within the same label space but with different marginal and conditional distributions through unsupervised learning, we can denote this as

P_{s} (X_{s}) \neq P_{t} (X_{t}), P_{s} (Y_{s} | X_{s}) \neq P_{t} (Y_{t} | X_{t})

. However, in practical applications, due to the difficulty of collecting comprehensive fault categories, some categories present in the source domain may never be observed in the target domain, i.e., the number of categories contained in the known working condition is more than the number of categories contained in the unknown working condition. In such cases, the source domain label space is a superset of the target domain label space, denoted as

y_{t} \subseteq y_{s}

. In this regard, certain scholars have designated categories present in both the source and target domains as shared categories, while those exclusive to the source domain and absent in the target domain are referred to as outlier categories [18]. The ultimate goal of this study is to build a diagnostic network model that can effectively learn domain-invariant feature G and discrimination method D through an adversarial network in the case of an asymmetric label space between known and unknown conditions, and that can focus on shared class samples and filter outlier class samples to minimize the target risk

{P r}_{(x, y) ~ P_{t}} [C (G (x) \neq y]

of cross-domain classifier C, that is, to realize the fault diagnosis of marine engine cross working conditions.

2.2. Domain Adaptation of Adversarial Networks

For domain adaptation, the essence is to find an adaptive metric to measure the distributional differences between domains, and to adapt the distribution by reducing the differences in the metrics, thus realizing the transfer task. Taking inspiration from Generative Adversarial Networks (GANs), Yaroslav [22] devised a Domain Adversarial Neural Network (DANN) that integrates adversarial mechanisms with classification networks. A DANN comprises three components: a feature extractor G, a domain discriminator D, and a label classifier C, each with internal parameters θ_G, θ_D and θ_C, respectively. During the training of the model, the feature extractor aims to extract features that confuse the discriminator to the greatest extent possible. This is accomplished so that the discriminator cannot discern whether a sample belongs to the source or target domain. Simultaneously, the discriminator endeavors to accurately attribute samples, resulting in a conflict where one component is dedicated to confusion, and the other is dedicated to discrimination. This setup creates a form of confrontation. Additionally, a classifier C is incorporated to learn the features extracted from the source domain samples, facilitating the classification of these samples. This addition prevents G from exploiting the extraction of irrelevant features to confuse D. Consequently, the objective of the DANN network can be defined as follows:

L (θ_{G}, θ_{C}, θ_{D}) = L_{c l s} (θ_{G}, θ_{C}) + λ L_{a d v} (θ_{G}, θ_{D})

(1)

In the aforementioned equation, λ represents the trade-off factor, quantifying the weight relationship between the categorization loss L_cls and the adversarial loss L_adv.

L_cls represents the cross-entropy loss of the classifier, computed by determining the negative logarithmic probability L_CE of the correct label from the multi-classification results, as shown in Equation (2):

L_{c l s} (θ_{G}, θ_{C}) = L_{C E} (C (G (x_{i}), y_{i}) = \log \frac{1}{C {(G (x_{i}))}_{y_{i}}}

(2)

For L_adv, it is estimated through the approximation of H-divergence between domains [22]. The specific calculation method is as in Equation (3):

L_{a d v} = \frac{1}{n_{s}} \sum_{x_{i} \in X_{s}} L_{B C E} (D (G (x_{i}^{s}), d_{i}) + \frac{1}{n_{t}} \sum_{x_{j} \in X_{t}} L_{B C E} (D (G (x_{j}^{t}), d_{j})

(3)

And as the domain discriminator performs binary classification, the calculation method for binary cross-entropy loss L_BCE is expressed by Equation (4):

L_{B C E} (D (G (x_{j}^{t}), d_{j}) = d_{i} \log \frac{1}{D (G (x_{i}))} + (1 - d_{i}) \log \frac{1}{1 - D (G (x_{i}))}

(4)

where d_i is the domain label of the sample, with the source domain labeled as 1 and the target domain labeled as 0.

Regarding the label classifier C, the objective is to achieve accurate label classifications. Therefore, the updates should aim to minimize the classification loss. For the domain discriminator D, the objective is to effectively discern the origin of the samples. To enhance this discrimination, it is essential to maximize the distinction between the two domain samples, emphasizing the need to maximize the divergence between the two domains. Concerning the feature extractor G, the objective is to produce features that bewilder the discriminator D while enabling the classifier C to make accurate classifications. Therefore, G undergoes parameter updates focused on minimizing the losses of both the classifier C and the discriminator D. The optimization of parameters during model training is executed through the following formulation:

\begin{matrix} ({\hat{θ}}_{G}, {\hat{θ}}_{C}) = \arg \min_{θ_{G}, θ_{C}} L (θ_{G}, θ_{C}, {\hat{θ}}_{D}) \\ ({\hat{θ}}_{D}) = \arg \max_{θ_{D}} L ({\hat{θ}}_{G}, {\hat{θ}}_{C}, θ_{D}) \end{matrix}

(5)

3. Proposed Method

This section introduces the BADWAN model, designed to facilitate fault diagnosis with partial knowledge transfer across diverse operating conditions for marine engines. The structural framework of the model is illustrated in Figure 2. On the one hand, the BADWAN model adopted the Self-Calibrating Convolutional Network (SCNet) as a feature extractor, replacing the 1D Convolutional Neural Network (1DCNN), to enhance the model’s capability to extract domain-invariant features. On the other hand, the BADWAN model adds four complementary modules to the DANN approach to facilitate the PDA task. The first module is the Enhanced Centroid Balance (ECB), designed to enhance distributional alignment balance. The second module is Class-level Weighting (CW), which emphasizes learning shared class samples through the introduction of Class-level Weighting. The third module is Instance Affinity Weighting (IAW), aiming to enhance the model’s domain fitness and prevent overfitting by focusing on learning more affinity samples. The fourth module is then the Adaptive Uncertainty Suppression (AUS) module, which enables the model to more accurately output the correct sample labels by suppressing the occurrence of false categories. The methods and modules used by the BADWAN are described in detail below.

3.1. Architecture Details

For the feature generator G in our proposed BADWAN model, we opted for a one-dimensional Self-Calibrating Convolutional Neural Network (1D-SCNet) [23]. SCNet enhances the model’s discriminative capabilities by integrating the original space with the potential space. This integration enables the adaptive adjustment of contextual information at each location, allowing for the adaptive consideration of surrounding information and an expanded sensory field for each location. Simultaneously, SCNet exclusively focuses on the contextual information of each spatial location, minimizing interference from irrelevant regions and ensuring robustness. Moreover, the modules in the network are simply heterogeneous counterparts of traditional CNN layers, without augmenting the model’s computational complexity or introducing additional hyperparameters. The SCNet network structure is depicted in Figure 3.

The network initiates by convolving the input

X = [x_{1}, x_{2}, \dots, x_{C}] \in R^{C \times H \times W}

using a CNN and subsequently divides the resulting outcome into two groups {X₁, X₂} along the channel direction. These two segments are then directed to separate paths to acquire contextual information of different types.

In the first path of Figure 3, the network employs three convolutional filters {K₁, K₂, K₃} to execute a self-calibration process, yielding the output Y₁. The internal operations are conducted as follows.

For this path, the input X₁ first passes through an average pooling layer with stride r and filter size r × r:

T_{1} = A v g P o o l_{r} (X_{1})

(6)

Subsequently, feature transformation is applied to T1 using the filter K₂, as shown in Equation (7):

X_{1}^{'} = UP (T_{1} * K_{1})

(7)

where

U P (\cdot)

represents the bilinear interpolation operator, and “*” denotes the convolution operation. Subsequently, as the filter modules are stackable in multiple layers, skip connections are employed to retain the original features and prevent gradient explosion. This can be expressed as Equation (8):

Y_{1}^{'} = (X_{1} * K_{2}) \cdot S i g m o i d (X_{1}^{″})

(8)

Finally, the ultimate output after calibration of the module can be expressed through the application of the filter K₃ for feature transformation.

Y_{1} = Y_{1}^{'} * K_{4}

(9)

The subsequent path involves a standard convolution operation:

Y_{2} = X_{2} * K_{4}

. This segment primarily aims to retain the information surrounding the original space. The two outputs {Y₁, Y₂} are subsequently merged to form the output Y. The above is an introduction to the internal structure of the feature extractor.

The label classifier C uses a fully connected layer, focusing on discerning sample categories after G extracts features from the source domain samples, enabling decision making. In addition, the domain discriminator D discerns the attribution of the samples and engages in an adversarial relationship with the feature extractor G, enhancing the capability of G to extract domain-invariant features. The internal structure of D also includes fully connected layers. The internal structure and detailed parameters of the model are shown in Table 1.

3.2. Enhanced Centroid Balance Module

Traditional domain adaptation methods often assume a scenario in which the source and target domains share the same label space. However, in practical applications, obtaining target domain samples with a complete label space can be challenging. This leads to an asymmetric and unbalanced label space between the two domains. In such cases, the existence of outlier source domain samples within the labeling space hampers the model’s domain adaptation capability, resulting in a negative transfer effect [18]. Consequently, the crucial challenge lies in minimizing or eliminating the impact of outlier samples to address the negative transfer problem.

For our proposed enhanced centroid balance module, the core concept is to symmetrize the label space by reinforcing the label space of the target domain. This involves creating pseudo-target domain samples by extracting a specific number of samples from the source domain for each category and incorporating them into the target domain. To ensure the representativeness of the added samples, we select those closer to the class centroid. This augmentation operation aligns the label space of the enhanced target domain with that of the source domain, effectively transforming the Partial Domain Adaptation (PDA) problem into the well-studied Unsupervised Domain Adaptation (UDA) problem.

For the enhanced centroid balance strategy, initially, we need to calculate the class centroid for each class in the source domain, and this is computed as follows:

c (y^{s, i}) = \frac{1}{| n_{s}^{i} |} \sum_{x_{j} \in X^{s, y^{s, i}}} G (x_{j})

(10)

where

c (y^{s, i})

is the class centroid of the i-th class of samples in the source domain;

n_{s}^{i}

is the number of samples in the i-th class of the source domain;

X^{s, y^{s, i}}

is the space of samples with class

y^{s, i}

in the source domain; and

G (x_{j})

is the features of the samples after feature extraction. After obtaining the class centroid, the distance of each sample from the class centroid is calculated using Euclidean distance, and some of the closer samples are used as pseudo-target domain samples.

After applying the enhanced centroid balance operation, the domain enhancement adversarial loss

L_{a d v}^{e c b}

can be expressed as follows:

\begin{array}{l} L_{a d v}^{e c b} (θ_{G}, θ_{D}) & = \frac{1}{n_{s}} \sum_{i = 1}^{n_{s}} \log [D (G (x_{i}^{s}))] + \frac{1}{n_{t}} \sum_{j = 1}^{n_{t}} \log [1 - D (G (x_{j}^{t}))] \\ + \frac{ρ}{n_{s}} \sum_{i = 1}^{n_{s}} \log [1 - D (G (x_{i}^{s}))] \end{array}

(11)

In the above formula, the pseudo-target domain samples are isolated and assigned a certain proportional weight in the calculation of the adversarial loss due to the introduction of the pseudo-target domain samples, where ρ is a variable that gradually decreases with the number of training rounds and eventually diminishes to zero. This is due to the weak transfer ability of the model in the early stage, so the pseudo-target domain samples are assigned more weight in the early stage to alleviate the problem of label space imbalance. As the number of training rounds increases, the model’s abilities of domain-invariant feature extraction and domain adaptation are gradually enhanced, and the model is able to autonomously categorize classes and align toward shared classes during domain adaptation. At this point, the weights of the pseudo-target domain samples are continuously reduced, and domain adaptation between the source domain and the real target domain is finally realized.

3.3. Class-Level Weighting Module

Diverging from the strategy outlined in Section 3.2, which focuses on augmenting the label space of the target domain, we adopted an alternative perspective. Specifically, we aimed to diminish the impact of outlier class samples from the source domain on the model, thereby mitigating the issue of negative transfer. On the one hand, through training on samples from the source domain, the label classifier can generate a probability distribution for the samples. On the other hand, for the outlier class in the source domain, its label space is nearly independent of the label space in the target domain. Consequently, we anticipate that the label classifier will assign a significantly lower probability to the outlier class samples compared to the shared class when classifying samples in the target domain. At this juncture, we aggregate the probabilities for each category across the entire target domain sample. The category with the higher cumulative probability is considered more likely to be the shared category. Therefore, we assign greater weight to the categories with higher probabilities and increase the model’s focus on the shared class samples so that the model will try not to focus on the outlier samples as much as possible during domain alignment, thus mitigating the detrimental effects of outlier anomalous samples.

The specific Class-level Weighting strategy is calculated as shown below:

m (y^{s, i}) = \frac{1}{n_{t}} \sum_{j = 1}^{n_{t}} p (y^{s, i} ∣ x_{j}^{t})

(12)

where

m (y^{s, i})

is the class-level weight, representing the weight of i-th class in the source domain, and

p (y^{s, i} | x_{j}^{t})

is the probability that the j-th target domain sample output by the label classifier belongs to i-th class in the source domain.

After obtaining the class-level weights, they are incorporated into the training of the label classifier and domain adversarial components. The weighted label classifier loss and weighted domain-enhanced adversarial loss can be expressed as follows:

L_{c l s}^{w} (θ_{G}, θ_{C}) = \frac{1}{n_{s}} \sum_{i = 1}^{n_{s}} m (y_{i}^{s}) L_{C E} (C (G (x_{i}^{s})), y_{i}^{s})

(13)

\begin{array}{l} L_{a d v}^{w} (θ_{G}, θ_{D}) & = \frac{1}{n_{s}} \sum_{i = 1}^{n_{s}} m (y^{s, i}) \log [D (G (x_{i}^{s}))] + \frac{1}{n_{t}} \sum_{j = 1}^{n_{t}} \log [1 - D (G (x_{j}^{t}))] \\ + \frac{ρ}{n_{s}} \sum_{i = 1}^{n_{s}} m (y^{s, i}) \log [1 - D (G (x_{i}^{s}))] \end{array}

(14)

3.4. Instance Affinity Weighting Module

In Section 3.3, we employed Class-level Weighting for the source domain classes, aiming to alleviate the negative impact of outlier class samples. However, each instance sample is equally weighted to participate in the computation of the adversarial loss, which does not seem reasonable enough. Here, we assume the presence of a sample at the edge of multiple categories, where its own category is ambiguous. During domain alignment, aligning such a sample to a specific category becomes challenging. In this case, if we over-pursue this sample to achieve domain alignment. The overall domain adaptation effect may be compromised, and it could even lead to overfitting. Hence, during domain alignment, we anticipate the model to prioritize the “easier” and more “affined” instance samples to mitigate overfitting phenomena, thereby achieving more effective domain alignment. Specifically, for an “affined” sample, which is easy to classify, the probability distribution of the output will be more centralized and the cross-entropy value will be lower, while for a “non-affined” sample, the probability distribution will be more even and the cross-entropy value will be relatively high. Hence, following the aforementioned principle, we utilize the entropy calculation

H (h) = - \sum_{c = 1}^{C} h_{c} \log (h_{c})

to measure the affinity of each instance sample and allocate attention weights

w (x) = 1 + e^{- H (h (x))}

accordingly to all instance samples. At this juncture, the objective of the domain adversarial loss is articulated as Equation (15):

\begin{array}{l} L_{a d v}^{W} (θ_{G}, θ_{D}) & = \frac{1}{n_{s}} \sum_{i = 1}^{n_{s}} w (x_{i}^{s}) m (y^{s, i}) \log [D (G (x_{i}^{s}))] \\ + \frac{1}{n_{t}} \sum_{j = 1}^{n_{t}} w (x_{i}^{t}) \log [1 - D (G (x_{j}^{t}))] \\ + \frac{ρ}{n_{s}} \sum_{i = 1}^{n_{s}} w (x_{i}^{s}) m (y^{s, i}) \log [1 - D (G (x_{i}^{s}))] \end{array}

(15)

3.5. Adaptive Uncertainty Suppression Module

Regarding the weighting methods discussed in Section 3.3 and Section 3.4, while they can help alleviate the impact of outlier class samples and unaffiliated samples, their effectiveness heavily relies on the classification ability of the label classifiers. Hence, to enhance the classification ability of the label classifier effectively, we introduced the Adaptive Uncertainty Suppression module to the model. This module focuses on further improving the classifier’s classification ability through complementary entropy. As the loss of the label classifier is computed using cross-entropy, this calculation method focuses only on the probabilities of the correct categories obtained by the model through the SoftMax function. This approach neglects the significance of the remaining SoftMax probabilities. For instance, consider two samples from a triple classification. After SoftMax, their results are [0.5, 0.25, 0.25] and [0.5, 0.4, 0.1], respectively. Through the entropy calculation, the entropy value of the first sample is approximately 0.603, while the second one is about 0.821. The entropy value of the latter is greater than that of the former, and we know through information entropy theory that the latter is more uncertain. In addition, in the context of obtaining distributional probability results, our objective is for the model to accurately predict the category, and thus, we prefer the model to produce classification results similar to those of the first sample. However, during model training, both samples have the same cross-entropy, which may seem unreasonable. Therefore, we utilize the complementary entropy calculated from the remaining probability distribution to further suppress the uncertainty in the model output [24]. This ensures that the model, while maintaining accuracy in classification, produces more confident and less misclassified results. The specific formulations of the weighted complementary entropy losses are as follows:

L_{a u s}^{w} (θ_{G}, θ_{C}) = \frac{1}{n_{s} \log (N - 1)} \sum_{i = 1}^{n_{s}} m (y_{i}^{s}) L_{a u s} (C (G (x_{i}^{s})), y_{i}^{s})

(16)

L_{a u s} (\hat{y}, y) = (1 - {\hat{y}}_{a}) \sum_{j \neq a} \frac{{\hat{y}}_{j}}{1 - {\hat{y}}_{a}} \log (\frac{{\hat{y}}_{j}}{1 - {\hat{y}}_{a}})

(17)

where N is the number of source domain categories, and a is the index corresponding to each class of the source domain sample.

3.6. Objective Function Integration and Optimization Strategies

In this subsection, we integrate the components from previous Section 3.2, Section 3.3, Section 3.4 and Section 3.5. The overall objective of the BADWAN model can be articulated as follows:

L_{BADWAN} (θ_{G}, θ_{C}, θ_{D}) = L_{c l s}^{w} (θ_{G}, θ_{C}) + α L_{a u s}^{w} (θ_{G}, θ_{C}) + β L_{a d v}^{w} (θ_{G}, θ_{D})

(18)

where α and β are trade-off coefficients.

The optimization strategy of the min-max strategy will be employed for model training [15]. This strategy initially maintains the internal parameters θ_D of the domain discriminator D unchanged and adjusts the feature extractor G and label classifier C based on the loss until the loss is minimized or the maximum number of iterations is reached. Subsequently, the parameters θ_G and θ_C of the feature extractor G and label classifier C are kept constant, and the loss is maximized to adjust the internal parameter θ_D of the domain discriminator D. Finally, the fault diagnostic model with partial knowledge transfer across operating conditions is achieved. The specific parameters are updated as follows:

\begin{matrix} ({\hat{θ}}_{G}, {\hat{θ}}_{C}) = \arg \min_{θ_{G}, θ_{C}} L_{BADWAN} (θ_{G}, θ_{C}, {\hat{θ}}_{D}) \\ ({\hat{θ}}_{D}) = \arg \max_{θ_{D}} L_{BADWAN} ({\hat{θ}}_{G}, {\hat{θ}}_{C}, θ_{D}) \end{matrix}

(19)

4. Experimental Study

In this section, we employ a simulation dataset from a dual-fuel engine aboard an LNG ship to validate the effectiveness of the BADWAN model through case studies. The experiments are categorized into two main scenarios, gradually broadening the application scope. Case 1 involves an experiment in a traditional transfer scenario where the model shares the same label space but operates under different conditions. This aims to verify the feasibility and effectiveness of the transfer learning method in marine engine fault diagnosis research. Case 2 extends the experiment to a more common scenario with differing label spaces and working conditions. This study is intended to experimentally demonstrate the superior performance of the proposed model in scenarios where the labeling space for the target working conditions is incomplete.

4.1. Dataset Description

In this section of the case study, we utilize the simulation dataset from the dual-fuel engine of an LNG vessel, specifically type 9L34DF. The dataset encompasses information related to five fault states and normal states under three distinct operating conditions. Each dataset comprises 13 thermodynamic monitoring indicators, and comprehensive details are presented in Table 2 and Table 3.

4.2. Experimental Case Design and Result Analysis

4.2.1. Case 1

The specific details of the transfer task designed for this case are presented in Table 4, and 1200 groups each of labeled source domain and unlabeled target domain samples were used to train the model, i.e., 200 groups for each type of fault. To mitigate the influence of random factors, the average value was derived as the final result by independently repeating the experiment five times in each transfer scenario. Accuracy and F1-Score were selected as the evaluation metrics to assess the model’s performance in this case.

The experiments were conducted in a Windows 10 system environment using an Intel(R) Core(TM) i7-8799K CPU @ 3.70 GHz. For the proposed BADWAN model, the optimizer employed Stochastic Gradient Descent (SGD) with a momentum of 0.9. The initial learning rate

{l r}_{0}

was set to 0.01, and the learning rate dynamically adjusted based on the epoch e (e ∈ [0, 1]) using function

{l r}_{e} = {l r}_{0} {(1 + λ e)}^{- ξ}

[25]. The parameter λ was set to 0.001 to regulate the growth rate of the exponential term, and ξ was set to 0.75 to represent the decay of the learning rate. The number of samples

n_{e c b}

in the pseudo-target domain in the ECB module was adjusted using

n_{e c b} = (b a t c h s i z e) / ω \times (1 - e)

, where ω was set to 4 to control the rate at which the number of samples decays, the batch size was configured as 128, and the number of epochs was set to 150. And several classical transfer learning methods, DANN [14], MMD [13], Correlation Alignment (CORAL) [26], DAAN [15] and PADA [18], were chosen for comparison experiments.

Figure 4 illustrates the comparative results of the diagnostic accuracy for various methods obtained through experiments in four transfer scenarios. In this context, “None” represents the outcome of cross-condition diagnosis without employing the transfer learning method, utilizing only 1DCNN or SCNet for diagnosis across different operating conditions. 1DCNN obtained 85.83%, 87.76%, 78.5% and 80.75% for each of the four tasks, while SCNet achieved 89.76%, 88.18%, 82.51% and 84.76%. Notably, SCNet consistently demonstrated a higher diagnostic accuracy than 1DCNN across all four transfer task scenarios. This superiority can be attributed to SCNet’s structure, which features a more expansive sensing field, enabling it to effectively extract diverse sample features and consequently enhance the diagnostic accuracy. And regardless of whether 1DCNN or SCNet is utilized as the baseline model, the integration with the transfer learning method significantly enhances the diagnostic accuracy. This observation supports the viability of applying the transfer learning method to the cross-condition fault diagnosis of marine engines. In particular, the diagnostic accuracies achieved by the BADWAN surpass 90% across four tasks, yielding results of 94.17%, 92.33%, 90.01% and 91.51%, respectively. These figures demonstrate a substantial improvement, surpassing those of the 1DCNN network model by 8.34%, 4.57%, 11.51% and 10.76%. This robust performance underscores the viability of implementing the proposed method in cross-operating condition fault diagnoses for marine engines, showcasing its effectiveness in enhancing the diagnostic model’s performance under varying operating conditions. Additionally, Figure 4 reveals that the diagnostic accuracy of SCNet, serving as the baseline model, surpasses that of 1DCNN, irrespective of the adoption of the transfer learning method. This finding validates the effectiveness of incorporating SCNet.

To provide a more quantitative assessment of the diagnostic models, we also calculated the F1-Score for each method (with SCNet as the baseline model), and the experimental results obtained are shown in Table 5. In the absence of the transfer learning method, SCNet exhibited improvements of 3.98%, 0.42%, 3.14% and 3.55% over 1DCNN in the four transfer scenarios, respectively. Upon introducing the transfer learning method, the F1-Score of the model improved by 4% to 8%, demonstrating that the transfer learning method enhances the diagnostic performance by narrowing the gap between different working conditions (domains) or identifying domain-invariant features. In Table 5, it is noteworthy that the diagnostic performance of the model experiences a more significant improvement with the introduction of the transfer learning approach, especially as the disparity between working conditions widens. The proposed BADWAN method demonstrates notable enhancements of 4.42%, 4.26%, 7.76% and 6.69% in the four transfer scenarios when compared those of the benchmark model SCNet. This improvement is more pronounced than that achieved by other transfer learning methods, affirming the viability of the BADWAN in the realm of cross-condition fault diagnoses for marine engines.

To visually discern the advantages of the BADWAN model in cross working condition fault diagnosis, the features generated from the untrained model, the model trained solely with SCNet, and the model trained with the BADWAN were reduced to two dimensions using the t-SNE method. The t-SNE method achieves the mapping of data from a high-dimensional space to a low-dimensional space by minimizing the KL divergence between the two similarity distributions in the high- and low-dimensional spaces and preserves the local and global relationships between data points at the same time. Utilizing a t-SNE plot enables a more intuitive observation of clustering and structural relationships within the same class of faults under different working conditions. The resulting visualization outcomes are presented in Figure 5. In Figure 5, the untrained original samples have varying working conditions, resulting in an extremely chaotic distribution of features that makes direct differentiation challenging. In the case of training with SCNet alone, the resulting features initially outline the sample boundaries, yet there still exists a significant overlap of samples from different classes. Conversely, the BADWAN model proposed in this paper demonstrates a superior ability to delineate boundaries between various sample types. It achieves effective feature clustering, mitigating the issue of distributional differences arising from distinct working conditions. This highlights the BADWAN model’s strong classification and generalization performance.

4.2.2. Case 2

Case 2 further generalizes the application scenarios of the diagnostic model based on Case 1, from the application scenarios with different working conditions and the same label space to the application scenarios with different working conditions and different label spaces (target domain label space ⊆ source domain label space). The specific experimental tasks for Case 2 are outlined in Table 6. The training samples for the model in Case 2 are the same sample set as in Case 1. However, for subsequent experiments, the target domain sample set was reclassified, consistently including the normal state (F0) in the target domain category setting. This adjustment aims to better align the experiments with real-world application scenarios. All potential scenarios were divided based on the number of target domain categories during the segmentation of the target domain sample set. For each scenario, the experiment was independently repeated five times, and the classification outcomes of the target domain samples are presented as the experimental results.

The experiment in Case 2 also included a comparison of several transfer learning methods as explored in Case 1, with SCNet serving as the baseline model. Table 7 displays the experimental results obtained from three transfer tasks. We observed that methods such as DANN, MMD and Coral achieved transfer by reducing the distribution differences between samples internally, without considering label information. However, when the number of categories in the target domain decreases, the presence of outlier classes in the source domain interferes with the measurement of distribution differences. This interference makes it more challenging for these methods to accurately perform domain alignment in the case of an incomplete target domain label set, leading to a decline in the classification performance of the diagnostic model in the target domain. From Table 7, it is evident that the diagnostic models exhibit a decrease in accuracy as the number of categories in the target domains decreases. For example, in the transfer scenario of C→A, the average accuracies for DANN corresponding to the number of categories in the target domains of 5, 4 and 3 were 81.52%, 78.08% and 75.81%, respectively. For the PADA and DAAN methods, on the other hand, certain weights were assigned to each class through the classification results of the internal classifiers, so that the models can pay more attention to the sample information of a specific class during training, thus adapting to different data distributions and labeling spaces. Similarly, in the C→A transfer scenario, the accuracies of PADA were 85.14%, 81.93% and 79.38%, respectively, which, when compared with the aforementioned DANN method, shows that the task of partial domain adaptation can be handled to a certain extent by the class weighting. For our proposed BADWAN method, the internal incorporation of the AUS module further enhances the requirements for the classifier performance, making the class weighting approach more reasonable. Additionally, the ECB module was employed to strengthen the target domain’s label space, facilitating a better alignment of the distributions between the source and target domains to enhance the model’s generalization capabilities. In the C→A transfer scenario, the BADWAN achieved results of 85.99%, 84.12% and 84.13%. Compared to the baseline DANN model, this represents improvements of 4.47%, 6.04% and 8.32%, respectively. It is evident that the proposed method exhibits strong diagnostic performances in general scenarios with varying working conditions and incomplete target domain label spaces.

To visually demonstrate the effectiveness of the method, we present all experimental results from the three tasks in Case 2 using box plots, as depicted in Figure 6, Figure 7 and Figure 8. Upon comparing these figures, it becomes evident that classical transfer learning methods exhibit performance degradation and unstable results as the number of target domain categories decreases. In contrast, the BADWAN model not only achieves higher diagnostic accuracies compared to traditional domain adaptive models but also demonstrates strong stability and robustness.

Similar to Case 1, we chose three tasks in the A→B transfer scenario for a t-SNE visualization. In this case, the base model DANN was selected as the comparative model, and the specific results are presented in Figure 9, Figure 10 and Figure 11. Through Figure 9a, Figure 10a and Figure 11a, it can be observed that the distributional differences between working conditions lead to confusion within the original samples. From Figure 9b, Figure 10b and Figure 11b, it can be observed that after applying the DANN for domain adaptation, the obtained features initially achieved clustering between similar samples. However, due to the difference in label space between the source and target domains, and the fact that DANN relies on the edge distribution of the samples for domain alignment, target domain samples may align with incorrect labels that are similar to the actual labels. This leads to distributional aliasing in the t-SNE graph. Through Figure 9c, Figure 10c and Figure 11c, it can be seen that the BADWAN method has been able to better delineate the label space between the source domain and the target domain and realize the alignment of the corresponding label space. This is attributed to the internal enhancement of the label space of the target domain facilitated by the ECB module. This enhancement ensures that the target domain samples align more effortlessly with the corresponding categories of the source domain. Moreover, the weighting strategy (CW & IAW) allows the model to selectively learn more crucial category information and sample details. This adaptability proves advantageous for the model’s application in diverse Partial Domain Adaptation scenarios, enabling it to better accommodate varying data distributions and label spaces.

5. Conclusions

In this paper, we addressed the real-world challenges posed by evolving working conditions and incomplete labeling in the context of marine engine fault diagnosis. By introducing deep migration learning to address these practical engineering challenges, we proposed the BADWAN model—a cross working condition fault diagnosis method for marine engines.

Given the intricate nature of the internal mechanisms of marine engines and the intricate relationships among various monitoring parameters, the BADWAN model strategically incorporates SCNet as a benchmark network. This aims to extract the inherent connections between each monitoring parameter and fault.
To mitigate the potential negative transfer issues arising from the incomplete labeling space of the target working conditions, the BADWAN model incorporates two distinct research ideas. On the one hand, the Enhanced Centroid Balance module was devised to harmonize the labeling space of target conditions, enhancing transfer performances across diverse working conditions. On the other hand, the weighting strategies “Class-level Weighting” and “Instance Affinity Weighting” were formulated to facilitate class-level and instance-level information interactions. This aids in mitigating the impact of outlier classes and non-affined samples. Additionally, to bolster the reliability of the weighting strategy, the Adaptive Uncertainty Suppression strategy was introduced. This strategy suppresses the likelihood of incorrect class output, thereby further refining the diagnostic performance of the model.
To substantiate the efficacy of the BADWAN, two distinct experimental cases were meticulously designed and gradually spread to more general scenarios. In Case 1, four cross working condition fault diagnosis tasks with a complete label space were considered. The BADWAN demonstrated notable diagnostic accuracies of 94.17%, 92.33%, 90.01% and 91.51% coupled with F1-Scores of 94.15%, 92.32%, 89.99% and 91.46% for the respective transfer tasks within this case. These results showcase a significant enhancement over the base model, affirming the effectiveness and viability of the proposed method in cross working condition fault diagnosis. Going a step further, in the cross working condition fault diagnosis task (Case 2), where the label space for the target working conditions is incomplete, the BADWAN attains an average accuracy of 86.32% across 12 transfer tasks. This outperforms other transfer learning methods, providing validation of the method’s superiority.

In conclusion, the approach presented in this paper exhibits specific strengths in addressing the challenge of cross condition fault diagnosis in marine engines through partial knowledge transfer. This offers a valuable reference for advancing the field of marine engine fault diagnosis research.

Author Contributions

Conceptualization, L.W. and H.C.; methodology, L.W.; software, L.W. and Z.C.; validation, L.W., Z.C. and Z.A.; formal analysis, L.W. and Z.C.; investigation, L.W. and Z.A.; resources, H.C.; data curation, L.W. and Z.C.; writing—original draft preparation, L.W.; writing—review and editing, L.W. and H.C.; visualization, L.W., Z.C. and Z.A.; supervision, H.C.; project administration, H.C.; funding acquisition, H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the project Research and application of Smart Ship Digital Twin Information Platform (funding number: 2022JH1/10800097), National Key R&D Program of China (funding number: 2022YFB4301403) and Development of liquid cargo and electromechanical simulation operation system for LNG ship (funding number: CBG3N21-3-3).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author.

Conflicts of Interest

Author Hui Cao was employed by Dalian Maritime University Smart Ship Limited Company. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Nomenclature

AUS	Adaptive Uncertainty Suppression
AvgPool	The average pooling layer
BADWAN	Balanced Adaptation Domain Weighted Adversarial Network
BSFC	Brake-Specific Fuel Consumption
C	Label classifier
CORAL	Correlation Alignment
CW	Class-level Weighting
C(x)	Output of x after label classifier
$c (y^{s, i})$	The class centroid of the i-th class of the samples in the source domain
D	Domain discriminator
d	The domain label of a sample, with the source domain labeled as 1 and the target domain labeled as 0
DAAN	Dynamic Adversarial Adaptation Network
DANN	Domain Adversarial Neural Network
D(x)	Output of x after feature domain discriminator
e	Current epoch as a percentage of total epochs
ECB	Enhanced Centroid Balance
G	Feature extractor
GAN	Generative Adversarial Network
G(x)	Output of x after feature extractor
IAW	Instance Affinity Weighting
IMEP	Indicated Mean Effective Pressure
K	Convolutional filters, the footmark represents the number
L_adv	Adversarial loss
L_BCE	Binary cross-entropy loss
L_CE	Cross-entropy loss
L_cls	Classification loss
${l r}_{0}$	The initial learning rate
$L_{a d v}^{e c b}$	Adversarial losses after ECB module
$L_{a d v}^{w}$	Adversarial losses after ECB module and CW module
$L_{a d v}^{W}$	Adversarial losses after ECB module, CW module and IAW module
$L_{a u s}^{w}$	Class-weighted complementary entropy loss
$L_{c l s}^{w}$	Classification loss after CW module
L(θ_G, θ_C, θ_D)	Total loss
MMD	Maximum Mean Difference
$m (y^{s, i})$	The class-level weights for the i-th class of the source domains
$n_{e c b}$	The number of samples in the pseudo-target domain
$n_{s}^{i}$	The number of samples in the i-th class of the source domain
n_s	The number of samples in the source domain
n_t	The number of samples in the target domain
PADA	Partial Adversarial Domain Adaptation
PDA	Partial Domain Adaptation
${P r}_{(x, y) ~ P_{t}} [C (G (x) \neq y]$	The target risk of the target domain sample after it is processed by the feature generator and then classified by the classifier
P_s(X_s)	Edge distribution of source domain samples
P_s(Y_s\|X_s)	Conditional distribution of source domain samples
P_t(X_t)	Edge distribution of target domain samples
P_t(Y_t\|X_t)	Conditional distribution of target domain samples
$p (y^{s, i} \| x_{j}^{t})$	The probability that the j-th target domain sample output by the label classifier belongs to the i-th class in the source domain
SCNet	Self-Calibrating Convolutional Neural Network
SGD	Stochastic Gradient Descent
UDA	Unsupervised Domain Adaptation
UP(·)	The bilinear interpolation operator
$w (x)$	The instance-level weights for sample x
X	Input sample
x_i	The i-th input sample
X_s	labeled source domain sample set
$x_{i}^{s}$	The i-th source domain sample
$X^{s, y^{s, i}}$	The space of samples with class $y^{s, i}$ in the source domain
X_t	Unlabeled target domain sample set
$x_{i}^{t}$	Characteristics of the i-th target domain sample
Y	Output feature
y_i	Label of the i-th input sample
y_s	Label space of the source domain sample
$y_{i}^{s}$	Label of the i-th source domain sample
$y^{s, i}$	The i-th label of the samples in the source domain
y_t	Label space of the target domain sample
1DCNN	1D Convolutional Neural Network
α	The trade-off coefficients
β	The trade-off coefficients
θ_G	Parameters inside the feature extractor
θ_C	Parameters inside the label classifier
θ_D	Parameters inside the domain discriminator
λ	Parameters regulating the growth rate of the exponential term
ξ	The decay of the learning rate
ρ	Pseudo-target domain sample weights
σ	Sigmoid activation function
ω	Parameters that control the decay rate of the pseudo-target domain samples
*	Convolution operation

References

Xu, X.; Zhao, Z.; Xu, X.; Yang, J.; Chang, L.; Yan, X.; Wang, G. Machine learning-based wear fault diagnosis for marine diesel engine by fusing multiple data-driven models. Knowl.-Based Syst. 2020, 190, 105324. [Google Scholar] [CrossRef]
Karatuğ, Ç.; Arslanoğlu, Y. Importance of early fault diagnosis for marine diesel engines: A case study on efficiency management and environment. Ships Offshore Struct. 2022, 17, 472–480. [Google Scholar] [CrossRef]
Knežević, V.; Orović, J.; Stazić, L.; Čulin, J. Fault Tree Analysis And Failure Diagnosis Of Marine Diesel Engine Turbocharger System. J. Mar. Sci. Eng. 2020, 8, 1004. [Google Scholar] [CrossRef]
Velasco-Gallego, C.; Lazakis, I. RADIS: A real-time anomaly detection intelligent system for fault diagnosis of marine machinery. Expert Syst. Appl. 2022, 204, 117634. [Google Scholar] [CrossRef]
Wang, R.; Chen, H.; Guan, C. DPGCN Model: A Novel Fault Diagnosis Method for Marine Diesel Engines Based on Imbalanced Datasets. IEEE Trans. Instrum. Meas. 2023, 72, 3504011. [Google Scholar] [CrossRef]
Li, Y.; Guo, Z.; Li, Z.; Deng, Z.; Noman, K. Instantaneous Angular Speed-Based Fault Diagnosis of Multicylinder Marine Diesel Engine Using Intrinsic Multiscale Dispersion Entropy. IEEE Sens. J. 2023, 23, 9523–9535. [Google Scholar] [CrossRef]
Kim, J.-y.; Lee, T.-h.; Lee, S.-h.; Lee, J.-j.; Lee, W.-k.; Kim, Y.-j.; Park, J.-w. A Study on Deep Learning-Based Fault Diagnosis and Classification for Marine Engine System Auxiliary Equipment. Processes 2022, 10, 1345. [Google Scholar] [CrossRef]
Xiong, G.; Ma, W.; Zhao, N.; Zhang, J.; Jiang, Z.; Mao, Z. Multi-Type Diesel Engines Operating Condition Recognition Method Based On Stacked Auto-Encoder And Feature Transfer Learning. IEEE Access 2021, 9, 31043–31052. [Google Scholar] [CrossRef]
Zheng, H.; Zhou, H.; Kang, C.; Liu, Z.; Dou, Z.; Liu, J.; Li, B.; Chen, Y. Modeling And Prediction For Diesel Performance Based On Deep Neural Network Combined With Virtual Sample. Sci. Rep. 2021, 11, 16709. [Google Scholar] [CrossRef]
Xu, N.; Zhang, G.; Yang, L.; Shen, Z.; Xu, M.; Chang, L. Research on thermoeconomic fault diagnosis for marine low speed two stroke diesel engine. Math. Biosci. Eng. MBE 2022, 19, 5393–5408. [Google Scholar] [CrossRef]
Wang, M.; Deng, W. Deep visual domain adaptation. Neurocomputing 2018, 312, 135–153. [Google Scholar] [CrossRef]
Peng, X.; Bai, Q.; Xia, X.; Huang, Z.; Saenko, K.; Wang, B. Moment Matching for Multi-Source Domain Adaptation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; Volume 2019, pp. 1406–1415. [Google Scholar]
Li, X.; Zhang, W.; Ding, Q.; Sun, J.-Q. Multi-Layer domain adaptation method for rolling bearing fault diagnosis. Signal Process. 2019, 157, 180–197. [Google Scholar] [CrossRef]
Liu, C.; Gryllias, K. Simulation-Driven Domain Adaptation for Rolling Element Bearing Fault Diagnosis. IEEE Trans. Ind. Inform. 2022, 18, 5760–5770. [Google Scholar] [CrossRef]
Chen, R.; Zhu, Y.; Yang, L.; Hu, X.; Chen, G. Adaptation Regularization Based on Transfer Learning for Fault Diagnosis of Rotating Machinery Under Multiple Operating Conditions. IEEE Sens. J. 2022, 22, 10655–10662. [Google Scholar] [CrossRef]
Kang, G.; Jiang, L.; Yang, Y.; Hauptmann, A.G. Contrastive Adaptation Network for Unsupervised Domain Adaptation. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
Zhao, C.; Liu, G.; Shen, W. A balanced and weighted alignment network for partial transfer fault diagnosis. ISA Trans. 2022, 130, 449–462. [Google Scholar] [CrossRef] [PubMed]
Cao, Z.; Ma, L.; Long, M.; Wang, J. Partial Adversarial Domain Adaptation. In Proceedings of the Computer Vision—ECCV 2018: 15th European Conference, Munich, Germany, 8–14 September 2018; Part VIII. Springer: Cham, Switzerland, 2018. [Google Scholar]
Li, W.; Chen, Z.; He, G. A Novel Weighted Adversarial Transfer Network for Partial Domain Fault Diagnosis of Machinery. IEEE Trans. Ind. Inform. 2021, 17, 1753–1762. [Google Scholar] [CrossRef]
Jian, L.; Yunbo, W.; Dapeng, H.; Ran, H.; Jiashi, F. A Balanced and Uncertainty-Aware Approach for Partial Domain Adaptation. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 123–140. [Google Scholar]
Zhang, W.; Xu, D.; Ouyang, W.; Li, W. Self-Paced Collaborative and Adversarial Network for Unsupervised Domain Adaptation. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 2047–2061. [Google Scholar] [CrossRef]
Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; March, M.; Lempitsky, V. Domain-Adversarial Training of Neural Networks. J. Mach. Learn. Res. 2017, 17, 1–35. [Google Scholar] [CrossRef]
Liu, J.-J.; Hou, Q.; Cheng, M.-M.; Wang, C.; Feng, J. Improving Convolutional Networks With Self-Calibrated Convolutions. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10093–10102. [Google Scholar]
Chen, H.-Y.; Wang, P.-H.; Liu, C.-H.; Chang, S.-C.; Pan, J.-Y.; Chen, Y.-T.; Wei, W.; Juan, D.-C. Complement Objective Training. arXiv 2019, arXiv:1903.01182. [Google Scholar]
Long, M.; Cao, Z.; Wang, J.; Jordan, M.I. Conditional Adversarial Domain Adaptation. Adv. Neural Inf. Process. Syst. 2018, 31, 1647–1657. [Google Scholar]
Li, X.; Jiang, H.; Liu, S.; Zhang, J.; Xu, J. A unified framework incorporating predictive generative denoising autoencoder and deep Coral network for rolling bearing fault diagnosis with unbalanced data. Meas. J. Int. Meas. Confed. 2021, 178, 109345. [Google Scholar] [CrossRef]

Figure 1. Different transfer scenarios: (a) Domain adaptation with balanced classes; (b) Partial domain adaptation with unbalanced classes.

Figure 2. Framework of the BADWAN model.

Figure 3. SCNet network structure diagram.

Figure 4. Accuracy comparison of different methods for the transfer tasks.

Figure 5. Task1 feature distribution visualization results of different scenarios: (a) Untrained; (b) SCNet; (c) BADWAN.

Figure 6. Task1 experimental results: (a) A→B; (b) B→A; (c) A→C; (d) C→A.

Figure 7. Task2 experimental results: (a) A→B; (b) B→A; (c) A→C; (d) C→A.

Figure 8. Task3 experimental results: (a) A→B; (b) B→A; (c) A→C; (d) C→A.

Figure 9. Task1 A(6)→B(5) feature distribution visualization results: (a) Untrained; (b) DANN; (c) BADWAN.

Figure 10. Task1 A(6)→B(4) feature distribution visualization results: (a) Untrained; (b) DANN; (c) BADWAN.

Figure 11. Task1 A(6)→B(3) feature distribution visualization results: (a) Untrained; (b) DANN; (c) BADWAN.

Table 1. BADWAN model architecture.

Networks	Layer Types	Parameters and Operations
Generator G	Convolution1D	64, kernel:4, stride:1, padding:3, BN, ReLU
	MaxPooling	kernel:2, stride:1, padding:1, dilation:1
	SClayer1	SCBottleneck1D block:1, 256, kernel:4, stride:2
	SClayer2	SCBottleneck1D block:2, 64, kernel:2, stride:2, padding:1
	SClayer3	SCBottleneck1D block:1, 256, kernel:4, stride:2
	AveragePooling	kernel:3, stride:1, padding:1
	Fully Connection1	128
	Bottleneck	256, ReLU
Classifier C	Fully Connection2	class number
Discriminator D	Fully Connection3	256, ReLU
Discriminator D	Fully Connection4	2, ReLU

Table 2. Details of marine engine datasets.

Working Condition	Marine Engine Type	States	Labels
100% Load (A)	Wärtsilä9L34DF Dual-Fuel Engine	F0: Normal	0a
		F1: Efficiency of Air Cooler Decreases	1a
		F2: Efficiency of Supercharger Decreases	2a
		F3: Turbine Failure	3a
		F4: Intercooler’s Air Side Blocked	4a
		F5: Fuel Injection Timing Failure	5a
75% Load (B)		F0: Normal	0b
		F1: Efficiency of Air Cooler Decreases	1b
		F2: Efficiency of Supercharger Decreases	2b
		F3: Turbine Failure	3b
		F4: Intercooler’s Air Side Blocked	4b
		F5: Fuel Injection Timing Failure	5b
50% Load (C)		F0: Normal	0c
		F1: Efficiency of Air Cooler Decreases	1c
		F2: Efficiency of Supercharger Decreases	2c
		F3: Turbine Failure	3c
		F4: Intercooler’s Air Side Blocked	4c
		F5: Fuel Injection Timing Failure	5c

Table 3. Monitoring index parameters.

Monitoring Index	Unit	Monitoring Index	Unit
PeakFiringPressure	bar	Exhaust Temperature	K
IMEP	bar	Turbine Inlet Temperature	K
BoostPressure TurboCharger	bar	Turbine Outlet Temperature	K
Turbine Inlet Pressure	bar	CompMassFlowPerS TurboCharger	kg/s
Turbine Outlet Pressure	bar	BSFC	kg/(kW·h)
Supercharger Inlet Temperature	K	Power	kW
Air Cooler Outlet Temperature	K	/	/

Table 4. Transfer tasks for Case 1.

Task	Transfer Scenario	Source Classes	Target Classes
1	A→B	0a,1a,2a,3a,4a,5a	0b,1b,2b,3b,4b,5b
2	B→A	0b,1b,2b,3b,4b,5b	0a,1a,2a,3a,4a,5a
3	A→C	0a,1a,2a,3a,4a,5a	0c,1c,2c,3c,4c,5c
4	C→A	0c,1c,2c,3c,4c,5c	0a,1a,2a,3a,4a,5a

Table 5. F1-Score comparison of different methods under the transfer tasks.

Methods	Task 1	Task 2	Task 3	Task 4	Mean
1DCNN	85.75	87.64	79.09	81.22	83.42
SCNet	89.73	88.06	82.23	84.77	86.20
DANN	93.58	91.73	85.55	90.77	90.41
MMD	93.59	91.73	85.55	89.73	90.15
CORAL	91.54	91.46	84.23	84.86	88.02
DAAN	91.63	91.46	84.39	84.56	88.01
PADA	93.51	91.61	85.50	89.76	90.10
BADWAN	94.15	92.32	89.99	91.46	91.98

Table 6. Transfer tasks for Case 2.

Task	Transfer Scenario	Source Classes Num	Target Classes Num
1	A→B, B→A, A→C, C→A	6	5
2	A→B, B→A, A→C, C→A	6	4
3	A→B, B→A, A→C, C→A	6	3

Table 7. Average accuracy of each transfer method in different transfer scenarios.

Transfer Scenario	DANN	MMD	CORAL	DAAN	PADA	BADWAN
A(6)-B(5)	85.60	83.48	83.56	83.70	85.89	92.23
B(6)-A(5)	88.35	89.21	90.09	90.61	90.38	91.12
A(6)-C(5)	83.18	81.69	81.47	81.73	83.46	84.72
C(6)-A(5)	81.52	83.25	84.11	84.34	85.14	85.99
A(6)-B(4)	83.35	80.06	85.86	86.15	85.43	89.82
B(6)-A(4)	86.39	81.75	88.84	88.85	88.12	89.51
A(6)-C(4)	79.60	77.27	79.27	79.52	80.01	82.22
C(6)-A(4)	78.08	73.70	79.49	82.88	81.94	84.12
A(6)-B(3)	80.20	69.60	79.71	81.78	80.89	84.91
B(6)-A(3)	84.84	78.59	86.29	86.66	85.36	88.97
A(6)-C(3)	72.37	66.29	74.19	74.43	74.06	78.14
C(6)-A(3)	75.81	69.68	79.40	80.37	79.38	84.13
Mean	81.61	77.88	82.69	83.42	83.34	86.32

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, L.; Cao, H.; Cui, Z.; Ai, Z. A Fault Diagnosis Method for Marine Engine Cross Working Conditions Based on Transfer Learning. J. Mar. Sci. Eng. 2024, 12, 270. https://doi.org/10.3390/jmse12020270

AMA Style

Wang L, Cao H, Cui Z, Ai Z. A Fault Diagnosis Method for Marine Engine Cross Working Conditions Based on Transfer Learning. Journal of Marine Science and Engineering. 2024; 12(2):270. https://doi.org/10.3390/jmse12020270

Chicago/Turabian Style

Wang, Longde, Hui Cao, Zhichao Cui, and Zeren Ai. 2024. "A Fault Diagnosis Method for Marine Engine Cross Working Conditions Based on Transfer Learning" Journal of Marine Science and Engineering 12, no. 2: 270. https://doi.org/10.3390/jmse12020270

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Fault Diagnosis Method for Marine Engine Cross Working Conditions Based on Transfer Learning

Abstract

1. Introduction

2. Preliminaries

2.1. Problem Definition

2.2. Domain Adaptation of Adversarial Networks

3. Proposed Method

3.1. Architecture Details

3.2. Enhanced Centroid Balance Module

3.3. Class-Level Weighting Module

3.4. Instance Affinity Weighting Module

3.5. Adaptive Uncertainty Suppression Module

3.6. Objective Function Integration and Optimization Strategies

4. Experimental Study

4.1. Dataset Description

4.2. Experimental Case Design and Result Analysis

4.2.1. Case 1

4.2.2. Case 2

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI