Improved RAkEL’s Fault Diagnosis Method for High-Speed Train Traction Transformer

Li, Man; Zhou, Xinyi; Qin, Siyao; Bin, Ziyan; Wang, Yanhui

doi:10.3390/s23198067

Open AccessArticle

Improved RAkEL’s Fault Diagnosis Method for High-Speed Train Traction Transformer

by

Man Li

^1,2,3,*

,

Xinyi Zhou

^1,2

,

Siyao Qin

²

,

Ziyan Bin

² and

Yanhui Wang

^1,2,3

¹

State Key Laboratory of Advanced Rail Autonomous Operation, Beijing Jiaotong University, Beijing 100044, China

²

School of Traffic and Transportation, Beijing Jiaotong University, Beijing 100044, China

³

Beijing Research Center of Urban Traffic Information Sensing and Service Technology, Beijing Jiaotong University, Beijing 100044, China

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(19), 8067; https://doi.org/10.3390/s23198067

Submission received: 17 August 2023 / Revised: 9 September 2023 / Accepted: 19 September 2023 / Published: 25 September 2023

(This article belongs to the Collection Artificial Intelligence for Data-Driven Fault Detection and Diagnosis)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The traction system is very important to ensure the safe operation of high-speed trains, and the failure of the traction transformer is the most likely fault in the traction system. Fault diagnosis in actual work relies largely on manual experience. This paper proposes an improved RAkEL (Random k-Labelsets) algorithm for the fault diagnosis of high-speed train traction transformers. Firstly, this article starts from the large amount of “sleeping” fault maintenance data accumulated by the railway department, takes a single maintenance record as an instance, uses specific monitoring values to construct an instance vector, and uses the fault phenomena corresponding to the monitoring indicators as labels. Then, this paper improves the step of selecting k-labelsets in RAkEL, and extracts associated faults using the Relief algorithm. Finally, this paper excavates and uses the association rules between data and faults to identify traction transformer faults. The results showed that the improved RAkEL diagnostic method had a significant improvement in the evaluation indicators. Compared with other multi-label classification algorithms, including BR (Binary Relevance) and CLR (Calibrated Label Ranking), this method performs well on multiple evaluation indicators. It can further help engineers perform timely maintenance work in the future.

Keywords:

multi-label classification; RAkEL; traction transformer; fault diagnosis

1. Introduction

In recent years, high-speed railway trains have developed rapidly in terms of speed and capacity. The system structure is complex, the equipment is closely connected with each other, and there are many types of faults. An efficient and accurate fault diagnosis mechanism is of great significance to reducing or eliminating the occurrence of accidents and ensuring the safety of high-speed train operations. The traction system is a key subsystem of high-speed trains, and the traction transformer is one of the main equipment. It is composed of iron core, coil winding, oil tank, oil protection device, and other equipment. Its operation is often affected by various factors such as electricity, heat, and magnetism. With the increase in service time, changeable train operating environment, or due to defects in the manufacturing process, irregular maintenance work, etc. [1], various failures of traction transformers will inevitably occur, burying potential safety hazards.

The large amount of maintenance data accumulated by the railway department can provide data support for fault diagnosis. The data-driven fault diagnosis method has the advantages of a wide application range, less need for modeling and prior knowledge, and easy implementation [1]. In the field of data-driven fault diagnosis of complex electromechanical equipment, machine learning methods are widely used due to their high reliability and easy implementation [2,3,4,5,6]. For example, support vector machine (SVM), a new machine learning method based on statistical learning theory, is a powerful tool for solving small sampling, nonlinear, and high-dimensional problems, which has inspired several works in recent years [7,8]. Many scholars optimize penalty factors and kernel function parameters through intelligent optimization algorithms, such as applying particle swarm optimization (PSO) [9], improved seagull optimization algorithm [10], etc., to improve diagnostic accuracy. Bayesian networks are also widely used in this field [11,12].

The key to fault diagnosis is feature extraction. The principal component analysis (PCA) method is another important research topic in the field of system fault analysis [13,14]. Among them, kernel principal component analysis (KPCA) is a commonly used method, such as the fuzzy clustering-based operating state diagnosis algorithm proposed in reference [15], or by combining it with SVM [10] to further obtain fault information. Deep PCA combines PCA theory with deep learning to effectively achieve early fault diagnosis [16,17].

In the diagnosis of faults in complex electromechanical equipment, existing research predominantly relies on Dissolved Gas Analysis (DGA), processing its characteristic values, and identifying corresponding single-fault types. However, these studies often overlook the fact that in practical operation, equipment instances are often associated with one or more fault categories, naturally giving rise to multi-label classification problems. Therefore, in recent years, multi-label classification methods have become increasingly popular in the field of fault diagnosis (three articles), which can well explore fault diagnosis situations of high-dimensional features and compound fault types. There are currently a large number of mature methods in the field to solve multi-label learning problems [18]. The first is the transformation method problem, such as BR (Binary Correlation), CC (Classifier Chains), CLR (Calibrated Label Ranking) [19,20,21]. The other is to improve the existing single classifier, such as transforming the nearest neighbor classifier (KNN) into multi-label nearest neighbor classifier (MLKNN), transforming support vector machine (SVM) into rank support vector machine (RANKSVM) [22,23]. The classic multi-label classification algorithm RAkEL has been widely used in image classification [24], biomedicine [25,26], information security, and other fields [27]. By dividing the label space, RAkEL tries to overcome the problems of computational cost and label powerset explosion. This paper is based on the ordinary RAkEL method, treats labels as features, uses the Relief algorithm to improve the steps of randomly selecting labelsets, and diagnoses high-speed train traction transformer faults based on existing fault and maintenance data. The results show that, compared with the ordinary RAkEL and other ordinary multi-label classification algorithms, the improved RAkEL has better performance.

The main contributions of this paper are as follows:

1.: Using the improved RAkEL method proposed in this paper to mine the correlation between fault phenomena in the actual high-speed train traction transformer fault dataset, the accuracy of the fault phenomenon identification and final maintenance diagnosis is high. At the same time, this paper sets relevant parameters based on the high-speed train traction transformer dataset and achieves good diagnostic results, indicating that this method is suitable for the actual fault diagnosis process.
2.: Based on the ordinary RAkEL algorithm, this paper considers the correlation between fault manifestations in the process of selecting k-labelsets, adds the Relief algorithm to reduce the randomness of labelset selection, and improves the calculation efficiency and diagnosis accuracy. In the actual high-speed train traction transformer fault dataset, compared with the ordinary RAkEL, the improved RAkEL has better performance in various evaluation indicators. Based on the optimal parameters obtained through experiments, $A P$ increased by 7.4%, and $C o v e r a g e$ , $H a m m i n g L o s s$ , $O n e E r r o r$ , and $R a n k i n g L o s s$ decreased by 51.2%, 9.8%, 13.3%, and 51.6%, respectively.
3.: After adding the Relief algorithm to mine label correlation, the improved RAkEL performs the best overall in comparison with other algorithms. In the actual high-speed train traction transformer fault dataset, based on the set parameters, the improved RAkEL has the best comprehensive performance compared with BR, CLR, and LP, and is only slightly lower than BR in $A P$ . This shows that compared with BR, the number of related instances that have not been diagnosed is larger in the improved RAkEL method.

The organizational structure of the rest of the paper is as follows: Section 2 introduces the working content and diagnostic difficulties of the traction transformer and the RAkEL method. Section 3 introduces the model construction, including the data processing process and evaluation indicator system. Section 4 uses the known data to verify the calculation examples, and discusses the related parameters. Section 5 summarizes the full text, and puts forward the deficiencies and the direction of future efforts.

2. Background and Related Work

2.1. Work Content and Diagnostic Difficulties of Traction Transformers

Taking the CRH5 model as an example, the multiple unit is composed of eight car formations and two power units. The first power unit consists of three high-speed trains and one trailer (M-M-T-M); The second power unit consists of two high-speed trains and two trailers (T-T-M-M). Each power unit is equipped with a main transformer (TT) and a pantograph, and the entire train is equipped with two pantographs.

The design of the traction structure is that the third traction converter of the first traction unit can be powered by the second traction unit and can be switched from the first traction unit to another traction unit. This feature enables the balancing of two main transformers when a traction converter fails; at least three traction converters operate when one main transformer fails. When the electrical equipment in each traction power unit malfunctions, the power train of that unit can be completely or partially cut off (cutting off one or two traction converters), but it does not affect the operation of other power units.

The main transformer (TT) is a single pole transformer equipped with six secondary windings, which allows the train to operate on a line powered by a nominal AC voltage of 25 kV–50 Hz, reducing the voltage to a value suitable for driving the parts. It is cooled by forced oil circulation; a dedicated oil-gas heat exchanger (cooler) is used to cool the oil, and an oil evaporator is also integrated into the transformer to ensure the space required for evaporation and oil storage. The main transformer (TT) meets the needs of the traction unit; each traction/auxiliary converter is powered by two secondary windings. In the component called “HV Controller Box” (COMB) adjacent to the transformer, remote isolation switches SAZ1 and SAZ (to drive one or eight vehicles, respectively), SAZ21 and SAZ122 (to drive two or seven vehicles, respectively), and SAZ31 and SAZ32 (to drive four vehicles) are installed to perform the task of disconnecting from the traction converter in the event of a fault affecting the converter.

Figure 1 shows the current phase measurement points for traction transformers, such as breakdown voltage (BV), moisture content, acid value, and dielectric loss factor. Due to the complex types of faults in traction transformers, there is currently little research on the analysis of the relationship between faults in traction transformers. Currently, maintenance work mainly relies on manual testing and judgment. Workers often misjudge or miss faults due to lack of experience or misjudgment, which seriously affects the safe operation of equipment. Therefore, analyzing the relationship between closely related faults and establishing an efficient and accurate fault diagnosis model is crucial for improving the reliability of traction transformer equipment.

2.2. RAkEL

The RAkEL method proposed by Tsoumakas et al. involves training multi-classifiers for each k-labelset using the LP method after randomly dividing the total labelset of the data into multiple small labelsets containing k labels (k-labelsets), collecting and combining the decisions of all the LP classifiers to constitute a multi-label classification of unlabeled instances [28]. Compared to LP, RAkEL has the advantage of generating computationally simpler single-label classification tasks with a more balanced distribution of class values. And in the case of overlapping labelsets, RAkEL is able to collect multiple diagnostic results for the same label through different LP models, which are voted to obtain the final output. This provides diagnostic results with the opportunity to correct potentially irrelevant errors and improve performance. At the same time, RAkEL can generalize to labelsets other than the known ones, which LP cannot do. The pseudocode of the algorithm is shown in Algorithm 1.

Algorithm 1. Pseudocode of RAkEL.

Y = R A k E L (D, M, k, m, x)

1. for r = 1 to m do:

2. Randomly select a k-labelset

L^{k} (l_{r}) \subseteq L

with

| L^{k} (l_{r}) | = k

;

3. Construct the multi-class training set

D_{L^{k} (l_{r})}^{+}

according to Equation (8);

4.

g_{L^{k} (l_{r})}^{+} ⟵ M (D_{L^{k} (l_{r})}^{+})

;

5. end for

6. Return the diagnostic labelset Y according to Equation (15).

3. Model Construction

At this stage, the railway department has accumulated a large amount of high-speed train traction transformer operation and maintenance data, including condition monitoring data, fault types and performance, repair and maintenance records, etc. It is of great significance to analyze and excavate these dormant data for the traction transformer’s fault identification and diagnosis of the repair method.

The ordinary RAkEL method randomly decomposes the initial labelset into multiple small labelsets, which reduces the complexity of the generated single-label classification task, but also brings about the problems of redundancy and low computational efficiency. Therefore, this paper considers adding the Relief algorithm for improvement, treating labels as features, and preferentially selecting relevant labels to form a labelset. Finally, we use the improved RAkEL algorithm to mine the association rules between data and faults, faults and faults, point out the types of faults that occur in the newly generated data, and further provide preventive maintenance recommendations.

3.1. Constructing the Training Set Containing Instances of Maintenance Records and Corresponding Fault Phenomenon Labels

The historical condition-monitoring data in the existing records contain monitoring values for specific monitoring indicators, including BV, moisture content, dielectric loss factor, and so on. In this paper, we take a single maintenance record as an instance and use its corresponding specific monitoring values to construct the instance space in the multi-label classification algorithm:

X = \{x_{1}, x_{2} \dots x_{i} \dots x_{N}\},

(1)

where

X

denotes the instance space and

x_{i}

denotes the ith maintenance record as the instance.

x_{i}

is a d-dimensional feature vector with d corresponding to the total number of monitoring indicators.

In this paper, the fault phenomena corresponding to the monitoring indicators are used as labels to construct the label space in the multi-label classification algorithm:

L = \{y_{1}, y_{2} \dots y_{M}\},

(2)

where

L

denotes the label space, M is the total number of possible fault phenomena, and

y_{M}

denotes the Mth label contained in the label space. Based on the fault phenomena in each maintenance record, the label 0/1 matrix corresponding to the instance is constructed:

Y_{i} = (a_{1}, a_{2}, \dots {, a}_{j}, \dots {, a}_{M}),

(3)

Y_{i}

denotes the labelset associated with the ith instance

x_{i}

, where

i = 1 \dots N

.

a_{j}

indicates whether the jth fault phenomenon occurs or not, if the fault occurs, then

a_{j}

= 1, otherwise, it is 0, where

j = 1 \dots M

.

The training set containing maintenance record instances and their associated labelsets is denoted as:

D = \{(x_{i}, Y_{i}) |1 \leq i \leq N\},

(4)

D

denotes the training set, where

x_{i} \in X

denotes the ith maintenance record instance,

Y_{i} \subseteq L

is the number of instances associated with the set of related labels

x_{i}

, and N is the total number of instances.

3.2. Constructing Multi-Classification Training Sets

RAkEL is an algorithm that transforms a multi-label problem into a multi-classification problem. It reduces the label space into k-labelsets and calls LP methods separately for each k-labelset to train the multi-classifiers and finally votes to obtain the results to improve the accuracy.

Given the labelset size k, the set consisting of the labels in the space of k labels is called a k-labelset. We construct the set of all possible k-labelsets:

|L^{k}| = (\binom{M}{k}),

(5)

where

L^{k}

is the set of all possible k-labelsets,

|L^{k}|

denotes the size of

L^{k}

, M is the number of labels, and

(\binom{M}{k})

denotes that k labels are randomly taken from M labels.

For the labelsets in

L^{k}

:

|L^{k} (l)| = k, 1 \leq l \leq (\binom{M}{k}),

(6)

where

L^{k} (l)

denotes the lth labelset in

L^{k}

and

|L^{k} (l)|

denotes the size of the

L^{k} (l)

.

Given the desired number of classifiers m:

m \leq |L^{k}| .

(7)

In the training phase, for an original multi-labeled sub-training set, we reduce the original labeling space

L

to

L^{k} (l)

by converting it to the following multi-class single-labeled training set:

D_{L^{k} (l)}^{+} = \{(x_{i}, σ_{L^{k} (l)} (Y_{i} \cap L^{k} (l)))| 1 \leq i \leq N\} .

(8)

D_{L^{k} (l)}^{+}

contains the new class:

Γ (D_{L^{k} (l)}^{+}) = \{σ_{L^{k} (l)} (Y_{i} \cap L^{k} (l))| 1 \leq i \leq N\},

(9)

where

D_{L^{k} (l)}^{+}

denotes the training set with label space

L^{k} (l)

;

σ_{L^{k} (l)} : 2^{L} \to N

is the mapping from the power set of

L

to the natural numbers of the inverse function; and

Γ (D_{L^{k} (l)}^{+})

denotes the new class in

D_{L^{k} (l)}^{+}

.

3.3. Constructing a Collection of Multi-Class Classifiers

A multi-class learning algorithm is utilized to induce a multi-class classifier:

g_{L^{k} (l)}^{+} : x \to Γ (D_{L^{k} (l)}^{+}),

(10)

i.e.,

g_{L^{k} (l)}^{+} \leftarrow M (D_{L^{k} (l)}^{+}),

(11)

To create an integration with m component classifiers, a collection of multi-label classifiers is constructed by calling LP on a set of m randomly chosen k-labelsets

L^{k} (l_{r})

(1 ≤ r ≤ m):

\overset{⃑}{G} = \{g_{L^{k} (l_{1})}^{+}, g_{L^{k} (l_{2})}^{+} \dots g_{L^{k} (l_{m})}^{+}\},

(12)

where

g_{L^{k} (l_{m})}^{+}

denotes the multi-class classifier constructed for the mth k-labelset.

The ordinary RAkEL randomly selects labels to divide the labelset with strong randomness and ignores the correlation between labels. It is easy to select a large number of redundant or irrelevant label combinations during random selection, which reduces the computational efficiency and diagnostic accuracy. Based on this, this paper applies the Relief algorithm, which treats labels as features and constructs closely related labelsets to reduce randomness [29].

Finally, we construct the related label matrix.

A_{j} = {(a_{i}, Φ (Y_{i}, y_{j})) | 1 \leq i \leq N},

(13)

a_{i} = [Φ (Y_{i}, y_{1}^{c}), Φ (Y_{i}, y_{2}^{c}), \dots \dots, Φ (Y_{i}, y_{j}^{c})],

(14)

where

A_{j}

denotes the related label matrix of the jth label,

a_{i}

represents the value of

x_{i}

with respect to the label in

Y_{j}^{C}

, N denotes the number of instances in the training set,

Y_{j}^{C}

denotes the label matrix consisting of labels other than the jth label,

Φ (Y_{i}, y_{j})

denotes the value of

x_{i}

with respect to the label

y_{j}

, and

Φ (Y_{i}, y_{j})

is 1 if

y_{j} \in Y_{i}

and 0 otherwise.

Y_{i}

denotes the set of related labels associated with

x_{i}

;

y_{1}^{c}

denotes the first label in

Y_{j}^{C}

; and

y_{k}^{c}

denotes the jth label in

Y_{j}^{C}

.

We apply the Relief algorithm to

A_{j}

to obtain the weight of each of the remaining labels on

y_{j}

, and next take the k − 1 label with the largest weight value among them as the closely related label of the jth label to obtain the set of closely related labels for each label

R_{j}

:

R_{j} = {y_{b} | r a n k (y_{b}) \leq k - 1, y_{b} \in Y_{j}^{C}},

(15)

where

R_{j}

denotes the set of closely related label corresponding to the jth label;

r a n k (y_{b})

denotes the ranking of weight values corresponding to the bth label in

Y_{j}^{C}

.

By forming a k-labelset with the jth label and its closely related labelset, we obtain a total of M closely related k-labelsets, and then randomly select m-M non-repeated k-labelsets from them to jointly construct a multi-classifier ensemble, which reduces the randomness in the selection of k-labelsets.

3.4. Building Diagnostic Instances Labelsets

For unrecognized record instances

x

, the following two quantities are calculated for each label:

τ (x, y_{j}) = \sum_{r = 1}^{m} ⟦y_{j} \in L^{k} (l_{r})⟧ (1 \leq j \leq M),

(16)

μ (x, y_{j}) = \sum_{r = 1}^{m} ⟦y_{j} \in σ_{L^{k} (l_{r})}^{- 1} (g_{L^{k} (l_{r})}^{+} (x))⟧,

(17)

where

τ (x, y_{j})

counts the maximum number of votes for diagnostic integration on label

y_{j}

and

μ (x, y_{j})

counts the actual number of votes for diagnostic integration on label

y_{j}

.

⟦X⟧

denotes the voting method,

⟦X⟧

is 1 if X is true, and 0 otherwise.

σ_{L^{k} (l)} : 2^{L} \to N

is the mapping from the power set of

L

to the natural numbers of the inverse function;

σ_{L^{k} (l)}^{- 1}

represents the inverse function corresponding to it.

g_{L^{k} (l_{r})}^{+}

denotes the multi-class classifier constructed for the rth k-labelset.

Then, the set of diagnostic labels for undiagnosed instances is represented as follows:

Y = \{y_{j} |μ (x, y_{j}) / τ (x, y_{j}) > 0.5,1 \leq j \leq M\},

(18)

In other words, it is considered that

y_{j}

is related to

x

when the actual number of votes is more than half of the maximum number of votes. For a collection created from a set of k labels, the maximum number of votes on each label is mk/M on average.

Table 1 shows the diagnostic voting process of RAkEL for a multi-label training set with the number of labels M = 6, for example, run with the labelset size k = 3 and the number of desired classifiers m = 7.

3.5. Evaluation Indicators System

3.5.1. Hamming Loss

Hamming Loss is used to evaluate how often labels are misdiagnosed, recording cases where: relevant labels are misdiagnosed as irrelevant, and irrelevant labels are misdiagnosed as relevant.

H a m m i n g L o s s = \frac{1}{p} \sum_{i = 1}^{p} \frac{1}{q} | h (x_{i}) |,

(19)

where p denotes the number of instances, q is the number of labels, and

h (x_{i})

denotes the number of labels corresponding to the diagnosed error for instance

x_{i}

.

3.5.2. Ranking Loss

Ranking Loss evaluates the proportion of reverse-ordered label pairs, recording the proportion of instances where the diagnostic rank of relevant labels is lower than the diagnostic rank of irrelevant labels.

r a n k i n g l o s s = \frac{1}{p} \sum_{i = 1}^{p} \frac{1}{|Y_{i}| |{\bar{Y}}_{i}|} | {(l ’, l ”) | f (x_{i}, l ’) \leq f (x_{i}, l ”),) l ’, l ”) \in Y_{i} \times {\bar{Y}}_{i}} |,

(20)

where

l ’

denotes the actual relevant label for the ith instance and

l^{″}

denotes the actual irrelevant label for that instance;

{\bar{Y}}_{i}

is the set of irrelevant labels for the instance;

f) x, y)

is a real-valued function indicating the confidence that y is the relevant label for x, obtained by the classifier system.

3.5.3. One Error

One Error is used to evaluate the proportion of instances where the highest-ranked label is not relevant to the instance.

o n e e r r o r = \frac{1}{p} \sum_{i = 1}^{p} {[{a r g m a x}_{l \in y} f (x_{i}, l)] \notin Y_{i}},

(21)

where

o n e e r r o r (k)

is the One Error value of the kth label.

{a r g m a x}_{l \in y} f (x_{i}, l)

denotes the highest ranked label associated with instance

x_{i}

.

3.5.4. Coverage

Coverage evaluates how many steps it takes, on average, to move down the list of ranking labels to cover all relevant labels for the instance.

c o v e r a g e = \frac{1}{p} \sum_{i = 1}^{p} {m a x}_{l \in Y_{i}} {r a n k}_{f} (x_{i}, l) - 1,

(22)

where

{r a n k}_{f}

is the rank function corresponding to the real-valued function

f

.

3.5.5. Average Precision

Average Precision (AP) is used to evaluate the average percentage of relevant labels that rank higher than a specific label.

a v g p r e c = \frac{1}{p} \sum_{i = 1}^{p} \frac{1}{|Y_{i}|} \sum_{l \in Y_{i}} \frac{| \{l^{'} | {r a n k}_{f} (x_{i}, l^{'}) \leq {r a n k}_{f} (x_{i}, l), l^{'} \in Y_{i}\} |}{{r a n k}_{f} (x_{i}, l)},

(23)

where

{r a n k}_{f} (x_{i}, l)

is the rank function corresponding to the real-valued function indicating the rank of

l

in the diagnostic results of the unknown instance

x_{i}

.

4. Algorithm Validation

The dataset used in this paper is the historical maintenance record dataset of traction transformer for high-speed trains. The maintenance record contains condition monitoring data under different monitoring indicators several times, and the monitoring indicators include BV, moisture content, acid value, etc. Specific monitoring values can reflect certain fault phenomena, and different combinations of fault manifestations can provide recommendations for high-speed train traction transformer maintenance methods.

4.1. Parameter Selection

In the RAkEL algorithm, the settings of the labelset size k and the number of classifiers m have a large impact on the diagnostic results. In this paper, 70% is randomly selected from the dataset as the training set and 30% as the test set, and the values of k and m are discussed.

Based on the different values of the k set (m = 2 M), the change of each evaluation indicator is shown in Figure 2, where

c o v e r a g e

has been normalized.

With different values of m set (k = 2), the changes in each evaluation indicator are shown in Figure 3, where

c o v e r a g e

has been normalized.

With different values of m set (k = 3), the changes in each evaluation indicator are shown in Figure 4, where

c o v e r a g e

has been normalized:

With different values of m set (k = 4), the changes in each evaluation indicator are shown in Figure 5, where

c o v e r a g e

has been normalized.

From Figure 3, Figure 4 and Figure 5, it can be seen that the comprehensive performance is good when k is set to 3 and m is set to 34, and the results of the indicators near the optimal value show a smooth trend. Considering the test results and suggestions [28], the labelset size k is determined as 3 and the number of classifiers m is determined as 34.

4.2. Comparison of Evaluation Indicators before and after Improvement

In order to reduce the impact of randomness in RAkEL, this paper adds the Relief algorithm that mines label correlation, selects labels with greater correlation to form a labelset, and reduces the redundant and irrelevant impact of randomly selecting a labelset. In this paper, 70% of the dataset is randomly selected as the training set and 30% as the test set; based on the parameters set in the previous section, the indicators before and after RAkEL improvement are shown in Table 2, among which are

H a m m i n g L o s s

,

O n e E r r o r

,

C o v e r a g e

, and

R a n k i n g L o s s

, whereby the smaller the value of the indicator, the better the performance, and the larger the value of the

A P

indicator, the better the performance.

As shown in Figure 6, the improved RAkEL outperforms the ordinary RAkEL in all indicators, with an

A P

improvement of 7.4%. This indicates that the improved RAkEL algorithm proposed in this paper, which incorporates feature selection and calculates the correlation between labels to prioritize selecting relevant labels for composing k-labelsets, has analyzed the association between faults and their concurrent failures effectively. It significantly reduces the impact caused by the random selection of labelsets in ordinary RAkEL, thus validating the effectiveness of the improvement.

4.3. Comparison with Other Multi-Label Classification Algorithms

According to the dataset and parameters set in the previous section, Table 3 demonstrates the results of comparing the indicators of this experiment with other multi-label classification algorithms, including BR and CLR, among which are

H a m m i n g L o s s

,

O n e E r r o r

,

C o v e r a g e

, and

R a n k i n g L o s s

, whereby the smaller the value of the indicator, the better the performance; the larger the value of the

A P

indicator, the better the performance.

From Table 3, the improved RAkEL had the best overall performance in all the indicators, except for the slightly lower

A P

compared to BR.

The comparison of the improved RAkEL indicators with other methods is shown in Figure 7.

As shown in the figure, the improved RAkEL has the best comprehensive performance compared to other classification algorithms in the application of a high-speed train traction system’s fault dataset. This indicates that the improved RAkEL has a better performance in mining the correlation between fault phenomenon labels, and also shows the effectiveness and applicability of mining the correlation of fault phenomena for fault diagnosis and taking effective maintenance measures. The actual fault diagnosis results are shown in Table 4.

In this case, if a breakdown voltage > 50 kV is detected, the probability of diagnosing a C3-level repair is 34.8%; a C4-level repair, 52.2%; and a C6-level repair, 13.0%. The diagnostic accuracy of the final maintenance method reached 93.62%, which is 8.62% higher than the commonly used diagnostic method based on the BPLN model in current engineering, indicating that the method proposed in this paper has practical application value [30].

5. Conclusions

It is very important to carry out efficient and accurate fault prediction for the main transformer, as it is one of the components of the high-speed train traction system with the most types of faults and the most frequent monitoring of the phase points. The algorithm proposed in this paper has a small degree of

A P

and enhancement before and after improvement, and there is still some room for improvement in the evaluation indicators, such as

A P

, in comparison with other algorithms. The next step we propose is to study the way with which we can achieve a higher-accuracy fault diagnosis while maintaining the existing level.

Author Contributions

Conceptualization, M.L., X.Z. and Y.W.; Data curation, X.Z., S.Q. and Z.B.; Funding acquisition, M.L. and Y.W.; Investigation, X.Z., S.Q. and Z.B.; Methodology, M.L. and Y.W.; Resources, M.L.; Validation, X.Z.; Writing—original draft, S.Q. and Z.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research is partially supported by the State Key Laboratory of Rail Traffic Control and Safety (contract No. RCS2022ZT006) and the Youth Program of the National Natural Science Foundation of China (Award number 52002019).

Data Availability Statement

The data will be made available on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Li, Y. Exploring Real-time Fault Detection of High-speed Train Traction Motor Based on Machine Learning and Wavelet Analysis. Neural Comput. Appl. 2022, 34, 9301–9314. [Google Scholar] [CrossRef]
Venkatasubramanian, V.; Rengaswamy, R.; Kavuri, S.; Yin, K. A Review of Process Fault Detection and Diagnosis Part III: Process History Based Methods. Comput. Chem. Eng. 2003, 27, 327–346. [Google Scholar] [CrossRef]
Chen, H.; Jiang, B. A Review of Fault Detection and Diagnosis for the Traction System in High-Speed Trains. IEEE Trans. Intell. Transp. Syst. 2020, 21, 450–465. [Google Scholar] [CrossRef]
Gonzalez-Jimenez, D.; del-Olmo, J.; Poza, J.; Garramiola, F.; Madina, P. Data-Driven Fault Diagnosis for Electric Drives: A Review. Sensors 2021, 21, 4024. [Google Scholar] [CrossRef]
Song, Y.; Zhao, G.; Zhang, B.; Chen, H.; Deng, W.; Deng, W. An enhanced distributed differential evolution algorithm for portfolio optimization problems. Eng. Appl. Artif. Intell. 2023, 121, 106004. [Google Scholar] [CrossRef]
Sun, Q.; Zhang, M.; Zhou, L.; Garme, K.; Burman, M. A machine learning-based method for prediction of ship performance in ice: Part I. ice resistance. Mar. Struct. 2022, 83, 103181. [Google Scholar] [CrossRef]
Yan, W.; Shao, H. Application of support vector machine nonlinear classifier to fault diagnoses. In Proceedings of the 4th World Congress on Intelligent Control and Automation (Cat. No.02EX527), Shanghai, China, 10–14 June 2002; pp. 2697–2700. [Google Scholar]
Qie, X.; Zhang, J.; Zhang, J. Research of the Machinery Fault Diagnosis and Prediction Based on Support Vector Machine. In Proceedings of the 2015 3rd International Conference on Machinery, Materials and Information Technology Applications, Qingdao, China, 28–29 November 2015; pp. 635–639. [Google Scholar]
Zhang, Z.; Guo, H. Research on Fault Diagnosis of Diesel Engine Based on PSO-SVM. In Proceedings of the 6th International Asia Conference on Industrial Engineering and Management Innovation: Innovation and Practice of Industrial Engineering and Management; Qi, E., Ed.; Atlantis Press: Paris, France, 2016; pp. 509–517. [Google Scholar]
Zhu, J.; Li, S.; Liu, Y.; Dong, H. A Hybrid Method for the Fault Diagnosis of Onboard Traction Transformers. Electronics 2022, 11, 762. [Google Scholar] [CrossRef]
Cai, B.; Liu, Y.; Hu, J.; Liu, Z.; Wu, S.; Ji, R. Bayesian Networks in Fault Diagnosis: Practice and Application; World Scientific: Singapore, 2018; pp. 1–420. [Google Scholar]
Xiao, Y.; Pan, W.; Guo, X.; Bi, S.; Feng, D.; Lin, S. Fault Diagnosis of Traction Transformer Based on Bayesian Network. Energies 2020, 13, 4966. [Google Scholar] [CrossRef]
Li, M. The Application of PCA and SVM in Rolling Bearing Fault Diagnosis. Adv. Mater. Res. 2012, 430–432, 1163–1166. [Google Scholar] [CrossRef]
Li, M.; Zhang, J.; Song, J.; Li, Z.; Lu, S. A Clinical-Oriented Non-Severe Depression Diagnosis Method Based on Cognitive Behavior of Emotional Conflict. IEEE Trans. Comput. Soc. Syst. 2023, 10, 131–141. [Google Scholar] [CrossRef]
Zhu, J.; Li, S.; Dong, H. Running Status Diagnosis of Onboard Traction Transformers Based on Kernel Principal Component Analysis and Fuzzy Clustering. IEEE Access 2021, 9, 121835–121844. [Google Scholar] [CrossRef]
Wu, Y.; Liu, X.; Zhou, Y. Deep PCA-Based Incipient Fault Diagnosis and Diagnosability Analysis of High-Speed Railway Traction System via FNR Enhancement. Machines 2023, 11, 475. [Google Scholar] [CrossRef]
Chen, H.; Jiang, B.; Lu, N.; Mao, Z. Deep PCA Based Real-Time Incipient Fault Detection and Diagnosis Methodology for Electrical Drive in High-Speed Trains. IEEE Trans. Veh. Technol. 2018, 67, 4819–4830. [Google Scholar] [CrossRef]
Zhang, M.; Zhou, Z. A Review on Multi-Label Learning Algorithms. IEEE Trans. Knowl. Data Eng. 2014, 26, 1819–1837. [Google Scholar] [CrossRef]
Boutell, M.R.; Luo, J.; Shen, X.; Brown, C.M. Learning Multi-label Scene Classifification. Pattern Recognit. 2004, 37, 1757–1771. [Google Scholar] [CrossRef]
Read, J.; Martino, L.; Luengo, D. Efficient Monte Carlo Methods for Multi-dimensional Learning with Classifier Chains. Pattern Recognit. 2014, 47, 1535–1546. [Google Scholar] [CrossRef]
Fürnkranz, J.; Hüllermeier, E.; Mencía, E.L.; Klaus, B. Multilabel Classifification via Calibrated Label Ranking. Mach. Learn. 2008, 73, 133–153. [Google Scholar] [CrossRef]
Zhang, M.; Zhou, Z. ML-KNN: A Lazy Learning Approach to Multi-label Learning. Pattern Recognit. 2007, 40, 2038–2048. [Google Scholar] [CrossRef]
Bian, J.; Li, X.; Li, F.; Zheng, Z.; Zha, H. Ranking Specialization for Web Search: A Divide-and-conquer Approach by Using Topical RankSVM. In Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA, 26–30 April 2010; pp. 165–181. [Google Scholar]
Ivasic-Kos, M.; Pobar, M. Multi-label Classification of Movie Posters into Genres with Rakel Ensemble Method. In Artificial Intelligence XXXIV; Bramer, M., Petridis, M., Eds.; Springer: Cham, Switzerland, 2017; Volume 10630, pp. 370–383. [Google Scholar]
Li, X.; Lin, L.; Chen, L. Identification of Protein Functions in Mouse with a Label Space Partition Method. Math. Biosci. Eng. 2022, 19, 3820–3842. [Google Scholar] [CrossRef]
Chou, K. Some Remarks on Predicting Multi-label Attributes in Molecular Biosystems. Mol. Biosyst. 2013, 9, 1096–1100. [Google Scholar] [CrossRef]
Aivatoglou, G.; Anastasiadis, M.; Spanos, G.; Voulgaridis, A.; Votis, K.; Tzovaras, D.; Angelis, L. A RAkEL-based Methodology to Estimate Software Vulnerability Characteristics & Score-an Application to EU Project ECHO. Multimed. Tools Appl. 2011, 81, 9459–9479. [Google Scholar]
Tsoumakas, G.; Katakis, I.; Vlahavas, I. Random k-labelsets for Multilabel Classifification. IEEE Trans. Knowl. Data Eng. 2011, 23, 1079–1089. [Google Scholar] [CrossRef]
Zhang, C.; Li, Z. Multi-label Learning with Label-specific Features via Weighting and Label Entropy Guided Clustering Ensemble. Neurocomputing 2021, 419, 59–69. [Google Scholar] [CrossRef]
Guo, L.; Tang, J.; Tang, L.; Zhan, Y.; Li, F. A Method of Transformer Fault Diagnosis Based on Improved BP Neural Network. Meas. Control. Inf. Technol. 2021, 71–77. [Google Scholar] [CrossRef]

Figure 1. Actual traction transformer and measured phase points.

Figure 2. Impact of k setting on evaluation indicators (m = 34).

Figure 3. Impact of m setting on evaluation indicators (k = 2).

Figure 4. Impact of m setting on evaluation indicators (k = 3).

Figure 5. Impact of m setting on evaluation indicators (k = 4).

Figure 6. Changes in evaluation indicators before and after RAkEL improvement.

Figure 7. Comparison of improved RAkEL with other algorithms.

Table 1. Example of voting process for RAkEL.

Classifier	k-Labelset	Diagnostic Labelset
Classifier	k-Labelset	$y_{1}$	$y_{2}$	$y_{3}$	$y_{4}$	$y_{5}$	$y_{6}$
$g_{L^{k} (l_{1})}^{+}$	$y_{1}, y_{2}$ $, y_{3}$	$1$	$0$	1	-	-	-
$g_{L^{k} (l_{2})}^{+}$	$y_{2}, y_{3}$ $, y_{5}$	-	1	0	-	1	-
$g_{L^{k} (l_{3})}^{+}$	$y_{3}, y_{4}$ $, y_{6}$	-	-	1	0	-	0
$g_{L^{k} (l_{4})}^{+}$	$y_{2}, y_{4}$ $, y_{6}$	-	0	-	1	-	1
$g_{L^{k} (l_{5})}^{+}$	$y_{1}, y_{2}$ $, y_{5}$	1	0	-	-	0	-
$g_{L^{k} (l_{6})}^{+}$	$y_{1}, y_{2}$ $, y_{4}$	1	1	-	0	-	-
$g_{L^{k} (l_{7})}^{+}$	$y_{1}, y_{2}$ $, y_{5}$	0	0	-	-	1	-
$τ (x)$	/	4	6	3	3	3	2
$μ (x)$	/	3	2	2	1	2	1
Voting value	/	3/4	2/6	2/3	1/3	2/3	1/2
Final result	/	1	0	1	0	1	0

Table 2. RAkEL evaluation indicators before and after improvement (k = 3, m = 34).

	Improved RAkEL	RAkEL
AP ↑	0.304 ± 0.0045	0.283 ± 0.003
Coverage ↓	0.199 ± 0.005	0.408 ± 0.025
Hamming Loss ↓	0.037 ± 0.008	0.041 ± 0.002
One Error ↓	0.026 ± 0.002	0.030 ± 0.003
Ranking Loss ↓	0.015 ± 0.005	0.031 ± 0.002

Table 3. Evaluation indicators of improved RAkEL and other methods.

	Improved RAkEL	CLR	BR	LP
AP ↑	0.304 ± 0.0045	0.27 ± 0.003	0.305 ± 0.0003	0.217 ± 0.012
Coverage ↓	0.199 ± 0.005	0.751 ± 0.01	0.715 ± 0.0382	0.441 ± 0.03
Hamming Loss ↓	0.037 ± 0.008	0.092 ± 0.002	0.086 ± 0.001	0.085 ± 0.016
One Error ↓	0.026 ± 0.002	0.061 ± 0.001	0.044 ± 0.0002	0.057 ± 0.009
Ranking Loss ↓	0.015 ± 0.005	0.053 ± 0.001	0.050 ± 0.003	0.029 ± 0.006
Comprehensive Ranking	1.2	3.8	2.4	2.4

Table 4. Actual fault data maintenance diagnosis results.

Monitoring Indicators	Criterion	Repair Method Diagnosis Probability			Diagnostic Accuracy
Monitoring Indicators	Criterion	C3	C4	C6	Diagnostic Accuracy
BV (kV)	>50	34.8%	52.2%	13.0%	100.0%
Moisture content (mg/L)	>10	20.7%	55.2%	24.1%	93.1%
Acid value (calculated in KOH) (mg/g)	>0.01	38.5%	50.0%	11.5%	88.5%
Dielectric loss factor (90 °C)	>0.005	18.2%	50.0%	31.8%	100.0%
H₂ (μL/L)	>10	21.1%	78.9%	0.0%	94.7%
C₂H₂ (μL/L)	>0.1	33.3%	40.7%	25.9%	88.9%
Total hydrocarbon (μL/L)	>10	20.0%	72.0%	8.0%	96.0%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, M.; Zhou, X.; Qin, S.; Bin, Z.; Wang, Y. Improved RAkEL’s Fault Diagnosis Method for High-Speed Train Traction Transformer. Sensors 2023, 23, 8067. https://doi.org/10.3390/s23198067

AMA Style

Li M, Zhou X, Qin S, Bin Z, Wang Y. Improved RAkEL’s Fault Diagnosis Method for High-Speed Train Traction Transformer. Sensors. 2023; 23(19):8067. https://doi.org/10.3390/s23198067

Chicago/Turabian Style

Li, Man, Xinyi Zhou, Siyao Qin, Ziyan Bin, and Yanhui Wang. 2023. "Improved RAkEL’s Fault Diagnosis Method for High-Speed Train Traction Transformer" Sensors 23, no. 19: 8067. https://doi.org/10.3390/s23198067

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Improved RAkEL’s Fault Diagnosis Method for High-Speed Train Traction Transformer

Abstract

1. Introduction

2. Background and Related Work

2.1. Work Content and Diagnostic Difficulties of Traction Transformers

2.2. RAkEL

3. Model Construction

3.1. Constructing the Training Set Containing Instances of Maintenance Records and Corresponding Fault Phenomenon Labels

3.2. Constructing Multi-Classification Training Sets

3.3. Constructing a Collection of Multi-Class Classifiers

3.4. Building Diagnostic Instances Labelsets

3.5. Evaluation Indicators System

3.5.1. Hamming Loss

3.5.2. Ranking Loss

3.5.3. One Error

3.5.4. Coverage

3.5.5. Average Precision

4. Algorithm Validation

4.1. Parameter Selection

4.2. Comparison of Evaluation Indicators before and after Improvement

4.3. Comparison with Other Multi-Label Classification Algorithms

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI