A Subway Sliding Plug Door System Health State Adaptive Assessment Method Based on Interval Intelligent Recognition of Rotational Speed Operation Data Curve

Qi, Hui; Chen, Gaige; Ma, Hongbo; Wang, Xianzhi; Yang, Yudong

doi:10.3390/machines10111075

Open AccessArticle

A Subway Sliding Plug Door System Health State Adaptive Assessment Method Based on Interval Intelligent Recognition of Rotational Speed Operation Data Curve

by

Hui Qi

^1,2

,

Gaige Chen

^1,2,*,

Hongbo Ma

³,

Xianzhi Wang

^2,4

and

Yudong Yang

^1,2

¹

School of Communications and Information Engineering & School of Artificial Intelligence, Xi’an University of Posts and Telecommunications, Xi’an 710121, China

²

Shaanxi Union Research Center of University and Enterprise for 5G+ Industrial Internet Communication Terminal Technology, Xi’an University of Posts and Telecommunications, Xi’an 710068, China

³

School of Mechano-Electronic Engineering, Xidian University, Xi’an 710121, China

⁴

School of Automation, Xi’an University of Posts and Telecommunications, Xi’an 710121, China

^*

Author to whom correspondence should be addressed.

Machines 2022, 10(11), 1075; https://doi.org/10.3390/machines10111075

Submission received: 9 October 2022 / Revised: 4 November 2022 / Accepted: 8 November 2022 / Published: 15 November 2022

(This article belongs to the Special Issue Feature Extraction and Condition Monitoring in Physics and Mechanics)

Download

Browse Figures

Versions Notes

Abstract

:

The subway sliding plug door system is crucial for ensuring normal operation. Due to the differences in the structure and motor control procedures of different sliding plug door systems, the rotational speed monitoring data curves show great differences. It is a challenging problem to recognize the intervals of complex data curves, which fundamentally affect the sensitivity of feature extraction and the prediction of an assessment model. Aiming at the problem, a subway sliding plug door system health state adaptive assessment method is proposed based on interval intelligent recognition of rotational speed operation data curve. In the proposed method, firstly, the rotational speed operation data curve is adaptively divided by a long short-term memory (LSTM) neural network into four intervals, according to the motion characteristics of the door system. Secondly, the sensitive features of the door system are screened out by the random forest (RF) algorithm. Finally, the health state of the door system is assessed using the adaptive boosting (AdaBoost) classifier. The proposed method is comprehensively verified by the benchmark experiment data set. The results show that the average diagnostic accuracy of the method on multiple bench doors can reach 98.15%. The wider application scope and the higher state classification accuracy indicate that the proposed method has important engineering value and theoretical significance for the health management of subway sliding plug door systems.

Keywords:

state assessment; interval recognition; long short-term memory; random forest algorithm; adaptive boosting algorithm

1. Introduction

The sliding plug door system is one of the most important subsystems in metro vehicles. With the development and improvement of the sliding plug door system, the sliding plug door system is becoming more and more complex, and its electromechanical integration is constantly improving. In addition, due to the influence of many human factors, such as the frequent opening and closing of doors and large passenger flow in the morning and evening peak periods, the system inevitably encounters various sudden failures, such as mechanical failures, electrical failures, and sensor failures [1], of which, mechanical failures occur most frequently. Common causes of mechanical failures include the mechanical wear, metal fatigue, and mechanical dimension changes caused by human impact. These factors will cause interference between the sliding plug door system and the car body of the door system, resulting in local abnormal resistance when the door moves, affecting the motor driving state and the sliding state of the door leaf, and ultimately affecting the normal opening and closing actions. Common mechanical failures include the wear of the screw, rupture of the guidepost, etc. These failures seriously affect the normal operation of metro vehicles and even threaten the personal safety of passengers. In order to avoid this kind of problem, the health assessment of sliding plug door systems has received more and more attention in recent years [2].

In general, there are three approaches for constructing state assessment models [3]: physics-based approaches [4], data-driven approaches [5], and hybrid approaches [6]. Compared with the methods based on physics and hybrids, data-driven approaches have been widely used in sliding plug door state assessment. The data-driven approaches are generally through statistical analysis or machine learning and other technical means [7], using the door’s historical monitoring parameter data set to find the state trend and degradation law of the potential key parts of the sliding plug door. Shi et al. [8] proposed a fault prediction model of sliding plug doors based on the random forest (RF) algorithm and completed the fault diagnosis of various parts by extracting the time domain features of the monitoring signal. Cao et al. [9] presented a novel preprocessing method based on empirical mode decomposition (EMD) and hybrid intrinsic mode functions (IMFs) selection criterion to reconstruct the acoustic signals of sliding plug doors and completed a state assessment by using multi-class support vector machine (SVM). Ren et al. [10] established a combination method of fault tree analysis (FTA) and Bayesian network (Bayes) for the fault diagnosis of a subway door braking system. Ma et al. [11] used fuzzy interval and the technique for order preference by similarity to ideal solution (TOPSIS) theory to calculate and rank the risk failure modes of sliding plug doors and completed a reliability analysis of seven typical faults. Xue et al. [12] proposed a feature extraction method of motor rotational speed and current signals, based on a combination of multi-scale sliding window and extended symbol aggregation approximation (ESAX), and used hierarchical pattern recognition to complete the state assessment of the sliding plug doors.

Although the above methods have achieved good diagnostic results, they almost did not, however, consider the operation process of the sliding plug door in their analysis. Through analysis, we found that the sliding plug door has obvious stage characteristics in the operation process, and the performance degradation of different parts may be reflected in different operation stages. Therefore, operation interval recognition is very important, which is conducive to each part of the curve to evaluate the state of the sliding plug door, with reference to its own characteristics. It can effectively improve the accuracy of the subsequent diagnosis. However, due to the different structures and motor control procedures of different sliding plug door systems, the rotational speed data curves show different degrees of advances and hysteresis, and there are also differences in the manifestation of different door curves. It is a challenging problem to recognize the interval of complex data curves. Therefore, an intelligent recognition method of data curve intervals is needed, which can adaptively divide the sample interval, according to the curve characteristics. The LSTM neural network has a strong learning ability for the long-term correlations hidden in sequence data and has been widely used in time series prediction. Wang et al. [13] used an LSTM neural network to predict the fault time series of complex systems and used multi-layer grid search to optimize the parameters, further reducing the prediction error. Guo et al. [14] proposed a method to combine the error fusion of multiple sparse automatic encoders with the LSTM neural network and set multiple threshold control lines, according to different machine fault change trends, to achieve more accurate prediction. Because the motor rotational speed curve is a typical time series, the data of different sliding plug doors are complex and highly nonlinear, so we chose LSTM to predict the operation interval of the sliding plug door rotational speed curve.

In addition, extracting key features to represent the dynamic changes of sliding plug door monitoring is one of the important steps in fault diagnosis. The random forest (RF) algorithm has been widely used because of its low computational complexity and high accuracy of feature importance evaluation. Guo et al. [15] selected the time domain features of the motor rotational speed, current, and angle signals of the sliding plug door system through the RF algorithm, constructed the optimal feature subset, and completed the fault diagnosis of the sliding plug door system after t-distributed stochastic neighbor embedding (t-SNE) algorithm dimension reduction. Peng et al. [16] used an RF algorithm to select the features of partial discharge (PD) signals in the partial discharge mode of high voltage cables and used the best features of PD pattern recognition to improve the diagnostic accuracy. Zhou et al. [17] filtered the time domain and frequency domain features of loudspeaker sound response signals through an RF algorithm, constructed the optimal feature subset, and improved the accuracy of anomaly recognition. The above experiments all show the advantages of random forest algorithms in feature selection and signal processing. On the other hand, the fault symptom of the sliding plug door was weak and the fault mode was complex, which made it difficult to achieve the desired effect by using a single classifier, such as SVM and Bayesian network, for fault diagnosis. Previous studies have shown that classifier groups are more suitable for complex fault diagnosis than a single classifier. Chen et al. [18] proposed a rolling bearing life stage identification method based on multi-classifier integration weighted balance distribution adaptive, which effectively solved the problem, i.e., that a small number of samples could not be effectively identified in the rolling bearing life cycle, due to the limited imbalance of samples under different working conditions. Ji et al. [19] constructed a three-layer classifier group based on Dezert–Smarandache theory (DSmT), selected classification methods to classify different faults of hydraulic valves in the first two layers, and then used DSmT theory to identify fault types in the third layer through fusion results. Xu et al. [20] built an AdaBoost integrated classifier for multiple fault identification and used an AdaBoost algorithm to improve the performance of the decision tree (DT) to effectively identify the coupling faults in complex industrial processes.

Aiming at the problem, a subway sliding plug door system health state adaptive assessment method is proposed, based on interval intelligent recognition of the rotational speed operation data curve. The contributions of this paper are summarized as follows: (1) An intelligent interval recognition method is proposed for the rotational speed data of subway sliding plug door systems. (2) An effective health state assessment procedure is proposed, and more sensitive feature and fault diagnosis accuracies for the door system are obtained.

The structure of this paper is organized as follows: Section 2 introduces basic rotational speed operation curves and analyzes their relevant characteristics. Section 3 describes the specific steps of the subway sliding door system health state adaptive assessment method. Section 4 presents the implementation of the approaches in an actual case, and the results are discussed. Finally, the conclusions are given in Section 5.

2. Analysis of Rotational Speed Operation Data Curves

2.1. Basic Analysis of Rotational Speed Data Curves

Although the data collection device of the sliding plug door can obtain the curves of rotational speed, current, and power, only the rotational speed curves are widely used for state assessment, because the rotational speed value provides a lot of information about the sliding plug door, such as their mechanical characteristics. Figure 1 shows the motor rotational speed curve of the subway sliding plug door in normal operation. The x-axis represents the number of sampling points. The y-axis represents the rotational speed value in units of rpm. The curve can be roughly divided into four intervals, each of which has its own characteristics. The first interval refers to the acceleration stage of the sliding plug door. The rotational speed curve is mainly displayed as a steep peak. The second interval is the high-speed stage of the sliding plug door. During this period, the corresponding rotational speed curve is relatively long and flat. The third interval refers to the deceleration stage of the sliding plug door. The curve mainly shows a sharp downward trend. The fourth interval is the amble stage of the sliding plug door. The curve fell slowly to zero.

2.2. Characteristic Analysis of Rotational Speed Data Curves

The subway sliding plug door has obvious stage characteristics during operation, and the performance degradation of the different parts may be reflected in the different operating stages of the door. When the sliding plug door system is degraded from a healthy state to an abnormal state, its data changes will be intuitively reflected on the rotational speed curve. As shown in Table 1, we summarize three different fault types and their corresponding rotational speed curve characteristics, and different faults will occur in different operating stages of the sliding plug door. So, it can be seen that interval recognition is very important, which is beneficial for each part of the curve to refer to its own characteristics for state assessment, and it will effectively improve the fault diagnosis accuracy of the subsequent sliding plug door system.

The current interval recognition method mainly depends on three fixed points to divide curves into four stages. However, the three fixed points may not apply to all doors. Figure 2 shows the motor rotational speed curves of Bench 1 and Bench 2. Bench 1 represents the sliding plug door with the first structure, and Bench 2 represents the sliding plug door with the second structure. It can be clearly seen from the figure that the continuous interval of sampling points in the high-speed stage of Bench 2 was 63–230, which was shorter than that of Bench 1 (32–282). The duration of the amble stage of Bench 2 was 295–385, which was shorter than that of Bench 1 (355–390). The rotational speed curve of Bench 1 was lower than Bench 2 as a whole, and the curves of both were highly nonlinear. Therefore, a data curve interval intelligent recognition method is needed to divide the sample intervals adaptively, according to the curve characteristics.

3. Subway Sliding Plug Door System Health State Assessment Method

In this paper, an adaptive assessment method for the health state of the subway sliding plug door system, based on an interval intelligent recognition of rotational speed data curve, is proposed. The LSTM model is used to recognize the interval of the rotational speed curves of the different bench doors, and the multi-interval time domain features are extracted, and then the RF algorithm is used to evaluate the importance of the features and select the features with high importance to form a feature set. Finally, input the feature set into the AdaBoost ensemble classifier to obtain the classification results.

3.1. Interval Recognition Model Based on LSTM

The subway sliding plug door has obvious stage characteristics in the operation process. The faults of different parts may be reflected in the different operation stages of the door. In the actual work, due to the different structures and motor control procedures of different sliding plug doors, there are great differences in the form of rotational speed monitoring data curves. However, the traditional interval recognition method has poor flexibility and it is easy to misclassify the intervals, which makes it difficult to accurately extract sensitive features, the classification accuracy is low, and the reliability is difficult to guarantee. On the one hand, because the motor rotational speed signal is a time series, LSTM has a strong ability to learn the long-term correlation hidden in the sequence data and has been widely used in time series prediction and fault diagnosis with good results [21]. On the other hand, compared with CNN, DBN, and SAE, LSTM has memory ability, due to the introduction of memory cells and three gates in its network structure [22], which makes it possible to capture the data characteristics of different operating sections of the sliding plug door by making full use of the spatial and time dependence. Therefore, the LSTM model is selected to recognize the operation intervals between different doors.

The interval recognition model constructed in this paper is shown in Figure 3. The model can complete the interval recognition of unknown curves by learning the data features of known interval labels. There are two main processes: model training process and model testing process. The first step is the model training process. The purpose of this step is to obtain an LSTM model that can be used for interval recognition of different sliding plug doors through training data. Here, we input the rotational speed curve data of the labeled operation stages of four sliding plug doors with different structures into the model for training. For each type of sliding plug door, we extracted data in four states: normal state (i data), guidepost fault (i data), roller fault (i data), and screw fault (i data). For each type of sliding plug door, we extracted 4 × i pieces of data, all of which had interval range labels. Then, we input these data with interval range labels into the LSTM model for training and output the LSTM model based on interval recognition.

The second step is the model testing process. After the model was obtained, we set the activation function to be the SoftMax function and assigned it to 4 classes to achieve the recognition of 4 operating intervals of test samples.

S o f t \max (i) = \frac{e^{i}}{\sum_{j} e^{j}}

(1)

where

i

represents the total number of input test data, and

j

represents the number of prediction categories.

In the testing process, we input a rotational speed curve with an unknown range to the LSTM model that had completed the training, and the output of the model was the four prediction interval ranges of this curve. We verified the effectiveness of the method through experiments in Section 4.2. The difference between the curve interval identified by the LSTM model and the actual interval was small, and the operating interval of different doors could be accurately identified. Additionally, in Section 4.3, the improvement of the LSTM interval recognition model for the recognition accuracy of the state of the sliding plug door was verified.

3.2. Features Screening of Subway Sliding PLUG Door System

Considering the lack of frequency domain information of subway sliding plug door monitoring, this paper mainly extracts the time domain statistical features. These features can be divided into dimensional features and dimensionless features [23]. Among them, skewness factor, kurtosis factor, peak factor, waveform factor, and pulse factor belong to dimensionless features, which are not affected by the various working conditions of the sliding plug door system and have a good diagnosis effect on weak faults. Peak value, peak-to-peak value, average value, root mean square (RMS), and variance value belong to dimensional features, which can reflect the size of rotational speed signal and signal fluctuation. The specific formulae and feature definitions are shown in Table 2.

Then, we constructed the subway sliding plug door state feature vector set

M \in R^{N \times P}

by extracting the time domain feature index of rotational speed signal in each stage of subway sliding plug door movement.

M = [\begin{matrix} m_{1} \\ m_{2} \\ ⋮ \\ m_{N} \end{matrix}] = [\begin{matrix} m_{11} & m_{12} & \dots & m_{1 P} \\ m_{21} & m_{22} & \dots & m_{2 P} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ m_{N 1} & m_{N 2} & \dots & m_{N P} \end{matrix}]

(2)

where N represents the number of rotational speed data, and each row of the matrix represents the extracted feature of one piece of data. Each rotational speed datum can extract an initial feature vector

m_{i} = [m_{i 1}, m_{i 2}, \dots, m_{i p}]

.

P = 4 \times 10 = 40

represents the number of original features, where 4 represents the four sliding plug door operation stages recognized by LSTM model and 10 represents 10-time domain statistical features.

In this paper, we chose the random forest (RF) algorithm for feature screening. The random forest has been applied in many fields, such as the development of medical prediction models [24] and the fault diagnosis of rolling bearings [25]. The outstanding advantages of the RF algorithm are as follows: (1) its feature importance evaluation is carried out automatically in the process of random forest training, with low computational complexity and easy implementation. (2) The RF algorithm randomly selects samples and features and has two-dimensional randomness, so it has strong generalization ability and high accuracy in feature importance evaluation.

In the RF algorithm, the accuracy of classification will change greatly because of whether an important feature has noise. In this paper, the importance of features is taken as the index of random forest feature selection. This index can be obtained through calculation, which is shown as adding noise interference to each feature in the sample, calculating the difference of prediction accuracy of the out of bag feature data before and after the change to obtain the feature importance of each feature. The definition of feature importance

D_{j}

is as follows:

D_{j} = \frac{1}{B} \sum_{i = 1}^{B} (R_{b}^{o o b} - R_{b j}^{o o b})

(3)

where

B

represents the total number of state assessment samples,

R_{b}^{o o b}

represents the number of correct classification of out of bag data by the decision tree, and

R_{b j}^{o o b}

represents the number of correct classification of out of bag data after adding noise to the decision tree.

After calculating all feature importance using RF algorithm, we arrange the feature importance in descending order

D_{j} = [D_{1}, D_{2}, D_{3} \dots D_{40}]

. After that, we extract the top 10 features with higher feature importance as a feature set

F = [F_{1}, F_{2}, \dots F_{10}]

and input them into the subsequent classifier to obtain the state prediction result of the sliding plug door. In Section 4.2, we demonstrate the effectiveness of this method on various bench doors.

3.3. AdaBoost Integrated Learning Algorithm

The sliding plug door system is a complex mechatronics system with many components. There are electromechanical coupling and multi-component coupling, and the sliding plug door system has weak fault symptoms and complex fault modes, which makes it difficult to establish an accurate multi-fault diagnosis model. When we extracted the features of the sliding plug door system’s rotational speed curve between intervals, the feature matrix formed was complex in structure, as well as high in feature correlation and redundancy, which also increased the complexity of fault diagnosis. Traditional fault diagnosis methods, such as SVM, Bayes, and other machine learning methods, are not satisfactory when dealing with sliding plug door problems, although SVM has been widely used in the field of fault diagnosis [26]. However, it also has some defects, that is, the ability to process complex feature sets is insufficient, it is easy to fall into the over fitting state, and the generalization ability is poor. It is sensitive to the choice of parameters and kernel functions. Although Bayes algorithm [27] overcomes the difficulty of parameter selection and over fitting of SVM, the high correlation of features in the sample feature matrix we extracted makes it difficult to meet the assumption of sample attribute independence, so it is difficult to guarantee the accuracy. The PCA fault diagnosis method is to use the fault threshold index to judge between normal and abnormal. Although the PCA method can judge whether a fault occurs, it cannot identify the fault type. Since we have a multi-classification problem and need to identify multiple fault states accurately, the PCA algorithm is not appropriate here.

AdaBoost integrated learning algorithm is an iterative algorithm, which is based on the framework of the boosting algorithm [28]. Its core idea is to perform iterative training on multiple weak learners, adjust the weights according to the errors, reduce the weights of the learners with large errors, and increase the weights of the learners with small errors. Finally, the weights are integrated into strong learners. Because AdaBoost algorithm is composed of multiple weak classifiers cascaded and the deviation of test results is constantly reduced in the iterative process, it is not easy to fall into the over fitting state. Therefore, compared with single classifiers, such as SVM and Bayes, AdaBoost algorithm can often obtain higher accuracy, and the volatility of classification results is small. More importantly, AdaBoost can effectively integrate the multi-stage time domain sensitive features of sliding plug door operation, solving the problem that Bayes cannot handle highly correlated features, and the accuracy has been greatly improved. At present, AdaBoost algorithm has been successfully applied to all kinds of fault diagnosis [29] and has achieved good diagnosis results. Therefore, AdaBoost algorithm was selected to evaluate the state of sliding plug door system. Figure 4 shows the framework of the AdaBoost ensemble model.

3.4. Health State Assessment Model of Subway Sliding Plug Door System

In this paper, an adaptive assessment method is proposed for the health state of subway sliding plug door system based on interval intelligent recognition of rotational speed operation data curve. The framework of the proposed method is shown in Figure 5, and general procedures and relevant processing effort are summarized as follows:

Step 1: Data acquisition and preprocessing. Collect the motor rotational speed data of four different sliding plug doors in normal and three abnormal states (guidepost fault, screw fault, roller fault) and align all data to 400 points.
Step 2: Constructing the LSTM model based on interval recognition. We used the curves of the labeled intervals under the four different bench doors as the training set to train the model parameters of the LSTM to obtain the LSTM interval recognition model, and finally input the test curves into the model to obtain the interval recognition results. Additionally, we compared the LSTM model with other deep learning algorithms CNN, DBN, and SAE.
Step 3: Feature screening based on RF algorithm. We extracted 40 time domain features according to the operating range of the door, used the random forest algorithm to calculate the feature importance and arrange them in descending order, and extracted the top 10 features as a sensitive feature set and input them into the subsequent classifier. The top 10 features, the top 20 features, the top 30 features, and the top 40 features were used as feature sets for comparison and verification.
Step 4: Input the sensitive feature set into the AdaBoost classifier for training and testing to obtain the prediction result. Additionally, compare it with SVM, KNN, and Bayes methods.

4. Experiment Verification and Results Discussion

4.1. Case Introduction

In order to verify the effectiveness of the proposed method and the feasibility of the engineering application, this paper uses the bench door as an application test platform. Figure 6 and Figure 7, respectively, show the physical drawing and structural drawing of the sliding plug door transmission system, which shows the specific positions of the guidepost, roller, and screw in the sliding plug door system. First, we checked whether the relevant dimensions and functions of the bench door met the inspection standards and confirmed that the door had no fault in any form. Then, under abnormal conditions, we manually installed the guidepost fault parts, roller fault parts, and screw fault parts, simulated the working state under the influence of single factors, and obtained the motor rotational speed experimental data. The motor used in this paper was the K65ZW-110-150 permanent magnet brushless DC motor. The specific parameters of the motor are as follows: The rated voltage is 110 V DC. The rated current is 2.1 A. The rated power is 150 W, and the rated speed is 3200× g rpm/min. When collecting the data, we used a sensor with a sampling rate of 10 Hz. The motor rotational speed was calculated by the position signal collected by a Hall sensor (A1220LUA-T, ALLEGRO).

4.2. Performance Analysis of Interval Recognition Model

In order to accurately identify the operating intervals of different bench doors for subsequent extraction of sensitive features and state assessment, this section verifies the effectiveness of the proposed LSTM interval recognition model. In this section, we select four different bench doors, and each bench door extracts four states, namely the health state, guidepost fault, roller fault, and screw fault. Each state extracted five curves that have been interval labeled as training sets, and the total number of training sets was 80 (4 × 4 × 5). Then, 25 unlabeled curves in each state were extracted as test sets for prediction, and the total number of test sets was 400 (4 × 4 × 25). The specific interval range and data information are shown in Table 3.

Figure 8a–d shows the recognition effect of four kinds of bench doors under the LSTM interval recognition model. In order to observe our prediction effect, we converted the actual curve graph into an interval graph in Figure 8, which was consistent with the effect of the curve graph in Figure 2. We can understand it by observing the abscissa and ordinate in Figure 8. The ordinate is the name of four intervals, and the abscissa is the number of sampling points. We can see the predicted interval range of the model here. From the figure, it can be seen that the prediction interval was highly consistent with the test interval, and the interval recognition of different door data curves was accurately realized.

In order to further evaluate the performance of the model, we will compare it with other deep learning models, such as CNN, DBN, and SAE, and use two performance measures to measure the difference between the actual segmentation points and predicted segmentation points, namely the mean absolute error (MAE) and the root mean square error (RMSE), which are defined as:

M A E = \sum_{i = 1}^{N} | y_{i}^{r e a l} - y_{i}^{p r e} | / N

(4)

R M S E = \sqrt{{\sum_{i = 1}^{N} (y_{i}^{r e a l} - y_{i}^{p r e})}^{2} / N}

(5)

where

y^{r e a l}

is the actual segment point in different intervals,

y^{p r e}

is the predicted segment point in different intervals, and

N

represents the number of total segment points.

From Table 4, we can see that the selection of the LSTM model for interval recognition was the best. The minimum MAE and RMSE were obtained on the four bench doors, and the average MAE and RMSE were 2.77 and 3.45, respectively. However, other depth models had poor recognition effect. The average MAE and RMSE of the DBN model were 6.49 and 8.25, respectively, the average MAE and RMSE of the SAE model were 7.73 and 9.45, respectively, and the average MAE and RMSE of CNN model were 10.42 and 12.43, respectively. The MAE and RMSE of these three models were both large, which cannot meet the needs of actual projects. From Figure 9 and Figure 10, we can also more intuitively see that the MAE and RMSE of the four bench doors under the recognition of the LSTM model were far less than those of other methods. The experimental results proved that the LSTM model can effectively learn the long-term correlation hidden in the sequence data and is more suitable for interval recognition between different sliding plug doors.

4.3. Performance Analysis of the Health State Assessment Model

In this experiment, we selected the normal state, guidepost fault, screw fault, and roller fault for four bench doors. Each state had 100 pieces of data, and the total data capacity of each door was 400. The data set was divided into a training set and a test set in a ratio of 7:3. There were 280 pieces of data in the training set and 120 pieces of data in the test set for each bench. The labels were set to normal state (label-1), guidepost fault (label-2), screw fault (label-3), and roller fault (label-4), and the specific data details are shown in Table 5.

Experiment 1 was to verify the effectiveness of random forest feature importance screening. Here, we first used the LSTM interval intelligent recognition model to divide different bench door rotational speed curves into four intervals and extract relevant time domain features. The feature numbers were based on the sliding plug door operating stage and the order of the features in Table 2. For example, 13 represents the average value of the rotational speed signal in the high-speed stage. Then, the importance of the features was calculated by the RF algorithm and arranged in descending order. After that, we put the top 10 features, the top 20 features, the top 30 features, and the top 40 features as the feature set into the AdaBoost classifier for comparison. A total of 10 experiments were performed for each feature set, and the average accuracy of each feature set is shown in Table 6 and Figure 11.

From Table 6, we can see that the accuracy of the four bench doors was the highest when the top 10 features were used as the feature set. The diagnostic accuracy of the four bench doors was more than 97%, and the average accuracy rate was 98.15%. However, with the feature input with less feature importance, the diagnostic accuracy of the four bench doors began to decline gradually. The average diagnostic accuracy of the top 20 feature sets of the four bench doors was 95.87%, the average diagnostic accuracy of the top 30 feature sets was 92.32%, and the average diagnostic accuracy of the top 40 feature sets was 87.90%. From Figure 11, it can be seen that the overall diagnostic accuracy showed a downward trend with the increase of features. Therefore, removing insensitive features that are not conducive to classification, or do not even affect the accuracy of classification, can effectively improve the diagnosis accuracy of the subsequent classifiers. Figure 12 shows the confusion matrix of the four kinds of benches using the method in this paper.

Figure 13a–d shows the ranking of the top ten importance degrees of the four bench doors after random forest screening. From the figure, we can see that most of the features are concentrated in the acceleration stage (features 1–10) and the high-speed stage (features 11–20) of the sliding plug door. This is also the main section where the three types of faults appear in Table 1. The experimental results demonstrated the importance of extracting features by dividing the rotational speed curve interval.

Experiment 2 is to verify the effectiveness of using the AdaBoost ensemble classifier in the proposed method. In this round of experiments, we will still use the top 10 features of different bench doors as feature sets and compare AdaBoost with SVM, KNN, and Bayes. The experimental accuracies under different methods are shown in Table 7 and Table 8. Table 7 shows the accuracies of different classifiers under LSTM interval recognition, mainly to compare the advantages of the AdaBoost ensemble classifier. Table 8 shows the accuracies of different classifiers, in the case of no LSTM interval recognition, mainly for the purpose of comparing the effectiveness of LSTM interval recognition. Figure 14 more intuitively summarizes the diagnostic accuracies under different methods.

From Table 7, we can see that the method LSTM + AdaBoost proposed in this paper obtained the highest accuracy. The average accuracy of the four kinds of benches reached 98.15%, while the average accuracy of other single classifiers decreased. The average accuracy of LSTM + SVM was 94.46%, the average accuracy of LSTM + KNN was 92.32%, and the average accuracy of LSTM + Bayes was 87.90%. It can be concluded that the AdaBoost ensemble classifier can effectively deal with multiple features, thus overcoming the defect of a low diagnostic accuracy of a single classifier.

Through the comparison between Table 7 and Table 8, it can be seen that the accuracy of the above classifiers has decreased after feature extraction without LSTM interval recognition. The average accuracy of AdaBoost on the four benches was 92.26%, which decreased by about 6%. The average accuracy of SVM was 84.35, which decreased by about 10%. The average accuracy of KNN was 81.61%, which decreased by about 11%. The average accuracy of Bayes was 79.61%, which decreased by about 8%. It can also be seen in Figure 13 that the accuracy of the above method showed a downward trend under the four benches, which proves the effectiveness of LSTM interval recognition.

5. Conclusions

A subway sliding plug door system health state adaptive assessment method is proposed based on interval intelligent recognition of the rotational speed operation data curve. In the method, the LSTM model was used to solve the difficult problem of interval recognition of complex rotational speed curves. Secondly, the RF algorithm was used to screen out the sensitive feature set. Finally, the AadBoost ensemble classifier was used to identify the multiple faults of the sliding plug door. The effectiveness of the proposed method was verified by the benchmark experiment test data set. The following conclusions can be drawn: (1) The LSTM model can accurately recognize the operating interval of the rotational speed curve of different bench doors to facilitate the extraction of subsequent sensitive features. (2) The average accuracy of this method on multiple benches is 98.15%, which can obtain more sensitive features and more robust fault diagnosis performance than other methods. In further work, the proposed method will be applied to the state assessment of more components of subway sliding plug doors to make for more applicability and operability.

Author Contributions

Conceptualization, G.C.; Data curation, H.Q. and Y.Y.; Formal analysis, X.W.; Funding acquisition, G.C.; Methodology, H.M.; Project administration, X.W.; Resources, G.C.; Software, H.Q.; Validation, H.Q.; Visualization, H.Q. and Y.Y.; Writing—original draft, H.Q.; Writing—review & editing, H.Q. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China grant No. [51905399] and National Natural Science Foundation of China grant No. [62271390].

Data Availability Statement

The data generated and/or analyzed during the current study are not publicly available for legal/ethical reasons but are available from the corresponding author on reasonable request.

Acknowledgments

This research was supported by the Project of National Natural Science Foundation of China (grant No. 51905399, 62271390). Comments and suggestions from the editor and reviewers are very much appreciated.

Conflicts of Interest

The authors declare no conflict of interest.

References

Chen, H.; Jiang, B.; Lu, N. Deep PCA based real-time incipient fault detection and diagnosis methodology for electrical drive in high-speed trains. IEEE Trans. Veh. Technol. 2018, 67, 4819–4830. [Google Scholar] [CrossRef]
Sun, Y.; Xie, G.; Cao, Y. Strategy for fault diagnosis on train plug doors using audio sensors. Sensors 2018, 19, 3. [Google Scholar] [CrossRef] [Green Version]
Lei, Y.; Li, N.; Guo, L.; Li, N.; Yan, T.; Jing, L. Machinery health prognostics: A systematic review from data acquisition to RUL prediction. Mech. Syst. Signal Process. 2018, 104, 799–834. [Google Scholar] [CrossRef]
Liu, X. Vibration modelling and fault evolution symptom analysis of a planetary gear train for sun gear wear status assessment. Mech. Syst. Signal Process. 2022, 166, 108403. [Google Scholar] [CrossRef]
Duan, S.; Song, W.; Zio, E. Product technical life prediction based on multi-modes and fractional Lévy stable motion. Mech. Syst. Signal Process. 2021, 161, 107974. [Google Scholar] [CrossRef]
Song, W.; Liu, H.; Zio, E. Long-range dependence and heavy tail characteristics for remaining useful life prediction in rolling bearing degradation. Appl. Math. Model. 2022, 102, 268–284. [Google Scholar] [CrossRef]
Zhao, D.; Gu, D.; Sun, X. Enhanced data-driven fault diagnosis for machines with small and unbalanced data based on variational auto-encoder. Meas. Sci. Technol. 2020, 31, 035004. [Google Scholar] [CrossRef]
Shi, W.; Lu, N.; Jiang, B. Data-driven intelligent incipient fault diagnosis for subway vehicle door system. Chin. J. Sci. Instrum. 2019, 40, 192–201. [Google Scholar]
Cao, Y.; Sun, Y.; Xie, G. Fault Diagnosis of Train Plug Door Based on a Hybrid Criterion for IMFs Selection and Fractional Wavelet Package Energy Entropy. IEEE Trans. Veh. Technol. 2019, 68, 7544–7551. [Google Scholar] [CrossRef]
Ren, Y.; Xu, Y.; Qiao, Q. Fault Diagnosis of Metro Door System Based on Bayesian Network. J. Ordnance Equip. Eng. 2019, 40, 184–188. [Google Scholar]
Ma, R.; Qiu, W.; Qin, Y. Fault risk assessment of sliding plug door of EMU based on fuzzy interval TOPSIS. China Saf. Sci. J. 2021, 31, 177–183. [Google Scholar]
Xue, Y.; Mei, X.; Zhi, Y. Sub-health state identification method of subway door based on time series data mining. J. Comput. Appl. 2018, 38, 905–910. [Google Scholar]
Wang, X.; Wu, J.; Liu, C. Exploring LSTM based recurrent neural network for failure time series prediction. J. Beijing Univ. Aeronaut. Astronaut. 2018, 44, 772–784. [Google Scholar]
Guo, J.; Lao, Z.; Hou, M.; Li, C.; Zhang, S. Mechanical fault time series prediction by using EFMSAE-LSTM neural network. Measurement 2020, 173, 108566. [Google Scholar] [CrossRef]
Guo, J.; Chen, G.; Ma, H. An equipment multiple failure causes intelligent identification method based on integrated strategy for subway sliding plug door system under variable working condition. Meas. Sci. Technol. 2022, 33, 124010. [Google Scholar] [CrossRef]
Peng, X.; Li, J.; Wang, G. Random forest based optimal feature selection for partial discharge pattern recognition in HV cables. IEEE Trans. Power Deliv. 2019, 34, 1715–1724. [Google Scholar] [CrossRef]
Zhou, J.; Zhou, Z.; Cui, L. Loudspeaker abnormal sound classification using variational modal decomposition and the random forest feature selection algorithm. J. Vib. Shock 2022, 41, 279–283. [Google Scholar]
Chen, R.; Wu, H. Rolling bearing life stage recognition based on multi-classifier integration of the weighted and balanced distribution adaptation. Chin. J. Sci. Instrum. 2019, 40, 66–73. [Google Scholar]
Ji, X.; Ren, Y.; Tang, H.; Xiang, J. DSmT-based three-layer method using multi-classifier to detect faults in hydraulic systems. Mech. Syst. Signal Process. 2020, 153, 107513. [Google Scholar] [CrossRef]
Xu, Y.; Cong, K.; Zhu, Q.; He, Y. A novel AdaBoost ensemble model based on the reconstruction of local tangent space alignment and its application to multiple faults recognition. J. Process Control 2021, 104, 158–167. [Google Scholar] [CrossRef]
Zhang, J.; Wang, P.; Yan, R. Long short-term memory for machine remaining life prediction. J. Manuf. Syst. 2018, 48, 78–86. [Google Scholar] [CrossRef]
Lei, J.; Liu, C.; Jiang, D. Fault diagnosis of wind turbine based on Long Short-term memory Networks. Renew. Energy 2019, 133, 422–432. [Google Scholar] [CrossRef]
Nikula, R.P.; Karioja, K.; Pylvänäinen, M.; Leiviskä, K. Automation of low-speed bearing fault diagnosis based on autocorrelation of time domain features. Mech. Syst. Signal Process. 2020, 138, 106572. [Google Scholar] [CrossRef]
Speiser, J.L. A random forest method with feature selection for developing medical prediction models with clustered and longitudinal data. J. Biomed. Inform. 2021, 117, 103763. [Google Scholar] [CrossRef] [PubMed]
Hu, Q.; Si, X.; Zhang, Q.; Qin, A. A rotating machinery fault diagnosis method based on multi-scale dimensionless indicators and random forests. Mech. Syst. Signal Process. 2020, 139, 106609. [Google Scholar] [CrossRef]
Huang, X.; Teng, Z.; Tang, Q.; Yu, Z. Fault diagnosis of automobile power seat with acoustic analysis and retrained SVM based on smartphone. Measurement 2022, 202, 111699. [Google Scholar] [CrossRef]
Wang, T.; Li, H. Probabilistic Seismic Response Prediction of Three-Dimensional Structures Based on Bayesian Convolutional Neural Network. Sensors 2022, 22, 3775. [Google Scholar] [CrossRef]
Wan, S.; Li, X.; Yin, Y.; Hong, J. Milling chatter detection by multi-feature fusion and adaboost-svm. Mech. Syst. Signal Process. 2021, 156, 107671. [Google Scholar] [CrossRef]
Long, Z.; Zhang, X.; Zhang, L. Motor fault diagnosis using attention mechanism and improved adaboost driven by multi-sensor information. Measurement 2021, 170, 108718. [Google Scholar] [CrossRef]

Figure 1. Normal rotational speed-time curve.

Figure 2. Comparison of rotational speed curves between different benches.

Figure 3. The structure of the LSTM model based on interval recognition.

Figure 4. Framework of AdaBoost ensemble model.

Figure 5. The framework of the proposed method.

Figure 6. Physical drawing of sliding door transmission system.

Figure 7. Structural drawing of sliding door transmission system.

Figure 8. The results of one prediction for different benches.

Figure 9. MAE comparison under different methods.

Figure 10. RMSE comparison under different methods.

Figure 11. Diagnostic results under different feature datasets.

Figure 12. Confusion matrix of 4 kinds of benches under the proposed method.

Figure 13. Random Forest feature importance ranking.

Figure 14. Diagnosis accuracy of all methods.

Table 1. Fault types and corresponding attributes.

Fault Types	Corresponding Curve Characteristics	Abnormal Stages
Guidepost fault	Overall upward trend	Acceleration
Screw fault	Large range abnormal fluctuation	High-speed
Roller fault	Overall upward trend	Acceleration

Table 2. Time domain feature name and formula.

Name	Formula	Name	Formula
1. Peak value	$U_{1} = \max (x_{i})$	6. Skewness	$U_{6} = {(\frac{1}{n} \sum_{i = 0}^{n} (x_{i} - \frac{1}{n} \sum_{i = 0}^{n} x_{i}))}^{3} / U_{4}$
2. Peak-to-peak value	$U_{2} = \max x_{i} - \min x_{i}$	7. Kurtosis	$U_{7} = \frac{1}{n} \sum_{i = 0}^{n} x_{i}^{4} / U_{4}$
3. Average value	$U_{3} = \frac{1}{n} \sum_{i = 1}^{n} x_{i}$	8. Peak factor	$U_{8} = U_{1} / U_{4}$
4. RMS value	$U_{4} = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} x_{i}^{2}}$	9. Waveform factor	$U_{9} = U_{4} / U_{3}$
5. Variance value	$U_{5} = \frac{1}{n} \sum_{i = 1}^{n} {(x_{i} - \frac{1}{n} \sum_{i = 1}^{n} x_{i})}^{2}$	10. Pulse factor	$U_{10} = U_{1} / U_{3}$

Table 3. Specific interval range and data information of four types of benches.

Interval Range Bench Number	Acceleration	High-Speed	Deceleration	Amble	Training Set	Testing Set
Bench 1	(0,32)	(32,282)	(282,355)	(355,400)	4 × 5	4 × 25
Bench 2	(0,63)	(63,229)	(229,295)	(295,400)	4 × 5	4 × 25
Bench 3	(0,62)	(62,161)	(161,257)	(257,400)	4 × 5	4 × 25
Bench 4	(0,51)	(51,240)	(240,292)	(292,400)	4 × 5	4 × 25

Table 4. MAE and RMSE results under different methods.

Method		Bench 1	Bench 2	Bench 3	Bench 4	Average
LSTM	MAE	2.33 ± 0.33	3.67 ± 0.67	2.46 ± 0.33	2.63 ± 0.33	2.77 ± 0.42
LSTM	RMSE	2.88 ± 0.42	4.51 ± 0.75	3.11 ± 0.45	3.36 ± 0.53	3.45 ± 0.53
DBN	MAE	5.67 ± 0.33	7.67 ± 0.67	6.67 ± 0.67	5.94 ± 0.33	6.49 ± 0.50
DBN	RMSE	7.05 ± 0.53	9.53 ± 0.73	8.17 ± 0.71	8.23 ± 0.54	8.25 ± 0.63
SAE	MAE	7.26 ± 0.33	8.33 ± 0.67	7.49 ± 0.33	7.85 ± 0.33	7.73 ± 0.42
SAE	RMSE	8.94 ± 0.62	10.23 ± 0.83	9.12 ± 0.64	9.52 ± 0.71	9.45 ± 0.71
CNN	MAE	9.33 ± 0.33	11.33 ± 0.67	10.33 ± 0.67	10.67 ± 0.67	10.42 ± 0.56
CNN	RMSE	11.40 ± 0.77	13.55 ± 1.38	12.17 ± 0.93	12.58 ± 1.40	12.43 ± 1.12

Table 5. Details of the four bench datasets.

Bench Number	Fault Categories	Labels	Number of Samples	Training Set	Testing Set
Bench 1	Normal	1	4 × 100	280	120
	Guidepost fault	2
	Screw fault	3
	Roller fault	4
Bench 2	Normal	1	4 × 100	280	120
	Guidepost fault	2
	Screw fault	3
	Roller fault	4
Bench 3	Normal	1	4 × 100	280	120
	Guidepost fault	2
	Screw fault	3
	Roller fault	4
Bench 4	Normal	1	4 × 100	280	120
	Guidepost fault	2
	Screw fault	3
	Roller fault	4

Table 6. Accuracy of different feature sets.

Bench Number	Top 10 Features	Top 20 Features	Top 30 Features	Top 40 Features
1	98.36% (118/120)	96.70% (116/120)	93.44% (112/120)	90.98% (109/120)
2	97.54% (117/120)	95.83% (115/120)	91.80% (110/120)	87.70% (105/120)
3	98.36% (118/120)	96.70% (116/120)	91.80% (110/120)	85.23% (102/120)
4	98.36% (118/120)	94.26% (113/120)	92.26% (111/120)	90.98% (109/120)
Average	98.15% (118/120)	95.87% (115/120)	92.32% (111/120)	87.90% (105/120)

Table 7. Accuracies of different classifiers under LSTM interval recognition.

Bench Number	Proposed Method	LSTM + SVM	LSTM + KNN	LSTM + Bayes
1	98.36% (118/120)	95.90% (115/120)	93.44% (112/120)	90.98% (109/120)
2	97.54% (117/120)	93.44% (112/120)	91.80% (110/120)	87.70% (105/120)
3	98.36% (118/120)	94.26% (113/120)	91.80% (110/120)	85.25% (102/120)
4	98.36% (118/120)	94.26% (113/120)	92.26% (111/120)	87.70% (105/120)
Average	98.15% (118/120)	94.46% (113/120)	92.32% (111/120)	87.90% (105/120)

Table 8. Accuracies of different classifiers without LSTM interval recognition.

Bench Number	AdaBoost	SVM	KNN	Bayes
1	92.54% (111/120)	87.70% (105/120)	82.23% (98/120)	80.70% (96/120)
2	91.80% (110/120)	85.23% (102/120)	82.23% (98/120)	78.52% (94/120)
3	92.54% (111/120)	82.23% (98/120)	81.26% (97/120)	78.52% (94/120)
4	92.26% (111/120)	82.23% (98/120)	80.70% (96/120)	80.70% (96/120)
Average	92.28% (111/120)	84.35% (101/120)	81.61% (97/120)	79.61% (95/120)

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qi, H.; Chen, G.; Ma, H.; Wang, X.; Yang, Y. A Subway Sliding Plug Door System Health State Adaptive Assessment Method Based on Interval Intelligent Recognition of Rotational Speed Operation Data Curve. Machines 2022, 10, 1075. https://doi.org/10.3390/machines10111075

AMA Style

Qi H, Chen G, Ma H, Wang X, Yang Y. A Subway Sliding Plug Door System Health State Adaptive Assessment Method Based on Interval Intelligent Recognition of Rotational Speed Operation Data Curve. Machines. 2022; 10(11):1075. https://doi.org/10.3390/machines10111075

Chicago/Turabian Style

Qi, Hui, Gaige Chen, Hongbo Ma, Xianzhi Wang, and Yudong Yang. 2022. "A Subway Sliding Plug Door System Health State Adaptive Assessment Method Based on Interval Intelligent Recognition of Rotational Speed Operation Data Curve" Machines 10, no. 11: 1075. https://doi.org/10.3390/machines10111075

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Subway Sliding Plug Door System Health State Adaptive Assessment Method Based on Interval Intelligent Recognition of Rotational Speed Operation Data Curve

Abstract

1. Introduction

2. Analysis of Rotational Speed Operation Data Curves

2.1. Basic Analysis of Rotational Speed Data Curves

2.2. Characteristic Analysis of Rotational Speed Data Curves

3. Subway Sliding Plug Door System Health State Assessment Method

3.1. Interval Recognition Model Based on LSTM

3.2. Features Screening of Subway Sliding PLUG Door System

3.3. AdaBoost Integrated Learning Algorithm

3.4. Health State Assessment Model of Subway Sliding Plug Door System

4. Experiment Verification and Results Discussion

4.1. Case Introduction

4.2. Performance Analysis of Interval Recognition Model

4.3. Performance Analysis of the Health State Assessment Model

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI