A Novel Fault Diagnosis Method for TE Process Based on Optimal Extreme Learning Machine

Hu, Xinyi; Hu, Mingfei; Yang, Xiaohui

doi:10.3390/app12073388

Open AccessArticle

A Novel Fault Diagnosis Method for TE Process Based on Optimal Extreme Learning Machine

by

Xinyi Hu

¹

,

Mingfei Hu

¹ and

Xiaohui Yang

^2,*

¹

Qianhu College, Nanchang University, Nanchang 330100, China

²

Information Engineering College, Nanchang University, Nanchang 330100, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(7), 3388; https://doi.org/10.3390/app12073388

Submission received: 1 March 2022 / Revised: 23 March 2022 / Accepted: 24 March 2022 / Published: 26 March 2022

Download

Browse Figures

Versions Notes

Abstract

:

Chemical processes usually exhibit complex, high-dimensional and non-Gaussian characteristics, and the diagnosis of faults in chemical processes is particularly important. To address this problem, this paper proposes a novel fault diagnosis method based on the Bernoulli shift coyote optimization algorithm (BCOA) to optimize the kernel extreme learning machine classifier (KELM). Firstly, the random forest treebagger (RFtb) is used to select the features, and the data set is optimized. Secondly, a new optimization algorithm BCOA is proposed to automatically adjust the network hyperparameters of KELM and improve the classifier performance. Finally, the optimized feature sequence is input into the proposed classifier to obtain the final diagnosis results. The Tennessee Eastman (TE) chemical process have been collected and used to verify the effectiveness of the proposed method. A comprehensive comparison and analysis with widely used algorithms is also performed. The results demonstrate that the proposed method outperforms other methods in terms of classification accuracy. The average diagnosis rate of 21 faults is found to be 89.32%.

Keywords:

modified coyote optimization algorithm; extreme learning machine; deep learning; chemical process; fault diagnosis

1. Introduction

The chemical industry’s equipment has grown increasingly automated, complicated, scaled-up, and intelligent as business has progressed [1,2,3,4]. Furthermore, the chemical industry’s intermediate products are frequently toxic and corrosive, making it possible for seemingly little errors to have notable safety consequences, making safety management challenging but increasingly critical: no mistake should be overlooked [5,6,7,8,9,10]. As a result, in recent years, several novel ways to minimize maintenance costs and enhance the utilization of the production process have been investigated, one of which is timely fault detection and diagnosis.

There are three types of fault detection and diagnosis for chemical processes: quantitative, qualitative, and historical process data-based methods [11,12,13]. The third way only requires historical data and does not require a complicated and precise mathematical modeling of the process based on considerable a priori knowledge and time-consuming and difficult mathematical processes. It can be modelled by directly processing historical data [14]. The amount of data available to researchers has considerably expanded with the arrival of the information and big data era, making it challenging for quantitative and qualitative methodologies to meet the criteria for diagnostic accuracy and time. As a result, in the chemical industry, data-based fault diagnosis methods are frequently used.

Data-based diagnostic methods can be roughly categorized into statistics-based methods and machine-learning-based methods [15]. Initially, this class of methods was mostly based on statistical learning methods, and a number of scholars have suggested a variety of related statistical-learning-based methods for fault diagnostics of chemical processes. Principal component analysis [16], partial least squares [17], independent component analysis [18], and Fisher’s discriminant analysis are the most common statistical-based methodologies [19]. All these methods aim to protect chemical variables from highly non-linear correlations and complex processes by reducing the dimensionality of the data to describe the main features of the original data while reducing noise and improving the computational complexity of high-dimensional data, among other things. However, as the number of data dimensions grows, the complexity of these statistics-based methodologies exponentially increases, resulting in dimension disaster [20,21].

Support vector machines (SVM) [22], artificial neural networks [23], and Bayesian networks are examples of machine-learning-based diagnostic approaches. SVM has been used to diagnose gearbox [24], rotating equipment [25], and high-pressure water supply heater issues in thermal power units [26], among other things. In addition, least square SVM is applied to reference evapotranspiration prediction [27] and assessment of water quality parameters [28]. For complex chemical process fault identification, Yang and Gu suggested an upgraded naive Bayes classifier [29]. However, an accurate classification of failure modes under complex chemical conditions is still not easy, especially when the test data come from different operating conditions and different domain distributions. Therefore, classifiers with a better generalization ability have been developed to cope with the degradation of diagnostic performance due to fluctuations in working conditions.

Extreme learning machines (ELM) and their derivatives have been employed in a variety of pattern recognition applications in recent years [30,31,32], and have shown a comparable accuracy and generalization performance to back propagation neural network (BPNN) and SVM. ELM and kernel ELM (KELM) [33] were used to classify objects with a domain bias in a recent study [34]. Based on the deep features recovered by the dynamic convolution neural network (DCNN) model, they were used as the top classifiers. In all cross-domain recognition tasks, KELM beat ELM and SVM, according to comparative tests. However, KELM still has some drawbacks, such as its being time-consuming and the laborious manual design of parameters, its heavy reliance on diagnostic experience, the limited ability of the shallow model to represent features, and its limited applicability under complex operating conditions. To further improve the diagnostic situation, we propose a new optimized KELM and apply it to solve the deterioration of the diagnostic performance due to complex conditions.

Although research on machine learning methods in chemical process fault diagnosis has progressed, some problems remain to be solved. In chemical processes, some factors have negligible effects on the results, and too many variables consume computational resources. After obtaining process-monitoring data, the validity of the classification model must be determined. Long-term studies have shown that many features are not relevant to the classification goal. John et al. classified data features into three categories: strongly correlated, weakly correlated, and irrelevant features [35]. Feature selection involves finding a subset of features that can be optimized and specified for evaluation criteria. Feature selection is also applied in many fault diagnosis methods. Malhi and Gao proposed a PCA-based feature selection scheme [36]. Tian et al. used Spearman’s rank correlation coefficient to select variables from high-dimensional data to eliminate noise and redundant variables, thus reducing the dimensionality of chemical process data [37].

A Random Forest treebagger (RFtb) has recently been employed in the feature selection process for diagnosis in a variety of engineering fields [38,39]. RFtb is one of the most accurate machine learning techniques, and performs well even when there are many input features and a limited number of samples [40,41]. Additionally, RFtb has been to diagnose faults in gearboxes [42] and bearings [43]. Therefore, RFtb is also highly likely to achieve good feature extraction results in chemical process diagnosis.

Inspired by previous works, we proposed a new fault diagnosis method for chemical process using Tennessee Eastman data. The method includes three successive procedures: feature extraction using RFtb, better optimization algorithms BCOA, and the fault detection method with BCOA-KELM based on the fused deep features. The methodological flow is as follows:

(1): Facing a time-varying, strongly coupled and nonlinear chemical dataset, we used RFtb to remove the redundant information while retaining most of the intrinsic and discriminative information to prevent features of different classes overlapping in some regions in the feature space.
(2): We introduced chaotic mechanism and Bernoulli shift into COA, and proposed a new algorithm Bernoulli shift coyote optimization algorithm (BCOA). The algorithm is able to perform more accurate local exploitation in late iterations, speed up convergence, and better maintain the population diversity of coyotes in individual updates. We combined BCOA and KELM classifiers to improve KELM’s being prone to fall into local extremes in the iteration.
(3): BCOA-KELM was proposed and used as the top classifier for fault diagnosis based on fused multidomain features, taking advantage of both ensemble learning and multikernel learning. Our technique has better diagnostic abilities and a faster diagnostic speed due to its high generalization performance.

The remainder of this paper is organized as follows. Section 2 presents the algorithm structure, the main components of the algorithm and the experimental methods. Section 3 compares the proposed method with other methods in terms of multidimensional indicators to demonstrate the superiority of this method. Section 4 provides a summary of this study and future work direction.

2. Materials and Methods

2.1. The Proposed BCOA-KELM Model

Figure 1 shows the variation of the average diagnostic accuracy with the number of training epochs for BCOA and COA, for both the training and test datasets. It can be seen that either curve is steadily converging. It can be seen that the average diagnostic accuracy of BCOA and COA for the training dataset is higher than that of the test dataset, which can indicate a reasonable selection of data. At the same time, the graph clearly shows that BCOA reaches its accuracy maximum at 10 epochs, while COA reaches its accuracy maximum after 20 epochs, which can indicate that BCOA has made a major breakthrough in terms of convergence speed.

To increase the model’s efficiency and accuracy, individual significance values are obtained using the random forest treebagger [44,45]. Following that, features are chosen based on the results, with redundant attributes removed to limit the number of network element nodes. Due to the network structure of KELM, the setting of the regularization coefficients c and kernel function parameters S will have an impact on the classification performance of KELM. At this point, the BCOA algorithm is an improvement in the emerging intelligent bionic optimization algorithm. Compared with other metaheuristics, BCOA has a unique algorithmic structure, which provides a new mechanism for balancing exploration and exploitation in the optimization process, and BCOA can maintain a high population diversity while improving convergence efficiency. Therefore, the BCOA algorithm can be used to find the most suitable c and S to improve the performance of the network.

The process of the model is presented in Figure 2, and is described as follows:

Step 1: Input of simulation data from the TE process into RFtb for training and prediction.

Step 2: RFtb’s feature importance values are ranked.

Step 3: Based on the ranking results, select features and retrieve the dataset for the input network.

Step 4: Initialize the Kernel Based Extreme Learning Machine and take the random regularization factor c and kernel function parameter S.

Step 5: Initialize the numbers of the coyotes: NumCoy, the numbers of packs: NumPac, the maximum number of the coyotes: MaxNumCoy, the fitness function: FitFunc.

Step 6: Equations (5) and (6) calculate alpha and cultural trend cult for alpha coyotes, and Equation (6) calculates effect size for alpha wolves and groups.

Step 7: Based on the fitness function, update the current coyote (Equation (10)), compare the adaptive capacity of the coyote before and after the update, and retain the better coyote (Equation (11)).

Step 8: If the Iteration loop conditions is satisfied, proceed to the next step; otherwise, return to Step 7.

Step 9: Find the current

1 / N_{c}

number of worse coyotes and perform a chaotic operation to generate additional coyotes if the threshold is met; otherwise, proceed to the next step.

Step 10: Proceed to the next step if the maximum iteration condition or preset condition is met; otherwise, return to Step 6.

Step 11: The optimal KELM diagnostic model is obtained by substituting the optimized regularization coefficients c kernel function parameters S into the KELM for training.

Step 12: The test samples are fed into the trained network to obtain the predicted output.

2.2. Coyote Optimization Algorithm

The coyote optimization algorithm is an intelligent bionic optimization algorithm proposed by Pierezan et al. [46]. Compared to other meta-heuristics, the COA (coyote optimization algorithm) has a unique algorithmic structure that provides a new mechanism for balancing exploration and exploitation in the optimization process [47,48,49].

2.2.1. Algorithm Flow

The following is how COA replicates the birth, development, death, and movement of coyote populations.

Step 1: Set the number of coyote groups

N_{p}

, the number of coyote individuals per group

N_{c}

, the dimension D, termination condition nfevalmax, and other parameters.

Step 2: Initialize the coyote pack at random; the

c th

individual inside the

p th

pack at time t is defined as

S o C_{c, j}^{p, t} = l b_{j} + r_{j} (u b_{j} - l b_{j})

(1)

S o C_{c}^{p, t} = (S o C_{c, 1}^{p, t}, S o C_{c, 2}^{p, t}, \dots, S o C_{c, D}^{p, t})

(2)

where

u b_{j}

and

l b_{j}

are the

j th

dimension’s upper and lower bounds, respectively, and

r_{j}

is a randomly generated real number in the range [0, 1].

Step 3: Assess the fitness of coyotes.

F i t_{c}^{p, t} = F i t F u n c (S o C_{c}^{p, t})

(3)

Step 4: Coyotes may split away from their original pack or are banished, resulting in a splinter group.A population shift’s probability is defined as

P_{e} = 0.005 N_{c}^{2}, N_{c} ⩽ 14

(4)

Step 5: Find the head coyote in the present pack

a l p h a^{p, t}

, and calculate the current coyote cultural trend of the coyote pack

c l u t^{p, t}

a l p h a^{p, t} = {S o c_{c}^{p, t} | | a r g_{{c = 1, 2, \dots N_{c}}} m i n F i t F u n c (S o c_{c}^{p, t})}

(5)

c l u t_{j}^{p, t} = \{\begin{matrix} O_{\frac{N_{c} + 1}{2}, j}^{p, t}, N_{c} i s o d d \\ O_{\frac{N_{c}}{2}, j}^{p, t} + O_{\frac{N_{c} + 1}{2}, j}^{p, t}, o t h e r w i s e \end{matrix}

(6)

where

O_{\frac{N_{c} + 1}{2}, j}^{p, t}

denotes that when

N_{c}

is odd, the median of all p-groups within the p-group at moment t the median of the

j th

dimensional variable of coyotes.

Step 6: In genetics, birth and death events are modeled. The birth of a baby coyote (

p u p^{p, t}

) was written as a mixture of the social standing of both parents (randomly picked) plus environmental factors, and coyote age (in years) was written as

a g e_{c}^{p, t}

.

p u p_{j}^{p, t} = \{\begin{matrix} S o C_{m_{1}, j}^{p, t}, r a n d_{j} < P_{s} o r j = j_{1} \\ S o C_{m_{1}, j}^{p, t}, r a n d_{j} < P_{s} + P_{α} o r j = j_{2} \\ R_{j}, r a n d_{j}, o t h e r w i s e \end{matrix}

(7)

where

m_{1}

,

m_{2}

are random coyotes from within the p-pack,

j_{1}

,

j_{2}

are two random dimensions of the problem, and

R_{j}

,

r a n d_{j}

are random numbers within [0, 1] generated by uniform probability. The discrete probability (

P_{s}

) and the association probability (

P_{a}

) affect the cultural diversity of individuals in a coyote pack, defined as

P_{s} = \frac{1}{D}, P_{α} = \frac{1 - P_{s}}{2}

(8)

Assume that

ω

means that the coyotes in the group are not as well adapted as the pups, and

ϕ

is the number of coyotes in the group at the time. if

ϕ

is 1 and

ω

holds, i.e., the number of coyotes in a group is If

ϕ

is 1 and

ω

holds, i.e., the number of coyotes in a group is 1 and the adaptability of pups is higher than that of only one coyote, then the pups survive. If

ϕ

is greater than 1 and

ω

holds, the pups survive and the only coyote in the group dies; if

ϕ

is greater than 1 and

ω

holds, the pups survive and the only coyote in the group dies. In addition,

ω

holds, the pup survives and the oldest coyote in the group dies; In all other cases, the pups died.

Step 7: Calculate the effect of cultural trends in the head wolf and the pack on the renewal of individuals within the coyote pack corresponding to the current moment

δ_{1}

,

δ_{2}

, with

δ_{1} = a l p h a^{p, t} - S o C_{c r 1}^{p, t}, δ_{2} = c u l t^{p, t} - S o C_{c r 2}^{p, t}

(9)

where

c r 1

and

c r 2

are the current pack’s random coyotes, respectively.

Step 8: Update all coyote individuals in the coyote pack in turn The new coyote individuals

n e w S o C_{c}^{p, t}

are obtained, the new coyote is selected for its suitability to the original coyote, and the best coyote

S o C_{c}^{p, t + 1}

is retained, with

n e w S o C_{c}^{p, t} = S o C_{c}^{p, t} + r_{1} δ_{1} + r_{2} δ_{2}

(10)

S o C_{c}^{p, t + 1} = \{\begin{matrix} n e w S o C_{c}^{p, t}, F i t F u n c (S o C_{c}^{p, t}) > F i t F u n c (n e w S o C_{c}^{p, t}) \\ S o C_{c}^{p, t}, o t h e r w i s e \end{matrix}

(11)

where

r_{1}

and

r_{2}

are real numbers in the range [0, 1] generated with uniform probability, represent the weights of individual coyotes influenced by cultural trends in

a l p h a

wolves and packs size.

Step 9: Simulate the growth process of individuals over time, and update the age of coyotes.

Step 10: Judge the termination condition, if it is reached, output the social state of the coyote with the best adaptation ability, otherwise return to Step 3.

2.2.2. Bernoulli Shift Coyote Optimization Algorithm

This paper improves the COA algorithm in terms of chaotic sequences, adding Bernoulli shift chaotic interference, shifting some individuals in the population to generate new individuals, and increasing the population’s diversity in order to improve the algorithm’s convergence speed, avoid the algorithm falling into local optimum as much as possible, and increase the population’s diversity. The Bernoulli shift coyote optimization method BCOA is suggested, and the algorithm’s specific implementation is detailed in detail with stages.

The birth of a coyote is the single circumstance that affects the population’s diversity in the COA algorithm, and individual coyote genes may be inherited from one parent or created at random. The search for optimal solutions is limited as a result, and population variety is not guaranteed. Before the population change operation, a chaotic Bernoulli shift is established, and the 1/

N_{c}

people with low fitness at the time are replaced as beginning values into the Bernoulli shift to produce new individuals and replace them.Simultaneously, the execution probability r for the chaotic interference mechanism is determined and integrated with the standard weight lowering approach to define it as a linear decreasing function, taking into account the balance between global and local performance of the algorithm. The Bernoulli shift, according to the literature [50,51], has improved traversal uniformity and optimization search efficiency, and its formulation is

x_{n + 1} = (2 x_{n}) m o d 1

(12)

Set the threshold R, calculate the chaotic perturbation execution probability r, and have

r = r_{m a x} - (r_{m a x} - r_{m i n}) \frac{k}{T_{m a x}}

(13)

where

T_{m a x}

is the maximum number of iterations. The detection mechanism is used to find 1/

N_{c}

poorly adapted individuals are substituted into the Tent mapping as initial values and an equal number of new individuals are generated to replace the original individuals.

Since the range of

{\bar{S o C}}_{c, j}^{t}

in the mapping is between [0, 1], and the COA algorithm The individual

S o C_{c, j}^{p, t}

in the mapping is different from the individual xp in the COA algorithm, so a variable conversion is needed.

{\bar{S o C}}_{c, j}^{t} = \frac{S o C_{c, j}^{p, t} - l b_{j}}{u b_{j} - l b_{j}}, j = 1, 2, \dots, D .

(14)

where

u b_{j}

,

l b_{j}

are the upper and lower bounds of the jth dimensional variable of the p-group at time t, respectively,

S o C_{c, j}^{p, t}

is the variable in the jth dimension of the individual at time t,

{\bar{S o C}}_{c, j}^{t}

is the dimensional variable of the individual at time t is the dimensional variable corresponding to the jth dimension of the individual at time t after the transformation of the Tent mapping.

Use the Tent mapping expression to turn Equation (14) into a chaotic sequence of variables column (

i = 1, 2, \dots, N_{c}, m = 1, 2, \dots, C_{m a x}

), where

C_{m a x}

is the maximum number of iterations of the chaotic search.

Using the following equation to map

S o C_{c} (m)

to the original solution space, a new individual is generated

n e w S o C_{c}

n e w S o C_{c, j}^{p, t} = S o C_{c, j}^{p, t} + \frac{l b_{j} - u b_{j}}{2} (2 {\bar{x}}_{i, j} (m) - 1)

(15)

2.3. The Kernel Based Extreme Learning Machine

2.3.1. Extreme Learning Machine Overview

A typical single implicit layer feedforward neural network structure is shown in Figure 3 [52], consisting of an input layer, an implicit layer and an output layer, with the input layer fully connected to the implicit layer and the implicit layer to the output layer neurons. The input layer has n neurons corresponding to n input variables, the hidden layer has l neurons, and the output layer has m neurons corresponding to m output variables. For the sake of generality, let the connection weights

ω

between the input and hidden layers be:

ω = [\begin{matrix} ω_{11} & ω_{12} & \dots & ω_{1 n} \\ ω_{21} & ω_{22} & \dots & ω_{2 n} \\ \dots & \dots & \dots & \dots \\ ω_{l 1} & ω_{l 2} & \dots & ω_{l n} \end{matrix}]

(16)

where

ω

denotes the connection weight between the ith neuron in the input layer and the jth neuron in the hidden layer.

Let the connection weight between the implicit layer and the output layer be

β

:

β = [\begin{matrix} β_{11} & β_{12} & \dots & β_{1 n} \\ β_{21} & β_{22} & \dots & β_{2 n} \\ \dots & \dots & \dots & \dots \\ β_{l 1} & β_{l 2} & \dots & β_{l n} \end{matrix}]

(17)

where self

β_{j k}

denotes the connection weights between the jth neuron in the hidden layer and the kth neuron in the output layer.

Let the threshold value b of the neuron in the hidden layer be

b = [\begin{matrix} b_{1} \\ b_{2} \\ \dots \\ b_{l} \end{matrix}]

(18)

Let the input matrix X and output matrix Y of the training set with Q samples be

X = [\begin{matrix} x_{11} & x_{12} & \dots & x_{1 n} \\ x_{21} & x_{22} & \dots & x_{2 n} \\ \dots & \dots & \dots & \dots \\ x_{n 1} & x_{n 2} & \dots & x_{n Q} \end{matrix}]

(19)

Y = [\begin{matrix} y_{11} & y_{12} & \dots & y_{1 n} \\ y_{21} & y_{22} & \dots & y_{2 n} \\ \dots & \dots & \dots & \dots \\ y_{m 1} & y_{m 2} & \dots & y_{m Q} \end{matrix}]

(20)

Let the activation function of the neurons in the hidden layer be

g (x)

, then from Figure 1, the output T of the network is:

T = {[\begin{matrix} t_{1} & t_{2} & \dots & t_{Q} \end{matrix}]}_{n \times Q}

(21)

t_{j} = [\begin{matrix} t_{1 j} \\ t_{2 j} \\ \dots \\ t_{m} j \end{matrix}] = {[\begin{matrix} \sum_{i = 1}^{t} β_{i 1} g (ω_{i} x_{j} + b_{i}) \\ \sum_{i = 1}^{t} β_{i 2} g (ω_{i} x_{j} + b_{i}) \\ \dots \\ \sum_{i = 1}^{t} β_{i m} g (ω_{i} x_{j} + b_{i}) \end{matrix}]}_{m \times 1}

(22)

2.3.2. Kernel Based Extreme Learning Machine

The Kernel Based Extreme Learning Machine (KELM) [53,54] is an improved algorithm based on the Extreme Learning Machine (ELM) combined with a kernel function.

ELM is a single implicit layer feedforward neural network whose learning objective function

F (x)

can be represented by the matrix:

F (x) = h (x) \times β = H \times β = L

(23)

where x is the input vector,

h (x)

, H is the output of the hidden layer nodes,

β

is the output weight and L is the desired output.

Turning the network training into a problem solved by a linear system,

β

is determined according to

β = H^{*} \cdot L

, where

H^{*}

is the generalised inverse matrix of H. To enhance the stability of the neural network, the regularisation factor c and the unit matrix I are introduced, so that the least squares solution for the output weights is

β = H^{T} {(H H^{T} + \frac{I}{c})}^{- 1} L

(24)

Introducing the kernel function into the ELM, the kernel matrix is

Ω_{E L M} = H H^{T} = h (x_{i}) h (x_{j}) = K (x_{i}, x_{j})

(25)

where

x_{i}

,

x_{j}

is the test input vector, then Equation (23) can be expressed as

F (x) = [K (x, x_{1}); \dots; K (x, x_{n})] {(\frac{I}{c} + Ω_{E L M})}^{- 1} L

(26)

where

(x_{1}, x_{2}, \dots, x_{n})

is the given training sample, n is the number of samples.

K ()

is the kernel function.

2.4. Experiment

2.4.1. Model Establishment

To detect and diagnose faults in the TE process database, the BCOA KELM model was used. Table 1 shows the errors found in the TE process database. Step changes in process variables, increased variability in process variables, and actuator faults are all linked to these faults (e.g., viscous valves). As a result, the data obtained from the TE process simulation are used to diagnose and detect faults in the samples using a model.The feature selection TE process simulates output variables consisting of 41 measured variables and 11 operational variables of the form [XMEAS (1), XMEAS (2),…, XMEAS (41), XMV (1),…, XMV (11)]. First, consider the amount of computation caused by the impact of the number of features and redundant features on the performance of the diagnostic network. In this paper, we input a dataset containing 52 features into the RFtb method, and the performance measures of the diagnostic model are ranked by OOBPermutedVarDeltaError on the importance value of each feature. In addition, feature selection RFtb can be performed to obtain the average value of various faults, as shown in Table 2. Then, the top five main features are extracted as indicators for classification model fault diagnosis. Finally, based on the feature selection results in the table, approximate training and test sets for various fault diagnoses are built. the BCOA algorithm was then combined with KELM to obtain the diagnostic model using the optimized input data for training. We put a test dataset into the trained diagnostic model to acquire classification results to confirm the model’s reliability.

2.4.2. Tennessee Eastman Process

The Tennessee Eastman (TE) process is a platform for chemical simulation experiments based on the actual chemical reaction process. Downs and Vogel [55] proposed the use of this method for evaluating process monitoring and monitoring methodologies. The TE process is a classic example in the chemical process. As a result, numerous academics have researched the process and have utilized it to drive process monitoring and defect identification. Figure 4 depicts the Tennessee Eastman process’s approximate schematic, which includes five primary units: reactor, condenser, compressor, stripper, and separator. The TE process uses four reactants, A, C, D, and E, as well as two products, G and H. Inertia component B and by-product F are also present.

There are 12 controlled variables and 41 measured variables in the TE process. The response rate, on the other hand, is always omitted from the variable, and the other 52 variables are used to represent the process rate in its entirety. The first 41 variables are measured variables, followed by 11 controlled variables. There are 16 known faults and 5 unknown faults in the TE process. Every defect has a train set and a test set, for a total of 22 training sets, including normal conditions. The fault training datasets were collected during a 24 h fault simulation. The test datasets were generated using a 48 h running simulation, and the problem was introduced at the 8 h mark. Three minutes was chosen as the sampling time.

3. Results

3.1. The Performance of Purposed Method

3.1.1. Fault Diagnosis Rate (FDR) and False Positive Rate (FPR)

As shown in Table 3, in the training sample set, the average FDR of the training dataset was 0.8932 and the corresponding average FPR was 0.1157. The diagnostic model exhibited good diagnostic rates (over 90%) for all defects on the training set. In the test sample set, the average FDR was 89.32% and the corresponding average FPR was 0.1157, showing that the model is effective (here for the test set). The diagnostic model demonstrated good diagnostic rates (above 90%) for all problems on the test set, except faults 3, 9 and 15. As shown in Figure 5, faults 3, 9, and 15 have low FDR and prominent FPR (high FPR indicates poor performance). It is well-recognized that faults 3, 9, 15 and 16 are a long-standing challenge in chemical fault diagnostics and a problem that must be overcome in studying. The average FDR is 89.32%, while the average FPR is 0.1157, demonstrating that the model is effective.

\begin{matrix} FDR = \frac{TP + TN}{TP + TN + FN + FP} \end{matrix}

(27)

\begin{matrix} FPR = \frac{FP}{FP + TN} \end{matrix}

(28)

F1-score is a measure of classification problems. In some machine-learning competitions for multi-classification problems, the F1-score is often used as the final evaluation method. It is the harmonic average of precision and recall. Recall and precision are equally important in F1-score. The maximum is 1 and the minimum is 0. Table 4 shows the four values of the confusion matrix. The values of true positive (TP) and true negative (TN) represent the number of observations representing the correct classification, while the false positive (FP) and false negative (FN) represent the number of misclassifications.

3.1.2. F1-Score

Precision is the ratio of the predicted true positive observationsto the total number of predicted positive outcomes, given asfollows:

Precision = \frac{TP}{TP + FP}

Recall is the ratio between the predicted true positiveobservations to the total number of actual positive values, givenas follows:

Recall = \frac{TP}{TP + FN}

F1-score calculation formula is as follows:

F 1 - score = 2 \frac{Recall \times Precision}{Recall + Precision}

The F1-score of BCOA-KELM is shown in Table 5, which reflects the diagnostic ability of the model.The recall and precision of faults 1, 4, 6, 7, 14 almost achieve 100%, demonstrating a great true positive rate and false positive values. Moreover, the table size trends for the F1-score are almost identical to those for the FDR, which could indicate that the method is not severely overfitted for arbitrary faults. Finally, Figure 6 shows the recall and precision of WCForest, which indicates that the proposed method offers a good performance.

3.1.3. TSNE

To directly show the extent to which the fault states are identified by the method in this paper; the final output t-distributed random neighbor embedding (TSNE) plots of the method in this paper are shown in Figure 7, where different colored points indicate different fault states. The plotted data consists of 480 pieces of data from each fault training set. It can be seen that fault 1 and fault 2, which have a fault diagnosis rate of 95% or more, are both well-classified in the current sample, while fault 3, which has a fault diagnosis rate of 0.7354, contains many points from both fault 1 and fault 2, and has fewer points from fault 1 than from fault 2, which is a reflection of the higher fault diagnosis rate for fault 1 relative to fault 2. As shown in Figure 8, fault 15, with a fault diagnostic rate of 58.85%, is added to the TSNE for faults 1, 2 and 3. The relatively similar and difficult-to-identify mix of fault 15 and fault 3 occurs, which is a direct reflection of the low fault diagnostic rate of fault 15, and, in fact, the relatively similar and difficult-to-identify mix of fault 15 and other faults occurs in essentially all of the different TSNE plots. In fact, in the different TSNE diagrams, there is a relatively similar and unidentifiable mix of fault 15 and other faults.

3.2. Performance Comparison

Table 6 records the parameters used by the BCOA-KELM algorithm in this experiment. To demonstrate the performance of the proposed method, Table 7 displays the performance of this method compared with other fault diagnostic methods during TE; the other methods are WCForest [56], DBN [57], GAN-PCC-DBN [37], LSTM-CNN [58] and RF-GA-CNN [59].

In comparison with other defect diagnosis methods, the model in this paper performed the best, as shown in Table 8. The FDR of the BCOA-KELM algorithm is 0.8932, while the WCforst is 0.8413, DBN is 0.8238, GAN-PCC-DBN is 0.8916, LSTM-CNN is 0.8822 and RF-GA-CNN is 0.8804. Moreover, the accuracy of all faults was above 50%; the diagnosis rate was above 70% for all faults except 15.

It is commonly recognized that the problematic points in chemical process diagnosis are 3, 9, 15 and 16. As can be seen from the figure, the fault diagnosis rate of the rest of the methods for these four types of errors is often low and often below 50%, while the fault diagnosis rate corresponding to errors 3, 9, 15 and 16 of the present method reaches 0.7354, 0.7125, 0.5886 and 0.9146, which can be said that the present method has effectively improved the fault diagnosis rate for faults 3, 9 and 16. Furthermore, the fault diagnosis rate of 0.9146 for fault 16 is significantly higher than that of other methods, demonstrating that BCOA-KELM has made a significant advance for fault 5. The proposed method has the best FDR performance among the diagnostic methods for the TE process, demonstrating the proposed method’s superiority.

3.3. Ablation Experiment

To verify the relative importance of both the feature-selection strategy and the BCOA algorithm for the BCOA-KELM algorithm, we divided the experiment into four groups: KELM, BCOA-KELM, KELM after feature selection, and BCOA-KELM after feature selection. We compared their FDR to verify the importance of both in the algorithm.

The results of the ablation experiments are shown in Table 8. Among the four groups involved in the comparison, the lowest FDR was that of the KELM algorithm alone, whose FDR was only 71.18%; the highest FDR was the feature-selected BCOA-KELM algorithm, whose FDR was 89.32%.

When using BOCA-KELM, the FDR of BCOA-KELM improved by 5.90% over that of KELM for the two sets of experiments without feature selection, and the FDR of BCOA-KELM improved by 16.62% over that of KELM for the two sets of experiments after feature selection. This proves that the BCOA algorithm we used can optimize the KELM very well.

When using feature selection, the FDR was improved by 15.18% in the two sets of experiments using the KELM algorithm after feature selection than the KELM without feature selection. In the two sets of experiments using the BCOA-KELM algorithm, the FDR was improved by 12.23%. Feature selection showed a small FDR improvement of 1.52% for the KELM algorithm, while feature selection showed a large FDR improvement for the BCOA-KELM algorithm, which proves the effectiveness of feature selection and also proves that a good combination of BCOA search and feature selection can better improve the fault diagnosis rate of the algorithm.

To further validate the relative importance of both for the BCOA-KELM algorithm, we also used the FPR metric to elucidate this result.

The results of the ablation experiments are shown in Table 9. Among the four groups involved in the comparison, the highest FPR was that of the KELM algorithm alone, whose FPR was only 0.3032; the lowest FPR was the BCOA-KELM algorithm after feature selection, whose FPR was 0.1157.

When BOCA-KELM was used, the FPR of BCOA-KELM was reduced by 0.1157 compared with that of KELM for the two sets of experiments without feature selection, and the FPR of BCOA-KELM was reduced by 0.1739 compared with that of KELM for the two sets of experiments after feature selection. Overall, the combination of BCOA algorithm and KELM effectively reduced the FPR, which proves the optimization effect of BCOA for KELM.

When using feature selection, the FPR was reduced by 0.0136 for the two sets of experiments when using the KELM algorithm after feature selection, compared to the KELM without feature selection, and by 0.1127 for the two sets of experiments using the BCOA-KELM algorithm after feature selection compared to the BCOA-KELM algorithm without feature selection. This also proves the importance of feature selection in this algorithm.

4. Discussion and Conclusions

On the basis of ELM, the BCOA-KELM method was presented for TE process fault diagnosis. Kelm is used to diagnose and classify problems. The internal parameters c and g, however, have an impact on KELM’s performance. To optimize this parameter, BCOA is employed. The proposal of BCOA and combination of the KELM and BCOA algorithms is the paper’s most significant contribution. The algorithm’s overall performance and accuracy are improved, while its training time is lowered. The F1-score and the model’s accuracy outcomes are compared to determine the model’s efficacy. One of the key parameters to consider when evaluating defect diagnosis systems is classification accuracy. Overfitting issues, on the other hand, can cause a discrepancy in accuracy results. However, overfitting problems can lead to a mismatch between accuracy results and fault diagnosis ability, which can be reflected by the F1-score. As a result, combining the two can more properly reflect the model’s diagnostic results. The experiments reveal that BCOA-KELM has a fast training time, higher classification accuracy in terms of fault diagnosis than other algorithms, and a significant improvement in diagnostic accuracy for fault 16. The model outperforms the commonly used diagnostic models in terms of diagnostic findings. As a result, BCOA-KELM can be used to diagnose Tennessee Eastman process faults as well as other classification and prediction issues.

Although the method has achieved some good results, there are still some limitations that need to be improved in future work. First, the quality of the raw data has a significant impact on the performance of the method, which is an important reason for the low correctness of partial fault diagnosis. A specific data-cleaning process for partial fault datasets is crucial in practical applications. Second, although our optimization of a large number of network hyperparameters of KELM leads to a significant improvement in the diagnostic performance of KELM, there is an upper limit to the accuracy of the KELM classifier. Next, we have to optimize the structure of the classifier itself. Third, for the extracted features, we can combine the characteristics of the chemical process itself and consider the TE process itself rather than just the data-driven diagnostic aspects to explore the chemical connection between the feature variables, which will be a new cross-optimization direction. Finally, the time complexity of the proposed approach needs to be considered during the design and validation of deep learning models. For example, when low-level features are sufficient for high-precision fault diagnosis, there is no need to extract high-level features of the chemical process, which may help to improve efficiency.

Author Contributions

Conceptualization, X.H. In addition, M.H.; methodology, X.H.; software, X.H.; validation, X.H. In addition, M.H.; formal analysis, X.H. In addition, M.H.; investigation, X.H.; resources, X.H., M.H. In addition, X.Y.; data curation, X.H. In addition, M.H.; writing—original draft preparation, X.H. In addition, M.H.; writing—review and editing, X.H. In addition, M.H.; visualization, X.H. In addition, M.H.; supervision, X.Y.; project administration, X.H. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Deng, X.; Tian, X.; Chen, S.; Harris, C.J. Deep principal component analysis based on layerwise feature extraction and its application to nonlinear process monitoring. IEEE Trans. Control Syst. Technol. 2018, 27, 2526–2540. [Google Scholar] [CrossRef]
Pagliaro, M. An Industry in Transition: The Chemical Industry and the Megatrends Driving Its Forthcoming Transformation. Angew. Chem. Int. Ed. 2019, 58, 11154–11159. [Google Scholar] [CrossRef] [PubMed]
He, B.; Bai, K.J. Digital twin-based sustainable intelligent manufacturing: A review. Adv. Manuf. 2021, 9, 1–21. [Google Scholar] [CrossRef]
Maddikunta, P.K.R.; Pham, Q.V.; B, P.; Deepa, N.; Dev, K.; Gadekallu, T.R.; Ruby, R.; Liyanage, M. Industry 5.0: A survey on enabling technologies and potential applications. J. Ind. Inf. Integr. 2021, 26, 100257. [Google Scholar] [CrossRef]
Gao, X.; Hou, J. An improved SVM integrated GS-PCA fault diagnosis approach of Tennessee Eastman process. Neurocomputing 2016, 174, 906–911. [Google Scholar] [CrossRef]
Dong, J.; Zhang, K.; Huang, Y.; Li, G.; Peng, K. Adaptive total PLS based quality-relevant process monitoring with application to the Tennessee Eastman process. Neurocomputing 2015, 154, 77–85. [Google Scholar] [CrossRef]
An, J.; Wang, Z.; Jiang, T.; Chen, P.; Liang, X.; Shao, J.; Nie, J.; Xu, M.; Wang, Z.L. Reliable mechatronic indicator for self-powered liquid sensing toward smart manufacture and safe transportation. Mater. Today 2020, 41, 10–20. [Google Scholar] [CrossRef]
Tauseef, S.; Abbasi, T.; Pompapathi, V.; Abbasi, S. Case studies of 28 major accidents of fires/explosions in storage tank farms in the backdrop of available codes/standards/models for safely configuring such tank farms. Process Saf. Environ. Prot. 2018, 120, 331–338. [Google Scholar] [CrossRef]
Chen, C.; Khakzad, N.; Reniers, G. Dynamic vulnerability assessment of process plants with respect to vapor cloud explosions. Reliab. Eng. Syst. Saf. 2020, 200, 106934. [Google Scholar] [CrossRef]
Yang, Y.; Chen, G.; Reniers, G.; Goerlandt, F. A bibliometric analysis of process safety research in China: Understanding safety research progress as a basis for making China’s chemical industry more sustainable. J. Clean. Prod. 2020, 263, 121433. [Google Scholar] [CrossRef]
Venkatasubramanian, V.; Rengaswamy, R.; Yin, K.; Kavuri, S.N. A review of process fault detection and diagnosis: Part I: Quantitative model-based methods. Comput. Chem. Eng. 2003, 27, 293–311. [Google Scholar] [CrossRef]
Venkatasubramanian, V.; Rengaswamy, R.; Kavuri, S.N. A review of process fault detection and diagnosis: Part II: Qualitative models and search strategies. Comput. Chem. Eng. 2003, 27, 313–326. [Google Scholar] [CrossRef]
Venkatasubramanian, V.; Rengaswamy, R.; Kavuri, S.N.; Yin, K. A review of process fault detection and diagnosis: Part III: Process history based methods. Comput. Chem. Eng. 2003, 27, 327–346. [Google Scholar] [CrossRef]
Xinyi, C.; Xuefeng, Y. Fault diagnosis in chemical process based on self-organizing map integrated with fisher discriminant analysis. Chin. J. Chem. Eng. 2013, 21, 382–387. [Google Scholar]
Nor, N.M.; Hassan, C.R.C.; Hussain, M.A. A review of data-driven fault detection and diagnosis methods: Applications in chemical process systems. Rev. Chem. Eng. 2020, 36, 513–553. [Google Scholar] [CrossRef]
Dunia, R.; Qin, S.J.; Edgar, T.F.; McAvoy, T.J. Identification of faulty sensors using principal component analysis. AIChE J. 1996, 42, 2797–2812. [Google Scholar] [CrossRef]
Botre, C.; Mansouri, M.; Nounou, M.; Nounou, H.; Karim, M.N. Kernel PLS-based GLRT method for fault detection of chemical processes. J. Loss Prev. Process Ind. 2016, 43, 212–224. [Google Scholar] [CrossRef]
Xu, Y.; Deng, X. Fault detection of multimode non-Gaussian dynamic process using dynamic Bayesian independent component analysis. Neurocomputing 2016, 200, 70–79. [Google Scholar] [CrossRef]
Yu, J. Localized Fisher discriminant analysis based complex chemical process monitoring. AIChE J. 2011, 57, 1817–1828. [Google Scholar] [CrossRef]
Wang, C.; Li, H.; Zhang, K.; Hu, S.; Sun, B. Intelligent fault diagnosis of planetary gearbox based on adaptive normalized CNN under complex variable working conditions and data imbalance. Measurement 2021, 180, 109565. [Google Scholar] [CrossRef]
Wang, N.; Li, H.; Wu, F.; Zhang, R.; Gao, F. Fault Diagnosis of Complex Chemical Processes Using Feature Fusion of a Convolutional Network. Ind. Eng. Chem. Res. 2021, 60, 2232–2248. [Google Scholar] [CrossRef]
Mahadevan, S.; Shah, S.L. Fault detection and diagnosis in process data using one-class support vector machines. J. Process Control 2009, 19, 1627–1639. [Google Scholar] [CrossRef]
Pirdashti, M.; Curteanu, S.; Kamangar, M.H.; Hassim, M.H.; Khatami, M.A. Artificial neural networks: Applications in chemical engineering. Rev. Chem. Eng. 2013, 29, 205–239. [Google Scholar] [CrossRef]
Li, C.; Sanchez, R.V.; Zurita, G.; Cerrada, M.; Cabrera, D.; Vásquez, R.E. Multimodal deep support vector classification with homologous features and its application to gearbox fault diagnosis. Neurocomputing 2015, 168, 119–127. [Google Scholar] [CrossRef]
Saimurugan, M.; Ramachandran, K.; Sugumaran, V.; Sakthivel, N. Multi component fault diagnosis of rotational mechanical system based on decision tree and support vector machine. Expert Syst. Appl. 2011, 38, 3819–3826. [Google Scholar] [CrossRef]
Demetgul, M. Fault diagnosis on production systems with support vector machine and decision trees algorithms. Int. J. Adv. Manuf. Technol. 2013, 67, 2183–2194. [Google Scholar] [CrossRef]
Kadkhodazadeh, M.; Valikhan Anaraki, M.; Morshed-Bozorgdel, A.; Farzin, S. A New Methodology for Reference Evapotranspiration Prediction and Uncertainty Analysis under Climate Change Conditions Based on Machine Learning, Multi Criteria Decision Making and Monte Carlo Methods. Sustainability 2022, 14, 2601. [Google Scholar] [CrossRef]
Kadkhodazadeh, M.; Farzin, S. A novel LSSVM model integrated with GBO algorithm to assessment of water quality parameters. Water Resour. Manag. 2021, 35, 3939–3968. [Google Scholar] [CrossRef]
Yang, G.; Gu, X. Fault diagnosis of complex chemical processes based on enhanced naive Bayesian method. IEEE Trans. Instrum. Meas. 2019, 69, 4649–4658. [Google Scholar] [CrossRef]
Zhou, Y.; Peng, J.; Chen, C.P. Extreme learning machine with composite kernels for hyperspectral image classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 8, 2351–2360. [Google Scholar] [CrossRef]
You, C.X.; Huang, J.Q.; Lu, F. Recursive reduced kernel based extreme learning machine for aero-engine fault pattern recognition. Neurocomputing 2016, 214, 1038–1045. [Google Scholar] [CrossRef]
Pang, S.; Yang, X.; Zhang, X. Aero engine component fault diagnosis using multi-hidden-layer extreme learning machine with optimized structure. Int. J. Aerosp. Eng. 2016, 2016. [Google Scholar] [CrossRef]
Huang, G.B.; Zhou, H.; Ding, X.; Zhang, R. Extreme learning machine for regression and multiclass classification. IEEE Trans. Syst. Man, Cybern. Part B Cybern. 2011, 42, 513–529. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zhang, L.; Zhang, D.; Tian, F. SVM and ELM: Who Wins? Object recognition with deep convolutional features from ImageNet. In ELM-2015 Volume 1, Proceedings of International Conference on Extreme Learning Machines (ELM-2015), Hangzhou, China, 15–17 December 2015; Springer: Berlin/Heidelberg, Germany, 2016; pp. 249–263. [Google Scholar]
John, G.H.; Kohavi, R.; Pfleger, K. Irrelevant features and the subset selection problem. In Machine Learning Proceedings 1994, Proceedings of the Eleventh International Conference, New Brunswick, NJ, USA, 10–13 July 1994; Elsevier: Amsterdam, The Netherlands, 1994; pp. 121–129. [Google Scholar]
Malhi, A.; Gao, R.X. PCA-based feature selection scheme for machine defect classification. IEEE Trans. Instrum. Meas. 2004, 53, 1517–1525. [Google Scholar] [CrossRef]
Tian, W.; Liu, Z.; Li, L.; Zhang, S.; Li, C. Identification of abnormal conditions in high-dimensional chemical process based on feature selection and deep learning. Chin. J. Chem. Eng. 2020, 28, 1875–1883. [Google Scholar] [CrossRef]
Menze, B.H.; Kelm, B.M.; Masuch, R.; Himmelreich, U.; Bachert, P.; Petrich, W.; Hamprecht, F.A. A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data. BMC Bioinform. 2009, 10, 213. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Liu, M.; Wang, M.; Wang, J.; Li, D. Comparison of random forest, support vector machine and back propagation neural network for electronic tongue data classification: Application to the recognition of orange beverage and Chinese vinegar. Sens. Actuators B Chem. 2013, 177, 970–980. [Google Scholar] [CrossRef]
Caruana, R.; Karampatziakis, N.; Yessenalina, A. An empirical evaluation of supervised learning in high dimensions. In Proceedings of the 25th International Conference on Machine Learning, Helsinki, Finland, 5–9 July 2008; pp. 96–103. [Google Scholar]
Cerrada, M.; Zurita, G.; Cabrera, D.; Sánchez, R.V.; Artés, M.; Li, C. Fault diagnosis in spur gears based on genetic algorithm and random forest. Mech. Syst. Signal Process. 2016, 70, 87–103. [Google Scholar] [CrossRef]
Cabrera, D.; Sancho, F.; Sánchez, R.V.; Zurita, G.; Cerrada, M.; Li, C.; Vásquez, R.E. Fault diagnosis of spur gearbox based on random forest and wavelet packet decomposition. Front. Mech. Eng. 2015, 10, 277–286. [Google Scholar] [CrossRef]
Xu, G.; Liu, M.; Jiang, Z.; Söffker, D.; Shen, W. Bearing fault diagnosis method based on deep convolutional neural network and random forest ensemble learning. Sensors 2019, 19, 1088. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Xie, Z.; Yang, X.; Li, A.; Ji, Z. Fault diagnosis in industrial chemical processes using optimal probabilistic neural network. Can. J. Chem. Eng. 2019, 97, 2453–2464. [Google Scholar] [CrossRef]
Abraham, S.; Huynh, C.; Vu, H. Classification of soils into hydrologic groups using machine learning. Data 2020, 5, 2. [Google Scholar] [CrossRef] [Green Version]
Pierezan, J.; Coelho, L.D.S. Coyote optimization algorithm: A new metaheuristic for global optimization problems. In Proceedings of the 2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–8. [Google Scholar]
Chin, V.J.; Salam, Z. Coyote optimization algorithm for the parameter extraction of photovoltaic cells. Sol. Energy 2019, 194, 656–670. [Google Scholar] [CrossRef]
Qais, M.H.; Hasanien, H.M.; Alghuwainem, S.; Nouh, A.S. Coyote optimization algorithm for parameters extraction of three-diode photovoltaic models of photovoltaic modules. Energy 2019, 187, 116001. [Google Scholar] [CrossRef]
Pham, T.D.; Nguyen, T.T.; Dinh, B.H. Find optimal capacity and location of distributed generation units in radial distribution networks by using enhanced coyote optimization algorithm. Neural Comput. Appl. 2021, 33, 4343–4371. [Google Scholar] [CrossRef]
Zhang, C.; Ding, S. A stochastic configuration network based on chaotic sparrow search algorithm. Knowl.-Based Syst. 2021, 220, 106924. [Google Scholar] [CrossRef]
Pierezan, J.; dos Santos Coelho, L.; Mariani, V.C.; de Vasconcelos Segundo, E.H.; Prayogo, D. Chaotic coyote algorithm applied to truss optimization problems. Comput. Struct. 2021, 242, 106353. [Google Scholar] [CrossRef]
Huang, G.; Huang, G.B.; Song, S.; You, K. Trends in extreme learning machines: A review. Neural Netw. 2015, 61, 32–48. [Google Scholar] [CrossRef] [PubMed]
Yang, Z.; Ce, L.; Lian, L. Electricity price forecasting by a hybrid model, combining wavelet transform, ARMA and kernel-based extreme learning machine methods. Appl. Energy 2017, 190, 291–305. [Google Scholar] [CrossRef]
Yang, W.; Tian, Z.; Hao, Y. A novel ensemble model based on artificial intelligence and mixed-frequency techniques for wind speed forecasting. Energy Convers. Manag. 2022, 252, 115086. [Google Scholar] [CrossRef]
Downs, J.J.; Vogel, E.F. A plant-wide industrial process control problem. Comput. Chem. Eng. 1993, 17, 245–255. [Google Scholar] [CrossRef]
Ding, J.; Luo, Q.; Jia, L.; You, J. Deep Forest-Based Fault Diagnosis Method for Chemical Process. Math. Probl. Eng. 2020, 2020, 5281512. [Google Scholar] [CrossRef]
Zhang, Z.; Zhao, J. A deep belief network based fault diagnosis model for complex chemical processes. Comput. Chem. Eng. 2017, 107, 395–407. [Google Scholar] [CrossRef]
Han, Y.; Ding, N.; Geng, Z.; Wang, Z.; Chu, C. An optimized long short-term memory network based fault diagnosis model for chemical processes. J. Process Control 2020, 92, 161–168. [Google Scholar] [CrossRef]
Deng, L.; Zhang, Y.; Dai, Y.; Ji, X.; Zhou, L.; Dang, Y. Integrating feature optimization using a dynamic convolutional neural network for chemical process supervised fault classification. Process Saf. Environ. Prot. 2021, 155, 473–485. [Google Scholar] [CrossRef]

Figure 1. Change of testing ACR with the number of training epochs.

Figure 2. Framework of the proposed fault diagnosis method.

Figure 3. Extreme Learning Machine Network Structure.

Figure 4. The TE process diagram.

Figure 5. FDR and FPR of BCOA-KELM.

Figure 6. Stacked graph of precision and recall.

Figure 7. TSNE visual output of fault 1, 2, 3.

Figure 8. TSNE visual output of fault 1, 2, 3, 15.

Table 1. The faults description of TE process.

Variable Number	Process Variable	Type
IDV (1)	A/C feed ratio, B composition constant (stream 4)	Step
IDV (2)	B comoosition, A/C ratio constant (stream 4)	Step
IDV (3)	D feed temperature (stream 2)	Step
IDV (4)	Reactor cooling water inlet temperature	Step
IDV (5)	Condenser c4mting water inIet temperature	Step
IDV (6)	A feed loss (stream 1)	Step
IDV (7)	C header pressure loss-reduced availability (stream 4)	Step
IDV (8)	A,B,C feed composition (stream 4)	Random variation
IDV (9)	D feed temperature (stream 2)	Random variation
IDV (10)	C feed temperature (stream 4)	Random variation
IDV (11)	Reactor cooling water inlet temperature	Random variation
IDV (12)	Condenser cooling water inlet temperature	Random variation
IDV (13)	Reaction kinetics	Slow drift
IDV (14)	Reactor cooling water valve	Sticking
IDV (15)	Condenser cooling water valve	Sticking
IDV (16)–(20)	Unknown	Unknown
IDV (21)	Valve (stream 4)	constant position

Table 2. Selected features for each condition.

Fault	Feature	Fault	Feature	Fault	Feature
1	1, 20, 22, 44, 46	8	16, 29, 38, 40, 41	15	16, 19, 20, 39, 40
2	10, 34, 39, 46, 47	9	19, 25, 31, 38, 50	16	18, 19, 38, 46, 50
3	18, 20, 37, 40, 41	10	18, 19, 31, 38, 50	17	21, 38, 46, 50, 51
4	19, 38, 47, 50, 51	11	7, 9, 13, 38, 51	18	16, 19, 22, 41, 50
5	17, 18, 38, 50, 52	12	4, 11, 18, 19, 35	19	5, 13, 20, 46, 50
6	1, 20, 37, 44, 46	13	7, 18, 19, 39, 50	20	19, 39, 41, 46, 50
7	19, 38, 45, 46, 50	14	9, 11, 21, 38, 50	21	7, 16, 19, 45, 50

Table 3. Diagnosis results of BCOA-KELM in TE process.

Fault	FDR	FPR
1	0.9969	0.0038
2	0.9750	0.0300
3	0.7354	0.2963
4	1.0000	0.0000
5	0.9844	0.0188
6	1.0000	0.0000
7	1.0000	0.0000
8	0.9271	0.0875
9	0.7125	0.3325
10	0.9333	0.0775
11	0.8615	0.1550
12	0.8969	0.1238
13	0.8552	0.1463
14	0.9990	0.0013
15	0.5885	0.4338
16	0.9146	0.0988
17	0.9563	0.0525
18	0.8865	0.1300
19	0.8688	0.1188
20	0.8469	0.1538
21	0.8177	0.1700
Average	0.8932	0.1157

Table 4. Confusion Matrix.

Confusion Matrix		Prediction
Confusion Matrix		Positive (P)	Negative (N)
Real	True (T)	TP	FN
	Flase (F)	FP	TN

Table 5. F1-score of BCOA-KELM in TE process.

Fault	Precision	Recall	F1
1	0.9816	1.0000	0.9907
2	0.8696	1.0000	0.9302
3	0.3763	0.8938	0.5296
4	1.0000	1.0000	1.0000
5	0.9143	1.0000	0.9552
6	1.0000	1.0000	1.0000
7	1.0000	1.0000	1.0000
8	0.6957	1.0000	0.8205
9	0.3606	0.9375	0.5208
10	0.7182	0.9875	0.8316
11	0.5490	0.9438	0.6943
12	0.6178	1.0000	0.7637
13	0.5412	0.8625	0.6651
14	0.9938	1.0000	0.9969
15	0.2440	0.7000	0.3619
16	0.6653	0.9813	0.7929
17	0.7921	1.0000	0.8840
18	0.5985	0.9688	0.7399
19	0.5759	0.8063	0.6719
20	0.5251	0.8500	0.6491
21	0.4708	0.7563	0.5803
Average	0.6900	0.9375	0.7799

Table 6. Parameters of BCOA in optimization of KELM.

Category	Value
Number of optimization parameters	2
Varmin	0
Varmax	10
Number of packs	10
Number of coyotes	5
Probability of leaving a pack	0.005
Chaotic perturbation	Bernoulli shift

Table 7. Comparison of FDR (%) of different methods.

Fault	WCforest	DBN	GAN-PCC-DBN	LSTM-CNN	RF-GA-CNN	Proposed Method
1	99.17	100.00	99.60	99.60	99.56	99.69
2	100.00	99.00	98.40	98.90	99.06	97.50
3	42.33	95.00	40.20	83.00	95.81	73.54
4	98.83	98.00	98.80	98.40	99.81	100.00
5	90.97	86.00	97.40	82.90	62.06	98.44
6	100.00	100.00	99.70	99.30	98.93	100.00
7	100.00	100.00	98.10	100.00	100.00	100.00
8	96.00	78.00	95.40	94.90	92.69	92.71
9	51.50	57.00	41.00	62.20	77.62	71.25
10	75.83	98.00	97.40	97.60	97.06	93.34
11	82.00	87.00	95.50	97.90	98.75	86.15
12	93.50	85.00	99.00	97.40	97.88	89.69
13	94.83	88.00	94.00	94.40	94.56	85.52
14	96.83	87.00	94.80	99.10	99.50	99.90
15	7.17	0.00	55.20	40.90	43.50	58.85
16	78.00	0.00	91.90	39.70	20.69	91.45
17	94.50	100.00	90.10	95.20	94.88	95.63
18	99.67	98.00	96.10	92.00	94.56	88.65
19	89.50	93.00	97.00	98.80	99.00	86.88
20	76.00	93.00	93.20	93.20	93.56	84.69
21	100.00	88.00	99.50	87.30	89.30	81.77
Average	84.13	82.38	89.16	88.22	88.04	89.32

Table 8. FDR (%) of the base algorithm and related improved algorithms.

	Without Feature Selection		Feature Selection
Fault	KELM	BCOA-KELM	KELM	BCOA-KELM
1	99.38	99.90	98.96	99.69
2	97.81	97.71	97.29	97.50
3	59.38	63.33	73.54	73.54
4	100.00	100.00	100.00	100.00
5	48.13	62.40	60.00	98.44
6	99.58	99.58	99.79	100.00
7	100.00	100.00	99.69	100.00
8	64.17	86.25	53.85	92.71
9	72.29	71.25	59.27	71.25
10	60.42	65.63	41.15	93.33
11	55.21	61.56	58.65	86.15
12	47.81	73.96	43.44	89.69
13	60.00	70.31	54.17	85.52
14	81.35	95.52	99.48	99.90
15	43.96	48.65	58.85	58.85
16	61.77	65.73	67.81	91.46
17	68.33	76.25	70.52	95.63
18	90.52	90.00	86.77	88.65
19	60.31	64.79	73.75	86.88
20	73.02	72.71	74.27	84.69
21	51.35	53.23	55.42	81.77
	71.18	77.08	72.70	89.32

Table 9. FPR of the base algorithm and related improved algorithms.

	Without Feature Selection		Feature Selection
Fault	KELM	BCOA-KELM	KELM	BCOA-KELM
1	0.0075	0.0013	0.0125	0.0038
2	0.0263	0.0275	0.0325	0.0300
3	0.3300	0.2788	0.1350	0.2963
4	0.0000	0.0000	0.0000	0.0000
5	0.6225	0.4513	0.4800	0.0188
6	0.0050	0.0050	0.0025	0.0000
7	0.0000	0.0000	0.0038	0.0000
8	0.4300	0.1650	0.5538	0.0875
9	0.2000	0.2200	0.4050	0.3325
10	0.4613	0.4013	0.7063	0.0775
11	0.5050	0.4163	0.4763	0.1550
12	0.6263	0.3125	0.6788	0.1238
13	0.2888	0.1563	0.3575	0.1463
14	0.2088	0.0525	0.0063	0.0013
15	0.6588	0.6013	0.4338	0.4338
16	0.4288	0.3788	0.3575	0.0988
17	0.3800	0.2850	0.3538	0.0525
18	0.1138	0.0963	0.1588	0.1300
19	0.3688	0.2538	0.3038	0.1188
20	0.3113	0.3200	0.2888	0.1538
21	0.3950	0.3738	0.3363	0.1700
	0.3032	0.2284	0.2896	0.1157

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Hu, X.; Hu, M.; Yang, X. A Novel Fault Diagnosis Method for TE Process Based on Optimal Extreme Learning Machine. Appl. Sci. 2022, 12, 3388. https://doi.org/10.3390/app12073388

AMA Style

Hu X, Hu M, Yang X. A Novel Fault Diagnosis Method for TE Process Based on Optimal Extreme Learning Machine. Applied Sciences. 2022; 12(7):3388. https://doi.org/10.3390/app12073388

Chicago/Turabian Style

Hu, Xinyi, Mingfei Hu, and Xiaohui Yang. 2022. "A Novel Fault Diagnosis Method for TE Process Based on Optimal Extreme Learning Machine" Applied Sciences 12, no. 7: 3388. https://doi.org/10.3390/app12073388

APA Style

Hu, X., Hu, M., & Yang, X. (2022). A Novel Fault Diagnosis Method for TE Process Based on Optimal Extreme Learning Machine. Applied Sciences, 12(7), 3388. https://doi.org/10.3390/app12073388

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Novel Fault Diagnosis Method for TE Process Based on Optimal Extreme Learning Machine

Abstract

1. Introduction

2. Materials and Methods

2.1. The Proposed BCOA-KELM Model

2.2. Coyote Optimization Algorithm

2.2.1. Algorithm Flow

2.2.2. Bernoulli Shift Coyote Optimization Algorithm

2.3. The Kernel Based Extreme Learning Machine

2.3.1. Extreme Learning Machine Overview

2.3.2. Kernel Based Extreme Learning Machine

2.4. Experiment

2.4.1. Model Establishment

2.4.2. Tennessee Eastman Process

3. Results

3.1. The Performance of Purposed Method

3.1.1. Fault Diagnosis Rate (FDR) and False Positive Rate (FPR)

3.1.2. F1-Score

3.1.3. TSNE

3.2. Performance Comparison

3.3. Ablation Experiment

4. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI