*Article* **Causality-Network-Based Critical Hazard Identification for Railway Accident Prevention: Complex Network-Based Model Development and Comparison**

**Qian Li <sup>1</sup> , Zhe Zhang 2,\* and Fei Peng <sup>3</sup>**


**Abstract:** This study investigates a critical hazard identification method for railway accident prevention. A new accident causation network is proposed to model the interaction between hazards and accidents. To realize consistency between the most likely and shortest causation paths in terms of hazards to accidents, a method for measuring the length between adjacent nodes is proposed, and the most-likely causation path problem is first transformed to the shortest causation path problem. To identify critical hazard factors that should be alleviated for accident prevention, a novel critical hazard identification model is proposed based on a controllability analysis of hazards. Five critical hazard identification methods are proposed to select critical hazard nodes in an accident causality network. A comparison of results shows that the combination of an integer programming-based critical hazard identification method and the proposed weighted direction accident causality network considering length has the best performance in terms of accident prevention.

**Keywords:** railway accident prevention; critical hazard identification; accident causality network; integer programming

#### **1. Introduction**

*1.1. Background*

Railway transportation has become the main transportation mode, with the advantages of high speed and low cost. However, railway accidents often interrupt railway transportation processes. Therefore, railway companies emphasize hazard control and emergency management to improve the safety and efficiency of railway transportation. Hazard control is used to alleviate critical or frequent hazard factors and can be considered an accident prevention measure. Emergency management addresses accidents and reduces the negative effects of railway accidents after their occurrence.

This study focuses on identifying critical hazards, which is the core aspect of hazard control. Using accident analysis methods, railway safety managers can investigate the causes of accidents from the aspects of humans, organizations, the environment, and technology. With the increasing number of accident investigation reports, more railway accident causes or hazards can be determined. Typically, some hazards more significantly contribute to accidents than others. These hazards or critical hazards should be identified to support the risk management of railway systems. In the following section, we review the related literature and discuss the contributions of this study.

#### *1.2. Related Studies*

Experience or data-based risk analysis methods are often used to determine the causes of accidents. The data are typically obtained from accident reports or railway experts. Based

**Citation:** Li, Q.; Zhang, Z.; Peng, F. Causality-Network-Based Critical Hazard Identification for Railway Accident Prevention: Complex Network-Based Model Development and Comparison. *Entropy* **2021**, *23*, 864. https://doi.org/10.3390/ e23070864

Academic Editors: Quanmin Zhu, Giuseppe Fusco, Jing Na, Weicun Zhang and Ahmad Taher Azar

Received: 25 May 2021 Accepted: 3 July 2021 Published: 6 July 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

on accident data or experience, many accident causation models have been developed to determine hazards that should be controlled or alleviated. Some researchers have classified accident causation models into various groups based on their attention preference toward accidents [1–3]. The first group is the domino theory-based accident causation model. The domino model, which was developed by Heintich, considers an accident as a result of several sequential hazard events [4]. Because the domino model simplifies the control of human behavior in accident causation, researchers improved the model by focusing more on management failures, and the modified domino models are defined as managementbased accident causation models [5,6]. The third group is human error models, where the occurrence of accidents is attributed to human hazards or errors such as unsafe incorrect responses and improper activities [7]. For example, the human performance railway index operational index has been proposed to predict the probability of human failure in railway operations. Human error analysis, human error reduction, and human factor analysis and classification system (HFACS) methods have been developed to analyze human errors in railway operations [8–11]. A Poisson regression method was used to determine the relationship between driver personality and driving safety [12]. Many factors affect the safe operation of railway systems in addition to operators, including train drivers, signalers, and controllers. Therefore, the systemic accident models proposed by Hollnagel may be more suitable for railway accident causation analysis [13,14]. An accident causation model for the railway industry has been proposed to investigate the contributions of human failure, technical failure, and external intrusion to final accidents [15]. A system theoretic accident model and process-based model has been proposed to investigate the various causes of railway accidents, including human and management failures [16,17]. Furthermore, the HFACS-RA (Human Factors Analysis and Classification System-railway Accidents) method was proposed to identify and analyze human and organizational factors that affect railway accidents [18]. An integrated evolutionary model called the scenario-risk-accident chain ontology (SRAC) is proposed to determine the risk relevance of railway systems from accident reports [19]. To support the safety management of railway systems, a quantitative causal analysis was proposed to identify the most important factors that contribute to the risk of passenger train accidents [20]. The associated rules have been derived by mining train accident data [21].

Various equipment, machines, and human resources are interrelated in railway systems, which increase the complexity of railway accidents [22,23]. Therefore, network-based accident causation models have been developed to analyze the causes of accidents and model interdependent hazards in recent years [24]. The nodes in the network represent hazard events or accidents, and the links indicate the relationship between hazards and accidents. A directed network has been formulated to analyze the causes of accidents or railway operational accidents [25,26]. The increasing number of accident reports has enabled the weight between hazards to be measured; hence, a directed weighted accident causation network has been proposed to investigate the accident causation complexity [27]. Three hazard control strategies have been compared using the tailored accident causality network (ACN) [28]; however, the hazard control model has not been formulated to identify the optimal hazards to be removed. Some entropy-based methods have been proposed to determine the critical nodes in real-world complex networks [29–31] such as power networks [32], biological networks [33], and transportation networks [34,35]; however, there is no method to identify critical hazard nodes in ACNs from the perspective of both global optimization and accident prevention. Therefore, a hazard identification model was developed in this study to select optimal hazards to be removed for railway accident prevention.

The selected critical hazards should block the path from hazards to accidents or lengthen the distance from hazards to accidents to prevent accidents. However, the length of the edges cannot be appropriately obtained; hence, the distance from hazards to accidents cannot be measured, although the proposed ACN model can facilitate the understanding of railway accidents. Herein, we propose a novel accident causation network model inspired by a Bayesian network model [36]; the model enables us to easily measure the weights and lengths of the links. Consequently, the shortest path from hazards to accidents can be obtained, and a critical hazard identification (CHI) model can be formulated. The best CHI model was selected by comparing the performances of different models. The proposed integer programming method performs the best because it can solve the CHI problem from a global perspective.

#### *1.3. Contributions and Organization*

The proposed model contributes to research pertaining to complex network-based accident analysis methods in three aspects:


The remainder of this paper is organized as follows. Section 2 introduces the ACN construction method and analyzes the objective of hazard control and the formulation of five CHI models. Section 3 presents a real-world case study to verify the effectiveness of the proposed ACN and CHI models. Finally, the conclusions and directions for future research are presented in Section 4.

#### **2. Problem Description and Formulation**

#### *2.1. ACN Model*

The first step in our study was to construct a causality network from accident investigation reports, which contain reports of many events. We focus on the description of the accident process because it contains the immediate cause and causal factors of accidents. The causal factors of accidents are typically defined as hazards. Several hazards can be extracted, and the causal relationship between hazards can be described using various causality connectors [37]. For example, we can identify two textual causality connectors such as "because" and "due to" in the sentence "the incident occurred because the driver of the first tram did not stop at the platform or stop signal due to a temporary loss of awareness." Therefore, the hazard event "loss of awareness" causes the hazard event "tram did not stop at the platform." We can obtain one hazard pair h*i*, *j*i if hazard event *i* causes hazard event *j*. If hazards *i* and *j* are denoted by two nodes, then an arrow from hazard node *i* to *j* can be added to represent their causal relationship. After manually extracting all hazard events and accidents from the set of accident investigation reports, we can construct an ACN for railway systems. Each node in a network corresponds to a hazard event or accident. Each edge is associated with a cause–effect pair and directed from the cause (hazard event) to the effect (hazard or accident). Table 1 defines the notations in this paper to formulate the proposed model.

#### 2.1.1. Edge Weight Metrics

We can obtain the same hazard pairs in different accident reports. Therefore, the ACN is a weighted direct accident causality network (WDACN), as shown in Figure 1a. Let *S* denote the set of nodes in the WDACN. In the WDACN, the weight *w*(*i*, *j*) of the edge from node *i* to node *j* is equal to the number of hazard pairs h*i*, *j*i in the reports *nij*, i.e., :

$$w(i,j) = n\_{ij} \tag{1}$$

**Table 1.** Notations and abbreviations.


**Figure 1.** Accident causation network. (**a**) WDACN, (**b**) DACN (**c**) ACN after reversing the weight of the edge, (**d**) Normalized WDACN, (**e**) WLDACN.

The causation routes are defined as the set of links from the source hazard node to the accident node. There may be more than one route from source hazard node *h* to accident node *a*, and the same route may appear several times. The frequencies of the causation route *k* from hazard node *h* to accident node *a Fk*(*h*, *a*) can be defined as *Fk*(*h*, *a*) = min(*w*(*i*, *j*)), *i*, *j* ∈ *S h*,*a k* , where *S h*,*a k* is the set of points on causation route *k*.

As shown in Figure 1a, the causation route (1 → 2 → 5) has a lower frequency than the causation route (1 → 2 → 4 → 5). Therefore, the causation path (1 → 2 → 5) has a lower active probability than the causation path (1 → 2 → 4 → 5). In this study, the causation route with the highest active probability is defined as the most likely causation path (MPCP).

The shortest causation path (SCP) is often used to represent the MPCP from one node to another in a complex network. To analyze the interaction between hazards from the perspective of complex networks, the WDACN was simplified as a direct accident causation network (DACN) [28], as shown in Figure 1b. However, the frequency difference of the hazards cannot be observed in the DACN. As shown in Figure 1c, the MPCP from *a* to e is (1 → 2 → 5), which is contrary to the evidence from the WDACN; hence, this method cannot be used to model the interaction between hazard factors and accidents.

Another method is to preserve the frequency information by reversing the weight of the edge [27], as shown in Figure 1c. However, the SCP still cannot indicate the MPCP precisely because paths (1 → 2 → 5) and (1 → 2 → 4 → 5) have the same distance (i.e., 1/3), as shown in Figure 1c. Therefore, the SCP resulting from these two methods is not consistent with the MPCP observed from the WDACN.

#### 2.1.2. Edge Length Metrics

To realize the consistency between SCP and MPCP, we propose a new method to measure the length of the edges. First, the weight of the edge (*i*, *j*) in the WDACN can be normalized as follows:

$$p(i,j) = \frac{w(i,j)}{D\_i^{out}}\,'\,\tag{2}$$

where *Dout i* is the out-degree of node *i*. As shown in Figure 1d, the normalized weight *p*(*i*, *j*) can be interpreted as the active probability of hazard node *j* for active node *i*. Let *H* and *A* denote the sets of hazard accident nodes. Let *S h*,*a k* denote the set of points on causation route *k* from hazard node *h* to accident node *a*. Therefore, the active probability of causation route *k* can be expressed as:

$$P\_k(h, a) = \prod\_{i, j \in S\_k^{b, a}} p(i, j) \tag{3}$$

*Pk*(*h*, *a*) reflects the conditional probability of causation path *k* given the occurrence of source hazard *h*. The MPCP from hazard node *h* to accident node *a* can be obtained by solving the following equation:

$$MPCP(h, a) \;= \text{argmax}(P\_k(h, a)), k = 1, 2, 3 \dots K \tag{4}$$

In order to transform the MPCP problem into an SCP problem, the transformation function should satisfy two conditions: (1) it is a monotone decreasing function of probability *p*(*i*, *j*); (2) the uncertainty of route *k* can be represented as the sum of uncertainties of each edge. Therefore, logarithmic function is selected as our transformation function. The natural logarithm of Equation (3) can be used as follows:

$$-\ln P\_k(h, a) = -\sum\_{i, j \in S\_k^{h, a}} \ln(p(i, j))\tag{5}$$

Equation (5) can denote the distance of route *k* which is the sum of length of edges. Therefore, the length *l*(*i*, *j*) of edge (*i*, *j*) can be described as follows:

$$l(i,j) = -\ln(p(i,j))\tag{6}$$

Figure 1e shows the length of edges. The MPCP problem can be transformed into an SCP problem. The length of causation path *r* from hazard node *h* to accident node *a* can be expressed as:

$$L\_k(h, a) := -\ln P\_k(h, a) = l\_k(h, a), \tag{7}$$

where *<sup>l</sup>k*(*h*, *<sup>a</sup>*) <sup>=</sup> <sup>∑</sup>*i*,*j*∈*<sup>S</sup> h*,*a k l*(*i*, *j*). Therefore, the SCP can be obtained by solving the following problem:

$$\text{SCP}(h, a) \;= \text{argmin}(l\_k(h, a)), k = 1, 2, 3 \dots K \tag{8}$$

The SCP can be obtained using Dijkstra algorithm and used to measure the difficulty of a hazard factor in causing an accident. The probability of hazard node *h* causing accident *a* is greater if *SCP*(*h*, *a*) is shorter. Finally, edges (*i*, *j*) in the ACN exhibit three attributes: weight *w*(*i*, *j*), active probability *p*(*i*, *j*), and length *l*(*i*, *j*). Because the proposed ACN contains weights and length metrics and is a directed network, the proposed network model can be named the WLDACN.

#### *2.2. CHI Method Development*

#### 2.2.1. Objective of CHI

The objective of CHI is to prevent accidents by reducing or eliminating hazards. The ACN efficiency is used as an indicator to reflect the difficulty of hazards causing accidents.

$$E = \sum\_{h \in H, a \in A} \frac{1}{\text{SCP}(h, a)}\tag{9}$$

A lower network efficiency corresponds to a longer SCP from hazards to accidents. Therefore, we should eliminate hazard–accident interactions by removing the nodes in the ACN. We assume that all hazards can be controlled or removed via technological development and management improvement. For example, equipment failures can be eliminated by improving the maintenance strategy, human factors can be controlled via safety training, and environmental hazards can be prevented by monitoring the operation surroundings. Additionally, the cost increases with the number of removed hazard nodes. Therefore, the maximum number of removed hazard nodes in the network should be defined.

Some hazard factors such as wind, snow, and rain cannot be controlled because they originate from the natural environment instead of the railway system. Therefore, we should discuss the controllability of hazard nodes. Hazard factors can be controlled if their causes can be determined, so only hazard nodes with parent nodes can be alleviated.

#### 2.2.2. High Centrality Adaptive Methods

The high centrality adaptive method is typically used to reduce the network efficiency by removing hazard nodes that have the highest centrality. Four methods can be used to measure the centrality of nodes in a complex network: based on the node degree, node betweenness, node closeness, and PageRank [38]. Therefore, high degree adaptive (HDA), high betweenness adaptive (HBA), high closeness adaptive (HCA), and high pagerank adaptive (HPA) methods are used to remove critical hazard nodes in a WLDACN.

The HDA method removes the node with the highest degree centrality (DC) in each iteration. The HDA method recomputes the DCs of the remaining nodes after node removal. The DC of hazard node *h* can be expressed as:

$$D\_h = D\_h^{in} + D\_h^{out} = \sum\_{i \neq h} w(i, h) \ + \sum\_{j \neq h} w(h, j) \tag{10}$$

Based on Equation (10), a greater DC of hazard node *h* results in more causal relationships between hazard node *h* and other nodes. If a node with a higher DC is eliminated, then more causal relationships can be eliminated.

HBA sequentially removes the node with the highest betweenness centrality (BC) and recomputes the BC for the remaining nodes in each iteration. The BC of hazard node *h* can be expressed as:

$$B\_{\rm li} = \sum\_{\substack{i \in H, j \in A \\ i \neq h, j \neq h}} \frac{\rho\_{ij}(h)}{\rho\_{ij}} \,. \tag{11}$$

where *ρij* is the number of shortest paths from hazard node *i* to accident node *j*, and *ρij*(*h*) is the number of shortest paths from hazard node *i* to accident node *j* that pass through hazard node *h*.

Based on the definition of BC, a greater value of the BC of hazard node *h* results in more SCPs from hazards to accidents to pass hazard *h*. If a node with a higher BC is eliminated, then more SCPs from hazards to accidents can be removed.

The HCA method removes the node with the highest closeness centrality (CC) and updates the CC for the remaining nodes in each iteration. CC describes the proximity of a hazard node to all other nodes in the ACN. It is calculated as the reciprocal of the average distances from one hazard node to all accident nodes as follows:

$$\mathcal{C}\_{\text{fl}} = \frac{N\_A}{\sum\_{a \in A} \text{SCP}(h, a)} \text{\textdegree} \tag{12}$$

where *N<sup>A</sup>* is the number of accident nodes. Based on Equation (12), a greater *C<sup>h</sup>* results in fewer steps from hazard node *h* to accidents. If the node with the higher CC is eliminated, then the average SCP increases from hazards to accidents.

The HPA method deletes the node with the highest PageRank centrality and subsequently recomputes PageRank for the remaining nodes in each iteration. The PageRank of hazard node *h* can be expressed as:

$$R\_{\mathbb{H}} = \sum\_{j \neq h} p(j, h)p(j) \tag{13}$$

Based on Equation (13), a greater value of *R<sup>h</sup>* results in a greater possibility of hazard node *h* being activated by other hazards. If the nodes with higher PageRank values are eliminated, then the propagation probability from hazards to accidents decreases.

In fact, the hazard link, including hazard *h* in the source dataset, should be changed if hazard *h* is deleted. Assuming that we extract one hazard link h*i*, *j*, *h*, *a*i from one accident report, if hazard *h* has been removed by our CHI methods, then accident *a* cannot occur; hence, the hazard link should be deleted.

#### 2.2.3. Integer Programming Method

High centrality adaptive methods delete the node with the highest centrality at each iteration, which is a local optimal solution. Therefore, a model should be developed to solve the global optimal solution. From an economical and safety perspective, the CHI problem can be described as follows: To identify the nodes to be removed to reduce the network efficiency, an integer programming model for the CHI problem is formulated, i.e.,:

$$\begin{array}{c} \text{Min } E\\ \left\{ \begin{array}{l} \sum \mathbf{x}\_{i} \leq M, \forall i\\ \mathbf{x}\_{i} \in \{0, 1\}, \forall i \end{array} \right. \\\end{array} (a) \tag{14}$$

where *x<sup>i</sup>* is a decision variable for removing hazard node *i*. *x<sup>i</sup>* = 1 indicates that hazard node *i* is removed. Equation (14a) states that a maximum of *M* nodes can be removed. Because the objective function of the CHI model cannot be transformed into a linear form, a heuristic algorithm is proposed to solve the integer programming model. The procedures of the algorithm can be described as follows:

Step 1: Based on the controllability of hazards, randomly select *M* controllable hazard nodes to be deleted from the accident causation network.

Step 2: For each hazard node in a circular order, maintain the other *M* − 1 hazard nodes and identify the optimal solution for the selected hazard node. The solution for the selected hazard node is updated before determining the optimal hazard node in the WLDACN to minimize the network efficiency.

Step 3: Terminate the algorithm when M consecutive solutions for a hazard node do not reduce the network efficiency; otherwise, return to Step 2.

#### *2.3. Model Performance Comparison*

The objective of the CHI is to increase the overall average degree of difficulty in causing accidents. To compare the performances of CHI methods from a local perspective, the average SCP (ASCP) from all hazards to each accident type and the ASCP from each hazard type to accidents (*ASCP*(*Ht*)) were used as indicators to evaluate the proposed CHI methods. The ASCP from all hazards to accident type *a*, *ASCP*(*a*), can be formulated as follows:

$$ASCP(a) = \frac{\sum\_{h \in H} CSP(h, a)}{N\_H} \,\,\,\,\tag{15}$$

where *N<sup>H</sup>* is the number of hazard nodes. If *ASCP*(*a*) decreases, then the overall probability of all hazards causing accident *a* increases; otherwise, the overall probability of all hazards causing accident *a* will decrease.

Hazards can be classified into different types. Each hazard type is composed of various hazard factors. The ASCP from hazard type *H<sup>t</sup>* to accident *a* can be formulated as follows:

$$ASCP(H\_{l\prime}a) = \frac{\sum\_{h \in H\_l} SSD(h, a)}{N\_{H\_l}}\,\tag{16}$$

where *NH<sup>t</sup>* is the number of hazard nodes that belong to hazard type *H<sup>t</sup>* . Similarly, the ASCP from hazard type *H<sup>t</sup>* to all accidents *ASCP*(*Ht*) can be formulated as:

$$ASCP(H\_t) = \frac{\sum\_{a \in A} ASCP(H\_{t\prime}a)}{N\_A} \tag{17}$$

If *ASCP*(*Ht*) decreases, then the overall probability of hazard type *H<sup>t</sup>* causing accidents increases; otherwise, the overall probability of hazard type *H<sup>t</sup>* causing accidents decreases.

#### **3. Case Studies**

The proposed model was applied to the hazard management of railway systems. All computational experiments were conducted on a PC with a 2.8-GHz CPU operating the Windows 10 operating system.

#### *3.1. Data Description*

The data in this study were obtained from RAIB (https://www.gov.uk/government/ publications/raib-investigation-reports-and-bulletins (accessed on 1 September 2020)). We obtained 240 accident investigation reports from 2012 to 2020. The hazards were classified into four types: human (H), equipment and machine (EM), environment (E), and management (M). We extracted 20 H-type hazards (H01-H2O), 53 EM-type hazards (EM01–EM53), 12 E-type hazards (E01–E12), and six M-type hazards (M01–M06). Eighteen types of accidents were obtained, as graphically illustrated in Figure 2.

#### **Figure 2.** Collection of accidents from RAIB.

#### *3.2. ACN Construction and Analysis*

Figures 3 and 4 show the WLDACN based on the interaction between hazard factors and accidents. The weight was computed using Equation (1) and depicted near the edges in the network, as shown in Figure 3. The length of each edge was computed using Equation (6) and depicted near the edges of the network, as shown in Figure 4.

**Figure 3.** ACN with edge weight metrics.

**Figure 4.** ACN with edge length metrics.

The SCP is an important feature of the WLDACN because it reflects the overall probability of hazards that cause accidents. Table 2 shows the distance from four types of hazards to 18 types of accidents. As shown in Table 2, the derailment accident (A04) had the shortest ASCP (4.44), which indicates that the most likely accident that results from hazards is derailment. Among the four types of hazards causing derailment, the H-type hazard had the shortest path length, which indicates that human failure can much more easily cause derailment. The H-type hazard can much more easily cause 15 types of accidents than the other types of hazards. The EM- and E-type hazards can cause derailment (A04), as indicated by the distance values of 3.54 and 5.62, respectively. M-type hazards were more likely to cause accidents "struck-by (A06)," due to their distance value of 4.78.

**Table 2.** Distances from 4 types of hazards to 18 types of accidents.


Figure 5 shows the ASCP for each hazard type for accidents. As shown in Figure 5, EM-type hazards had the shortest distance to accidents (5.69) among the four types of hazards, which indicates that the most likely cause of accidents is equipment or machine failure. The second most likely cause of accidents is human-type hazards.

**Figure 5.** ASCP from each hazard type to accidents.

Figure 6 shows the ASCP from each H-type hazard node and E-type hazard node to accidents. As shown in Figure 6, the distances from different hazard nodes to accidents were different. H06 (rail line inspector did not identify problems in a timely manner) and H08 (track worker negligence) had shorter ASCPs to accidents than the other H-type hazard nodes. Therefore, hazards H06 and H08 can more easily cause railway accidents than the other H-type hazard nodes. E02 (water hazard) had a shorter ASCP to accidents than other E-type hazard nodes. Therefore, hazard E02 can more easily cause railway accidents than the other E-type hazard nodes.

**Figure 6.** ASCP from each H-type hazard node and E-type hazard node to accidents.

Railway safety managers should delete as many hazard nodes as possible through hazard control efforts. However, different ASCPs from each hazard node to accidents indicate different contributions of various hazards to railway accidents. Therefore, the CHI model should be used to obtain the critical hazard nodes.

#### *3.3. CHI Model Application and Comparison*

Based on the obtained WLDACN and hazard controllability analyses, we obtained 64 controllable hazard nodes. Herein, five CHI methods have been proposed to find the critical hazard nodes. Ten nodes were applied to compare these CHI methods, including the

HBA, HCA, HDA, HPA, and IPM. The comparison results are shown in Figure 7. Among the four high centrality adaptive strategies, the HBA strategy is much more effective in preventing accidents than the other three methods because it enables more SCPs from hazards to accidents to be removed. It is difficult to distinguish the effectiveness of the HDA and HPA methods because the network efficiency depends on the number of removed nodes. The HCA strategy performed the worst among the four strategies. Therefore, we suggest using the HBA method to identify critical hazards in railway systems among high centrality adaptive strategies. However, the proposed IPM performed better than all high centrality adaptive strategies, as shown in Figure 7. The high centrality adaptive method iteratively removes the nodes with the highest centrality; therefore, it only solves one critical hazard node at each step. However, the IPM obtains the critical hazard nodes from a global perspective. Consequently, the proposed IPM more effectively performed hazard management and accident prevention.

**Figure 7.** Comparison of different CHI methods.

Additionally, the ASCP was used to compare the performances of five CHI methods. Figure 8 shows the ASCP from all hazards for each accident type after removing 10 hazard nodes. We assume that the ASCP is 20 if accidents cannot be caused by hazards. As shown in Figure 8, some accidents cannot be caused by hazards after we remove 10 hazard nodes. Nine types of accidents, including A05, A12, A13, and A14, cannot be caused by hazards when the HBA method is adopted; eight types cannot be caused by hazards when the HCA method is used; nine types cannot be caused by hazards when the HDA method is used; eight types of accidents, including A05, A08, and A14, cannot be caused by hazards when the HPA method is adopted; nine types of accidents, including A05, A08-A14, A16, and A18, cannot be caused by hazards when the IPM method is adopted. Therefore, HBA and IPM better prevent accidents than the other three methods.

**Figure 8.** ASCP from all hazards to each accident type.

Based on Equation (17), Figure 9 shows the ASCP from each hazard type to accidents after we removed 10 hazard nodes. As shown in Figure 9, the IPM and HBA methods performed better than the other three methods because IPM and HBA increased the ASCP of EM-, E-, and M-type more than the HPA, HCA, and HDA methods. The ASCP of H-type hazards can be lengthened by the HDA method to a higher level. However, HAD has a worse overall performance than the IPM and HBA methods, which implies that the proposed IPM and HBA methods significantly reduce the accident-causing probability of each hazard type. Therefore, the IPM and HBA methods can be selected as CHI models to identify the critical hazard nodes. The IPM can reduce the probability of H-, EM-, and E-type hazard-related accidents to a lower level than the HBA. The HBA method can reduce the probability of M-type hazard-related accidents to a lower level than the IPM method. However, the overall length of the ASCP resulting from the IPM was longer than that from the HBA method. Therefore, the proposed IPM is suggested as the CHI method for accident prevention in railway systems.

**Figure 9.** ASCP from each hazard type to accidents.

The hazard nodes to be alleviated were H02 (driver emergency brake failure), H04 (driver's operation mistake), H06 (rail line inspector did not identify problems in a timely manner), H12 (level crossing watchman's mistake), H13 (train maintainer's inadequate maintenance), EM02 (damaged track), EM04 (false signal displayed), EM24 (equipment failure signal), EM29 (train braking system failure), and M01 (poor safety education for workers). The hazard types included 1 M-, 5 H-, and 4 EM-type hazards. Although we obtained 20 H-type hazards from the accident reports, which only constituted 22% of all hazard nodes, the critical H-type hazards still constituted 50% of the total critical hazard nodes, which implies that human failure was the main cause of accidents. Based on the obtained critical hazard nodes and their parent nodes, hazard control measures can be implemented. For example, the top two causes of hazard event H02 were the conductor's mistake (H03) and false signal displayed (EM04). For hazard node H03, we could not identify the causes from the accident reports. Therefore, railway safety managers should employ professional or experienced conductors to reduce these mistakes. Additionally, hazard node EM04 is a critical hazard node, which is mainly caused by EM03 (power supply failure). Therefore, the power supply departments of railway companies should strive to maintain a stable power supply.

The top 10 critical hazard nodes did not include E-type hazard nodes because the railway system is more robust to environmental changes than other transportation modes [39]. If the controllability of hazards is not analyzed, then E03 (wind) will constitute one of the top 10 critical hazard nodes. However, the wind cannot be alleviated because it is a natural hazard; as such, railway companies strive to protect railway systems from wind damage by technology improvement. In summary, the controllability of hazard factors must be analyzed to obtain feasible hazard control measures.

#### *3.4. ACN Model Comparison*

The proposed WLDACN model uses a different length metric method from other network-based accident causation models such as DACN and WDACN. To validate the efficiency of the WLDACN, we compared the performances of the five CHI methods under three accident causation models. The comparison results are listed in Table 3.

**Table 3.** Difficulty of hazards causing accidents under different ACN models and CHI methods. (Bold represent the optimal values).


All three accident causation models performed the best when they were implemented using the HBA and IPM methods. However, the proposed IPM performed slightly better than the HBA method, although different accident causation models were used. Because WDACN and DACN obtain the edge length by reversing or disregarding the weights of edges, as described in Section 2, the proposed WLDACN performed the best among the three accident causation models. The WDACN and WLDACN demonstrated identical performance when implemented using the HDA and HPA methods because they used the same method to compute the weights of the edges. In general, combining the IPM and WLDACN yielded the best accident prevention results.

#### *3.5. Limitation of the Method*

The railway company may use facilities with various levels of reliability or employ staff with different professional abilities, which may lead to uncertain accidents. Therefore, the causality network obtained from different railway companies may be different. The obtained results only represent the dataset in this paper. Although the causality network based critical hazard identification method can help safety managers understand the causes of railway accidents to avoid similar occurrences in the future, the number of accident reports determines the efficiency of the method. In fact, the railway industry has developed for a long time, and we believe that the accumulated accident data are so large that the method can be used to suggest critical hazards.

#### **4. Conclusions**

Accident causation models enable us to effectively understand the causes of railway accidents. In previous ACN models, the effects of the frequency of hazard events or accidents are typically simplified; hence, the MPCP from hazards to accidents is inconsistent with the obtained SCP. Therefore, we proposed a new ACN model, where the MPCP problem is transformed into an SCP problem. The nodes in the network can be classified as hazard or accident nodes. As such, we can use the distance from hazards to accidents to measure the difficulty of various hazards causing accidents; consequently, critical hazards can be identified.

The causes of railway accidents should be understood to prevent accidents through hazard control. Therefore, we proposed CHI models to select critical hazards. The controllability of hazards, which is often disregarded, was discussed to determine the variables in CHI models. In this study, five CHI models were proposed to identify the critical hazard nodes in the ACN. To compare the performances of the models, three indices that can measure the efficiency of accident prevention were proposed. Comparison results from the case study indicated that the proposed IPM and WLDACN models performed better than the other models.

Because the results from the proposed model were based on real-world data, they offer useful insights into the hazard management of railway systems. The proposed model can suggest critical hazard factors that should be controlled. Safety managers can select hazard control options based on the critical causes of accidents.

In future studies, more accident data will be used to validate the proposed CHI model. In our model, the results of each type of accident, such as damage, injury, or death, were not considered due to inadequate data. Therefore, more data regarding accident damage should be obtained to measure the severity of accidents, and the weight of distance from hazards to each type of accident can be measured. Furthermore, the management cost of each hazard can be differentiated in future models.

**Author Contributions:** Conceptualization, Z.Z. and Q.L.; methodology, Z.Z.; software, F.P.; validation, Q.L.; writing—original draft preparation, Z.Z.; project administration, Z.Z. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by the National Key R&D Program of China, Grant/Award Number: 2018YFB1601200, National Natural Science Foundation of China under Grant 71801009, State Key Laboratory of Rail Traffic Control and Safety (Contract No.: RCS2021ZT004), Key Program of Beijing Polytechnic, Grant/Award Number: 2021Z018-KXZ.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** The data presented in this study are openly available in https:// www.gov.uk/government/publications/raib-investigation-reports-and-bulletins (accessed date 6 July 2021).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


**Yusi Luan, Mengxuan Jiang, Zhenxiang Feng and Bei Sun \***

School of Automation, Central South University, Changsha 410083, China; 184611053@csu.edu.cn (Y.L.); jiangmengxuan@csu.edu.cn (M.J.); fengzxcsu@csu.edu.cn (Z.F.)

**\*** Correspondence: sunbei@csu.edu.cn

**Abstract:** For an industrial process, the estimation of feeding composition is important for analyzing production status and making control decisions. However, random errors or even gross ones inevitably contaminate the actual measurements. Feeding composition is conventionally obtained via discrete and low-rate artificial testing. To address these problems, a feeding composition estimation approach based on data reconciliation procedure is developed. To improve the variable accuracy, a novel robust M-estimator is first proposed. Then, an iterative robust hierarchical data reconciliation and estimation strategy is applied to estimate the feeding composition. The feasibility and effectiveness of the estimation approach are verified on a fluidized bed roaster. The proposed M-estimator showed better overall performance.

**Keywords:** data reconciliation; robust estimator; gross error detection; feeding composition

#### **1. Introduction**

Complete, accurate and reliable data measurements are important for process modeling, model identification, real-time online optimization and process control. However, the measured values of variables in the actual measurement process are inevitably contaminated by random errors or even gross errors. Therefore, the process measurements will deviate from the real values. In addition, the feeding composition, which is important in making control decisions, cannot be measured online due to the high economic cost and technical limitations. Therefore, it is necessary to process the measurements to guarantee the reliability and observability of the system. Based on this, the estimation of feeding composition can be better performed. Data reconciliation is an effective approach for this aim [1]. The principle of data reconciliation is to minimize the sum of square deviations between the coordinated value of variables and their measurements under the condition of satisfying material balances, heat balances and boundary constraints of process variables. From the mathematical point of view, data reconciliation is a constrained optimization problem [2], which has been widely used in many important procedures such as in-line process monitoring [3,4], enhanced control strategy performance [5,6], parameter estimation [7], soft sensor applications [8], quality control [9] and industrial optimization [10,11] among others [3–11].

Traditional data reconciliation typically assumes that measurements contain only random errors with an average value of zero and normal distributions. Nonetheless, the measurements will be contaminated by gross errors. If gross errors are not processed, the accuracy of data reconciliation results will be reduced. The gross error detection is therefore particularly important for data reconciliation. There are currently three main methods for handling measurements with gross errors. The first strategy is sequential gross error detection and data reconciliation. A variety of statistical tests, including the global test [12], the measurement test [13], the nodal test [14], generalized likelihood ratio test [15] and principal component analysis test [16], are used for gross error detection. Following the detection and elimination of gross errors, the traditional data reconciliation is carried

**Citation:** Luan, Y.; Jiang, M.; Feng, Z.; Sun, B. Estimation of Feeding Composition of Industrial Process Based on Data Reconciliation. *Entropy* **2021**, *23*, 473. https://doi.org/ 10.3390/e23040473

Academic Editor: Quan Min Zhu

Received: 1 March 2021 Accepted: 13 April 2021 Published: 16 April 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

out. The second strategy is simultaneous gross error detection and data reconciliation. Yu et al. [17] proposed a support vector regression method for recursive simultaneous data reconciliation and gross error detection. Zhang et al. [18] used a novel particle filter (PF) algorithm based on the measurement test (MT) to solve the dynamic simultaneous gross error detection and data reconciliation. Yuan et al. [19] established a new hierarchical Bayesian framework, which can simultaneously estimate the real value of process variables and obtain the magnitudes of gross errors. The last strategy is robust data reconciliation. The estimations close to the true value can be obtained by reducing the influence of gross errors on data reconciliation results. Tjoa and Biegler [20] constructed an M-estimator based on mixed distribution, which takes both random errors and gross errors into account and minimized the M-estimator as an objective function. Johnson and Kramer [21] presented a new robust method for data reconciliation based on a probability bootstrapping theory. Arora and Biegler [22] evaluated the Fair and Hampel estimators in stationary and dynamical constrained problems. Hampel's three-part estimator presented the best performance. Wang and Romagnoli [23] designed a partially adaptive estimator based on a generalized T distribution and a fully adaptive estimator based on density estimation by studying the robustness of a set of existing estimators. Özyurt and Pike [24] analyzed the performance of a number of robust estimators such as Hampel, Cauchy and Fair. Ragot et al. [25] adopted a cost function that has been improved based on the contaminated probability density distribution model for the design of robust estimator. Prata et al. [26] applied the proposed strategy based on the particle swarm optimization (PSO) and the robust Welsch estimator to real-time online detection of Bulk polymerization of propylene. Zhang et al. [27] constructed a robust estimator by using the quasi-weighted least squares. Through the comparative analysis of various methods like Welsch, quasi-weighted least squares and comentropy M-estimators, the feasibility and effectiveness of robust estimators in simultaneous gross error detection and data reconciliation were demonstrated [28]. Alighardashi et al. [29] proposed a maximum likelihood framework for simultaneous data reconciliation and gross error detection for steady-state data. Xie et al. [30] utilized a novel robust estimator to improve the robustness of data reconciliation. Forty-eight different robust M-estimators can be founded in a recent review study, including Fair, Cauchy, Biweight, Jin, Welsch estimator and Xie [31].

The key to estimating feeding composition based on data reconciliation is the robust estimator and the design of the estimation strategy. An excellent robust estimator is embodied in two aspects: On the one hand, the objective function is bounded, and the bound value is small; on the other hand, the influence function can converge quickly and converge to zero. The objective functions of Cauchy [32] and Fair [33] estimators increase with the increase in standard measurement residuals, indicating that the error will greatly affect the reconciled results. Besides, the convergence speed of their influence functions is very slow, and the bound values are large, which shows that the two estimators cannot suppress the influence of gross errors on reconciled results. The objective function of Welsch [26] estimator is limited, but the bound value is also large, and the influence function does not converge rapidly. Although the influence function of Xie [30] estimator may converge to zero and the convergence speed is improved in comparison to the estimators above, it still cannot achieve the fast standard. Based on the robust estimation theory, a novel robust estimator is proposed to enhance the performance of the robust estimators. The objective function is bounded, the bound value is smaller, and the influence function converges toward zero faster. At the same time, aiming at the problem that the change of feeding composition caused by the long test period of industrial processes is unknown, an iterative robust hierarchical data reconciliation and estimation strategy based on the heat balance is proposed. In order to further verify the robustness and effectiveness of the proposed robust estimator and estimation of feeding composition strategy, a series of comparative experiments are carried out through two numerical examples and the real data from a fluidized bed roaster for zinc smelting.

After initial remarks, motivation aspects and applications of the data reconciliation issue, the paper is organized as follows. Section 1 presents preliminary approaches to traditional data reconciliation and robust estimator. The novel robust estimator is proposed, and two numerical examples are used to demonstrate the effectiveness of the proposed estimator in Section 2. Section 3 provides an iterative robust hierarchical data reconciliation and estimation strategy to estimate feeding composition. The feeding composition of fluidized bed roaster is then estimated in Section 4. Finally, Section 5 provides the conclusions.

#### **2. Preliminaries**

#### *2.1. Data Reconciliation*

The purpose of data reconciliation is to obtain the reconciled data by processing the raw measurements. The reconciled data not only satisfy the constraints of the process model but are also closer to the true value. There are three main assumptions for the system using data reconciliation: The system is in a steady state, the measurement errors follow a normal distribution with zero mean, and each measured variable is independent of each other. Taking into account the presence of measurement errors, the measurement model may be expressed as follows:

$$\mathfrak{x} = \mathfrak{x}^\* + \varepsilon \tag{1}$$

where *x* denotes the vector of raw process measurements, *x* ∗ denotes the vector of true values of the process variables and *ε* is the random measurement errors.

Based on the principle of data reconciliation, the general steady-state data reconciliation problem can be stated as a form of solving the weighted least squares solution satisfying the process model and boundary constraints:

$$\begin{array}{ll}\min & (\boldsymbol{x} - \boldsymbol{\widetilde{x}})^T \boldsymbol{\Sigma}^{-1} (\boldsymbol{x} - \boldsymbol{\widetilde{x}})\\ \text{s.t.} & F(\boldsymbol{\widetilde{x}}, \boldsymbol{u}) = \boldsymbol{0} \\ & G(\boldsymbol{\widetilde{x}}, \boldsymbol{u}) \le \boldsymbol{0} \end{array} \tag{2}$$

where *<sup>x</sup>*eis the vector of reconciled data, <sup>∑</sup> is the diagonal covariance matrix of measurement errors, *u* is the vector of unmeasured variables; *F* represents the process model, which is used as equality constraints in the optimization problem, and *G* denotes the inequality constraints indicating variable boundaries.

#### *2.2. M-Estimator*

In the classical weighted least square of data reconciliation, Equation (2), it is usually assumed that the process measurements contain only random errors. However, in the actual measurement process, the measured values may contain gross errors. The presence of gross errors has a serious effect on the conventional data reconciliation, resulting in the propagation of gross errors. This will cause the reconciled data to not satisfy the process model and deviate from the true value of variables seriously. As a result, the robust estimator is used to account for gross errors in measurements.

Currently, there are numerous approaches for robust data reconciliation, the majority of which are based on the theory of M-estimator. The function of the measurement residuals is constructed as the objective function of the robust estimator. By reducing the weight of the measurements with gross errors, it can prevent gross errors smearing other measurements. Hence, the robust estimator can well reconcile the data with gross errors. The problem of robust data reconciliation can be expressed as follows:

$$\begin{array}{ll}\min \sum\_{i=1}^{n} \rho(\xi\_i) = \min \sum\_{i=1}^{n} \rho(\frac{x\_i - \tilde{x}\_i}{\sigma\_i})\\ \text{s.t. } F(\tilde{x}, u) = 0\\ \tilde{x}\_{\text{imin}} \le \tilde{x}\_i \le \tilde{x}\_{\text{imax}}, \; i = 1, 2, \dots, n\\ u\_{\text{lmin}} \le u\_l \le u\_{l\text{max}}, \; l = 1, 2, \dots, N - n \end{array} \tag{3}$$

where *<sup>ρ</sup>* is the robust estimator, *<sup>ξ</sup><sup>i</sup>* = (*x<sup>i</sup>* <sup>−</sup> *<sup>x</sup>*e*i*)/*σ<sup>i</sup>* is the standardized residual for the *i*th measured variable, *x<sup>i</sup>* , *<sup>x</sup>*e*<sup>i</sup>* and *<sup>u</sup><sup>l</sup>* are, respectively, the measured data, the reconciled data for the *i*th measured variable and the estimated value for the *j*th unmeasured variable, *<sup>x</sup>*e*i*min, *<sup>x</sup>*e*i*max, *<sup>u</sup>l*min and *<sup>u</sup>l*max are, respectively, the lower limit, the upper limit for the *i*th reconciled variable, the lower limit, the upper limit for the *l*th unmeasured variable. *N*, *n*, *N* − *n* denote the total number of variables, the total number of measured variables and the total number of variables not measured, respectively. *σ<sup>i</sup>* represents the standard deviation for variable measurement errors.

Most M-estimators are not based on a clearly defined probability distribution known in advance. Most of them are based on a simple mathematical structure. The mathematical structure of M-estimator, i.e., *ρ*, has some general characteristics as follows:


As a crucial index in assessing the robustness of the M-estimator, the influence function (IF) [34] is directly proportional to the derivative of the objective function of the estimator. The influence function value refers to the effect of different deviation on the estimator. In general, the function may be defined as the first derivative of the objective function to standardized residuals, which can be expressed as follows:

$$\text{IF}(\mathfrak{f}) = \frac{d\rho(\mathfrak{f})}{d\mathfrak{f}} \tag{4}$$

Some general characteristics of IF [35] are:


On the basis of a robust M-estimation theory, the influence function needs to be continuous and bounded. When *ξ* is small, the influence function is proportional to *ξ*; when *ξ* is infinite, the influence function can converge to a constant, which indicates that the robust estimator can suppress the effect of gross errors in the reconciled results. The influence function of the weighted least squares estimator is the standardized residual *ξ*, which indicates that the influence function will increase with the measurement errors. It can be seen that the weighted least squares estimator is not robust because its influence function is boundless, and the reconciled results can be easily affected by gross errors.

#### **3. Data Reconciliation Based on a Novel Robust Estimator**

#### *3.1. A Novel Robust Estimator*

The most robust estimators are constructed according to the following requirements: The objective function of the estimator is bounded, and the bound value is small; the influence function can rapidly converge to a constant. From the observation of the mathematical structure of 48 different robust estimators in the literature [31], it has been shown that the objective functions of these M-estimators comprise limitless function, limited function, non-piecewise function, piecewise function, multi parameters and a single parameter. The performance of the M-estimator whose objective function is limited, non-piecewise and has single parameter is analyzed. It is found that the M-estimator with a relatively rapid convergence rate basically contains the mathematical expression with *e* as the base number. The derivative of exponential function based on *e* is equal to itself. Moreover, *e* <sup>−</sup>*<sup>ξ</sup>* decreases with increasing *ξ* and tends towards 0. When *ξ* tends towards 0, *e* −*ξ* tends towards 1. Therefore, the objective function can be limited if the structure of 1 − *e* −*ξ* is included in. For instance, the Welsch estimator and the objective function is presented in Equation (5). In order to improve the robustness of the M-estimator, Xie divided 1 − *e* <sup>−</sup>*<sup>ξ</sup>* with 1 + *e* −*ξ* . By the decline of 1 + *e* −*ξ* , the overall rise speed of the objective function is improved. Furthermore, the converging speed of the influence function is accelerated. The objective function of Xie estimator is illustrated under Equation (6).

Welsch:

$$\rho(\xi\_i) = \frac{c\_w^2}{2} \left( 1 - \exp\left(-\left(\frac{\xi\_i}{c\_w}\right)^2\right) \right) \tag{5}$$

Xie:

$$\rho(\xi\_i) = \frac{1 - \exp\left(-\left(\frac{\xi\_i}{c\_x}\right)^2\right)}{1 + \exp\left(-\left(\frac{\xi\_i}{c\_x}\right)^2\right)}\tag{6}$$

where c*<sup>w</sup>* and *c<sup>x</sup>* are, respectively, tuning parameters of Welsch and Xie estimators.

As discussed above, in order to limit the effect of gross errors on the reconciled results, a novel robust estimator is created based on the general characteristics of the M-estimator and influence function. The objective function of the novel robust estimator is defined as follows:

$$\rho(\xi\_i) = \frac{c\_p^2 \left[1 - \exp\left(-\left(\xi\_i/c\_p\right)^2\right)\right]}{4\left[1 + \exp\left(-\left(\xi\_i/c\_p\right)^4\right)\right]} \tag{7}$$

where *c<sup>p</sup>* is the tuning parameter; the definition of *ξ<sup>i</sup>* is the same as above. By increasing the index of *ξ*, the downward velocity of 1 + *e* −*ξ* is increased. Then, the objective function increases faster, and the influence function converts to 0 faster. At the same time, in order to satisfy the characteristic that the influence function is almost linear near the origin, the objective function is multiplied by *c* 2 *<sup>p</sup>*/4. The influence function of the novel robust estimator function expressed in Equation (7) is as follows:

$$\text{IF}(\xi\_i) = \frac{d\rho(\xi\_i)}{d\xi\_i} = \frac{\xi\_i \exp\left(-\left(\frac{\xi\_i}{c\_p}\right)^2\right) \left[1 + \exp\left(-\left(\frac{\xi\_i}{c\_p}\right)^4\right) + \frac{2\xi\_i^2}{c\_p^2} \left(1 - \exp\left(-\left(\frac{\xi\_i}{c\_p}\right)^2\right)\right)\right]}{2\left[1 + \exp\left(-\left(\frac{\xi\_i}{c\_p}\right)^4\right)\right]^2} \tag{8}$$

It can be seen from Equations (7) and (8) that when the standardized residual is small, the objective function of the novel robust estimator tends to be gradually zero, and the influence function is proportional to the standardized residual. It shows that the weighted least squares method can be used to reconcile the data where measurements contain only random errors. When the standardized residual is significant, the objective function of the novel robust estimator tends to be constant little by little, and the influence function quickly converges to zero. It is demonstrated that the novel robust estimator can adjust the weight of the estimator based on the standardized residual size when the measurements contain gross errors. The larger the gross errors, the lower the weights, and the greater the effects of restraining the gross errors. Therefore, the proposed estimator is more robust.

#### *3.2. The Tuning Parameter of the Novel Robust Estimator*

The use of robust estimators in data regression typically involves the inverse relationship between relative efficiency and robustness [36]. Relative efficiency refers to the fit quality of the estimated value reconciled by M-estimator, when errors follow another distribution relative to the reference distribution (usually assumed to be Normal distribution). The so-called robustness refers to the performance of the M-estimator under various nonnormal error distributions. The more robust an M-estimator is, the less its relative efficiency is. The relationship between robustness and relative efficiency is addressed by the tuning parameters. When comparing the performance of different M-estimators, the

level of relative efficiency and the reference distribution should be consistent. In general, the tuning parameters for each M-estimator are obtained at the relative efficiency level of 95%. Then the robustness of distinct M-estimators is judged.

The mathematical definition of relative efficiency is illustrated in Equation (9):

$$E\_{ff}[\text{IF}(\xi), f(\xi)] = \frac{V\_f[\text{IF}\_f(\xi), f(\xi)]}{V[\text{IF}(\xi), f(\xi)]} \tag{9}$$

where *V<sup>f</sup>* is the asymptotic variance of the reference estimator, *V* is the asymptotic variance of the specified estimator, *f*(*ξ*) is the error reference probability distribution, IF*<sup>f</sup>* (*ξ*) is the influence function of the reference estimator and IF(*ξ*) is the influence function of the specified estimator. The asymptotic variance *V* is defined as follows [37]:

$$V[\text{IF}(\xi), f(\xi)] = \frac{\int\_{-\infty}^{+\infty} \text{IF}^2(\xi) f(\xi) d\xi}{\left[\int\_{-\infty}^{+\infty} \text{IF}'(\xi) f(\xi) d\xi\right]^2} \tag{10}$$

where IF0 (*ξ*) is the derivative of the influence function IF(*ξ*). Since IF0 (*ξ*) may be discontinuous, Equation (10) can be further expressed by Equation (11):

$$V[\text{IF}(\xi), f(\xi)] = \frac{\int\_{-\infty}^{+\infty} \text{IF}^2(\xi) f(\xi) d\xi}{\left[ - \int\_{-\infty}^{+\infty} \text{IF}(\xi) f'(\xi) d\xi \right]^2} = \frac{\int\_{-\infty}^{+\infty} \text{IF}^2(\xi) f(\xi) d\xi}{\left[ - \int\_{-\infty}^{+\infty} \text{IF}(\xi) f'(\xi) d\xi \right]^2} \tag{11}$$

where *f* 0 (*ξ*) is the derivative of *f*(*ξ*).

The tuning parameter of the proposed estimator at the relative efficiency level of 95% in respect to the Normal distribution is calculated as follows:

(1) The Normal distribution is selected as the reference distribution, and its probabilistic distribution and the first derivative are respectively expressed as follows:

$$f(\xi) = \frac{1}{\sigma\sqrt{2\pi}} \exp\left(-\frac{\xi^2}{2}\right) \tag{12}$$

$$f'(\xi) = -\xi \frac{1}{\sigma \sqrt{2\pi}} \exp\left(-\frac{\xi^2}{2}\right) = -\xi f(\xi) \tag{13}$$

(2) The least square estimator is selected as the reference estimator, and its influence function is expressed as follows:

$$\text{IF}\_f(\mathfrak{f}) = \frac{d\rho(\mathfrak{f})}{d\mathfrak{f}} = \frac{d}{d\mathfrak{f}} \left(\frac{\mathfrak{f}^2}{2}\right) = \mathfrak{f} \tag{14}$$

(3) The influence function of the proposed estimator is shown in Equation (8). The expressions of *V<sup>f</sup>* and *V* can be obtained by using Equation (11):

$$V\_f = \frac{2 \cdot \int\_0^{+\infty} \mathrm{IF}\_f^2(\xi) f(\xi) d\xi}{\left(2 \cdot \int\_0^{+\infty} \mathrm{IF}\_f(\xi) \xi f(\xi) d\xi\right)^2} \tag{15}$$

$$V = \frac{2 \cdot \int\_0^{+\infty} \mathrm{IF}^2(\xi) f(\xi) d\xi}{\left(2 \cdot \int\_0^{+\infty} \mathrm{IF}(\xi) \xi f(\xi) d\xi\right)^2} \tag{16}$$

(4) The expression of relative efficiency can be obtained from Equation (9). Making the expression equal to 0.95, a univariate equation on the tuning parameter *c<sup>p</sup>* can be constructed, as shown in Equation (17). By numerical calculation, the tuning parameter at the relative efficiency level of 95% in respect to the Normal distribution is 1.5424.

$$\frac{V\_f}{\nabla} = \frac{2f\_0^{+\infty} \xi^2 \frac{1}{\sigma \sqrt{2\pi}} \exp(-(\xi^2/2)) d\_\theta^2}{\left[2f\_0^{+\infty} \xi^2 \frac{1}{\sigma \sqrt{2\pi}} \exp(-(\xi^2/2)) d\_\xi^2\right]^2} \Big|\_{\theta = 0}}{2f\_0^{+\infty} \frac{1}{\xi^2 \exp(-2(\xi/\sigma\_P)^2)(1 + \exp(-(\xi/\sigma\_P)^4) + 2\xi^2(1 - \exp(-(\xi/\sigma\_P)^2))/\epsilon\_P^2)} \Big|\_{\sigma = \sqrt{2\pi}} = 0.95 \tag{17}$$
  $\frac{\xi^2 \exp(-(\xi/\sigma\_P)^4)}{4(1 + \exp(-(\xi/\sigma\_P)^4)} \Big|\_{\theta = 0} + 2\xi^2 (1 - \exp(-(\xi/\sigma\_P)^2)) / \epsilon\_P^2)} \frac{1}{\sigma \sqrt{2\pi}} \exp(-(\xi^2/2)) d\_\xi^2$   $\frac{\xi^2 \exp(-(\xi/\sigma\_P)^2)(1 + \exp(-(\xi/\sigma\_P)^4) + 2\xi^2(1 - \exp(-(\xi/\sigma\_P)^2))/\epsilon\_P^2)} \Big|\_{\theta = 0} + 0.15 \frac{1}{\sigma \sqrt{2\pi}} \exp(-(\xi/\sigma\_P)^4) d\_\xi^2$ 

To more accurately reflect the competitive relationship between the relative efficiency and robustness of the M-estimator, the comparative images of the objective function and influence function of the proposed, Xie and Welsch estimators are shown in Figure 1. Relative efficiency levels are 90%, 95%, 98% and 99%, respectively. The tuning parameters of the three estimators at different relative efficiency levels are presented in Table 1. Table 1 and Figure 1 show that the higher the relative efficiency level, the higher the values of the tuning parameters for the three estimators. The higher the tuning parameter, the slower the convergence speed of the influence function to 0 and the weaker the robustness. Hence, when comparing the robustness of distinct estimators, the comparative analysis has to be performed at the same level of relative efficiency.


**Table 1.** Tuning parameters of three estimators at distinct relative efficiency levels.

In order to further analyze the effectiveness of the proposed estimator, the objective function and influence function of the proposed estimator are compared with the other four robust estimators. The four types of robust estimators are Fair, Cauchy, Welsch and Xie. The objective functions of Fair and Cauchy are shown in Equations (18) and (19): Fair:

$$\rho(\mathfrak{f}\_i) = c\_f^2 \left( \frac{|\mathfrak{f}\_i|}{c\_f} - \ln \left( 1 + \frac{|\mathfrak{f}\_i|}{c\_f} \right) \right) \tag{18}$$

Cauchy:

$$\rho(\xi\_i) = c\_c^2 \ln \left( 1 + \frac{\xi\_i^2}{c\_c^2} \right) \tag{19}$$

where *c <sup>f</sup>* and *c<sup>c</sup>* are, respectively, tuning parameters of Fair and Cauchy estimators. The tuning parameters of the five M-estimators under the level of 95% relative efficiency are *c <sup>f</sup>* = 1.3998, *c<sup>w</sup>* = 2.9846, *c<sup>c</sup>* = 2.3849, *c<sup>x</sup>* = 1.9597 and *c<sup>p</sup>* = 1.5424. The results of the comparison are illustrated in Figures 2 and 3.

robustness. Hence, when comparing the robustness of distinct estimators, the compara-

In order to further analyze the effectiveness of the proposed estimator, the objective function and influence function of the proposed estimator are compared with the other four robust estimators. The four types of robust estimators are Fair, Cauchy, Welsch and Xie. The objective functions of Fair and Cauchy are shown in Equations (18) and (19):

<sup>2</sup> ( ) ln 1 *i i*

2 <sup>2</sup> ( ) ln 1 *<sup>i</sup> i c*

where *<sup>f</sup> <sup>c</sup>* and *<sup>c</sup> <sup>c</sup>* are, respectively, tuning parameters of Fair and Cauchy estimators. The tuning parameters of the five M-estimators under the level of 95% relative efficiency are 1.3998 *<sup>f</sup> c* = , 2.9846 *wc* = , 2.3849 *<sup>c</sup> c* = , 1.9597 *<sup>x</sup> c* = and 1.5424 *<sup>p</sup> c* = . The results of

ξ

*f f*

 = −+

 ξ

2

ξ

*c*

*c c*

*<sup>c</sup> <sup>c</sup>*

 = +

*i f*

ρ ξ

ρ ξ

the comparison are illustrated in Figures 2 and 3.

*c*

2.3828 90% 2.9846 95% 3.9077 98% 4.7343 99%

1.6705 90% 1.9597 95% 2.3409 98% 2.6359 99%

1.3082 90% 1.5424 95% 2.0942 98% 2.6060 99%

(18)

(19)

*w ff w ff w ff w ff*

*x ff x ff x ff x ff*

*p ff p ff p ff p ff*

 = = = = = = = =

*c E c E c E c E*

 = = = = = = = =

*c E c E c E c E*

 = = = = = = = =

*c E c E c E c E*

tive analysis has to be performed at the same level of relative efficiency.

1 Welsch

2 Xie

3 Proposed

Fair:

Cauchy:

**Table 1.** Tuning parameters of three estimators at distinct relative efficiency levels.

 **M-estimator Tuning parameter**

**Figure 1.** Characteristicfunctions of three estimators at distinct relative efficiency levels. **Figure 1.** Characteristicfunctions of three estimators at distinct relative efficiency levels.

As can be seen from Figures 2 and 3, the objective function and influence function of the proposed M-estimator satisfy their general characteristics. When the standardized residual is low, the objective and influence functions of several robust estimators are all close, which indicates that low measurement errors have little effect on these estimators. When the standardized residual is large, the objective function of the Fair and Cauchy estimator increases at a higher rate. Their influence functions do not converge rapidly and eventually converge to a nonzero value. It shows that both Fair and Cauchy estimators are sensitive to gross errors. Although the objective function of Welsch does not diverge, it tends to be a greater constant than Xie and the novel robust estimator. The objective

**Figure 2.** Objective functions for Xie, Welsch, Cauchy, Fair and Proposed estimator.

functions of the novel robust estimator and Xie estimator are less than those of the other three estimators in the case of gross errors in measurements. Furthermore, the proposed M-estimator tends to be a constant lower than Xie. It can be seen further from Figure 2 that when there are gross errors in measurements, the influence function of the proposed Mestimator decreases much faster than that of Xie. When the standardized residual exceeds 4, the influence function of the proposed M-estimator converges to 0. Nevertheless, the influence function of Xie converges to 0 only when the standardized residual is above 5. Hence one can see that the convergence speed of influence function of the novel robust estimator is faster than that of Xie, and the effect of suppressing gross errors is excellent. The proposed method is therefore more robust and efficient than other estimators. (**c**) Welsch. **Figure 1.** Characteristicfunctions of three estimators at distinct relative efficiency levels.

*Entropy* **2021**, *23*, x FOR PEER REVIEW 9 of 24

(**b**) Xie

**Figure 2. Figure 2.**  Objective functions for Xie, Welsch, Cauchy, Fair and Proposed estimator. Objective functions for Xie, Welsch, Cauchy, Fair and Proposed estimator.

**Figure 3.** Influence functions for Xie, Welsch, Cauchy, Fair and Proposed estimator. **Figure 3.** Influence functions for Xie, Welsch, Cauchy, Fair and Proposed estimator.

As can be seen from Figures 2 and 3, the objective function and influence function of the proposed M-estimator satisfy their general characteristics. When the standardized re-Next, two numerical examples are used to further verify the effectiveness of the proposed robust estimator, and the proposed robust estimator is compared with several

sidual is low, the objective and influence functions of several robust estimators are all

When the standardized residual is large, the objective function of the Fair and Cauchy estimator increases at a higher rate. Their influence functions do not converge rapidly and eventually converge to a nonzero value. It shows that both Fair and Cauchy estimators are sensitive to gross errors. Although the objective function of Welsch does not diverge, it tends to be a greater constant than Xie and the novel robust estimator. The objective functions of the novel robust estimator and Xie estimator are less than those of the other three estimators in the case of gross errors in measurements. Furthermore, the proposed M-estimator tends to be a constant lower than Xie. It can be seen further from Figure 2 that when there are gross errors in measurements, the influence function of the proposed M-estimator decreases much faster than that of Xie. When the standardized residual exceeds 4, the influence function of the proposed M-estimator converges to 0. Nevertheless, the influence function of Xie converges to 0 only when the standardized residual is above 5. Hence one can see that the convergence speed of influence function of the novel robust estimator is faster than that of Xie, and the effect of suppressing gross errors is excellent.

The proposed method is therefore more robust and efficient than other estimators.

Next, two numerical examples are used to further verify the effectiveness of the proposed robust estimator, and the proposed robust estimator is compared with several estimators. The data reconciliation procedure can be interpreted mathematically as an optimization problem. In this paper, the state transition algorithm (STA) is applied as the optimization method for data reconciliation problems [38]. The local search operator, the global search operator and the heuristic search operator are adopted as the state transformation operators. This can prevent the solution process from falling into the local opti-

In this part, the measurement network [26] as shown in Figure 4 is adopted. The network is a linear structure consisting of four nodes and seven streams. As can be seen from

mum and reduce the search time for the global optimum.

Figure 4, the seven streams meet the following material balance

*3.3. Linear Case* 

estimators. The data reconciliation procedure can be interpreted mathematically as an optimization problem. In this paper, the state transition algorithm (STA) is applied as the optimization method for data reconciliation problems [38]. The local search operator, the global search operator and the heuristic search operator are adopted as the state transformation operators. This can prevent the solution process from falling into the local optimum and reduce the search time for the global optimum. 1246 0 0 *xxxx x x* −++= − =

*Entropy* **2021**, *23*, x FOR PEER REVIEW 11 of 24

#### *3.3. Linear Case*

In this part, the measurement network [26] as shown in Figure 4 is adopted. The network is a linear structure consisting of four nodes and seven streams. As can be seen from Figure 4, the seven streams meet the following material balance 345 567 0 0 *xxx xxx* −−= −−=

$$\begin{cases} \mathbf{x}\_1 - \mathbf{x}\_2 + \mathbf{x}\_4 + \mathbf{x}\_6 = \mathbf{0} \\ \mathbf{x}\_2 - \mathbf{x}\_3 = \mathbf{0} \\ \mathbf{x}\_3 - \mathbf{x}\_4 - \mathbf{x}\_5 = \mathbf{0} \\ \mathbf{x}\_5 - \mathbf{x}\_6 - \mathbf{x}\_7 = \mathbf{0} \end{cases} \tag{20}$$

2 3

(20)

(23)

(24)

*<sup>i</sup> x* represents

*<sup>i</sup>* denotes the standard devia-

where *x* = [*x*1, *x*2, . . . , *x*7] *T* is set of stream variables. In this linear example, all variables are assumed to be measured, and the true value of each variable is X = [5, 15, 15, 5, 10, 5, 5]. The true value of each variable is added with some random errors to get the corresponding measurements. Let the standard deviation of random errors be 2.5% of the true value of each stream. The diagonal matrix of the variance is as follows: sponding measurements. Let the standard deviation of random errors be 2.5% of the true value of each stream. The diagonal matrix of the variance is as follows: = × 0.01 (1.562,14.062,14.062,1.562,6.250,1.562,1.562) *diag* (21)

**Figure 4.** Diagram of the measurement network.

cluded are:

where *<sup>i</sup> x*

\*

**Figure 4.** Diagram of the measurement network. For a precise analysis of the performance of each robust estimator, SSE (sum of For a precise analysis of the performance of each robust estimator, SSE (sum of squares due to error), TER (total error reduction) and RER (relative error reduction) [27] are used as indicators to assess the precision of the reconciled results. The definitions included are:

$$\text{SSE} = \sum\_{i=1}^{n} \left( \tilde{x}\_{i} - x\_{i}^{\*} \right)^{2} + \sum\_{j=1}^{m} \left( \tilde{u}\_{j} - u\_{j}^{\*} \right)^{2} \tag{22}$$

$$\text{TER} = \frac{\sqrt{\sum\_{i=1}^{n} \left(\frac{x\_i - x\_i^\*}{\sigma\_l}\right)^2} - \sqrt{\sum\_{i=1}^{n} \left(\frac{\tilde{x}\_i - x\_i^\*}{\sigma\_l}\right)^2}}{\sqrt{\sum\_{i=1}^{n} \left(\frac{x\_i - x\_i^\*}{\sigma\_l}\right)^2}} \tag{23}$$

$$\text{RER} = \frac{\sum\_{i=1}^{n} \left( \frac{\left| \mathbf{x}\_{i}^{\*} - \mathbf{x}\_{i} \right|}{\mathbf{x}\_{i}^{\*}} - \frac{\left| \mathbf{x}\_{i}^{\*} - \widetilde{\mathbf{x}}\_{i} \right|}{\mathbf{x}\_{i}^{\*}} \right)}{\sum\_{i=1}^{n} \frac{\left| \mathbf{x}\_{i}^{\*} - \mathbf{x}\_{i} \right|}{\mathbf{x}\_{i}^{\*}}} \tag{24}$$

1

represents the reconciled data for the *i* th measurable variable; \*

=

\* \*

*<sup>n</sup> ii ii*

\*

*<sup>n</sup> i i*

*i i i*

1

=

the true value for the *i* th measurable variable; *<sup>i</sup> x* represents the measured data for the

*i* th measurable variable; *<sup>j</sup> u* is the estimated value of the *j* th unmeasurable variable;

tion for the *i* th measurable variable; *n* is the total number of measurable variables; *m*

is the total number of unmeasurable variables. The smaller the SSE, the closer the reconciled results are to the true value. The larger the TER, the more accurate the reconciled results. The larger RER shows that the reconciled results are less affected by gross errors.

Streams 2 and 5 are selected to add gross errors in the magnitude of 2 and 1, respec-

tively, and other streams contain only random errors. The results of the data reconciliation

<sup>=</sup> <sup>−</sup>

*i i*

\* \*

( )

− − <sup>−</sup>

σ

*x x*

*x x*

*x*

*x x xx*

\*

2 1 ( ) *n i i i i x x* σ = − where *<sup>x</sup>*e*<sup>i</sup>* represents the reconciled data for the *<sup>i</sup>*th measurable variable; *<sup>x</sup>* ∗ *i* represents the true value for the *i*th measurable variable; *x<sup>i</sup>* represents the measured data for the *i*th measurable variable; *<sup>u</sup>*e*<sup>j</sup>* is the estimated value of the *j*th unmeasurable variable; *u* ∗ *j* is the

*<sup>j</sup> u* is the true value of the *j* th unmeasurable variable;

3.3.1. There are Two Gross Errors in the Measurement Variables

RER

true value of the *j*th unmeasurable variable; *σ<sup>i</sup>* denotes the standard deviation for the *i*th measurable variable; *n* is the total number of measurable variables; *m* is the total number of unmeasurable variables. The smaller the SSE, the closer the reconciled results are to the true value. The larger the TER, the more accurate the reconciled results. The larger RER shows that the reconciled results are less affected by gross errors.

3.3.1. There Are Two Gross Errors in the Measurement Variables

Streams 2 and 5 are selected to add gross errors in the magnitude of 2 and 1, respectively, and other streams contain only random errors. The results of the data reconciliation and indicators for several methods are presented in Table 2. For convenience, "Proposed" is used to represent the proposed estimator in this table and subsequent tables.


**Table 2.** Reconciled results of different methods with two gross errors.

Table 2 shows that the SSE of the proposed robust estimator (Proposed) is much lower than that of Fair with 1.2117 and Cauchy with 0.1287 and is less than that of Xie and Welsch estimator. The TER of the novel robust estimator is much larger than that of Fair with 0.2953 and Cauchy with 0.7606, and higher than that of Xie and Welsch estimator. In addition, the RER of Proposed is also higher than that of the other four robust estimators. The results show that in the case of two gross errors, the reconciled results of the novel robust estimator are closer to the true value, and the proposed estimator performs better than other methods.

#### 3.3.2. There Are Three Gross Errors in the Measurement Variables

Streams 2, 5 and 7 are selected to add gross errors with the magnitudes of 2, 1 and −1.5, respectively, and other streams contain only random errors. The results of the data reconciliation and indicators for several methods are presented in Table 3.

**Table 3.** Reconciled results of different methods with three gross errors.


Table 3 shows that the SSE of the proposed robust estimator (Proposed) is much lower than that of Fair with 1.2866 and Cauchy with 0.1071 and is less than that of Xie and Welsch estimator. The TER of the novel robust estimator is much larger than that of Fair with 0.3474, and higher than that of Xie, Welsch and Cauchy estimator. In addition, the RER of Proposed is also higher than that of the other four robust estimators. The results show that in the case of three gross errors, the reconciled results of the novel robust estimator are closer to the true value, and the proposed estimator performs better than other methods.

In order to better illustrate the effectiveness of the proposed estimator, the linear numerical example is divided into three groups for comparative experiments. Each group comprises 100 data samples. In the first group, there is gross error in stream 2 (*x*2). In group two, there are gross errors in stream 2 (*x*2) and stream 5 (*x*5). The third group contains gross errors in stream 2 (*x*2), stream 5 (*x*5) and stream 7 (*x*7). In each group of experiments, the selected variables are added with gross errors ranging from 3 times to 9 times the standard deviation. The measurement test (MT) is used to detect gross errors. Two criteria are selected to assess the performance of different methods, including the observed power (*OP*) and the Average Number of Type I errors (*ATVI*) [2]. *OP* and *ATVI* are defined as follows:

$$OP = \frac{\text{The number of gross errors detected correctly}}{\text{The total number of gross errors}} \tag{25}$$

$$ATVI = \frac{\text{The number of gross errors detected mistakenly}}{\text{The total number of simulations}} \tag{26}$$

*OP* indicates the number of gross errors correctly detected. *ATVI* indicates the number of gross errors mistakenly detected. The larger value of the *OP* is, the stronger ability of the estimator to identify gross errors. The lower value of the *ATVI* is, the less frequently gross errors are detected incorrectly. The statistical graph of the process in which gross errors are detected by distinct methods in each group is shown in Figure 5. The abscissa of each graph corresponds to seven stream variables. The y-axis is the number of times each variable is detected to contain gross errors. The *OP* and *ATVI* of the proposed estimator, Xie, Welsch, Cauchy and Fair are obtained by statistical analysis of the process diagram, as illustrated in Figure 6.

As depicted in Figure 6, when one variable, two variables and three variables contain gross errors, the *OP* obtained by the proposed estimator is higher than that obtained by other estimators. The result shows that the proposed estimator has a higher probability of identifying gross errors correctly. In addition, the *AVTI* obtained by the proposed estimator is lower than that obtained by other methods. It can also be seen from Figure 5 that the proposed estimator has a lower number of gross errors mistakenly detected than other estimators. It is shown that gross errors have less influence on the data reconciliation based on the Proposed estimator.

#### *3.4. Nonlinear Case*

A nonlinear numerical example [22] is used in this part. Within the nonlinear system, there are five measurable variables, three unmeasurable variables and six nonlinear constraint equations, which are described as Equation (27):

$$\begin{cases} 0.5x\_1^2 - 0.7x\_2 + x\_3u\_1 + x\_2^2u\_1u\_2 + 2x\_3u\_3^2 - 255.8 = 0\\ x\_1 - 2x\_2 + 3x\_1x\_3 - 2x\_2u\_1 - x\_2u\_2u\_3 + 111.2 = 0\\ x\_3u\_1 - x\_1 + 3x\_2 + x\_1u\_2 - x\_3\sqrt{u\_3} - 33.57 = 0\\ x\_4 - x\_1 - x\_3^2 + u\_2 + 3u\_3 = 0\\ x\_5 - 2x\_3u\_2u\_3 = 0\\ 2x\_1 + x\_2x\_3u\_1 + u\_2 - u\_3 - 126.6 = 0 \end{cases} \tag{27}$$

where *xi*(*i* = 1, 2, . . . , 5) is the measured variable and *uj*(*j* = 1, 2, 3) is the unmeasured variable. Measurements of the measured variables are obtained from the true value added, some random errors and gross errors. The true value, the measured value, the standard deviation of the measured variables and the true value of the unmeasured variables are

*Entropy* **2021**, *23*, x FOR PEER REVIEW 13 of 24

are defined as follows:

diagram, as illustrated in Figure 6.

In order to better illustrate the effectiveness of the proposed estimator, the linear numerical example is divided into three groups for comparative experiments. Each group comprises 100 data samples. In the first group, there is gross error in stream 2 ( <sup>2</sup> *x* ). In group two, there are gross errors in stream 2 ( <sup>2</sup> *x* ) and stream 5 ( <sup>5</sup>*x* ). The third group contains gross errors in stream 2 ( <sup>2</sup> *x* ), stream 5 ( <sup>5</sup>*x* ) and stream 7 ( <sup>7</sup>*x* ). In each group of experiments, the selected variables are added with gross errors ranging from 3 times to 9 times the standard deviation. The measurement test (MT) is used to detect gross errors. Two criteria are selected to assess the performance of different methods, including the observed power (OP) and the Average Number of Type I errors (ATVI) [2]. OP and ATVI

> The number of gross errors detected correctly The total number of gross errors

The number of gross errors detected mistakenly The total number of simulations

*OP* indicates the number of gross errors correctly detected. *ATVI* indicates the number of gross errors mistakenly detected. The larger value of the *OP* is, the stronger ability of the estimator to identify gross errors. The lower value of the *ATVI* is, the less frequently gross errors are detected incorrectly. The statistical graph of the process in which gross errors are detected by distinct methods in each group is shown in Figure 5. The abscissa of each graph corresponds to seven stream variables. The y-axis is the number of times each variable is detected to contain gross errors. The *OP* and *ATVI* of the proposed estimator, Xie, Welsch, Cauchy and Fair are obtained by statistical analysis of the process

As depicted in Figure 6, when one variable, two variables and three variables contain gross errors, the OP obtained by the proposed estimator is higher than that obtained by other estimators. The result shows that the proposed estimator has a higher probability of identifying gross errors correctly. In addition, the *AVTI* obtained by the proposed estimator is lower than that obtained by other methods. It can also be seen from Figure 5 that the

*OP* = (25)

*ATVI* = (26)

given in Table 4. Amongst these, *x*<sup>2</sup> , *x*<sup>3</sup> and *x*<sup>5</sup> contain gross errors. The results of the data reconciliation and indicators for several methods are presented in Table 4. proposed estimator has a lower number of gross errors mistakenly detected than other estimators. It is shown that gross errors have less influence on the data reconciliation based on the Proposed estimator.

**Figure 5.** Statistical diagram of gross errors detection process for distinct methods. **Figure 5.** Statistical diagram of gross errors detection process for distinct methods. **Figure 5.** Statistical diagram of gross errors detection process for distinct methods.

A nonlinear numerical example [22] is used in this part. Within the nonlinear system, there are five measurable variables, three unmeasurable variables and six nonlinear con-

A nonlinear numerical example [22] is used in this part. Within the nonlinear system,

0.5 0.7 2 255.8 0 2 3 2 111.2 0 3 33.57 0

0.5 0.7 2 255.8 0 2 3 2 111.2 0 3 33.57 0

(27)

(27)

 − ++ + − = −+ − − + =

 − ++ + − = −+ − − + =

2 22 1 2 31 212 33

2 22 1 2 31 212 33

*x x xu x uu xu*

*x x xu x uu xu*

−+ + − − =

−+ + − − =

3 0

3 0

where ( 1, 2,...,5) *<sup>i</sup> x i* = is the measured variable and ( 1,2,3) *<sup>j</sup> u j* = is the unmeasured variable. Measurements of the measured variables are obtained from the true value added, some random errors and gross errors. The true value, the measured value, the standard deviation of the measured variables and the true value of the unmeasured variables are given in Table 4. Amongst these, ݔଶ , ݔଷ and ݔହcontain gross errors. The results of the

where ( 1, 2,...,5) *<sup>i</sup> x i* = is the measured variable and ( 1,2,3) *<sup>j</sup> u j* = is the unmeasured variable. Measurements of the measured variables are obtained from the true value added, some random errors and gross errors. The true value, the measured value, the standard deviation of the measured variables and the true value of the unmeasured variables are given in Table 4. Amongst these, ݔଶ , ݔଷ and ݔହcontain gross errors. The results of the

1 2 13 21 223 31 1 2 12 3 3

1 2 13 21 223 31 1 2 12 3 3

*x x xx xu xuu xu x x xu x u*

*x x xx xu xuu xu x x xu x u*

2 126.6 0

2 126.6 0

data reconciliation and indicators for several methods are presented in Table 4.

data reconciliation and indicators for several methods are presented in Table 4.

+ +−− =

+ +−− =

2 413 2 3

−− ++ =

−− ++ =

*xxxu u*

2 413 2 3

*xxxu u*

2 0

2 0

1 231 2 3

1 231 2 3

*x xxu u u*

*x xxu u u*

5 323

5 323

− =

− =

*x xuu*

*x xuu*

**Figure 6.** Results of *OP* and *AVTI* with different methods. **Figure 6.** Results of *OP* and *AVTI* with different methods. **Figure 6.** Results of *OP* and *AVTI* with different methods.

*3.4. Nonlinear Case* 

*3.4. Nonlinear Case* 


**Table 4.** Reconciled results of different methods for nonlinear constraints.

Table 4 shows that the SSE calculated by the proposed robust estimator is significantly less than that calculated by Fair, Cauchy and Welsch estimator, and it is less than that calculated by Xie estimator. Furthermore, the TER and RER of the proposed estimator are larger than those of the other four estimators. The results show tha for the nonlinear case, the reconciled results of the new robust estimator are also near the true value. Consequently, the proposed robust estimator can effectively suppress the effect of gross errors on the reconciled results and has greater robustness and excellent effectiveness.

#### **4. Feeding Composition Estimation Based on Iterative Data Reconciliation**

In order to address the problem caused by the unknown change in feeding composition, an iterative robust hierarchical data reconciliation and estimation strategy is proposed. The method is applied to a fluidized bed roaster for zinc smelting. Firstly, the industrial process of fluidized bed roaster is introduced. At the same time, difficulties in reconciling the fluidized bed roaster are discussed in Section 3.1. Then, on the basis of the establishment of the mechanism balance model of the fluidized bed roaster, specific solutions are presented in accordance with the existing difficulties.

#### *4.1. Industrial Process Description*

The fluidized bed roaster is a type of thermal equipment that applies fluidization technology to make materials desulphurized by oxidation roasting in the metallurgical industry. The roasting process of the fluidized bed roaster is a gas-solid reaction procedure. By sending a great deal of air into the oven from the bottom, a very intense exothermic reaction occurs in the material layer of the sulfur ore under the agitation of the air. Oxygen combines with sulfide to form sulfur dioxide and precious metals are converted into oxides or sulphates. The structure diagram for the fluidized bed roaster is shown in Figure 7.

In the fluidized roasting process, accurate and reliable measurements are the basis of process control, functional analysis and production management. However, in the real process, due to inaccurate sensors, equipment leaks and measurement bias, real-time measurements are inevitably affected by random errors and gross errors. That has a serious effect on the accuracy of measurements. In addition, some important variables, such as smoke volume and calcine quality, cannot be measured due to limitations in measurement technology and the environment.

The ZnS content of the feed during the roasting of the zinc concentrate will affect the indices of operation, such as the resistance to roasting, the temperature of the boiling layer and the content in the gaseous effluents. Then, the technical-economic index of the level of zinc soluble in calcine is affected, which field staff can judge from according to roasting. The compositions of mixed zinc concentrate and calcine are sampled only twice a day, once in the morning and once at night. Then the percentage of Zn, S and other elements in the mixed zinc concentrate and that of soluble zinc and soluble sulfur in the calcine are detected by the laboratory. Nevertheless, the sampling frequency of the variables measured

in real time, like feed volume and blast volume, is 5 seconds, which is much greater than that of the laboratory data. Hence, due to the long test cycle of the mixed zinc concentrate composition in the fluidized bed roaster, the laboratory data cannot reflect the change in the real-time feeding composition. *Entropy* **2021**, *23*, x FOR PEER REVIEW 16 of 24

**Figure 7.** Structure diagram of fluidized bed roaster. **Figure 7.** Structure diagram of fluidized bed roaster. 1 2 3 3 02 0 <sup>16</sup> 1000(1 )( ) <sup>97</sup> *M M T cp V T cp Q* α =− + + +

The ZnS content of the feed during the roasting of the zinc concentrate will affect the indices of operation, such as the resistance to roasting, the temperature of the boiling layer and the content in the gaseous effluents. Then, the technical-economic index of the level Concerning the problems of measurements contaminated by gross errors, irredundant process data and a long test cycle of the feeding composition, an iterative robust hierarchical data reconciliation and estimation strategy for feeding composition is proposed. where ߙ denotes the percentage of ZnS in zinc concentrate; ܯଵ and ܯଶ represent the first and second feeding quantity, respectively; ܸ is the blast velocity; ܶ, ܶଵ, ܶଶ and ܶଷ denote the temperature variables for blast, zinc concentrate, gas and calcine, respectively; ܿ, ܿଵ, ܿଶ and ܿଷ are specific heat for gas, zinc concentrate, blast and calcine, respec-

(29)

of zinc soluble in calcine is affected, which field staff can judge from according to roasting. The compositions of mixed zinc concentrate and calcine are sampled only twice a day, *4.2. Iterative Robust Hierarchical Data Reconciliation and Composition Estimation Framework* tively; ܳ describe the heat loss;ߩ and ߩଵ are, respectively, the density of air and gas.

once in the morning and once at night. Then the percentage of Zn, S and other elements in the mixed zinc concentrate and that of soluble zinc and soluble sulfur in the calcine are Figure 8 illustrates the framework of iterative robust hierarchical data reconciliation and estimation strategy for feeding composition. The major steps are as follows: Among these variables, ܯଵ, ܯଶ, ܸ, ܶ, ܶଶ and ܶଷ are measured variables. ߩଵ, ܳ and ߙ are unmeasured variables. The other parameters are all fixed values.

soot can be classified as calcine in the calculation. Therefore, the material and heat balance at steady-state are expressed by Equations **Figure 8.** The framework of iterative robust hierarchical data reconciliation and composition estimation. **Figure 8.** The framework of iterative robust hierarchical data reconciliation and composition estimation.

ance and measurements, the ZnS estimation model before data reconciliation could be

1 1 2 1 1 00 2 1 2

*Y M M T cp V T cp M M*

−− + − −

= + ++ +

16 1000(1 )( ) ) <sup>97</sup>

α

443 508 min (1000( ) ( )1000 <sup>97</sup>

. . *min max s t* α

where ܻଵ is the objective of the optimization problem; ߙ and ߙ௫ are the top and bottom bounds of ߙ, respectively. In this case, ܳ can be roughly considered as a fixed value, which can be obtained by expert knowledge. The preliminary estimation of ߙ will

 αα

Step 3: Robust hierarchical data reconciliation. First, the robust reconciliation model of material balance layer is established to obtain the reconciled values of the first and

1 2 3 3 02 0

*M M T cp V T cp Q*

2

≤ ≤ (31)

(30)

α

,

Step 2: Preliminary estimation of ߙ based on measurements and heat balance. Esti-

be used to construct the hierarchical model in Step 3.

(28) and (29):

Material balance:

expressed as follows:

Step 1: Establish the steady-state mechanism model of the fluidized bed roaster The establishment of a reasonable process model is the basis for data reconciliation. However, the actual industrial process is complex and needs to be simplified accordingly. The following assumptions are put forward to establish a credible and comprehensible mechanism model: (1) The only exothermal component of the mixed zinc concentrate is ZnS.

(2) Since the specific heat of calcine and soot is close and the mass ratio is close to 1:1, soot can be classified as calcine in the calculation.

Therefore, the material and heat balance at steady-state are expressed by Equations (28) and (29):

Material balance:

$$\frac{16}{97}a(M\_1 + M\_2)1000 + V\_0(\rho\_0 - \rho\_1) = 0\tag{28}$$

Heat balance:

$$\begin{array}{l} 1000(M\_1 + M\_2)T\_1cp\_1 + V\_0T\_0cp\_2 + \frac{443508}{97}a(M\_1 + M\_2)1000\\ = 1000(1 - \frac{16}{97}a)(M\_1 + M\_2)T\_3cp\_3 + V\_0T\_2cp\_0 + Q \end{array} \tag{29}$$

where *α* denotes the percentage of ZnS in zinc concentrate; *M*<sup>1</sup> and *M*<sup>2</sup> represent the first and second feeding quantity, respectively; *V*<sup>0</sup> is the blast velocity; *T*0, *T*1, *T*<sup>2</sup> and *T*<sup>3</sup> denote the temperature variables for blast, zinc concentrate, gas and calcine, respectively; *cp*0, *cp*1, *cp*<sup>2</sup> and *cp*<sup>3</sup> are specific heat for gas, zinc concentrate, blast and calcine, respectively; *Q* describe the heat loss; *ρ*<sup>0</sup> and *ρ*<sup>1</sup> are, respectively, the density of air and gas. Among these variables, *M*1, *M*2, *V*0, *T*0, *T*<sup>2</sup> and *T*<sup>3</sup> are measured variables. *ρ*1, *Q* and *α* are unmeasured variables. The other parameters are all fixed values.

Step 2: Preliminary estimation of *α* based on measurements and heat balance. Estimation of *α* can be considered as an optimization problem. Combined with the heat balance and measurements, the ZnS estimation model before data reconciliation could be expressed as follows:

$$\begin{array}{ll}\min Y\_1 = (1000(M\_1 + M\_2)T\_1cp\_1 + V\_0T\_0cp\_2 + \frac{443.508}{97}a(M\_1 + M\_2)1000\\ -1000(1 - \frac{16}{97}a)(M\_1 + M\_2)T\_3cp\_3 - V\_0T\_2cp\_0 - Q)^2\end{array} \tag{30}$$

$$\mathfrak{s}.t. \quad \mathfrak{a}\_{\min} \le \mathfrak{a} \le \mathfrak{a}\_{\max} \tag{31}$$

where *Y*<sup>1</sup> is the objective of the optimization problem; *α*min and *α*max are the top and bottom bounds of *α*, respectively. In this case, *Q* can be roughly considered as a fixed value, which can be obtained by expert knowledge. The preliminary estimation of *α* will be used to construct the hierarchical model in Step 3.

Step 3: Robust hierarchical data reconciliation. First, the robust reconciliation model of material balance layer is established to obtain the reconciled values of the first and second feeding quantity and blast velocity. Then, the reconciled values of the material balance layer are used as precise values to construct the robust reconciliation model of heat balance layer. Finally, the reconciled results of the temperature for the blast, gas and calcine can be obtained. Therefore, the hierarchical data reconciliation model based on the novel robust estimation is expressed as follows:

Material balance layer:

min*f*<sup>1</sup> = *n* ∑ *i*=1 *ρ*(*x<sup>i</sup>* , *x*ˆ*i*) = *n* ∑ *i*=1 (1 + ((*x<sup>i</sup>* − *x*ˆ*i*) 2 /*σ* 2 *i c* 14 *<sup>p</sup>* − 1) exp(−((*x<sup>i</sup>* − *x*ˆ*i*)/*σicp*) 2 )) <sup>=</sup> <sup>1</sup> + ((*M*<sup>1</sup> <sup>−</sup> *<sup>M</sup>*<sup>ˆ</sup> 1) 2 /*σ* 2 *M*<sup>1</sup> *c* 14 *<sup>p</sup>* <sup>−</sup> <sup>1</sup>) exp(−((*M*<sup>1</sup> <sup>−</sup> *<sup>M</sup>*<sup>ˆ</sup> <sup>1</sup>)/*σM*<sup>1</sup> *cp*) 2 )+ <sup>1</sup> + ((*M*<sup>2</sup> <sup>−</sup> *<sup>M</sup>*<sup>ˆ</sup> 2) 2 /*σ* 2 *M*<sup>2</sup> *c* 14 *<sup>p</sup>* <sup>−</sup> <sup>1</sup>) exp(−((*M*<sup>2</sup> <sup>−</sup> *<sup>M</sup>*<sup>ˆ</sup> <sup>2</sup>)/*σM*<sup>2</sup> *cp*) 2 )+ <sup>1</sup> + ((*V*<sup>0</sup> <sup>−</sup> *<sup>V</sup>*<sup>ˆ</sup> 0) 2 /*σ* 2 *V*0 *c* 14 *<sup>p</sup>* <sup>−</sup> <sup>1</sup>) exp(−((*V*<sup>0</sup> <sup>−</sup> *<sup>V</sup>*<sup>ˆ</sup> <sup>0</sup>)/*σV*<sup>0</sup> *cp*) 2 )+ <sup>1</sup> + (k*ρ*<sup>1</sup> <sup>−</sup> *<sup>ρ</sup>*e1<sup>k</sup> 2 /*c* 14 *<sup>p</sup>* <sup>−</sup> <sup>1</sup>) exp(−( <sup>k</sup>*ρ*<sup>1</sup> <sup>−</sup> *<sup>ρ</sup>*e1k/*cp*) 2 ) (32)

$$\begin{array}{c} \text{s.t.} \begin{cases} \frac{16}{\mathcal{G}} a (\hat{M}\_1 + \hat{M}\_2) + \hat{\mathcal{V}}\_0 (\rho\_0 - \rho\_1) = 0 \\ \hat{M}\_{1\text{min}} \le \hat{M}\_1 \le \hat{M}\_{1\text{max}} \\ \hat{M}\_{2\text{min}} \le \hat{M}\_2 \le \hat{M}\_{2\text{max}} \\ \hat{\mathcal{V}}\_{0\text{min}} \le \hat{\mathcal{V}}\_0 \le \hat{\mathcal{V}}\_{0\text{max}} \\ \rho\_{1\text{min}} \le \rho\_1 \le \rho\_{1\text{max}} \end{cases} \end{array} \tag{33}$$

where *f*<sup>1</sup> is the objective function of the robust estimator; *x<sup>i</sup>* and *x*ˆ*<sup>i</sup>* represent the measurement and reconciled data for the *i*th measured variable respectively; *M*ˆ 1min, *M*ˆ 2min, *V*ˆ 0min and *ρ*1min denote the bottom bounds of the reconciled results for the first and second feeding quantity, blast velocity and the density of the air, respectively. *M*ˆ 1max, *M*ˆ 2max, *V*ˆ 0max and *ρ*1max denote the top bounds of the reconciled results for the first and second feeding quantity, blast velocity and the density of the air, respectively.

Heat balance layer:

min*f*<sup>2</sup> = *n* ∑ *i*=1 *ρ*(*x<sup>i</sup>* , *x*ˆ*i*) = *n* ∑ *i*=1 (1 + ((*x<sup>i</sup>* − *x*ˆ*i*) 2 /*σ* 2 *i c* 14 *<sup>p</sup>* − 1) exp(−((*x<sup>i</sup>* − *x*ˆ*i*)/*σicp*) 2 )) <sup>=</sup> <sup>1</sup> + ((*T*<sup>0</sup> <sup>−</sup> *<sup>T</sup>*<sup>ˆ</sup> 0) 2 /*σ* 2 *T*0 *c* 14 *<sup>p</sup>* <sup>−</sup> <sup>1</sup>) exp(−((*T*<sup>0</sup> <sup>−</sup> *<sup>T</sup>*<sup>ˆ</sup> <sup>0</sup>)/*σT*<sup>0</sup> *cp*) 2 )+ <sup>1</sup> + ((*T*<sup>2</sup> <sup>−</sup> *<sup>T</sup>*<sup>ˆ</sup> 2) 2 /*σ* 2 *T*2 *c* 14 *<sup>p</sup>* <sup>−</sup> <sup>1</sup>) exp(−((*T*<sup>2</sup> <sup>−</sup> *<sup>T</sup>*<sup>ˆ</sup> <sup>2</sup>)/*σT*2*cp*) 2 )+ <sup>1</sup> + ((*T*<sup>3</sup> <sup>−</sup> *<sup>T</sup>*<sup>ˆ</sup> 3) 2 /*σ* 2 *T*3 *c* 14 *<sup>p</sup>* <sup>−</sup> <sup>1</sup>) exp(−((*T*<sup>3</sup> <sup>−</sup> *<sup>T</sup>*<sup>ˆ</sup> <sup>3</sup>)/*σT*<sup>3</sup> *cp*) 2 )+ 1 + (k*Q* − *Q*ek 2 /*c* 14 *<sup>p</sup>* − 1) exp(−( k*Q* − *Q*ek/*cp*) 2 ) (34) *s*.*t*. 1000(*M*ˆ <sup>1</sup> + *M*ˆ <sup>2</sup>)*T*1*cp*<sup>1</sup> + *V*ˆ 0*T*ˆ <sup>0</sup>*cp*<sup>2</sup> + <sup>443508</sup> <sup>97</sup> *<sup>α</sup>*(*M*<sup>ˆ</sup> <sup>1</sup> + *M*ˆ <sup>2</sup>)1000 <sup>=</sup> <sup>1000</sup>(<sup>1</sup> <sup>−</sup> <sup>16</sup> <sup>97</sup> *<sup>α</sup>*)(*M*<sup>ˆ</sup> <sup>1</sup> + *M*ˆ <sup>2</sup>)*T*ˆ <sup>3</sup>*cp*<sup>3</sup> + *V*ˆ 0*T*ˆ <sup>2</sup>*cp*<sup>0</sup> + *Q T*ˆ 0min <sup>≤</sup> *<sup>T</sup>*<sup>ˆ</sup> <sup>0</sup> <sup>≤</sup> *<sup>T</sup>*<sup>ˆ</sup> 0max *T*ˆ 2min <sup>≤</sup> *<sup>T</sup>*<sup>ˆ</sup> <sup>2</sup> <sup>≤</sup> *<sup>T</sup>*<sup>ˆ</sup> 2max *T*ˆ 3min <sup>≤</sup> *<sup>T</sup>*<sup>ˆ</sup> <sup>3</sup> <sup>≤</sup> *<sup>T</sup>*<sup>ˆ</sup> 3max (35)

$$Q\_{\min} \le Q \le Q\_{\max}$$

where *f*<sup>2</sup> is the objective function of the robust estimator; *T*ˆ 0min, *T*ˆ 2min, *T*ˆ 3min and *Q*min denote the bottom bounds of the reconciled results for the temperature of the blast, gas, calcine and heat loss, respectively; *T*ˆ 0max, *T*ˆ 2max, *T*ˆ 3max and *Q*max denote the top bounds of the reconciled results for the temperature of the blast, gas, calcine and heat loss, respectively.

Step 4: Reconciled estimation of *α*ˆ based on reconciled results and heat balance. The reconciled results obtained in Step 3 are substituted for the objective function of the optimization problem based on heat balance. Then the estimated value of the percentage for ZnS after reconciliation could be obtained. The ZnS estimation model after data reconciliation can be represented as follows:

$$\begin{array}{ll}\min Y\_2 = (1000(\hat{M}\_1 + \hat{M}\_2)T\_1cp\_1 + \hat{V}\_0\hat{T}\_0cp\_2 + \frac{443.508}{\mathcal{G}7}\hat{\mathfrak{a}}(\hat{M}\_1 + \hat{M}\_2)1000\\ -1000(1 - \frac{16}{\mathcal{G}7}\hat{\mathfrak{a}})(\hat{M}\_1 + \hat{M}\_2)\hat{T}\_3cp\_3 - \hat{V}\_0\hat{T}\_2cp\_0 - Q)^2 \end{array} \tag{36}$$

$$
\mathfrak{s}.t.\mathfrak{k}\_{\min} \le \mathfrak{k} \le \mathfrak{k}\_{\max} \tag{37}
$$

where *Y*<sup>2</sup> is the objective of the optimization problem; *α*ˆ represents the estimated value of percentage for ZnS after reconciliation; *α*ˆ min and *α*ˆ max denote the bottom and top bounds of *α*ˆ respectively. Through the iterative process of the above steps, greater accuracy and a smaller range of ZnS fluctuation in the feed can be obtained. Therefore, staff can better understand changes of the feeding composition and make adjustments as required.

#### **5. Results from Real Industrial Data**

To further demonstrate the effectiveness of the proposed method, 100 steady-state samples are collected from the real industry of a fluidized bed roaster. In addition, the data reconciliation results of the proposed estimator are compared with those of Xie estimator.

The results of the comparison between the two methods are presented in Figures 9–14, which respectively present the reconciled results for the first feeding quantity, the second feeding quantity, blast velocity, blast temperature, gas temperature and calcine temperature. From Figures 9–14, it is known that the reconciled results by the novel robust estimator have a smaller range of fluctuation than that obtained by Xie estimator. It shows that whatever the uncertainty of the fluctuation of the measurements and the number of gross errors they contain, the data reconciliation approach based on the novel robust estimator is highly robust. second feeding quantity, blast velocity, blast temperature, gas temperature and calcine temperature. From Figures 9–14, it is known that the reconciled results by the novel robust estimator have a smaller range of fluctuation than that obtained by Xie estimator. It shows that whatever the uncertainty of the fluctuation of the measurements and the number of gross errors they contain, the data reconciliation approach based on the novel robust estimator is highly robust.

*Entropy* **2021**, *23*, x FOR PEER REVIEW 19 of 24

where ݂ଶ is the objective function of the robust estimator; ܶ

calcine and heat loss, respectively; ܶ

iation can be represented as follows:

**5. Results from Real Industrial Data** 

tively.

quired.

denote the bottom bounds of the reconciled results for the temperature of the blast, gas,

of the reconciled results for the temperature of the blast, gas, calcine and heat loss, respec-

reconciled results obtained in Step 3 are substituted for the objective function of the optimization problem based on heat balance. Then the estimated value of the percentage for ZnS after reconciliation could be obtained. The ZnS estimation model after data reconcil-

2 1 2 1 1 00 2 1 2

*Y M M T cp V T cp M M*

−− + − −

= + ++ +

<sup>16</sup> <sup>ˆ</sup> ˆ ˆ ˆˆ 1000(1 )( ) ) <sup>ˆ</sup> <sup>97</sup>

α

443 508 <sup>ˆ</sup> <sup>ˆ</sup> <sup>ˆ</sup> <sup>ˆ</sup> <sup>ˆ</sup> <sup>ˆ</sup> min (1000( ) ( )1000 <sup>ˆ</sup> <sup>97</sup>

. . ˆ ˆ ˆ *min max s t* α

where ܻଶ is the objective of the optimization problem; ߙො represents the estimated value of percentage for ZnS after reconciliation; ߙො and ߙො௫ denote the bottom and top bounds of ߙො respectively. Through the iterative process of the above steps, greater accuracy and a smaller range of ZnS fluctuation in the feed can be obtained. Therefore, staff can better understand changes of the feeding composition and make adjustments as re-

 αα

To further demonstrate the effectiveness of the proposed method, 100 steady-state

samples are collected from the real industry of a fluidized bed roaster. In addition, the data reconciliation results of the proposed estimator are compared with those of Xie estimator. The results of the comparison between the two methods are presented in Figures 9–14, which respectively present the reconciled results for the first feeding quantity, the

1 2 3 3 02 0

*M M T cp V T cp Q*

ଶ௫, ܶ

Step 4: Reconciled estimation of ߙො based on reconciled results and heat balance. The

௫, ܶ

, ܶ

2

≤ ≤ (37)

α

,

ଶ, ܶ

ଷ௫ and ܳ௫ denote the top bounds

ଷ and ܳ

(36)

**Figure 9.** Reconciled results for the first feed material quantity. **Figure 9.** Reconciled results for the first feed material quantity.

**Figure 10.** Reconciled results for the second feed material quantity. **Figure 10.** Reconciled results for the second feed material quantity.

**Figure 11.** Reconciled results for the blast velocity.

**Figure 12.** Reconciled results for the temperature of blast.

**Figure 10.** Reconciled results for the second feed material quantity.

**Figure 10.** Reconciled results for the second feed material quantity.

*Entropy* **2021**, *23*, x FOR PEER REVIEW 20 of 24

**Figure 11.** Reconciled results for the blast velocity. **Figure 11.** Reconciled results for the blast velocity.

**Figure 12.** Reconciled results for the temperature of blast. **Figure 12.** Reconciled results for the temperature of blast.

**Figure 12.** Reconciled results for the temperature of blast. In addition, the standard deviations of the reconciled results and measurements for six variables obtained from the two methods are calculated, and the results of the comparison are presented in Figure 15. Figure 15 shows that, for the first feeding quantity, the seconding feed quantity, the blast velocity and the temperature of blast, the standard deviation of the reconciled results based on the novel robust estimator is less than Xie. For the temperature of gas and calcine, the standard deviations of reconciled results for the two methods are closer, because the measurement fluctuation ranges for both two variables are all very small. The results demonstrate that the data reconciliation based on the novel robust estimator provides better reconciliation results for variables with a large fluctuation range.

*Entropy* **2021**, *23*, x FOR PEER REVIEW 21 of 24

**Figure 13.** Reconciled results for the temperature of gas. **Figure 13.** Reconciled results for the temperature of gas. **Figure 13.** Reconciled results for the temperature of gas.

**Figure 14.** Reconciled results for the temperature of calcine. **Figure 14.** Reconciled results for the temperature of calcine.

**Figure 14.** Reconciled results for the temperature of calcine. In addition, the standard deviations of the reconciled results and measurements for six variables obtained from the two methods are calculated, and the results of the comparison are presented in Figure 15. Figure 15 shows that, for the first feeding quantity, the In addition, the standard deviations of the reconciled results and measurements for six variables obtained from the two methods are calculated, and the results of the comparison are presented in Figure 15. Figure 15 shows that, for the first feeding quantity, the seconding feed quantity, the blast velocity and the temperature of blast, the standard de-For the change in feeding composition, the above 100 steady-state samples are also collected to compare the percentage of ZnS before and after data reconciliation. Experimental results are shown in Figures 16 and 17. It is evident that the ZnS fluctuation range obtained by the proposed method is smaller than that before data reconciliation. It illustrates that the percentage of ZnS estimated by the reconciled results is more accurate and could better reflect the change of feeding composition over a period of time.

seconding feed quantity, the blast velocity and the temperature of blast, the standard deviation of the reconciled results based on the novel robust estimator is less than Xie. For the temperature of gas and calcine, the standard deviations of reconciled results for the

viation of the reconciled results based on the novel robust estimator is less than Xie. For the temperature of gas and calcine, the standard deviations of reconciled results for the two methods are closer, because the measurement fluctuation ranges for both two varia-

bles are all very small. The results demonstrate that the data reconciliation based on the novel robust estimator provides better reconciliation results for variables with a large fluc-

novel robust estimator provides better reconciliation results for variables with a large fluc-

tuation range.

tuation range.

*Entropy* **2021**, *23*, x FOR PEER REVIEW 22 of 24

**Figure 15.** The standard deviation of reconciled results and measurements. **Figure 15.** The standard deviation of reconciled results and measurements.

**Figure 16.** The percentage of ZnS before data reconciliation. **Figure 16.** The percentage of ZnS before data reconciliation.

**Figure 16.** The percentage of ZnS before data reconciliation.

**Figure 17.** The percentage of ZnS after data reconciliation. **Figure 17.** The percentage of ZnS after data reconciliation.

#### **6. Conclusions**

**6. Conclusions**  The feeding composition in an industrial process can better reflect the state of production. In this paper, estimation of feeding composition based on robust data reconciliation is proposed to account for the unknown change in feeding composition. According to the robust estimation theory, a novel robust estimator is presented to handle the measurements with random errors and gross errors. In comparison with the objective function and influence function of other robust estimators, the proposed estimator is proved to be The feeding composition in an industrial process can better reflect the state of production. In this paper, estimation of feeding composition based on robust data reconciliation is proposed to account for the unknown change in feeding composition. According to the robust estimation theory, a novel robust estimator is presented to handle the measurements with random errors and gross errors. In comparison with the objective function and influence function of other robust estimators, the proposed estimator is proved to be more robust. Besides, the linear and nonlinear numerical examples are used to further verify the effectiveness of the estimator. Then, an iterative robust hierarchical data reconciliation and estimation of feeding composition strategy is proposed and used for a fluidized bed roaster. The reconciled and estimated results show that the proposed strategy is beneficial for the estimation of feeding composition in the actual industrial process.

more robust. Besides, the linear and nonlinear numerical examples are used to further verify the effectiveness of the estimator. Then, an iterative robust hierarchical data reconciliation and estimation of feeding composition strategy is proposed and used for a fluidized bed roaster. The reconciled and estimated results show that the proposed strategy is **Author Contributions:** Conceptualization, Y.L. and B.S.; methodology, Y.L.; software, Y.L.; validation, Y.L.; formal analysis, Y.L.; investigation, Y.L., Z.F. and M.J.; resources, Y.L.; data curation, Y.L. and Z.F.; writing—original draft preparation, Y.L.; writing—review and editing, Y.L. and B.S.; visualization, Y.L.; supervision, Y.L.; project administration, Y.L.; funding acquisition, B.S. All authors have read and agreed to the published version of the manuscript.

beneficial for the estimation of feeding composition in the actual industrial process. **Funding:** This research was funded by the National Natural Science Foundation of China (61973321) and the Natural Science Foundation of Hunan Province (Grant No. 2019JJ50823).

**Author Contributions:** Conceptualization, Y.L. and B.S.; methodology, Y.L.; software, Y.L.; validation, Y.L.; formal analysis, Y.L.; investigation, Y.L., Z.F. and M.J.; resources, Y.L.; data curation, Y.L. **Data Availability Statement:** Publicly available datasets were analyzed in this study. It can be found here: https://open.library.ubc.ca/cIRcle/collections/ubctheses/831/items/1.0078522 (accessed on 1 February 2021).

**Conflicts of Interest:** The authors declare no conflict of interest.

2. Narasimhan, S.; Jordache, C. *Data Reconciliation and Gross Error Detection: An Intelligent Use of Process Data*; Elsevier: Amsterdam,

3. Prata, D.M.; Lima, E.L.; Pinto, J.C. In-Line Monitoring of Bulk Polypropylene Reactors Based on Data Reconciliation Procedures.

4. Prata, D.M.; Lima, E.L.; Pinto, J.C. Nonlinear Dynamic Data Reconciliation in Real Time in Actual Processes. *Comput. Aided Chem.* 

5. Abu-El-Zeet, Z.H.; Roberts, P.D.; Becerra, V.M. Enhancing model predictive control using dynamic data reconciliation. *Alche J.* 

6. Sun, B.; Yang, C.; Wang, Y.; Gui, W.; Craig, I.; Olivier, L. A comprehensive hybrid first principles/machine learning modeling

and the Natural Science Foundation of Hunan Province (Grant No. 2019JJ50823).

ualization, Y.L.; supervision, Y.L.; project administration, Y.L.; funding acquisition, B.S. All authors

**Data Availability Statement:** Publicly available datasets were analyzed in this study. It can be

and Z.F.; writing—original draft preparation, Y.L.; writing—review and editing, Y.L. and B.S.; vis-**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

The Netherlands, 1999.

*Eng.* **2009**, *27*, 47–54.

**2002**, *48*, 324–333.

*Macromol. Symp.* **2008**, *271*, 26–37.

**References** 

have read and agreed to the published version of the manuscript. 1. Kuehn, D.R.; Davidson, H. Computer control II. Mathematics of control. *Chem. Eng. Prog.* **1961**, *57*, 44–47.

framework for complex industrial processes. *J. Process Control* **2020**, *86*, 30–43.

**Funding:** This research was funded by the National Natural Science Foundation of China (61973321) 2. Narasimhan, S.; Jordache, C. *Data Reconciliation and Gross Error Detection: An Intelligent Use of Process Data*; Elsevier: Amsterdam, The Netherlands, 1999.

1. Kuehn, D.R.; Davidson, H. Computer control II. Mathematics of control. *Chem. Eng. Prog.* **1961**, *57*, 44–47.

