*2.1. Clarifying Questions*

Constructing the FFBRB flywheel system fault diagnosis model needed a solution to the following problems:

**Problem 1.** *How to use the FFTA mechanism and integrate it into the BRB knowledge base was the first problem to be solved. In the BRB, the relationship between the input and output is described by a series of belief rules, and belief rules are built based on expert knowledge. However, when the BRB is applied to the practical flywheel system, expert knowledge is difficult to embed into the fault diagnosis model of the flywheel system (see Section 3.2.)*.

To realize the FFTA to BRB conversion, it is necessary to describe the correspondence between FFTA logic gates and BRB belief rules, and the correspondence between FFTA events and BRB input and output. The function to solve this problem is denoted as CovBridge(∗) and is the set of parameters in this process, then the process can be described by the following expression:

$$\text{BRB}(\text{BeliefRule}, \text{input}/\text{output}) = \text{CovBridge}\left(\text{FFTA}(\text{LogicGate}, \text{event}), \,\,\, \phi\right) \tag{1}$$

This is a nonlinear mapping. It is not executed in a specific software language. With CovBridge(∗), logic gates in the FFTA were converted into belief rules in the BRB, and events in the FFTA were converted into inputs and outputs in the BRB. The inputs of the CovBridge(∗) function were logic gates, events, and parameter sets in the FFTA, and the outputs were belief rules and their inputs and outputs in the BRB.

**Problem 2.** *How to build a reasonable and complete FFBRB model was the second problem to be solved. In order to solve the problem of how to diagnose various faults in the actual flywheel system, it is necessary to design the reasoning process and optimization process of the FFBRB model reasonably and establish a reasonable and accurate model (See Section 3.3)*.

The function to solve this problem is denoted as FFBRB(∗). *ζ* is the set of parameters in this process, y then the process can be described by the following expression:

$$\mathbf{y} = \text{FFBRB}\left(\mathbf{x}, \mathbb{Q}\right) \tag{2}$$

This is a nonlinear mapping. x is the failure probability of the bottom event in the FFTA, and y is the output utility value of the BRB, corresponding to the occurrence probability of the top event. *ζ* is the set of parameters in this process.

**Remark 1.** *In order to solve the problem of small sample size, it could usually take two solutions. First, sample data with similar characteristics to the research question should be sought to expand the sample data volume, such as transfer learning [17,18]. Second, through the analysis of the model mechanism to expand the amount of information input. The BRB belongs to the second type of method, which can expand the model information input through expert knowledge, so as to realize model training under small samples*.

### *2.2. Overview of FFBRB Fault Diagnosis Model Principle*

To solve the above problems, the FFBRB flywheel fault diagnosis model is proposed in this paper. In this model, the existing FFTA is used to construct the initial belief rules of BRB, and the transformation rules from FFTA to BRB are given. The model used the ER (evidential reasoning) algorithm to give the reasoning process of the model. In this model, the P-CMA-ES (projection covariance matrix adaptation evolutionary strategies) algorithm was used to optimize the parameters of the model, which improved the accuracy of the model. Figure 1 shows the overall transformation process of the model.

**Figure 1.** Fault diagnosis schematic diagram of FFBRB model.

**Remark 2.** *The similar learning ability of the BRB and neural networks was noted in the literature [19]. Therefore, the fault diagnosis of complex systems could be achieved through constructing deep BRB or hierarchical BRB models [20]*.

### **3. Construction and Inference of the FFBRB Model**

This section mainly introduces three parts:


### *3.1. Basic Structure of the FFTA Flywheel System*

In a practical flywheel system, FFTA analysis mainly depends on how the probability of each event in a fuzzy fault tree is calculated and expressed, and how to apply them to BRB. The overall fuzzy fault tree analysis structure is shown in Figure 2.

**Figure 2.** FFTA structure diagram of flywheel.

The fuzzy fault tree graph of the flywheel system is mainly composed of logic gates and related events, and its faults include sensor faults and system faults. The complete flywheel system fault tree [21] is shown in Figure 3 below:

### *3.2. The Process of Constructing the BRB Model Based on the FFTA*

3.2.1. Analysis of Conversion Mechanism between FFTA and BRB

FFTA and BRB have differences in inputs and outputs. The input and output in BRB are mainly described by a series of belief rules, whereas the input and output in the FFTA are mainly described by logic gates and events. Therefore, it needed a bridge to enable the transition and transformation between the FFTA and BRB. The fault tree established in FFTA can sort out the relationship between fault events and clarify the context of different events. Bayesian networks describe the state of a part of the modeled thing and are associated with

.

probability, also known as reliability networks. There is a certain mapping relationship between fuzzy fault tree and Bayesian network, which is expressed as follows [22]:


In order to describe the correspondence between FFTA and Bayesian networks, an example is listed in Figure 4 for reference.

**Figure 4.** The corresponding expression graph between FFTA and Bayesian network graph.

BRB consists of three important parts: knowledge base, inference machine and optimization method. BRB's knowledge base is composed of a series of belief rules, which represent the relationship between input and output. ER, as the reasoning machine of BRB, is an evidential reasoning method [23]. The literature proves that the Bayesian inference can be extended to ER, where ER has weighted reliable inaccurate information, and the relationship between Bayes rules and ER rules can be revealed. The literature comes to the following conclusion: when each event is independent of the other, conditional probability is equivalent to belief degree. Therefore, it can be concluded that the Bayesian inference can be transformed into ER inference. ER [24], as the inference machine of BRB, is a part of BRB. Therefore, Bayesian inference can be transformed into BRB inference. The corresponding relationship between BRB and Bayesian network [25–27] is as follows:


Thus, as can be seen from the above analysis, it can conclude the complete FFTA to BRB conversion process, and the schematic conversion diagram from the FFTA to BRB is shown in Figure 5:

• The three numbers in the triangular fuzzy number of FFTA's base event failure probability are divided into three groups corresponding to the root node of the Bayesian network, respectively, which are used as the input of BRB;

• The three numbers in the triangle fuzzy number of FFTA intermediate event occurrence probability are divided into three groups corresponding to the root leaf nodes of the Bayesiannetwork,respectively,which serveastheinputandoutputofBRB;

• The three numbers in the triangular fuzzy number of FFTA top event occurrence probability are divided into three groups of night nodes corresponding to the Bayesian network, respectively, which are used as the output of BRB.

**Figure 5.** Schematic conversion diagram from FFTA to BRB.

### 3.2.2. Conversion Rules from FFTA to BRB

It can be seen from the above that the logic gate in FFTA corresponds to the conditional probability distribution of the corresponding node in the Bayesian network. Different logic gate pairs should have different transformation rules, and this section defines the transformation process.

Probability Representation of Transformation Space Condition Corresponding to Different Logic Gates

*xi* is used to represent the i-th base event in FFTA, then the conditional probability rule in the Bayesian network corresponding to the logic gate of type "and" in FFTA can be described as expression 3, and the conditional probability rule in the Bayesian network corresponding to the logic gate of type "or" can be described as expression 4.

$$p(Top|\mathbf{x}\_1, \mathbf{x}\_2, \dots, \mathbf{x}\_n) = \prod\_{i=1}^n \mathbf{x}\_i \tag{3}$$

$$p(Top|\mathbf{x}\_1, \mathbf{x}\_2, \dots, \mathbf{x}\_n) = \sum\_{i=1}^n \mathbf{x}\_i \tag{4}$$

The Belief Rule and Rule Activation Weight Representation of the BRB Corresponded to the Logic Gate

Attribute importance withdrawal in BRB is the weight of attribute, and the importance of rules is the weight of rules. In this section, this paper defined different transformation rules for different logic gates, which also correspond to different rule activation weights.

The set of input reference values in FFTA below, that is, the set of reference values of the base event is represented by *Ai*. *Top*1, *Top*2, ... , *Topn* represents *n* results; under the k belief rule, the corresponding belief degree of each result is determined by *βi*(*<sup>i</sup>* = 1 ··· *<sup>N</sup>*), *N* indicates the number of results; this paper used *<sup>δ</sup>i*(*<sup>i</sup>* = 1 ... *M*) which represents the attribute weight of each premise attribute, *M* represents the number of attributes, and *θk* represents the rule weight of the belief rule in the article *k*, *K* is the number of belief rules.

• Under the condition of "and" logic gates, the BRB's belief rules [28] can be described as follows:

$$\begin{array}{l} \text{Beilef Rule}\_{k}: \\ \text{If } \mathbf{x}\_{1} \text{ is } A\_{1} \land \mathbf{x}\_{2} \text{ is } A\_{2} \land \dots \land \mathbf{x}\_{n} \text{ is } A\_{n} \\\\ \text{Then result is } \{ (Top\_{1}, \beta\_{1}), (Top\_{2}, \beta\_{2}), \dots, (Top\_{n}, \beta\_{N}) \} \\\\ \text{with rule weight } \theta\_{1}, \theta\_{2}, \dots, \theta\_{K} \\\\ \text{and attribute weight } \delta\_{1}, \delta\_{2}, \dots, \delta\_{M} \end{array} \tag{5}$$

where *ak i* represents the rule matching degree under rule *k* (the adaptability of input sample and belief rule), *l* indicates two adjacent activation rules, two rules are activated when the input falls between them, and the rule activation weight calculation under the "and" gate condition is as follows:

$$\omega\_k = \frac{\theta\_k \prod\_{i=1}^{M} \left(a\_i^k\right)^{\delta\_i}}{\sum\_{i=1}^{K} \theta\_l \prod\_{i=1}^{M} \left(a\_i^l\right)^{\delta\_i}} \tag{6}$$

$$a\_i^k = \begin{cases} \frac{A\_i^{l+1} - x\_i}{A\_i^{l+1} - A\_i^l} & k = l, A\_i^l \le x\_i \le A\_i^{l+1} \\\ 1 - a\_i^k & k = l+1 \\\ 0 & k = 1 \cdots K, k \ne l, l+1 \end{cases} \tag{7}$$

• Under the condition of "or" logic gates, the BRB's belief rules could be described as follows:

$$\begin{aligned} \text{If } & x\_1 \text{ is } A\_1 \lor x\_2 \text{ is } A\_2 \lor \dots \lor x\_n \text{ is } A\_n\\ \text{Then result is } & \{ (Top\_1, \beta\_1), (Top\_2, \beta\_2), \dots, (Top\_n, \beta\_N) \} \\ \text{with rule weight } & \theta\_1, \theta\_2, \dots, \theta\_K \end{aligned} \tag{8}$$

and attribute weight *δ*1, *δ*2, ..., *δM*

where *ak i* represents the rule matching degree (the adaptability of input sample and belief rule), the rule activation weight calculation under the "and" gate condition is as follows:

$$
\omega\_k = \frac{\theta\_k \sum\_{i=1}^{M} \left(a\_i^k\right)^{\delta\_i}}{\sum\_{l=1}^{K} \theta\_l \sum\_{i=1}^{M} \left(a\_i^k\right)^{\delta\_i}} \tag{9}
$$

The calculation of the rule matching degree is the same as the above "and" logic gate condition.

### *3.3. Establishment of the FFBRB Model and Inference Optimization*

The FFBRB flywheel system fault diagnosis model established in this paper is shown in Figure 6.

**Figure 6.** FFBRB flywheel system fault diagnosis model diagram.

3.3.1. Analysis of Reasoning Process from FFTA to BRB

The reasoning process of the FFBRB model, which is actually the reasoning process of the BRB, is shown in Figure 7.

**Figure 7.** Diagram of FFBRB model inference process.

In particular, this model uses the triangle fuzzy number FFTA in the probability of events, from the upper and lower bounds of the triangular fuzzy number representation and event probability values are divided into three groups, respectively, after dealing with the BRB, can go through BRB to optimize the processing of the top event probability triangle fuzzy number, see FFTA analysis of the fitting effect of the result of the probability of the top event.

FFBRB model makes the FFTA knowledge mechanism embedded in the BRB expert knowledge base, which solves the problem that it is difficult to embed BRB expert knowledge. The FFBRB model uses BRB to train a series of sample data, which further improves the accuracy of the data and solves a considerable part of the uncertainty problems of the flywheel model. This section mainly introduces the reasoning process of FFBRB model fault diagnosis, that is, the reasoning process of BRB.

The specific fault diagnosis process of the FFBRB model is as follows:

Step 1: Data preprocessing. This paper first normalized the data samples and limited the data within the range of 0–1 to characterize the probability, so as to better describe the problem.

Step 2: Fuzzy fault tree analysis. Firstly, the logical relationship between events is sorted out and the fault tree graph of the fault diagnosis model is drawn. Then, this paper used a triangle fuzzy number to represent the failure probability of the FFTA basic event, introduce a fuzzy interval operator, calculate the triangle fuzzy number of occurrence probability of the middle event and top event and divide the data into three groups. For example, a triangle fuzzy number is used to represent the failure probability of a base event *x*<sup>1</sup>(*a*1, *m*1, *b*1) and base event *x*<sup>2</sup>(*a*2, *m*2, *b*2), and interval fuzzy operator formula is used to obtain the occurrence probability of an intermediate event or top event (*a*, *m*, *b*). In order to facilitate subsequent data processing, this paper divided these data into three groups (*a*1, *a*2, *a*), ( *m*1, *m*2, *m*), (*b*1, *b*2, *b*).

Step 3: Taking the Bayesian network as a bridge, FFTA is mapped to BRB. The equivalence of FFTA logic gate input and output and BRB input and output was explained through the bridge of the Bayesian network. According to the mapping rules mentioned above, fault tree graphs are mapped to the Bayesian network graphs and then BRB analysis is carried out, respectively, according to the graphs.

Step 4: Input the sample data integrating FFTA fault mechanism knowledge into BRB and use BRB for fault diagnosis. There are four steps to achieve concrete reasoning:


$$\beta\_n = \frac{\mu \times \left[ \prod\_{i=1}^{L} \left( \omega\_l \beta\_{n,l} + 1 - \omega\_l \sum\_{i=1}^{N} \beta\_{i,l} \right) - \prod\_{l=1}^{L} \left( 1 - \omega\_l \sum\_{i=1}^{N} \beta\_{i,l} \right) \right]}{1 - \mu \times \left[ \prod\_{l=1}^{L} (1 - \omega\_l) \right]} \tag{10}$$

$$\mu = \frac{1}{\sum\_{n=1}^{N} \prod\_{l=1}^{L} \left(\omega\_l \beta\_{n,l} + 1 - \omega\_l \sum\_{i=1}^{N} \beta\_{i,l}\right) - (N-1) \prod\_{l=1}^{L} \left(1 - \omega\_l \sum\_{i=1}^{N} \beta\_{i,l}\right)} \tag{11}$$

• Utility calculation, the final output.

$$y = \sum\_{n=1}^{N} \mu(Top\_n) \beta\_n \tag{12}$$

Step 5: BRB optimization. In this step, the optimization algorithm is used to process the parameters to make the BRB output more accurate.

### 3.3.2. Optimization of the FFBRB Fault Diagnosis Model

This section describes the optimization process of the FFBRB model, as shown in Figure 8 below:

**Figure 8.** Optimization process flow chart.

In this model, the data generated by fuzzy fault tree analysis are still uncertain after BRB processing. In order to reduce the error between the parameters processed by the initial BRB and the real data and complete the optimization of parameters, an optimization mechanism is introduced in this model. P-CMA-ES [29] algorithm is used. The optimization function can be described as follows:

$$\begin{aligned} \min & \text{MSE}(\emptyset) \\ \text{s.t.} & \sum\_{n=1}^{N} \beta\_{n,k} = 1, k = 1 \cdots K \\ & 0 \le \beta\_{n,k} \le 1 \\ & 0 \le \theta\_k \le 1 \end{aligned} \tag{13}$$

In the upper form, the actual output of the square error is used by the *MSE*(*ς*), *ς* is the parameter that appears in the process and this paper used the lower formula to represent the average error of the output of the prediction:

$$MSE(\zeta) = \frac{1}{K} \sum\_{k=1}^{K} (y^\* - y)^2 \tag{14}$$

In the above expression, *y* represents the actual output, *y*∗ represents the predicted output, and the number of training samples is expressed by *K*. The realization process of the P-CMA-ES algorithm is described in detail below:

• Set initial parameters. The number of solutions is defined as *Num* in the population, *Pn* in the optimal subgroup, the dimension of the problem is defined as *D*, the optimal subgroup is defined as *μ*, the weight of the optimal subgroup is defined as *ωi*;

$$\sum\_{i=1}^{\mu} \omega\_i = 1, \quad \omega\_1 \ge \omega\_2 \ge \cdots \ge \omega\_{\mu} \ge 0 \tag{15}$$

• Sampling. The mean value of the optimal subgroup solution is the desired output value, and the population is normally distributed. The calculation process is as follows:

$$\mathbf{c}\_{i}^{h+1} = average^{h} + \eta^{h}H(0, To^{h})\tag{16}$$

In the population of generation *h* + 1, the *i*(0 < *i* < *Num*) solution is represented to *ςh*+<sup>1</sup> *i* ; *average<sup>h</sup>* is the average of optimal subgroup solutions in the population; *ηh* is *h* the generation of evolutionary steps; *<sup>H</sup>*(∗) is the normal distribution function representation of data; population *h* generation covariance matrix is represented by *Toh*;

• Projection. The process of performing a projection operation for each equality constraint can be described as follows:

$$\begin{aligned} &\mathfrak{g}\_1^{h+1}(1+m\times\left(\tau-1\right):m\times\tau) \\ &=\mathfrak{g}\_1^{h+1}(1+m\times\left(\tau-1\right):m\times\tau)-\mathrm{Q}^T\times\left(\mathrm{Q}\times\mathrm{Q}^T\right)^{-1} \\ &\quad \times \mathfrak{g}\_1^{h+1}(1+m\times\left(\tau-1\right):m\times\tau)\times\mathrm{Q} \end{aligned} \tag{17}$$

The *m* = (1 ... *<sup>M</sup>*), expression of the number of variables can be expressed as *m* in the equality constraint, *m* = (1... *<sup>M</sup>*), *M* represents the solutions in each equality constraint, and *τ* = (1 ... *M* + <sup>1</sup>), when the constraints are equal, its quantity can be expressed by *τ*. In addition, *Q* = [1, 1, . . . , <sup>1</sup>]<sup>1</sup>×*<sup>N</sup>* is the way to represent parameter vectors;

• Select and reorganize. Select the optimal subgroup and calculate the solution set of the mean. In the optimal subgroup, the weight of the *i* − *th*(*i*=1 ... *Pn*) solution can be expressed as *hi*, which is calculated as follows:

$$average^{h+1} = \sum\_{i=1}^{Pn} h\_i g\_i^{h+i} \prime \sum\_{i=1}^{Pn} h\_i = 1\tag{18}$$

• Update the covariance matrix. The specific calculation process is as follows:

$$To^{h+1} = \left(1 - \varepsilon\_1 - \varepsilon p\_n\right)T^h + \varepsilon\_1 s\_c^{h+1} \left(s\_c^{h+1}\right)^T + \varepsilon\_{Pn} \sum\_{i=1}^{p\_n} h\_i \left(\frac{\xi\_1^{h+1} - average^h}{\eta^\varepsilon}\right) \times \left(\frac{\xi\_1^{h+1} - average^h}{\eta^\varepsilon}\right)^T \tag{19}$$

$$s\_c^{h+1} = (1 - \varepsilon\_c)s\_c^h + \sqrt{\varepsilon\_c(2 - \varepsilon\_c) \left(\sum\_{i=1}^{p\_n} h\_i^2\right)^{-1}} \times \frac{average^h average^{h+1}}{\eta^\S} \tag{20}$$

$$\eta^{h+1} = \eta^h \exp(\frac{\varepsilon\_\eta}{o\_\eta} (\frac{\left| s\_\xi^{h+1} \right|}{||H(0, f)||} - 1))\tag{21}$$

$$s\_{\eta}^{h+1} = (1 - \varepsilon\_{\eta})s\_{\eta}^{h} + \sqrt{\mathbf{e}\_{\varepsilon}(2 - \varepsilon\_{\varepsilon}) \sum\_{i=1}^{p\_n} h\_i^2}^{p\_n} \times \text{To}^{h-\frac{1}{2}} \times \frac{average^{h+1} - average^h}{\eta^h} \tag{22}$$

In the above calculation expression, the learning rate is expressed as *<sup>e</sup>*1,*ePn*,*ec*,*eη*; The *h*th evolutionary step is expressed as *shη*, *shη* = 0; The evolution path of the *h*th covariance matrix is expressed as *shc* , *shc* = 0. In addition, *J* is used to represent the identity matrix, and the damping coefficient is denoted by *<sup>o</sup>η*, Normal distribution of mathematical expectation *<sup>H</sup>*(*<sup>o</sup>*, *To<sup>h</sup>*) use *F <sup>N</sup>*(*<sup>o</sup>*, *I*) .

The above steps describe the specific calculation process of the P-CMA-ES algorithm. This algorithm was an improvement of the CMA-ES (projection covariance matrix adaptation evolutionary strategies) algorithm, which successfully solved the equality constraint problem in the BRB and was suitable for the fault diagnosis model proposed in this paper.
