1. Introduction
A dynamic uncertain causality graph (DUCG) is a probabilistic graphical model for knowledge representation and probabilistic reasoning [
1]. As a graphical and mathematical methodology, the DUCG aims to graphically represent complex uncertain causalities between variables and perform probabilistic reasoning [
2]. A DUCG could be a directed graph comprising a number of nodes (variables) and directional arrows (causalities). Normally, the causalities and knowledge parameters associated with a DUCG are determined based on the knowledge and experience of field experts [
3]. The knowledge reasoning process of a DUCG can be divided into three steps [
4,
5,
6,
7]: (1) simplification of DUCG based on evidence to decrease its scale; (2) extending causality chain expression of the consequence events composing of independent events; and (3) calculating the probabilities of consequence events according to these expressions. In addition, a complicated DUCG can be represented by a set of uncomplicated sub-DUCGs in the construction of a knowledge base [
8]. Due to its outstanding capability in depicting uncertain causalities and performing efficient reasoning, the DUCG has been widely used in many fields, such as medical diagnosis and treatment [
9,
10], fault diagnosis of nuclear power plants [
11], reliability analysis of dynamic reliability block diagram [
12], shale-gas sweet-spot evaluation [
13], and probabilistic safety assessment of a boiling water reactor [
14].
Although the DUCGs have been utilized to solve various issues, the conventional DUCG model has been criticized for its shortcomings in knowledge representation and reasoning (KRR) [
15]. For example, the knowledge parameters acquired from historical statistics or experts were restricted to be crisp values [
16]. However, due to information scarcity, complex causalities between events, and inconsistent knowledge of specialists, the evaluations obtained from experts are usually uncertain and fuzzy [
17,
18,
19]. Hence, to handle the vague knowledge, different uncertainty theories such as intuitionistic fuzzy sets [
20], cloud model theory [
21], and picture fuzzy sets [
22] have been combined with DUCGs to improve their KRR ability. In practice, experts tend to use linguistic terms to describe their qualitative evaluations when quantitative information is not available or the cost of obtaining quantitative information is too high [
23,
24,
25]. To more accurately and objectively describe the linguistic assessment information, Pythagorean uncertain linguistic sets (PULSs) [
26] were proposed recently. The desirable advantage of PULSs is that they can describe two aspects of an object: an uncertain linguistic variable and a Pythagorean fuzzy number (PFN). Since its introduction, the PULS has received widespread attention in academia and has been applied in many practical fields [
27,
28]. Therefore, it is desirable to apply the PULSs to character the fuzziness and uncertainty expert knowledge in DUCGs.
In addition, obtaining accurate knowledge parameters play an important role in the KRR of DUCGs. However, previous research often directly gave the values of knowledge parameters or the representation of events [
20,
21,
22]. Few studies have considered obtaining the knowledge parameters of DUCG from experts’ evaluations [
22]. In addition, due to the different backgrounds of domain experts, experts often have diverse opinions about the knowledge parameters of a DUCG. However, existing studies fail to handle the conflicting opinions from experts on knowledge parameters. When experts have different opinions on an evaluation index, Yao et al. [
13] adopted the weighting method to compute probabilities in DUCG; but they did not consider conflicting and inconsistent opinions among experts. Hence, developing effective methods to handle conflict opinions of experts to obtain knowledge parameters more accurately is vital for knowledge acquisition using DUCGs.
Motivated by the above analyses, this article aims to develop a new type of DUCGs, called PUL-DUCG, to enhance the ability of DUCGs in KRR. Firstly, the PULSs are used to express the vagueness and ambiguous evaluations obtained from a domain expert on knowledge parameters. Secondly, with the purpose of obtaining more accurate knowledge parameters, a correction algorithm for dealing with group conflict is employed to deal with cognitive nonconformity in the knowledge acquisition process. Moreover, a causal inference algorithm based on the improved evaluation based on the distance from average solution (EDAS) method is introduced to find out the most probable cause of an abnormal event. Lastly, a practical example is provided to illustrate the application and effectiveness of the established PUL-DUCG model.
The rest part of the paper is structured as follows: In the next section, we review the relevant work of DUCGs.
Section 3 introduces the basic concepts of PULSs and DUCGs. The proposed PUL-DUCG model based on PULSs and the EDAS method is detailed in
Section 4. A real case is performed to show and validate the feasibility of the new PUL-DUCG model in
Section 5. Finally, we present conclusions in
Section 6 and provide further study recommendations.
2. Literature Review
In recent years, some extended DUCGs have been proposed to compensate for the shortcomings of existing methods. For example, Zhang et al. [
9] extended the DUCG to include the representation and inference algorithm for non-causal classification relationships. Zhang et al. [
29] proposed an extended DUCG methodology to intuitively represent complex medical knowledge and perform effective clinical diagnose. Yao et al. [
13] provided a DUCG model based on multiple conditional events and weighted graphs for shale-gas sweet-spot evaluation. Qiu and Zhang [
30] presented a multi-valued DUCG method to calculate the joint probability distribution of the directed cycle graph with local data and domain causal knowledge. Jiao et al. [
3] reported an artificial intelligence diagnostic model based on DUCG for improving the efficiency of differential dyspnea diagnosis. Dong and Zhang [
11] proposed a cubic DUCG methodology characterized by causal connections and dynamic negative feedback loops for temporal process modeling and diagnostic logic inference. Bu et al. [
31] presented a hybrid DUCG to reduce the misdiagnosis caused by outpatient triage error and help triage nurses improve their triage accuracy.
On the one hand, many uncertainty theories have been incorporated into DUCGs to represent the experience and knowledge of experts. For instance, Li et al. [
22] proposed a DUCG based on picture fuzzy sets for uncertain knowledge representation and reasoning in root cause diagnosis. Li and Yue [
20] introduced a DUCG model for root cause analysis, in which the intuitionistic fuzzy sets were used for describing the uncertain event. Li et al. [
21] developed a cloud reasoning dynamic DUCG model, in which the cloud model theory was employed to handle the fuzziness and randomness of uncertain information simultaneously. Combining a fuzzy decision tree with the DUCG, Zhao et al. [
32] proposed a simplified DUCG model for fault diagnosis in nuclear power plants.
Furthermore, many researchers have introduced different knowledge inference algorithms for DUCGs. Nie and Zhang [
33] suggested an inference algorithm of DUCG based on conditional stochastic simulation for complex cases with many state-unknown intermediate variables. Zhou and Zhang [
14] integrated DUCG with event trees to perform probabilistic safety assessments by considering the problems of dependencies and circular loops. In [
20,
21], the technique of order preference similarity to the ideal solution (TOPSIS) method was utilized to implement fuzzy knowledge inference in DUCG. In [
22], an enhanced knowledge reasoning algorithm based on the picture fuzzy operators was developed to resolve causal inference problems. Dong et al. [
34] provided a methodology for modeling and reasoning about complex faults with negative feedback with cubic DUCG. Hao et al. [
35] proposed a diagnostic modeling and reasoning system using DUCG for the intelligent diagnosis of jaundice, considering the causal interactions among diseases and symptoms.
Based on the above literature review, it can be seen that many upgraded DUCGs have been developed for KRR. Nevertheless, there is no research that combines PULS s with the EDAS method in DUCG. Additionally, most studies lack the capacity to deal with the conflict judgements of domain experts on knowledge parameters. Hence, this article aims to present a new type of DUCG that combine PULSs and the EDAS method for KRR. In addition, a preference modifying-based method is used to detect and eliminate the conflict among experts to obtain accurate knowledge parameters in the knowledge acquisition process, which is an important issue that received little attention in the literature.
4. The Proposed PUL-DUCG Model
4.1. Definition of the PUL-DUCG
In this section, a new type of DUCG based on PULSs and the EDAS method [
47] is presented for KRR.
The events are represented by some graphical symbols in the classical DUCG. A square is used to represent the
B-type variable, which could only be an independent cause. A circle is used to represent the
X-type variable, which can be divided into two types:
and
. The
-type variable can represent a consequence; the
-type variable can represent both a consequence and a cause. A certain state
j of variable
Vi (
Vi can be a
B- and
-type), referred to as
, represents a parent event. A certain state
k of variable
(
can be
and
type), referred to as
, represents a child event. The connection event variable
, denoted by the arrow
→
, represents the cause-and-effect relationship between
and
. The causal mechanism of
independently causing
is quantified using the virtual event
, with all knowledge parameters given in a virtual event matrix
for
p child events
and
q states of child events. A typical PUL-DUCG is established as shown in
Figure 2. Similar to the normal DUCG model, the variables of the PUL-DUCG model can be expressed as
When there are multiple states for a variable, the model can be expressed as
4.2. Knowledge Acquisition and Representation
In this stage, the PULSs are used to acquire the knowledge parameters of PUL-DUCG (e.g., rn;I, bij and ). Here, we use the knowledge parameter bij as an example to illustrate the knowledge acquisition process. Suppose that the parameter bij is evaluated by l experts . Each expert is assigned a weight with because of their different backgrounds and experience. Let be the knowledge assessment matrix of the kth expert, where is the linguistic evaluation for the jth state of the root event Bi. Next, the proposed knowledge acquisition is explained in detail.
By using the PULWA operator, all individual knowledge assessment matrixes
can be aggregated to obtain the collective knowledge assessment matrix
, where
By using Equation (8), the expression of any child event is the combination of parent events and connection events. Then, the expression could be simplified by removing unrelated events from the expression based on the DUCG.
For example, as shown in
Figure 2, the child events
X2 and
X3 are expressed as:
Using Equations (13) and (14), the expression
X2 is expanded as:
A duplicated variable
X2 appears on the right-hand side of the equation, so
X2 can be removed. When
X2 is deleted, it will disappear as a parent event of
X3. Finally,
X3 has only one parent event
B4. Then,
changes into
, where the superscript index {2} indicates event
X2. Finally, the first-order cut-set expression of the target event
X2 can be expressed as:
4.3. Reasoning Algorithm
In this section, we use the EDAS method to conduct knowledge reasoning in the proposed PUL-DUCG model, which can locate the maximal probability event. The detailed reasoning steps of the PUL-DUCG are expressed below:
In this study, the occurrence probability of a target child event is computed by
where
Hkj and
E represent
B- type events and
X-type events, respectively.
Suppose that we are interested in the events of
and
in
Figure 3. Namely, we want to know which is more likely to cause the event of
between
and
. Thus, we can compute the probabilities of
and
by
The average probability for each target child event
can be derived by
As a result, the average probability vector is established as .
The positive distance matrix
can be obtained by
The negative distance matrix
can be derived by
The weighted sum of the probability assessment positive distance (
SPi) is calculated by
In a similar way, the weighted sum of the probability assessment negative distance (
SNi) is obtained by
The normalized values of
SPi and
SNi for each event can be obtained by
The appraisal score for the
ith event
can be determined by
where
.
The greater the appraisal score , the higher occurrence probability of the target child events. Therefore, all the child events can be ranked based on the descending order of their appraisal score .
5. Case Study
In this section, an example about the root cause analysis of the abnormal aluminum electrolysis cell condition [
48] is used to demonstrate the feasibility and practicability of the proposed PUL-DUCG model.
5.1. Problem Description
The aluminum electrolysis cell is the most important equipment in aluminum electrolysis. Due to the interference of complex physical and chemical reactions under high temperatures and corrosive conditions, the cell conditions in aluminum electrolysis are quite sophisticated. These conditions can be judged based on the superheat degree [
49]. Therefore, the root cause analysis of abnormal aluminum electrolysis cell condition can be transformed into the root cause analysis of an abnormal superheat degree. The superheat degree is mainly determined by technical parameters. If technical parameters are abnormal, it will result in an abnormal superheat degree. Thus, it is a vital significance to recognize the abnormal technical parameter, which leads to an abnormal superheat degree. By consulting experienced technicians and experts, the technical parameters aluminum height (
AH), electrolyte level (
EL), molecular ratio (
MR), heat insulation capacity (
IP), electrolyte temperature (
ET), voltage fluctuation (
VF) and crystallization temperature (
CT) are chosen for assessing the abnormal superheat degree. These technical parameters, together with their corresponding roles, are shown in
Table 1 [
48]. Subsequently, the proposed PUL-DUCG is established for root cause analysis of the abnormal superheat degree, as shown in
Figure 3.
Five experts were invited to evaluate the knowledge parameters of the PUL-DUCG model using the linguistic term set
S =
= Very low,
s1 = Low,
s2 = Slightly low,
s3 = Normal,
s4 = Slightly high,
s5 = Normal,
s6 = Very high}. These experts from different departments or organizations include an electrical engineer, a chemical production engineer, a nonferrous metal engineer and two professors of aluminum electrolysis. The weights assigned to the five experts are assumed as
in line with their different background knowledge and experience. For example, the experts’ evaluations on parameter
b are listed in
Table 2.
5.2. Implementation and Results
First, the knowledge parameters of the PUL-DUCG are determined according to the proposed knowledge acquisition method. Then, the reasoning algorithm of the proposed PUL-DUCG model is implemented to find out the root cause event with the maximum probability. We use the knowledge parameter a as an example to explain the knowledge acquisition process.
Step 1: By applying Equation (12), the collective knowledge assessment matrix
is derived as shown in
Table 3.
Step2: The expression of target child events
Xnk under different states could be simplified as
Step 3: Via Equation (17), the occurrence probabilities of target child events
Xnk under different states are calculated as listed in
Table 4.
For example, when events X42, X52, X61 and X73 occur, we need to acquire the different states of , and , namely the fuzzy representation of the posterior probability of events , and , where k = 1, 2, 3.
Step 4: Via Equation (19), the average probability vector
is established as
Step 5: With Equations (20) and (21), the positive distance matrix
and the negative distance matrix
of target child events to the average probability vector
are determined as
Step 6: By using Equations (22)–(23), the weighted sums of SP and SN for each target child event
SPi (
i = 1, 2, ..., 9) and
SNi (
i = 1, 2, ..., 9) are determined as expressed in
Table 5.
Step 7: Via Equations (24) and (25), the normalized values of
(
i = 1, 2, ..., 9) and
(
i = 1, 2,..., 9) for each target child event are computed and shown in
Table 5.
Step 8: Applying Equation (26), the appraisal scores for nine target child events are displayed in
Table 5. According to the descending order of the appraisal scores
, the ranking of nine target child events is determined as shown in
Table 5.
Accordingly, the ranking order of target child events within the circumstance of is > > > > > > > > . Thus, we can know that event has the highest occurrence probability, indicating that is the most likely cause event in the case of . This result is consistent with the statistical data obtained from the actual production.
5.3. Comparisons and Discussions
To verify the effectiveness and advantages of the proposed PUL-DUCG model, a comparative analysis with the fuzzy-Bayesian network (FBN) [
48], the DUCG [
1], the intuitionistic fuzzy DUCG (IFDUCG) [
20], and the cloud model DUCG (CDUCG) [
21] is conducted in this section. The reasoning results of the listed methods are presented in
Table 6.
First, the accuracy of PUL-DUCG was the same as the CDUCG, which was 95.6%. This confirms that the proposed PUL-DUCG model is effective for root cause analysis. Furthermore, this method has advantages over the compared methods. Since the DUCG model ignores uncertain information, it is likely to obtain inaccurate reasoning results in complex situations. Compared with the IFDUCG, the proposed model integrates the advantages of PULSs and linguistic variables. Thus, the PUL-DUCG has a more powerful representation ability than the IFDUCG model to handle the fuzziness and vulnerability information in practical situations. In addition, the TOPSIS method in [
19] is used to find the root event with a maximum posteriori probability, whereas the proposed model determines the probability ranking of events by the EDAS method. The TOPSIS method did not consider the weights of events, which are not in line with the real situations. Moreover, the CDUCG and the IFDUCG have limitations in handling the inconsistent cognition of experts. Therefore, the knowledge parameters acquired by the proposed PUL-DUCG model are more credible because they are based on a preference modifying-based method.
From
Table 6, it can be seen that the accuracy of the FBN method is lower than the DUCG. This is mainly because the FBN can only depict the quantitative evaluation information. Within the circumstance of a qualitative depiction, the FBN is unable to describe fuzzy information. In contrast, the PUL-DUCG model is proposed based on the PULSs, which can effectively describe both qualitative and quantitative information. Moreover, the FBN requires a lot of statistics and empirical information to express the precise conditional probability between events. In comparison, the PUL-DUCG permits inadequate information representation, which could significantly diminish the workload and trouble of setting up an information base.
From the above comparison analysis, it can be found that more accurate and reasonable knowledge acquisition and reasoning results can be determined by the proposed PUL-DUCG model. Compared with the existing DUCG methods, the advantages of the developed PUL-DUCG model are summarized as follows:
- 1.
The model used the membership degree and non-membership degree of PULSs to express the fuzzy knowledge of experts. The fuzziness of initial information will be retained, which can avoid information loss and distortion in the process of causal analysis. Hence, the proposed model is more capable of representing and reasoning uncertain information.
- 2.
Based on a correction algorithm for solving the contradictory opinions of specialists, the proposed model can manage the conflicts and inconsistencies among expert evaluations in knowledge parameters. Thus, it can determine the knowledge parameters more precisely when acquiring knowledge.
- 3.
By means of the modified EDAS method, the proposed model can obtain rational and distinguishing occurrence probabilities and select the maximal probability event. Therefore, it is more accurate and effective in the practical utilization of causal analysis problems.
6. Conclusions
In this paper, a new PUL-DUCG model based on PULSs and the EDAS was developed to enhance the KRR ability of conventional DUCGs. The PULSs are adopted to character the fuzziness and uncertainty of expert knowledge in knowledge representation. A modified EDAS method is utilized to determine the root cause event with the maximal probability. Moreover, this method comprehensively considered the conflicting opinions among experts, which can avoid the limitations of individual expert evaluations and obtain more accurate and reliable knowledge parameters. Finally, a practical aluminum electrolysis case was presented to illustrate the application and advantages of our proposed DUCG model. The results indicate that the PUL-DUCG model is a promising and effective modeling technique for knowledge representation, acquisition and reasoning.
However, the proposed PUL-DUCG model has some disadvantages which can be addressed in further studies. First, the complexity of the proposed algorithm will be increased when a larger number of experts are involved. Thus, further work is needed to solve the problems of conflicting opinions in a large group environment. Second, other knowledge parameters, such as the time factor, can be considered to improve the online analysis ability of DUCG. Another research direction is developing a computer-based application system for the PUL-DUCG, which can perform automatic diagnosis after inputting knowledge parameters obtained from experts.