Next Article in Journal
A Receiver Position Estimation Method Based on LSTM for Multi-Transmitter Single-Receiver Wireless Power Transfer Systems
Previous Article in Journal
An Efficient Multi-Level 2D DWT Architecture for Parallel Tile Block Processing with Integrated Quantization Modules
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Efficient Consistency Check Based on Perceived Initial Deviation

1
School of Mechanical and Electrical Engineering, Huainan Normal University, Huainan 232038, China
2
State Key Laboratory of Industrial Control Technology, School of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China
3
School of Biological Engineering, Huainan Normal University, Huainan 232038, China
*
Author to whom correspondence should be addressed.
Electronics 2024, 13(23), 4669; https://doi.org/10.3390/electronics13234669
Submission received: 6 September 2024 / Revised: 19 November 2024 / Accepted: 21 November 2024 / Published: 26 November 2024
(This article belongs to the Section Computer Science & Engineering)

Abstract

:
The behavior recorded by an information system is often different from the behavior in the initial model since the business process is constantly changing in actual operation. In order to enable event logs to be replayed well to process the model, the set of optimal alignments first needs to be determined. All the corresponding activities in each alignment are compared to the existing method, which will cause a lot of unnecessary work. Thus, we propose that the non-optimal alignment is perceived beforehand according to the relationship between the location of the initial deviation and the number of deviations. The perceptible regions in the process model are divided based on the behavioral characteristics of various substructures. The comparison of an alignment is terminated if the location of the initial deviation is less than the previous value in the perceptible region. This alignment is judged to be non-optimal. Otherwise, the alignment with optimal probability is completely compared. The OPS plug-in was executed in the data sets from various networks and BPIC2020, and the results showed that the search efficiency could be improved under the premise of guaranteeing optimality.

1. Introduction

1.1. Research Background

In recent years, process mining has become an important technique to extract valuable information from event logs [1]. Process mining mainly includes the following three steps: process discovery, conformance check and model repair [2,3,4,5]. Process discovery refers to extracting useful information from event logs to construct the corresponding process model [6,7]. Differences usually exist between current recorded behaviors and the initial models due to the variations of an event log with the actual situations [8]. Thus, it is necessary to complete the subsequent model repair according to this differing information [9,10]. Trace alignment detects deviations by correlating each trace with all firing sequences, while behavior alignment identifies all non-consistent behavior between the event log and the process model based on the division of the event structure [11,12]. Two kinds of deviations between the event log and the initial model can be detected by applying a robust analysis technique, namely the insert and skip deviations [13]. An insert deviation refers to activity that is only present in the event log. Meanwhile, a skip deviation is specific to the initial model, which affects the event log replays in the process model [14]. The inconsistent behavior detected by the behavior-checking technology mainly includes the following two types: (i) unfitted behavior in the event log that cannot be replayed in the process model and (ii) the additional behavior that can only be captured in the initial model and that does not affect the normal replay of the event log. The additional behavior in the model needs to be retained as much as possible so that the similarity is ensured between the repaired and initial models [15,16]. Trace alignment involves the detection of a deviation element, and its ultimate goal is to measure the optimal performance of the event log replay in the process model. Thus, there is no need to consider the influence of additional behavior in the model [17].
The minimum deviation cost between the event log and the process model is satisfied under the set of optimal alignments. The deviation cost is equal to the number of deviations when the unit cost of all deviations is the same [18]. The brute force search for optimal alignment involves comparing each trace in the event log to all firing sequences in the process model, which can cause a lot of unnecessary consumption [19]. Therefore, a huge amount of work must be devoted to ensuring the optimal detection of consistency when dealing with the data set, which has a larger size or complex network structure. In this study, we propose that the non-optimality of some alignments can be perceived beforehand based on the location of the initial deviation. This method was focused on the perspective of control flow. Under the premise of ensuring the optimal cost of trace alignment, the efficiency of conformance check is improved mainly in the following ways: (i) the initial deviation of an alignment, whether it occurred within the perceptible region, is determined; (ii) the location of the initial deviation is smaller, and the number of deviation is larger when the initial deviation occurs in the perceptible region; and (iii) the location of effective initial deviation is continuously updated when each trace in the event log is aligned with the process model. Therefore, some alignments are terminated when the location of the initial deviation is smaller than the previously recorded one. In this case, some non-optimal alignment is determined without the whole comparison. It is worth noting that the perceptible processes of each trace in the event log are independent from each other.

1.2. Related Work

Alignment associates every trace in the event log with all firing sequences in the process model, and their deviations are detected accordingly [20]. This technique mainly involves the following three analyses: conformance check, cost accounting and optimal alignment search.
The conformance between the event log and the initial model is measured using four different metrics: fitness, precision, simplicity and generalization. Particularly, the fitness is the most important [21]. Conformance checks can be divided into the following two types according to different detection objects: (i) the deviation element between the event log and the initial model, which is detected based on trace alignment [22,23]; (ii) the behavior that does not match the event log and the initial model, which is discovered based on behavior mining [24]. Conformance checks can also be divided into the following two types from the different perspectives of different workflows: (i) conformance checks based on the control flow [25,26]; (ii) conformance checks from multiple perspectives, including role, data flow and stochastic language, among others [27,28,29]. Conformance checks can be further divided into the following two types, according to the integrity of the check: (i) conformance of the business process in the current state, which is only checked [30]; (ii) conformance, which is checked in real-time with changes in the business process [31,32].
Cost accounting refers to the total cost of asynchronous movement in an alignment. The unit cost of synchronous movement is set to 0, while the unit cost of asynchronous movement is set to 1 in the unweighted generic business process (the unit cost is 0 when an invisible transition is included in an asynchronous movement) [33]. The process model is repaired by processing the deviation information according to the minimum cost, and the event log can be replayed in the repaired process model.
The optimal alignment search is a key technique in conformance checking, which allows the optimal fitness of the control flow to be analyzed by it. The performance of an optimal alignment search is mainly affected by two different criteria: (i) the minimal deviation cost and (ii) the search workload. The event log is directly compared with the process model, and the fitness is analyzed by calculating the total cost of deviations. Therefore, the conformance check is limited to simple-model or single operation, and the optimal fitness cannot be guaranteed by it [34,35]. Each trace in the event log is aligned with all firing sequences of the initial model using the A algorithm, and the set of optimal alignments is selected through cost evaluation [36]. The location and severity of deviations can be pinpointed by adjusting the differences between the event log and the process model [37]. For this purpose, the process model and the event log are transformed into two respective automatons, respectively. Then, they are compared according to the synchronous product. The synchronous product is computed using the A heuristic and acceptable heuristic functions. All differences are captured by this synchronous product, allowing the minimum cost to be obtained [38]. Although the optimality of the alignment set is guaranteed by the A algorithm and the synchronous product, they involve a huge amount of work on a complicated data set. The fitness is accurately obtained using the iterative decomposition approach. The best fitness can be estimated within 10 min, and the unnecessary time to calculate the accurate value can be decreased when the fitness interval is narrow enough [39]. Based on the structural and behavioral characteristics of the process model, the overall search space for optimal alignment can be reduced through the use of effective heuristics and trace replay [40]. In this study, the perceptible regions of the process model are first determined based on the behavior characteristics of various substructures. Then, some non-optimal alignments can be terminated early when the initial deviation is located in the perceptible region. In this way, the search efficiency of optimal alignment is efficiently improved.

2. Basic Definition

Definition 1
(Event log). The tuple L = ( , o , e , l , ϑ , ) is recorded as the event log. is a trace in the event log, and o is the unified case number of traces that occurs repeatedly. Since there may be different traces in the event log, all the case numbers belong to the case set O , i.e., o O , o = 1 | O | 0 L . e is an event element in the log, denoted as e L . l is a function that assigns each event element to its corresponding label, denoted as l ( e ) = ϑ . ϑ is the label set of all events, l ( e ) ϑ and ϑ ϑ . is the flow relationship between adjacent events, denoted as i = 1 I e i 1 × e i .
Definition 2
(Labeled process model). The tuple M = ( p , t , λ , σ , τ , δ , j , p ini , p final , ) is recorded as the labeled process model. p and t represent the place and transition, respectively. λ is the function that assigns each transition to the corresponding label, that is, λ ( t ) = σ λ ( t ) = τ . τ is a label for invisible transition that has no real meaning. σ is a finite set of labels in the process model, λ ( t ) σ τ and σ σ . δ is a complete sequence from the initial to the terminated node, and j is the case number of each sequence δ , denoted as j = 1 | J | δ j M . p ini P and p final P are the initial and the terminated places, respectively. The process model M is a workflow net if and only if | p ini | = | p final | = 1 . is the flow relationship between adjacent transitions, denoted as k = 1 K t k 1 × t k .
Definition 3
(The unit cost setting of alignment). The corresponding activity between a trace and a firing sequence δ is compared by alignment, denoted as A l i g n ( i , δ j ) = ( M o v e L M , M o v e ˜ L , M o v e ˜ M , E s e r t ( l ( e ) ) , S k i p ( λ ( t ) ) ) , A l i g n ( i , δ j ) A l i g n ( A l i g n represents the set of all alignments between the event log and process model). The set of synchronous movements is denoted as M o v e L M , m o v e L M M o v e L M . The set of asynchronous movements belonging to the event log is denoted as M o v e ˜ L , m o v e ˜ L M o v e ˜ L . The set of asynchronous moves belonging to the process model is denoted as M o v e ˜ M , m o v e ˜ M M o v e ˜ M . The unit cost of asynchronous movement is “1” in the unweighted business process, denoted as cos t ( m o v e ˜ L / m o v e ˜ M ) = 1 . The unit cost of synchronous movement is “0”, denoted as cos t ( m o v e L M ) = 0 . The unit cost is 0 when invisible transition is included in an asynchronous movement. The insert deviation E s e r t ( l ( e ) ) and skip deviation S k i p ( λ ( t ) ) are produced by M o v e ˜ L and M o v e ˜ M , respectively ( e L , t M ). All movements in an alignment are recorded in Table 1, and the costs of different movements are accounted for.
Definition 4
(Search for the optimal alignment). The alignment set is obtained by comparing each trace of the event log with all firing sequences in the process model, i.e., A l i g n = o = 1 | O | j = 1 | J | A l i g n ( o , δ j ) = o = 1 | O | j = 1 | J | c o m p a r e ( o , δ j ) , o L , δ j M . The optimal alignment  A l i g n o p is the minimum cost alignment set associated with all traces, that is, cos t ( A l i g n ( o , δ o p ) ) < cos t ( A l i g n ( o , δ j ) ) , A l i g n o p = o = 1 | O | o p = 1 | O P | A l i g n o p ( o , δ o p ) , δ j M \ δ o p , δ o p M \ δ j . In other words, all alignments between the event log and the process model are necessary to be obtained before accurately judging the optimal alignment. As shown in Figure 1, the search workload for the optimal alignment between event log L and process model M is v a l u e ( o = 1 | O | j = 1 | J | A l i g n ( o , δ j ) ) = v a l u e ( j = 1 4 c o m p a r e ( 1 , δ j ) ) = 40 ( v a l u e represents the number of comparisons in the optimal alignment evaluation). The optimal alignment between L and M is A l i g n o p = ( 1 , δ 1 ) .
Definition 5
(Location of the initial deviation). The initial deviation is the first deviation in an alignment, and its location refers to the order of occurrence in an alignment. Figure 2 depicts the discovery process of the initial deviation. In Figure 2, a trace 0 of L and a sequence δ 2 of M are extracted, and the first difference S k i p ( λ ( t ) ) = e produced by the alignment of 0 and δ 2 is marked with a dashed blue line. It is worth noting that the invisible transition τ in the model is ignored in order to better observe the deviation.
The initial deviation is detected in an alignment, and the location of the initial deviation is determined according to the unit order of alignment. The location of the initial deviation is denoted as v ( l o c i d ( A l i g n ( o , δ j ) ) ) (v is the evaluation function, that is, the location of moving unit in alignment is converted into numerical form; locid represents the occurrence location of the initial deviation in alignment). A l i g n ( 1 , δ 1 ) and A l i g n ( 1 , δ 2 ) in Table 2 are used as examples, and the asynchronous movement that produced the initial deviation E s e r t ( l ( e 3 ) ) = C is the third moving unit in A l i g n ( 1 , δ 1 ) . v ( l o c i d ( A l i g n ( 1 , δ 1 ) ) ) = 3 is obtained by the conversion of the evaluation function v, that is, the location of the initial deviation in A l i g n ( 1 , δ 1 ) is 3. The asynchronous movement in A l i g n ( 1 , δ 2 ) that produces the initial deviation S k i p ( λ ( t 2 ) ) = C is the second unit, which is transformed by the evaluation function v to obtain v ( l o c i d ( A l i g n ( 1 , δ 2 ) ) ) = 2 , that is, the initial deviation in A l i g n ( 1 , δ 2 ) occurs at 2. Therefore, the location of the initial deviation in A l i g n ( 1 , δ 1 ) is greater than A l i g n ( 1 , δ 2 ) , denoted as v ( l o c i d ( A l i g n ( 1 , δ 1 ) ) ) > v ( l o c i d ( A l i g n ( 1 , δ 2 ) ) ) .
Definition 6
(Path partition of various substructures). Any path between the initial and the terminated nodes in a business process is called a complete path. Two types of sub-paths may be included in a complete path. Namely, a mandatory sub-path must occur, and a selective sub-path may occur, denoted as p m a n , p s e l . The unique behavior on each complete path comes from p s e l , while the common behavior belongs to p m a n . The substructure corresponding to p m a n is the mandatory substructure, that is, { N c a u N c o n c } = N m a n . N c a u and N c o n c satisfy causal and concurrent behavior, respectively. N c a u and N c o n c both belong to mandatory substructures, that is, their activities occur on each complete path. p s e l corresponds to the selective substructure, i.e., N c o n f = N s e l ( N c a u represents a substructure satisfying conflict behavior relationship). According to the above rules, the various substructures are divided and their corresponding sub-paths are given in Figure 3.
The level division of the selective substructure depends on the depth of selection. The selective substructure of the first level is regarded as the root of the tree structure. Then, the next level of the selective substructure is contained in the upper substructure, denoted as N s e l ( l e a f 1 ) N s e l ( r o o t ) , N s e l ( n ) N s e l ( n 1 ) ( N s e l ( l e a f 1 ) represents the first leaf of the root structure, n indicates the number of level division). According to the order of occurrence, the selective substructure of the same level can be denoted as N s e l ( n ) m ( m represents the number of occurrence order). The mandatory substructure can be divided into prepositive and postpositive substructures, which are denoted as N m · a n ( z ) and N m a n · ( d ) . N m a n ( z ) · and N m a n · ( d ) are mandatory substructures that occur before and after the selective structure, respectively ( z and d represent the occurrence orders of N m · a n ( z ) and N m a n · ( d ) , respectively). It is worth noting that the types of all sub-paths in a substructure correspond to that of their substructures.
Definition 7
(Perceptible Region). The perceptible region is a set of special substructures in the process model, denoted as φ p M { N s e l ( n ) m N m · a n ( z ) N m a n · ( d ) } . A subalignment between a subsequence δ j · in φ p M and any trace o in the event log is denoted as A l i g n s u b ( o , δ j · ) . The subalignment set between φ p M and o is j = 1 | J | A l i g n ( o , δ j · ) = A l i g n s u b ( o , δ j · ) . The smaller the initial deviation location, the larger the number of deviations in A l i g n s u b ( o , δ j · ) . A substructure N c o n c in φ p M is described in Figure 4. δ 1 · = ( B C D E ) and δ 2 · = ( B D C E ) are included in N c o n c . From Table 3, A l i g n s u b ( o , δ 1 · ) is produced by o = ( A B C D F G ) and δ 1 · = ( B C D E ) . A l i g n s u b ( o , δ 2 · ) is produced by o = ( A B C D F G ) and δ 2 · = ( B D C E ) . According to Definition 5, the location of the initial deviation in A l i g n s u b ( o , δ 1 · ) and A l i g n s u b ( o , δ 2 · ) is v ( l o c i d ( A l i g n s u b ( 1 · , δ 1 · ) ) ) = 4 > v ( l o c i d ( A l i g n s u b ( 1 · , δ 2 · ) ) ) = 2 . The cost of A l i g n s u b ( 1 · , δ 1 · ) and A l i g n s u b ( 2 · , δ 1 · ) is cos t ( A l i g n s u b ( 1 · , δ 1 · ) ) = 1 < cos t ( A l i g n s u b ( 1 · , δ 2 · ) ) = 3 .

3. Recognition of the Perceptibility of Business Process

The search for optimal alignment aligns each trace with all firing sequences, and the alignment set is discovered by producing the minimum cost. In order to improve the search efficiency of the optimal alignment, the non-optimality of some alignments can be determined beforehand according to the specified condition in the perceptible region. The different substructures are divided based in the process model, and the perceptible region is determined by the behavioral characteristics of various substructures.

3.1. Establishment of Perceptibility Condition

The costs of all alignments corresponding to each trace are obtained, and the alignment with the minimum cost is evaluated (the number of deviations is equivalent to the cost of deviations if and only if the cost of each deviation is “1”). The optimal alignment between the event log and process model consists of all optimal trace alignments. Some non-optimal alignments can be predicted by the location of the initial deviation.
The location of the initial deviation in any alignment can be denoted as v ( l o c i d ( A l i g n ( o , δ j ) ) ) according to Definition 5. The location of the initial deviation in an alignment is not the largest when a trace is aligned with the process model. Then, this alignment must be non-optimal, denoted as the perceptibility condition C p : v ( l o c i d ( A l i g n ( o , δ j ) ) < v ( l o c i d ( A l i g n ( o , δ j ) ) cos t ( A l i g n ( o , δ j ) ) > cos t ( A l i g n ( o , δ j ) ) A l i g n ( o , δ j ) A l i g n o p ( cos t is cost function). The comparisons of some non-optimal alignments are terminated at the location of the initial deviation when the perceptibility condition C p is satisfied. Therefore, the unnecessary alignments are saved. In order to take advantage of the given condition C p , it is necessary to determine the perceptible region in the process model. For example, a trace in the event log 1 = ( A B D E K ) is aligned with all firing sequences of the process model, shown in Figure 5. The location of the initial deviation in A l i g n ( 1 = ( A B D E K ) , δ 1 = ( A B G E F K ) ) is v ( l o c i d ( A l i g n ( 1 , δ 1 ) ) ) = 3 , and the number of deviations is also 3. The location of the initial deviation in A l i g n ( 1 = ( A B D E K ) , δ 2 = ( A B D H I J K ) ) is v ( l o c i d ( A l i g n ( 1 , δ 2 ) ) ) = 4 , and the number of deviations is also 4. v ( l o c i d ( A l i g n ( 1 , δ 1 ) ) ) < v ( l o c i d ( A l i g n ( 1 , δ 2 ) ) ) and cos t ( A l i g n ( 1 , δ 1 ) ) < cos t ( A l i g n ( 1 , δ 2 ) ) , so A l i g n ( 1 , δ 1 ) may be optimal. The above situation violates C p .

3.2. Determination of Perceptible Region

According to the example in Section 3.1, the perceptibility condition is not always satisfied. The perceptibility condition is valid if and only if the initial deviation occurs in the specific range. Therefore, it is necessary to determine the perceptible region based on the behavior characteristics of various substructures in the process model. There are four steps to determine the perceptible region in the process model: (i) according to Definition 6, various types of mandatory and selective substructures are divided in the process model; (ii) the initial node a i and the terminated node a f in the process model are determined and included in the perceptible region; (iii) according to Definition 6, the selective substructure with a fixed number of enable activities is denoted as N s e l ( n ) m ( n u m ( e a ) ) = f ( v ) . ( e a is a set of enabled activities,= that can occur; f ( v ) is a fixed value). The given condition C p is satisfied by an alignment when the initial deviation occurs in N m · a n ( z ) , N m a n · ( d ) and N s e l ( n ) m ( n u m ( e a ) ) = f ( v ) ; (iv) it does not necessarily satisfy C p , and the selective substructure with an uncertain number of enabled activities is denoted as N s e l ( n ) m ( n u m ( e a ) ) = u ( v ) ( u ( v ) is a uncertain value). C p is satisfied when the number of enabling activities of N s e l ( n ) m ( n u m ( e a ) ) = u ( v ) does not exceed 2 in any case, denoted as N s e l ( n ) m ( n u m ( e a ) ) = u ( v ) 2 . Therefore, the whole perceptible region is denoted as φ p M = { a i N m · a n ( z ) ( N s e l ( n ) m ( n u m ( e a ) ) = f ( v ) ) ( N s e l ( n ) m ( n u m ( e a ) ) = u ( v ) 2 ) N m a n · ( d ) a f } . It is denoted as M φ p M when the process model contains only the substructures of the perceptible region. This is the reason why Figure 5 does not satisfy C p , namely, N s e l ( 1 ) 2 ( n u m ( e a ) ) = u ( v ) > 2 N s e l ( 1 ) 2 φ ¬ p M ( φ ¬ p M is the whole non-perceptible region, which is the set of all substructures that do not belong to the perceptible region φ p M ). Figure 6a–c describes the three structures belonging to the perceptible region in the process model, respectively, i.e., N s e l ( n ) m ( n u m ( e a ) ) = f ( v ) , N s e l ( n ) m ( n u m ( e a ) ) = u ( v ) 2 and N m a n ( N m · a n ( z ) , N m a n · ( d ) ).
The determinate process of the perceptible region is described in Algorithm 1. Firstly, the initial and the terminated activities belong to the perceptible region (lines 1–3). Secondly, two mandatory substructures are located correctly, which belong to the perceptible region (lines 4–17). The selective substructure in which the number of the enabled activities is a fixed value in the process model is located correctly. Then, this selective substructure is assigned to the perceptible region (lines 22–24). The selective substructure is assigned to the perceptible region when the number of enabled activities is uncertain and the total number of enabled activities does not exceed 2 (lines 25–27). The selective substructure does not belong to the perceptible region when the number of enabled activities is uncertain and the total number of enabled activities exceeds 2 (lines 28–30). Finally, the output of Algorithm 1 is the perceptible range of the process model, denoted as φ p M .
Algorithm 1 Determine perceptible range
Input:
process   model   M ,   a i   is   the   initial   activity ,   a f   is   the   terminated   activity ,   process   model     substructure   N ,   pre - function   p r e ( ) ,   post - function   p o s t ( ) ,   the   numerical   function   num ( )   the   function   of   fixed   value   f v ( ) ,   the   function   of   uncertain   u v ( ) ,   the   set   of   enable   activities e a .
Output :   the   non - perceptible   area   φ p M . 1 :   If   a i , a f M   a n d   a i = p r e ( a * \ a i )   a n d   a f = p o s t ( a * \ a f )   then 2 :      a i , a f φ p M 3 :   end 4 :   z = 1 5 :   For   each   z Z   do   6 :                 If   N m · a n ( z ) M   and   N m · a n ( z ) = p r e ( N s e l ( n ) m )   then   7 :                           N m · a n ( z ) φ p M   8 :                 end 9 :         z = z + 1 10 :   end   11 :   d = 1 12 :   For   each   d D   do 13 :                 If   N m a n · ( d ) M   and   N m a n · ( d ) = p o s t ( N s e l ( n ) m )   then 14 :                           N m a n · ( d ) φ p M   15 :                 end 16 :   d = d + 1 17 :   end 18 :   n = 1 19 :   For   n n u m ( n )   do 20 :                 m = 1 21 :                 For   m n u m ( m )   do 22 :                                 If   N s e l ( n ) m M   and   n u m ( e a ) = f ( v )   then 23 :                                                 N s e l ( n ) m φ p M 24 :                                 end 25 :                                 If   N s e l ( n ) m M   and   n u m ( e a ) = u ( v )   then 26 :                                                 If   n u m ( e a ) 2   then   27 :                                                                 N s e l ( n ) m φ p M 28 :                                                   else 29 :                                                                 N s e l ( n ) m φ p M 30 :                                                 end 31 :                                   end 32 :                 m = m + 1 33 :                 end 34 :   n = n + 1   35 :     end 36 :   return   φ p M
The substructure of a real-life business process can contain various nested types [41]. Four common nested substructures are described in Figure 7. The perceptible nested substructure needs to be analyzed according to the specific situation.
The selective substructure in N sel ( 2 ) 1 in Figure 7a is contained in the selective substructure N sel ( 1 ) 1 , and the number of enabled activities of N s e l ( 1 ) 1 is 1. Thus, N s e l ( 1 ) 1 φ p M . The selective substructure N s e l ( 1 ) 1 is contained in the mandatory substructure N m a n , shown in Figure 7b. The mandatory substructure N m · a n is regarded as a special selective substructure N m a n ( s e l ( 1 ) ) 1 . N m a n ( s e l ( 1 ) ) 1 ( n u m ( e a ) ) f ( v ) and max ( N m a n ( s e l ( 1 ) ) 1 ( n u m ( e a ) ) ) = 4 > 2 , so N m a n ( s e l ( 1 ) ) 1 φ ¬ p M ( max is the maximum function). As shown in Figure 7c, two mandatory substructures N m a n are contained in the selective substructure N s e l ( 1 ) 1 . The selective substructure N s e l ( 2 ) 1 is contained in one of the mandatory substructures N m a n . N s e l ( 1 ) 1 ( n u m ( e a ) ) f ( v ) and max ( N s e l ( 1 ) 1 ( n u m ( e a ) ) ) = 4 > 2 , so N s e l ( 1 ) 1 φ ¬ p M . Two selective substructures N s e l ( 1 ) 1 / 2 and N s e l ( 1 ) 2 / 1 are contained in the mandatory substructure N m a n , shown in Figure 7d, denoted as N m a n ( s e l ( 1 ) ) 1 / 2 and N m a n ( s e l ( 1 ) ) 2 / 1 . m = 1 / 2 and m = 2 / 1 , so the occurrence order of N s e l ( 1 ) 1 / 2 and N s e l ( 1 ) 2 / 1 is uncertain according to Definition 6. N s e l ( 2 ) 1 / 2 and N s e l ( 2 ) 2 / 1 are contained in N m a n ( s e l ( 1 ) ) 1 / 2 and N m a n ( s e l ( 1 ) ) 2 / 1 , respectively. The structural property of N m a n is regarded as the selective substructure N m a n ( s e l ) . N m a n ( s e l ) ( n u m ( e a ) ) f ( v ) and max ( N m a n ( s e l ) ( n u m ( e a ) ) ) = 5 > 2 in N m a n ( s e l ) , so N m a n ( s e l ) φ ¬ p M . To sum up, the perceptible nested substructure needs to obey the following two rules: (i) the perceptibility condition of the selective substructure is directly considered when the outermost structure is the selective substructure; (ii) the mandatory substructure is judged to be a perceptible substructure if it does not contain a selective substructure. Otherwise, the mandatory substructure needs to be judged as the type of selective substructure.
The determination of perceptible nested substructure is described in Algorithm 2. The perceptible region is determined according to the normal rule of selective substructure when a mandatory substructure is contained in a selective substructure (lines 2–14). The mandatory substructure is regarded as a selective substructure if a selective substructure is contained in this mandatory substructure (lines 15–17).
Algorithm 2 Determine the perceptibility of nested substructures
Input:
process   model   M ,   mandatory   concurrency   substructure   N m a n ,   selective   substructure   N s e l ,   the   set   of   enable   activities   e a ,   function   of   fixed   value   f v ( ) ,   function   of   maximum   value max ( ) ,   function   of   judge   substructure   N   is   a   perceptive   region   j N ( ) .
Output :   judge   result   y e s   or   n o . 1 : N m a n M , N s e l M 2 :   If   N m a n N s e l   and   N s e l ( m a n ) M   then   3 :             If   e a N s e l ( m a n )   then 4 :                         If   f v ( e a ) = y e s 5 :                                     j N ( N s e l ( m a n ) ) = y e s 6 :                         else 7 :                                     If   max ( e a ) 2   then   8 :                                                 j N ( N s e l ( m a n ) ) = y e s 9 :                                     else 10 :                                                 j N ( N s e l ( m a n ) ) = n o 11 :                                     end 12 :                         end 13 :             end 14 :   end 15 :   If   N s e l N m a n   and   N m a n ( s e l ) M   then   16 :             N m a n ( s e l ) N s e l 17 :             j N ( N m a n ( s e l ) ) = j N ( N s e l ) 18 :   end  

3.3. Reverse Search for Perceived Range

There are many types of substructures in process models, which are divided according to the occurrence possibility of each path in the substructure. Therefore, the type of each substructure between the initial and end nodes is determined for accurately detecting the perceptible and non-perceptible regions in the process model. The perceptible region in the process model is searched back-to-front according to Definition 6, which is called a reverse search. Figure 8 describes the process of a reverse search. Since N s e l ( n ) m N s e l ( n 1 ) m , N s e l ( n ) m and N s e l ( n 1 ) m are regarded as a whole. N s e l ( n 1 ) m ( n u m ( e a ) ) = 2 indicates that the number of enabled activities in N s e l ( n 1 ) m is a constant of 2, that is, N s e l ( n 1 ) m φ p M . max ( N s e l ( n 1 ) m 1 ( n u m ( e a ) ) ) = 3 indicates that one branch of N s e l ( n 1 ) m 1 contains only a hidden transition, while the number of enabled activities on the other branch is 3, i.e., N s e l ( n 1 ) m 1 φ ¬ p M . max ( N s e l ( n 1 ) m 2 ( n u m ( e a ) ) ) = 2 indicates that the maximum number of enable activities of N s e l ( n 1 ) m 2 is 2, that is, N s e l ( n 1 ) m 2 φ p M . N s e l ( n 1 ) m 3 ( n u m ( e a ) ) = 1 indicates that the number of enabled activities in N s e l ( n 1 ) m 3 is a constant of 1, that is, N s e l ( n 1 ) m 3 φ p M . N m · a n ( z ) ( n u m ( e a ) ) = 4 is the number of enabled activities of N m · a n ( z ) , which is a constant, that is, N m · a n ( z ) φ p M . Thus, the perceptible range in Figure 8 is φ p M = M \ ( N s e l ( n 1 ) m 2 N s e l ( n 1 ) m 1 ) . The perceptibility condition of C p can be used to determine the optimal alignment when the initial deviation occurs in the perceptible region.
The reverse search for the perceptible region is described in Algorithm 3. Firstly, it is determined that various substructures belong to the process model (line 1). N s e l ( n ) m is not contained in N m a n · ( d ) ; then, M φ p M N m a n · ( d ) (lines 3–6). Otherwise, N m a n · ( d ) is regarded as N s e l ( n ) m , and the algorithm jumps to line 10 (lines 7–9). The number of enabled activities of N s e l ( n ) m is determined to be a fixed value from the back forward, then M φ p M N s e l ( n ) m (lines 13–17). The number of enabled activities of N s e l ( n ) m is not a fixed value and the maximum number of enabled activities does not exceed 2; then, M φ p M N s e l ( n ) m (lines 18–21). The level n is reduced by 1 if the next substructure satisfies N s e l ( n 1 ) m N s e l ( n ) m (lines 22–24). The occurrence order m is reduced by 1 if the substructure N s e l ( n ) m 1 is the preceding substructure of N s e l ( n ) m (lines 25–27). M φ p M N m a n · ( z ) if N s e l ( n ) m is not contained in N m a n · ( z ) (lines 29–32). Otherwise, N m · a n ( z ) is regarded as N s e l ( n ) m , and the algorithm returns to line 10 (lines 33–35). Finally, the output is the perceptible region of the process model M φ P M (line 37).
Algorithm 3 Non-perceivable areas are eliminated by reverse search
Input:
process   model   M ,   mandatory   substructure   N m a n ,   selective   substructure   N s e l ,   perceivable   range   of   process   model   M φ P M , initial   activity   a i , final   activity   a f ,   perceptible   condition   C p ,   fixed   value   f v ,   function   of   judge   substructure   N   is     a   perceivable   region   j ( ) .
Output :   perceivable   range   of   process   model   M φ P M . 1 :   N m a n M , N s e l M 2 :   d = | d | 3 :   For   each   d 1   do 4 :                 If   N s e l ( n ) m N m a n · ( d )   then   5 :                                 j ( N m a n · ( d ) ) = y e s 6 :                                 M φ P M N m a n · ( d ) C p ( N m a n · ( d ) ) = t r u e 7 :                 else 8 :                                 N m a n · ( d ) N s e l ( n ) m   and   jump   to   10   line   9 :                 end 10 :   d = d - 1   11 :   end 12 :   n = n u m ( n ) m = n u m ( m ) 13 :   For   each   n 1     m 1   do 14 :                 If   N s e l ( n ) m M     num ( e a ) = f v     e a N s e l ( n ) m   then 15 :                                 j ( N s e l ( n ) m ) = y e s 16 :                                 M φ P M N s e l ( n ) m C p ( N s e l ( n ) m ) = t r u e 17 :                 end 18 :                 If   N s e l ( n ) m M     num ( e a ) f v     0 n u m ( e a ) 2     e a N s e l ( n ) m   then 19 :                                 j ( N s e l ( n ) m ) = y e s 20 :                                 M φ P M N s e l ( n ) m C p ( N s e l ( n ) m ) = t r u e 21 :                 end 22 :                 If   N s e l ( n 1 ) m N s e l ( n ) m   then   23 :                                 n = n - 1 24 :                   end 25 :                 If   N s e l ( n ) m 1 = p r e ( N s e l ( n ) m )   then 26 :                                 m = m - 1 27 :                 end   28 :   end 29 :   For   each   z 1   do 30 :                 If   N s e l ( n ) m N m a n · ( z )   then   31 :                                 j ( N m a n · ( z ) ) = y e s 32 :                                 M φ P M N m a n · ( z ) C p ( N m a n · ( z ) ) = t r u e 33 :                 else 34 :                                 N m a n · ( z ) N s e l ( n ) m   and   return   to   10   line   35 :                 end 36 :   end   37 :   Return   M φ p M  

4. Perceptible Search for Optimal Alignment

Each trace is aligned with all firing sequences by the optimal alignment, and the alignment that satisfies the least deviation cost is selected. The brute force search will produce a huge amount of computation for the complex business process. It is proposed that the non-optimality of alignment is perceived based on the location of the initial deviation, thereby reducing some unnecessary comparisons.

4.1. Perceptible Search Algorithm

The optimal alignments of the event log can be regarded as a sum of the optimal alignment of each trace since the event log contains different traces. The search for optimal alignment is denoted as S e a r c h ( A l i g n o p ( L , M ) ) = S e a r c h ( o = 1 | O | o p = 1 | O P | A l i g n ( o , δ o p ) ) . The default location of the initial deviation of each trace is recorded as 0, denoted as v ( l o c i d ( A l i g n ( o , M ) ) ) = 0 . According to the perceptibility condition C p , an alignment cannot be judged to be non-optimal when v ( l o c i d ( A l i g n ( o , δ j ) ) ) v ( l o c i d ( A l i g n ( o , M ) ) ) . Therefore, the final cost of A l i g n ( o , δ j ) needs to be obtained and evaluated together with the other alignments of o , namely, ( A l i g n ( o , M ) A l i g n ( o , δ j ) ) h o p ( A l i g n ( o , M ) ) ( h o p is the evaluated function of the minimum cost). In this case, v ( l o c i d ( A l i g n ( o , δ j ) ) ) is assigned to the record value v ( l o c i d ( A l i g n ( o , M ) ) ) , and acts as the perceptible standard for the next alignment A l i g n ( o , δ j + 1 ) . This alignment must not be the optimal alignment of o when v ( l o c i d ( A l i g n ( o , δ j ) ) ) < v ( l o c i d ( A l i g n ( o , M ) ) ) . Therefore, a complete comparison of A l i g n ( o , δ j ) is not needed, and v ( l o c i d ( A l i g n ( o , M ) ) ) remains unchanged. It is worth noting that the perceptibility condition is feasible when the initial deviation occurs in the perceptible region. The overall process of the perceptible algorithm is described in Figure S1.
The above search for optimal alignment can be denoted as A l i g n o p ( L , M ) P S ( o = 1 | O | j = 1 | J | A l i g n ( o , δ j ) ) . It can directly jump to the next trace o + 1 when o is consistent with δ j , and this alignment is regarded as the optimal alignment of o . The concrete procedures and the proof of perceptible search ( P S ) are described in the Supplementary Materials.
The process model M with the perceptible location is described in Figure 9, and the non-perceptible region is not contained in M , namely, I d e ( φ ¬ p M ) = . All the firing sequences of M are recorded in Table 4.
The event log is set to L = {(a,b,c,d,e,f,h,i,l),(a,b,c,d,e,f,j,k,l,)}. L is aligned with M , and the search process of the optimal alignment is depicted in Figure 10. Figure 10a,b is the search process for traces 1 and 2 , respectively. From Figure 10a, the location of the initial deviation in the first and the third alignment is greater than or equal to the previously recorded value. These two alignments need to be evaluated, that is, v ( l o c i d ( A l i g n ( 1 , δ 1 , 3 ) ) ) v ( l o c i d ( A l i g n ( 1 , M ) ) ) W c o m p a r e ( A l i g n ( 1 , δ 1 , 3 ) ) . The location of the initial deviation in other alignments is smaller than the previously recorded value. These alignments are directly judged to be non-optimal, denoted as v ( l o c i d ( A l i g n ( 1 , δ \ δ 1 , 3 ) ) ) < v ( l o c i d ( A l i g n ( 1 , M ) ) ) P c o m p a r e ( A l i g n ( 1 , δ \ δ 1 , 3 ) ) A l i g n ( 1 , δ \ δ 1 , 3 ) A l i g n o p ( P c o m p a r e is the function of partial comparison). For example, v ( l o c i d ( A l i g n ( 1 , δ 2 ) ) ) = 6 < 7 and A l i g n ( 1 , δ 2 ) = P c o m p a r e ( A l i g n ( 1 , δ 2 ) ) . The comparison of A l i g n ( 1 , δ 2 ) is terminated when the initial deviation occurs. From Figure 10b, the first alignment has no deviation, so it is directly judged as the optimal alignment of 2 .
Algorithm 4 describes the perceptible search for optimal alignment. The set of optimal alignments is initialized to the (line 1). The recorded location of the initial deviation is set to 0 (line 4). The alignment needs to be completely compared and the location of the initial deviation is assigned to the recorded value if the location of the initial deviation is greater than or equal to the previously recorded value in perceptible region (lines 7–9). The recorded value is maintained and the comparison of alignment is terminated when the location of the initial deviation is smaller than the previously recorded value (lines 10–13). The recorded value is maintained and the alignment is completely compared if the initial deviation of alignment occurs in a non-perceptible region (lines 15–17). The alignment is regarded as the optimal alignment of a trace if no deviation is found between the trace and firing sequence (lines 18–21). All alignments that need complete comparison are assigned to the alignment set A l i g n (line 22). The optimal alignment set A l i g n o p is evaluated by h o p (line 25). The optimal alignment set is output between the event log and process model as a result (line 28).
Algorithm 4 Search the optinal alignment based on the location of initial deviation
Input:
event   log   L ,   process   model   M ,   trace   o ,   firing   sequence   δ j ,   the   value   of   initial   deviation   location   v ( l o c i d ( o , M ) ) ,   the   perceptible   range   φ p M ,   the   non - perceptible   range   φ ¬ p M ,   the   evalution   function   of   the   optimal   alignment   h o p ,   the     whole   compared   function   of   alignment   W c o m p a r e ,   the   partial   compare   of   alignment   P c o m p a r e ,   the   set   of   the   alignments   is   A l i g n o p .  
Output :   the   set   of   the   alignments   is   A l i g n o p . 1 : A l i g n o p = 2 :   o = 1   3 :   For   each   o O   do 4 :                     v ( l o c i d ( o , M ) ) = 0 5 :                     j = 1 6 :                     For   each   j J   do   7 :                                     If   cos t ( A l i g n ( o , δ j ) ) > 0     l o c i d ( A l i g n ( o , δ j ) ) N m a n / s e l   N m a n / s e l φ p M   then 8 :                                             If   v ( l o c i d ( A l i g n ( o , δ j ) ) ) v ( l o c i d ( A l i g n ( o , M ) ) ) = 0   then   9 :                                                       W c o m p a r e ( A l i g n ( o , δ j ) ) v ( l o c i d ( A l i g n ( o , M ) ) ) v ( l o c i d ( A l i g n ( o , δ j ) ) ) 10 :                                           else 11 :                                                       | P c o m p a r e ( A l i g n ( o , δ j ) ) | = v ( l o c i d ( A l i g n ( o , δ j ) ) ) 12 :                                                       P c o m p a r e ( A l i g n ( o , δ j ) ) v ( l o c i d ( A l i g n ( o , M ) ) ) v ( l o c i d ( A l i g n ( o , M ) ) ) j u m p   t o   24   l i n e 13 :                                             end 14 :                                       end 15 :                                       If   cos t ( A l i g n ( o , δ j ) ) > 0     l o c i d ( A l i g n ( o , δ j ) ) N m a n / s e l   N m a n / s e l φ ¬ p M   then   16 :                                             W c o m p a r e ( A l i g n ( o , δ j ) ) v ( l o c i d ( A l i g n ( o , M ) ) ) v ( l o c i d ( A l i g n ( o , M ) ) ) 17 :                                       end 18 :                                       If   cos t ( A l i g n ( o , δ j ) ) = 0   then   19 :                                             A l i g n o p A l i g n ( o , δ j )   20 :                                             j u m p   t o   l i n e   26 21 :                                         end 22 :                                       A l i g n W c o m p a r e ( A l i g n ( o , δ j ) ) 23 :                       j = j + 1 24 :                       end 25 :                       A l i g n o p h o p ( A l i g n ) 26 :   o = o + 1 27 :   end 28 :   return   A l i g n o p  

4.2. Example of Industrial Application

The behavior recorded in the initial process is often different from that described in the current information system due to the larger flow and complicated properties of coal slime water. The degradation performance of anionic polyacrylamide (HPAM) may be affected by different operational conditions. Therefore, the biological coal-washing process should be regarded as a business process that executes a specific goal. The deviation elements are detected in the actual biological coal-washing process, providing accurate technical support for the subsequent optimization. Figure 11 and Figure 12 describe the current process and initial model in the biological coal-washing using Petri net language, respectively. Table 5 lists annotations of the letter labels in Figure 11 and Figure 12.
As shown in Figure 11, let trace o = ( A B C D E G K N P Q S T U V P Q R W ) be recorded in L , and then, o compares M in Figure 12. The two firing sequences δ j = ( A B C D E P Q U T W ) and δ j = ( A B C D F P Q T U W ) are extracted for constructing two alignment A l i g n ( o , δ j ) and A l i g n ( o , δ j ) , respectively. According to the judgment of the perceptible region in Section 3.2, M φ p M is known. A l i g n ( o , δ j ) can be determined to be non-optimal when the first deviation E s e r t ( l ( e 5 ) ) = E occurs due to v ( l o c i d ( A l i g n ( o , δ j ) ) ) = 6 > v ( l o c i d ( A l i g n ( o , δ j ) ) ) = 5 . According to Theorem S1 in the Supplementary Materials, there exists δ j = ( A B C D E P Q T U W ) and it makes A l i g n ( o , δ j ) superior to A l i g n ( o , δ j ) and A l i g n ( o , δ j ) . Therefore, the efficiency of deviation detection in actual operation is effectively improved.

5. Evaluation

An experiment was performed on a 64-bit Win10 computer with Inter(R) Core(TM) i5-2.11 GHZ with 8 GB of memory space and python3.7. The A algorithm needed to make the complete comparison of all alignments between the event log and process model, which was denoted as A A l i g n . O I A A l i g n directly jumped to the alignment of the next trace if and only if an alignment that fit perfectly was found by it. The deviation perception proposed in this paper was denoted as I D P A l i g n , and its scalability and effectiveness were verified by the performance comparison of the above three methods.

5.1. The Operation Interface of OPS-Align Plug-In

A A l i g n and O I A A l i g n could be realized through the source code in the prom framework, but I D P A l i g n could not. To ensure the fairness of the experimental results, we wrote a OPS-Align plug-in in Python to evaluate the search workload of the optimal alignment.
Figure 13 describes the OPS-Align plug-in interface. Input and output information were located in the left of the interface, and parts of the program code were located in the right. As shown in Figure 13, v a l u e ( O ) was the total number of trace cases, v a l u e ( J ) was the total number of firing sequences, P R was the perceptible range, S M e t h o d was the current method, E L E x was the event log used in the experiment, M E x was the process model used in the experiment and S T i m e was the search time.

5.2. Scalability

In order to verify the improvement in various business processes, A A l i g n and I D P A l i g n could be evaluated on data sets with different network structures.

5.2.1. Data Sets of Artificial and Real-Life Business Process

Four artificial and two real-life data sets were used in the experiment. Each data set was constructed by ten different event logs and a corresponding initial model. Based on the initial model, these event logs were generated by redundancy activity, loss activity and dislocation between activities or cyclic substructure.
  • Artificial data set.
Four artificial data sets were generated from two different artificial business processes. These two business processes were considered initial models M a 1 and M a 2 , respectively. The non-perceptible range was not set in M a 1 and M a 2 . Each artificial data set contained the selective substructure or the cyclic substructure.
  • Real-life data set.
As shown in Figure 14, the auto-claim process mined by the prom framework was regarded as the initial model M r in the two real-life data sets. The activities of M, O, R and T were removed so that the non-perceptible region was not contained in M r . The selective substructure and the cyclic substructure were contained in each real-life data set, so these data sets had a universal network structure.

5.2.2. Experimental Results on Different Network Structures

  • Artificial data sets with selective substructures.
Figure 15 describes the performance of A* − Align and IDP − Align in two data sets containing a selective substructure. The X and Y axes represented the number of aligned traces and the search time of the optimal alignment, respectively. Each trace was required compared with all firing sequences, so the number of aligned traces was referred to the total number of traces involved in the alignment process, denoted as | T r a c e a l i g n | = o = 1 | O | o × n × n u m ( J ) ( n represents the number of repetition of o , and n u m ( J ) is the total number of firing sequences). Each event log had 200 aligned traces, and ten data sets were analyzed, as shown in Figure 15a,b. Each data set was constructed of ten event logs and a corresponding initial model, that is, D a 1 ( L a ( 1 , , 10 ) , M a 1 ) and D a 2 ( L a ( 1 , , 10 ) , M a 1 ) . The average search time of ten data sets was taken to ensure the authenticity of the experimental result. The business processes in Figure 15a,b contained 11 and 23 activities, respectively. The length intervals of traces in two data sets were 6 to 8 and 14 to 20, respectively. The lengths of firing sequences in two data sets were 7 and 16, respectively. The search time using IDP − Align was reduced by 207 ms (in Figure 15a), while it was reduced by 1302 ms (in Figure 15b) compared to A* − Align. From Figure 15, we found that the size of the data set was larger, and more activities could be automatically eliminated in perception. Thus, the difference in search time was greater.
  • Artificial data sets with different cyclic substructures.
Each event log had 300 aligned traces, and ten data sets were analyzed, as shown in Figure 16a,b. Each data set was constructed of ten event logs and a corresponding initial model, that is, D a 3 ( L a ( 1 , , 10 ) , M a 2 ) and D a 4 ( L a ( 1 , , 10 ) , M a 2 ) . The average search time of ten data sets was taken to ensure the authenticity of the experimental result. Figure 16a,b contained 10 and 11 activities, respectively. The maximum length of the trace and firing sequence could not be determined due to the cyclic substructure in each data set. The cyclic substructures in Figure 16a,b were the self-loop and the loop of three activities, respectively. The search time using IDP − Align was reduced by 2700 ms (in Figure 16a), while it was reduced by 7908 ms (in Figure 16b) compared to A* − Align. The size of alignment comparison was enlarged by the cyclic substructure, so more elements were automatically eliminated in perception (in Figure 16b). Thus, the improvement degree of search time depended on the size of the data set compared to the A* − Align.
  • Real-life data sets with universal network structures.
The real-life data sets containing both selective substructures and cyclic substructures were evaluated. The cyclic substructures contained in Figure 17a,b were the self-loop and normal loop, respectively. Each event log had 1020 and 780 aligned traces, and ten data sets were analyzed as shown in Figure 17a,b, respectively. Each data set was constructed of ten event logs and a corresponding initial model M r , that is, D r 1 ( L r ( 1 , , 10 ) , M r ) and D r 2 ( L r ( 1 , , 10 ) , M r ) . The average search time of ten data sets was taken to ensure the authenticity of the experimental result. The search time using IDP − Align was reduced by 37,752 ms (in Figure 17a), while it was reduced by 24,600 ms (in Figure 17b) compared to A* − Align. This was due to the size of the data set and cyclic substructure.
To sum up, the search time was significantly shortened by IDP − Align applied to various networks for the large-sized data sets. Figure 18 shows the experimental results of eight different data sets. Their parameters were search time, alignment proportion of partial comparisons and the number of aligned traces. The alignment proportion of partial comparisons was the ratio of automatically eliminated non-optimal alignments. In the same data set, the alignment proportion of partial comparisons was larger and the reduction of search time was more obvious than A* − Align. As shown in Figure 18, the search time for data sets with various network structures could be effectively reduced using IDP − Align compared with A* − Align.

5.2.3. Experimental Results on Biological Coal-Washing Data Set

The biological coal-washing process was simulated as event logs, that is, L B i o without loop structure and L B i o with loop structure, respectively. The initial model is constructed by a simple process, M B i o ( M B i o φ p M ) and M B i o ( M B i o φ p M ), consisting of basic steps, respectively. The points of each line represent the search time and the proportion of time difference in data sets of different sizes (seen in Figure 19). As can be seen from Figure 19, the proportion of search time and time difference increases with the increase of data set size. As can also be seen from Figure 19, the search time and the proportion of time difference increases with the increase of data set size.
From Figure 19a, D B i o 1 is composed of L B i o and M B i o , while D B i o 2 is composed of L B i o and M B i o . A A l i g n and I D P A l i g n are executed on these two data sets. The substructures in D B i o 1 were causal, concurrent and selective, that is, N c a u , N c o n c and N s e l ( n ) m . However, D B i o 2 has more cyclic substructures than D B i o 1 . As shown in Figure 19a, the search time of I D P A l i g n in D B i o 1 and D B i o 2 was reduced by 47,648 ms and 65,768 ms compared with that of A A l i g n , respectively. From Figure 19b, D B i o 1 is composed of L B i o and M B i o , while D B i o 2 is composed of L B i o and M B i o . A A l i g n and I D P A l i g n are executed on these two data sets. As shown in Figure 19b, the search time of I D P A l i g n in D B i o 1 and D B i o 2 was reduced by 55,226 ms and 40,232 ms compared with that of A A l i g n , respectively. The substructures in D B i o 1 were causal, concurrent and selective, that is, N c a u , N c o n c and N s e l ( n ) m . However, D B i o 2 has more cyclic substructures than D B i o 1 .

5.3. Effectiveness

A A l i g n , O I A A l i g n and L D P A l i g n were performed on two BPIC2020 data sets so as to verify the improvement of search efficiency on the basis of guaranteeing optimality. At the same time, the experiment took into account the different noises.

5.3.1. BPIC2020 Data Sets

The data sets used came from BPIC2020. Table 6 lists the specific information of the two event logs in BPIC2020. Their cases and event types represent the number of different traces and event elements, respectively. Domestic Declarations in BPIC2020 were used as the event log of the experiment. Figure 20a,b shows two process models M B 1 and M B 2 with inconsistent behavior from Domestic Declarations, respectively. The non-perceivable region was not identified in M B 1 and M B 2 .

5.3.2. Experimental Result

The data set D B 1 was constructed by Domestic Declarations and the given process model, as shown in Figure 20a. Then, 20% and 40% noise were added to the initial event log to produce two new data sets, denoted as DB1+20% and DB1+40%. A A l i g n , O I A A l i g n and I D P A l i g n were used to search the optimal alignment set (in Figure 21). Cost and search time were evaluated, and the proportion of fitted alignments in D B 1 , DB1 and DB1+20% and DB1+40% was extremely low. Thus, the search time was shortened only by 0.34%, 0.28% and 0.25% using O I A A l i g n compared to A A l i g n . The certain proportion of alignments in D B 1 , DB1+20% and DB1+40% were partially compared. Thus, the search time was reduced using I D P A l i g n by 40%, 34% and 29% compared with A A l i g n . The noise was greater and the reduction of search time was smaller in the same data set. Fewer elements were automatically eliminated due to the initial deviation of some noise alignments occurring at the end. The difference of search time between O I A A l i g n and A A l i g n in D B 1 , DB1+20% and DB1+40% are depicted in Figure 21 (partially enlarged).
The data set D B 2 was constructed by Domestic Declarations and the given process model, as shown in Figure 20b. Then, 20% and 40% noise were added to the initial event log to produce two new data sets, denoted as D B 2 + 20 % and D B 2 + 40 % . The search time was shortened by 2.82%, 2.34% and 2% using O I A A l i g n compared to A A l i g n . It could be seen that the ratio of fitted alignments in Figure 22 was larger than that in Figure 21. The search time was reduced using I D P A l i g n by 20%, 18% and 14% compared with A A l i g n . The result showed that the improvement in Figure 22 was lower than that in Figure 21 due to the size of the data sets.
Table 7 records the time and cost obtained using A A l i g n , I A A l i g n and I D P A l i g n . From Table 7, the search time of I D P A l i g n was the least in the same data set compared with others.
Table 8 lists all the numerical time differences in different data sets. From Table 8, we can find that the difference between A A l i g n and I A A l i g n was much smaller than the others. At the same time, we can also see the following two problems according to the time difference: (1) the time differences were basically proportional to the size of the data set; (2) noise addition has little effect on the time savings of I D P A l i g n .

5.4. Performance Comparison

The efficiency improvement of I D P A l i g n was verified by the above experiments compared with the existing A A l i g n and I A A l i g n . Moreover, the search time was effectively shortened on the premise of ensuring the minimum cost. However, the main performance of conformance checking was evaluated by taking cost, time, perspective, form and application range into account. Therefore, these features are discussed in Table 9.

6. Conclusions

The effective search for the optimal alignment is introduced in this paper, aiming at unweighted business processes with uniform unit cost. The search for the optimal alignment is efficiently improved compared with the traditional brute force search when dealing with a complex business process. The perceptible search mainly includes the following three steps: (i) According to the behavior characteristics of different substructures in the process model, the perceptible region is determined by reverse search. The location of the initial deviation is inversely proportional to the number of deviations when the initial deviation occurs in the perceptible region. (ii) The recorded location of the initial deviation is firstly set to 0. The recorded location is updated when the location of the current initial deviation is greater than the previously recorded value. At the same time, its recorded location is used as the optimal metric of the same trace. (iii) The recorded location is unchanged and the comparison of alignment is automatically terminated as non-optimal when the location of the current initial deviation is smaller than the previously recorded value. In summary, the search workload of the optimal alignment can be effectively reduced now that some alignments are only partially compared based on the perception of the initial deviation.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/electronics13234669/s1.

Author Contributions

L.Z.: Writing—original draft, Validation, Methodology, Investigation, Formal analysis, Data curation, Project administration. F.W.: Conceptualization, Methodology, Validation, Investigation, Data curation, Formal analysis, Writing—original draft, Writing—review and editing. Z.S.: Investigation, Supervision, Project administration, Methodology. K.H.: Investigation, Validation. Y.H.: Validation, Data curation, Formal analysis. G.Z.: Data curation, Formal analysis. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by the Open Research Project of the State Key Laboratory of Industrial Control Technology, Zhejiang University, China (no. ICT2024B58), the Natural Science Research Project of Anhui Higher Education Institution (no. 2024AH051732 and 2024AH051735) and the High-Level Talent Fund Project of Huainan Normal University (no. 621222-BSKYQDJ).

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References

  1. Li, C.; Ge, J.; Huang, L.; Hu, H.; Wu, B.; Yang, H.; Hu, H.; Luo, B. Process mining with token carried data. Inf. Sci. 2016, 328, 558–576. [Google Scholar] [CrossRef]
  2. Caldeira, J.; Abreu, F.B.e. Software development process mining: Discovery, conformance checking and enhancement. In Proceedings of the 2016 10th International Conference on the Quality of Information and Communications Technology (QUATIC), Lisbon, Portugal, 6–9 September 2016; pp. 254–259. [Google Scholar]
  3. Van der Aalst, W.; Weijters, T.; Maruster, L. Workflow mining: Discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 2004, 16, 1128–1142. [Google Scholar] [CrossRef]
  4. Weidlich, M.; Mendling, J. Perceived consistency between process models. Inf. Syst. 2012, 37, 80–98. [Google Scholar] [CrossRef]
  5. Alizadeh, M.; Lu, X.; Fahland, D.; Zannone, N.; van der Aalst, W.M. Linking data and process perspectives for conformance analysis. Comput. Secur. 2018, 73, 172–193. [Google Scholar] [CrossRef]
  6. van Zelst, S.J.; van Dongen, B.F.; van der Aalst, W.M. Event stream-based process discovery using abstract representations. Knowl. Inf. Syst. 2018, 54, 407–435. [Google Scholar] [CrossRef]
  7. Pourmasoumi, A.; Kahani, M.; Bagheri, E. Mining variable fragments from process event logs. Inf. Syst. Front. 2017, 19, 1423–1443. [Google Scholar] [CrossRef]
  8. Buijs, J.C.; La Rosa, M.; Reijers, H.A.; van Dongen, B.F.; van der Aalst, W.M. Improving business process models using observed behavior. In Proceedings of the 2nd International Symposium on Data-Driven Process Discovery and Analysis (SIMPDA), Campione d’Italia, Italy, 18–20 June 2012; pp. 44–59. [Google Scholar]
  9. Qi, H.; Du, Y.; Qi, L.; Wang, L. An approach to repair Petri net-based process models with choice structures. Enterp. Inf. Syst. 2018, 12, 1149–1179. [Google Scholar] [CrossRef]
  10. Zhang, L.; Fang, X.; Shao, C.; Wang, L. Real-time repair of business processes based on alternative operations in case of uncertainty. IEEE Access 2021, 9, 23672–23690. [Google Scholar] [CrossRef]
  11. Kleiner, N. Delta analysis with workflow logs: Aligning business process prescriptions and their reality. Requir. Eng. 2005, 10, 212–222. [Google Scholar] [CrossRef]
  12. Garcia-Banuelos, L.; van Beest, N.R.; Dumas, M.; La Rosa, M.; Mertens, W. Complete and interpretable conformance checking of business processes. IEEE Trans. Softw. Eng. 2017, 44, 262–290. [Google Scholar] [CrossRef]
  13. van der Aalst, W.; Adriansyah, A.; van Dongen, B. Replaying history on process models for conformance checking and performance analysis. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2012, 2, 182–192. [Google Scholar] [CrossRef]
  14. Fang, X.; Cao, R.; Liu, X.; Wang, L. A method of mining hidden transition of business process based on region. IEEE Access 2018, 6, 25543–25550. [Google Scholar] [CrossRef]
  15. Armas-Cervantes, A.; Baldan, P.; Dumas, M.; Garcia-Bañuelos, L. Diagnosing behavioral differences between business process models: An approach based on event structures. Inf. Syst. 2016, 56, 304–325. [Google Scholar] [CrossRef]
  16. Fahland, D.; Van Der Aalst, W.M. Model repair-aligning process models to reality. Inf. Syst. 2015, 47, 220–243. [Google Scholar] [CrossRef]
  17. Bauer, M.; Van der Aa, H.; Weidlich, M. Estimating process conformance by trace sampling and result approximation. In Proceedings of the Business Process Management: 17th International Conference, BPM 2019, Proceedings 17, Vienna, Austria, 1–6 September 2019; Springer International Publishing: Cham, Switzerland, 2019. [Google Scholar]
  18. De Leoni, M.; Maggi, F.M.; van der Aalst, W.M. Aligning event logs and declarative process models for conformance checking. In Proceedings of the Business Process Management: 10th International Conference, BPM 2012, Proceedings 10, Tallinn, Estonia, 3–6 September 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 82–97. [Google Scholar]
  19. Adriansyah, A.; van Dongen, B.F.; van der Aalst, W.M. Conformance checking using cost-based fitness analysis. In Proceedings of the 2011 IEEE 15th International Enterprise Distributed Object Computing Conference, Helsinki, Finland, 29 August–2 September 2011; pp. 55–64. [Google Scholar]
  20. Leemans, S.J.; Fahland, D.; Van Der Aalst, W.M. Discovering block-structured process models from event logs-a constructive approach. In Proceedings of the Application and Theory of Petri Nets and Concurrency: 34th International Conference, PETRI NETS 2013, Proceedings 34, Milan, Italy, 24–28 June 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 311–329. [Google Scholar]
  21. Buijs, J.C.; Van Dongen, B.F.; van Der Aalst, W.M. On the role of fitness, precision, generalization and simplicity in process discovery. In Proceedings of the On the Move to Meaningful Internet Systems: OTM 2012: Confederated International Conferences: CoopIS, DOA-SVI, and ODBASE 2012, Proceedings, Part I, Rome, Italy, 10–14 September 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 305–322. [Google Scholar]
  22. de Leoni, M.; Maggi, F.M.; van der Aalst, W.M. An alignment-based framework to check the conformance of declarative process models and to preprocess event-log data. Inf. Syst. 2015, 47, 258–277. [Google Scholar] [CrossRef]
  23. Jagadeesh Chandra Bose, R.P.; van der Aalst, W. Trace alignment in process mining: Opportunities for process diagnostics. In Proceedings of the International Conference on Business Process Management, Hoboken, NJ, USA, 13–16 September 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 227–242. [Google Scholar]
  24. Weidlich, M.; Polyvyanyy, A.; Desai, N.; Mendling, J. Process compliance measurement based on behavioural profiles. In Proceedings of the Advanced Information Systems Engineering: 22nd International Conference, CAiSE 2010, Proceedings 22, Hammamet, Tunisia, 7–9 June 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 499–514. [Google Scholar]
  25. Bogdanov, E.; Cohen, I.; Gal, A. Conformance checking over stochastically known logs. In Proceedings of the International Conference on Business Process Management, Münster, Germany, 11–16 September 2022; Springer International Publishing: Cham, Switzerland, 2022; pp. 105–119. [Google Scholar]
  26. Pegoraro, M.; Uysal, M.S.; van der Aalst, W.M. Conformance checking over uncertain event data. Inf. Syst. 2021, 102, 101810. [Google Scholar] [CrossRef]
  27. Calheno, R.; Carvalho, P.; Lima, S.R.; Henriques, P.R.; Merino, M.R. Improving conformance checking in process modelling: A multiperspective algorithm. J. Super-Comput. 2023, 79, 18256–18292. [Google Scholar] [CrossRef]
  28. Felli, P.; Gianola, A.; Montali, M.; Rivkin, A.; Winkler, S. Multi-perspective conformance checking of uncertain process traces: An SMT-based approach. Eng. Appl. Artif. Intell. 2023, 126, 106895. [Google Scholar] [CrossRef]
  29. Leemans, S.J.; van der Aalst, W.M.; Brockhoff, T.; Polyvyanyy, A. Stochastic process mining: Earth movers’ stochastic conformance. Inf. Syst. 2021, 102, 101724. [Google Scholar] [CrossRef]
  30. Wang, L.; Du, Y.; Qi, M.; Qi, H.; He, Z. Petri net-based deviation detection between a process model with loop semantics and event logs. Concurr. Comput. Pract. Exp. 2018, 30, e4419. [Google Scholar] [CrossRef]
  31. van Zelst, S.J.; Bolt, A.; Hassani, M.; van Dongen, B.F.; van der Aalst, W.M.P. Online conformance checking: Relating event streams to process models using prefix-alignments. Int. J. Data Sci. Anal. 2019, 8, 269–284. [Google Scholar] [CrossRef]
  32. Lee, W.L.J.; Burattin, A.; Munoz-Gama, J.; Sepúlveda, M. Orientation and conformance: A HMM-based approach to online conformance checking. Inf. Syst. 2021, 102, 101674. [Google Scholar] [CrossRef]
  33. Adriansyah, A.; Van Dongen, B.F.; van der Aalst, W.M. Cost-based conformance checking using the A* Algorithm. BPM Cent. Rep. BPM-11-11 BPMcenter.Org 2011, 1111, 1–14. [Google Scholar]
  34. Adriansyah, A.; van Dongen, B.F.; van der Aalst, W.M. Towards robust conformance checking. In Proceedings of the Business Process Management Workshops: BPM 2010 International Workshops and Education Track, Revised Selected Papers 8, Hoboken, NJ, USA, 13–15 September 2010; Springer: Berlin/Heidelberg, Germany, 2010; pp. 122–133. [Google Scholar]
  35. De Weerdt, J.; De Backer, M.; Vanthienen, J.; Baesens, B. A robust F-measure for evaluating discovered process models. In Proceedings of the 2011 IEEE Symposium on Computational Intelligence and Data Mining (CIDM), Paris, France, 11–15 April 2011; pp. 148–155. [Google Scholar]
  36. Bloemen, V.; van Zelst, S.J.; van der Aalst, W.M.; van Dongen, B.F.; van de Pol, J. Maximizing synchronization for aligning observed and modelled behaviour. In Proceedings of the Business Process Management: 16th International Conference, BPM 2018, Proceedings 16, Sydney, NSW, Australia, 9–14 September 2018; Springer International Publishing: Cham, Switzerland, 2018; pp. 233–249. [Google Scholar]
  37. Reißner, D.; Armas-Cervantes, A.; Conforti, R.; Dumas, M.; Fahland, D.; La Rosa, M. Scalable alignment of process models and event logs: An approach based on automata and s-components. Inf. Syst. 2020, 94, 101561. [Google Scholar] [CrossRef]
  38. Lee, W.L.; Verbeek, H.M.; Munoz-Gama, J.; van der Aalst, W.M.; Sepúlveda, M. Recomposing conformance: Closing the circle on decomposed alignment-based conformance checking in process mining. Inf. Sci. 2018, 466, 55–91. [Google Scholar] [CrossRef]
  39. Song, W.; Xia, X.; Jacobsen, H.-A.; Zhang, P.; Hu, H. Efficient alignment between event logs and process models. IEEE Trans. Serv. Comput. 2016, 10, 136–149. [Google Scholar] [CrossRef]
  40. Dumas, M.; García-Bañuelos, L. Process mining reloaded: Event structures as a unified representation of process models and event logs. In Proceedings of the Application and Theory of Petri Nets and Concurrency: 36th International Conference, PETRI NETS 2015, Proceedings 36, Brussels, Belgium, 21–26 June 2015; Springer International Publishing: Cham, Switzerland, 2015; pp. 33–48. [Google Scholar]
  41. Zhong, C.; Zhang, H.; Huang, H.; Chen, Z.; Li, C.; Liu, X.; Li, S. DOMICO: Checking conformance between domain models and implementations. Softw. Pract. Exp. 2024, 54, 595–616. [Google Scholar] [CrossRef]
  42. Zhang, L.; Fang, X. Business process fitness analysis based on alignment processing and deviation detection. Comput. Integr. Manuf. Syst. 2020, 26, 1573–1581. [Google Scholar]
Figure 1. Search workload for the optimal alignment. Note: event log is L = ( A C B D E R S T ) , and process model is M = ( A C B D E R T , A B C D E R T , A B C D E F I R K S T , A C B D E F I R K S T ) .
Figure 1. Search workload for the optimal alignment. Note: event log is L = ( A C B D E R S T ) , and process model is M = ( A C B D E R T , A B C D E R T , A B C D E F I R K S T , A C B D E F I R K S T ) .
Electronics 13 04669 g001
Figure 2. Discovery of initial deviation.
Figure 2. Discovery of initial deviation.
Electronics 13 04669 g002
Figure 3. Division of paths in various substructures.
Figure 3. Division of paths in various substructures.
Electronics 13 04669 g003
Figure 4. Substructure Nconc of process model M .
Figure 4. Substructure Nconc of process model M .
Electronics 13 04669 g004
Figure 5. Example of process model. Note: N s e l ( 1 ) 1 is represented by a green circle, and N s e l ( 1 ) 2 is marked by a red circle.
Figure 5. Example of process model. Note: N s e l ( 1 ) 1 is represented by a green circle, and N s e l ( 1 ) 2 is marked by a red circle.
Electronics 13 04669 g005
Figure 6. Various substructures belonging to the perceptible region in the process model. Selective substructure with the same number of enable activities (a), selective structure with the number of enable activities ≤ 2 (b), and mandatory substructure (c).
Figure 6. Various substructures belonging to the perceptible region in the process model. Selective substructure with the same number of enable activities (a), selective structure with the number of enable activities ≤ 2 (b), and mandatory substructure (c).
Electronics 13 04669 g006aElectronics 13 04669 g006b
Figure 7. The perceptible range of nested substructures. Nested substructure N s e l ( 2 ) 1 N s e l ( 1 ) 1 (a); nested substructure N s e l ( 1 ) 1 N m a n ( s e l ( 1 ) ) 1 (b); nested substructure N m a n N s e l ( 1 ) 1 (c); nested substructure N s e l ( 1 ) ( 1 / 2 ) / ( 2 / 1 ) N m a n (d).
Figure 7. The perceptible range of nested substructures. Nested substructure N s e l ( 2 ) 1 N s e l ( 1 ) 1 (a); nested substructure N s e l ( 1 ) 1 N m a n ( s e l ( 1 ) ) 1 (b); nested substructure N m a n N s e l ( 1 ) 1 (c); nested substructure N s e l ( 1 ) ( 1 / 2 ) / ( 2 / 1 ) N m a n (d).
Electronics 13 04669 g007
Figure 8. The reverse search for perceptible range. Note: numbers are marked at the locations of all activities in the process model, and the activities of the selective substructure that are in the same location can be recorded as the same number.
Figure 8. The reverse search for perceptible range. Note: numbers are marked at the locations of all activities in the process model, and the activities of the selective substructure that are in the same location can be recorded as the same number.
Electronics 13 04669 g008
Figure 9. The process model M without non-perceivable region.
Figure 9. The process model M without non-perceivable region.
Electronics 13 04669 g009
Figure 10. Search for the set of optimal alignments. Note: v ( l o c i d ( 1 , δ 1 ) ) is simplification for v ( l o c i d ( A l i g n ( 1 , δ 1 ) ) ) . Note: the pink block represents the alignments of complete comparison, the yellow block shows the alignments of partial comparison and the green block represents the fitting alignment. The search process of optimal alignment between 1 and M (a), and the search process of optimal alignment between 2 and M (b).
Figure 10. Search for the set of optimal alignments. Note: v ( l o c i d ( 1 , δ 1 ) ) is simplification for v ( l o c i d ( A l i g n ( 1 , δ 1 ) ) ) . Note: the pink block represents the alignments of complete comparison, the yellow block shows the alignments of partial comparison and the green block represents the fitting alignment. The search process of optimal alignment between 1 and M (a), and the search process of optimal alignment between 2 and M (b).
Electronics 13 04669 g010
Figure 11. Current business process of biological coal-washing process. Note: labels for the actual steps in the transition are replaced by letters.
Figure 11. Current business process of biological coal-washing process. Note: labels for the actual steps in the transition are replaced by letters.
Electronics 13 04669 g011
Figure 12. Initial model of biological coal-washing process.
Figure 12. Initial model of biological coal-washing process.
Electronics 13 04669 g012
Figure 13. OPS-Align plug-in interface.
Figure 13. OPS-Align plug-in interface.
Electronics 13 04669 g013
Figure 14. Real-life business process mined by prom framework.
Figure 14. Real-life business process mined by prom framework.
Electronics 13 04669 g014
Figure 15. The running results on the data sets with selective substructure. Run result of data set with 11 activities (a), and run result of data set with 23 activities (b).
Figure 15. The running results on the data sets with selective substructure. Run result of data set with 11 activities (a), and run result of data set with 23 activities (b).
Electronics 13 04669 g015
Figure 16. The running results of data sets with cyclic substructures. Run result of data set with 10 activities (a), and run result of data set with 11 activities (b).
Figure 16. The running results of data sets with cyclic substructures. Run result of data set with 10 activities (a), and run result of data set with 11 activities (b).
Electronics 13 04669 g016
Figure 17. Running results on generic data sets. Run result of data set with 1020 aligned traces (a), and run result of data set with 780 aligned traces (b).
Figure 17. Running results on generic data sets. Run result of data set with 1020 aligned traces (a), and run result of data set with 780 aligned traces (b).
Electronics 13 04669 g017
Figure 18. Merging results of different data sets.
Figure 18. Merging results of different data sets.
Electronics 13 04669 g018
Figure 19. Evaluations of biological coal-washing data set. The results of D B i o 1 and D B i o 2 (a), and the results of D B i o 1 and D B i o 2 (b). Note: time indicates the search time for optimal alignment; the number of traces indicates the number of traces in the event log; proportion of variance represents the proportion of the current time difference in the total time difference.
Figure 19. Evaluations of biological coal-washing data set. The results of D B i o 1 and D B i o 2 (a), and the results of D B i o 1 and D B i o 2 (b). Note: time indicates the search time for optimal alignment; the number of traces indicates the number of traces in the event log; proportion of variance represents the proportion of the current time difference in the total time difference.
Electronics 13 04669 g019
Figure 20. The initial process model M B 1 and M B 2 . Process model M B 1 (a), and process model M B 2 (b).
Figure 20. The initial process model M B 1 and M B 2 . Process model M B 1 (a), and process model M B 2 (b).
Electronics 13 04669 g020
Figure 21. Running result of D B 1 , DB1+20% and DB1+40%.
Figure 21. Running result of D B 1 , DB1+20% and DB1+40%.
Electronics 13 04669 g021
Figure 22. Running result of D B 2 , DB2+20% and DB2+40%.
Figure 22. Running result of D B 2 , DB2+20% and DB2+40%.
Electronics 13 04669 g022
Table 1. The cost setting of unit movement.
Table 1. The cost setting of unit movement.
CostType of MovementsExpression of Alignment
0 m o v e L M A l i g n ( l ( e 1 ) , λ ( t 1 ) ) = A l i g n ( A , A )
1 m o v e ˜ L A l i g n ( l ( e 2 ) , λ ( t 2 ) ) = A l i g n ( B , )
1 m o v e ˜ M A l i g n ( l ( e 3 ) , λ ( t 3 ) ) = A l i g n ( , C )
0 m o v e ˜ L / m o v e ˜ M A l i g n ( l ( e 3 ) , λ ( t 3 ) ) = A l i g n ( τ , ) / A l i g n ( , τ )
Table 2. Initial deviation locations of ξ1 and ξ2. Note: yellow is the initial deviation, ξ is the identifier of A l i g n ( o , δ j ) .
Table 2. Initial deviation locations of ξ1 and ξ2. Note: yellow is the initial deviation, ξ is the identifier of A l i g n ( o , δ j ) .
ξ1ABCDEG
ABEFG
ξ2ABCDF
ACBEF
Table 3. The subalignments A l i g n s u b ( o , δ 1 · ) and A l i g n s u b ( o , δ 2 · ) . Note: the yellow is the initial deviation, and the gray has no practical meaning.
Table 3. The subalignments A l i g n s u b ( o , δ 1 · ) and A l i g n s u b ( o , δ 2 · ) . Note: the yellow is the initial deviation, and the gray has no practical meaning.
A l i g n s u b ( o , δ 1 · ) BCD
BCDE
A l i g n s u b ( o , δ 2 · ) BCD
BDCE
Table 4. A set of firing sequences of M .
Table 4. A set of firing sequences of M .
Serial No.Occurrence Sequence
δ1(a,b,c,d,e,f,j,k,l)
δ2(a,b,c,d,e,g,l)
δ3(a,b,c,d,e,f,l)
δ4(a,b,c,d,e,g,j,k,l)
δ5(a,b,d,c,e,f,l)
δ6(a,b,d,c,e,g,l)
δ7(a,b,d,c,e,f,j,k,l)
δ8(a,b,d,c,e,g,j,k,l)
Table 5. Professional notes for letter labels.
Table 5. Professional notes for letter labels.
LetterProfessional NoteRemark
ATreatment of coal slime water
BPre-sedimentation
CEnter the reaction pool
DAdd HPAM
EFlocculation
FBiodegradation
GMonitoring concentration
HFungus
Ibacteria
JBio-enzyme
K < 10 mg / L This concentration range meets environmental emission standards.
L 10 100 mg / L This concentration range has no significant effect on coal flotation.
M > 100 mg / L The opposite of the above two cases ( < 10 mg / L and 10 100 mg / L ).
NEmission
OReuse
PReal environment surveyThe real-life environment contains many possibilities. This section has many selective behaviors.
QOptimization of reaction conditions based on machine learningThis section contains a number of selective operations based on the actual situation.
RYes
SNo
TComputer molecular simulation
UResearch on the mechanism of enzymatic transformation
VEnzymatic engineering design
WTerminate
Table 6. The information of BPIC2020 event log.
Table 6. The information of BPIC2020 event log.
Event LogCasesLinesEvent TypesEvents
Domestic Declarations58210,00025122,762
Request For Payment71910,00029134,521
Table 7. The experiment results of BPIC2020.
Table 7. The experiment results of BPIC2020.
MethodA*-AlignIA*-AlignIDP-Align
ParameterTime (ms)CostTime (ms)CostTime (ms)Cost
D17632 ms181,7987606 ms181,7984582 ms181,798
D1 (+20%)9238 ms221,0429212 ms221,0426105 ms221,042
D1 (+40%)10,661 ms255,51810,635 ms255,5187611 ms255,518
D2127,340 ms179,125123,753 ms179,125101,475 ms179,125
D2 (+20%)153,426 ms214,887149,839 ms214,887126,275 ms214,887
D2 (+40%)179,863 ms252,393176,276 ms252,393153,998 ms252,393
Table 8. The numerical differences of experiment results.
Table 8. The numerical differences of experiment results.
MethodIA*-Align Is Subtracted by A*-AlignIDP-Align Is Subtracted by A*-AlignIDP-Align Is Subtracted by IA*-Align
ParameterTime Difference Value (ms)Time Difference Value (ms)Time Difference Value (ms)
D126 ms3050 ms3024 ms
D1 (+20%)26 ms3133 ms3107 ms
D1 (+40%)26 ms3050 ms3024 ms
D23587 ms25,865 ms22,278 ms
D2 (+20%)3587 ms27,151 ms23,564 ms
D2 (+40%)3587 ms25,865 ms22,278 ms
Table 9. Check performances of various methods.
Table 9. Check performances of various methods.
ReferencesCostTimeCheck PerspectiveCheck FormApplication Region
[13,19,33]Minimum Control flowstatic state { L M } = D
[42]Minimum Control flowstatic state { L M } = D
[28]Minimum Multi-perspectivestatic state { L M } = D
[29]Minimum Stochastic perspectivestatic state { L M } = D
[31,32]Minimum Control flowdynamic state { L M } = D
[39]Minimum > [42]Control flowstatic state { L M } = D
This workMinimum > [42]Control flowstatic state { L ( M \ φ p M ) } = D
Note: and represent the invariability and decline in search time, respectively. { L M } = D represents an arbitrary data set consisting of the event log and the initial model. { L ( M \ φ p M ) } = D indicates the initial model in the data set not containing non-perceivable region.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, L.; Wang, F.; Song, Z.; Huang, K.; Hu, Y.; Zhuo, G. Efficient Consistency Check Based on Perceived Initial Deviation. Electronics 2024, 13, 4669. https://doi.org/10.3390/electronics13234669

AMA Style

Zhang L, Wang F, Song Z, Huang K, Hu Y, Zhuo G. Efficient Consistency Check Based on Perceived Initial Deviation. Electronics. 2024; 13(23):4669. https://doi.org/10.3390/electronics13234669

Chicago/Turabian Style

Zhang, Liwen, Fanglue Wang, Zhihuan Song, Kaifeng Huang, Yanli Hu, and Guiying Zhuo. 2024. "Efficient Consistency Check Based on Perceived Initial Deviation" Electronics 13, no. 23: 4669. https://doi.org/10.3390/electronics13234669

APA Style

Zhang, L., Wang, F., Song, Z., Huang, K., Hu, Y., & Zhuo, G. (2024). Efficient Consistency Check Based on Perceived Initial Deviation. Electronics, 13(23), 4669. https://doi.org/10.3390/electronics13234669

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop