1. Introduction
A workflow (or business process) management system can support two fundamental functionalities, namely, the modeling functionality and the enacting functionality. The modeling functionality allows modelers to define, analyze, and maintain the workflow processes by hooking all essential workflow entities, such as activities, roles, performers, relevant data, and invoked applications, on the corresponding procedures. By contrast, the enacting functionality supports performers to play the essential roles of invoking, executing, and monitoring all instances of the workflow processes. The logical foundation of such a workflow management system is based on its underlying workflow model, which implies that the system is able to automate the defining, creation, execution, and management of the workflow processes according to the internal principle and structure of the underlying workflow model. To date, several workflow models [
1,
2] have been proposed, almost all of which employ five essential entity types, namely, the activity, role, performer, data repository, and application, to represent organizational works and their procedural collaborations. In this study, we focused on the performer entity type.
In recent years, studies on workflow started focusing on people working in workflow-supported organizations because it is widely accepted for a workflow system to be a people system. By analyzing the interactive and collaborative behaviors among people involved in performing workflow processes, we are able to measure and estimate their overall performance in real businesses, as well as their work productivity. For more than a decade, workflow mining [
3,
4] has received significant attention as a key enabling technology for acquiring human-centered knowledge regarding workflow processes. In this regard, the authors’ research group has proposed research and development issues when applying the concept of social networks and analysis methods into human-centered workflow knowledge discovery and analysis.
Under this context, we are particularly interested in the work transference network [
5] among performers in a workflow-supported organization. More specifically, this network is established through work transference (or handover) relationships between two performers in charge of the preceding and following activities within the workflow process. As shown in
Figure 1, there are two performers,
and
, in charge of the consecutive activities
A and
B, respectively. As the predecessor of this case,
will transfer the results of the execution of
A to his successor
. Therefore, this type of relationship reflects the relevancy and intensity between performers in terms of working, and accordingly, can eventually be an important analytical property for the acquisition of human-centered workflow knowledge.
In research on the work transference network, two main branches exist: discovery and rediscovery. The former is to discover a work transference network through analyzing a specific workflow model, whereas the latter is concerned with mining a work transference network from workflow event logs of the model. More specifically, we differentiate the former from the latter; the former is used to explore a planned work transference network [
5], whereas, the latter is to explore an enacted work transference network. This paper is directly related to work transference network rediscovery. Ultimately, through these discovery and rediscovery concepts, it is possible to assess the workflow fidelity, which indicates how faithfully the observed work transference network (rediscovery) reflects the planned network (discovery). Through the generalization of workflow fidelity, we can answer the following managerial questions.
Who is the performer most closely working with a particular performer?
Based on work transferences, what is the highly recommended performer group in specific procedural working steps within the workflow process?
How well the designed resource allocation plan and its resulting collaboration patterns be accomplished?
To actualize the workflow fidelity assessment, we should address the fact that there still lacks a common implementation that provides system supports to rediscover work transference network correctly and to analyze discrepancies between the discovery and rediscovery results. As a step forward to resolve this gap, this paper describes a framework for rediscovering work transference networks hidden in event logs. In addition, the framework is designed to handle heterogeneous event logs of different timestamp origins, such as the assessed, started and completed timestamp origins. To verify the framework, we conduct experimental analytics on the complete event log dataset of the BPI 2018 Challenge of a specific workflow model which deals with the handling of applications for direct EU payments for German farmers from the European Agricultural Guarantee Fund. Conclusively, the purpose of this paper is to originate a fundamental principle for rediscovering a work transference network from a specific workflow’s event logs by fulfilling the experimental analytics and delivering the analytical results.
The remainder of this paper is outlined as follows.
Section 2 describes related studies concerning workflow knowledge discovery.
Section 3 describes a series of conceptual definitions and a discovery procedure of the framework.
Section 4 describes the experimental analytics of mining work transference networks based on this framework. Finally, some concluding remarks are given in
Section 5.
2. Related Works
Numerous emerging technologies and research issues can be found in the literature on workflow management. A few of the most recent issues to be highlighted are workflow mining and knowledge discovery issues, which are related with collecting runtime event log data into workflow logs, filtering out and forming workflow warehouses from the logs, and discovering knowledge from such workflow warehouses. Almost all recent workflow management systems provide their own logging mechanisms [
6] for organizing such workflow logs and warehouses. In terms of collecting, filtering, and discovering activities with workflow logs and warehouses, to date, the related studies have mainly focused on two specialties of workflow discovery activities, namely, workflow process discovery [
3,
7,
8,
9,
10,
11,
12] and workflow knowledge discovery [
13,
14,
15,
16,
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29]. The workflow process discovery is directly concerned with redesigning and reengineering the control-flow aspect of the workflow processes by discovering workflow models from event logs. To obtain desirable process models from event logs, diverse discovery techniques have been proposed in the literature, including alpha [
3], amalgamation [
7], heuristics-based [
8], and inductive [
9] mining algorithms.
By contrast, the workflow knowledge discovery is closely connected with replanning and realigning the resource aspect (e.g., the data-flow, performer, role, and program) of the workflow models by rediscovering enacted and binding histories (e.g., temporal work transferences) from the workflow logs. The eventual goal of this research area is measuring, evaluating, controlling, and predicting the degree of workflow fidelity in a workflow-supported organization. Through the novel concept of workflow fidelity, we can achieve the managerial decision-making goal to minimize the discrepancies between the planned workflow models as estimated on the build time and their enacted workflow models as actually performed at runtime. In this study, we focus experimentally on workflow knowledge discovery and analytics with particular attention paid to work transference networking knowledge.
The workflow knowledge discovery stems from the concepts of business process intelligence [
13] and workflow intelligence [
14]. In [
13], the authors claimed that business process intelligence is a suite of integrated software tools aiming at managing the workflow execution quality by providing several features, such as analysis, prediction, monitoring, control, and optimization. Through a business process intelligence suite, we can accomplish a higher level of enhancement, such as the detection and prevention of nonconformances through auditing [
15], refining workflow data preparations, and integrating other data mining techniques, in managing the quality of the workflow execution. The work in [
13] is much closer to the conceptual contribution described herein, whereas the approach in [
14] is much more concrete. In [
14], the authors proposed a framework of control-path oriented workflow intelligence and quality improvement to achieve a higher degree of the workflow traceability and discoverability, and devised an efficient control-path analysis approach through the concept of a minimal workflow model. In particular, the discovered knowledge in the proposed framework is a quantitative measure of runtime enactments according to each control-path generated from a workflow model. The control paths and reachable paths with frequencies of their runtime enactments become valuable knowledge for redesigning and reengineering the corresponding workflow model.
From the viewpoint of workflow intelligence, in particular, workflow knowledge discovery needs to connect all perspectives such as the behavioral, temporal, organizational, and performance perspectives. From an organizational viewpoint, Kim [
16] first triggered the initiation of the human-centered knowledge discovery issue on workflows by observing the collaborative behaviors among the workflow performers. The authors proposed a formal approach using an algorithm that can discover a procedural collaboration network among the workflow performers. Following this approach, the work in [
17] became a major turning point in human-centered workflow knowledge discovery. In their study, the authors understood the workflow performers’ network as the relationships of work transferences (which they call a handover) among the workflow performers. In other words, they interpreted the concept of a workflow performer network into the concept of a social network and analyzed them by using social network analysis (SNA) techniques. Furthermore, Song et al. [
18] attempted to discover organizational models as well as workflow performer networks, and measured their performance. They defined this analytics based on event logs as a more comprehensive term, organizational mining. In addition, Park et al. [
19] developed another approach and system for analyzing the social networks of workflow performers using process models. As such, SNA techniques facilitate the discovery of organizational structures among workflow performers and those applications for acquiring human-centered knowledge [
17,
18,
19], including the measurement of employee contribuion [
19], resource community detection [
20,
21], and recommendation [
22,
23].
In addition to these studies, results related to the integrated concept of workflows and social networks for dealing with the discovery of temporal workflow patterns [
24], and mining techniques to be used for workflow resource allocation [
25]. Beyond the discovery of human-centered knowledge, Hong et al. [
26] presented a methodology for redesigning an organizational structure based on the results of social networks analyzed from workflow enactment event logs. In addition, to address a sustainable analysis, Appice et al. [
27] proposed a technique for continually updating a social network discovered from event logs. Through their approach, it is possible to track changes to a social network and gain knowledge from its histories in terms of the dynamics. As an industrial case study, Aloini et al. [
28] analyzed a social network among port logistics workers by using workflow mining and SNA techniques. The results from the study suggest that handover relationships of such workers affect the overall performance of the export process efficiency.
In the present paper, we focus on a new shape of human-centered networking knowledge hidden inside the workflow processes and their enactment event histories, which is called work transference networking knowledge and is formally represented as a work transference network model. Our research group has successfully developed a mining system that is able to discover work transference networks from workflow enactment event log datasets formatted in the extensible event stream (XES) standard [
30]. Through the discovery of work transference networks, we are able to not only quantitatively measure the degree of work-sharing and relevancy among performers, but also qualitatively estimate the levels of work-intensity among workflow performers in a workflow-supported organization.
3. Conceptual Discovery Framework
In this section, we formally describe a conceptual framework that includes a series of conceptual definitions and a procedure to discover a workflow transference network from event logs.
Figure 2 illustrates the framework and functional relationships of its core concepts. To deploy the formalization of the framework, we define a series of formal concepts including those from an event trace to a work transference network model. An event trace corresponds to an execution of a workflow instance, and therefore this aggregates all completed events temporally occurring during its execution. We introduced a formal definition of this concept (called a temporal trace) represented in [
7]. In addition, each event trace has its own performer trace, which has been formally called a temporal transference model. Together with these concepts, we provide a formal definition of the work transference network model.
3.1. Workflow Enactment Event Logs
According to the workflow instance executed, the logging component of the workflow execution engine records its execution events in a log repository, and the logged events are arranged in the form of a temporal sequence of events. This sequence corresponding to a workflow instance is called an event trace, from which we can extract a activity trace, a formal representation of which is specified as a model of temporal trace. An event trace is also involved with a sequence of the performers (performer trace) participating in the executions of the work items in the corresponding workflow instance. We can also extract a performer trace from an event trace, the formal representation of which is specified as a model of temporal work transference. Here, we describe the discovery framework from formally defining an event, which is stored as a single record of log, as shown in the following.
Definition 1 (Event). Let we = (α, , , , , p, t, s) be an event stored as log, where:
α is a work item (activity instance) identifier,
pc is a package identifier,
wf is a workflow process identifier,
wc is a workflow instance identifier,
ac is an activity identifier,
p is a performer identifier,
t is a timestamp, and
s is a current state of the work item that is one of the states such as ready, assigned, reserved, running, completed, and cancelled.
In terms of the log formats, we consider that event logs are stored in a tag-based language. An XML-based log format, XWELL [
6], is proposed for the purpose of workflow mining at the academic level, and WfMC has released a standardized audit and log specification, namely, BPAF [
31]. IEEE has recently released a standard for event log format, XES [
30], whose aim is to provide designers of information systems with a unified and extensible methodology for capturing system behaviors by means of event logs. As a format for the event log structure, we can use the XES schema describing the structure of an XES event log/stream. Based upon the XES format, we simply summarize the essential attributes in the event log, as follows:
The event attribute is used to specify an event identifier, which is assigned by the workflow execution engine.
The work item attribute of an event represents a work item identifier that is uniquely assigned by using those combined identifiers, such as the package id, workflow id, activity id, and instance id.
The performer attribute is used to specify a human-resource in charge of enacting the work item.
The timestamp attribute specifies the time information of an occurred event.
Finally, the state attribute represents the current state of the work item maintained by the engine. Whenever the state is changed, the resulting event will be logged. The state should be ready, assigned, reserved, running, completed, and cancelled.
3.2. Temporal Workcases and Work Transferences
Definition 2 (Event Trace). Let WT(c) be an event trace of a workflow instance, c, where = (, ⋯, ), where {, = c ∧ ≤ ∧ = ∧ = ∧ = ∧ ∧ }, which formally represents a temporally ordered event sequence of a specific workflow instance, which can be extracted from the event logs by considering the timestamp and the state attributes.
From the formal definition of event trace, we build a temporal workcase model, TWC(
c), along with a temporal work transference model, TWT(
c), as shown in
Figure 3. Note that the meaningful temporal order in managing the workflow instances must be one of the following instantaneous points in time, each of which is called a timestamp origin. Accordingly, we necessarily discover a series of event traces from numerous event logs, and from each of the discovered traces, we build four types of temporal workcases and temporal work transferences with one of the timestamp origins.
The scheduled time: the event’s timestamp is taken at when the state of a work item is changed from ready (the ready state of a work item implies that the work item is ready to be processed but has not been assigned to a particular participant) to assigned (the assigned state of a work item implies that the work item has been assigned to a group of participants, but work has not started yet). An event with a timestamp of the scheduled time holds that, ⇒ ( ∧ ∧ ).
The assessed time: the event’s timestamp is taken at when the state of a work item is changed from assigned to reserved (the reserved state of a work item implies that the work item has been assigned to a single participant, but work has not started yet). An event with a timestamp of the assessed time holds that, ⇒ ( ∧ ∧ ).
The started time: the event’s timestamp is taken at when the state of a work item is changed from reserved to running (the running state of a work item implies that the work item is actively being worked on, and time spent in this state would be recorded as processing time or work time). An event with a timestamp of the started time holds that, ⇒ ( ∧ ∧ ).
The completed time: the event’s timestamp is taken at when the state of a work item is changed from running to completed (the completed state of a work item implies that the work item has been fully executed and completed with either success or failure). An event with a timestamp of the completed time holds that, ⇒ ( ∧ ∧ )
Definition 3 (Temporal workcase). Let TWC(c) be a temporal workcase of a workflow instance, c:
= (, …, ),
where {| ∧ ∧ ∧ = c ∧ ( ≺ ) ∧ ∧ ∧ },
which is a temporally ordered activity sequence along with the specific timestamp origin. It is assumed that all the work items in c are successfully completed, and their executions are running without being suspended, as well.
Based on Definition 3, we can interpret the formal definition as a conceptual implication in which all activities of the event trace of c that have the same instance id are lined up in an activity sequence along with the timestamp origin. Consequently, from an event trace, an activity sequence is constructed by extracting the activity identifiers and their timestamps, which we call a temporal workcase. Note that, according to the different types of timestamp origins, four types of temporal workcases potentially exist, where all activities () are aligned with the specific timestamp origin that should be one of the scheduled (), assessed (), started (), or completed () timestamp origins.
Definition 4 (Temporal Work Transference). Let TWT(c) be a temporal work transference of a workflow instance, c:
= (, …, ),
where {| ∧ ∧ ∧ = c ∧ ( ≺ ) ∧ ∧ ∧ };
which is a temporally ordered performer sequence along with the specific timestamp origin. Note that the scheduled timestamp origin is not available in performer sequences. Especially, a temporal work transference is formally defined by a temporal work transference model, and it is assumed that all the work items in c are successfully completed, and their executions are running without being suspended, as well.
Based on Definition 4, we can interpret the formal definition as a conceptual implication that all performers of the event trace of c that have the same instance id are lined up in a performer sequence along with the timestamp origin. Consequently, from an event trace, a performer sequence is constructed by extracting the performer identifiers and their timestamps, which we call a temporal work transference. Owing to the inapplicability of the scheduled timestamp type, there may exist three types of temporal work transference where all performers are aligned with the specific timestamp origin, which should be an assessed, started, or completed timestamp origin.
Definition 5 (Temporal Work Transference Model). A temporal work transference model is formally defined as 3-tuple = (χ, , ) over a set P of performer nodes, , on a temporal work transference, , of a workflow instance, , and a set K () of the timestamp origins, where:
is a coordinator or a coordinator-group linked from an external temporal work transference model;
is a coordinator or a coordinator-group linked to an external temporal work transference model;
on P,
- -
is a single-valued mapping function of a performer node, = ∧ K, to its immediate successor in a temporal work transference;
- -
is a single-valued mapping function of a performer node, = ∧ K, to its immediate predecessors in a temporal work transference.
According to different timestamp origins, temporal work transference models, , are defined as follows:
with the assessed time: of a temporal work transference model
with the started time: of a temporal work transference model
with the completed time: of a temporal work transference model.
3.3. Work Transference Network Model
As described above, we confirmed that a TWTM can be constructed for each workflow instance by combining the corresponding TWC and TWT. At this stage, one step remains for amalgamating all TWTMs into the workflow transference network model (WTNM) that we ultimately aim to discover. A WTNM reveals the real form of the work transference relationships among the performers after a lapse of a certain time since deploying the corresponding workflow model and it is discovered from the event logs shaped in the form of a workflow warehouse. As shown on the right side of
Figure 2, as the final outcome of the framework, a WTNM has the formal and graphical structure of a directed graph (digraph) model to represent the performers and work transferences among them. Each node represents a performer, and each ordered pair of nodes (or a directed edge) represents a work transference relationship. Edges contain the intermediate activities and their occurrences; for example, the performer
has completed the work items of activity
C 20 times and has transferred those outcomes to the performer
.
As a formal representation of such knowledge regarding the work transferences, we define the WTNM through Definition 6. Using the formal notation of , we define the work transference relationships, in which the two nodes of a directed edge represent the predecessor and successor of the work items. In addition, we define the formal notation of for the work association relationships by labelling each directed edge with the activity names and occurrences corresponding the work items that are transferred to the successor and received from the predecessor of the corresponding work transference relationship concurrently.
Definition 6 (Work Transference Network Model). A work transference network model is formally defined as = (σ, ψ, , ), over a set P of performer nodes, , and a set A of activity nodes, , in a set of event traces logged from enacting the underlying workflow model, where:
is a coordinators or a coordinator-group linked from some external work transference networks;
is a coordinators or a coordinator-group linked to some external work transference networks;
/* Work Transferences */
- -
is a multi-valued function mapping a performer to its set of immediate predecessors;
- -
is a multi-valued function mapping a performer to its set of immediate successors;
/* Work Associations */
- -
is a multi-valued function returning a paired list of receiving work items and their occurrences on ordered pairs of performers, , , from to ;
- -
⟶ is a multi-valued function returning a paired list of transferring work items and their occurrences on ordered pairs of performers, , , from p to ;
5. Conclusions
As a study focusing on the work transference network (WTN) for the workflow knowledge discovery, our contributions in this paper include the following:
Formal definitions and procedure of the framework for discovering a WTN from event logs.
Brief description of the implemented system of the framework that can handle event logs of different timestamps.
Experimental analytics of the discovery of WTN using real-life event log dataset.
Based on the contributions above, we confirmed that the proposed framework and its implemented system are valid in discovering WTNs and acquiring primitive knowledge. To recap, the information and knowledge that the proposed framework can provide are listed as follows:
A structure of the WTN observed during a long-term period (from accumulated event logs).
A structure of the WTN for a single workflow instance (from an event trace).
Total numbers of performers and activities associated with certain workflow processes and their occurrences.
Patterns showing how work transference relationships are made (e.g., self-transferring).
Degree of strength in terms of the work transference between two performers.
Conclusively, we consider this study as a meaningful step towards the workflow fidelity assessment from the organizational aspect that has not been supported with a systemic way. Despite the feasibility of the framework, our work is still lacking regarding the suggestion of possible and attractive applications for business analysts. Therefore, as future study, we need to improve our system to provide more sophisticated analytical capabilities to effectively discover and deliver more valuable knowledge. In addition, we have a plan to conduct a case study in which we will apply our system to massive real-life event logs to verify how beneficial the discovery framework is to workflow-supported organizations.