1. Introduction
In today’s fast-paced and competitive manufacturing landscape, ensuring consistent product quality while optimizing operational efficiency has become a paramount concern for organizations worldwide [
1]. In modern manufacturing, employing advanced technology is essential to enhance both the efficiency of processes and the quality of products. Additionally, the contemporary manufacturing setting produces vast amounts of data from various sources such as sensors and machinery [
2]. In order to effectively monitor and enhance industrial processes in this data-rich environment, it is vital to use appropriate controls such as statistical process control (SPC). SPC comprises a set of methods designed to measure and manage quality during production [
3]. It offers quantifiable control and insight into processes by monitoring their performance over time and reducing quality issues. Nonetheless, the inherent unpredictability of manufacturing processes often results in variations that can affect product quality, production efficiency, and overall operational performance [
4,
5]. Managing these variations is crucial to maintaining consistency, achieving quality standards, and optimizing manufacturing processes [
6].
By enhancing adaptability and flexibility during operations, production processes can be accelerated and costs can be reduced [
7]. In industrial systems, flexibility is closely linked to the system’s ability to adapt to changes and disruptions, thereby improving operational performance and enabling us evolving demands to be better met [
8]. Manufacturing flexibility acts as a mechanism to regulate workflows, providing companies with a range of options for response strategies to effectively handle fluctuations in demand and maintain operational efficiency [
9].
Variability in manufacturing processes due to flexibility often leads to deviations that can compromise product quality and consistency. Maintaining stringent quality standards is imperative for effective control of these variations. SPC is a pivotal method designed to keep manufacturing processes under meticulous control [
10]. By employing SPC, industries can attain measurable control and systematically assess their processes’ capability. However, when SPC processes exhibit abnormalities or deviate from expected norms this can pose significant challenges [
11]. These abnormal SPC processes indicate potential inefficiencies and signal the need for immediate corrective actions to rectify the underlying issues and ensure consistent product quality.
Process mining is a cutting-edge data-driven discipline within the realm of process management. Leveraging the vast amounts of digital data generated during business operations, process mining aims to analyze, visualize, and improve organizational processes in a transparent and objective manner [
12]. Unlike traditional process management approaches, which rely heavily on subjective assessments and manual observations, process mining offers an empirical and comprehensive view of how processes truly operate. By extracting insights from event logs, process mining techniques can identify bottlenecks, inefficiencies, and deviations in workflows, enabling organizations to make data-driven decisions for optimization and enhancement [
13]. This data-centric approach enhances process transparency while facilitating continuous improvement, thereby driving operational excellence and fostering innovation within organizations [
14].
Therefore, the consolidation of PM and SPC poses a powerful methodology for empowering the manufacturing environment by providing real-time insights into process performance and quality control. The coaction of PM and SPC is derived from their complementary roles in process management. While PM provides a data-driven approach to grip process behaviour and determine prospects for improvement, SPC offers a structured context to monitor process performance, spot deviations, and sustain process stability. By combining the insights of PM with the methodologies of SPC, organizations can boost their process monitoring abilities, proactively recognize deviations, and apply corrective actions to guarantee consistent quality and operational efficiency [
1,
15].
In addition, PM techniques can support SPC by examining abnormal SPC processes. When manufacturing processes depart considerably from the defined control limits, abnormal SPC procedures are used to identify any problems with quality or inefficiency in the process [
16]. Maintaining operational excellence, cutting waste, and preventing defects all depend on the detection of anomalous SPC processes. Organizations may enhance process performance by implementing targeted corrective measures, gaining a greater knowledge of process behavior, and identifying the underlying reasons for deviations by using PM approaches [
17].
In this respect, the present study examines process mining applied for statistical process control in the manufacturing environment. The intersection of PM and SPC is illustrated as part of a journey to show how this synergy can revolutionize manufacturing, improve product quality, and drive operational excellence in Industrial 4.0. The paper proposes a methodology combining the power of PM for SPC to control the manufacturing processes by uncovering deviations in abnormal processes.
The structure of this paper is organized as follows.
Section 2 provides a quick review of SPC studies in the manufacturing domain.
Section 3 presents the background of terms related to SPC and PM.
Section 4 introduces the proposed methodology, namely, process mining-based statistical process control (PM–SPC).
Section 5 shows the implementation of the proposed method in order to verify its validity.
Section 6 discusses the results and limitations of the methodology. Finally,
Section 7, concludes the study.
2. Background
2.1. Statistical Process Control (SPC)
In the quest for continual enhancement, statistical quality control relies on two key practical methodologies: Acceptance sampling and statistical process control (SPC). Acceptance sampling involves the application of predetermined sampling plans to assess whether a given lot or series of lots should be accepted or rejected based on examining a sample [
18]. On the other hand, SPC stands out as a primary method in quality control, encompassing a range of effective problem-solving techniques aimed at ensuring process stability and enhancing processing capacity by minimizing variability [
19]. It is widely recognized as a traditional strategy in the manufacturing sector for reducing unpredictability and fostering continuous improvement [
20].
SPC is a methodology for monitoring and controlling processes to ensure that they operate efficiently and produce products or services of consistent quality. It involves collecting and analyzing data from the process in order to understand its variability and make informed decisions about process adjustments and improvements [
21].
Upper and lower control limits are key components of SPC, and are used to monitor the stability and performance of a process over time. The upper control limit (UCL) is the highest value that a process can achieve while still being considered in control or stable. It is calculated statistically based on the process data, and is typically set at three standard deviations above the process mean (Equation (
1)). The UCL serves as a threshold; any data points falling above it may indicate special causes of variation in the process [
22].
Conversely, the lower control limit (LCL) is the lowest value that a process can achieve while remaining in control. Similar to the UCL, it is calculated based on a statistical analysis of the process data, and is usually set at three standard deviations below the mean value of the cycle time (Equation (
2)). Data points falling below the LCL may also suggest special causes of variation in the process [
23].
In the above equations, and respectively indicate the average and standard deviation values of the cycle time.
2.1.1. Process Capability
In the context of process capability analysis, the selection of appropriate indices is crucial for accurately assessing and reporting the performance of a process. Among the commonly used indices are , , and . Each of these indices serves a specific purpose and provides different insights into the process capability.
In this study, is preferred over and for the following reasons:
Unlike
,
considers the centering of the process. This is important in our analysis, as it provides a more accurate representation of the process capability by considering both the spread and the average of the process [
24].
The processes under investigation may not always be perfectly centered;
provides a realistic measure of capability under such conditions, ensuring that any deviations from the target mean are factored into the capability assessment [
25].
is widely used and accepted in industry for process capability analysis. Using
ensures that our results are easily interpretable and comparable to other studies and industry benchmarks [
26].
While
offers a comprehensive measure by including deviation from a target, it can be more complex to apply and interpret [
27];
strikes a balance by providing a robust measure that is both effective and easier to communicate.
The
measure specifies how well a process meets specifications or requirements. It assesses the capability of a process to produce output within defined tolerance limits relative to the process variability [
28]. It is calculated using Equation (
3):
where the Upper Specification Limit (USL) and Lower Specification Limit (LSL) are the upper and lower limits, respectively, of acceptable performance or product characteristics,
is the average value of the process, and
represents the standard deviation of the process attribute. A
value greater than 1 indicates that the process can meet specifications, with a higher value indicating better capability, while a value less than 1 suggests that the process may not consistently meet specifications and that improvements may be necessary [
29].
It is important to note that the specification limits represent the acceptable range of product or service characteristics set by the customer or organization, and are not the same as control limits, which are based on the inherent variability of the process. The specification limits are used to ensure that the final output meets customer requirements, while the control limits are used to monitor and manage the stability and performance of the process itself [
30].
2.1.2. Control Charts
SPC utilizes
, along with other statistical tools such as control charts, to monitor process performance over time. Control charts display process data in a graphical format, allowing operators to identify trends, shifts, or abnormalities in the process. By continuously monitoring process performance and making data-driven decisions, SPC helps organizations to maintain consistent quality, reduce waste, and improve overall efficiency [
31].
X-bar (
) control charts are commonly used in SPC to monitor the central tendency of a process. Thees monitor the central tendency or average of a process over time and plot the sample means of subgroups taken from the process on the y-axis against the sample subgroup number or time on the x-axis. The central line on the chart represents the overall process mean, while the UCL and LCL represent the expected variability around the process mean [
32,
33].
2.2. Process Mining
Process mining stands at the crossroads of process and data science, aiming to derive valuable insights from event logs stored in information technology (IT) systems [
34]. Organizations routinely gather data to streamline and enhance their business processes, offering a rich resource for understanding and optimizing operations [
35]. However, interpreting this data and extracting actionable insights can be challenging, which is where process mining comes into play. The objective of PM is to govern underlying processes by leveraging various technologies and methodologies to achieve diverse goals, including identifying actual process flows, analyzing social networks, and comparing actual and desired processes using event logs [
12]. The versatility and benefits of process mining have been recognized across industries, with applications ranging from healthcare and business process management to manufacturing [
2,
36,
37,
38].
Process mining encompasses three primary types: process discovery, conformance checking, and enhancement [
34]. In process discovery, the objective is to construct a visual representation of the as-is process model using event log data [
39]. Using sophisticated algorithms, process mining tools analyze event logs to construct graphical representations of process flows. Techniques include the alpha algorithm, which focuses on constructing a process tree; heuristics mining, which infers a process model by identifying patterns; inductive mining, which analyzes the event log to discover significant process patterns; and fuzzy mining, which uses fuzzy logic to handle uncertainty in the event log data. These approaches help to visualize the sequence of activities, decision points, and the paths taken by different cases or instances. This visual representation allows stakeholders to grasp the actual flow of processes, revealing both the expected paths and the deviations from them.
Among process discovery algorithms, the inductive mining starts by constructing a Directly-Follows Graph (DFG) from the event log. This graph captures the sequential relationships between activities based on their occurrence in the log [
40]. Nodes in the graph represent activities, and directed edges (arrows) indicate that one activity directly follows another within at least one observed case. From the DFG, the Inductive Miner algorithm builds a process tree (also known as a split-join tree). The process tree is a hierarchical representation that captures both the sequential and parallel behavior of processes observed in the event log. The constructed process tree may undergo simplification steps to improve readability and reduce complexity without losing essential process behavior. This simplification can involve merging nodes or branches that represent similar sequences or behaviors observed in the event log.
Conformance checking involves comparing the as-is process model with the to-be model or a predefined reference model in order to identify deviations and compliance issues [
41]. Methods such as token-based replay compare the event log with the process model by replaying events and tokens, while alignment analysis quantifies the alignment between the event log and process model to identify deviations.
Lastly, enhancement aims to improve the existing process model by extending it with additional information or enhancing it based on identified patterns and insights [
42]. Enhancement mining extends the discovered process model by incorporating additional paths and activities, while performance analysis identifies bottlenecks and inefficiencies in the process. Recommendations and predictions are provided to optimize the process and achieve better performance and outcomes.
Additionally, root cause analysis plays a crucial role in enhancing process models within process mining frameworks. This analysis leverages techniques such as decision trees to identify underlying factors contributing to process inefficiencies or deviations. Decision trees analyze event data to pinpoint specific conditions or combinations of events that lead to undesired outcomes or deviations from the expected process flow. By tracing back through the process model and event logs, decision trees highlight critical decision points or activities that significantly impact process performance, thereby incorporating insights from root cause analysis into the process model enhancement phase. This integration extends the discovered process model by addressing identified root causes through additional paths, activities, or decision points that mitigate process inefficiencies and improve overall performance.
2.3. Process Mining-Based Statistical Process Control
Control charts and process capability indices are valuable tools for maintaining high quality standards. They help to promptly detect any deviations from desired performance levels, ensuring that quality remains consistently high.
When integrated with SPC techniques, process mining analysis offers a structured approach to monitoring and managing the operational processes. First, it is imperative to identify pivotal key performance indicators (KPIs) relevant to the processes. These indicators, encompassing metrics such as cycle time, throughput, and defect rates, serve as pivotal benchmarks for assessing process performance, quality, and capability [
43]. Second, various types of control charts, including X-bar and R charts, can be employed to monitor the evolution of the selected KPIs over time. These charts furnish a visual representation of process variation, aiding in detecting trends, shifts, or irregularities that may signal process instability or quality deficiencies [
44]. Then, PM can be used for processes that should be investigated in greater detail.
2.4. Related Work
Several studies have combined the domains of SPC and PM, as shown in
Table 1.
These articles illustrate potential research gaps that should be filled in order to move our knowledge forward. First, one major gap is the absence of real-world apps and the efficiency of the proposed techniques and frameworks. Moreover, there is a need to integrate methods and tools from both traditional and new approaches.
Another gap concerns the scalability and scope of process mining approaches to complex systems. Previous authors have pointed out that more research is needed to investigate whether these methodologies could be used for large-scale industrial control systems with various architectures and communication protocols. Likewise, handling real-time event streams is a challenge.
Another area that needs to be addressed is the adaptability of emerging technologies, such as the integration of blockchains into business process management systems as a predictive monitoring tool. The lack of in-depth comparisons and benchmark tests is clarified as a limitation, in addition to the absence of software for testing the reliability of predictive monitoring techniques using different types of datasets. Consequently, the development of software tools for the intended techniques is a vital research issue that must be considered in order to bridge this gap.
While this background section has provided an overview of SPC and PM techniques, as well as related work, the true potential of these approaches lies in their integration, which is the focus of the proposed methodology outlined in the next section.
3. Preliminaries
Process mining leverages event log data to gain valuable insights into organizational processes. In order to effectively model and analyze these processes, it is essential to establish a formal understanding of the key concepts involved. The following definitions lay the foundation for the subsequent discussion on Petri nets and their application in the proposed methodology.
An event log consists of a compilation of traces, each representing a process instance (referred to as a case). These traces include activities undertaken and the process attributes affected during the instance.
Definition 1 (Events [
34]).
An event is defined as a triple , where a belongs to the set of activity names A, t belongs to the universe of timestamps T, and i belongs to the universe of event identifiers I. This triple signifies that the event identified by i denotes the execution of an activity instance a at timestamp t. Additional attributes of events may be defined; however, they are not formalized here for the sake of simplicity. A trace is defined as a sequence of events.
Definition 2 (Traces and Event Logs [
34]).
A trace σ is described as a sequence of events, denoted as . An event log L is characterized as a collection of traces, represented as . Petri nets, regarded as one of the earliest and most extensively researched process modeling languages, facilitate the depiction of concurrency. A Petri net is defined as a bipartite graph consisting of two types of nodes: places and transitions.
Definition 3 (Petri Net [
34]).
A Petri net is a triple net , where P is a finite set of places, T is a finite set of transitions such that , and is a set of directed arcs, named the flow relation. A marked Petri net is a pair , where is a Petri net and is a multi-set over P denoting the marking of the net. The set of all marked Petri nets is denoted as . Assessing the quality of a process mining result is challenging, and involves multiple dimensions. One of these dimensions is that the discovered Petri net should accurately reflect the behavior observed in the event log. A model with high fitness can effectively replicate the majority of the traces found in the log. The fitness of a case with trace
on Petri net
N is defined in Equation (
4) [
59]:
where
p represents the produced tokens,
c the consumed tokens,
m the missing tokens, and
r the remaining tokens. The initial part calculates the ratio of missing tokens to consumed tokens;
equals 1 if there are no missing tokens (
) and 0 if all tokens intended for consumption are missing (
). Likewise,
equals 1 if there are no remaining tokens and 0 if none of the produced tokens are actually consumed. An equal penalty is assigned for missing and remaining tokens.
In SPC, it is imperative to consider certain metrics when creating control charts and calculations. These process metrics are referred to as key performance indicators (KPIs).
Definition 4 (KPI Function). Let denote the universe of events. A KPI is defined as a function , where, for a (prefix of a) trace and an integer , yields the KPI value of σ after the occurrence of the initial i events. Here, represents the number of events in σ.
It is necessary to consider KPI values as numerical for the calculations involved in statistical process control; thus, denotes the collection of all conceivable KPI values. With a slight abuse of notation, we denote as , signifying the KPI value following the occurrence of events in trace . It should be noted that KPI values may be timestamps, which can be readily mapped to numerical values.
These formal definitions of events, traces, event logs, and Petri nets provide the necessary theoretical keystone for the proposed methodology, which aims to leverage the synergy between statistical process control and process mining techniques. By combining the structured monitoring capabilities of SPC with the data-driven process discovery and enhancement techniques of PM, our proposed methodology offers a comprehensive approach to empowering the manufacturing environment. In the next section, we provide the details of this innovative methodology, outlining its key steps and highlighting its potential impact.
4. Proposed Methodology
The proposed methodology is illustrated in
Figure 1. This section provides a detailed explanation of each step in the methodology in order to enhance understanding of how the process is implemented and the rationale behind each phase.
Step 1. Event Log Collection: The first step in the proposed methodology is to gather and input the event log. An event log is a comprehensive record of all the events that have occurred within a process. These logs typically include timestamps, user activities, system events, and other relevant data points. The collection of the event log is crucial, as it forms the foundation for understanding the current state and performance of the process.
Step 2. Statistical Process Control (SPC) Diagrams: With the event log collected, the next step is to create SPC diagrams for the defined Key Performance Indicators (KPIs). SPC diagrams, or control charts, are graphical representations of the process performance over time. They are used to monitor process behavior and identify any variations that may indicate problems.
The creation of SPC diagrams involves several key activities, including data calculation, parameter definition, and visualization. First, statistical measures such as the mean and standard deviation of the event log are calculated. Second, control limits (upper and lower control limits) and specification limits (upper and lower specification limits) are defined. Control limits are derived from the event log, and indicate the expected range of variation, while specification limits are defined by customer or business requirements. Lastly, the event log, along with the control and specification limits on the control chart, are plotted by considering the defined KPIs. This visual representation helps to identify whether the process is operating within acceptable boundaries quickly.
Step 3. Process Categorization: Based on the SPC diagrams, processes are categorized as either “under control” or “out-of-control”. This categorization is based on whether the process data fall within the established control and specification limits. Processes under control consistently operate within the control and specification limits with respect to the defined KPIs. These processes show stable and predictable performance. On the other hand, the KPI values in out-of-control processes exceed the control and specification limits or show signs of instability. These processes require further investigation in order to identify and address the causes of variation.
Step 4. Process Discovery for Out-of-Control Processes: Detailed analysis is performed using process mining discovery techniques for the processes categorized as out-of-control. The proposed methodology utilizes an Inductive Miner algorithm to discover out-of-control process models. The Inductive Miner was selected for its ability to handle noisy and incomplete data, making it suitable for our real-world manufacturing dataset. This algorithm constructs a process tree by recursively decomposing the event log into smaller and more manageable parts. It iteratively identifies patterns and constructs a model that accurately represents the observed behaviors.
Petri nets are constructed to model and analyze the discovered process flow. Petri nets provide a graphical representation of the process, showing transitions (activities), places (states), and the flow of activities. This helps understand the detailed sequence of events and identify deviations or bottlenecks.
Step 5. Process Enhancement Based on KPIs: After identifying the out-of-control processes with Petri nets, various enhancement methods are applied to bring the processes back under control. One key technique is root cause analysis, which aims to identify and address the underlying causes of process deviations.
Decision trees are used for root cause analysis in order to systematically explore and identify the root causes of process deviations. A decision tree helps to break down complex issues into smaller manageable factors and pinpoint the specific areas that need improvement. Based on the findings from the root cause analysis, targeted improvements are suggested to eliminate or minimize the impact of the identified root causes. This may involve changes to process parameters, adjustments in workflows, or other corrective actions to enhance process performance.
A decision tree consists of nodes representing decisions, branches representing possible actions, and leaf nodes representing outcomes. The tree begins with a root node and branches out to show all possible outcomes of a decision. The Gini index is used to measure the purity of the nodes. A lower Gini index indicates a more homogeneous node, meaning that the data points within the node are more similar to each other. This helps identify the most influential variables affecting process performance. The decision tree splits nodes based on criteria that maximize the reduction in impurity. The decision to split is made to create the most homogeneous sub-nodes, which helps pinpoint critical decision points in the process. Each path from the root to a leaf represents a decision rule. Following these paths allows the contribution of different factors to the process outcomes to be better understood.
Integrating decision tree-based root cause analysis with SPC enhances the holistic management of organizational processes. Decision trees help to identify specific conditions or combinations of factors that contribute to process inefficiencies or deviations, providing predictive insights into critical root causes. These insights guide the selection of the variables monitored in SPC, such as through control charts or process capability analysis, to detect variations and maintain process stability. In turn, SPC provides real-time monitoring capabilities to promptly identify deviations from expected performance and trigger corrective actions based on insights gained from root cause analysis. This integration creates a feedback loop wherein SPC data can refine and validate decision tree models, fostering a continuous improvement cycle that strengthens process reliability, efficiency, and quality management within organizations.
With the methodology established, the next step is to validate its effectiveness through a real-world case study in the manufacturing domain. This will showcase the proposed approach’s practical application and ability to drive process improvements and operational excellence.
5. Case Study in Production Processes
This case study demonstrates the proposed methodology’s practical application in a real-world manufacturing environment. It shows that the integrated SPC and PM approach enhances process efficiency and quality.
5.1. Data Collection and Preprocessing
The data collection procedure entailed acquiring information from diverse machinery within an industrial manufacturing environment. The data underwent a rigorous cleaning process to eliminate noise and irrelevant information. This entailed the systematic filtration of incomplete records and the identification of outliers. Data preprocessing involved preparing the data for analysis, which included aggregating event logs and sensor readings into meaningful metrics.
Table 2 shows examples of the raw data collected from different sources.
Finally, 11,784 production events from the company’s IT system were obtained for SPC and PM.
5.2. Statistical Process Control Charts for Defined KPIs
The cycle time (case duration) KPI was selected based on its significance for process performance and quality control according to the process owner. Statistical methods were used to set control limits for each chart. This involved calculating the mean and standard deviations of the KPI data and establishing upper and lower control limits based on these values. In
Figure 2, under control processes are represented by green points, whereas out-of-control processes are shown by black crosses. The figure indicates that more than 6% of the processes take longer than the specification limits. The average and standard deviations of call durations are 191.06 and 122.12 s, respectively. The
value of 0.52 refers to the process not consistently meeting specifications, indicating that improvements are necessary.
5.3. Process Discovery
To improve the process, the proposed methodology suggests applying process mining techniques. Using the Inductive Miner algorithm, the methodology was able to generate a comprehensive process model incorporating all observed behaviors. This model provided a detailed representation of the manufacturing workflow, highlighting both common and exceptional paths. The discovered process model revealed insights into the sequence of operations, the frequency of different activities, and the points where deviations occurred, indicating potential inefficiencies or bottlenecks in the process.
Figure 3 shows the discovered process map with Petri net notation; ⨀ indicates the starting place, whereas ⊚ refers to the final place of the model. Due to the nature of the metalworking processes, the starting place can hold a token that circulates in the Petri net more than once. For this reason, after some transitions (activities) there are arrows returning to the starting place. The fitness value computed by Equation (
4) is 0.92, meaning that the discovered process model explains 92% of the behavior observed in the event log; in other words, 92% of the traces in the event log are consistent with the behavior described by the process model.
The meanings of the activities in the process are as follows:
Material Retrieval: The process begins with the withdrawal of the necessary materials from the inventory.
Waiting: The system places the process on hold and waits for the next available resource or instruction.
Return from Waiting: After waiting, the process is moved forward to the next stage.
Calling the Machine: The machinery or equipment required for the process is activated or called into operation.
Product Retrieval: The product is taken or retrieved from the machinery or equipment.
Production Restart: The process is restarted due to a lack of response or action from the operator.
Quality Control: The product is checked for quality and compliance with standards.
Packing and Shipment: The product is packaged and prepared for shipment as the final step in the production process.
Storage: The product is stored as the final step in the production process.
Operation Cancellation: The process is canceled due to a decision made by the agent.
5.4. Process Enhancement Based on Defined KPIs
Figure 4 compares two sets of event logs: one representing processes where outcomes exceed the upper specification limit (USL) (Group A), and one representing whole processes (Group B). The differences between these two groups are visualized in order to highlight variations between all processes and out-of-control processes.
Nodes (states) represent different stages (activities) in the process, such as Calling the Machine and Product Retrieval. The thickness of each node correlates with the sojourn time, indicating how long the process typically remains in that state. Arrows (transitions) indicate the transition from one stage to another. The thickness of each arrow shows the duration of the transition between stages. The color gradient shows the duration metric () from the red to the blue spectrum; red shading shows (Group A has shorter durations), whereas blue shading indicates (Group A has longer durations).
Elements (nodes and arrows) with a frequency below 5% are excluded, focusing only on the most frequent activities in the process. The Quality Control node is highlighted, suggesting that it is significant in the comparison. The bold outline indicates a substantial difference in sojourn time or transition duration for this state between the two groups. The arrow from Material Retrieval to Quality Control is thick and colored blue, implying that the average transition duration for Group A (out-of-control processes) is significantly longer compared to Group B (all processes). The fact that there are no red arrows in the transitions indicates that the transitions in the two groups do not differ significantly. The visual representation in
Figure 4 aids in identifying specific stages and transitions that differ between standard and problematic processes, providing targets for process improvement and optimization efforts. The figure clearly shows that the deviations between the two logs (Group A and Group B) should be investigated. Thus, the next step is to discover the out-of-control processes.
Figure 5 shows the out-of-control processes (those above the USL), with a 0.94 fitness value. When compared to the discovered processes in
Figure 3, it is clear that there are differences in the flow which cause high cycle times. For example, in contrast to the discovered model in
Figure 3, the token in the place following the Material Retrieval and Product Retrieval activities can be consumed for the Operation Cancellation and Production Restart activities. These additional two transitions normally extend the cycle time, as a loop occurs. Moreover,
Figure 6 compares the cycle times of the discovered process model and the model of the processes above the USL. It is obvious that processes above the USL have longer durations, resulting in their being out-of-control.
A decision tree analysis cycle time can provide insights into the factors contributing to longer case durations exceeding a set threshold (USL). This can be achieved through automated selection of attributes. Subsequently, the target class (LESSEQUAL), which consists of traces with cycle times below the defined threshold, is defined; in this case, the average case duration is 191 s.
Figure 7 shows the root cause analysis through a decision tree indicating the out-of-control processes in order to reduce the cycle times that are above the USL. The analysis starts by checking whether there is a transition from the Return from Waiting activity to the Packing and Shipment activity. If this transition happens, its value becomes 1; otherwise, it is 0. Because the Calling the Machine activity is performed after the Return from Waiting activity, the True (left) split is usually followed in the first node. This can also be seen from the number of samples. Given the typical sequence of activities, the decision tree initially follows the True (left) split, which is evident from the higher number of samples (743 times) compared to the False (right) split (only four times).
Following the True split, the second node examines the type of shift. The company has two shifts: normal () and night (). If the production starts during a normal shift, its value becomes 1 and a False (right) split is followed, significantly reducing cycle time. Subsequent nodes delve into detailed activity transitions. For instance, the node “Calling the Machine → Calling the Machine ≤ 0.5” checks for repetitive machine calls. The left split here suggests fewer repetitions, with a Gini index of 0.037 and 53 samples. Analyzing the True split path, the decision tree reveals that normal shift activities and specific key transitions, such as from “Product Retrieval” to “Quality Control”, significantly influence cycle times. Each branch of the tree, such as “Activity: Storage ≤ 0.5”, further breaks down the process steps, allowing for a granular analysis of potential delays.
Following the False split, the second node examines the Calling the Machine activity. Although fewer samples follow this split path, the insights are equally valuable, as “Activity: Calling the Machine” highlights specific instances where the Calling the Machine activity causes longer cycle times. This path also explores less frequent but critical transitions that might be causing delays.
The above case study demonstrates the practical implementation of the proposed methodology, highlighting its capacity to uncover process deviations, identify root causes, and recommend targeted corrective actions. While the results are promising, there is room for further exploration and refinement, as discussed in the concluding sections.
6. Discussion
The intersection of SPC and PM presents a powerful methodology for augmenting manufacturing operations by providing real-time insights into process performance and quality control. This study has explained the potential usefulness of this synergistic approach in pinpointing and reducing deviations within manufacturing processes.
The proposed methodology initiates the collection of event logs and incorporates SPC to outline control limits and evaluate process performance. Process mining discovery techniques are employed for processes categorized as out-of-control in order to clarify the actual process flow and determine bottlenecks and inefficiencies. Subsequent application of log comparison and root cause analysis by decision tree techniques facilitates the identification of the underlying causative factors of deviations and the implementation of targeted corrective measures. In our case study, the analysis revealed that more than 6% of the processes surpassed the specification limits, thereby underscoring the imperative for enhancements. By deploying the proposed methodology, our study distinguished specific deviations engendering augmented cycle times and was able to propose targeted interventions to reinstate the processes within control parameters.
The root cause analysis, enabled by the decision tree algorithm, provided a structured and systematic approach to identifying the key factors contributing to extended cycle times in the out-of-control processes. By examining critical attributes such as the absence of the transition from the ‘Return from Waiting’ activity to the ‘Packing and Shipment’ activity, along with the shift information, the resulting decision tree effectively highlighted the primary sources of inefficiencies within the manufacturing process. This analytical approach allowed for the prioritization of corrective actions while enhancing the precision and effectiveness of interventions aimed at mitigating the identified deviations and restoring process control. Thus, using decision tree analysis in conjunction with root cause analysis proved instrumental in facilitating a data-driven and targeted approach to process improvement, thereby contributing to the overarching objective of achieving operational excellence and consistent product quality within the manufacturing environment.
The proposed methodology has numerous advantages over existing procedures in quality control. When implemented together with PM, SPC can offer organizations a holistic view of processes and allow them to determine what is actually causing these processes to go out of control, rather than merely recognizing that this is the case. This degree of detail enables precise corrective actions and sustained process improvement initiatives to be carried out.
Additionally, the proposed methodology involves better use of data analysis, which minimizes subjective judgment and reliance on observations. PM techniques such as process discovery and root cause analysis can help organizations by revealing issues that might otherwise be overlooked. This empirical approach can additionally improve the efficiency of process improvement initiatives.
However, it is important to note the following limitations of the proposed methodology. Despite its successful implementation, it depends on the availability and quality of event log data. The success of PM techniques relies on the effectiveness of organizations’ data collection and management processes. Furthermore, because the methodology requires specialized skills and tools for PM analysis, it may require additional investments in training and technology.
Notably, while the case study in this paper focused on production processes, the proposed methodology can be generalized to various industries and domains where process optimization and quality control are crucial. The integration of SPC and PM techniques is applicable to any process-oriented environment that generates event log data. However, the specific KPIs, control limits, and PM algorithms may need to be adapted based on the unique characteristics and requirements of each domain.
7. Conclusions and Future Directions
This research focused on the combined application of SPC and PM to improve manufacturing process performance and quality. The case study was very useful in illustrating how this combined approach can pinpoint such process variances. Using the data gathered from the event logs and SPC, this study was able to set up control limits and observe the performance of different processes. Processes classified as out-of-control were investigated using PM discovery methods and decision trees with the aim of identifying the root causes of inefficiencies.
The aforementioned conclusions highlight the significance of such integrated approaches in enhancing process control and operational productivity. Although there are challenges and constraints, including concerns around complexity and scalability, this strategy has significant advantages in delivering real-time information and facilitating focused solutions. Future research studies should aim to identify these gaps and examine the versatility of this approach in other industries and broader contexts.
By linking the structured monitoring abilities of SPC with the data-driven process discovery and enhancement capabilities of PM, the proposed methodology paves the way for a comprehensive and proactive approach to process optimization in the manufacturing environment, aligning with the overarching pursuit of operational excellence and consistent product quality.
Future research initiatives might investigate the application of advanced machine learning and artificial intelligence paradigms to extend the predictive capabilities of the proposed methodology and automate the identification and mitigation of process deviations in real time. Additionally, the extension of the proposed methodology across diverse industrial sectors together with an evaluation of its scalability and adaptability could provide valuable insights into its broader applicability and potential, potentially leading to a transformative impact on manufacturing operations within the ambit of digital transformation.