1. Introduction
Industry 4.0 uses IoT, digital twin systems, and predictive maintenance technologies to improve business processes and gain a competitive advantage. In that scenario, Real-Time Fault Detection and Diagnosis (RT-FDD) is fundamental for increasing the reliability of production systems by preventing breakdowns [
1]. The real-time aspect enables early interventions when abnormalities are detected, while the diagnosis feature supports precise maintenance actions.
It turns out that industrial processes may be continuous or discrete, and different RT-FDD approaches are required for each one since their behavior are distinct. Continuous processes, such as oil refineries and distillation systems, deliver products at a specific rate (e.g., ton/h, L/min) with uninterrupted operation. While discrete processes, such as Discrete Manufacturing Machines (DMMs), are executed in well-defined sequences of steps, with specific duration, delivering unitary items (e.g., bottles, boxes) [
2]. Therefore, performing RT-FDD in continuous processes comprises comprehending the normal behavior of variables along time, in steady states and transitions, to differentiate it from anomalous or faulty behaviors. In contrast, in discrete systems, RT-FDD comprises understanding sequential operations, their duration, and how continuous variables behave in each operation.
In this context, this study focuses on performing RT-FDD in DMMs, which are widely present in the manufacturing industry. Their automation systems generally include knowledge-based (KB) fault diagnostic features that monitor sensors and parameters, and trigger alarms [
3]. Furthermore, their behavior is typically described by a sequence of events, represented by inputs and outputs (IOs) [
4,
5], and may also involve continuous variables such as pressures, flows, temperatures, levels, positions, energy quality, power, or consumption [
3]. Consequently, KB fault diagnostic is limited to a few known situations that can be humanly implemented due to the complexity and variety of the data types and the number of devices to be monitored in a DMM. Given that, manufacturing performance may be improved by overcoming the limitations of the KB solutions and delivering more and better diagnostics that support maintenance interventions.
Alternatives to KB systems are Physical Models (PM) and Data-Driven techniques such as Machine Learning (ML). However, while KB and PM approaches demand high engineering effort and deep knowledge about the machine’s behavior [
6,
7], ML does not, since it learns from machines’ behavior and can deal with high-dimensional data [
8]. Moreover, ML approaches can perform nonlinear relations in data and are somewhat flexible to outliers.
Several studies have focused on developing accurate ML models to reduce downtime in production and improve process quality; most deal with continuous processes. Recently, Kojuk et al. [
9,
10] developed a method and approach to building a decision support tool combining supervised and semi-supervised techniques to detect and diagnose faults performed over data from continuous processes. Ren et al. [
11] developed a methodology based on deep belief networks and multiple models to accomplish fault detection for complex systems. Chiu et al. [
12] proposed a method using random forest and a time-series deep-learning model based on the long short-term memory networking to achieve real-time monitoring and faster corrective adjustment of machines. Furukawa et al. [
13] used the change score generated by the ChangeFinder as new features at the SVM to classify normal and anomalous conditions improving the detection speed and accuracy compared to the original SVM. Finally, Makridis et al. [
14] proposed a method that combines an ensemble to perform the task of predicting faults in maritime vessels.
Regarding FDD tasks in discrete event systems, such as DMMs, most studies represent the system’s behavior by Petri Nets or Finite State Machines to implement the diagnostic approaches. For instance, Cohen et al. [
4] developed a hybrid approach that uses Petri Nets to guide data-driven fault diagnosis of PLC (Programmable Logic Controller)-timed cyclic event systems with a 97.2% validation accuracy. Furthermore, Lee and Chuang [
15] developed a Petri Net-Based Fault Diagnostic System for Industrial Processes. Their solution involves learning the machine’s normal behavior, designing a Petri Net from it, and implementing PLC routines based on logical combinations that detect if the current machine behavior is normal or anomalous. Finally, Ghosh et al. [
5] developed an automated fault detection tool for PLC-controlled manufacturing systems. Their approach is centered on learning the states of a sequential machine over time to detect when a sensor or actuator state change occurs in an unexpected moment.
Therefore, some significant findings may be highlighted from the studies mentioned above. Firstly, research on ML for RT-FDD in DMMs remains lacking since most studies that employ ML to FDD are focused on continuous processes, while those that deal with discrete-event systems, such as DMMs, mainly deal with timed events and use Petri Nets or State Machines. In addition, the combination of continuous variables with discrete events has not been identified in any prior study regarding RT-FDD for DMMs, even though continuous variables may indicate an imminent failure and contribute to the improvement of the FDD task. For instance, an unusual temperature at a specific device may precede its damage. Moreover, the typical sequential cyclic behavior of DMMs has not been considered in any prior study dealing with ML to perform the RT-FDD task.
Thus, considering that ML may significantly contribute to RT-FDD in DMMs, another challenge must be faced: the current industrial workforce barely includes professionals ready to use ML, such as data scientists [
16]. With that in mind, Automated Machine Learning (AutoML) has been employed by researchers to address this gap, enabling non-ML experts to explore ML technologies [
17,
18,
19,
20].
AutoML is essentially a paradigm related to automating the entire or part of the ML process to reduce human effort on model development and empower domain experts to use machine learning [
17]. In this sense, Larocque-Villiers et al. [
21] and Li et al. [
22] developed AutoMLs for intelligent fault detection on bearings and gearboxes, respectively. Kefalas et al. [
23] investigated the usage of AutoML for Remaining Useful Life Estimation of Aircraft Engines, and Nascimento et al. [
22] studied the diagnostic of operation conditions and sensors faults using an AutoML. In all cases, the AutoML proved very efficient and saved significant time.
Keeping in mind the improvement of the manufacturing industry performance by reducing downtime with better fault diagnosis, this work proposes a novel and domain-specific AutoML approach for RT-FDD in DMMs. It explores the cyclic sequential behavior of DMMs, considers the scarcity of ML professionals in the industry, and uses only data commonly available in industrial SCADA (Supervisory Control and Data Acquisition) systems: time series of digital and analog IOs. In this sense, the main contributions of this research are:
an AutoML approach that enables non-ML experts to implement data-driven RT-FDD in the industry since it requires human contributions only in the automation and maintenance domains;
a method to combine discrete events and continuous variables composing the features for RT-FDD in DMMs, that considered its cyclic sequential behavior;
the evaluation of how the combination of discrete timed-events and continuous variables as features contributes to the enhancement of models’ performance;
the evaluation of the generated models’ capacity to correctly diagnose faults, even when only a few samples of the faulty conditions are available.
The remainder of this document is structured as follows:
Section 2.1 (AUTO-ML APPROACH FOR RT-FDD) details the structure of the proposed Auto-ML approach, describing the feature and dataset preparation, the classifiers explored at the model selection mechanism, and the model execution routine;
Section 2.4 (3D REAL-TIME MACHINE AND FAULT SIMULATION) describes two simulated machines and nine faulty conditions that are used in the experiments;
Section 3 (RESULTS AND DISCUSSION) evaluates the automatically generated models’ capacity to detect and diagnose the faults, and its performance in different scenarios that reflect real-world situations.
3. Results and Discussion
In this section, the results obtained from applying the AutoML approach are presented and discussed. Experiments were performed with two datasets for each simulated machine: one was generated as detailed in Process II of the proposed approach, named PA dataset, and the other with only the discrete timed events, named ODE dataset. The investigations verified:
the overall performance of the selected models and the influence of combining discrete and continuous variables (
Section 3.1);
the performance of all 16 models implemented with different classifiers in the model selection process, their sensitivity to the dataset split, and the initialization (
Section 3.2);
the performance by class of the selected models using a confusion matrix (
Section 3.3);
the relevance of timed-events and continuous variables features from a feature importance analysis (
Section 3.4);
the impact of the dataset size on the performance of the models using the F1 Score (
Section 3.5).
3.1. Overall Performance Evaluation
The approach was capable of generating models to diagnose faults with F1 Scores of 100% and 85% for the Furnace and Pick And Place simulated machines, respectively. These results were obtained when evaluated with the reserved unseen fraction of the PA dataset and can be observed in
Figure 5c and
Figure 6c. For the Furnace, the selected model was the Extra Trees Classifier, and for the Pick And Place, it was the Random Forest Classifier.
Figure 5 presents boxplots of all 16 automatically generated classifiers on the model selection process for the Furnace machine experiment, ranked by F1 Score.
Figure 5a,b enable a direct comparison between models implemented with PA and ODE test data, respectively. Finally,
Figure 5c,d show the performance of the implemented models over reserved unseen data. Following the same organization,
Figure 6 presents the same performance analysis for the Pick and Place machine. The red dashed line provides a clear visualization of how models implemented with the PA dataset present a superior mean F1 Score compared with those implemented with the ODE dataset.
Regarding the contribution of continuous variables when combined with discrete events as features, the selected models implemented with the PA dataset present an F1 Score 6% higher than the best models implemented with the ODE dataset. The mean F1 Scores with the ODE and PA datasets were 94% and 100% on the Furnace data, and 79% and 85%, on the Pick And Place data. The difference between the distributions is confirmed by applying a hypothesis t-test with a confidence interval of 95%.
In practical terms, this increment of 6% in the F1-Score should result in higher machine availability due to faster and more precise interventions based on correct diagnostics. The reason is that since the F1 Score strongly considers False Positives and False Negatives, the model is less likely to diagnose a faulty situation when the machine is working in normal condition or vice versa.
No comparison with other studies was performed because, to the best of the authors’ knowledge, there is no prior study nor public benchmark dataset that combines discrete events and continuous variables from a DMM for RT-FDD. The fact is that some of the few publicly available datasets regarding industrial systems are related to continuous process [
31,
32,
33].
3.2. Considered Classifiers Performance
It is possible to observe in
Figure 5a,c and
Figure 6a,c that the two best models, ET and RF for the Furnace, and RF and LIGHTGBM for the Pick And Place, were the same on the evaluations with test and unseen data. In addition, these models presented the highest mean F1 Scores, with the highest minimum values and lowest standard deviations when evaluated with test and unseen data. This low variance suggests models are less sensitive to the train/test split and also to the initialization.
Regarding the performance of the other 14 classifiers, 3 presented F1 Score below 80% on the Furnace dataset, and 8 presented performance below 70% on the Pick And Place dataset. This evidences that not all classifiers learn adequately from data. Moreover, regarding the sensitivity of the classifiers to the initialization and the dataset split, it is possible to verify that some classifiers present a significantly lower sensitivity than others by observing the standard deviation of each distribution. This observation is relevant since a highly sensitive classifier to the initialization or dataset may overfit the data and present an unsatisfactory performance over unseen data.
3.3. Performance Evaluation by Class
Table 2,
Table 3,
Table 4 and
Table 5 show how each class on unseen instances are labeled on the ODE and PA datasets. As can be observed, the classifiers implemented with the PA dataset presented an improved performance in all classes. Regarding the improvements on the Furnace RT-FDD model, in
Table 2, faults F1 and F3 presented wrong diagnostics. These misclassifications are entirely eliminated when the model is implemented with the PA dataset, as observed in
Table 3.
As summarized in
Table 4 and
Table 5, considering the RT-FDD models for the Pick And Place machine, correct classifications in all classes are improved by applying the PA dataset. In addition, significant improvements are observed on Faults F1 and F4, enhancing more than 50% for correct classifications. Therefore, despite remaining inefficiency regarding some classes, the models implemented with the PA dataset still present significantly superior capacity to provide a correct diagnosis.
3.4. Feature Importance Analysis
An ablation study was performed to analyse the importance of each feature, by removing it from the dataset and verifying the model’s performance in its absence. As can be observed in
Figure 7 and
Figure 8, of the 10 most relevant features, 6 (>50%) are continuous variables within the red dashed rectangles. This result corroborates the positive contribution of the combination of continuous variables and discrete events in the dataset from the evidence that continuous variables features are as relevant as the most relevant discrete event features. This analysis was performed using Pycaret’s model evaluation module.
3.5. Sensitivity to the Dataset Size and Improvement Capacity
Since faults are rare events, it is reasonable to expect that an initial dataset in a new deployment of the proposed AutoML approach may contemplate a small number of samples of faulty cycles. Therefore, the model’s sensitivity to the number of faulty cycle samples and its capacity to improve as new known faults occur was investigated. It involved training and evaluating the selected models with datasets that count 20 to 70 instances of each class, in steps of 5 (i.e., [20, 25, …, 70]).
As can be seen in
Figure 9 and
Figure 10, the models implemented with the PA dataset present superior performance compared to those implemented with the ODE dataset. Furthermore, models implemented with the PA dataset and evaluated over unseen data present a more evident performance enhancement as the number of instances increases compared to those implemented with the ODE dataset.
As can be seen in
Figure 9c,d, in the evaluation with unseen data, the F1 Score of Furnace classifiers implemented with the PA dataset shows a gain from 82% to 94% when the number of faulty samples increases from 20 to 70. In the case of the ODE dataset, the F1-Score gain is limited to 85%. Similar behavior is observed for the best classifier of the Pick And Place example. Increasing the samples from 20 to 70, the F1 Score improved from 80% to 88% and from 74% to 82%, with PA and ODE datasets, respectively.
Therefore, the models implemented with PA and ODE datasets are equivalent in correctly diagnosing a small number of fault samples. However, models implemented with the PA dataset are significantly more capable of improving as they are retrained with new fault samples.
Finally,
Figure 11 summarizes the distribution of the F1 Score considering all the models implemented over PA and ODE datasets and the dataset sizes from 20 to 70 instances for each machine. As observed, models implemented following the proposed approach (PA) present higher performance than the models implemented only with discrete events (ODE).
4. Conclusions
A new automated ML approach for real-time fault detection and diagnosis (RT-FDD) in discrete manufacturing machines (DMMs) is presented and validated with two case studies: a Furnace and a Pick And Place simulated machines. The models generated by the approach presented the highest mean F1 Scores and the lowest variances among all 16 classifiers considered in the model selection process. Extra Trees and Random Forest are the classifiers selected for the Furnace and Pick And Place Machines, with average F1 Scores of 100% and 85%, respectively.
A significant improvement of 6% in the mean F1 Score is verified when continuous and discrete variables are combined following this study’s proposed approach compared with a dataset built with only timed-delay discrete events. The statistical difference in the distributions is confirmed by applying a hypothesis test with a 95% confidence interval. Moreover, in the feature importance analysis, six (6) continuous variables are listed within the ten (10) most relevant features, corroborating the positive contribution of the combination of continuous variables and discrete events in the dataset.
In both case studies, the classifiers implemented with the PA dataset show an F1 Score higher than 80% when only 20 samples of each fault class are available. The F1 Score is enhanced to over 90% when 70 samples of each faulty condition are available. These results show the approach’s capacity to diagnose faults when it is first deployed as well as its capacity to improve each time a new fault occurrence is detected.
Future works should consider the introduction of automatic anomaly detection for identifying novel faulty conditions, clustering, and explainability resources to support understanding and labeling the new cases. Furthermore, new studies that contemplate the deployment of AutoMLs in industry and their impacts may highlight new gaps and research directions to remove barriers that prevent ML from being widely used on the shop floor.