Predictive Data Mining Techniques for Fault Diagnosis of Electric Equipment: A Review

Contreras-Valdes, Arantxa; Amezquita-Sanchez, Juan P.; Granados-Lieberman, David; Valtierra-Rodriguez, Martin

doi:10.3390/app10030950

Open AccessReview

Predictive Data Mining Techniques for Fault Diagnosis of Electric Equipment: A Review

¹

ENAP-Research Group, CA-Sistemas Dinámicos, Facultad de Ingeniería, Universidad Autónoma de Querétaro (UAQ), Campus San Juan del Río, Río Moctezuma 249, Col. San Cayetano, San Juan del Río, Qro. C. P. 76807, Mexico

²

ENAP-Research Group, CA-Fuentes Alternas y Calidad de la Energía Eléctrica, Departamento de Ingeniería Electromecánica, Tecnológico Nacional de México, Instituto Tecnológico Superior de Irapuato (ITESI), Carr. Irapuato-Silao km 12.5, Colonia El Copal, Irapuato, Guanajuato C. P. 36821, Mexico

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2020, 10(3), 950; https://doi.org/10.3390/app10030950

Submission received: 29 December 2019 / Revised: 23 January 2020 / Accepted: 27 January 2020 / Published: 1 February 2020

(This article belongs to the Special Issue Machine Fault Diagnostics and Prognostics)

Download

Browse Figures

Versions Notes

Abstract

:

Data mining is a technological and scientific field that, over the years, has been gaining more importance in many areas, attracting scientists, developers, and researchers around the world. The reason for this enthusiasm derives from the remarkable benefits of its usefulness, such as the exploitation of large databases and the use of the information extracted from them in an intelligent way through the analysis and discovery of knowledge. This document provides a review of the predictive data mining techniques used for the diagnosis and detection of faults in electric equipment, which constitutes the pillar of any industrialized country. Starting from the year 2000 to the present, a revision of the methods used in the tasks of classification and regression for the diagnosis of electric equipment is carried out. Current research on data mining techniques is also listed and discussed according to the results obtained by different authors.

Keywords:

data classification; data mining; data regression; electric equipment; fault diagnosis

1. Introduction

Over the past few years, the number and diversity of electrical equipment, such as motors, transformers, generators, electric vehicles, and energy transmission and distribution systems, among many others, are getting bigger [1,2,3,4,5,6]. Their exponential growth is due to the need of people to perform a number of different activities, ranging from industrial processes to everyday activities such as charging the cell phone battery or starting the car to go to work. Due to their paramount importance in any facet of society, their safety and correct operation is vital, even more so when considering that a failure in one of its components can produce (1) high economical losses derived from its partial or total repair, (2) degradation and poor quality on its performance, (3) outages in the production process, (4) damages to other equipment, and (5) conditions that put in risk the physical integrity of people, among others.

In this regard, the application and development of new techniques and methods to monitor the condition of electric machines and systems are important topics of research. In general, a condition monitoring strategy consists of the following steps (see Figure 1): Data collection through different types of sensors, data processing and feature extraction, and data analysis for condition assessment. The latter can be seen as the process of exploring, finding, selecting, and using specific data to solve the given problem, e.g., a diagnosis problem; however, it is not an easy and straightforward process since the data analyst has to deal with different volumes and varieties of data, as well as redundant and unneeded data, which can compromise and difficult the solution of the assigned task; in fact, the reality is that, in many cases, only a small part of the dataset is used because its volume is simply too large to be used and processed effectively. One solution to this problem has been the use of data mining (DM) techniques. DM is one of the fastest growing fields at both the computational and industrial levels. Its main characteristic involves the search of patterns through the handling of different sets of data to discover the available knowledge. Kantardzic [7] calls DM to the process of applying a computer-based methodology for discovering knowledge from data. Although DM is based on computational algorithms, best results can be obtained by balancing the knowledge of human experts about the problem under study with the advantages and operating modes of different algorithms [4].

In general, DM functionalities can be divided into two categories: Predictive and descriptive. The former is used to construct models that allow the prediction of unknown or future values, whereas the latter is in charge of finding new information that allows the description of the dataset. In this regard, the prediction functionalities become the most suitable option to perform the condition monitoring since a new and unknown equipment condition can be determined or predicted from a specific input information. Therefore, this manuscript is aimed at reviewing the classification and regression tasks that fall within the predictive category of DM, as well as hybrid techniques that combine more than one prediction method. Specifically, classification techniques attempt to find a function or model that distinguishes or predicts the class of unknown data by analyzing a data training set [8]. The regression analysis is used for numerical prediction, i.e., to predict missing or new numerical data values [8].

In the literature, two main groups of research works related to DM and electric equipment are found. On the one hand, there are different reviews about DM applications, e.g., diagnosis in health [9,10,11], marketing [12], industrial [13,14], climatological [15], and financial [16] issues, among others. On the other hand, there are also reviews related to diagnosis methods for specific machines such as transformers [17,18,19], estimation strategies in electric vehicles [20], mathematical models used to study induction motors in defective conditions [21], or, in a more general sense, methodologies of fault classification in transmission systems [22,23] and distribution of energy [24]. There is also the work of Hare et al. [25], where they present a study of modern diagnosis methods in smart micro grids. Although there are specialized reviews on topics of either DM or electric equipment and systems, none of these works have been specifically focused on reviewing the research that has been carried out about the applications of DM techniques for condition monitoring of electric equipment and systems, which is very important in order to highlight the algorithms that have been used in specific equipment but can be applied to other machines since the application core is similar. In this regard, this manuscript provides a review of DM techniques focused particularly on the tasks of classification and regression within the category of predictive analysis applied to various electric machines and systems such as transformers, electric vehicles, heating, ventilation, and air conditioning (HVAC) systems, airplane, automotive, three-phase and multi-phase induction motors, centrifugal pumps, generators, distribution systems, and transmission lines, among others.

The rest of this manuscript is prepared as follows. Section 2 deals with the classification, regression, and hybrid techniques used for the detection of faults and the diagnosis of electrical systems. In Section 3, recent research works on these topics and the latest contributions on DM techniques that can be explored in fault diagnosis methodologies are presented. Finally, Section 4 shows the conclusions of this work.

2. Predictive Model

DM tasks can be conducted through prediction and description models [7,8] (see Figure 2a). In general, the prediction models are constructed through the learning of known data classes, whereas the description models arise from the findings obtained in a dataset [8]. In this regard, predictive DM techniques are the straightforward option to perform the diagnosis of equipment and systems since their different operating conditions can be learned and determined by a prediction model. In this model (see Figure 2b), its learning is carried out by means of the analysis of a data training set (input data with known outputs) and then used to predict the unknown output (class or value) of new input data. According to the nature of data (discrete or continuous), the prediction model can be used to perform either classification or regression functionalities [7] (see Figure 2a). A classification procedure consists of the assignment of an object (unknown class) into one of several predefined classes (predicted class) according to its properties or features. In a different way, a regression procedure involves the modeling of continuous functions to determine new numerical values (predicted values) according to specific inputs. Classification and regression techniques used for fault detection in electric equipment and systems are presented in the following sections.

2.1. Classification-Based Methods

In a condition monitoring context, a classification model can be constructed from a given system and used to provide warnings and predict certain failures in early stages. In this regard, researchers around the world have proposed and used different classification methods in machine learning, pattern recognition, and statistics to perform faults diagnosis.

Recent research on DM has been focused on developing classification techniques capable of handling datasets with different features, e.g., imbalance of proportionally, and large amounts of data. In the latter, this capacity is strongly required because, on the one hand, the availability of data is growing and, on the other hand, their performance can be compromised if limited datasets are analyzed. In fact, the amount of available data during the training of a neural network (NN) plays an important issue in its performance. For instance, Taylor et al. [26] contrasted three different techniques: Neural networks trained by using a hybrid of evolutionary search and backpropagation, neural networks trained by straightforward backpropagation, and simple predictive rulesets trained by evolutionary algorithms. Results indicate that evolved NNs outperform backpropagation trained NNs. However, the results are slightly unsatisfactory from a business viewpoint, obtaining a maximum accuracy of 77.9%, which can be somehow expected due to the small amount of training data, highlighting the need of additional data to establish a better reference during the pattern recognition task. Fortunately, there are many works in which the authors also use NNs as the basis of their investigations and promising results are obtained. In an energy consumption context, Magoulès et al. [27] diagnose different electrical equipment of an office building, including fans, pumps, cooling equipment, and chillers. They use a recursive deterministic perceptron NN to distinguish between normal and defective datasets, where an effectiveness percentage higher than 97% is obtained. Similarly, the use of NNs for fault detection on induction motors are presented [28,29,30,31,32,33,34,35]. Tallam et al. [28] presented an on-line diagnostic scheme to alert the engine protection system of an incipient failure. This scheme consists of a feed-forward NN with a self-organized feature map to display the operating conditions of the in-test machine. An interesting feature offered by the results is that the method is not sensitive to unbalanced supply voltages or asymmetries in the machine. Martins et al. [29] use the alpha-beta stator currents of a three-phase induction motor as input variables to diagnose stator faults. In their proposal, an unsupervised Hebbian-based NN is used to extract the main components of the stator current data. Other proposals combine NNs with fuzzy logic systems (FLSs) to detect inter-turn faults [30,31]. In particular, Ballal et al. [31] developed an ANFIS (Adaptive Neural Fuzzy Inference System) for the detection of stator inter-turn insulation and bearing wear faults, where five input parameters, i.e., current, bearing temperature, winding temperature, speed, and the noise of the machine are used to construct the model. For the inter-turn insulation fault, they obtain an effectiveness of 94.03% using two inputs and 96.67% using five inputs. For the bearing wear fault, the accuracy rate with two inputs is 90.5%, and 98.7% with five inputs. These results demonstrate the importance of an information-rich dataset. In [32,33,34], several NNs are implemented in field programmable gate arrays (FPGAs) to diagnose different faults in induction motors. The diagnosis of broken rotor bars is presented by Zolfaghari et al. [35], where the multi-layer perceptron NN used is able to detect the faults in the rotor with a classification effectiveness of 98.80%. Furthermore, modular NNs are used to diagnose transmission lines from the voltage and current signals of their elements (busses, transmission lines, and transformers). Given its modular nature, the diagnosis can be carried out by element, by area, or for the entire context of the electrical system [36]. In [37], adaptive linear neural networks and feed forward neural networks are combined to classify electrical disturbances that affect the electric equipment. The best classification results are obtained when only a single disturbance appears; when more disturbances are combined, the effectiveness is reduced, but it is worth noticing that the effectiveness percentage obtained exceeds 90% for a noiseless condition and exceeds 77% for a noisy condition in the presence of six combined disturbances. In addition, the overall methodology takes 46.5 milliseconds per half cycle analyzed. Hare et al. [25] present a survey for fault diagnostics in smart micro grids, in which they discuss the faults within various components of a micro network, e.g., photovoltaic panels, wind turbines, conventional generation systems, as well as cables and transmission lines, etc., where several classification algorithms such as NNs, decision trees, and FLSs, among others, are presented.

Regarding the transformers, Rigatos and Siano [38] propose the neural-fuzzy network modeling and the local statistical approach for the detection of incipient faults in power transformers. Another technique commonly used for diagnosis of transformers is the decision tree [39,40,41]. Menezes et al. [39] and Han et al. [40] used experimental data from a dissolved gas analysis (DGA) to illustrate the performance of their decision tree-based models. In [39], they present a comparison between the method based on the algorithm C4.5 and other methods used in DGA. They use only 162 samples for the analysis, obtaining the following accuracy: 99.38% for the proposed method, 98.15% for the rules extracted, 88.03% for Duval Triangle, 63.25% for Dornenburg IEC C57.104, and 56.41% for Rogers IEC C57.104. In [40], a decision-tree C4.5 algorithm obtained an effectiveness of 86% for a thermoelectric fault in oil-immersed transformer. Samantaray and Dash [41] analyze the current of a power transformer to discriminate between the current signals generated by the inrush effect and the ones generated by its internal faults. The processing time of the proposed approach is 0.12s and provides an accuracy greater than 96%, exceeding the accuracy of the support vector machine (93.33%). As can be noted, the type of variable to be analyzed by the decision tree-based methods is not restricted; in fact, the use of vibration signals for the diagnosis of faults in monoblock centrifugal pumps [42] and motors of internal combustion [43] are also presented. The latter also compares the classification accuracy obtained by the J48 algorithm, random forest tree algorithm, linear model tree algorithm, best first tree algorithm, and functional tree algorithm, where the linear model tree algorithm provides the best results, offering classification accuracy of 100% using statistical features. In general, it can note that the decision tree algorithms are a practical, economical, and very effective approach. In addition to these different types of decision trees, the fault tree is another alternative used for the diagnosis of systems. For example, Volkanovski et al. [44] evaluate the reliability of a power system for energy delivery by constructing a fault tree structure, which represents the system configuration and includes all the possible flow routes of interruption of the power supply from the generators to the loads, including energy transfer limitations, common cause failure of power lines, energy flows and the capacity of generators, and loads in the power system. Duan and Zhou [45] also use the fault tree analysis and Bayesian networks for fault detection of a system for oil pressure warning instructions in an aircraft engine, where a diagnostic decision tree to guide maintenance personnel to make more efficient decisions when attempting to repair the system is obtained. An advanced Bayesian non-linear state estimation technique called Unscented Kalman Filtering to detect faults in HVAC (heating, ventilation, and air conditioning) components is presented by Bonvini et al. [46]. This algorithm can detect common faults in a chiller plant and functional failures caused by problems in the compressor and occlusions in the valves with a computational performance of 0.25s using Intel Xeon (R) 2.67 GHz–19 Cores and 0.52s using Intel Core i7 2.8 GHz–1 Core. Another tree-based method is the tree-structured fault dependence kernel developed by Li et al. [47]. It implements a structured labeling to include dependency information and describe severity levels in a high-margin learning framework for fault detection of building cooling systems. It is important to highlight that the testing accuracy increases or decreases accordingly with the change of training samples. For instance, in [47], the testing accuracy of the proposed strategy boosted from 69.64% (six training samples) to 99.12% (180 training samples). That is, accumulating more training data is beneficial for the fault detection and diagnosis.

Other classification method that has been widely used is the FLS. In general, it uses knowledge-based reasoning to construct logical rules and, thus, diagnose faults. In this type of algorithms, the designer knowledge about the in-test equipment, e.g., operating conditions, nominal parameters, overall performance, etc., plays a fundamental role. In [48], an FLS is designed to diagnose stator winding faults in induction motors. Similar results are obtained under noisy and noiseless conditions. Therefore, FLS is a good option because there is no general and accurate analytical model that describes completely the induction motor under fault conditions, leaving the open doors to uncertainties or noisy conditions. Amezquita-Sanchez et al. [49] present two FLSs to detect broken rotor bars (BRB) in both regimes of operating conditions, i.e., transient and stationary. The combination of fractal dimension analysis and FL system demonstrated to be highly effective on identifying half-BRB, one BRB, and two BRB, as well as healthy condition, since an effectiveness of 95% and 100% for start-up transient and steady state is obtained. For transformers, Islam et al. [50] present the diagnosis of several transformer faults using dissolved gas in oil analysis (DGA) and an FLS for its interpretation. An overview of different FLSs for DGA is presented in [51], where it is indicated that there is not a single technique that can enable the detection of the full range of faults, therefore the combination of different methods has to be explored as a promising solution. Although promising results have been obtained using FLSs, a relatively high superiority of an adaptive neuro fuzzy inference system for DGA is presented in [52], obtaining an accuracy of 98% for all the 100 fault cases under study, while FL obtained 95%. Regarding other electric systems an equipment, the fault diagnosis of the power system using fuzzy logic is presented in [53]. An online monitoring system of voltage variations in electric systems is presented in [54], where an FLS is used to diagnose and classify instantaneously, i.e., sample to sample, the severity of the electric variation. Their proposal is a suitable tool for analyzing stored data; furthermore, it provides phase information unlike the conventional root mean square technique; moreover, it gives results sample to sample, which is better for nonstationary signals. Lauro et al. [55] diagnose a fan coil electric and Zio et al. [56] classify the faults of a steam generator of a pressurized water reactor. In the latter, a fuzzy clustering-based classification model is transformed into a fuzzy logic inference model, allowing its direct interpretation and inspection; also, improvements in the obtaining of the model are presented to allow the treatment of more complicated scenarios.

Table 1 shows a summary for the above reviewed works, where the used techniques and conventional applications, along with the physical variables that have been analyzed by them, are presented. As can be observed, NNs, decision trees, and FLSs are the most commonly used methods for fault detection. Although NNs can be more suitable for fault detection from a generalization viewpoint, decision trees have been preferred in many cases because of the clarity in their interpretation (human friendly) and their low computation burden, which are desirable features in online condition monitoring systems. Also, if the amount of data is limited, a simple decision tree can be used; yet, other aspects of such small dataset have to be taken into account, for instance: redundancy of data, data imbalance, information contained, data type (continuous or discrete), range, time dependency, etc. Regarding the physical variable measured from the in-test equipment, the current signals show to be a powerful and representative source of information for fault detection; although promising results are obtained, the combination of multiple physical variables, e.g., current and vibrations signals, should be explored in order to improve the reliability of new classification schemes and expand the number of fault conditions that can be determined by a single classification algorithm, exploiting the information that each signal can provide, e.g., current signals can provide information to diagnose electrical faults and vibration signals can provide information to diagnose mechanical faults.

2.2. Regression-Based Methods

In general, the use of regression techniques consists of numerical prediction, i.e., a methodology to generate a methatical function or model to predict missing or new numerical data values; but it also covers the identification of distribution trends based on the available data. For the latter, the support vector machine (SVM) has been widely used since a regression function is found from the training dataset.

Among the available research works, SVMs have been presented in the literature as one of the most promising methods to diagnose faults in power transformers [57]. Lv et al. [58] and Bacha et al. [59] implement SVM-based strategies to establish the classification of faults in power transformers by using the gases available from the DGA. Both works present an interesting performance comparison among different methods. In [58], five artificial intelligence methods are presented. It is found that the SVM is the most effective and fastest method, obtaining an accuracy of 100% and a training time less than 1s, NN (92.76% accuracy and 81s training time), expert system (89.34% accuracy and training time no mentioned), FL (92.32% accuracy and 82s training time), and combined NN and expert system (93.54% accuracy and 44s training time). In [59], the classification accuracy of FL (86.7%), multi-layer perceptron (80%), radial basis function (86.7%), and SVM (90%) is presented. In a similar venue, SVMs are also explored for the detection and localization of faults in transmission lines, where Johnson and Yadav [60] and Parikh et al. [61] conclude that the SVMs are a highly accurate method for these tasks. Zhang et al. [62] present a SVM-based methodology for data-based line trip fault prediction in power systems, where long-term memory networks are used to capture time series characteristics from multiple sources in large systems. The accuracy of the line trip fault prediction can reach about 97%. SVMs have been also employed in the diagnosis of induction motors, e.g., Gangsar and Tiwari [63] carry out a comparative investigation to predict mechanical and electrical faults in induction motors from the analysis of vibration and current signals and the use of multiclass SVM methods. Zhang et al. [64] propose a method based on the robust local linear embedding algorithm and an SVM for the diagnosis the gear fault from an experimental setup composed by a motor, a torque transducer/encoder, and a dynamometer. The diagnosis of fault severity in the stator winding of induction motors using SVM in regression mode is presented by Das et al. [65]. In their research, they analyze the current signals for different levels of short circuit fault, different unbalance conditions in the voltage supply, and different load levels. In the methodology, they use recursive feature elimination to select the optimum number of features and an SVM as a load-immune classifier, demonstrating the high capabilities of SVMs. Among other electric machines where the SVM has been applied, heating, ventilation, and air conditioning (HVAC) systems [66,67] and the steam generator and pressure boundary of the Chinese CNP300 PWR (Qinshan I NPP) reactor coolant system [68] are included. For the latter, a specialized SVM module monitors the subunits of the reactor coolant system and is capable of making fault diagnosis at the component level. Finally, Lai et al. [69] investigate partial discharge activities for online monitoring of power equipment. They use back-propagation NN, self-organizing map, and SVM for classification and comparison, concluding that SVM is the best method in terms of classification accuracy and processing speed.

Some other approaches related to regression models include Poisson regression, least-square regression, and logistic regression [70]. Publications such as Jena and Bhalja [71] use a logistic regression binary classifier for the development of a new fault zone identification scheme for busbar verified by modeling an existing power generation station in a design software package. The proposed scheme is able to identify the fault zone with an accuracy of 99% when it is tested on a large dataset (28,800) by using a small training dataset (9600 cases). In the diagnosis of power systems, Xu and Chow [72] report the results obtained after using two different techniques, i.e., logistic regression and artificial NN, for the identification of the cause of faults in the power distribution systems. Logistic regression is a parametric model that is rarely used in power system fault diagnosis, while artificial NN is a nonparametric method that has been extensively used in this field. Logistic regression as a conventional statistical method has formalized models to exhibit the nonlinear relationship between the independent and dependent variables, while artificial NN can increase its flexibility by including hidden layers, which is often regarded as a substantial advantage. They conclude that both can be easily implemented. As seen from the results, artificial NN can achieve higher balanced accuracy than logistic regression; however, logistic regression is much faster because the artificial NN requires a relatively long training time and cross-validation requires an even longer computation time. Regarding the linear regression-base methods, the work of Cha et al. [73] presents the diagnosis and detection of faults in the main engine of a space shuttle during a stable state. Within the automotive industry, Jiang and Yin [74] present a new design and implementation approach based on recursive total principle component regression for efficient data-driven fault detection in automobile cyber-physical systems. Meanwhile, Bolovinou et al. [75] solve the problem of predicting the distance at which an electric vehicle can be driven before the energy recharge is required. The fact that the model is online implies that the prediction is made at any distance traveled from the beginning of the trip, which is achieved from a regression analysis. Using square linear regressions, Cappiello et al. [76] present a statistical model to predict the instantaneous emissions and fuel consumption of light-duty vehicles. Yu et al. [77] provide theoretical support for the prediction of faults in highway electromechanical equipment through a panel data model-based multi-factor predictive model. This model is characterized by a two-dimensional multivariate regression analysis based on and individuals and time. Emphasizing the intelligent diagnosis of faults, the classification and regression tree (CART) is used by Gopinath et al. [78] as a back-end classifier to diagnose synchronous generators. The statistical characteristics of the frequency domain are extracted from the current signals of the in-test generators. According to the work presented by Bangura et al. [79], the hidden patterns and nuances of differences between healthy performance firms and several fault signatures using time-series DM for the diagnosis of eccentricities and bar/end-ring connector breakages in polyphase induction motors can be identified. In a more general scenario, Wang and Jiao [80] propose a method of failure prediction related to quality by constructing a total principal component regression model, which can divide the space of the variables into two subspaces, and only one of them will be related to the quality fault.

Table 2 summarizes the above-reviewed information, where the effectiveness percentage of each method is also presented; from this information, it is evident that the SVMs are one of the most used methods for fault detection. Many authors agree that SVMs are more robust than other algorithms and satisfy the minimization of structural risk; yet, its effectiveness relies on the features and preparation of data. In addition, they have a high correct identification relationship according to the reported effectiveness percentages. In several works, SVMs have presented a better performance than NNs. These works highlight that SVMs reach the global optimum in a more direct way, are less prone to overfitting, present a smaller computational model, etc. Similar to other algorithms such as NNs, once the training stage has been carried out, the computational time to perform a SVM-based diagnosis is relatively short, making it a suitable for online and continuous diagnosis of electric equipment. Although in several works SVMs have presented a low computation cost/time, it cannot be suitably compared if aspects such as effectiveness reached, overfitting issues, robustness, number of hidden layers and neurons per layer, number of nodes, model complexity, activation and kernel functions used, and training algorithms, among others, are not taken into account.

2.3. Hybrid Techniques

It is common to find research where the authors decide to use not only predictive techniques, but also to combine different algorithms that lead them to obtain models or methods that offer better results, including greater precision and efficiency, as well as better handling of data. This section deals with those works whose authors use more than one method, combining classification techniques and regression techniques, as well as other methods that do not belong to the predictive modeling of DM. The use of hybrid techniques, i.e., techniques that combine different methods, is frequently observed in the diagnosis of equipment such as motors, transformers, and electric vehicles mainly, as stated below.

In the extensive field of motors, Seera et al. [81] use the hybrid fuzzy min–max (FMM) neural network and classification and regression tree (CART), which is known as FMM–CART, to perform rule extraction and data classification in order to detect and classify faults in different motor conditions. They show the overall accuracy rates of five motor conditions (healthy, broken rotor bars, unbalanced voltages, eccentricity, and stator winding faults). FMM presented the lowest accuracy, 93.62%, while CART and FMM–CART achieved 98.11% and 98.25%, respectively, for multiple motor conditions in a time of 0.21s, 0.92s, and 0.96s, respectively. Two years later, in 2014, Seera and Lim [82] implement this hybrid model and conclude that it can produce accurate predictions of motor failures in an online learning environment. In addition, the results of the model are better than those compared with CART, FMM, and multi-layer perceptron. At the noisy test, multi-layer perceptron and FMM presented 78.39% and 94.88% accuracy, whereas FMM–CART and CART achieved stable results with 96.54% and 97.82% accuracy. The multilayer perceptron structure was the most complex with 30 hidden nodes, whereas FMM produced 12 nodes (hyperboxes). FMM–CART and CART created eight and six leaves, respectively. The computational time of FMM was only 0.13s. Multilayer perceptron consumed the longest time (2.08s), whereas FMM–CART and CART used almost 1s. The CART method combined with adaptive neuro-fuzzy inference system is presented by Tran et al. [83]. They use current and vibration signals from the induction motor for fault diagnosis; additionally, the hybrid of back-propagation and least-squares algorithm is used to adjust the parameters of membership functions. The total classification accuracy was 91.11% and 76.67% for vibration and current signals, respectively. Other works such as the one presented by Pramesti et al. [84] involve the identification of stator failures in induction motors using the multinomial logistic regression analysis and the Wavelet Transform (WT). Júnior et al. [85] use a multiple linear regression modeling technique along with the analysis of variance and the genetic algorithm optimization to obtain classification models to diagnose three-phase induction motors under normal and short-circuit conditions. The method presents percentages of hits greater than 95% in the diagnosis of the normal and incipient short-circuit fault condition, even at different motor load levels. In addition to the low cost and simplicity, this method does not require physical access to the machine because the current and voltage can be measured from the motor control board. Thus, the probability of the occurrence of human accident is reduced significantly. Unlike several other reported methods for fault diagnosis, the proposed approach requires few data and only uses simulation data to construct the expert system. Seshadrinath et al. [86] propose an algorithm based on two parts: In the first one, the optimal size of the structure of the Probabilistic Neural Network (PNN) is determined, using an orthogonal least-squares regression algorithm. In the second part, the fusion of a Bayesian classifier is recommended as an effective solution to diagnose incipient interturn fault in the machine. To track the health status of a degraded system and predict the remaining service life of a turbofan engine, Zhou et al. [87] propose a method that combines the echo state kernel recursive least-squares algorithm and a Bayesian technique, which demonstrates an excellent performance with respect to long-term prediction.

To obtain an effective diagnosis of faults in automotive systems, intelligent monitoring schemes of the vehicle’s condition are needed. In this regard, Choi et al. [88] develop three new approaches for fusion of classifiers in order to reduce the error rate. These approaches are: Joint optimization of the fusion center and individual classifiers, class-specific Bayesian fusion, and dynamic fusion, demonstrating that the proposed techniques surpass the individual classifiers such as PNN, k-Nearest Neighbor (kNN), or principal components analysis. A fault detection scheme for applications in the automotive industry is presented by Jakubek and Strasser [89]. They achieve the detection of faults by using kernel regression techniques and a NN. The resulting network uses significantly less basis functions than a radial basis function network with the same accuracy. Oliva et al. [90] present a model-based approach to predict the remaining driving range by combining a particle filtering and Markov chains by implementing detailed models of the battery, electric motor, and vehicle dynamics. Tseng and Chau [91] and Grubwinkler and Lienkamp [92] study different methodologies for the prediction of electric vehicle energy consumption. In particular, Tseng and Chau compare three approaches that include (1) approaches based on driver/vehicle/environment dependent factors using speed profile matching and driving habit matching, (2) approach of comparison with the average using personalized adjustment, and (3) a collaborative filtering approach that uses matrix factorization; whereas Grubwinkler and Lienkamp use the least-mean square algorithm for the prediction of the mean energy consumption. To have a broad overview of the methodologies used in estimation strategies related to the battery, control, and energy management of both hybrid and electric vehicles, it is recommended to review the work of Cuma and Koroglu [93].

In transformers, their preventive maintenance is very often emphasized. Liao et al. [94] use least-square SVM (LS-SVM) and particle swarm optimization in order to optimize the regression parameters for the diagnosis of transformers immersed in oil by using dissolved gases. A comparison with back-propagation neural network, radial basis function neural network, generalized regression neural network, and support vector regression is carried out. Advantages of the regression model include those inherited from the support vector regression, i.e., a unique solution and support of statistical learning theory. In the next years, the wavelet technique is fused with the LS-SVM by Zheng et al. [95] and Zhang et al. [96] to diagnose transformers as well. From the analysis and interpretation of the data generated by the concentration of dissolved gases, Yang and Hu [97] propose a fault diagnosis system, which combines back-propagation NN and a multinomial logistic regression model. Al-Janabi et al. [98] also propose a hybrid system to diagnose transformers. The proposal is based on genetic algorithms and neural networks; in general, it provides information to identify the exact fault in the transformer and its fault state. Fei and Zhang [99] also make use of genetic algorithms along with SVM for fault detection in transformers. Unlike the abovementioned works, Koley et al. [100] use the WT for the extraction of characteristics of the impulse test response of a transformer in the time and frequency domains and the SVM in regression mode to classify transformer faults. It should be pointed out that the SVM tool trained with only simulated data was capable of predicting fault classes accurately when the analog data were presented to the trained SVM for fault prediction.

For the diagnosis of faults in centrifugal pumps, Yunlong and Peng [101] present a new method based on LS-SVM and the empirical mode decomposition. In the case of monoblock centrifugal pumps, Sakthivel et al. [102] use a decision tree-fuzzy hybrid system. In the test dataset, the classification accuracy was 99.3% in decision tree-fuzzy method, 97.50% in rough set-fuzzy method, and 96.67% in case of PCA-based decision tree fuzzy method. For the same task, in [103], Muralidharan and Sugumaran use Wavelet analysis, the Naïve Bayes (NB) algorithm, and the Bayes network algorithm. In [104], they apply the J48 algorithm and the continuous Wavelet transform (CWT). The sym3, rbio2.6, and coif1 mother wavelets are the most suitable for fault diagnosis of centrifugal pumps, reaching a classification accuracy of 100%. Finally, in [105], they use the SVM and the CWT. In this case, bior3.7_17 is the wavelet that gives maximum classification accuracy (99.76%). Hence, it can be considered as the best wavelet as it has the maximum fault discriminating capability for the system under study. Other works that have used the WT are the systems for electric power distribution. Jamil et al. [106] implement an algorithm based on fuzzy logic that uses the DWT to identify 10 different types of faults in an electrical power distribution system. For high impedance fault detection in electrical distribution networks, the WT extracts dynamic characteristics to feed a decision-making system based on SVM [107]. The SVM is also used along with the Hilbert Huang transform to decompose the voltages of transmission lines into intrinsic mode functions [108] for fault classification in power systems. The main contribution of the proposed algorithm is the possibility of its application to any transmission line, no matter the line configuration, with no need for re-training at different load values, voltage levels, and fault resistances. In 2018, Singh and Vishwakarma [109] present a methodology to classify cross-country faults in series-compensated double circuit transmission lines. This method is based on EMD and three different classifiers: SVM, NB, and PNN. The effectiveness is 95% for SVM, 91.66% for Naïve-Bayes, and 96.7% for PNN, where their response times are 0.03s, 0.012s, and 0.016s respectively. Da Silva et al. [110] apply qualitative trend analysis and NB for the diagnosis of multiple failures in transmission lines. This hybrid diagnosis system can be generalized to deal with other types of faults along the transmission line.

Regarding other machines, Lin and Horng [111] use a scheme of classification and detection of faults in an ion implanter, proposing a hybrid classification tree, i.e., they combine a grouping algorithm with CART. They indicate that their methodology is general and can be applied to other machines by simply modifying the warning generation criteria. For the fault detection in components of nuclear power plants components, statistical methods have been used. Di Maio et al. [112] used a set of auto-associative kernel regression models, a hybrid approach based on correlation analysis, a genetic algorithm, and a sequential probability ratio test to detect faults by taking as a case study a coolant pump of a typical pressurized water reactor. Liangyu et al. [113] propose an artificial NN combined with optimal zoom search to recognize various degrees of failure in a high-pressure feedwater heater system. The classification of the healthy and defective conditions of a face milling tool is done through the acquisition of sound signals using the discrete WT (DWT) and the J48 algorithm, which is a decision tree technique [114]. On the other hand, to detect and diagnose faults in HVAC systems, Du et al. [115] combine NNs and clustering analysis.

Table 3 lists the works that have presented hybrid techniques for the detection of faults and diagnosis of the abovementioned electric equipment and systems. According to the information shown in Table 3, two different are combined on average to perform the diagnosis, where not only DM techniques are implemented, but other signal processing algorithms are used to extract or highlight features contained into the analyzed signals in order to simplify the fault classification task. As main techniques, the WT and the EMD are found. While the works that use WT exploit its capability for time frequency decomposition in a symmetric way, the works that use EMD exploit its capability to decompose a signal in an adaptive way. In this regard, EMD has been preferred in many works since a-priori information for the analyzed signal is not needed. From this point of view, other recent schemes based on EMD such as down-sampling EMD [116], which is a method that provides specific advantages over EMD, should be explored in the field of fault detection in electric equipment. WT has been also widely used to remove the unwanted noise in an electrical signal. This noise is generated by acquisition systems, sensors, or any electronic device. Regarding DM techniques, FLSs have presented suitable results under noisy conditions in the input signals, since this noise is somehow compared with the uncertainties of the input data, which is an inherent ability of FLSs.

3. Recent Methods for General Applications

In the literature, various articles that involve the most recent research on classification and regression algorithms, which can be used in different areas of application, have been presented. Djeffal et al. [117] present a method based on filtering and revision stages to delete samples that have little influence on the learning results of a SVM, where the goal is to reduce training time without losing accuracy. This strategy could be used for handling and reducing huge databases before the application of any other algorithm. Zhao et al. [118] tackle the challenging problem of classification in the presence of label noise. In this regard, they propose a Markov chain sampling framework that robustly learns effective classifiers and accurately identifies mislabeled instances. Hwang and Son [119] propose a prototype-based classification to select some data from a dataset for development of learning rules and prediction, demonstrating that the proposed approach overcomes other classifiers such as the Bayes classifier and the nearest neighbor. Regarding the Bayesian approaches, Zhang et al. [120] present a probability density estimation approach based on the nonparametric kernel mixing model in order to estimate reliable class-conditional probability functions; in general, the proposed Bayesian classifier consists of three steps: Partitioning, structure learning, and estimation of probability density functions. Zhang et al. [121] propose a learning scheme that offers a recursive algorithm to explore the distribution of class density for the Bayesian estimation and an automated approach to select powerful discriminant functions for the classification of high-dimensional data, while Celotto [122] proposes a unified visual approach to compare and classify a large subset of Bayesian confirmation measures. In the work of Becker et al. [123], analytical and approximate inference methods are discussed to calculate the marginal probabilities of Bayes factors, providing guidance on the interpretation of results and offering new types of analysis to study sequential data in many application areas.

Regarding regression analysis, Le et al. [124] present the geometric-based online Gaussian process that could scale with massive datasets, guaranteeing that the proposed algorithm produces a good enough solution (close to the optimal one) and a fast-online regression. Marx and Vreeken [125] present an information theory-based approach using the Kolmogorov complexity and the principle of minimum description length to provide a practical solution to the problem of inferring the direction of causal dependence of observational data. Rudaś and Jaroszewicz [126] analyze two uplift modeling approaches for linear regression and identify the situations in which each model works best; in fact, they propose a third model that combines the benefits of both approaches. Liang et al. [70] propose the model called heterogeneous-target robust mixture regression that addresses the challenges and practical concerns of joint learning for multiple objectives/multi-tasking learning by managing mixed types of objectives simultaneously, imposing structural constraints on each component of the mixture and adopting robustness strategies.

On the other hand, Chen and Guestrin [127] describe a scalable end-to-end tree boosting system (XGBoost) and propose a new algorithm based on data dispersion providing information on cache access patterns, data compression, and fragmentation to construct a scalable tree boosting system. They claim that XGBoost is widely used by data scientists to achieve cutting-edge results in many machine learning challenges. Teinemaa et al. [128] evaluate the temporal stability and prediction accuracy of different existing predictive process monitoring methods, finding that the methods based on the XGBoost and LSTM exhibit the highest temporal stability. In relation to NN, Baldi [129] studies the internal and external approaches for the design of recursive neural architectures. Zhang et al. [130] address the problems of intelligent fault diagnosis when the data at the time of training and testing does not come from the same distribution by using domain adaptive convolutional NNs. Bouguelia et al. [131] propose an adaptive algorithm to continuously update a system of neurons through the extension of the growing neural gas algorithm with three complementary mechanisms, which allows one to closely monitor the gradual and sudden changes in the distribution of data. The imbalance data problem is addressed by Xi et al. [132]. They propose the least-squares support vector machine for class imbalance learning by evaluating two parameters of misclassification costs; also, the Cholesky factorization is used to enhance computational stability. In order to reduce the estimation error in online sequential extreme learning machine systems, Lu et al. [133] present a new training approach based on Kalman filter. Although the two last works have been applied to fault detection in aircraft engines, they can be used in other machines.

Table 4 shows a compendium of the abovementioned methods. They are grouped by year in order to show their chronological appearance and highlight which ones are the latest algorithms or strategies proposed in the literature to solve DM issues or improve DM tasks. As these methods can address general applications, it is recommended their research and integration in fault detection methodologies of electric equipment and systems. For instance, the least-squares SVM is useful for imbalance data, i.e., when there is a disproportionate ratio of observations in each class, and the Markov chain sampling is a useful tool for mislabel data.

4. Conclusions

The development of efficient and reliable methodologies represents an extremely important task for researchers and developers of diagnosis systems; in order to contribute to the solution of this task, DM techniques have been widely used. To offer the reader an overview of DM techniques used in the detection of faults and diagnosis of electrical equipment and systems in recent years, this paper provides a general review that can facilitate informed decision-making for specific applications. All the details and results obtained by the authors cited here can be consulted directly from the bibliography of each research.

Although certain techniques have been constantly used for specific applications, e.g., SVMs in transformers, the selection of an appropriate DM technique for either classification or regression will depend on many factors, e.g., monitoring technology, features of data, and knowledge about the in-test system operation. However, it is important to take into account that the more complex and robust the systems, the greater the amount and variety of data produced, and the more difficult the detection of faults and the diagnosis. Additionally, the researcher has to be informed about the features of specific DM algorithms so that, through its implementation, the information contained in the acquired data can be exploited.

It is extremely difficult for a single technique to detect the full range of faults of a system in a 100% reliable way. Each method has its own strengths and weaknesses. Outputs from various diagnostic methods must be aggregated into an overall evaluation system; thus, instead of using one diagnostic method, intelligent hybrid methods that combine the strengths of each method can be developed. In the literature, there are many articles using hybrid techniques in order to increase the percentage of efficiency, accuracy, reliability, and speed of their models. On the one hand, different signal processing techniques have been used for pre-processing of data. This pre-processing allows highlighting and extracting information from raw data. Typical operations are denoising, frequency or mode decomposition, and space transformation, where the WT- and EMD-based methods have demonstrated promising results. On the other hand, the combination of different DT algorithms has been also explored in order to take advantage of their individual benefits. Wu et al. [134] present an important analysis on the 10 most influential DM algorithms in the research community, being C4.5, CART, PageRank, k-Means, kNN, Apriori, AdaBoost, Expectation-Maximization, NB, and SVM. It should be noted that these algorithms cover statistical learning, classification, clustering, link mining, and association analysis. Despite obtaining promising results in many works and having knowledge about both the in-test system and analyzed data, it is difficult or impossible to conceive a perfect algorithm in terms of accuracy, velocity, or complexity for specific applications, mainly considering that even similar applications can have many different requirements; in this regard, the design and development of new algorithms and methods are still of paramount importance.

Also, special attention has to be given to equipment related to renewable energy sources such as wind turbines, photovoltaic systems, power converters, energy storage systems, among others, due to their rapid development and growth [135,136]. For instance, the void defects evolving into damage in wind turbine blades are investigated in [137]. The improvement of photovoltaic and wind power storage systems based on the prediction of battery life and its faults using SVMs is presented in [138]; in these systems, the correct operation of batteries is fundamental. It is clear that all the elements of a system are important and the research of specialized fault detection methodologies for the individual elements and the system as a whole are critical for the maintenance and repair of the system.

Some recommended directions for future research are: (i) Fusion and analysis of multiple physical variables as source of information of a specific equipment, (ii) exploration and integration of recent algorithms to improve the quality of data before the application of a DT-based algorithm, (iii) development of practical hardware solutions for online and real-time fault diagnosis, and (iv) detection of incipient faults.

Author Contributions

Conceptualization, A.C.-V. and M.V.-R.; investigation, resources and visualization, A.C.-V., J.P.A.-S. and D.G.-L.; funding acquisition, J.P.A.-S. and M.V.-R.; Writing—original draft, review & editing, all the Authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the “Consejo Nacional de Ciencia y Tecnología (CONACYT)” under the scholarship 892305.

Conflicts of Interest

The authors declare no conflict of interest.

References

Adil, A.M.; Ko, Y. Socio-technical evolution of Decentralized Energy Systems: A critical review and implications for urban planning and policy. Renew. Sustain. Energy Rev. 2016, 57, 1025–1037. [Google Scholar] [CrossRef]
García-Villalobos, J.; Zamora, I.; San Martín, J.I.; Asensio, F.J.; Aperribay, V. Plug-in electric vehicles in electric distribution networks: A review of smart charging approaches. Renew. Sustain. Energy Rev. 2014, 38, 717–731. [Google Scholar] [CrossRef]
Valtierra-Rodriguez, M. Fractal dimension and data mining for detection of short-circuited turns in transformers from vibration signals. Meas. Sci. Technol. 2019, 31, 025902. [Google Scholar] [CrossRef]
Mejia-Barron, A.; Valtierra-Rodriguez, M.; Granados-Lieberman, D.; Olivares-Galvan, J.C.; Escarela-Perez, R. Experimental data-based transient-stationary current model for inter-turn fault diagnostics in a transformer. Electr. Power Syst. Res. 2017, 152, 306–315. [Google Scholar] [CrossRef]
Mejia-Barron, A.; de Santiago-Perez, J.J.; Granados-Lieberman, D.; Amezquita-Sanchez, J.P.; Valtierra-Rodriguez, M. Shannon Entropy Index and a Fuzzy Logic System for the Assessment of Stator Winding Short-Circuit Faults in Induction Motors. Electronics 2019, 8, 90. [Google Scholar] [CrossRef] [Green Version]
Lee, J.H.; Pack, J.H.; Lee, I.S. Fault Diagnosis of Induction Motor Using Convolutional Neural Network. Appl. Sci. 2019, 9, 2950. [Google Scholar] [CrossRef] [Green Version]
Kantardzic, M. Data Mining: Concepts, Models, Methods, and Algorithms, 2nd ed.; Wiley—IEEE Press: Piscataway, NJ, USA, 2011. [Google Scholar]
Han, J.; Pei, J.; Kamber, M. Data Mining: Concepts and Techniques, 3rd ed.; The Morgan Kaufmann Series in Data Management Systems; Elsevier Science: Waltham, MA, USA, 2011. [Google Scholar]
Jothi, N.; Rashid, N.A.; Husain, W. Data Mining in Healthcare—A Review. Procedia Comput. Sci. 2015, 72, 306–313. [Google Scholar] [CrossRef] [Green Version]
Paramasivam, V.; Yee, T.S.; Dhillon, S.K.; Sidhu, A.S. A methodological review of data mining techniques in predictive medicine: An application in hemodynamic prediction for abdominal aortic aneurysm disease. Biocybern. Biomed. Eng. 2014, 34, 139–145. [Google Scholar] [CrossRef]
Bellazzi, R.; Zupan, B. Predictive data mining in clinical medicine: Current issues and guidelines. Int. J. Med. Inf. 2008, 77, 81–97. [Google Scholar] [CrossRef] [Green Version]
Ngai, E.W.T.; Xiu, L.; Chau, D.C.K. Application of data mining techniques in customer relationship management: A literature review and classification. Expert Syst. Appl. 2009, 36, 2592–2602. [Google Scholar] [CrossRef]
Ge, Z.; Song, Z.; Ding, S.X.; Huang, B. Data Mining and Analytics in the Process Industry: The Role of Machine Learning. IEEE Access 2017, 5, 20590–20616. [Google Scholar] [CrossRef]
Xu, Y.; Sun, Y.; Wan, J.; Liu, X.; Song, Z. Industrial Big Data for Fault Diagnosis: Taxonomy, Review, and Applications. IEEE Access 2017, 5, 17368–17380. [Google Scholar] [CrossRef]
Odell, S.D.; Bebbington, A.; Frey, K.E. Mining and climate change: A review and framework for analysis. Extr. Ind. Soc. 2018, 5, 201–214. [Google Scholar] [CrossRef]
Ngai, E.W.T.; Hu, Y.; Wong, Y.H.; Chen, Y.; Sun, X. The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature. Decis. Support Syst. 2011, 50, 559–569. [Google Scholar] [CrossRef]
Christina, A.J.; Salam, M.A.; Rahman, Q.M.; Wen, F.; Ang, S.P.; Voon, W. Causes of transformer failures and diagnostic methods—A review. Renew. Sustain. Energy Rev. 2018, 82, 1442–1456. [Google Scholar]
De Faria, H.; Costa, J.G.S.; Olivas, J.L.M. A review of monitoring methods for predictive maintenance of electric power transformers based on dissolved gas analysis. Renew. Sustain. Energy Rev. 2015, 46, 201–209. [Google Scholar] [CrossRef]
Sun, H.C.; Huang, Y.C.; Huang, C.M. Fault Diagnosis of Power Transformers Using Computational Intelligence: A Review. Energy Procedia 2012, 14, 1226–1231. [Google Scholar] [CrossRef] [Green Version]
Zheng, Y.; Ouyang, M.; Han, X.; Lu, L.; Li, J. Investigating the error sources of the online state of charge estimation methods for lithium-ion batteries in electric vehicles. J. Power Sources 2018, 377, 161–188. [Google Scholar] [CrossRef]
Liu, Y.; Bazzi, A.M. A review and comparison of fault detection and diagnosis methods for squirrel-cage induction motors: State of the art. ISA Trans. 2017, 70, 400–409. [Google Scholar] [CrossRef]
Prasad, A.; Belwin, E.J.; Ravi, K. A review on fault classification methodologies in power transmission systems: Part—I. J. Electr. Syst. Inf. Technol. 2018, 5, 48–60. [Google Scholar]
Prasad, A.; Belwin, E.J.; Ravi, K. A review on fault classification methodologies in power transmission systems: Part-II. J. Electr. Syst. Inf. Technol. 2018, 5, 61–67. [Google Scholar] [CrossRef] [Green Version]
Gururajapathy, S.S.; Mokhlis, H.; Illias, H.A. Fault location and detection techniques in power distribution systems with distributed generation: A review. Renew. Sustain. Energy Rev. 2017, 74, 949–958. [Google Scholar] [CrossRef]
Hare, J.; Shi, X.; Gupta, S.; Bazzi, A. Fault diagnostics in smart micro-grids: A survey. Renew. Sustain. Energy Rev. 2016, 60, 1114–1124. [Google Scholar] [CrossRef]
Taylor, D.W.; Corne, D.W.; Taylor, D.L.; Harkness, J. Predicting alarms in supermarket refrigeration systems using evolved neural networks and evolved rulesets. In Proceedings of the 2002 Congress on Evolutionary Computation, Honolulu, HI, USA, 12–17 May 2002. [Google Scholar]
Magoulès, F.; Zhao, H.; Elizondo, D. Development of an RDP neural network for building energy consumption fault detection and diagnosis. Energy Build. 2013, 62, 133–138. [Google Scholar] [CrossRef]
Tallam, R.M.; Habetler, T.G.; Harley, R.G.; Gritter, D.J.; Burton, B.H. Neural network based on-line stator winding turn fault detection for induction motors. In Proceedings of the Conference Record of the 2000 IEEE Industry Applications Conference, Thirty-Fifth IAS Annual Meeting and World Conference on Industrial Applications of Electrical Energy, Rome, Italy, 8–12 October 2000. [Google Scholar]
Martins, J.F.; Pires, V.F.; Pires, A.J. Unsupervised Neural-Network-Based Algorithm for an On-Line Diagnosis of Three-Phase Induction Motor Stator Fault. IEEE Trans. Ind. Electron. 2007, 54, 259–264. [Google Scholar] [CrossRef]
Xu-hong, W.; Yi-gang, H. Fuzzy Neural Network based On-line Stator Winding Turn Fault Detection for Induction Motors. In Proceedings of the 2nd IEEE Conference on Industrial Electronics and Applications, Harbin, China, 23–25 May 2007. [Google Scholar]
Ballal, M.S.; Khan, Z.J.; Suryawanshi, H.M.; Sonolikar, R.L. Adaptive Neural Fuzzy Inference System for the Detection of Inter-Turn Insulation and Bearing Wear Faults in Induction Motor. IEEE Trans. Ind. Electron. 2007, 54, 250–258. [Google Scholar] [CrossRef]
Cabal-Yepez, E.; Valtierra-Rodriguez, M.; Romero-Troncoso, R.J.; Garcia-Perez, A.; Osornio-Rios, R.A.; Miranda-Vidales, H.; Alvarez-Salas, R. FPGA-based entropy neural processor for online detection of multiple combined faults on induction motors. Mech. Syst. Signal Process. 2012, 30, 123–130. [Google Scholar] [CrossRef]
Camarena-Martinez, D.; Valtierra-Rodriguez, M.; Garcia-Perez, A.; Osornio-Rios, R.A.; Romero-Troncoso, R.J. Empirical mode decomposition and neural networks on FPGA for fault diagnosis in induction motors. Sci. World J. 2014, 2014, 908140. [Google Scholar] [CrossRef]
Amezquita-Sanchez, J.P.; Valtierra-Rodriguez, M.; Camarena-Martinez, D.; Granados-Lieberman, D.; Romero-Troncoso, R.J.; Dominguez-Gonzalez, A. Fractal dimension-based approach for detection of multiple combined faults on induction motors. J. Vib. Control 2016, 22, 3638–3648. [Google Scholar] [CrossRef]
Zolfaghari, S.; Noor, S.B.M.; Rezazadeh, M.M.; Marhaban, M.H.; Mariun, N. Broken rotor bar fault detection and classification using wavelet packet signature analysis based on fourier transform and multi-layer perceptron neural network. Appl. Sci. 2018, 8, 25. [Google Scholar] [CrossRef] [Green Version]
Flores, A.; Quiles, E.; Garcia, E.; Morant, F. Fault Diagnosis of Electric Transmission Lines using Modular Neural Networks. IEEE Lat. Am. Trans. 2016, 14, 3663–3668. [Google Scholar] [CrossRef]
Valtierra-Rodriguez, M.; Romero-Troncoso, R.J.; Osornio-Rios, R.A.; Garcia-Perez, A. Detection and classification of single and combined power quality disturbances using neural networks. IEEE Trans. Ind. Electron. 2013, 61, 2473–2482. [Google Scholar] [CrossRef]
Rigatos, G.; Siano, P. Power transformers’ condition monitoring using neural modeling and the local statistical approach to fault diagnosis. Int. J. Electr. Power Energy Syst. 2016, 80, 150–159. [Google Scholar] [CrossRef]
Menezes, A.G.C.; Almeida, O.M.; Barbosa, F.R. Use of decision tree algorithms to diagnose incipient faults in power transformers. In Proceedings of the Simposio Brasileiro de Sistemas Eletricos (SBSE), Niteroi, Brazil, 12–16 May 2018. [Google Scholar]
Han, Y.; Zhao, D.; Hou, H. Oil-immersed Transformer Internal Thermoelectric Potential Fault Diagnosis Based on Decision-tree of KNIME Platform. Procedia Comput. Sci. 2016, 83, 1321–1326. [Google Scholar] [CrossRef]
Samantaray, S.R.; Dash, P.K. Decision Tree based discrimination between inrush currents and internal faults in power transformer. Int. J. Electr. Power Energy Syst. 2011, 33, 1043–1048. [Google Scholar] [CrossRef]
Sakthivel, N.R.; Sugumaran, V.; Babudevasenapati, S. Vibration based fault diagnosis of monoblock centrifugal pump using decision tree. Expert Syst. Appl. 2010, 37, 4040–4049. [Google Scholar] [CrossRef]
Sharma, A.; Sugumaran, V.; Babu, S. Misfire detection in an IC engine using vibration signal and decision tree algorithms. Measurement 2014, 50, 370–380. [Google Scholar] [CrossRef]
Volkanovski, A.; Čepin, M.; Mavko, B. Application of the fault tree analysis for assessment of power system reliability. Reliab. Eng. Syst. Saf. 2009, 94, 1116–1127. [Google Scholar] [CrossRef]
Duan, R.; Zhou, H. A New Fault Diagnosis Method Based on Fault Tree and Bayesian Networks. Energy Procedia 2012, 17, 1376–1382. [Google Scholar] [CrossRef] [Green Version]
Bonvini, M.; Sohn, M.D.; Granderson, J.; Wetter, M.; Piette, M.A. Robust on-line fault detection diagnosis for HVAC components based on nonlinear state estimation techniques. Appl. Energy 2014, 124, 156–166. [Google Scholar] [CrossRef]
Li, D.; Zhou, Y.; Hu, G.; Spanos, C.J. Fault detection and diagnosis for building cooling system with a tree-structured learning method. Energy Build. 2016, 127, 540–551. [Google Scholar] [CrossRef]
Rodríguez, P.V.J.; Arkkio, A. Detection of stator winding fault in induction motor using fuzzy logic. Appl. Soft Comput. 2008, 8, 1112–1120. [Google Scholar] [CrossRef]
Amezquita-Sanchez, J.P.; Valtierra-Rodriguez, M.; Perez-Ramirez, C.A.; Camarena-Martinez, D.; Garcia-Perez, A.; Romero-Troncoso, R.J. Fractal dimension and fuzzy logic systems for broken rotor bar detection in induction motors at start-up and steady-state regimes. Meas. Sci. Technol. 2017, 28, 075001. [Google Scholar] [CrossRef]
Islam, S.M.; Wu, T.; Ledwich, G. A novel fuzzy logic approach to transformer fault diagnosis. IEEE Trans. Dielectr. Electr. Insul. 2000, 7, 177–186. [Google Scholar] [CrossRef]
Huang, Y.C.; Sun, H.C. Dissolved gas analysis of mineral oil for power transformer fault diagnosis using fuzzy logic. IEEE Trans. Dielectr. Electr. Insul. 2013, 20, 974–981. [Google Scholar] [CrossRef]
Khan, S.A.; Equbal, M.D.; Islam, T. A comprehensive comparative study of DGA based transformer fault diagnosis using fuzzy logic and ANFIS models. IEEE Trans. Dielectr. Electr. Insul. 2015, 22, 590–596. [Google Scholar] [CrossRef]
Chin, H.C. Fault section diagnosis of power system using fuzzy logic. IEEE Trans. Power Syst. 2003, 18, 245–250. [Google Scholar] [CrossRef]
Valtierra-Rodriguez, M.; Granados-Lieberman, D.; Torres-Fernandez, J.E.; Rodríguez-Rodríguez, J.R.; Gómez-Aguilar, J.F. A new methodology for tracking and instantaneous characterization of voltage variations. IEEE Trans. Instrum. Meas. 2016, 65, 1596–1604. [Google Scholar] [CrossRef]
Lauro, F.; Moretti, F.; Capozzoli, A.; Khan, I.; Pizzuti, S.; Macas, M.; Panzieri, S. Building Fan Coil Electric Consumption Analysis with Fuzzy Approaches for Fault Detection and Diagnosis. Energy Procedia 2014, 62, 411–420. [Google Scholar] [CrossRef]
Zio, E.; Baraldi, P.; Popescu, I.C. A fuzzy decision tree method for fault classification in the steam generator of a pressurized water reactor. Ann. Nucl. Energy 2009, 36, 1159–1169. [Google Scholar] [CrossRef]
Mittal, M.; Bhushan, M.; Patil, S.; Chaudhari, S. Optimal Feature Selection for SVM Based Fault Diagnosis in Power Transformers. IFAC Proc. Vol. 2013, 46, 809–814. [Google Scholar] [CrossRef] [Green Version]
Lv, G.; Cheng, H.; Zhai, H.; Dong, L. Fault diagnosis of power transformer based on multi-layer SVM classifier. Electr. Power Syst. Res. 2005, 75, 9–15. [Google Scholar] [CrossRef]
Bacha, K.; Souahlia, S.; Gossa, M. Power transformer fault diagnosis based on dissolved gas analysis by support vector machine. Electr. Power Syst. Res. 2012, 83, 73–79. [Google Scholar] [CrossRef]
Johnson, J.M.; Yadav, A. Complete protection scheme for fault detection, classification and location estimation in HVDC transmission lines using support vector machines. IET Sci. Meas. Technol. 2017, 11, 279–287. [Google Scholar] [CrossRef]
Parikh, U.B.; Das, B.; Maheshwari, R. Fault classification technique for series compensated transmission line using support vector machine. Int. J. Electr. Power Energy Syst. 2010, 32, 629–636. [Google Scholar] [CrossRef]
Zhang, S.; Wang, Y.; Liu, M.; Bao, Z. Data-Based Line Trip Fault Prediction in Power Systems Using LSTM Networks and SVM. IEEE Access 2018, 6, 7675–7686. [Google Scholar] [CrossRef]
Gangsar, P.; Tiwari, R. Comparative investigation of vibration and current monitoring for prediction of mechanical and electrical faults in induction motor based on multiclass-support vector machine algorithms. Mech. Syst. Signal Process. 2017, 94, 464–481. [Google Scholar] [CrossRef]
Zhang, Y.; Ye, D.; Liu, Y. Robust locally linear embedding algorithm for machinery fault diagnosis. Neurocomputing 2018, 273, 323–332. [Google Scholar] [CrossRef]
Das, S.; Purkait, P.; Koley, C.; Chakravorti, S. Performance of a load-immune classifier for robust identification of minor faults in induction motor stator winding. IEEE Trans. Dielectr. Electr. Insul. 2014, 21, 33–44. [Google Scholar] [CrossRef]
Mulumba, T.; Afshari, A.; Yan, K.; Shen, W.; Norford, L.K. Robust model-based fault diagnosis for air handling units. Energy Build. 2015, 86, 698–707. [Google Scholar] [CrossRef]
Yan, K.; Shen, W.; Mulumba, T.; Afshari, A. ARX model based fault detection and diagnosis for chillers using support vector machines. Energy Build. 2014, 81, 287–295. [Google Scholar] [CrossRef]
Ayodeji, A.; Liu, Y. Support vector ensemble for incipient fault diagnosis in nuclear plant components. Nucl. Eng. Technol. 2018, 50, 1306–1313. [Google Scholar] [CrossRef]
Lai, K.; Phung, B.; Blackburn, T. Application of data mining on partial discharge part I: Predictive modelling classification. IEEE Trans. Dielectr. Electr. Insul. 2010, 17, 846–854. [Google Scholar] [CrossRef]
Liang, J.; Chen, K.; Lin, M.; Zhang, C.; Wang, F. Robust finite mixture regression for heterogeneous targets. Data Min. Knowl. Discov. 2018, 32, 1509–1560. [Google Scholar] [CrossRef]
Jena, S.; Bhalja, B.R. Development of a new fault zone identification scheme for busbar using logistic regression classifier. IET Gener. Transm. Distrib. 2017, 11, 174–184. [Google Scholar] [CrossRef]
Xu, L.; Chow, M.Y. A Classification Approach for Power Distribution Systems Fault Cause Identification. IEEE Trans. Power Syst. 2006, 21, 53–60. [Google Scholar] [CrossRef] [Green Version]
Cha, J.; Ha, C.; Ko, S.; Koo, J. Application of fault factor method to fault detection and diagnosis for space shuttle main engine. Acta Astronaut. 2016, 126, 517–527. [Google Scholar] [CrossRef]
Jiang, Y.; Yin, S. Recursive Total Principle Component Regression Based Fault Detection and Its Application to Vehicular Cyber-Physical Systems. IEEE Trans. Ind. Inf. 2018, 14, 1415–1423. [Google Scholar] [CrossRef]
Bolovinou, A.; Bakas, I.; Amditis, A.; Mastrandrea, F.; Vinciotti, W. Online prediction of an electric vehicle remaining range based on regression analysis. In Proceedings of the IEEE International Electric Vehicle Conference (IEVC), Florence, Italy, 17–19 December 2014. [Google Scholar]
Cappiello, A.; Chabini, I.; Nam, E.K.; Lue, A.; Abou, Z.M. A statistical model of vehicle emissions and fuel consumption. In Proceedings of the IEEE 5th International Conference on Intelligent Transportation Systems, Singapore, 3–6 September 2002. [Google Scholar]
Yu, Q.; Qin, Y.; Liu, P.; Ren, G. A Panel Data Model-Based Multi-Factor Predictive Model of Highway Electromechanical Equipment Faults. IEEE Trans. Intell. Transp. Syst. 2018, 19, 1–7. [Google Scholar] [CrossRef]
Gopinath, R.; Santhosh, K.C.; Ramachandran, K.I.; Upendranath, V.; Sai Kiran, P.V.R. Intelligent fault diagnosis of synchronous generators. Expert Syst. Appl. 2016, 45, 142–149. [Google Scholar] [CrossRef]
Bangura, J.F.; Povinelli, R.J.; Demerdash, N.A.O.; Brown, R.H. Diagnostics of eccentricities and bar/end-ring connector breakages in polyphase induction motors through a combination of time-series data mining and time-stepping coupled FE-state space techniques. In Proceedings of the Conference Record of the 2001 IEEE Industry Applications Conference, 36th IAS Annual Meeting, Chicago, IL, USA, 30 September–4 October 2001. [Google Scholar]
Wang, G.; Jiao, J. Quality-Related Fault Detection and Diagnosis Based on Total Principal Component Regression Model. IEEE Access 2018, 6, 10341–10347. [Google Scholar] [CrossRef]
Seera, M.; Chee, P.L.; Ishak, D.; Singh, H. Fault Detection and Diagnosis of Induction Motors Using Motor Current Signature Analysis and a Hybrid FMM–CART Model. IEEE Trans. Neural Netw. Learn. Syst. 2012, 23, 97–108. [Google Scholar] [CrossRef] [PubMed]
Seera, M.; Chee, P.L. Online Motor Fault Detection and Diagnosis Using a Hybrid FMM-CART Model. IEEE Trans. Neural Netw. Learn. Syst. 2014, 25, 806–812. [Google Scholar] [CrossRef] [PubMed]
Tran, V.T.; Yang, B.S.; Oh, M.S.; Tan, A.C.C. Fault diagnosis of induction motor based on decision trees and adaptive neuro-fuzzy inference. Expert Syst. Appl. 2009, 36, 1840–1849. [Google Scholar] [CrossRef] [Green Version]
Pramesti, W.; Damayanti, I.; Asfani, D.A. Stator fault identification analysis in induction motor using multinomial logistic regression. In Proceedings of the International Seminar on Intelligent Technology and Its Applications (ISITIA), Lombok, Indonesia, 28–30 June 2016. [Google Scholar]
Júnior, A.M.G.; Silva, V.V.R.; Baccarini, L.M.R.; Mendes, L.F.S. The design of multiple linear regression models using a genetic algorithm to diagnose initial short-circuit faults in 3-phase induction motors. Appl. Soft Comput. 2018, 63, 50–58. [Google Scholar] [CrossRef]
Seshadrinath, J.; Singh, B.; Panigrahi, B.K. Incipient Interturn Fault Diagnosis in Induction Machines Using an Analytic Wavelet-Based Optimized Bayesian Inference. IEEE Trans. Neural Netw. Learn. Syst. 2014, 25, 990–1001. [Google Scholar] [CrossRef] [PubMed]
Zhou, H.; Huang, J.; Lu, F.; Thiyagalingam, J.; Kirubarajan, T. Echo state kernel recursive least squares algorithm for machine condition prediction. Mech. Syst. Signal Process. 2018, 111, 68–86. [Google Scholar] [CrossRef]
Choi, K.; Singh, S.; Kodali, A.; Pattipati, K.R.; Sheppard, J.W.; Namburu, S.M.; Chigusa, S.; Prokhorov, D.V.; Qiao, L. Novel Classifier Fusion Approaches for Fault Diagnosis in Automotive Systems. IEEE Trans. Instrum. Meas. 2009, 58, 602–611. [Google Scholar] [CrossRef] [Green Version]
Jakubek, S.; Strasser, T. Fault-diagnosis using neural networks with ellipsoidal basis functions. In Proceedings of the American Control Conference, Anchorage, AK, USA, 8–10 May 2002. [Google Scholar]
Oliva, J.A.; Weihrauch, C.; Bertram, T. Model-Based Remaining Driving Range Prediction in Electric Vehicles by using Particle Filtering and Markov Chains. World Electr. Veh. J. 2013, 6, 204–213. [Google Scholar] [CrossRef] [Green Version]
Tseng, C.M.; Chau, C.K. Personalized Prediction of Vehicle Energy Consumption Based on Participatory Sensing. IEEE Trans. Intell. Transp. Syst. 2017, 18, 3103–3113. [Google Scholar] [CrossRef] [Green Version]
Grubwinkler, S.; Lienkamp, M. A modular and dynamic approach to predict the energy consumption of electric vehicles. In Proceedings of the Conference on Future Automotive Technology, Munich, Germany, 18–19 March 2013. [Google Scholar]
Cuma, M.U.; Koroglu, T. A comprehensive review on estimation strategies used in hybrid and battery electric vehicles. Renew. Sustain. Energy Rev. 2015, 42, 517–531. [Google Scholar] [CrossRef]
Liao, R.; Zheng, H.; Grzybowski, S.; Yang, L. Particle swarm optimization-least squares support vector regression based forecasting model on dissolved gases in oil-filled power transformers. Electr. Power Syst. Res. 2011, 81, 2074–2080. [Google Scholar] [CrossRef]
Zheng, H.; Zhang, Y.; Liu, J.; Wei, H.; Zhao, J.; Liao, R. A novel model based on wavelet LS-SVM integrated improved PSO algorithm for forecasting of dissolved gas contents in power transformers. Electr. Power Syst. Res. 2018, 155, 196–205. [Google Scholar] [CrossRef]
Zhang, Y.Y.; Wei, H.; Yang, Y.D.; Zheng, H.B.; Zhou, T.; Jiao, J. Forecasting of Dissolved Gases in Oil-immersed Transformers Based upon Wavelet LS-SVM Regression and PSO with Mutation. Energy Procedia 2016, 104, 38–43. [Google Scholar] [CrossRef]
Yang, M.T.; Hu, L.S. Intelligent fault types diagnostic system for dissolved gas analysis of oil-immersed power transformer. IEEE Trans. Dielectr. Electr. Insul. 2013, 20, 2317–2324. [Google Scholar] [CrossRef]
Al-Janabi, S.; Rawat, S.; Patel, A.; Al-Shourbaji, I. Design and evaluation of a hybrid system for detection and prediction of faults in electrical transformers. Int. J. Electr. Power Energy Syst. 2015, 67, 324–335. [Google Scholar] [CrossRef]
Fei, S.; Zhang, X. Fault diagnosis of power transformer based on support vector machine with genetic algorithm. Expert Syst. Appl. 2009, 36, 11352–11357. [Google Scholar] [CrossRef]
Koley, C.; Purkait, P.; Chakravorti, S. Wavelet-Aided SVM Tool for Impulse Fault Identification in Transformers. IEEE Trans. Power Deliv. 2006, 21, 1283–1290. [Google Scholar] [CrossRef]
Yunlong, Z.; Peng, Z. Vibration Fault Diagnosis Method of Centrifugal Pump Based on EMD Complexity Feature and Least Square Support Vector Machine. Energy Procedia 2012, 17, 939–945. [Google Scholar] [CrossRef] [Green Version]
Sakthivel, N.R.; Sugumaran, V.; Nair, B.B. Comparison of decision tree-fuzzy and rough set-fuzzy methods for fault categorization of mono-block centrifugal pump. Mech. Syst. Signal Process. 2010, 24, 1887–1906. [Google Scholar] [CrossRef]
Muralidharan, V.; Sugumaran, V.; Indira, V. Fault diagnosis of monoblock centrifugal pump using SVM. Eng. Sci. Technol. Int. J. 2014, 17, 152–157. [Google Scholar] [CrossRef] [Green Version]
Muralidharan, V.; Sugumaran, V. A comparative study of Naïve Bayes classifier and Bayes net classifier for fault diagnosis of monoblock centrifugal pump using wavelet analysis. Appl. Soft Comput. 2012, 12, 2023–2029. [Google Scholar] [CrossRef]
Muralidharan, V.; Sugumaran, V. Feature extraction using wavelets and classification through decision tree algorithm for fault diagnosis of mono-block centrifugal pump. Measurement 2013, 46, 353–359. [Google Scholar] [CrossRef]
Jamil, M.; Singh, R.; Sharma, S.K. Fault identification in electrical power distribution system using combined discrete wavelet transform and fuzzy logic. J. Electr. Syst. Inf. Technol. 2015, 2, 257–267. [Google Scholar] [CrossRef] [Green Version]
Mortazavi, S.H.; Moravej, Z.; Shahrtash, S.M. A hybrid method for arcing faults detection in large distribution networks. Int. J. Electr. Power Energy Syst. 2018, 94, 141–150. [Google Scholar] [CrossRef]
Ramesh, B.N.; Jagan, M.B. Fault classification in power systems using EMD and SVM. Ain Shams Eng. J. 2017, 8, 103–111. [Google Scholar] [CrossRef] [Green Version]
Singh, S.; Vishwakarma, D.N. A Novel Methodology for Identifying Cross-Country Faults in Series-Compensated Double Circuit Transmission Lines. Procedia Comput. Sci. 2018, 125, 427–433. [Google Scholar] [CrossRef]
Da Silva, P.R.N.; Gabbar, H.A.; Vieira Junior, P.; da Costa Junior, C.T. A new methodology for multiple incipient fault diagnosis in transmission lines using QTA and Naïve Bayes classifier. Int. J. Electr. Power Energy Syst. 2018, 103, 326–346. [Google Scholar] [CrossRef]
Lin, S.; Horng, S. A Classification-Based Fault Detection and Isolation Scheme for the Ion Implanter. IEEE Trans. Semicond. Manuf. 2006, 19, 411–424. [Google Scholar] [CrossRef]
Di Maio, F.; Baraldi, P.; Zio, E.; Seraoui, R. Fault Detection in Nuclear Power Plants Components by a Combination of Statistical Methods. IEEE Trans. Reliab. 2013, 62, 833–845. [Google Scholar] [CrossRef] [Green Version]
Liangyu, M.; Yongguang, M.; Lee, K.Y. An Intelligent Power Plant Fault Diagnostics for Varying Degree of Severity and Loading Conditions. IEEE Trans. Energy Convers. 2010, 25, 546–554. [Google Scholar] [CrossRef]
Madhusudana, C.K.; Kumar, H.; Narendranath, S. Fault diagnosis of face milling tool using decision tree and sound signal. Mater. Today Proc. 2018, 5, 12035–12044. [Google Scholar] [CrossRef]
Du, Z.; Fan, B.; Jin, X.; Chi, J. Fault detection and diagnosis for buildings and HVAC systems using combined neural networks and subtractive clustering analysis. Build. Environ. 2014, 73, 1–11. [Google Scholar] [CrossRef]
Camarena-Martinez, D.; Valtierra-Rodriguez, M.; Perez-Ramirez, C.A.; Amezquita-Sanchez, J.P.; Romero-Troncoso, R.J.; Garcia-Perez, A. Novel downsampling empirical mode decomposition approach for power quality analysis. IEEE Trans. Ind. Electron. 2015, 63, 2369–2378. [Google Scholar] [CrossRef]
Djeffal, A.; Babahenini, M.C.; Ahmed, A.T. Fast binary support vector machine learning method by samples reduction. Int. J. Data Min. Model. Manag. 2017, 9, 1. [Google Scholar] [CrossRef]
Zhao, Z.; Chu, L.; Tao, D.; Pei, J. Classification with label noise: A Markov chain sampling framework. Data Min. Knowl. Discov. 2018, 33, 1468–1504. [Google Scholar] [CrossRef]
Hwang, D.; Son, Y. Prototype-based classification and error analysis under bootstrapping strategy. Int. J. Data Min. Model. Manag. 2018, 10, 293. [Google Scholar] [CrossRef]
Zhang, W.; Zhang, Z.; Chao, H.C.; Tseng, F.H. Kernel mixture model for probability density estimation in Bayesian classifiers. Data Min. Knowl. Discov. 2018, 32, 675–707. [Google Scholar] [CrossRef]
Zhang, J.; Wang, S.; Chen, L.; Gallinari, P. Multiple Bayesian discriminant functions for high-dimensional massive data classification. Data Min. Knowl. Discov. 2016, 31, 465–501. [Google Scholar] [CrossRef]
Celotto, E. Visualizing the behavior and some symmetry properties of Bayesian confirmation measures. Data Min. Knowl. Discov. 2016, 31, 739–773. [Google Scholar] [CrossRef]
Becker, M.; Lemmerich, F.; Singer, P.; Strohmaier, M.; Hotho, A. MixedTrails: Bayesian hypothesis comparison on heterogeneous sequential data. Data Min. Knowl. Discov. 2017, 31, 1359–1390. [Google Scholar] [CrossRef] [Green Version]
Le, T.; Nguyen, K.; Nguyen, V.; Nguyen, T.D.; Phung, D. GoGP: Fast Online Regression with Gaussian Processes. In Proceedings of the IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA, 18–21 November 2017. [Google Scholar]
Marx, A.; Vreeken, J. Telling Cause from Effect Using MDL-Based Local and Global Regression. In Proceedings of the IEEE International Conference on Data Mining (ICDM), New Orleans, LA, USA, 18–21 November 2017. [Google Scholar]
Rudaś, K.; Jaroszewicz, S. Linear regression for uplift modeling. Data Min. Knowl. Discov. 2018, 32, 1275–1305. [Google Scholar] [CrossRef] [Green Version]
Chen, T.; Guestrin, C. XGBoost. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
Teinemaa, I.; Dumas, M.; Leontjeva, A.; Maggi, F.M. Temporal stability in predictive process monitoring. Data Min. Knowl. Discov. 2018, 32, 1306–1338. [Google Scholar] [CrossRef] [Green Version]
Baldi, P. The inner and outer approaches to the design of recursive neural architectures. Data Min. Knowl. Discov. 2017, 32, 218–230. [Google Scholar] [CrossRef] [Green Version]
Zhang, B.; Li, W.; Li, X.L.; Ng, S.K. Intelligent fault diagnosis under varying working conditions based on domain adaptive convolutional neural networks. IEEE Access 2018, 6, 66367–66384. [Google Scholar] [CrossRef]
Bouguelia, M.R.; Nowaczyk, S.; Payberah, A.H. An adaptive algorithm for anomaly and novelty detection in evolving data streams. Data Min. Knowl. Discov. 2018, 32, 1597–1633. [Google Scholar] [CrossRef] [Green Version]
Xi, P.P.; Zhao, Y.P.; Wang, P.X.; Li, Z.Q.; Pan, Y.T.; Song, F.Q. Least squares support vector machine for class imbalance learning and their applications to fault detection of aircraft engine. Aerosp. Sci. Technol. 2018, 84, 56–74. [Google Scholar] [CrossRef]
Lu, F.; Wu, J.; Huang, J.; Qiu, X. Aircraft engine degradation prognostics based on logistic regression and novel OS-ELM algorithm. Aerosp. Sci. Technol. 2018, 84, 661–671. [Google Scholar] [CrossRef]
Wu, X.; Kumar, V.; Quinlan, J.R.; Ghosh, J.; Yang, Q.; Motoda, H.; McLachlan, G.J.; Ng, A.F.M.; Liu, B.; Yu, P.S.; et al. Top 10 Algorithms in Data Mining. Knowl. Inf. Syst. 2008, 14, 1–37. [Google Scholar] [CrossRef] [Green Version]
Zhao, H.; Wu, Q.; Hu, S.; Xu, H.; Rasmussen, C.N. Review of energy storage system for wind power integration support. Appl. Energy 2015, 137, 545–553. [Google Scholar] [CrossRef]
Saponara, S.; Saletti, R.; Mihet-Popa, L. Hybrid micro-grids exploiting renewables sources, battery energy storages, and bi-directional converters. Appl. Sci. 2019, 9, 4973. [Google Scholar] [CrossRef] [Green Version]
Zhou, B.; Yu, F.; Li, H.; Xin, W. A Quantitative Study on the Void Defects Evolving into Damage in Wind Turbine Blade Based on Internal Energy Storage. Appl. Sci. 2020, 10, 491. [Google Scholar] [CrossRef] [Green Version]
Liu, Z.F.; Li, L.L.; Tseng, M.L.; Tan, R.R.; Aviso, K.B. Improving the reliability of photovoltaic and wind power storage systems using least squares support vector machine optimized by improved chicken swarm algorithm. Appl. Sci. 2019, 9, 3788. [Google Scholar] [CrossRef] [Green Version]

Figure 1. General elements of a condition monitoring strategy.

Figure 2. (a) Schematic diagram for tasks and functionalities of data mining (DM) and (b) prediction model.

Table 1. Classification methods and their applications.

Classification Methods	Equipment Under test	Physical Variable Used as Information Source
▪ NNs and rule sets [26] ▪ Recursive deterministic perceptron NN [27] ▪ Feedforward NN [28] ▪ Hebbian NN [29] ▪ B-spline membership fuzzy NN [30] ▪ ANFIS [31,52] ▪ Feed forward NN [32,33,34] ▪ Multi-layer NN [35] ▪ Modular NNs [36] ▪ Adaptive linear NN and feed forward NN [37] ▪ Neural-fuzzy network and statistical analysis [38]	▪ Refrigeration systems ▪ Electric equipment (fan, coil, pump, and chiller) ▪ Stator winding of an induction motor ▪ Bearings in induction motors ▪ Broken rotor bars ▪ Electric transmission lines ▪ Transformer	▪ Temperature ▪ Energy consumption ▪ Voltage ▪ Data from DGA ▪ Current ▪ Vibrations ▪ Speed
▪ Decision tree: C4.5 algorithm [39,40,42] ▪ Decision tree: CART algorithm [41] ▪ Decision trees: J48 algorithm, best first algorithm, random forest algorithm, functional algorithm, and linear model algorithm (a comparison) [43] ▪ Fault tree analysis [44] ▪ Fault tree analysis and Bayesian networks [45]	▪ Transformer ▪ Monoblock centrifugal pump ▪ Internal combustion engine ▪ Power system ▪ System of oil pressure warning instructions in aircraft engines	▪ Data from DGA ▪ Current ▪ Vibration ▪ Power flows
▪ Bayesian non-linear state estimation technique (detection based on a threshold value) [46] ▪ Tree-structured fault dependence kernel [47]	▪ Chiller plant ▪ Building cooling system	▪ Pressure ▪ Temperature
▪ Fuzzy logic system [48,49,50,51,53,54] ▪ Fuzzy sets and fuzzy logic [55] ▪ Fuzzy decision tree [56]	▪ Induction motor ▪ Transformer ▪ Power systems ▪ Fan coil electric consumption ▪ Steam generator of a pressurized water reactor	▪ Current ▪ Voltage ▪ Electric power ▪ Temperature ▪ Pressure ▪ Flow

Table 2. Regression-based methods and their applications.

Regression Methods	Equipment Under Test	Type of Fault	Effectiveness Percentage
▪ Multilayer SVM [57]	▪ Transformer	▪ Partial discharge and arcing	100
▪ Multilayer SVM [58]	▪ Transformer	▪ Discharge and thermal faults	100
▪ Multilayer SVM [59]	▪ Transformer	▪ Discharge and thermal faults	90
▪ SVM [60]	▪ High-voltage, direct current transmission lines	▪ Pole-1 to ground, pole-2 to ground and pole-1 to pole-2	100
▪ SVM [61]	▪ Series compensated transmission line	▪ Line-to-ground, line-to-line fault involving ground and line-to-line	98.703
▪ LSTM and SVM [62]	▪ Power systems	▪ Data-based line trip fault	97.7
▪ Multiclass SVM [63]	▪ Induction motor	▪ Mechanical fault	97.48
▪ Robust locally linear embedding algorithm and SVM [64]	▪ Motor, torque transducer/encoder and dynamometer	▪ Gear fault	90–100
▪ SVM in regression mode [65]	▪ Stator winding of an induction motor	▪ Stator winding short circuit faults	95.1, 80.8, and 92.7
▪ Autoregressive time series model with exogenous variables and SVM [66]	▪ HVAC systems	▪ Air handling unit faults	92.3
▪ Autoregressive time series model with exogenous variables and SVM [67]	▪ Chillers	▪ Reduced condenser and evaporator water flow, condenser fouling, non-condensable in refrigerant and refrigerant leak	90.31
▪ Multiclass SVM [68]	▪ Steam generator and pressure boundary of a reactor coolant system	▪ Incipient faults	100
▪ Logistic regression [71]	▪ Busbar	▪ Internal and external faults	99.69
▪ CART [78]	▪ Synchronous generators	▪ Inter-turn fault	95.58–98.15
▪ Time-series data mining [79]	▪ Polyphase induction motors	▪ Eccentricities and bar/end-ring connector breakages	100

Table 3. Summary for hybrid techniques used.

Data Mining Techniques	Other Techniques	Application
▪ Fuzzy min-max NN and CART [81,82]	-	▪ Induction motors
▪ ANFIS and CART [83]	-	▪ Induction motors
▪ Logistic regression [84]	WT	▪ Stator winding of an induction motor
▪ Multiple linear regression and genetic algorithms [85]	RMS	▪ Three-phase induction motors
▪ PNN and orthogonal least squares regression algorithm [86]	DWT	▪ Induction machines
▪ Kernel recursive least squares algorithm and Bayesian technique [87]	-	▪ Turbofan engine
▪ SVM, PNN, kNN, and Principal components analysis [88]	-	▪ Engine system
▪ Kernel regression techniques and NN [89]	-	▪ Automotive industry
▪ Particle Filtering and Markov Chains [90]	-	▪ Electric ▪ Vehicles
▪ Average and collaborative filtering [91]	Similarity Matching	▪ vehicle energy consumption
▪ Mean and least-mean square algorithm [92]	-	▪ vehicle energy consumption
▪ LS-SVM regression [94]	Particle swarm optimization algorithm	▪ Transformer
▪ LS-SVM [95,96]	WT	▪ Transformer
▪ Multinomial logistic regression and NN [97]	-	▪ Transformer
▪ NN and genetic algorithm [98]	-	▪ Transformer
▪ SVM and genetic algorithm [99]	-	▪ Transformer
▪ SVM in regression mode [100]	WT	▪ Transformer
▪ LS-SVM [101]	EMD	▪ Centrifugal pump
▪ Decision tree-fuzzy and rough set-fuzzy methods [102]	-	▪ Monoblock centrifugal pump
▪ NB classifier and Bayes net classifier [103]	Wavelet analysis	▪ Monoblock centrifugal pump
▪ Decision tree [104]	Wavelet analysis	▪ Monoblock centrifugal pump
▪ SVM [105]	CWT	▪ Monoblock centrifugal pump
▪ Fuzzy logic [106]	WT	▪ Electrical power distribution system
▪ SVM [107]	WT	▪ Distribution networks
▪ SVM [108]	EMD	▪ Power systems
▪ SVM, NB and PNN [109]	EMD	▪ Transmission Lines
▪ Naïve Bayes [110]	Qualitative trend analysis	▪ Transmission Lines
▪ CART and clustering algorithm [111]	-	▪ Ion implanter
▪ Auto-associative kernel regression, correlation analysis, genetic algorithm, and probability ratio test [112]	-	▪ Reactor Coolant Pump of a typical Pressurized Water Reactor
▪ NN [113]	Optimal zoom search	▪ High-pressure feedwater heater system
▪ Decision tree [114]	DWT	▪ Face milling tool
▪ NN and subtractive clustering analysis [115]	-	▪ HVAC systems

Table 4. Recent methods for general applications.

Year	Methods	Usage
2016	Naïve Bayes and feature weighting approaches [121]	▪ High-dimensional massive data classification
	Visual approach to represent Bayesian confirmation measures (BCMs) [122]	▪ Visualize the behavior and symmetry properties of BCMs
	XGBoost [127]	▪ Scalable machine learning system for tree boosting
2017	Covering-based samples reduction [117]	▪ Fast binary support vector machine learning method
	MixedTrails, a Bayesian approach [123]	▪ Types of analysis to study sequential data
	Geometric-based Online Gaussian Process for fast regression [124]	▪ Handling of large-scale datasets
	Kolmogorov complexity and use the Minimum Description Length [125]	▪ Solution to the problem of inferring the direction of causal dependence of observational data
	Recursive neural architectures [129]	▪ Design of recursive architectures for numerical data of variable size
2018	Heterogeneous-target robust mixture regression (HERMIT) [70]	▪ Handling of heterogeneous data
	Markov chain sampling [118]	▪ Classification of data in presence of label noise
	Prototype-based classification [119]	▪ Learning and prediction based on the selection of handfuls of class data
	Kernel mixture model [120]	▪ Probability density estimation in Bayesian classifiers
	Linear regression [126]	▪ Uplift modeling
	Random forest, XGBoost, and LSTM [128]	▪ Analysis of temporal stability and accuracy for binary classification
	Convolutional NN [130]	▪ Intelligent fault diagnosis when the data at training and testing time does not come from the same distribution
	Growing Neural Gas Algorithm [131]	▪ Adaptive algorithm got evolving data streams
2019	Least squares support vector machine [132]	▪ Imbalance of data
2019	Logistic regression and Kalman filter [133]	▪ Estimation error reduction in online sequential extreme learning machine systems

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Contreras-Valdes, A.; Amezquita-Sanchez, J.P.; Granados-Lieberman, D.; Valtierra-Rodriguez, M. Predictive Data Mining Techniques for Fault Diagnosis of Electric Equipment: A Review. Appl. Sci. 2020, 10, 950. https://doi.org/10.3390/app10030950

AMA Style

Contreras-Valdes A, Amezquita-Sanchez JP, Granados-Lieberman D, Valtierra-Rodriguez M. Predictive Data Mining Techniques for Fault Diagnosis of Electric Equipment: A Review. Applied Sciences. 2020; 10(3):950. https://doi.org/10.3390/app10030950

Chicago/Turabian Style

Contreras-Valdes, Arantxa, Juan P. Amezquita-Sanchez, David Granados-Lieberman, and Martin Valtierra-Rodriguez. 2020. "Predictive Data Mining Techniques for Fault Diagnosis of Electric Equipment: A Review" Applied Sciences 10, no. 3: 950. https://doi.org/10.3390/app10030950

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Predictive Data Mining Techniques for Fault Diagnosis of Electric Equipment: A Review

Abstract

1. Introduction

2. Predictive Model

2.1. Classification-Based Methods

2.2. Regression-Based Methods

2.3. Hybrid Techniques

3. Recent Methods for General Applications

4. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI