1. Introduction
Industrial systems across manufacturing, energy, and transportation sectors are powered by induction motors [
1]. Operational efficiency is directly impacted by their reliability, where significant economic losses, production delays, and potential safety hazards can be caused by unexpected failures. While predictive maintenance has been established as a crucial strategy in mitigating these challenges, limitations have been found in traditional diagnostic methods that rely on manual inspection and periodic maintenance. Through recent advances in machine learning and signal processing, this landscape has been transformed and more accurate, proactive approaches to equipment monitoring have been enabled [
2].
Catastrophic failures can occur when faults in induction motors, arising from complex interactions between mechanical and electrical components, are left undetected [
3,
4]. Vibration and acoustic emission signals have been identified as essential indicators for fault detection, through which statistical techniques and probabilistic models have been widely adopted over recent decades [
5,
6]. Intelligent solutions for predictive maintenance in rotating machinery have been implemented by manufacturers, particularly where remote monitoring applications are concerned [
7]. Vibration-based condition monitoring has been established as a cornerstone, through which potential faults can be identified by tracking vibration pattern variations [
8].
Fault detection strategies have been fundamentally reshaped by data-driven methodologies, where artificial intelligence and advanced signal processing are incorporated [
9]. Among the various failure modes by which motor performance can be compromised, imbalance has been identified as a uniquely complex challenge for which sophisticated diagnostic approaches are required. Unlike simpler mechanical failures, the imbalance can be manifested through multiple configurations—mass imbalance, geometric imbalance, and couple imbalance—by which distinct detection challenges are presented and motor reliability and operational efficiency are significantly impacted.
Fault detection reliability is challenged by several obstacles: high-frequency signal noise, dynamic operating conditions, and complex signal propagation mechanisms. In variable industrial environments, where significant signal interference and operational variability are introduced, conventional approaches have been found particularly challenging [
10]. By computational constraints, imbalance fault diagnosis is further complicated, especially in real-time processing scenarios where immediate results must be delivered without system performance being compromised. Advanced signal-processing techniques have been developed through recent research, by which precise fault indicators can be extracted under diverse operational conditions [
11]. Traditional limitations have been overcome by combining fast Fourier transform (FFT), wavelet transforms, and machine learning algorithms.
Feature extraction has been identified as a critical challenge in imbalance fault analysis. Through hybrid methodologies developed by researchers, time-domain and frequency-domain transformations have been integrated to capture subtle mechanical irregularities [
12]. Generalizable features have been emphasized, by which consistency can be maintained across different motor configurations and operating conditions. Detection precision has been enhanced through supervised learning algorithms, ensemble classification methods, and adaptive neural network architectures [
13].
The imbalance fault detection field has been continuously advanced through machine learning applications. Particular promise in handling complex motor imbalance characteristics has been demonstrated by deep learning and ensemble methods [
14]. By these approaches, fundamental limitations in existing diagnostic methods have been addressed and more robust, adaptable detection models have been developed through which generalization across varied motor configurations and operating conditions can be achieved.
The potential of artificial intelligence in motor fault diagnosis has been highlighted by recent studies [
15,
16,
17]. In our research, deep neural networks (DNNs), support vector machines (SVMs), and K-nearest neighbors (KNNs) have been explored to address the binary classification between “normal” and “imbalance” conditions. In previous work, SVM was combined with long short-term memory (LSTM) networks [
18], although a deep understanding of fault-specific patterns was required. Statistical features from vibration and current signals have been employed with SVM in other studies [
19], while KNN has demonstrated effectiveness in diagnosing faults in rotating machinery [
20,
21].
In this work, a novel approach has been developed where traditional and advanced machine learning models are combined for imbalance fault detection. SVM with time-domain features has been employed for binary classification, by which its effectiveness with simpler feature sets can be leveraged, while DNN with FFT-based autocorrelation features has been applied to capture complex fault signatures in the frequency domain. The frequency domain analysis is performed using autocorrelation computed through FFT-based convolution, where the signal is convolved with its time-reversed version. This approach leverages FFT’s computational efficiency for calculating autocorrelation, enabling effective capture of periodic patterns and temporal dependencies in the vibration signals. Using the MAFAULDA dataset [
22] and random oversampling techniques [
23], the following significant improvements were achieved: SVM accuracy increased from 85.9% to 95.4%; KNN achieved 92.8% accuracy; and our DNN implementation with FFT-based features reached an accuracy of 99.7%. These results align with research where frequency-domain advantages in fault detection have been highlighted [
24,
25,
26].
In this paper, existing research has been extended through the practical application of AI in predictive maintenance. The focus is on vibration-based datasets, preprocessing data/features, and specific fault conditions. Through the MAFAULDA dataset’s alignment with studied fault conditions, both relevance and accuracy in our findings are ensured.
Figure 1 illustrates the flowchart of the proposed methods.
The remainder of this paper is organized as follows: In
Section 2, the experimental setup, dataset characteristics, and methods are presented, including details of the machinery fault simulator and data acquisition system.
Section 3 describes the feature extraction process and the statistical parameters used for fault detection. The feature signal analysis approach and its implementation are detailed in
Section 4.
Section 5 presents the classification methodology, including the implementation of SVM, KNN, and DNN algorithms along with a discussion on practical considerations and performance evaluations of these algorithms. The experimental results and detailed analysis of each classifier’s performance are presented in
Section 6.
Section 7 concludes the paper with key findings and suggestions for future research.
3. Feature Extraction
This section presents our approach to statistical feature extraction for fault diagnosis. We employed eleven statistical features that effectively characterized the distribution and patterns in vibration data. The extracted features included mean, standard deviation, quartile medians (Q1, Q2, and Q3), minimum and maximum (peak-to-peak), kurtosis, skewness, root mean square (RMS), and energy. For parameter optimization, we utilized kurtosis as an advanced input feature, measuring signal sharpness to produce optimized features. The detailed optimization process using kurtosis is discussed in [
34]. These optimization techniques enable comprehensive assessment of the data structure, particularly in identifying peaked, flat, or directionally biased distributions. Skewness quantifies distribution symmetry, with zero indicating normal or symmetric distributions, while kurtosis measures the relative heaviness of distribution tails compared to normal distributions. Both metrics provide crucial statistical characteristics for fault detection. The mean and standard deviation describe the central tendency and data spread, respectively, with variance and standard deviation being particularly useful for measuring data distribution during feature extraction [
35]. Our methodology applies these statistical features to differentiate between expected (normal) and unusual (imbalance) patterns in time-domain data. The dataset comprises 2550 feature windows, with 70% allocated for training and 30% for testing. Each window spans 132 samples and is advanced with 25% overlap, generating multiple statistical features that characterize the vibration patterns under diverse operating conditions. The mathematical formulations for all statistical features are presented in
Table 4.
4. Feature Signal Analysis for Fault Detection
The MAFAULDA dataset contains vibration data collected through industrial IMI sensors (601A01 and 604B31 accelerometers), which were positioned on the MFS to capture vibrations in radial, axial, and tangential directions. The data acquisition system included a Monarch Instrument MT-190 tachometer, Shure SM81 microphone, and National Instruments NI-9234 data acquisition modules operating at a 51.2 kHz sampling rate. The analysis extracted eleven fundamental features from the raw vibration data: mean, standard deviation, minimum, maximum, kurtosis, skewness, root mean square (RMS), energy, and median values (25%, 50%, 75%). The SVM and KNN methods applied these features to both raw and processed data, allowing the algorithms to learn and adapt to underlying patterns for effective classification. The DNN implementation uniquely incorporated autocorrelation analysis of accelerometer data, computed through FFT-based convolution, which captured temporal patterns and recurring signal characteristics crucial for fault detection.
The dataset comprises vibration sequences at fixed rotation speeds from 254 to 3686 rpm, with approximately 60 rpm increments. These sequences were sampled at 50 kHz and analyzed using a sliding window approach. Windows of 132 samples were advanced with 25% overlap between consecutive windows, corresponding to a step size of 99 samples, resulting in 2550 feature windows spanning approximately five seconds of data. The 60 rpm increments were carefully chosen to provide comprehensive coverage of operational speeds while maintaining manageable data volume. The imbalance fault simulations were conducted using loads ranging from 6 g to 35 g. Under normal operation, the rotational frequency was consistently maintained for each load value below 30 g. These operating conditions maintained a constant speed within each sequence, allowing for reliable fault signature identification.
Figure 6 shows the analysis of extracted features under normal rotational speeds. Loads equal to or exceeding 30 g limited the system’s ability to achieve rotational frequencies above 3300 rpm, revealing important characteristics for fault detection. The progressive load conditions (6 g to 35 g) reveal important system dynamics that are crucial for practical fault detection implementations.
Different imbalance conditions show distinct behavioral patterns. At the 6 g imbalance (
Figure 7), the mean, standard deviation, and kurtosis properties remained stable despite slight perturbations. The 10 g imbalance analysis (
Figure 8) shows sustained rotational properties. Skewness and RMS features demonstrated characteristic changes in the system’s vibrational behavior under an increased imbalance load.
At the 15 g imbalance (
Figure 9), variations in maximum, energy, and median parameters show how the system responded to the increased load. The 20 g load imbalance (
Figure 10) created significant dynamical variations across all features, especially in skewness and kurtosis. The 25 g imbalance analysis (
Figure 11) reveals the system’s response to substantial imbalances through standard deviation and minimum trends.
The analysis is expanded to a 15 g load imbalance as shown in
Figure 9. Changes in important parameters, such as the maximum, energy, and median, provide a detailed analysis of how the system reacts to more extreme imbalance situations. The maximum value directly indicates peak vibration amplitudes, which typically increase under severe imbalance conditions. This parameter is crucial as it reveals potential threshold violations that could indicate severe mechanical stress on the system. The energy value, calculated as the sum of squared signal amplitudes provides a comprehensive measure of the overall vibrational intensity. Under extreme imbalance conditions, energy values show significant increases reflecting the higher vibrational content across the entire signal duration. This makes energy a sensitive indicator of severe imbalance conditions.
Figure 10 shows the effects of a 20 g load imbalance, resulting in dynamical variations in all features. Skewness and kurtosis are two characteristics that might show significant fluctuations, suggesting that the system is sensitive to increasing imbalance levels. In
Figure 11, variations in all extracted features are represented under a 25 g imbalance. Similarly, trends in standard deviation and minimum provide insight into the system’s capability to manage important imbalance issues.
Figure 12 shows limitations to the feature performance by analyzing the effects of higher load imbalance. The 30 g load analysis reveals constrained rotational frequencies, limiting frequencies above 3300 rpm, and variations in maximum and energy features, indicating system stability issues when subjected to high loads. The dynamics of all extracted features under a severe load of a 35 g imbalance are examined in the last exploration as shown in
Figure 13. At a 35 g load imbalance, significant changes appear across all features, particularly in mean, skewness, and RMS values, showing system behavior at operational limits. Each sequence spans five seconds at a 50 kHz sampling rate, providing detailed data for fault detection. In
Figure 6,
Figure 7,
Figure 8,
Figure 9,
Figure 10,
Figure 11,
Figure 12 and
Figure 13, the time is taken in microseconds.
7. Conclusions
This paper introduces effective methods for predicting induction motor faults, aiming to minimize losses and prevent disasters through advanced fault detection techniques. By leveraging SVM, KNN, and DNN, we precisely classify motor states as either “normal” or “imbalance”. Our findings demonstrate significant improvements over previous studies, notably with accuracy for optimized oversampled SVM reaching 95.4% and optimized oversampled KNN achieving an accuracy of 92.8%. These methods, enhanced by optimization and oversampling strategies, contribute significantly to fault detection in induction motors. The best-performing algorithm was DNN with FFT-based implementation of autocorrelation, achieving an impressive 99.7% accuracy. This aligns with trends in technology-driven fault identification and offers substantial benefits for industrial processes such as increased efficiency, reduced downtime, and enhanced safety
In future work, the model’s robustness will be enhanced by testing it on the testbed under varying load conditions and real-world noise. The training dataset will also be expanded with diverse fault scenarios, and adaptive learning techniques will be incorporated to improve the generalization and reliability of the model.