An Audio-Based Motor-Fault Diagnosis System with SOM-LSTM

Tu, Chia-Sheng; Chiu, Chieh-Kai; Tsai, Ming-Tang

doi:10.3390/app14188229

Open AccessArticle

An Audio-Based Motor-Fault Diagnosis System with SOM-LSTM

by

Chia-Sheng Tu

¹,

Chieh-Kai Chiu

² and

Ming-Tang Tsai

^2,*

¹

School of Mechanical and Electrical Engineering, Tan Kah Kee College, Xiamen University, Zhangzhou 363105, China

²

Department of Electrical Engineering, Cheng-Shiu University, Kaohsiung 833, Taiwan

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(18), 8229; https://doi.org/10.3390/app14188229

Submission received: 8 August 2024 / Revised: 5 September 2024 / Accepted: 10 September 2024 / Published: 12 September 2024

(This article belongs to the Collection Modeling, Design and Control of Electric Machines: Volume II)

Download

Browse Figures

Versions Notes

Abstract

:

This paper combines self-organizing mapping (SOM) and a long short-term memory network (SOM-LSTM) to construct an audio-based motor-fault diagnosis system for identifying the operating states of a rotary motor. This paper first uses an audio signal collector to measure the motor sound signal data, uses fast Fourier transform (FFT) to convert the actual measured sound–time-domain signal into a frequency-domain signal, and normalizes and calibrates the frequency-domain signal to ensure the consistency and accuracy of the signal. Secondly, the SOM is used to further analyze the characterized frequency-domain waveforms in order to reveal the intrinsic structure and pattern of the data. The LSTM network is used to process the secondary data generated via SOM. Dimensional data aggregation and the prediction of sequence data in long-term dependencies accurately identify different operating states and possible abnormal patterns. This paper also uses the experimental design of the Taguchi method to optimize the parameters of SOM-LSTM in order to increase the execution efficiency of fault diagnosis. Finally, the fault diagnosis system is applied to the real-time monitoring of the motor operation, the work of identifying the motor-fault type is performed, and tests under different loads and environments are attempted to evaluate its feasibility. The completion of this paper provides a diagnostic strategy that can be followed when it comes to motor faults. Through this fault diagnosis system, abnormal conditions in motor equipment can be detected, which can help with preventive maintenance, make work more efficient and save a lot of time and costs, and improve the industry’s ability to monitor motor operation information.

Keywords:

motor; operation audio; self-organizing mapping; long short-term memory network; fast Fourier transform

1. Introduction

Motors are indispensable and important equipment for daily necessities and various industries, enhancing the development of modern science and technology and industrial progress. A motor is a device that converts electrical energy into mechanical energy. Its main structure is formed of a stator and a rotor. The stator or the rotor is a complex mechanical structure [1]. In order to increase industrial productivity, most motors run for a long time. Therefore, after years of operation, the probability of motor failure is very high. When a fault occurs, how to detect, diagnose, and determine the type of fault and then perform maintenance is a concerning crucial steps. The structure of early machinery and equipment was relatively simple. Diagnostic experts or maintenance personnel worked on troubleshooting and repairs according to their own experiences, and they used simple measuring instruments to extract information from machines and equipment. They then implemented fault diagnosis and maintenance based on personal experience. This diagnostic method, which is based on instrument signals and human experience, has been gradually replaced with more efficient artificial intelligence in recent years [2,3,4,5]. It is also one of the important topics applied to motor-fault diagnosis and maintenance at present.

In the early days, most studies on motor faults still used vibration analysis methods [6]. The purpose of fast Fourier transform is to process the vibration signal and analyze the spectrum of the signal to identify various types of faults [7]. However, when using FFT to analyze time-domain signals into frequency-domain signals, signal distortion sometimes occurs and is not yet perfect. In some diagnoses, the obscure feature parameters result in misjudgments of the detection system. Additionally, the fixed size of the chosen window, the difficulties in quantifying the faults’ extent, and the high computational cost required to obtain a good resolution still remain the major drawbacks of this technique [8,9]. In recent years, artificial neural networks (ANNs) have been successfully used in fault prediction; motor-fault problems are solved more simply, effectively, and flexibly [10,11,12]. The usual approach to applying ANNs is to obtain fault identification and extract features that are learned by mapping machine conditions to the extracted features. However, ANNs have essential weaknesses, it is not easy to determine the size of the network architecture and related parameters in a dynamic environment [13]. Therefore, building a sound-based motor-fault diagnosis system to improve power maintenance personnel’s capability for motor-fault diagnosis is a direction of future work.

In the research on motor-fault detection, ANN is the most commonly intelligent tool. Especially when monitoring the condition of motors, using ANN-appropriate technology and reliable algorithms can avoid unexpected failures in industrial processes. Ref. [14] used common fault types for induction motors, such as compound faults for rotor imbalance, stator short circuit, and rotor, as the objectives of machine learning diagnoses and judgments. Ref. [15] used the correlation vector formed by the single fault type and the corresponding spectral characteristic relation to decompose the entire correlation matrix into independent correlation vectors and combined the statistical parameters of time-domain vibration signals with a Bayesian network for gear fault diagnosis. Ref. [16] employed frequency-axis proportional correction, load segmentation, and feature extraction methods to process the motor sound and used the generalized regression neural network to train and identify induction motor faults. Ref. [17] applied the artificial intelligence algorithm to solve the problem of fault diagnosis in a permanent magnet synchronous machine. Refs. [18,19] proposed a deep transfer learning strategy in the intelligent fault diagnosis of rotating machinery. Ref. [20] investigated a novel diagnosis algorithm for incipient electrical faults in an induction machine of wind power systems under time-varying conditions. Ref. [21] used the motor current normalized residual harmonic analysis method to detect the diagnosis of induction motor faults. Multilayer artificial neural networks (MANNs) have been used for the diagnosis and classification of different faults in the stator windings of a permanent magnet synchronous motor [22]. In order to solve the problem of the compound fault diagnosis of diesel engine fuel, a diagnosis method based on generative adversarial networks and transfer learning was proposed in [23]. According to a previous literature review, the detection of motor faults is much more based on their severity and location. In addition, when different loads are applied to the motor operation, the sound signals are also different, which causes trouble in fault detection. The challenge of this work is to detect and estimate the severity of motor faults under different loads to detect the various types of motor faults.

This study combined self-organizing maps (SOMs) [24,25] with long short-term memory (LSTM) [26,27] to build a sound-based motor-fault diagnosis system in order to solve the problem of motor-abnormality detection. The spectrogram data of FFT may be highly complex and multi-dimensional, and the correlation thereof may still be very complex. SOMs, which are thus used for data cluster analysis, are unsupervised machine learning algorithms. By importing the spectrogram data generated via FFT into SOMs, the features and correlation of the spectral data are mapped into the two-dimensional space, assisting in identifying and classifying various states in motor operation, from normal operation to different types of mechanical failures. As a powerful sequence model, LSTM has a memory and forgetting mechanism; it can effectively capture long-term dependencies in sequence data. The operational sound of the rotary motor is classified using the LSTM, and the effect of these data on the motor running state and failure mode is analyzed directly. This study also selected the orthogonal experiment design (OED) [28] to improve SOM-LSTM execution efficiency so that it has fast learning characteristics and robust analysis. Motor-fault diagnosis is an uncertain dynamic environment problem that involves many variables and restrictions. When the amount of prediction data is very large, problem solving will be more difficult. This paper proposes an SOM-LSTM calculation method and a sound conversion and OED mechanism to enhance the algorithm execution efficiency and increase the probability of finding the optimum solution so as to enhance the ability of the industrial circles to master motor operation information.

2. Sound Frequency Analysis Technology

A motor generates sound when it is running, and the sound data can be analyzed in depth through signal-processing technology and machine learning algorithms so as to implement the accurate monitoring of the motor status and fault warning. This study intended to identify the modes of normal operation of a motor and three fault types: out-of-phase fault, unbalanced voltage fault, and misalignment fault, as shown in Figure 1.

FFT is a key step in converting the motor operating sound from the time domain to the frequency domain. The original sound signal is decomposed via FFT into a series of components of different frequencies, and a spectrogram is generated to display the amplitude of each frequency component in order to understand the characteristic frequency and energy distribution of operational sound. The steps of the FFT algorithm are as follows [29]:

Prepare input signals: The signal data requiring frequency-domain analysis are obtained. They are usually discrete signals in the time domain.
Choose the FFT length: The length of FFT is determined; a power of 2 is usually chosen to accelerate the calculation using the FFT algorithm. If the length of the input signals is not an integer multiple of the FFT length, the required length can be achieved by filling zeros at the end of the signals.
Calculate FFT: The FFT algorithm is used to perform a Fourier transform on the zero-filled signals. The FFT algorithm converts the time-domain signals into frequency-domain signals, and the spectral information is generated.
Calculate the spectral amplitude: The spectral amplitude information is extracted from the FFT results; the energy distribution of the spectrum is usually represented using the amplitude spectrum.
Draw a spectrogram: The spectral amplitude or spectral phase information is visualized so as to analyze and understand the frequency-domain characteristics of signals.

The time-domain signals of the sound are converted into frequency-domain signals, and the feature information of the signals in the frequency domain is obtained, such as the spectral amplitude and phase. The audio sampling time in this paper is a sampling frequency of 21,000 samples per second, and the same audio information is subjected to different FFT lengths. The operating sound included a normal operating mode and three failure modes. Each mode is divided into 16 operating sound samples with different loads from light load to full load. The total of sound samples is 64. Figure 2 shows the no-load, half-load, and full-load frequency-domain signals of FFT with 128 Hz, 256 Hz, 512 Hz, and 1024 Hz.

3. Methodology

This paper divides the motor-operating sound-processing flow into three major items: FFT, SOM, and LSTM. First, the spectral data extraction of FFT is used to convert the motor operating sound into spectral data. The second step is the feature mapping of SOM; the features and correlation of spectral data are mapped into two-dimensional space. Finally, the sound of the rotary motor is classified using LSTM, and different types of operational sounds are identified automatically, thereby implementing the effective monitoring and classification of the operating status of the rotary motor. By analyzing the spectral data, different types of operational sounds are classified into four types, namely normal, out of phase, unbalanced voltage, and misalignment.

3.1. SOM-LSTM

In the SOM neural network, each neuron is located on a node of the map and is responsible for responding to a specific type of input pattern. Through the learning process, the neurons self-organize to reflect the topology of the input data so that similar data points are close to each other on the map and dissimilar data points are far apart. The steps of the SOM algorithm are as follows [30].

Step 1. Map unit initialization: The weight, $W_{i}$ , of each neuron $i$ is generated randomly.
Step 2. Look for each input sample, $X$ , of the best matching unit. The SOM looks for the best matching unit by calculating the similarity (or distance) between $X$ and the weight, $W_{i}$ , of all neurons. The similarity measure is calculated as per Equation (1):

$d ({X, W}_{i}) = \sqrt{\sum_{j = 1}^{m} {(x_{j} - w_{i j})}^{2}}$

(1)

m

is the dimension of the vector.

d ({X, W}_{i})

is the distance between the best matching unit (BMU) of neurons and the input sample

X

.

Step 3. Update neighborhood weights: Move the BMU and the neuron weights to their neighborhood closer to the input sample $X$ in order to perform learning. The adjustment of the weights is expressed as Equation (2):

$W_{i} (t + 1) = W_{i} (t) + θ (i, B M U, t) \times α (t) \times [X - W_{i} (t)]$

(2)

W_{i} (t + 1)

is the weight of neuron

i

at time

(t + 1)

.

θ (i, B M U, t)

is the neighborhood function that will determine how much the neuron

i

is influenced by its distance from the BMU. The neighborhood function usually decreases as the distance increases, and it decreases with time.

α (t)

is the learning rate, which decreases with time to ensure that the learning process is stabilized gradually.

Step 4. Reduce the learning rate: The neighborhood function $θ (i, B M U, t)$ is usually a distance- and time-varying function, ensuring that the network has a greater influence in the early stages of learning. As time goes by, the affected neighborhood will gradually shrink, and the learning process is refined gradually. The neighborhood function used in this paper is a Gaussian function, expressed as Equation (3):

$θ (i, B M U, t) = e x p (- \frac{d^{2} (i, B M U)}{{2 σ}^{2} (t)})$

(3)

d (i, B M U)

is the distance between the neuron

i

and the BMU on the network map. This distance usually refers to the grid distance on the network map, e.g., the Euclidean distance.

σ (t)

is the neighborhood width that decreases with time, controlling the shape of the neighborhood function. In the early stage of learning,

σ (t)

is relatively large, allowing more neurons to be affected by the BMU; as learning progresses,

σ (t)

decreases gradually so that the learning process further focuses on the immediate neighborhood of the BMU.

Step 5. Repeat iteration: Repeat Step 2 and Step 4 until the termination condition. The termination condition in this paper is 3000 iterations.

The LSTM is a special recurrent neural network for processing and predicting the time series events in sequence data. Each unit structure contains the following: (1) a forget gate, (2) an input gate, (3) an output gate, and (4) a cell state. The LSTM calculation process allows the cell to decide what information to save from the input data, what outdated information to forget, and when to transfer information from the cell state to the output. LSTM has some unique advantages compared to GRU [31] and convolutional neural networks (CNNs) [32]. LSTM has a more complex structure, which makes it better at memorizing long-sequence data and retaining information. LSTM can effectively handle long-term dependency problems, while GRU is relatively simplified and may be less effective in dealing with long-sequence dependencies. CNN is good at processing images and local features, but its performance with sequence data is usually not as good as that of LSTM. Therefore, LSTM will have more advantages than GRU and CNNs in model construction that requires memorizing long-term information. The main calculation steps are as follows [33,34]:

Step 1. Setting the forget gate: The forget gate decides what information to forget from the cell state. It calculates the output, $f_{t}$ , of the forget gate through the current input, $x_{t}$ , and the hidden state, $h_{t - 1}$ , at the previous time point, expressed as Equation (4):

$f_{t} = σ (W_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})$

(4)

W_{f}

is the weight of the forget gate,

b_{f}

is the bias term of the forget gate, and

σ

is an S-function, expressed as Equation (5), so that the output value is between 0 and 1, which determines the proportion of forgetting.

s i g m o i d (x) = \frac{1}{1 + e^{x}}

(5)

Step 2. Setting the input gate: The new information of the updated cell state is determined through the vector $\tilde{C_{t}}$ with the candidate value range of [−1, +1] created with a $t a n h$ layer, expressed as Equation (6):

$\tilde{C_{t}} = t a n h (W_{C} \cdot [h_{t - 1}, x_{t}] + b_{C})$

(6)

W_{c}

is the weight of the input gate;

b_{c}

is the bias term of the input gate.

Step 3. Cell-state update: The cell state, $C_{t}$ , is combined with the information of the forget gate and the input gate into the updated state, expressed as Equation (7):

$C_{t} = f_{t} \times C_{t - 1} + i_{t} \times \tilde{C_{t}}$

(7)

f_{t}

determines how much of the previous cell state to retain, while

i_{t} \times \tilde{C_{t}}

adds new candidate values.

Step 4. Setting the output gate: The output gate will determine which data based on the cell state will be exported, expressed as Equation (8). This output value, $h_{t}$ , will be used as the output at the current time point and as the hidden state at the next time point, expressed as Equation (9):

$o_{t} = σ (W_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})$

(8)

$h_{t} = o_{t} \times t a n h (C_{t})$

(9)

o_{t}

is the activation function of the output gate, which determines which parts to export. The final output,

h_{t}

, is the product of this activation value and the

t a n h

value of the cell state, and it ensures that the output value is −1 to 1.

The SOM obtains feature-window signals of the multi-frequency domain from FFT for data-dimension reduction and feature mapping, and it extracts the operating characteristics of the motor with more frequencies and different dimensions. The LSTM learning is used to obtain multi-dimensional feature data from multi-frequency SOM as LSTM input data. Figure 3 shows the network structure of SOM-LSTM. The LSTM input layer passes through the feature maps of the SOM frequencies 128, 256, 512, and 1024 Hz. The LSTM output is a 1-dimensional number including the normal state, out-of-phase fault, unbalance fault, and misalignment fault; the parameter settings are the cell state (

C_{t}

) and memory capacity (

f_{t}

).

The input data for the system consists of motor sound signals, which are captured using a high-quality microphone placed near the motor during its operation. The microphone records the sound at a sampling rate of 21,000 samples per second to ensure that a broad range of frequencies is captured, allowing for an accurate analysis of the motor’s operational status. The SOM-LSTM involves several steps:

The preparation of input signals:

The recorded sound signals are first digitized, resulting in discrete time-domain signals. These signals represent the amplitude of the sound wave at each sampled point in time. This step effectively decomposes the sound signal into its constituent frequencies, providing a detailed view of the motor’s acoustic profile. For example, if the 128 Hz data of FFT is input, the input terminal is 128 pieces of data at one time. Similarly, if the 1024 Hz data of FFT is input, the input terminal is 1024 pieces of data at one time. This paper passes the FFT data through the 2-dimensional topology structure of SOM m = [3, 4, 5], so that only a minimum of 9 (3 × 3) strokes and a maximum of 25 (5 × 5) strokes are needed to input data.

2.: Calculating the spectral amplitude:

The output of the FFT is a complex-valued array that contains both amplitude and phase information for each frequency component. The spectral amplitude, which indicates the strength or energy of each frequency, is extracted from this array. The amplitude spectrum is then analyzed to identify the dominant frequencies and their corresponding amplitudes, which are characteristic of the motor’s operating condition. Through this FFT-based preprocessing, the raw motor sound signals are transformed into a format that is suitable for further analysis, allowing for more precise monitoring and fault detection.

3.: Feature extraction using SOM:

After preprocessing the motor sound signals with FFT to obtain frequency-domain representations, these spectral data are fed into the self-organizing map (SOM). The input to the SOM consists of the spectral data obtained from FFT analysis. This data includes the amplitude values of various frequencies for each sound sample. As a result of training, the SOM clusters similar spectral features close to each other on the map, while dissimilar features are positioned further apart. This spatial organization helps visually distinguish different types of motor sounds based on their frequency content.

4.: Temporal analysis and classification with LSTM:

Once the SOM has processed and mapped the spectral features, these mapped features are then used as input for the long short-term memory (LSTM) network. Based on the learned patterns, the LSTM classifies the motor sound into one of four categories: normal operation, out-of-phase fault, unbalanced voltage fault, or misalignment fault. This classification is achieved by evaluating the output state of the LSTM after processing the entire sequence of input features [35].

5.: Evaluation of the output:

The output of the LSTM is a single-dimensional prediction that indicates the operational state of the motor. By using SOM to extract and map features, followed by LSTM to learn temporal patterns and classify the sounds, this combined approach effectively evaluates the motor’s operational state and provides a robust method for monitoring and diagnosing potential faults.

3.2. Same-Frequency-Ratio Axis Correction

Due to the running sound of the motor, in the spectral data of FFT, it can be known from Figure 2 that, when the motor is running, for example, the no-load, half-load, and full-load conditions of 128 Hz, the maximum amplitude of the spectral data is not at the same frequency. This paper uses frequency-ratio axis correction, namely Equation (10), to calculate the position of the maximum amplitude of different loads at the same frequency in the spectral data.

F m_{A} = \frac{\sum_{m = 1}^{16} \max F_{A}}{M}

(10)

where

{m a x F m}_{A}

is the spectral position of the maximum amplitude of different loads at the same frequency,

m = 1

is the spectral position of the maximum amplitude of no load,

M = 16

is the spectral position of the maximum amplitude of a full load, and

{F m}_{A}

is the average value of the maximum amplitude of the same frequency. The maximum amplitude of different loads at the same frequency is moved to the spectral position of

{F m}_{A}

in order to achieve the same frequency-ratio axis correction.

The frequency-domain data volume of operational sound directly increases the complexity of the SOM calculation. Because, in each iteration, the distance between each input vector and the weight vectors of all neurons on the map need to be calculated, the data volume makes the calculation take more time. In the high-dimensional space, it becomes more difficult to find meaningful local structures that may affect the mapping quality learned through SOM. Therefore, this paper proposes the selection of the frequency-feature window of operational sound, which is the feature-window selection for FFT spectral data. The spectral data volume is reduced using the feature window. The ratio-frequency components in the signals are enhanced or highlighted to improve the accuracy and performance of spectral analysis. A window method is used to get the signal characteristics, and the requirements and the performance characteristics of the window function are analyzed, as well as the possible computational costs. Under normal circumstances, it is most effective to conduct experiments and comparisons based on specific requirements and select the most suitable window method. The selection of the frequency-feature window after the same-frequency-ratio axis correction is shown in Figure 4.

3.3. Taguchi Method

The input layer of LSTM is a cluster analysis of the topology of input frequencies (128, 256, 512, and 1024 Hz) of SOM. There are many parameters in the LSTM model, such as the number of units in the hidden layer, the maximum number of iterations, and memory capacity, which should be adjusted to obtain the best performance. The number of units,

C_{t}

, in the LSTM layer determines the complexity of the model. The value of

f_{t}

is how much memory capacity of the previous unit is retained. The parameter settings to be determined in this paper are the cell state (

C_{t}

) and memory capacity (

f_{t}

). The output layer is represented by the number of the type; Type 1 = normal, Type 2 = out-of-phase voltage, Type 3 = unbalanced voltage, and Type 4 = axis misalignment. The OED of the Taguchi method looks for the optimal parameter combination of the SOM dimension parameter

m

and the two parameters

C_{t}, f_{t}

of LSTM within a limited number of experiments. SOM uses the dimension parameter

m = [3, 4, 5]

and the two parameters

C_{t} = [100, 125, 150] a n d f_{t} = [0.4, 0.5, 0.6]

of LSTM as 3 levels of their variables, so the 3 levels and 3 factors of the OED is used to construct the experiment’s matrix as shown in Table 1.

The 3-level, 3-factor experimental design of the OED is used to analyze the performance of SOM and LSTM in different parameter combinations. The optimal parameter combination is selected as the best parameter combination from SOM-LSTM to improve the processing capacity and resolution for the operational sound of the rotary motor.

4. Case Test

This paper uses a microphone as the sound pickup device, and it converts the analog audio signals into digital signals through an analog/digital adapter card. The sampling period is 16 ms, there are 256 sampling points, and the sampling rate is 22,050 (data/sec). The experimental motor is a four-pole, three-phase, 2.4 kW, 220 V, 60 Hz, and 1800 rpm induction motor. Figure 5 show the diagram of the sampled sound stored. The sampled sound is digitally stored in the memory card according to the sample period, sample points, and sample rate through the A/D adapter card.

This paper plans four operating conditions for simulation, which are described below.

Type 1 = normal: The normal condition is important data in the database, and the monitoring results of the trouble-free operation of the motor can be used as a reference for other fault conditions.
Type 2 = out-of-phase fault: The out-of-phase fault is simulated under the original operating condition of the three-phase voltage power supply; one phase of the power supply is cut off, and the remaining two phases supply power, resulting in the motor voltage out of phase.
Type 3 = unbalanced fault: The unbalanced fault is simulated, and the single-phase autotransformer will be adjusted so that the voltage of phase 1 is less than the voltage of the other 2 phases, resulting in the unbalanced voltage of the motor.
Type 4 = axis-misalignment fault: half of the screws of the coupling are loosened, and the other half are tightened, leading to a coupling misalignment in the operation of the motor.

In the above four conditions, this paper measures the operational sound of the motor. The load is measured from no load to a full load and divided into 16 segments of operational sound; No. 1 is no load, No. 8 is a half-load, and No. 16 is a full load. The operational sound of the rotary motor is recorded under each load. Four states, Type 1 = normal, Type 2 = out-of-phase fault, Type 3 = unbalanced fault, and Type 4 = axis-misalignment fault, are recorded. The data set is divided into two parts, the training data and testing data. The training data are used for training and updating the biases and learning rates. The test data are used to test the proposed methods after training. The simulation is implemented with Matlab on an Intel Core i5-7300HQ 2.5 GHZ computer with 16 GB of RAM.

4.1. Taguchi Parameter Design

The parameter combination for the three levels and three factors of OED is shown in Table 2.

This paper uses the 128, 256, 512, and 1024 Hz frequencies of the audio data. It also adopts the Taguchi orthogonal array for the experimental design and uses the data of the SOM cluster feature analysis as the input layer of the LSTM. The simulation results are shown in Table 3, Table 4, Table 5 and Table 6, which show the parameter analysis of 128 Hz, 256 Hz, 512 Hz, and 1024 Hz, respectively. The identification rate (%) is the operating-status decision value of the SOM-LSTM output value. Table 3 shows the best identification rate of SOM-LSTM (128 Hz), which is 79.69% for EXP. 8. The best parameters are

m

= 5,

C_{t}

= 125, and

f_{t}

= 0.5. Table 4 shows the best identification rate for SOM-LSTM (128 Hz), which is 85.94% for EXP. 9. The best parameters are

m

= 5,

C_{t}

= 125, and

f_{t}

= 0.5. Table 5 shows the best identification rate for SOM-LSTM (128 Hz), which is 90.63% for EXP. 8. The best parameters are

m

= 5,

C_{t}

= 125, and

f_{t}

= 0.4. Table 6 shows the best identification rate for SOM-LSTM (128 Hz), which is 85.94% for EXP. 9. The best parameters are

m

= 5,

C_{t}

= 125, and

f_{t}

= 0.5. Table 5 shows the best identification rate for SOM-LSTM (128 Hz), which is 86.72% for EXP. 8. The best parameters are

m

= 5,

C_{t}

= 125, and

f_{t}

= 0.4. The experimental parameters and optimal identification rate are shown in Table 2, Table 3, Table 4 and Table 5, Exp. 8 has the best identification rate. Therefore, this paper chooses the Exp. 8 experimental design of the Taguchi method. The dimension parameter

m

= 5 of the SOM and the two parameters

C_{t}

= 125 and

f_{t}

= 0.4 of the LSTM are the optimal parameter values, and they are used as the optimal parameters of SOM-LSTM.

4.2. Operational Sound Identification of SOM-LSTM

Figure 6 shows the simulated curve of four faults’ identifications, which included the normal type, the out-of-phase fault type, the unbalanced fault type, and the axis-misalignment fault type. If the signals of fault-type training and prediction are accurate, the signals will overlap. If the fault type is judged incorrectly, the signals will be separated. The deviation error for the Type 1 = normal, Type-2 = out-of-phase fault, Type-3 = unbalanced fault, and Type-4 = axis-misalignment fault are 3.44%, 3.98%, 4.56%, and 4.98%, respectively. However, the deviation errors are inevitably affected in the diagnosis process. The identification results of SOM-LSTM can closely match the actual fault condition, and the deviation errors in the four faults are still within the acceptable range. It is obvious that SOM-LSM has the ability to find better solutions.

4.3. Performance Test

In order to verify the accuracy of SOM-LSTM, this paper uses the same data to train the experimental design in Table 7 with a radial basis-function neural network (RBFNN), and conduct experimental comparisons between SOM-LSTM and SOM-RBFNN. The experimental design of Exp. 1~11 is performed. According to Table 7, SOM-RBFNN, in this paper, is m = [3, 4, 5], σ = [0.01, 0.1, 1], and γ = [0.3, 0.4, 0.5]. The SOM-RBFNN of Exp. 7 has a training accuracy of 100% under the parameters of m = 5, σ = 0.01, and ϒ = 0.5, but the test recognition rate seems to be lower than expected, at only 54.69%, and the average accuracy rate between training and testing is 77.345%. Similarly, the SOM-LSTM in Exp. 9 has a training accuracy of 100% under the parameters m = 5,

C_{t}

= 150, and

f_{t}

= 0.5, and the test recognition rate is also 92.19%. The average accuracy rate between training and testing is 96.095%. According to the experimental analysis, the SOM-LSTM of Exp. 9 is also better than the SOM-RBFNN of Exp. 7, and the improvement in the identification rate is greater.

This paper repeats Exp. 9 of SOM-LSTM with 100 times, and it calculates the best solution, the worst solution, and the average solution in 100 runs. The same parameters are used for each run. Identification rates for the worst, best, and average values are recorded together with the number of generations to converge and the average execution time, as shown in Table 8. The average execution time is about 1.231 s. Table 8 shows the average solution that the training sample identification rate is 98.44%, the test sample identification rate is 87.50%, and the total identification rate is 92.97%. It can be seen that the proposed algorithm is a good performer in terms of the search solution and the average execution time. The RMSE training error of the SOM-LSTM of Exp. 9 is shown in Figure 7. When the SOM-LSTM is trained to the No. 701 to No. 750 iterations, the average RMSE is 0.3248.

5. Conclusions

This paper has proposed the advantages of multi-model combination. By combining multiple models, such as FFT, SOM, and LSTM, and using the experimental design of the Taguchi method, the parameters of SOM-LSTM are optimized. The experimental design of the Taguchi method makes SOM-LSTM allow the full exhibition of the model. The execution efficiency of classification and monitoring is improved, providing a comprehensive and effective solution for the identification of motor operating sounds. This method covers many aspects, such as data processing, feature extraction, and model building; thus, a comprehensive solution is provided. The preliminary processing and extraction of data have been considered, and the deep learning technology was fully used for model building, thus ensuring a comprehensive analysis and the monitoring of motor operating sounds and improving the accuracy and reliability of classification and monitoring. SOM-LSTM constructs a sound-based motor-fault diagnosis system to solve the problem of motor-abnormality detection. This paper is expected to help in the preventive maintenance of motor equipment, making this work more efficient, saving time and costs, and enhancing the capability of industrial circles to monitor motor operation information.

Author Contributions

C.-S.T. is the first author. He generalized the novel algorithms and designed system-planning projects. C.-K.C. contributed material tools and the experiments and conducted simulations. M.-T.T. assisted in the performance of the project, modeled the theory, conducted formal analysis, and prepared the manuscript as the corresponding author. All authors were involved in exploring system validation. All authors have read and agreed to the published version of the manuscript.

Funding

The APC was funded by the National Science and Technology Council 2021 research and innovation programm, grant number NSC 110-2221-E-230-003.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data are unavailable due to privacy restrictions.

Acknowledgments

We would like to thank the National Science and Technology Council, Taiwan, for its financial support (Grant Number: NSC 110-2221-E-230-003).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chapman, S.J. Electric Machinery Fundamentals, 5th ed.; McGraw-Hill: New York, NY, USA, 2012; ISBN 0071325816. [Google Scholar]
Palmero, G.I.S.; Santamaria, J.J.; Moya, E.J.; Gonzzlez, J.R.P. Fault detection and fuzzy rule extraction in AC motors by a neuro-fuzzy ART-based system. Eng. Appl. Artif. Intell. 2005, 18, 867–874. [Google Scholar] [CrossRef]
Ghate, V.N.; Dudul, S.V. Optimal MLP neural network classifier for fault detection of three phase induction motor. Expert Syst. Appl. 2010, 37, 3468–3481. [Google Scholar] [CrossRef]
Li, B.; Chow, M.Y.; Tipsuwan, Y.; Hung, J.C. Neural-Networks-Based Motor Rolling Bearing Fault Diagnosis. IEEE Trans. Ind. Electron. 2000, 47, 1060–1069. [Google Scholar] [CrossRef]
Zhou, W.; Lu, B.; Habetler, T.G.; Harley, R.G. Incipient Bearing Fault Detection via Motor Stator Current Noise Cancellation Using Wiener Filter Industry Applications. IEEE Trans. Ind. Appl. 2009, 45, 1309–1317. [Google Scholar] [CrossRef]
Siano, D.; Panza, M.A. Diagnostic method by using vibration analysis for pump fault detection. Energy Procedia 2018, 148, 10–17. [Google Scholar] [CrossRef]
Rajaby, E.; Sayedi, S.M. A structured review of sparse fast Fourier transform algorithms. Digit. Signal Process. 2022, 123, 103403. [Google Scholar] [CrossRef]
Palácios, R.; Silva, I.; Goedtel, A.; Godoy, W. A comprehensive evaluation of intelligent classifiers for fault identification in three-phase induction motors. Electr. Power Syst. Res. 2015, 127, 249–258. [Google Scholar] [CrossRef]
Tang, L.; Tian, H.; Huang, H.; Shi, S.; Ji, Q. A survey of mechanical fault diagnosis based on audio signal analysis. Measurement 2020, 220, 113294. [Google Scholar] [CrossRef]
Martin-Diaz, M.; Morinigo-Sotelo, D.; Duque-Perez, O.; Arredondo-Delgado, P.A.; Camarena-Martinez, D.; Romero-Troncoso, R.J. Analysis of various inverters feeding induction motors with incipient rotor fault using high-resolution spectral analysis. Electr. Power Syst. Res. 2017, 152, 18–26. [Google Scholar] [CrossRef]
Bazan, G.H.; Scalassara, P.R.; Endo, W.; Goedtel, A.; Godoy, W.F.; Palácios, R.H. Stator fault analysis of three-phase induction motors using information measures and artificial neural networks. Electr. Power Syst. Res. 2017, 143, 347–356. [Google Scholar] [CrossRef]
Mian, Z.; Deng, X.; Dong, X.; Tian, Y.; Cao, T.; Chen, K.; Jaber, T. A literature review of fault diagnosis based on ensemble learning. Eng. Appl. Artif. Intell. 2024, 127, 107357. [Google Scholar] [CrossRef]
Evangeline, S.; Darwin, S.; Raj, E. A deep residual neural network model for synchronous motor fault diagnostics. Appl. Soft Comput. 2024, 160, 111683. [Google Scholar] [CrossRef]
Zhang, S.; Su, L.; Gu, J.; Li, K.; Zhou, L.; Pecht, M. Rotating machinery fault detection and diagnosis based on deep domain adaptation: A survey. Chin. J. Aeronaut. 2023, 36, 45–74. [Google Scholar] [CrossRef]
Noussaiba, L.; Abdelaziz, F. ANN-based fault diagnosis of induction motor under stator inter-turn short-circuits and unbalanced supply voltage. ISA Trans. 2024, 145, 373–386. [Google Scholar] [CrossRef] [PubMed]
Li, Z.; Zhu, J.; Shen, X.; Zhang, C.; Guo, J. Fault Diagnosis of Motor Bearing Based on the Bayesian Network. Procedia Eng. 2011, 16, 18–26. [Google Scholar] [CrossRef]
Mishra, R.K.; Choudhary, A.; Fatima, S.; Mohanty, A.R.; Panigrahi, B.K. A generalized method for diagnosing multi-faults in rotating machines using imbalance datasets of different sensor modalities. Eng. Appl. Artif. Intell. 2024, 132, 107973. [Google Scholar] [CrossRef]
Nyanteh, Y.D.; Srivastava, S.K.; Edrington, C.S.; Cartes, D.A. Application of artificial intelligence to stator winding fault diagnosis in Permanent Magnet Synchronous Machines. Electr. Power Syst. Res. 2013, 103, 201–213. [Google Scholar] [CrossRef]
Tang, S.T.; Ma, J.; Yan, Z.; Zhu, Y.; Khoo, C.K. Deep transfer learning strategy in intelligent fault diagnosis of rotating machinery. Eng. Appl. Artif. Intell. 2024, 134, 108678. [Google Scholar] [CrossRef]
Zhu, Z.; Lei, Y.; Qi, G.; Chai, Y.; Mazur, N.; An, Y.; Huang, X. A review of the application of deep learning in intelligent fault diagnosis of rotating machinery. Measurement 2023, 206, 112346. [Google Scholar] [CrossRef]
Gritli, Y.; Stefani, A.; Rossi, C.; Filippetti, F.; Chatti, A. Experimental validation of doubly fed induction machine electrical faults diagnosis under time-varying conditions. Electr. Power Syst. Res. 2011, 81, 751–766. [Google Scholar] [CrossRef]
Allal, A.; Khechekhouche, A. Diagnosis of induction motor faults using the motor current normalized residual harmonic analysis method. Int. J. Electr. Power Energy Syst. 2022, 141, 108219. [Google Scholar] [CrossRef]
Moosavi, S.S.; Djerdir, A.; Ait-Amirat, Y.; Khaburi, D.A. ANN based fault diagnosis of permanent magnet synchronous motor under stator winding shorted turn. Electr. Power Syst. Res. 2015, 125, 67–82. [Google Scholar] [CrossRef]
Cui, Z.; Lu, Y.; Yan, X.; Cui, S. Compound fault diagnosis of diesel engines by combining generative adversarial networks and transfer learning. Expert Syst. Appl. 2024, 251, 123969. [Google Scholar] [CrossRef]
Melssen, W.J.; Smits, J.R.M.; Buydens, L.M.C.; Kateman, G. Using artificial neural networks for solving chemical problems: Part II. Kohonen self-organising feature maps and Hopfield networks. Chemom. Intell. Lab. Syst. 1994, 23, 267–291. [Google Scholar] [CrossRef]
Arribas-Bel, D.; Nijkamp, P.; Scholten, H. Multidimensional urban sprawl in Europe: A self-organizing map approach. Comput. Environ. Urban Syst. 2011, 35, 263–275. [Google Scholar] [CrossRef]
Qin, C.; Qin, D.; Jiang, Q.; Zhu, B. Forecasting carbon price with attention mechanism and bidirectional long short-term memory network. Energy 2024, 299, 131410. [Google Scholar] [CrossRef]
Zhang, Y.; Song, Y.; Wei, G. Spatial and temporal attention-based and residual-driven long short-term memory networks with implicit features. Eng. Appl. Artif. Intell. 2024, 134, 10854. [Google Scholar] [CrossRef]
Pontes, F.; Paulo de Paiva, A.; Balestrassi, P.P.; Ferreira, J.R.; Borges da Silva, M. Optimization of Radial Basis Function neural network employed for prediction of surface roughness in hard turning process using Taguchi’s orthogonal arrays. Expert Syst. Appl. 2012, 39, 7776–7787. [Google Scholar] [CrossRef]
Lin, W.M.; Gow, H.J.; Tsai, M.T. An enhanced radial basis function network for short-term electricity price forecasting. Appl. Energy 2010, 87, 3226–3234. [Google Scholar] [CrossRef]
Duan, H.; Meng, X.; Tang, J.; Qiao, J. Time-series prediction using a regularized self-organizing long short-term memory neural network. Appl. Soft Comput. 2023, 145, 110553. [Google Scholar] [CrossRef]
Zhao, Z.; Fu, Y.; Pu, J.; Wang, Z.; Shen, S.; Ma, D.; Xie, Q.; Zhou, F. Performance decay prediction model of proton exchange membrane fuel cell based on particle swarm optimization and gate recurrent unit. Energy AI 2024, 17, 100399. [Google Scholar] [CrossRef]
Ilesanmi, E.; Ilesanmi, T.; Gbotoso, G. A systematic review of retinal fundus image segmentation and classification methods using convolutional neural networks. Healthc. Anal. 2023, 4, 100261. [Google Scholar] [CrossRef]
Feng, Z.; An, J.; Han, M.; Ji, X.; Zhang, X.; Wang, C.; Liu, X.; Kang, L. Office building energy consumption forecast: Adaptive long short term memory networks driven by improved beluga whale optimization algorithm. J. Build. Eng. 2024, 91, 109612. [Google Scholar] [CrossRef]
Srivastava, S.; Lessmann, S. A comparative study of LSTM neural networks in forecasting day-ahead global horizontal irradiance with satellite data. Sol. Energy 2018, 162, 232–247. [Google Scholar] [CrossRef]

Figure 1. Diagram of fault identification.

Figure 2. The frequency-domain signals of no load, half a load, and a full load for 128 Hz, 256 Hz, 512 Hz, and 1024 Hz.

Figure 3. The network structure of SOM-LSTM.

Figure 4. The selection of the frequency-feature window after the same-frequency-ratio axis correction.

Figure 5. The diagram of the sampled sound stored.

Figure 6. The simulated curves of four fault identifications.

Figure 7. The RMSE training error of SOM-LSTM.

Table 1. The 3 levels and 3 factors of the OED.

Exp.	1	2	3
1	1	1	1
2	1	2	2
3	1	3	3
4	2	1	2
5	2	2	3
6	2	3	1
7	3	1	3
8	3	2	1
9	3	3	2

Table 2. The parameter combination for three levels and three factors of OED.

Exp.	SOM $m$	LSTM $C_{t}$	LSTM $f_{t}$
1	3	100	0.4
2	3	125	0.5
3	3	150	0.6
4	4	100	0.5
5	4	125	0.6
6	4	150	0.4
7	5	100	0.6
8	5	125	0.4
9	5	150	0.5

Table 3. Parameter analysis at 128 Hz.

128 Hz	SOM	LSTM	LSTM	Identification Rate (%)
Exp.	$m$	$C_{t}$	$f_{t}$	Training Sample	Testing Sample
1	3	100	0.4	87.50	46.88
2	3	125	0.5	82.81	64.06
3	3	150	0.6	82.81	59.38
4	4	100	0.5	85.94	54.69
5	4	125	0.6	73.44	28.13
6	4	150	0.4	84.38	59.38
7	5	100	0.6	60.94	82.81
8	5	125	0.4	89.06	70.31
9	5	150	0.5	87.50	60.94
Average				81.60	58.51

Table 4. Parameter analysis of 256 Hz.

256 Hz	SOM	LSTM	LSTM	Identification Rate (%)
Exp.	$m$	$C_{t}$	$f_{t}$	Training Sample	Testing Sample
1	3	100	0.4	81.25	54.69
2	3	125	0.5	81.25	71.88
3	3	150	0.6	82.81	81.25
4	4	100	0.5	64.06	62.50
5	4	125	0.6	81.25	81.25
6	4	150	0.4	67.19	31.25
7	5	100	0.6	71.88	73.44
8	5	125	0.4	82.81	62.50
9	5	150	0.5	87.50	84.38
Average				77.78	67.01

Table 5. Parameter analysis of 512 Hz.

512 Hz	SOM	LSTM	LSTM	Identification Rate (%)
Exp.	$m$	$C_{t}$	$f_{t}$	Training Sample	Testing Sample
1	3	100	0.4	64.06	76.56
2	3	125	0.5	85.94	64.06
3	3	150	0.6	79.69	71.88
4	4	100	0.5	53.13	42.19
5	4	125	0.6	81.25	81.25
6	4	150	0.4	87.50	75.00
7	5	100	0.6	84.38	82.81
8	5	125	0.4	96.88	84.38
9	5	150	0.5	84.38	79.69
Average				79.69	73.09

Table 6. Parameter analysis of 1024 Hz.

1024 Hz	SOM	LSTM	LSTM	Identification Rate (%)
Exp.	$m$	$C_{t}$	$f_{t}$	Training Sample	Testing Sample
1	3	100	0.4	79.69	84.38
2	3	125	0.5	81.25	90.63
3	3	150	0.6	78.13	84.38
4	4	100	0.5	70.31	81.25
5	4	125	0.6	81.25	62.50
6	4	150	0.4	84.38	67.19
7	5	100	0.6	82.81	79.69
8	5	125	0.4	85.94	87.50
9	5	150	0.5	75.00	73.44
Average				79.86	78.99

Table 7. The performance test of SOM-LSTM and SOM-RBFN.

Exp.	SOM-LSTM Identification Rate (%)		SOM-RBFNN Identification Rate (%)
Exp.	Training Sample	Testing Sample	Training Sample	Testing Sample
1	100.00	78.13	75.00	46.88
2	100.00	84.38	87.50	51.56
3	100.00	57.81	76.56	45.31
4	100.00	84.38	79.69	60.94
5	100.00	57.81	64.06	45.31
6	100.00	84.38	73.44	51.56
7	100.00	70.31	100.00	54.69
8	98.44	70.31	90.63	43.75
9	100.00	92.19	92.19	43.75
Average	99.83	75.52	82.12	49.31

Table 8. The robust test of Exp. 9 of SOM-LSTM.

Exp. 9	Identification Rates (%)
Exp. 9	Training Sample	Test Sample	Average
Best	100.00	92.19	96.09
Worst	92.19	78.13	85.16
Average	98.44	87.50	92.97
Execution time	1.231 s	1.231 s	1.231 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tu, C.-S.; Chiu, C.-K.; Tsai, M.-T. An Audio-Based Motor-Fault Diagnosis System with SOM-LSTM. Appl. Sci. 2024, 14, 8229. https://doi.org/10.3390/app14188229

AMA Style

Tu C-S, Chiu C-K, Tsai M-T. An Audio-Based Motor-Fault Diagnosis System with SOM-LSTM. Applied Sciences. 2024; 14(18):8229. https://doi.org/10.3390/app14188229

Chicago/Turabian Style

Tu, Chia-Sheng, Chieh-Kai Chiu, and Ming-Tang Tsai. 2024. "An Audio-Based Motor-Fault Diagnosis System with SOM-LSTM" Applied Sciences 14, no. 18: 8229. https://doi.org/10.3390/app14188229

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Audio-Based Motor-Fault Diagnosis System with SOM-LSTM

Abstract

1. Introduction

2. Sound Frequency Analysis Technology

3. Methodology

3.1. SOM-LSTM

3.2. Same-Frequency-Ratio Axis Correction

3.3. Taguchi Method

4. Case Test

4.1. Taguchi Parameter Design

4.2. Operational Sound Identification of SOM-LSTM

4.3. Performance Test

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Exp.	SOM $m$	LSTM $C_{t}$	LSTM $f_{t}$
1	3	100	0.4
2	3	125	0.5
3	3	150	0.6
4	4	100	0.5
5	4	125	0.6
6	4	150	0.4
7	5	100	0.6
8	5	125	0.4
9	5	150	0.5

Exp.	SOM $m$	LSTM $C_{t}$	LSTM $f_{t}$
1	3	100	0.4
2	3	125	0.5
3	3	150	0.6
4	4	100	0.5
5	4	125	0.6
6	4	150	0.4
7	5	100	0.6
8	5	125	0.4
9	5	150	0.5

Exp.	SOM $m$	LSTM $C_{t}$	LSTM $f_{t}$
1	3	100	0.4
2	3	125	0.5
3	3	150	0.6
4	4	100	0.5
5	4	125	0.6
6	4	150	0.4
7	5	100	0.6
8	5	125	0.4
9	5	150	0.5