Abstract
Mechanical failures frequently occur in On-Load Tap Changers (OLTCs) during operation, potentially compromising the reliability and stability of power systems. The goal of this study is to develop an intelligent and accurate diagnostic approach for OLTC mechanical fault identification, particularly under the challenge of non-stationary vibration signals. To achieve this, a novel hybrid method is proposed that integrates the Gazelle Optimization Algorithm (GOA), Feature Mode Decomposition (FMD), and a Transformer-based classification model. Specifically, GOA is employed to automatically optimize key FMD parameters, including the number of filters (K), filter length (L), and number of decomposition modes (N), enabling high-resolution signal decomposition. From the resulting intrinsic mode functions (IMFs), statistical time domain features—peak factor, impulse factor, waveform factor, and clearance factor—are extracted to form feature vectors. After feature extraction, the resulting vectors are utilized by a Transformer to classify fault types. Benchmark comparisons with other decomposition and learning approaches highlight the enhanced performance of the proposed framework. The model achieves a 95.83% classification accuracy on the test set and an average of 96.7% under five-fold cross-validation, demonstrating excellent accuracy and generalization. What distinguishes this research is its incorporation of a GOA–FMD and a Transformer-based attention mechanism for pattern recognition into a unified and efficient diagnostic framework. With its high effectiveness and adaptability, the proposed framework shows great promise for real-world applications in the smart fault monitoring of power systems.
1. Introduction
The On-Load Tap Changer (OLTC) plays a fundamental role in maintaining voltage stability by allowing tap transitions under load without power interruptions [1,2,3]. The performance and reliability of OLTCs are essential to the overall stability and voltage control of power systems, directly impacting the quality and continuity of electrical service delivery. According to statistical analyses, OLTC-related failures constitute over 20% of total transformer malfunctions, with mechanical issues comprising more than 70% of these cases [4,5,6]. These statistics indicate that OLTC mechanical faults rank among the most frequent and disruptive failure types in power transformers. Mechanical faults can lead to abnormal switching behavior, prolonged tap transition time, contact overheating, or even permanent failure. These faults often result in voltage fluctuations, power quality deterioration, and in severe cases, forced transformer shutdowns or grid instability [7]. As a result, the efficient and precise extraction of vibration signal features for the accurate assessment of equipment condition and fault diagnosis has emerged as a central concern in recent research.
OLTC vibration signals encapsulate abundant information related to mechanical operating states and fault characteristics, effectively reflecting the condition of mechanical components. Xu et al. employed time–frequency analysis in conjunction with signal decomposition techniques to extract diagnostic features from vibration signals [8]. The Empirical Mode Decomposition (EMD) algorithm decomposes the entire signal during OLTC tap transitions to obtain time domain energy features by constructing envelopes based on local extrema and interpolating intrinsic mode functions (IMFs). However, due to its reliance on extrema and symmetric envelopes, EMD is prone to mode mixing and boundary artifacts, which can lead to ambiguous decomposition results and the potential omission of critical information [9]. To address these limitations, Liu et al. introduced the Variational Mode Decomposition (VMD) technique for OLTC fault diagnosis [10]. Theoretically, VMD separates the signal into multiple narrowband components with a primary emphasis on frequency domain characteristics [11,12]. Nevertheless, it fails to adequately capture the impulsive and periodic features inherent in mechanical faults. A recent study introduced (FMD), a technique designed to utilize adaptive filters with limited-bandwidth impulse responses to decompose signals into modal components that exhibit minimal correlation with the original signal [13]. FMD retains both periodic and transient signal features and demonstrates notable robustness under noise interference. However, its performance is highly sensitive to parameter configuration, which has a substantial impact on the quality of decomposition. The Gazelle Optimization Algorithm (GOA), a recently developed swarm intelligence technique, offers strong global search capabilities and high adaptability, making it particularly effective for solving complex, multimodal, and high-dimensional optimization tasks [14]. While FMD is capable of decomposing signals into IMFs that capture relevant fault information, its efficacy is heavily dependent on optimal parameter settings [15]. To address this, the present study integrates GOA to automatically optimize FMD parameters, thereby enhancing decomposition accuracy. Precise signal decomposition serves as the foundation for reliable feature extraction and classification, and is essential for realizing the intelligent fault diagnosis of OLTC mechanical systems. To illustrate the strengths and limitations of existing signal decomposition approaches in OLTC vibration analysis, Table 1 summarizes a comparative overview of the methods used in previous studies and the proposed GOA-FMD approach.
Table 1.
Comparison of representative signal decomposition methods for OLTC vibration analysis.
In the domain of OLTC fault diagnosis, achieving high-precision fault classification is essential for enhancing diagnostic accuracy and ensuring model generalization. Several classic machine learning techniques—including Support Vector Machines (SVMs), K-Nearest Neighbors (KNN), and Random Forests (RFs)—offer a certain level of diagnostic capability [16,17,18]. However, these methods typically rely on manually engineered features and exhibit limited adaptability and scalability in complex or dynamic environments. By contrast, deep learning techniques leverage neural networks for automatic feature extraction, thereby improving classification performance [19]. As a type of dense neural network with forward-only signal propagation, the Multilayer Perceptron (MLP) is typically effective for processing data with limited complexity and uniform input dimensions. Nonetheless, its dependence on handcrafted features and its inherent limitations in handling high-dimensional, nonlinear, or unstructured data can compromise its generalization ability [20]. In recent years, the Transformer architecture, underpinned by attention mechanisms, has emerged as a powerful model for feature representation [21,22,23]. Its capacity to capture long-range dependencies and integrate multi-source feature interactions makes it highly effective for complex classification tasks involving multiple feature dimensions. To highlight the advantages of the proposed Transformer-based classifier, Table 2 provides a comparative summary of the common classification methods used in OLTC fault diagnosis, outlining their strengths and limitations in terms of feature extraction, generalization, and adaptability.
Table 2.
Comparison of classification models for OLTC mechanical fault diagnosis.
To address the limitations in diagnostic accuracy caused by incomplete feature extraction from OLTC mechanical vibration signals, this paper presents an adaptive feature extraction scheme that integrates FMD with parameter tuning performed via the GOA. To improve decomposition accuracy, GOA is employed to fine-tune essential FMD parameters, such as filter count (K), filter length (L), and the number of modes (N). The obtained Intrinsic Mode Functions (IMFs) undergo detailed inspection across time, frequency, and joint time–frequency domains to ensure thorough signal characterization. On this basis, a Transformer-based fault classification model is developed to leverage attention mechanisms for effective feature learning. An experimental platform is subsequently established to simulate OLTC mechanical fault conditions. Three representative fault types are designed: (1) gear jamming in drive shaft; (2) loose screw in drive shaft; (3) arc plate looseness and contact wear. Fault identification is conducted using the proposed feature extraction method in conjunction with the Transformer classifier. To validate the proposed approach, comparative experiments are performed against alternative signal decomposition and classification techniques. The results confirm that the proposed approach achieves a high diagnostic precision and strong resilience under varying operating conditions.
2. GOA-FMD-Based Vibration Signal Feature Extraction and Fault Diagnosis Method
2.1. Principles and Characteristics of FMD
When the OLTC operates under mechanical fault scenarios, its vibration responses embed rich diagnostic characteristics that reflect distinct fault behaviors. FMD, initially designed for diagnosing faults in rotating equipment, is adapted in this study to suit OLTC diagnostic requirements and aims to iteratively extract modal components that carry diagnostic features by applying a sequence of limited-bandwidth impulse response filters. This approach suppresses irrelevant components while improving the separability and interpretability of fault-relevant features [13]. FMD has exhibited strong adaptability and reliability in the context of OLTC fault diagnosis. The fundamental procedure of FMD signal decomposition is summarized as follows:
- Provide the input signal x, define the filter length T, and initialize the iteration index i = 1. Specify the total number of filters K, then perform the initial modal decomposition based on the current parameters.
- Output the decomposition components , where , ‘’ indicates convolution, and fk represents the k-th finite impulse response (FIR) filter.
- Utilize the original signal x, the decomposed component , and the calculated period to update filter coefficients. is identified as the time lag at which the autocorrelation function reaches its peak after the zero crossing event. The autocorrelation is calculated as follows:where τ denotes the time delay and t is the time index of the signal. Finally, the iteration index is updated as i = i + 1.
- Continue to Step 2 if the predefined iteration limit has not been met; otherwise, move forward to Step 5.
- Compute the correlation coefficient for every pairwise combination of modal components and , using the formula below.where denotes the mean value of the corresponding modal component. Form a matrix containing the pairwise correlation coefficients, and identify the modal component with the lowest correlation—i.e., the one associated with the smallest maximum value in its corresponding row (or column) of .
- Compute the periodic correlation degree CK of each modal component uk using the following formula:where M is the correlation order, representing the number of periodic points used in the computation of CK. The mode exhibiting the highest degree of periodic correlation is designated as the final extracted component. Then, increment the mode index: K = K + 1.
- Assess whether the total number of extracted components K has attained the set value N. If this requirement is met, continue to Step 8; otherwise, revert to Step 2 and repeat the decomposition.
A schematic representation of the FMD algorithm is provided in Figure 1.
Figure 1.
Flowchart of FMD process.
The performance of FMD in analyzing OLTC vibration signals is highly sensitive to the parameter settings, which significantly affect the decomposition results [24,25,26]. A short filter length L may result in under-decomposition, whereas an excessively long filter may introduce significant noise into the decomposed components. Similarly, setting too few number of modes N may cause the loss of critical features, while too many modes may result in redundant information [15,27]. The number of filters K also plays a crucial role—if improperly chosen, it can either reduce the decomposition resolution or increase computational complexity unnecessarily [28]. Since FMD lacks an inherent mechanism for adaptive parameter selection, optimization algorithms are required to perform parameter tuning and ensure the accurate extraction of critical fault features from OLTC vibration signals.
2.2. FMD Parameter Optimization via GOA
GOA draws inspiration from natural predator–prey dynamics and operates within the framework of swarm intelligence techniques [14,29]. The optimization process begins with the random initialization of a gazelle population, which serves as candidate solutions in the search process. The population is represented by an position matrix X, where n denotes the number of agents and d is the dimensionality of the problem. Each row in X denotes the position vector of a candidate gazelle in the search domain. Gazelle population positions are defined by the position matrix X as shown in Equation (4) as follows:
where xi,j indicates the location of the i-th gazelle in the j-th dimension. The number of gazelles is n, and the dimensionality of the problem space is d. Each element xi,j is initialized using the following:
where a random variable is , and UBj and LBj represent the minimum and maximum permissible values along the j-th search dimension. After each iteration, all individuals are evaluated using the fitness function. The top-performing gazelles are selected as elites, and their positions are stored in the elite matrix E for guidance in subsequent generations. The elite matrix E stores the positions of the top-performing individuals (elite gazelles), and is defined as follows:
where denotes the coordinate of the i-th elite individual along the j-th axis.
During the development phase, each gazelle updates its position through a stochastic movement governed by the following rule:
where gi represents the present location of the i-th gazelle; v refers to the movement scaling coefficient; and are random values drawn from a uniform distribution. Ei corresponds to the position vector of the associated elite agent from matrix E.
When a predator is detected, the gazelle initiates an escape response modeled by either Lévy flight or Brownian motion. This process involves two stages: a global exploration phase characterized by large step movements and a local exploitation phase involving fine-tuned adjustments. The corresponding position update equations are defined as follows:
where S represents the upper bound of the gazelle’s movement speed; λ is the directional control factor; RL and RB are Lévy-distributed and Brownian-distributed random vectors, respectively; is a uniformly distributed random number; Ei is the position of the i-th elite gazelle; gi is the current position of the i-th gazelle; m and T denote the current and the maximum allowed iterations; and CF decreases in a nonlinear manner as iterations proceed, aiming to achieve a trade-off between exploration and exploitation.
To improve the algorithm’s potential for escaping from local optima, a predator–prey strategy is employed. This strategy simulates the natural behavioral shift between escape and social movement observed in prey animals. A decision is made based on a comparison between a random variable uniformly sampled from the interval and the predation success rates (PSRs). If , the gazelle performs a random relocation within the search space boundary using a controlled scaling factor. Otherwise, a differential movement is executed based on randomly selected individuals from the population. The position update rule, which mathematically models this predator–prey strategy, is defined in Equation (11) below.
where CF is the control factor from Equation (10), and is a uniformly distributed random number. The binary control variable U is defined as follows:
where, gr1, gr2 are two randomly selected position vectors from the current population. The variable U is used to probabilistically suppress position updates in certain dimensions, thereby introducing adaptive stochastic behavior during the escape process.
This work utilizes the GOA to optimize three parameters of the FMD process: the number of filters K, the filter length L, and the mode number N. The algorithmic flow for GOA-based parameter optimization is depicted in Figure 2. In the diagram, denotes a uniformly distributed random number, and T represents the predefined maximum number of iterations, while m indicates the index of the current iteration.
Figure 2.
Flowchart of the GOA-optimized FMD.
Each gazelle represents a parameter combination {L, K, N} corresponding to the filter length, the number of filters, and the mode number, respectively. FMD is applied to the original signal using these parameters, and the envelope entropy of the decomposed signal is used as the fitness function, which is defined as follows:
In these equations, ak corresponds to the amplitude value of the k-th component in IMF set after envelope extraction, and pk is the normalized amplitude. A lower envelope entropy implies that the extracted features are more concentrated and that the noise is reduced, which is advantageous for accurate fault identification [30,31].
2.3. Time Domain Feature Extraction from Vibration Signals
The FMD process generates IMFs, each corresponding to specific frequency bands of the OLTC vibration signals, thereby capturing critical information reflective of the system’s operational state. In order to characterize the impulsive and oscillatory behavior of the signal, four typical statistical indices in the time domain—namely peak factor, impulse factor, waveform factor, and margin factor—are computed from the IMFs [32,33,34,35].
The peak factor (PF), which quantifies the signal’s sharpness or impulsiveness, is mathematically described as follows:
A high peak factor typically reveals sudden transient phenomena within OLTC vibration signals, often caused by mechanical shocks or abrupt contacts. The impulse factor (IF) reflects the impulsiveness of the signal and is expressed as follows:
For impact-type faults in OLTC systems, the impulse factor tends to exhibit significantly elevated values. The waveform factor (WF) describes the waveform characteristics and distortion level of the signal. It is computed by the following:
This metric is frequently employed to assess the extent of waveform deformation and to distinguish periodic shock patterns from stochastic noise elements in the signal. The margin factor (MF) reflects whether the signal contains isolated high peaks, and is defined as follows:
This indicator evaluates whether the signal is approaching an extreme operating condition. A relatively large margin factor implies the presence of a prominent, isolated peak in the signal, which often corresponds to a transient event or an abnormal operating state.
2.4. Transformer-Based Classification Model Architecture and Implementation
The multiple intrinsic mode functions (IMFs) derived from the decomposition of the OLTC vibration signals exhibit varying levels of significance and distinct time domain characteristics. In addition, potential correlations may exist among the IMFs, which necessitate that classification models account for inter-feature dependencies. The Transformer, recognized as a cutting-edge deep learning framework, excels in modeling sequences and capturing global feature relationships, rendering it highly effective for handling intricate temporal data [21,22,36].
The proposed Transformer-based classification model comprises an encoder module and a classification head. The model receives a 24-dimensional input vector, derived by calculating four time domain statistical indicators from each of the six extracted IMFs. Positional encoding is initially added to preserve the sequential information. The encoded input is then processed by a multi-head self-attention mechanism, consisting of two encoder layers with four attention heads each and a model dimensionality of 128. The processed output is passed through a dense layer containing 64 neurons, followed by a Dropout operation set at a rate of 0.1 to reduce the risk of overfitting. Finally, a Softmax activation layer is employed to produce the classification probabilities. The schematic diagram of the designed model architecture is shown in Figure 3.
Figure 3.
Transformer-based architecture.
To ensure both robust feature representation and architectural simplicity, the model is tailored for fault diagnosis, particularly in cases with scarce annotated data [37]. The configuration parameters employed in the Transformer model are summarized in Table 3.
Table 3.
Hyperparameter settings of the Transformer model used for OLTC fault classification.
Based on the aforementioned model architecture and training configuration, the proposed OLTC mechanical fault diagnosis framework consists of five key stages: data acquisition, modal decomposition, feature extraction, sample construction, and fault classification. The overall process is illustrated in Figure 4.
Figure 4.
Flowchart of OLTC mechanical fault diagnosis.
3. Simulation of Mechanical Faults in OLTC
3.1. Experimental Platform Setup and Vibration Signal Acquisition
In this study, an experimental configuration for the KM-type OLTC was deployed to simulate vibration responses under both normal operating and representative mechanical fault conditions. A domestically manufactured KM-III800Y/126C-10193W OLTC prototype served as the test subject. Although the platform is based on a KM-type OLTC, the proposed method is not device-specific and is applicable to other OLTC models with similar mechanical structures. Vibration signals were acquired during the tap changing operation, providing empirical data to support the subsequent fault diagnosis research. The platform was instrumented with accelerometers (model YD38D, 100 mV/g sensitivity). The sensors were mounted in a triangular configuration on the top insulating cover of the OLTC unit, with their sensing axes oriented vertically (axial direction) as illustrated in Figure 5. This layout was chosen based on preliminary tests indicating that vibrations in the vertical (axial) direction consistently exhibited the highest amplitude in response to mechanical disturbances. In contrast, lateral directions were more susceptible to structural shaking and installation-related deviations during the tap changing operation, which could introduce measurement bias. Therefore, aligning the sensor axes vertically ensures measurement stability for vibration-based diagnostics. The triangular layout improves the spatial coverage and helps capture the localized vibration responses caused by different fault types.
Figure 5.
Accelerometer placement on the KM-type OLTC for vibration monitoring.
The data acquisition system is capable of performing the multi-channel synchronous capture of vibration signals, offering a peak sampling frequency of 102.4 kHz. Considering that most of the energy in OLTC vibration signals lies below 10 kHz [38,39], a sampling frequency of 25 kHz was adopted in accordance with the Nyquist criterion to maintain signal fidelity and measurement accuracy [40].
3.2. Fault Type Setup and Analysis of Vibration Response Characteristics
To analyze the vibration characteristics of the OLTC under different fault conditions, three types of typical mechanical faults were introduced in the experiment (see Figure 6): (a) gear jamming in the drive shaft, (b) loose screw in the drive shaft, (c) arc plate looseness and contact wear. These fault types were selected based on their high occurrence frequency in engineering practice, ease of simulation on the experimental platform, and their capacity to generate distinguishable vibration features [6]. Specifically, gear jamming and screw looseness affect the torque transmission and stability of the drive mechanism, while arc plate and contact wear degrade electrical continuity and switching performance—both contributing to observable mechanical anomalies.
Figure 6.
Three typical mechanical fault types simulated on the KM-type OLTC: (a) gear jamming in the drive shaft; (b) loose screw in the drive shaft; (c) arc plate looseness and contact wear.
Under normal operating conditions, the structural components of the OLTC remain intact, and the contact mechanism functions without obstruction or abnormal friction. To simulate the gear jamming fault of the drive shaft, metal shims were inserted into the gear gaps to artificially introduce abnormal frictional torque between the gear teeth. The screw loosening fault was replicated by partially loosening the locking screw mounted on the drive shaft. The cam plate loosening fault was simulated by loosening its mounting screw, resulting in relative displacement between the cam and adjacent mechanical components. In addition, contact head wear was simulated under the same setup to reflect the degradation that typically co-occurs with cam plate looseness.
The vibration signal was recorded beginning at the instant the tap changer was actuated, with a total acquisition duration of 0.2 s and a sampling rate of 25 kHz, thereby capturing the complete switching process. To obtain a unified and noise-reduced signal representation, the raw signals from the three accelerometers were first synchronized and then averaged point-by-point across all sampling instances. This signal-level fusion approach helps suppress sensor-specific disturbances while preserving the global vibration characteristics, thereby enhancing the quality of the subsequent decomposition and feature extraction process. The corresponding waveform plots for this process are presented in Figure 7a–d, which clearly illustrate the variations in vibration response under different mechanical fault conditions.
Figure 7.
Time domain waveforms of OLTC contact operation under different conditions: (a) normal condition; (b) gear jamming in drive shaft; (c) loose screw in drive shaft; (d) arc plate looseness and contact wear.
Under standard working conditions, as depicted in Figure 7a, the vibration signal displays a smooth and low-amplitude waveform, suggesting that the system operates in a steady and well-balanced state. In contrast, Figure 7b illustrates that gear jamming in the drive shaft induces sharp transient spikes and significant amplitude fluctuations, indicative of high-frequency impulse components caused by mechanical impact. In Figure 7c, the loosening of the drive shaft screw results in an overall increase in vibration amplitude; however, the observed peaks are less pronounced than those in the gear jamming scenario. Figure 7d reveals that cam plate looseness combined with contact head wear leads to waveform irregularities, the presence of sharp peaks, and substantial amplitude variations, reflecting persistent fault characteristics throughout the switching process. The comparative waveform analysis indicates that both the amplitude and duration of vibration signals are considerably greater under fault conditions than in the normal state. To further quantify these distinctions, Table 4 presents the peak value, root mean square (RMS), and dominant frequency band associated with each operating condition [41].
Table 4.
Comparison of temporal and spectral attributes of vibration signals.
As observed from Table 4, all fault conditions exhibit significantly higher root mean square (RMS) and peak values compared with the normal operating condition. In addition, the dominant frequency band demonstrates an upward shift under fault scenarios. These results suggest that the evolution of mechanical faults leads to an increased vibration intensity and the presence of higher-frequency components, thereby highlighting the distinct dynamic distinctive features corresponding to various fault categories.
4. Experimental Data Analysis
4.1. GOA-FMD-Based Feature Extraction and Results Analysis
In this study, the maximum number of iterations for FMD was set to 10. The parameters of the GOA were configured as follows: the velocity factor s was set to 0.88, the predator success rate PSRs was 0.34, the initial population size was 20, and the total iteration count of iterations for GOA was set to 80. The parameter search range and step size for FMD are summarized in Table 5, where the constraint is satisfied to ensure decomposition feasibility.
Table 5.
Parameter search space, step size, and optimal values for GOA-optimized FMD.
During the GOA optimization process, the fitness function utilized in the optimization process was envelope entropy, which facilitated convergence towards the global optimum. The convergence behavior depicted in Figure 8 indicates that the fitness function undergoes a gradual decline with successive iterations, eventually approaching a steady value, indicating that the algorithm gradually approaches the optimal solution. This behavior confirms both the convergence and the effectiveness of the proposed optimization strategy.
Figure 8.
Convergence curve of the fitness function value (envelope entropy) during GOA-based optimization of FMD parameters.
The vibration signal was decomposed using FMD, and multiple IMFs were extracted. Taking the signal under normal operating conditions as an example, its time domain and frequency domain representations can be seen in Figure 9. It is evident that the frequency bands of the individual IMFs have been effectively separated, with clearly defined boundaries and no mode aliasing. These observations confirm that the decomposition achieved a satisfactory performance under the current parameter configuration.
Figure 9.
Time and frequency domain representations of IMF components extracted by FMD under normal operating conditions.
To ensure both computational efficiency and diagnostic accuracy, only the first six IMFs were retained for feature extraction. As shown in Figure 10, the cumulative energy contribution rate of the first six IMFs exceeds 99%, which means that these components preserve almost all the meaningful vibration information while eliminating redundant or noise-dominated higher-order IMFs.
Figure 10.
Energy contribution rate and cumulative contribution rate of each IMF component obtained via FMD.
For each decomposed IMF, four representative time domain statistical indicators—peak factor, impulse factor, waveform factor, and margin factor—were calculated to characterize vibration features. Table 6 presents the results using IMF3 as a representative case.
Table 6.
Comparative analysis of statistical time domain indicators extracted from vibration signals under various OLTC operating conditions.
Under fault conditions, all four time domain indicators generally exhibit higher values compared with the normal operating condition. Notably, the peak factor values for the “Loose Screw in Drive Shaft” and “Arc Plate Looseness and Contact Wear” cases both exceed 23, indicating the presence of strong impact components in the vibration signal. The margin factor also shows a significant increase, reflecting greater deviation from steady-state operation and a higher likelihood of transient disturbances. These four time domain features, extracted from each of the six IMFs—resulting in a 24-dimensional feature vector—are subsequently used as input for the Transformer-based fault classification model.
4.2. Fault Classification Model Construction and Performance Comparison
In this study, vibration signals associated with OLTC contact operation were collected under four distinct conditions: (1) normal operation, (2) fault 1, gear jamming in the drive shaft, (3) fault 2, loose screw in the drive shaft, (4) fault 3, arc plate looseness and contact wear. For each condition, 150 signal samples were acquired, leading to a cumulative total of 600 samples. Among them, 120 samples per class were assigned to the training dataset, while the remaining 30 were used for testing, resulting in 480 training instances and 120 testing instances overall.
4.2.1. Comparison of Feature Extraction Performance Across Signal Decomposition Methods
To validate the effectiveness of FMD, a comparative analysis was conducted using three benchmark signal decomposition methods: Empirical Mode Decomposition (EMD), Ensemble Empirical Mode Decomposition (EEMD), and Variational Mode Decomposition (VMD). To ensure a fair comparison, the GOA parameters were held constant across all methods. GOA was employed to optimize the key parameters of each decomposition technique, and the optimized parameter configurations are summarized in Table 7. The baseline decomposition methods, including EMD, EEMD, and VMD, were re-implemented by the authors based on the algorithmic descriptions and parameter settings reported in the relevant literature [9,11,42].
Table 7.
Parameter names, search ranges, and GOA-optimized values for each decomposition method (VMD, EMD, and EEMD).
The OLTC vibration signals were decomposed using different signal decomposition methods (FMD, VMD, EMD, and EEMD). For each method, the same set of time domain features was extracted and used as the input to an identical Transformer-based classification model. To evaluate the fault identification performance of each decomposition technique, classification accuracy and confusion matrices were analyzed under consistent model configurations. The comparative classification results are presented in Figure 11 and summarized numerically in Table 8.
Figure 11.
Confusion matrices comparing diagnostic results using different GOA-optimized signal decomposition methods: (a) GOA-FMD; (b) GOA-VMD; (c) GOA-EMD; and (d) GOA-EEMD. The classification results are based on the same Transformer model.
Table 8.
Classification accuracy of OLTC fault diagnosis using different GOA-optimized signal decomposition methods with Transformer-based classification.
Among the tested decomposition methods, the GOA-FMD achieved the highest classification performance, with an overall accuracy of 95.83%. Notably, it reached near-perfect accuracy under both the normal operating condition and fault 3 (arc plate looseness and contact wear). In comparison, the GOA-VMD approach attained an overall accuracy of 86.67%, but exhibited noticeable performance degradation in identifying fault 1 (gear jamming). The GOA-EMD and GOA-EEMD methods yielded lower overall accuracies of 74.17% and 80.00%, respectively. These results suggest that GOA-FMD offers superior classification accuracy and robustness for OLTC vibration signal analysis, making it more suitable for practical fault identification applications. A five-fold cross-validation was performed for all methods to ensure a consistent and unbiased performance evaluation, as shown in Figure 12.
Figure 12.
Five-fold cross-validation accuracy of Transformer-based fault classification using different GOA-optimized decomposition methods.
As shown in Figure 12, the GOA-FMD method achieved the best performance under five-fold cross-validation, with an average classification accuracy of 96.7%. This result highlights its effectiveness in suppressing mode mixing and enhancing both feature extraction and fault recognition accuracy. The GOA-VMD method attained an average accuracy of 87.3%, outperforming both GOA-EMD (73.5%) and GOA-EEMD (79.7%). However, it still exhibited noticeable fluctuations across folds, indicating limited adaptability to varying signal characteristics. In contrast, the EMD-based methods suffered from considerable drops in accuracy in certain folds, suggesting a relatively weaker capability in handling signal interference and preserving fault-relevant features. In summary, the GOA-FMD framework demonstrates a superior classification performance, enhanced stability, and strong generalization ability, rendering it particularly effective for diagnosing OLTC faults when only limited data is available.
4.2.2. Performance Comparison of Classification Models in Fault Identification
To gain deeper insight into the effectiveness of the Transformer model for diagnosing OLTC faults, this study conducted comparative experiments with several classic classification models, including SVM, RF, and MLP. The goal of the comparison was to assess the classification accuracy and robustness of each model when trained and tested with the same feature inputs and experimental settings. These baseline models were reproduced by the authors based on established methods reported in prior studies [18,20,43]. The hyperparameter settings for all baseline models are detailed in Table 9. The hyperparameter settings for each baseline model were selected based on commonly accepted practices in the literature and fine-tuned through preliminary experiments to ensure optimal performance.
Table 9.
Hyperparameter configurations of comparative classification models (SVM, RF, and MLP).
For consistency and a fair comparison, all classification models were trained using the same input features—namely, the 24-dimensional time domain feature vectors (comprising four statistical indicators extracted from six IMFs) obtained via GOA-FMD signal decomposition. The corresponding classification performance of each model is summarized in Table 10 and visualized in Figure 13.
Table 10.
Classification accuracy of different models on the test set using GOA-FMD features.
Figure 13.
Confusion matrices of fault classification results using different models based on GOA-FMD features: (a) Transformer classifier; (b) MLP classifier; (c) RF classifier; (d) SVM classifier.
According to the data in Table 10, the Transformer model demonstrated the greatest performance in OLTC fault classification, achieving an overall accuracy of 95.83%. It reached 100% accuracy in identifying both the normal condition and fault 3 (arc plate looseness and contact wear). In comparison, the MLP model attained an overall accuracy of 87.5%, but its accuracy for fault 2 (loose screw in drive shaft) dropped to 80.00%. The Random Forest (RF) model yielded an overall accuracy of 80.83%, with relatively poor classification results for fault 1 and fault 2 (both at 76.67%). The Support Vector Machine (SVM) model achieved 83.33% accuracy, indicating a more stable but consistently lower performance than the Transformer model. These results suggest that the Transformer model exhibits superior capability in capturing global and discriminative features from complex vibration signals. Its enhanced representation power makes it more suitable for high-precision fault classification tasks, particularly in the context of OLTC diagnostics.
As shown in Figure 14, the Transformer model obtained the greatest average classification accuracy of 96.8% across all five cross-validation folds, significantly outperforming the other benchmark models. This result highlights its strong ability to capture both the temporal dynamics and global representations inherent in OLTC vibration signals, thereby enhancing diagnostic precision. In contrast, the MLP model achieved an average accuracy of 89.5%, while the Random Forest (RF) and Support Vector Machine (SVM) models obtained 81.1% and 86.6%, respectively. All three models fell short of the Transformer’s performance in terms of both accuracy and stability. Such findings provide confirmation of the superior representational capacity and generalization ability of the Transformer model, confirming its effectiveness as a high-precision diagnostic framework for complex OLTC fault classification tasks.
Figure 14.
Five-fold cross-validation results of different classification models based on GOA-FMD features.
5. Conclusions
This study investigates the mechanical fault diagnosis problem of the KM-type OLTC and proposes a novel classification framework that integrates GOA-FMD with a Transformer-based deep learning model. By incorporating vibration signal analysis, robust feature extraction, and advanced classification techniques, the proposed approach aims to enhance diagnostic accuracy and equipment reliability. The core conclusions drawn from the study are listed below:
1. A comprehensive experimental setup was established to simulate both normal and faulty OLTC operating conditions. Three representative mechanical fault types were introduced: fault 1—gear jamming in the drive shaft; fault 2—loose screw in the drive shaft; and fault 3—arc plate looseness and contact wear. A total of 600 vibration signal samples were acquired, providing a reliable data foundation for subsequent signal characterization and model training.
2. The vibration signals were decomposed using the GOA-optimized FMD method, in which key parameters—including the number of filters K, filter length L, and the mode number N—were adaptively optimized. This approach effectively suppressed mode aliasing and improved the separability of signal components. Subsequently, four time domain features (peak factor, impulse factor, waveform factor, and margin factor) were extracted from each IMF to construct a discriminative feature vector.
3. Comparative experiments among multiple decomposition methods (GOA-VMD, GOA-EMD, and GOA-EEMD) demonstrated that GOA-FMD achieved the best signal decomposition and classification performance. It obtained the highest average classification accuracy of 96.7%, significantly outperforming GOA-VMD (87.3%), GOA-EMD (73.5%), and GOA-EEMD (79.7%). These results confirm the superior fault feature preservation and anti-noise capability of GOA-FMD.
4. To perform fault classification, a Transformer model was adopted, which takes advantage of attention-based processing to extract global and long-range features from the input data. A comparative evaluation with classic models (SVM, RF, and MLP) revealed that Transformer achieved the best diagnostic performance on both the test set (95.83%) and in the five-fold cross-validation (96.8%), demonstrating a superior generalization ability and robustness under varying data conditions.
In conclusion, the proposed GOA-FMD + Transformer framework provides a feasible and scalable solution for intelligent condition monitoring and fault diagnosis in complex power equipment. The core innovation of this study lies in employing the GOA to adaptively optimize the key parameters of FMD, thereby achieving a superior signal decomposition performance. In addition, a Transformer-based classifier is introduced to enhance fault identification by leveraging its capability to model long-range dependencies in vibration signals. The integrated approach offers both a high diagnostic accuracy and practical engineering applicability. However, this study is limited to three predefined mechanical fault types. Further research is required to extend the approach to more diverse and complex fault scenarios to enhance its generalizability and deployment in broader real-world settings.
Author Contributions
Conceptualization, R.W. and Z.C.; Methodology, R.W. and D.L.; Software, Q.W. and F.J.; Validation, Z.C., Y.D., and F.J.; Formal analysis, R.W.; Investigation, Q.W.; Resources, Y.D. and H.W.; Data curation, H.W.; Writing—original draft preparation, D.L.; Writing—review and editing, Z.C. and X.W.; Visualization, F.J.; Supervision, X.W.; Project administration, X.W.; Funding acquisition, R.W. and Z.C. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by the Yunnan Province “Xingdian Talent Support Program” Chief Technician Special Fund and the China Southern Power Grid Science and Technology Project (YNKJXM20230542).
Data Availability Statement
The data that support the findings of this study are not publicly available due to privacy concerns and legal restrictions, but can be provided by the authors upon reasonable request.
Acknowledgments
The authors gratefully acknowledge Kunming Power Supply Bureau of Yunnan Electric Grid Co., Ltd. for its support in supplying the experimental samples and data used in this study. Appreciation is also extended to Shandong University for its valuable technical guidance and assistance throughout the research process.
Conflicts of Interest
The authors Ruifeng Wei, Zhenjiang Chen, Qingbo Wang, Yongsheng Duan, Hui Wang, and Feiming Jiang are affiliated with Kunming Power Supply Bureau, Yunnan Electric Grid Co., Ltd. No other authors report any commercial or financial ties that might be interpreted as potential conflicts of interest. This research received funding from Kunming Power Supply Bureau of Yunnan Electric Grid Co., Ltd. The funding organization was involved in the provision of experimental data and supporting documents, participated in the investigation and formal analysis, and contributed to the preparation and technical validation of the original manuscript draft.
Abbreviations
The following abbreviations are used in this manuscript:
| OLTC | On-Load Tap Changer |
| EMD | Empirical Mode Decomposition |
| VMD | Variational Mode Decomposition |
| FMD | Feature Mode Decomposition |
| RF | Random Forest |
| MLP | Multilayer Perceptron |
| SVM | Support Vector Machine |
| EEMD | Ensemble Empirical Mode Decomposition |
| GOA | Gazelle Optimization Algorithm |
| PF | Peak Factor |
| IF | Impulse Factor |
| WF | Waveform Factor |
| MF | Margin Factor |
| IMF | Intrinsic Mode Function |
References
- Geng, J.; Zhang, Z.; Wang, X.; Gao, S.; Wang, P. On-load tap-changer fault mode recognition based on the singular value of Hilbert energy spectrum time-frequency matrix and spectrum entropy. IET Gener. Transm. Distrib. 2022, 16, 3256–3266. [Google Scholar] [CrossRef]
- Ismail, F.B.; Mazwan, M.; Al-Faiz, H.; Marsadek, M.; Hasini, H.; Al-Bazi, A.; Yang Ghazali, Y.Z. An offline and online approach to the OLTC condition monitoring: A review. Energies 2022, 15, 6435. [Google Scholar] [CrossRef]
- Secic, A.; Krpan, M.; Kuzle, I. Vibro-acoustic methods in the condition assessment of power transformers: A survey. IEEE Access 2019, 7, 83915–83931. [Google Scholar] [CrossRef]
- Nadolny, Z. Design and Optimization of Power Transformer Diagnostics. Energies 2023, 16, 6466. [Google Scholar] [CrossRef]
- Yang, R.; Zhang, D.; Li, Z.; Yang, K.; Mo, S.; Li, L. Mechanical fault diagnostics of power transformer on-load tap changers using dynamic time warping. IEEE Trans. Instrum. Meas. 2019, 68, 3119–3127. [Google Scholar] [CrossRef]
- Wang, S.; Hong, Z.; Min, Q.; Zou, D.; Zhao, Y.; Qi, R.; Zhao, T. Diagnosis of Power Transformer On-Load Tap Changer Mechanical Faults Based on SABO-Optimized TVFEMD and TCN-GRU Hybrid Network. Energies 2025, 18, 2934. [Google Scholar] [CrossRef]
- Shi, Y.; Ruan, Y.; Li, L.; Zhang, B.; Huang, Y.; Xia, M.; Yuan, K.; Luo, Z.; Lu, S. A Mechanical Fault Identification Method for On-Load Tap Changers Based on Hybrid Time—Frequency Graphs of Vibration Signals and DSCNN-SVM with Small Sample Sizes. Vibration 2024, 7, 970–986. [Google Scholar] [CrossRef]
- Xu, Y.; Chen, B.; Ma, H.; Xu, H.; Wang, L.; Wang, C. Vibration signal feature extraction method of the onload tap changer based on EMDPSD. J. Electr. Power Sci. Technol. 2021, 35, 3–10. [Google Scholar]
- Huang, N.E. New method for nonlinear and nonstationary time series analysis: Empirical mode decomposition and Hilbert spectral analysis. In Wavelet Applications VII; SPIE: Bellingham, WA, USA, 2000; pp. 197–209. [Google Scholar]
- Liu, J.; Wang, G.; Zhao, T.; Zhang, L. Fault diagnosis of on-load tap-changer based on variational mode decomposition and relevance vector machine. Energies 2017, 10, 946. [Google Scholar] [CrossRef]
- Dragomiretskiy, K.; Zosso, D. Variational mode decomposition. IEEE Trans. Signal Process. 2013, 62, 531–544. [Google Scholar] [CrossRef]
- Lei, Y.; Lin, J.; He, Z.; Zuo, M.J. A review on empirical mode decomposition in fault diagnosis of rotating machinery. Mech. Syst. Signal Process. 2013, 35, 108–126. [Google Scholar] [CrossRef]
- Miao, Y.; Zhang, B.; Li, C.; Lin, J.; Zhang, D. Feature mode decomposition: New decomposition theory for rotating machinery fault diagnosis. IEEE Trans. Ind. Electron. 2022, 70, 1949–1960. [Google Scholar] [CrossRef]
- Agushaka, J.O.; Ezugwu, A.E.; Abualigah, L. Gazelle optimization algorithm: A novel nature-inspired metaheuristic optimizer. Neural Comput. Appl. 2023, 35, 4099–4131. [Google Scholar] [CrossRef]
- Li, Z.; Zhou, Z.; Zhou, X. A sensor-based modified FMD method to identify fault feature for mechanical fault diagnosis of ship-borne antennae. IEEE Access 2023, 11, 40018–40028. [Google Scholar] [CrossRef]
- Duan, X.; Zhao, T.; Li, T.; Liu, J.; Zou, L.; Zhang, L. Method for diagnosis of on-load tap changer based on wavelet theory and support vector machine. J. Eng. 2017, 2017, 2193–2197. [Google Scholar] [CrossRef]
- Peterson, L.E. K-nearest neighbor. Scholarpedia 2009, 4, 1883. [Google Scholar] [CrossRef]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Fernandes, M.; Corchado, J.M.; Marreiros, G. Machine learning techniques applied to mechanical fault diagnosis and fault prognosis in the context of real industrial manufacturing use-cases: A systematic literature review. Appl. Intell. 2022, 52, 14246–14280. [Google Scholar] [CrossRef] [PubMed]
- Tolstikhin, I.O.; Houlsby, N.; Kolesnikov, A.; Beyer, L.; Zhai, X.; Unterthiner, T.; Yung, J.; Steiner, A.; Keysers, D.; Uszkoreit, J. Mlp-mixer: An all-mlp architecture for vision. Adv. Neural Inf. Process. Syst. 2021, 34, 24261–24272. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
- Lin, T.; Wang, Y.; Liu, X.; Qiu, X. A survey of transformers. AI Open 2022, 3, 111–132. [Google Scholar] [CrossRef]
- Xu, Z.; Guan, H.; Kang, J.; Lei, X.; Ma, L.; Yu, Y.; Chen, Y.; Li, J. Pavement crack detection from CCD images with a locally enhanced transformer network. Int. J. Appl. Earth Obs. Geoinf. 2022, 110, 102825. [Google Scholar] [CrossRef]
- He, X.; Zhou, X.; Li, J.; Mechefske, C.K.; Wang, R.; Yao, G.; Liu, Q. Adaptive feature mode decomposition: A fault-oriented vibration signal decomposition method for identification of multiple localized faults in rotating machinery. Nonlinear Dyn. 2023, 111, 16237–16270. [Google Scholar] [CrossRef]
- Li, S.; Dou, L.; Li, H.; Li, Z.; Kang, Y. An Innovative Electromechanical Joint Approach for Contact Pair Fault Diagnosis of Oil-Immersed On-Load Tap Changer. Electronics 2023, 12, 3573. [Google Scholar] [CrossRef]
- Meng, T.; Jiang, X.; Song, Q.; Hu, Z.; Guo, J.; Zhu, Z. Modified Feature Mode Decomposition Guided by Spectral Structure Information for Machinery Fault Diagnosis. IEEE Trans. Instrum. Meas. 2024, 73, 1–10. [Google Scholar] [CrossRef]
- Qin, S.; Zeng, H.; Sun, W.; Wu, J.; Yang, J. Multi-strategy improved particle swarm optimization algorithm and gazelle optimization algorithm and application. Electronics 2024, 13, 1580. [Google Scholar] [CrossRef]
- Chauhan, S.; Vashishtha, G.; Kumar, R.; Zimroz, R.; Gupta, M.K.; Kundu, P. An adaptive feature mode decomposition based on a novel health indicator for bearing fault diagnosis. Measurement 2024, 226, 114191. [Google Scholar] [CrossRef]
- Zhou, C.; Xiong, Z.; Bai, H.; Xing, L.; Jia, Y.; Yuan, X. Parameter-adaptive TVF-EMD feature extraction method based on improved GOA. Sensors 2022, 22, 7195. [Google Scholar] [CrossRef] [PubMed]
- Chen, Z.; Yang, Y.; He, C.; Liu, Y.; Liu, X.; Cao, Z. Feature extraction based on hierarchical improved envelope spectrum entropy for rolling bearing fault diagnosis. IEEE Trans. Instrum. Meas. 2023, 72, 1–12. [Google Scholar] [CrossRef]
- Yang, Y.; Liu, H.; Han, L.; Gao, P. A feature extraction method using VMD and improved envelope spectrum entropy for rolling bearing fault diagnosis. IEEE Sensors J. 2023, 23, 3848–3858. [Google Scholar] [CrossRef]
- Yang, Y.; Zhang, Y.; Zeng, Q. Research on coal gangue recognition based on multi-layer time domain feature processing and recognition features cross-optimal fusion. Measurement 2022, 204, 112169. [Google Scholar] [CrossRef]
- Yan, X.; Jia, M. Bearing fault diagnosis via a parameter-optimized feature mode decomposition. Measurement 2022, 203, 112016. [Google Scholar] [CrossRef]
- Zhang, X.; Li, L.; Liu, S.; Lei, J. Empirical wavelet transform based on energy peak location and its application in weak bearing fault diagnosis. J. Xi’an Jiaotong Univ. 2021, 55, 1–8. [Google Scholar]
- Rivas, E.; Burgos, J.C.; Garcia-Prada, J.C. Vibration analysis using envelope wavelet for detecting faults in the OLTC tap selector. IEEE Trans. Power Deliv. 2010, 25, 1629–1636. [Google Scholar] [CrossRef]
- Hua, M.; Yan, K.; Li, X. A Transformer-based self-supervised learning model for fault diagnosis of air-conditioning systems with limited labeled data. Eng. Appl. Artif. Intell. 2025, 146, 110331. [Google Scholar] [CrossRef]
- Trujillo-Guerrero, M.F.; Román-Niemes, S.; Jaén-Vargas, M.; Cadiz, A.; Fonseca, R.; Serrano-Olmedo, J.J. Accuracy comparison of CNN, LSTM, and transformer for activity recognition using IMU and visual markers. IEEE Access 2023, 11, 106650–106669. [Google Scholar] [CrossRef]
- Cichoń, A.; Włodarz, M. OLTC Fault detection based on acoustic emission and supported by machine learning. Energies 2023, 17, 220. [Google Scholar] [CrossRef]
- Ezzaidi, H.; Fofana, I.; Picher, P.; Gauvin, M. On the Feasibility of Detecting Faults and Irregularities in On-Load Tap Changers (OLTCs) by Vibroacoustic Signal Analysis. Sensors 2024, 24, 7960. [Google Scholar] [CrossRef] [PubMed]
- Shannon, C.E. Communication in the presence of noise. Proc. IRE 2006, 37, 10–21. [Google Scholar] [CrossRef]
- Shang, R.; Peng, C.; Fang, R. A segmented preprocessing method for the vibration signal of an on-load tap changer. Electronics 2021, 10, 131. [Google Scholar] [CrossRef]
- Wu, Z.; Huang, N.E. Ensemble empirical mode decomposition: A noise-assisted data analysis method. Adv. Adapt. Data Anal. 2009, 1, 1–41. [Google Scholar] [CrossRef]
- Sun, J.; Xiao, Q.; Wen, J.; Wang, F. Natural gas pipeline small leakage feature extraction and recognition based on LMD envelope spectrum entropy and SVM. Measurement 2014, 55, 434–443. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).













