4.1. Compared Methods
In order to demonstrate the effectiveness of the proposed method, we first compare IGWO-PNN with GWO-PNN and PNN and the traditional IEC three-ratio method (IEC) vertically; second, we conduct a cross-sectional comparison and select different metaheuristic algorithms combined with PNN to constitute different fault diagnosis models; and finally, use other more classical network models to compare with IGWO-PNN. The validity of the IGWO-PNN model is comprehensively demonstrated by comparing other more classical network models with IGWO-PNN.
For the above experimental objectives, three separate sets of comparison tests were set up.
The first group is a longitudinal comparison of IGWO-PNN, GWO-PNN, and PNN. In the second group, we selected several meta-heuristic algorithms proposed in the last 5 years to model fault diagnosis in combination with PNN: the Whale optimization algorithm (WOA)-PNN [
30], Multi-Verse Optimizer (MVO)-PNN [
31], Salp Swarm Algorithm (SSA)-PNN [
32], and Seagull optimization algorithm (SOA)-PNN [
33]. In the second group, we select some proposed and relatively traditional network models for comparison, namely, Bat Algorithm (BA)-BP [
34], Cuckoo Search (CS)-BP [
35], and Backpropagation Neural Network optimized by Particle swarm optimization and Genetic algorithm. We also introduce the Extreme learning machine (ELM), a more novel network compared to PNN and BP, as a control.
4.2. Results of Model Comparison
In this paper, the simulation platform Matlab 2018a is used to simulate the experiment. The calculation platform used in the experiment is i7-9750 CPU @ 2.60 GHz. The divided data set is imported into the input layer of PNN for training. The feature data processed by the three-ratio method is three-dimensional, so the network input layer is three layers, and the output layer is the fault classification layer. The number of layers is the same as the number of fault types, that is, the output layer is four layers.
Simulation experiments are performed according to the method proposed in
Section 3.1. The parameters of each model in the simulation experiments are shown in
Table 4.
After the completion of the first set of comparison experiments, the results of each fault and the average accuracy are shown in
Table 5. It can be seen that the average accuracy of IGWO-PNN on the test set is 99.71%, which is higher than that of GWO-PNN and PNN, and thus much higher than that of the traditional IEC three-ratio method, which fully reflects the superiority of the machine learning method compared with the traditional IEC three-ratio method. The diagnostic results of the three groups of machine learning models for the fault data samples in the table are shown in
Figure 6, where
Figure 6a,c,e shows the classification results of the training samples and
Figure 6b,d,f shows the classification results of the test samples. From
Figure 6, we can clearly see that the IGWO-PNN model performs the best among the three machine learning models in both the training and test sets. Furthermore, we can observe that the diagnostic results of PNN, both in the training and test sets, are far from GWO-PNN or IGWO-PNN. It is much larger than the difference between GWO-PNN and IGWO-PNN. Therefore, it can be demonstrated that the optimization of the smoothing factor by the metaheuristic algorithm can substantially improve the classification accuracy of PNN when used as a classifier.
In order to demonstrate the superiority of the proposed method from several perspectives, we introduced the mean square error (MSE) in all three sets of comparison experiments. The MSEs of PNN, GWO-PNN, and IGWO-PNN in the first set of comparison experiments are shown in
Figure 7.
It can be seen that the dispersion of IGWO-PNN is smaller and the stability of the model is stronger compared to PNN and GWO-PNN.
The feasibility and effectiveness of the optimization work of the smoothing factor and the improved theory of the GWO algorithm are fully demonstrated by the first set of experiments. The GWO algorithm was introduced in 2014, and many novel meta-heuristics have appeared since then. Does the improved GWO still have some superiority in the field of transformer fault diagnosis compared to these latecomers? This is the question that needs to be investigated in the second set of comparison experiments.
Again, based on the same training and test sets, the IGWO-PNN fault diagnosis model is compared with PNN fault diagnosis models composed of other novel intelligent algorithms in various aspects.
The results for each fault and the average accuracy are shown in
Table 6.
As can be seen from
Table 6, the novel meta-heuristic algorithms are generally better at correcting the smoothing factors and generally have higher average precision on the test set. However, the average accuracy of IGWO-PNN is at least 2% higher than the average accuracy of other PNN models. It can be seen that based on the same fault data samples, IGWO still has some superiority over newer intelligent algorithms such as MVO, SSA, etc.
Figure 8 shows the diagnostic results of the PNN fault diagnosis model of the novel metaheuristic algorithm in
Table 6 based on the test set and the training set. IGWO-PNN has only one diagnostic failure, which is the best performance among all PNN models.
Similarly, we evaluate the five meta-heuristic algorithm models using MSE. The results are shown in
Figure 9. The performance of several models in terms of MSE is more indicative of the usefulness of the models than comparing the average diagnostic accuracy. As seen in
Figure 9, the MSEs of the other four meta-heuristic algorithms are greater than IGWO-PNN, both based on the training and test sets. The accuracy of the MVO-PNN model closest to IGWO-PNN is also three times more accurate than the MSE of IGWO-PNN.
In order to demonstrate that the established IGWO-PNN model has a strong convergence capability for the faulty dataset, in the second set of comparison experiments, we introduce the fitness curves to demonstrate the merit-seeking process of each of the above-proposed six metaheuristics (including GWO), and the results are shown in
Figure 10. As for IGWO, its fitness value starts decreasing from the third iteration and converges to the minimum fitness value at the ninth time. The whole process is within 10 iterations, which is the fastest among the six meta-heuristic algorithms. Compared with other algorithms, it is easier to jump out of the local optimum. This is enough to prove that the proposed update strategy has a significant improvement in the convergence ability of GWO.
The first two sets of comparison experiments are discussed around PNN, and in the third set of comparison experiments, we choose other neural network models that have been proposed in the literature or are more classical (BP and ELM) to compare with the IGWO-PNN model. The results for each fault and the average accuracy are shown in
Table 7, and
Figure 11 shows the corresponding simulation results.
From the simulation results of the third set of comparison experiments, the IGWO-PNN model still has the highest diagnostic accuracy. However, the fault diagnosis models that have been proposed in other literature, such as CS-BP as well as BA-BP, are close to IGWO-PNN in terms of accuracy, especially CS-BP, which has an average accuracy of 99.42%. It can be seen that in terms of average accuracy, IGWO-PNN does not have much advantage.
However, it can be seen by the MSEs of the six models shown in
Figure 12. Although CS-BP is very close to IGWO-PNN in terms of average accuracy, its performance in terms of MSE is not optimistic compared to IGWO-PNN. In contrast, the stability and practicality of the newly proposed IGWO-PNN model are much better.
Similar to the second set of comparison experiments, we continue to introduce the fitness curves to observe the optimization search process of IGWO-PNN and the other four BP models. Note that among these four BP models, intelligent algorithms such as CS, BA, and PSO optimize the weights and bias of the BP neural network. The fitness curve is shown in
Figure 13, which fully reflects the strong convergence ability of IGWO compared with the traditional metaheuristics such as CS, PSO, and BA. In this paper, the fitness function is MSE. From
Figure 13, the fitness value of IGWO-PNN is much smaller than the other four models when it reaches the optimum within the maximum number of iterations, which again highlights the global search ability and stability of the IGWO model.
Considering the unbalanced nature of the collected fault data (low temperature overheating has much more data than several other types of faults), we chose to evaluate our classification model using the Marco F1-score.
Table 8 shows the results of the Marco F1-score comparison for all the methods mentioned in this paper.
Accuracy is not a perfect metric for unbalanced datasets, and in this case, the F1-score better reflects the true performance of the classification model.
Table 8 shows that when evaluating two different models, the model with relatively higher accuracy does not necessarily have a higher Marco F1-score than its rivals, such as the comparison between CS-BP and WOA-PNN. However, among all the models, the IGWO-PNN model still tops the list, which shows that the proposed model has a high classification utility.
The above three sets of comparison experiments show the superiority of the IGWO-PNN fault diagnosis model in terms of classification average accuracy, MSE, and diagnosis efficiency. However, up to the end, our experiments are based on the same set of training and test sets. To avoid the chance associated with using only one set of data,
Table 9 shows the cross-validation results of different machine learning models.
The average accuracy obtained from the 5-fold, 10-fold, and 15-fold cross-validation results are clearly more convincing than the results obtained on the test set alone. The accuracy of the CS-BP model falls by as much as 4% compared to the test set, which shows that the results on the test set alone are somewhat chance. The average accuracy of the IGWO-PNN is 97.28%, which is the highest among all models, showing that the proposed model has good generalization ability. Although the results are slightly lower compared to those on the test set, this result better reflects the realistic performance of the model.