Next Article in Journal
Design of Cylindrical Thermal Dummy Cell for Development of Lithium-Ion Battery Thermal Management System
Next Article in Special Issue
Development of Hankel Singular-Hypergraph Feature Extraction Technique for Acoustic Partial Discharge Pattern Classification
Previous Article in Journal
Evaluation of Pressure Resonance Phenomena in DCT Actuation Circuits
Previous Article in Special Issue
Combined Approach Using Clustering-Random Forest to Evaluate Partial Discharge Patterns in Hydro Generators
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Convolutional Neural Network-Based Model for Multi-Source and Single-Source Partial Discharge Pattern Classification Using Only Single-Source Training Set

1
Department of Electrical & Computer Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada
2
Verint Systems, Vancouver, BC V6E 4E6, Canada
*
Author to whom correspondence should be addressed.
Energies 2021, 14(5), 1355; https://doi.org/10.3390/en14051355
Submission received: 5 January 2021 / Revised: 20 February 2021 / Accepted: 25 February 2021 / Published: 2 March 2021

Abstract

:
Classification of the sources of partial discharges has been a standard procedure to assess the status of insulation in high voltage systems. One of the challenges while classifying these sources is the decision on the distinct properties of each one, often requiring the skills of trained human experts. Machine learning offers a solution to this problem by allowing to train models based on extracted features. The performance of such algorithms heavily depends on the choice of features. This can be overcome by using deep learning where feature extraction is done automatically by the algorithm, and the input to such an algorithm is the raw input data. In this work, an enhanced convolutional neural network is proposed that is capable of classifying single sources as well as multiple sources of partial discharges without introducing multiple sources in the training phase. The training is done by using only single-source phase-resolved partial discharge (PRPD) patterns, while testing is performed on both single and multi-source PRPD patterns. The proposed model is compared with single-branch CNN architecture. The average percentage improvements of the proposed architecture for single-source PDs and multi-source PDs are 99.6% and 96.7% respectively, compared to 96.2% and 77.3% for that of the traditional single-branch CNN architecture.

1. Introduction

Effective insulation degradation diagnosis is a key prerequisite for monitoring the integrity of any electrical system. An acceptable diagnostic method which has been used over the years is the measurement of partial discharges (PD) [1]. Different parameters were employed for PD classification throughout the years. Some of the parameters include maximum discharge magnitude and number of discharges as a function of time, PD pulses on an elliptic time-base, phase of the positive half cycle of the PRPD patterns, features that were extracted using different dimensionality reduction techniques, the application of mixed Weibull functions and wavelet transform on discharge patterns. For the deep learning part, the parameters used are waveform spectrogram, time-domain waveform signal, and PRPD patterns. Okamoto and Tanaka were among the first to work on developing techniques to measure partial discharges in 1986 [2]. Their work demonstrated the existence of a correlation between the distribution profile of the charge against the phase angle and the level of insulation degradation by analysing the skewness of the profile. Another approach for determining partial discharge sources was based on the analysis of different quantities of discharge as a function of time; these include maximum discharge magnitude, the number of discharges, and the inception voltage [3]. By 1990s it was evident that distinctive characteristic behaviors such as increase, decrease, strong or weak fluctuations of these quantities can be correlated to discharge sources.
With the advancements in the field of pattern recognition, interest increased in automating partial discharge recognition and classification. In 1993, one of the first successful applications of neural networks for automatic recognition of any partial discharge source was reported [4]. The input was extracted from commercial partial discharge detectors that would display PD pulses on an elliptic time-base. The phase position and the spread of the pulses were shown to be correlated with the nature of PD source, suggesting that PD pulses on an elliptic time-base provided important features for characterization. The rate of correct classification varied between 70% and 90% depending on the number of layers in the neural network and the classes to be classified. The choice of the neural network architecture has been an open question since that time. Poor generalization was recorded on real patterns compared to that on training synthesised patterns [5]. In [6], phase resolved partial discharge (PRPD) patterns were considered as inputs to the neural network wherein the phase of the positive half cycle was considered. The study proposed a way to separate superimposed charge-phase patterns which was based on separating contours before passing it to the neural network. The limitation of this method is that it required the patterns to be non-overlapping. More progress was done by Krivda who used dimensionality reduction techniques to derive low-dimensional representations of different partial discharge patterns [7]. Krivda [7] concluded that in order to decide on the right features, a balance should be set between the number of features and the time needed to compute the features; moreover, new types of neural network could yield better results. In [8], the authors discussed automatic recognition of multiple PD sources. A stochastic method based on on applying mixed Weibull functions to the pulse-height distribution patterns was investigated. The study concluded that in case of partially or completely superimposed PD patterns, separation was impossible.
Up to the year 2000, automatic recognition of multiple PD sources was yet to be resolved. The authors in [9] introduced the application of wavelet transform on PD detection proposing the use of Daubechies mother wavelet. Features were extracted from the third level reconstructed horizontal H and vertical V component images. A feature vector was composed by averaging the H and V images in the magnitude and phase directions resulting in 150 elements. The neural network used in this model had one hidden dense layer, and multiple source patterns were used while training. The overall classification accuracy was 88%. However, the authors concluded that further study of actual multiple source PD was required for more accurate assessment of the proposed method. In [10], stochastic procedures and fuzzy classifier were implemented to identify different PD pulses; however, it was noted that the fuzzy classifier was not efficient when PD pulses had similar shapes.
Historically, the input data for any machine learning algorithm had to be pre-processed by using the user’s knowledge of the domain and assessment of which features are important for the specific problem. By 2006, automatic feature extraction became possible through the use of deep artificial neural networks which could accept raw data as input. The first application of deep learning on PD diagnosis was done in 2015 [11]. In [11], the authors recorded the PRPD patterns for six different PD defects in oil, where the patterns were treated as 50 × 64 dimensional images. The classification accuracy increased as the number of hidden layers increased reaching 86% for five hidden layers. The authors in [12] were among the first to use a deep learning architecture called Recurrent Neural Network (RNN) for the classification of PRPD patterns. Trials were performed to decide on the best values for the number of layers and number of power cycles. They achieved an accuracy of 96.62% that outperformed simple deep neural networks (with an accuracy of 93.01%) and traditional machine learning techniques using support vector machine (with an accuracy of 88.63%). Recently, a number of authors have reported the use of deep learning models, such as convolutional neural networks (CNN) for classifying PD sources [13,14,15]. Among these works, various formats of input have been used for PD source identification; these inputs include: waveform spectrogram, time-domain waveform signal, and PRPD patterns. For the waveform spectrogram data, the authors in [16] used CNN to detect PD signal with varying noise and interference signals. The input to the network was an image showing the time-frequency spectrum of sound clips, which were measurements recorded from a switch gear using the transient earth voltage method (TEV). CNN showed superior performance in terms of detection accuracy and detection time compared to other methods prevalent in the industry. In [17], Che et al. used 2D- CNN to classify three PDs sources in XLPE cable which are internal PDs, corona PDs, surface PDs, in addition to noise. Acoustic signals were generated using an optical fiber distributed acoustic sensing system. The 1D-signals were converted to 2D spectral representation by applying mel-frequency cepstrum coefficients analysis (MFCC). For the time-domain waveform data, authors in [18] used signals from an analog transformer model which consisted of impulse fault current waveforms for different fault conditions. Each waveform was represented by a 2500 dimensional vector. The training was performed using the PD data from sources co-occurring simultaneously at two different locations within the winding. This resulted into a total of 20,304 classes corresponding to the different fault conditions at different winding locations. The classification accuracy attained was 99.2%. Wang et al were interested in UHF signals for partial discharge detection in GIS [19]. They collected time-series data from lab experiments and simulations using the finite- difference time-series method (FDTD). The input to the CNN were 64 by 64 images that were down sampled originally from a 600 by 438 time-resolved partial discharge (TRPD) image. The classification accracy was compared with conventional methods based on using statistical features as input. It was concluded that when CNN outperforms the traditional methods when the number of training examples is greater than 500. For the PRPD input data, the authors in [20] obtained mixed onsite and experimental PRPD patterns for six different sources of partial discharge. The input data was represented as a 72 × 50 matrix. An accuracy of 89.7% was achieved. In [21], the authors used CNN in order to detect the deterioration of the insulation in high voltage systems using PRPD images. Four classes were distinguished: start, middle, end and noise. The tested specimens were aged by undergoing high electric stress in a lab setup. Different architectures of the CNN were investigated by changing the hyper-parameters as the number and the size of the kernels. The results were reported in terms of the confusion matrix and the accuracy percentage. In [22], an algorithm was presented to identify multi-source PDs based on a two-step logistic regression model.
It is noteworthy that all of the prior methods reported above depend on the availability of training data from multi-source PD inputs [23]. There are a number of drawbacks associated with this choice. Such a training data is difficult to collect in practice, is time consuming, and, by its very combinatorial nature, precludes the collection of examples for all possible combinations of concurrently occurring defects. In this paper, to address these drawbacks, we propose a novel convolutional architecture for single-source PD and multi-source PD classification using training data with ground-truth available only at the level of single-source PDs. Our proposed architecture consists of a convolutional backbone feeding into multiple fully connected neural networks (FCNs). The input to the convolutional part of the network is the PRPD pattern matrix (Section 2.1). The output of this CNN stage is a common feature representation which is broadcast to different FCNs, wherein each FCN is trained to output the probability of occurrence of a specific PD. Thus the proposed hybrid architecture moves from extracting general representations to more fine-tuned representation in a hierarchical fashion. The overall loss of the network is the combination of individual binary cross entropy losses from each of the FCNs. This loss is jointly optimized with respect to the parameters of the CNN stage and the FCNs. At testing time, our network produces a multi-label output vector signifying the probability of the presence of respective PDs. We show superior performance as compared to models trained independently on single-source PDs demonstrating the value of shared convolutional stage and joint optimization of the FCNs.

2. Material and Methods

2.1. Experimental Setup

Several experiments were performed by different groups to classify partial discharges by the use of their phase resolved partial discharge patterns. PD classification and identification using laboratory data has been used to establish proof of concept for a number of techniques available in the literature (e.g., [24,25]). Lab experiments were done by Janani et al. [26,27] to simulate artificial defects. The experimental setup consisted of a high voltage transformer, a capacitive divider to measure the AC voltage, the test cell, and the PD measurement system as shown in Figure 1. The lab setups simulate common sources of PD in air, oil, and SF6. PD data collection was conducted in accordance with IEC 60270 standard [28]. The test cells include three sources (floating electrode, moving particle, and fixed protrusion) of partial discharge in SF6 (Figure 2), two sources of PD (free particle and needle electrode) in transformer oil (Figure 3), and corona in air which has the same setup as floating electrode but filled with air. For the floating electrode, the distance between the gap between the two electrodes is 1 mm. For the free particle, a small bearing with a diameter of 3.17 mm was placed on a concave dish ground electrode. For the point plane electrode, the needle has a diameters of of 20 μm [27]. More details on the experimental setups are explained in [29]. In total, there are six different PD patterns generated. In addition, four different combinations of multiple partial discharges were simulated by using simultaneously two or three test cells. A commercial PD measurement system (Omicron MPD 600) was used to acquire the PRPD of each test cell.
The output data from the Omicron software is exported as binary files. This data includes information about partial discharges taking place relative to the applied phase voltage. The discharge magnitude and phase are divided into 400 and 500 bins respectively. This results in a 400 × 500 matrix M ( x , y ) , where the number in each bin represents the number of discharges occurring at a specific phase angle ( x ) and a specific discharge magnitude ( y ) . Figure 4 shows a visual representation of the six single-source PRPD patterns. The 400 × 500 matrix is reduced to 100 × 100 by summing up the counts in each 40 × 50 sub-matrix. In addition, background noise is unavoidable even with perfect measurement conditions. Background noise is reflected by having an offset charge over all the phase windows. In this work, it has been removed for all the PRPD patterns. In addition to the six classes, an additional no-pattern class corresponding to the cases not involving the presence of any PD is added. In order to encourage the model to learn features related to the shape of PRPDs, the samples were converted into binary samples, where zero threshold is considered for binarization. A visual representation of the binary matrices are shown on the side of the Omicron representation of each of the six single PD source classes. Figure 5 shows the PRPD patterns of the four multi-source PD classes.
To make the systems insensitive to changes in charge magnitude settings on Omicron software and to different applied voltages, samples are extracted using multiple magnitude settings; hence, introducing variability in the dataset. Particularly, for each class, the following charge magnitude settings for the Omicron software were employed that are summarized in Table 1. The numbering of the six single-source classes is done as follows:
  • Class 1 (corona in air)
  • Class 2 (floating electrode in SF6)
  • Class 3 (free particle in oil)
  • Class 4 (free particle in SF6)
  • Class 5 (point plane electrode in oil)
  • Class 6 (point plane electrode in SF6)
The numbering of the four multiple-source classes is done as follows:
  • Class 14 (corona in air and free particle in SF6)
  • Class 16 (corona in air and point plane electrode in SF6)
  • Class 46 (free particle in SF6 and point plane electrode in SF6)
  • Class 146 (corona in air, free particle in SF6 and point plane electrode in SF6)

2.2. Method

Convolutional neural networks (CNNs) represent a class of deep neural networks that were originally designed for visual images, and have shown state-of-the-art performance for a range of applications [30,31,32,33]. Typically, CNNs consist of a cascade of alternating convolutional and pooling layers as shown in Figure 6. A convolutional layer comprises of a bank of linear 2D or 3D filters which are convolved with a multi-channel input image to produce a multi-channel output of feature maps. The output of convolutions is often passed through a non-linear activation function such as a rectified linear unit (ReLU). A pooling layer subsamples the input in a non-linear fashion (e.g., maximum value in a local window). The successive convolutional and pooling layers, coupled with non-linear activations, confer the CNNs with the capability to automatically learn feature representations at different spatial scales of an image in a hierarchical fashion. Common applications include classification, regression, and matrix-to-matrix transformations [34,35].
In classification problems, a data-point can belong to a single class (mutually exclusive membership) or it could belong to multiple categories at the same time. The latter is usually referred to as multi-label classification. Since PRPD patterns from multiple sources can occur concurrently, PD detection is essentially a multi-label classification problem. In the presence of training data with various combinations of co-occurring multi-source PD labels, building a multi-label classification model is tenable. However, as mentioned in Section 1, collection of such a dataset is expensive, time consuming, and may not allow to span all possible combinations of PDs. On the other hand, it is practically more feasible to collect single-source PD data in large quantities. We therefore focus on methods to capitalize single-source training data for solving multi-label classification problem.
Let K be the number of PD sources. To enable explicit detection of cases with no PDs, we define a separate category representing the absence of all the PDs. Let the training data consisting of N examples be represented as { X i , y i } i = 1 N , where X i R H × W is the i th PRPD pattern image, and y i { 0 , 1 } K + 1 is the corresponding ( K + 1 ) -dimensional label. The label vector is ( K + 1 ) -dimensional because, as described above, we have defined an additional class for cases with no PDs binary label vector, signifying the presence or absence of each PD. Since only single-source examples are considered during training, each y i is a one-hot vector. At testing time, the label vector for a test-case can contain multiple 1 s.

2.2.1. Multiple Single-Source Classifiers (Baseline)

To achieve multi-label classification, a traditional way has been to learn multiple ( K + 1 ) independent binary classifiers, each trained to detect an individual PD defect. The loss for the k th model, given the training dataset, is given by
L k ( θ k ) = 1 N i = 1 N [ y i k log ( F θ k ( X i ) ) + ( 1 y i k ) log ( 1 F θ k ( X i ) ) ]
where F θ k is the function encoded by the k th model, depending on parameters θ k . y i k is the k th element of y i . After the training phase, given a test case X ( test ) , one then needs to invoke K + 1 models to build a multilabel output, y ^ ( test ) = { F θ k ( X ( test ) ) } k = 1 K + 1 . An example convolutional architecture that accepts a PRPD pattern image and performs a binary classification for the presence of a specific single-source PD is shown in Figure 7a.

2.2.2. Joint Model with Shared CNN Parameters (Proposed)

While the baseline approach described above may learn excellent single-source classifiers, it is not expected to generalize for multilabel classification task. This is due primarily to overtuned class specific parameters { θ k } k = 1 K + 1 learned independently for each single-source PD. To address these issues we propose to decompose the network parameters into two sets: a shared set of common parameters, ρ CNN (for the convolutional part), and class specific parameters, { ϕ FCN k } k = 1 K + 1 (for fully connected networks). In particular, our proposed architecture has a shared convolutional stage for feature extraction. These features are then distributed to multiple FCNs. Our motivation is to encourage the CNN to learn to extract more general feature representations which are useful for all classes. The FCNs accept these general features to learn class specific models in a joint fashion. Our architecture is shown in Figure 7b. Let the CNN part be represented by the network G , and each of the fully connected networks be represented by H k . Our joint loss function is then given by
L ( ρ CNN , { ϕ FCN k } k = 1 K + 1 ) = 1 N ( K + 1 ) k = 1 K + 1 i = 1 N [ y i k log H k ( G ( X i ) ) + ( 1 y i k ) log ( 1 H k ( G ( X i ) ) ) ] .

2.2.3. Design Details for Network Layers

Two CNN layers consisting of 36 filters, a kernel of size 3 × 3 followed by two dense layers with 128 and 64 filters respectively, and ending with a classification layer of seven nodes constitute the network used in this study. Batch normalization has been used in order to decrease the effect of over-fitting. The schematic for one of the classifiers in Figure 7a is shown in Figure 8. The design details for the implemented classifiers are shown in Table 2. The hyper parameters of our neural network such as the number of layers, number of nodes per layer, kernel size were chosen by running experiments for different values of the parameters and plotting the training and validation accuracy curves as a function of epochs.
The design of the layers is kept the same for both the baseline model and the proposed model so that the difference in performance due to the proposed parameter-sharing based architecture can be investigated. The proposed model architecture is shown in Figure 9.

3. Performance Metrics

Since the model is trained using single PRPD patterns only, the generalization of the model is tested by evaluating the performance on a new hybrid dataset that includes PRPD patterns from single as well as from multiple partial discharge sources. In addition to that, samples with different charge magnitude specification on the Omicron software are tested. Different standard multi-label classification metrics have been used in the literature to evaluate the performance of trained models. Some of these metrics include mean average precision, 0-1 exact match, macro and micro F1, per class precision, per class recall, overall precision and overall recall [36]. In this paper, the individual recall (Recall(k)) and the individual precision (Precision(k)) are calculated for each of the classes 1 to 7 by taking into account both single-source PDs and multiple-source PDs. The recall reflects the proportion of the positive examples that is correctly classified, and the precision reflects the proportion of the examples predicted to be positive that are actually positive. PCR and PCP represent the arithmetic mean of recall and precision respectively,
PCR = 1 K k = 1 K Recall ( k )
PCP = 1 K k = 1 K Precision ( k ) .
In addition, classification accuracy and false negative accuracy are evaluated for single as well as for multiple classes. The classification accuracy is calculated considering equal weights for all classes, while the false negative accuracy is calculated taking into consideration only the true class or classes that the sample truly belongs to. The importance of calculating the false negative accuracy metric in this context comes from the fact that it is of high importance to detect the correct source of PD in high voltage systems. Consistent false identification of a PD source will put the high voltage apparatus in failing condition, in addition to safety risk for employees working near this apparatus. The false negative accuracy reflects on the performance of the model by quantitatively evaluating single classes and multiple classes separately in comparison to the individual recall and precision. If a PRPD pattern belongs to classes one, four and six, then the ground truth is [ 1001010 ] . The classification accuracy is then calculated by checking the matching elements in each of the seven-element vector [ 1001010 ] . The classification accuracy for a single sample is calculated as
P classification = k = 1 7 M k 7 × 100
where M k is equal to one when the element k in the ground truth vector agrees with the prediction of the model for the corresponding class k, and zero otherwise. The ideal classification accuracy is 100% and the worst is 0%.
The false negative accuracy for a single sample is calculated as:
P false negative = j = 1 T N j T × 100
where N j equals one when the element j in the ground truth vector which is equal to one does not agree with the prediction of the model for the corresponding class j, and zero otherwise. In this metric, checking the matching prediction is performed only on the class or classes that the sample truly belongs to. T is the number of classes that the sample truly belongs to. In our tested dataset, T can be 1 for single-sourced PDs, 2 or 3 for multiple-sourced PDs. Hence, the ideal false negative accuracy is 0% and the worst is 100%. Calculating the classification and false negative accuracies over a number of samples is done by averaging (5) and (6) over the number of samples.

4. Results and Discussion

4.1. Independent Classifiers

Figure 10 shows calculated loss of the trained model, using (1), as a function of epochs (or iterations) for both the training and the validation dataset in log scale. An epoch is when the entire dataset is passed forward and backward through the neural network. Under-fitting is clearly seen for classes six and seven where the gap between the training loss and validation loss increases at epoch 1000. Table 3 shows the classification and the false negative accuracies.
As seen in Table 3, the model does not generalize well to the multiple classes especially Class 16 and Class 146. The arithmetic mean of both precision and recall are shown in Table 4. The precision for each of class three and class four is low similar to the recall of class six. In Table 5 we show the hybrid confusion matrix in which the rows and columns represent the input and predicted classes respectively. The true positives are highlighted for better visibility.

4.2. Proposed Model

As we proceed with training the model, a trade off takes place between generalization and learning deeper features about single partial discharges. Generalization comes in the context of correct classification of multiple sourced—PRPD patterns. The training of the model is terminated when the validation accuracy is observed to start shifting from the training accuracy. During the training phase, a portion of the dataset is used for validation purposes where this portion is used to calculate the loss for back propagation in each epoch. The decision is collectively made by analyzing the average validation and training loss of the seven classes. For epoch 4000, the percentage difference between the validation and the training loss is 0.8% compared to 2.7% for epoch 8000 as shown in Figure 11, and consequently, the training is stopped at iteration (epoch) 4000.
The calculated loss of the trained model as a function of epochs or iterations for both the training and the validation dataset in log scale, using (1), is shown in Figure 12.
The classification accuracy and false negative accuracy are shown in Table 6. In comparison with Table 3, better performance is recorded where the average classification accuracy for single classes increased from 96.2% to 99.6%. The average false negative accuracy for the multiple classes dropped from 23.5% to 8.7%. On the other hand, the arithmetic mean of both precision and recall are calculated in Table 7. Comparing Table 7 with Table 4, ideal recall is recorded for class 6 and ideal precision is recorded for all classes. This indicates that our proposed model enhanced the prediction of true positives. The hybrid confusion matrix for the proposed model is shown in Table 8. As seen in these tables, compared to Table 5, our proposed model has enhanced classification ability not only for single-source PDs, but also for multi-source PDs. This is shown in the last four rows of Table 8 corresponding to the multiple classes and comparing them with that of Table 5. This indicates that our proposed model decreased false negatives predictions.

5. Conclusions

In this paper, a customized approach based on deep learning algorithm particularly CNN has been developed in order to identify single and multiple sources PDs which can occur in high voltage insulation systems. The difficulty of identifying multiple sources PDs using training set of single sources PDs results from the fact that the PRPD patterns are partially overlapping. As a result, traditional machine learning techniques which are based on the manual extraction of features get confused when multiple source PRPD patterns are set to be classified. Different algorithms should be deployed in order to decide on the separation criteria between these overlapped PRPDs. A customized CNN model has been shown to be useful for this problem through the proposed enhanced version based on sharing the weights among different classes. The essence of the proposed model is that the training is done on single sources of PDs only. This is appreciated in the industry where additional financial resources and time are needed to acquire data from simultaneous sources of PDs. The model is robust to electric interference as well as to applied phase voltage. The average percentage improvements of the proposed architecture for single-source PDs and multi-source PDs are 99.6% and 96.7%, respectively compared to 96.2% and 77.3% for that of the independent classifiers architecture.

Author Contributions

Conceptualization: S.M., A.A., H.J. and B.K.; methodology, S.M., A.A.; software, S.M.; formal analysis, A.A. and H.J. and B.K.; investigation, S.M., A.A., H.J. and B.K.; original draft preparation, S.M.; writing, S.M.; review and editing, A.A., H.J. and B.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The Natural Sciences and Engineering Research Council of Canada.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Luo, Y.; Li, Z.; Wang, H. A review of online partial discharge measurement of large generators. Energies 2017, 10, 1694. [Google Scholar] [CrossRef] [Green Version]
  2. Okamoto, T.; Tanaka, T. Novel partial discharge measurement computer-aided measuremnet systems. IEEE Trans. Electr. Insul. 1986, 6, 1015–1019. [Google Scholar] [CrossRef]
  3. Gulski, E.; Kreuger, F. Determination of discharge sources by analysis of discharge quantities as a function of time. In Proceedings of the IEEE International Symposium on Electrical Insulation, Baltimore, MD, USA, 7–10 June 1992; pp. 397–400. [Google Scholar]
  4. Satish, L.; Gururaj, B. Partial discharge pattern classification using multilayer neural networks. IET IEE Proc. A Sci. Meas. Technol. 1993, 140, 323–330. [Google Scholar] [CrossRef] [Green Version]
  5. Satish, L.; Zaengl, W.S. Artificial neural networks for recognition of 3-d partial discharge patterns. IEEE Trans. Dielectr. Electr. Insul. 1994, 1, 265–275. [Google Scholar] [CrossRef]
  6. Cachin, C.; Wiesmann, H.J. PD recognition with knowledge-based preprocessing and neural networks. IEEE Trans. Dielectr. Electr. Insul. 1995, 2, 578–589. [Google Scholar] [CrossRef]
  7. Krivda, A. Automated recognition of partial discharges. IEEE Trans. Dielectr. Electr. Insul. 1995, 2, 796–821. [Google Scholar] [CrossRef]
  8. Cacciari, M.; Contin, A.; Mazzanti, G.; Montanari, G. Identification and separation of two concurrent partial discharge phenomena. Proceedings of Conference on Electrical Insulation and Dielectric Phenomena, Millbrae, CA, USA, 23–23 October 1996; Volume 2, pp. 476–479. [Google Scholar]
  9. Lalitha, E.; Satish, L. Wavelet analysis for classification of multi-source PD patterns. IEEE Trans. Dielectr. Electr. Insul. 2000, 7, 40–47. [Google Scholar] [CrossRef] [Green Version]
  10. Contin, A.; Cavallini, A.; Montanari, G.; Pasini, G.; Puletti, F. Digital detection and fuzzy classification of partial discharge signals. IEEE Trans. Dielectr. Electr. Insul. 2002, 9, 335–348. [Google Scholar] [CrossRef]
  11. Catterson, V.; Sheng, B. Deep neural networks for understanding and diagnosing partial discharge data. In Proceedings of the 2015 IEEE Electrical Insulation Conference, Seattle, WA, USA, 7–10 June 2015; pp. 218–221. [Google Scholar]
  12. Nguyen, M.T.; Nguyen, V.H.; Yun, S.J.; Kim, Y.H. Recurrent neural network for partial discharge diagnosis in gas-insulated switchgear. Energies 2018, 11, 1202. [Google Scholar] [CrossRef] [Green Version]
  13. Tuyet-Doan, V.N.; Tran-Thi, N.D.; Youn, Y.W.; Kim, Y.H. One-Shot Learning for Partial Discharge Diagnosis Using Ultra-High-Frequency Sensor in Gas-Insulated Switchgear. Sensors 2020, 20, 5562. [Google Scholar] [CrossRef]
  14. Puspitasari, N.; Khayam, U.; Kakimoto, Y.; Yoshikawa, H.; Kozako, M.; Hikita, M. Partial Discharge Waveform Identification using Image with Convolutional Neural Network. In Proceedings of the 54th International Universities Power Engineering Conference (UPEC), Bucharest, Romania, 3–6 September 2019; pp. 1–4. [Google Scholar]
  15. Barrios, S.; Buldain, D.; Comech, M.P.; Gilbert, I.; Orue, I. Partial discharge classification using deep learning methods—Survey of recent progress. Energies 2019, 12, 2485. [Google Scholar] [CrossRef] [Green Version]
  16. Lu, Y.; Wei, R.; Chen, J.; Yuan, J. Convolutional neural network based transient earth voltage detection. In Proceedings of the 2016 15th International Symposium on Parallel and Distributed Computing (ISPDC), Fuzhou, China, 8–10 July 2016; pp. 386–389. [Google Scholar]
  17. Che, Q.; Wen, H.; Li, X.; Peng, Z.; Chen, K.P. Partial discharge recognition based on optical fiber distributed acoustic sensing and a convolutional neural network. IEEE Access 2019, 7, 101758–101764. [Google Scholar] [CrossRef]
  18. Dey, D.; Chatterjee, B.; Dalai, S.; Munshi, S.; Chakravorti, S. A deep learning framework using convolution neural network for classification of impulse fault patterns in transformers with increased accuracy. IEEE Trans. Dielectr. Electr. Insul. 2017, 24, 3894–3897. [Google Scholar] [CrossRef]
  19. Wang, Y.; Yan, J.; Yang, Z.; Liu, T.; Zhao, Y.; Li, J. Partial Discharge Pattern Recognition of Gas-Insulated Switchgear via a Light-Scale Convolutional Neural Network. Energies 2019, 12, 4674. [Google Scholar] [CrossRef] [Green Version]
  20. Song, H.; Dai, J.; Sheng, G.; Jiang, X. GIS partial discharge pattern recognition via deep convolutional neural network under complex data source. IEEE Trans. Dielectr. Electr. Insul. 2018, 25, 678–685. [Google Scholar] [CrossRef]
  21. Florkowski, M. Classification of partial discharge images using deep convolutional neural networks. Energies 2020, 13, 5496. [Google Scholar] [CrossRef]
  22. Janani, H.; Kordi, B.; Jozani, M.J. Classification of simultaneous multiple partial discharge sources based on probabilistic interpretation using a two-step logistic regression algorithm. IEEE Trans. Dielectr. Electr. Insul. 2017, 24, 54–65. [Google Scholar] [CrossRef]
  23. Ganguly, B.; Chaudhury, S.; Biswas, S.; Dey, D.; Munshi, S.; Chatterjee, B.; Dalai, S.; Chakravorti, S. Wavelet Kernel based Convolutional Neural Network for Localization of Partial Discharge Sources within a Power Apparatus. IEEE Trans. Ind. Inform. 2020, 17, 1831–1841. [Google Scholar]
  24. Gulski, E.; Kreuger, F. Computer-aided recognition of discharge sources. IEEE Trans. Electr. Insul. 1992, 27, 82–92. [Google Scholar] [CrossRef]
  25. Tang, J.; Jin, M.; Zeng, F.; Zhang, X.; Huang, R. Assessment of PD severity in gas-insulated switchgear with an SSAE. IET Sci. Meas. Technol. 2017, 11, 423–430. [Google Scholar] [CrossRef]
  26. Janani, H.; Jacob, N.D.; Kordi, B. Automated recognition of partial discharge in oil-immersed insulation. In Proceedings of the IEEE Electrical Insulation Conference (EIC), Seattle, WA, USA, 7–10 June 2015; pp. 467–470. [Google Scholar]
  27. Janani, H.; Kordi, B. Towards automated statistical partial discharge source classification using pattern recognition techniques. IET High Volt. 2018, 3, 162–169. [Google Scholar] [CrossRef]
  28. IEC 60270 Standard. High-Voltage Test Techniques: Partial Discharge Measurements. Available online: https://webstore.iec.ch/publication/1247 (accessed on 21 January 2021).
  29. Janani, H. Partial Discharge Source Classification Using Pattern Recognition Algorithms. Ph.D. Thesis, University of Manitoba, Winnipeg, MB, Canada, 2016. [Google Scholar]
  30. Goodfellow, I.; Bengio, Y.; Courville, A.; Bengio, Y. Deep Learning; MIT Press: Cambridge, UK, 2016; Volume 1. [Google Scholar]
  31. Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef] [Green Version]
  32. Duan, L.; Hu, J.; Zhao, G.; Chen, K.; Wang, S.X.; He, J. Method of inter-turn fault detection for next-generation smart transformers based on deep learning algorithm. IET High Volt. 2019, 4, 282–291. [Google Scholar] [CrossRef]
  33. Polisetty, S.; El-Hag, A.; Jayram, S. Classification of common discharges in outdoor insulation using acoustic signals and artificial neural network. IET High Volt. 2019, 4, 333–338. [Google Scholar] [CrossRef]
  34. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
  35. Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI); Springer: Berlin/Heidelberg, Germany, 2015; Volume 9351, pp. 234–241. [Google Scholar]
  36. Durand, T.; Mehrasa, N.; Mori, G. Learning a deep convnet for multi-label classification with partial labels. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 647–657. [Google Scholar]
Figure 1. Experimental setup for PD measurement [27].
Figure 1. Experimental setup for PD measurement [27].
Energies 14 01355 g001
Figure 2. SF6 test cells: (a) floating electrode; (b) free particle; (c) point-plane electrode [22] (used with permission).
Figure 2. SF6 test cells: (a) floating electrode; (b) free particle; (c) point-plane electrode [22] (used with permission).
Energies 14 01355 g002
Figure 3. Oil test cells: (a) free particle; (b) point-plane electrode [27].
Figure 3. Oil test cells: (a) free particle; (b) point-plane electrode [27].
Energies 14 01355 g003
Figure 4. PRPD patterns and their binary representation of various single defects: (a) class 1; (b) class 2; (c) class 3; (d) class 4; (e) class 5; (f) class 6.
Figure 4. PRPD patterns and their binary representation of various single defects: (a) class 1; (b) class 2; (c) class 3; (d) class 4; (e) class 5; (f) class 6.
Energies 14 01355 g004
Figure 5. PRPD patterns and their binary representation of various multiple defects: (a) class 14; (b) class 16; (c) class 46; (d) class 146.
Figure 5. PRPD patterns and their binary representation of various multiple defects: (a) class 14; (b) class 16; (c) class 46; (d) class 146.
Energies 14 01355 g005
Figure 6. Simple CNN architecture.
Figure 6. Simple CNN architecture.
Energies 14 01355 g006
Figure 7. (a) baseline model with independent model for each class; (b) proposed model with a common convolutional backbone shared across all classes.
Figure 7. (a) baseline model with independent model for each class; (b) proposed model with a common convolutional backbone shared across all classes.
Energies 14 01355 g007
Figure 8. CNN architecture of an independent classifier.
Figure 8. CNN architecture of an independent classifier.
Energies 14 01355 g008
Figure 9. Proposed deep learning model architecture: common convolutional backbone shared across all classes.
Figure 9. Proposed deep learning model architecture: common convolutional backbone shared across all classes.
Energies 14 01355 g009
Figure 10. Training and validation log losses vs. number of iterations for independent classifiers (a) class 1; (b) class 2; (c) class 3; (d) class 4; (e) class 5; (f) class 6; (g) class 7.
Figure 10. Training and validation log losses vs. number of iterations for independent classifiers (a) class 1; (b) class 2; (c) class 3; (d) class 4; (e) class 5; (f) class 6; (g) class 7.
Energies 14 01355 g010
Figure 11. Decision on stopping criteria: the percentage difference between the validation loss and the training loss is minimal in the marked region.
Figure 11. Decision on stopping criteria: the percentage difference between the validation loss and the training loss is minimal in the marked region.
Energies 14 01355 g011
Figure 12. Training and validation log losses vs. number of iterations for the proposed model (a) class 1; (b) class 2; (c) class 3; (d) class 4; (e) class 5; (f) class 6; (g) class 7.
Figure 12. Training and validation log losses vs. number of iterations for the proposed model (a) class 1; (b) class 2; (c) class 3; (d) class 4; (e) class 5; (f) class 6; (g) class 7.
Energies 14 01355 g012
Table 1. Different levels of charge magnitude scale setting on the Omicron Software.
Table 1. Different levels of charge magnitude scale setting on the Omicron Software.
ClassesCharge Magnitude Scale Setting
Class 1100, 200, 250, 500, and 1000 pC
Class 270, 100, and 200 nC
Class 3200, 300, and 350 pC
Class 470, 150, and 250 pC
Class 510, 50, and 100 nC
Class 620, 50, 100, and 200 pC
Table 2. Design specification of an independent classifier.
Table 2. Design specification of an independent classifier.
Layer TypeOutput Shape
Input Layer(None, 100, 100, 1)
Batch Normalization(None, 100, 100, 1)
Convolution1 2D(None, 100, 100, 36)
Max-pooling1 2D(None, 50, 50, 36)
Batch Normalization1(None, 50, 50, 36)
Activation1(None, 50, 50, 36)
Convolution2 2D(None, 50, 50, 36)
Max-pooling2 2D(None,25, 25, 36)
Batch Normalization2(None, 25, 25, 36)
Activation2(None, 25, 25, 36)
Flatten(None, 22500)
Dense1(None, 128)
Batch Normalization3(None, 128)
Activation2(None, 128)
Dense2(None, 64)
Batch Normalization4(None, 64)
Activation4(None, 64)
Dense3(None, 1)
Activation4(None, 1)
Table 3. Accuracy of single and multiple source PRPD patterns.
Table 3. Accuracy of single and multiple source PRPD patterns.
ClassesClassification AccuracyFalse Negative Accuracy
Class 187.71%0%
Class 297.7%0%
Class 3100%0%
Class 499.82%0%
Class 5100%0%
Class 688.44%0%
Class 7100%0%
Class 1679.42%44%
Class 4678.28%0%
Class 1485.71%0%
Class 14665.71%50%
Table 4. PCR and PCP for independent classifiers.
Table 4. PCR and PCP for independent classifiers.
ClassesRecall(i)Precision(i)
Class 110.83
Class 211
Class 310.38
Class 40.890.6
Class 510.89
Class 60.650.99
Class 710.95
Arithmetic MeanPCR: 0.93PCP: 0.8
Table 5. Hybrid confusion matrix of independent classifiers.
Table 5. Hybrid confusion matrix of independent classifiers.
Predicted Classes
1234567
Imput Classes110004937000
201401001010
3001300000
4001097010
5000013000
600590101200
7000000200
145005050000
165001414060
4650026500500
1465004525000
Table 6. Accuracy of single and multiple source PRPD patterns.
Table 6. Accuracy of single and multiple source PRPD patterns.
ClassClassification AccuracyFalse Negative Accuracy
Class 1100%0%
Class 2100%0%
Class 3100%0%
Class 497.17%0%
Class 5100%0%
Class 6100%0%
Class 7100%0%
Class 16100%0%
Class 46100%0%
Class 1496.28%13%
Class 14690.57%22%
Table 7. PCR and PCP for the proposed model.
Table 7. PCR and PCP for the proposed model.
ClassRecall(i)Precision(i)
Class 111
Class 211
Class 311
Class 40.860.89
Class 511
Class 611
Class 711
Arithmetic MeanPCR: 0.98PCP: 0.99
Table 8. Hybrid confusion matrix of the proposed model.
Table 8. Hybrid confusion matrix of the proposed model.
Predicted Classes
1234567
Imput Classes1100000000
2014000000
3001300000
4000970120
5000013000
6000001200
7000000200
14500037000
16500000500
46000500500
1465000170500
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Mantach, S.; Ashraf, A.; Janani, H.; Kordi, B. A Convolutional Neural Network-Based Model for Multi-Source and Single-Source Partial Discharge Pattern Classification Using Only Single-Source Training Set. Energies 2021, 14, 1355. https://doi.org/10.3390/en14051355

AMA Style

Mantach S, Ashraf A, Janani H, Kordi B. A Convolutional Neural Network-Based Model for Multi-Source and Single-Source Partial Discharge Pattern Classification Using Only Single-Source Training Set. Energies. 2021; 14(5):1355. https://doi.org/10.3390/en14051355

Chicago/Turabian Style

Mantach, Sara, Ahmed Ashraf, Hamed Janani, and Behzad Kordi. 2021. "A Convolutional Neural Network-Based Model for Multi-Source and Single-Source Partial Discharge Pattern Classification Using Only Single-Source Training Set" Energies 14, no. 5: 1355. https://doi.org/10.3390/en14051355

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop