Next Article in Journal
A Vertical Flux-Switching Permanent Magnet Based Oscillating Wave Power Generator with Energy Storage
Previous Article in Journal
An Overview of Resonant Circuits for Wireless Power Transfer
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Influence Analysis and Prediction of ESDD and NSDD Based on Random Forests

1
Department of Electrical Engineering, Shandong University, Jinan 250061, China
2
Shandong Provincial Key Laboratory of Ultra High Voltage Transmission Technology and Equipments, #17923 Jingshi Road, Jinan 250061, China
*
Author to whom correspondence should be addressed.
Energies 2017, 10(7), 878; https://doi.org/10.3390/en10070878
Submission received: 4 April 2017 / Revised: 15 May 2017 / Accepted: 26 June 2017 / Published: 30 June 2017
(This article belongs to the Section F: Electrical Engineering)

Abstract

:
Equivalent salt deposit density (ESDD) and non-soluble deposit density (NSDD) measurements are a basic requirement of power systems. In order to predict the site pollution severity (SPS) of insulators, a new method based on random forests (RFs) is proposed. Using mutual information (MI) theory and RFs, the weights of factors related to the SPS of insulators are analyzed. The samples of contaminated insulators are extracted from the transmission lines of high voltage alternating current (HVAC) and high voltage direct current transmission (HVDC). The regression models of RFs and support vector machines (SVM) are constructed and compared, which helps to support the lack of information in predicting NSDD in previous works. The results are as follows: according to the mean decrease accuracy (MDA), mean decrease Gini, (MDG), and MI, the types of the insulators (including surface area, surface orientation, and total length) as well as the hydrophobicity are the main factors affecting both ESDD and NSDD. Compared with NSDD, the electrical parameters have a significant effect on ESDD. For the influence factors of ESDD, the weights of the insulator type, hydrophobicity, and meteorological factors are 52.94%, 6.35%, and 21.88%, respectively. For the influence factors of NSDD, the weights of the insulator type, hydrophobicity, and meteorological factors are 55.37%, 11.04%, and 14.26%, respectively. The influence voltage level (vl), voltage type (vt), polarity/phases (pp) exerted on ESDD are 1.5 times, 3 times, and 4.5 times of NSDD, respectively. The influence that distance from the coastline (d), wind velocity (wv), and rainfall (rf) exert on NSDD are 1.5 times, 2 times, and 2.5 times that of ESDD, respectively. Compared with the natural contamination test and the SVM regression model, the RFs regression model can effectively predict the contamination degree of insulators, and the relative error of the predicted ESDD and NSDD is 8.31% and 9.62%, respectively.

1. Introduction

Research on insulator natural contamination is a basic requirement of external insulation. The contamination degree of insulators is a result of the actual operating environment, which can reflect the pollution resistance characteristics of insulators under natural conditions [1]. Due to the complex working environment of insulators, the natural contamination characteristics of the insulator are difficult to depict with mathematical expressions.
At present, machine learning algorithms are widely used in the study of the natural contamination characteristics of insulators: Jiao et al. combined particle swarm optimization (PSO) and support vector machine (SVM) to build an insulator contamination on-line monitoring system to predict the contamination degree, and the relative error was less than 10% [2]. Ahmad et al. modeled the relationship between equivalent salt deposit density (ESDD) with temperature, humidity, pressure, rainfall, and wind velocity, using artificial neural networks (ANN), and the mean absolute error of the model output was found to be 3.6% [3]. Meanwhile, Karamousantas et al. provides the foundation for routine maintenance of insulators according to the same algorithm, and the relative error was less than 17% [4]. Muniraj et al. used the characteristics of leakage current as an input, and predicted ESDD by an adaptive neurofuzzy inference system (ANFIS) model, whose coefficient of determination was 0.998 and root mean square error was just 0.00323 [5]. In recent years, the random forest algorithm has been widely used in the field of power systems. Hannan et al. proposed that the performance of the RFs based space vector pulse width modulation (SVPWM) technique is superior to both ANN-SVM and ANFIS-SVM techniques in terms of damping capability, settling time, steady-state error, and transient response under different operating speeds and load conditions [6]. Samantaray et al. proposed the identification of the fault zone in flexible alternating current transmission systems (FACTS)-based transmission lines using RFs (random forests) with an accuracy and reliability of more than 99% [7]. Shah et al. proposed a RFs-based fault discrimination technique for power transformers, and the fault discrimination accuracy was more than 98% [8]. Meanwhile, Kannan et al. proposed a RFs classifier, and the identification rate of the RF classifier lies above 98% at all pollution conditions [9]. Larivière and Van den Poel found that both random forests and regression forests techniques provide a better fit for the estimation and validation sample compared to ordinary linear regression and logistic regression models, and the prediction accuracy was 95% [10]. Therefore, the study of random forest algorithms is essential in the field of external insulation. However, the following shortcomings exist in the studies of SPS (site pollution severity) prediction: Firstly, the influencing factors are different and there is no weight analysis of the factors related to the natural contamination. Secondly, only ESDD is predicted in all of the above algorithms; NSDD is ignored, and according to IEC60815, NSDD is also an indispensable factor [11]. Lastly, RFs based natural pollution prediction has not been compared with other algorithms, and there is no actual natural contamination test validation.
In this paper, 16 factors that are related to ESDD and NSDD are proposed, and the weights are analyzed using MI (mutual information) and RFs. Based on the most related factors, a new regression method of using RFs as a function estimation is established. Experiments show that the method is greatly superior to the SVM regression model in terms of accuracy. The prediction results of this method have been verified by the actual data from the insulators of the Chinese East Coast transmission line.

2. Onsite Measurements and Research Methodology

2.1. Insulator Parameters

In order to study the influence of voltage type and voltage level on the natural accumulation of composite insulators, the surrounding insulators of HVAC transmission lines are sampled. The voltage level of the transmission line and the parameters of the insulators are shown in Table 1. As the contamination degree of the insulators is not uniform [12], the parameters of each insulator sample consists of four values: ESDD of the upper surface, NSDD of the upper surface, ESDD of the lower surface, and NSDD of the lower surface. Since the insulator string of the transmission line is composed of a plurality of insulators, the potential of each insulator is different. So each string of insulators is numbered from 1 to n according to its potential. The sampling diagrams of suspension insulators and the long rod composite insulator are shown in Figure 1.
The natural contamination data presented in this paper were sampled in the transmission line of East China. In order to reduce the measurement error, the insulators were removed from the ground, using non-woven cloth (with a small amount of ethanol solution) to wipe off all the contamination. The method introduced in [11] was used to measure the ESDD and NSDD of each insulator.

2.2. The Influences of the Natural Contamination

Meteorological factors are evaluated using the average value of one-year, such as days with a wind velocity that is more than 5.5 m/s, days of medium to heavy rain, annual average temperature, and so on.
The SPS is determined by ESDD and NSDD under various factors. At present, the difficulty in the modelling of natural contamination is the selection criteria for the related factors. According to references [12,13,14,15,16] and the actual working condition, the following factors are chosen to evaluate the contamination of insulators: (1) Contamination factors: deposition time (dt), hydrophobicity (HC), particle size (ps); (2) Meteorological factors: altitude (a), distance from coastline (d), rainfall (rf), temperature (t), and wind velocity (wv); (3) Insulator type: material (m), position factor (pf), surface area (sa), surface orientation (so), total length (n); (4) Electrical factors: voltage type (vt), voltage level (vl), polarity/phases (pp). Considering the factors shown in Table 2 as inputs, RFs was used to calculate the weights.

2.3. Concept of RFs

RFs is a statistical learning algorithm, which adopts the bootstrap resampling method to extract multiple samples from the original sample. Firstly, the decision tree for each bootstrap sample is constructed. Then the predictions of multiple decision trees are combined by voting for the final prediction, and the results are obtained [17]. This method has high prediction accuracy, good tolerance for abnormal values and noise, and it is not easy to over-fit. On the basis of obtaining a variety of influencing factors, the ESDD and NSDD prediction model is carried out by using the random forests regression (RFs-R) model. The modelling process is shown in Figure 2. Firstly, a series of training samples are randomly selected from the original training sample set Sk (k = 1, ..., n) by the Bootstrap method for the insulator group Gk (k = 1, ..., n), then we use the test set to test the decision tree, synthesize the test results of multiple decision trees, and obtain the final ESDD and NSDD forecast model by voting.
There are two methods for calculating the weight of RFs variables: one is based on mean decrease accuracy (MDA); the other is based on Gini impurity, called the mean decrease Gini (MDG). The weight of the corresponding factor will increase with the decrement of the above two parameters [17].

2.3.1. Mean Decrease in Gini

Suppose that S is a set of s data samples whose class label attribute has m different values and defines m different classes (Ci, i = 1, ..., m). According to the difference of the class label attribute values, S can be divided into m (Si, i = 1, ..., m), and let Si be the set of samples belonging to class Ci and Si is the number of samples in set Si. The Gini index of set S is:
Gini ( S ) = 1 i = 1 m p i 2
where, Pi estimated by si/s is the probability of any sample belonging to Ci. When Gini(S) is 0, all the records in the set belong to the same category. When all the samples in set S are evenly distributed, Gini (S) reaches its maximum, indicating that the minimum useful information can be obtained. If the data set S is divided into n subsets (Sj, j = 1, …, n) according to a certain attribute partition, the index of Ginisplit after splitting is:
Gini split ( S ) = j = 1 m s j s Gini ( S j )
where, n is the number of child nodes, sj is the record number of the child node j, and s is the number of records at node P. When the classification method has traversed all of the attributes, the attribute whose Ginisplit is the smallest will be selected as the split attribute of the node.

2.3.2. Mean Decrease in Accuracy

The order of the eigenvalues of each feature is disrupted and the impact of the order variation on the accuracy of the model is measured. Obviously, the order of disruption has little effect on the accuracy of the model for unimportant variables, but the order of the scramble reduces the accuracy of the model for important variables.
At first, train the RFs model to test the out of bag (OOB) error of each tree in the model using the sample data outside the bag. And then, randomly disrupt the value of the variable v in the sample data outside the bag and retest the OOB error of each tree. Finally, obtain the weight measure of the single tree to the variable v, which is the mean value of the difference of the OOB error in the test.
MDA ( v ) = 1 n i = 1 n ( e r r OOB i e r r OOB i ' )

2.3.3. RFs-R Model

Using the RFs-R model written in MATLAB [18], and compared with the research of Larivière and Van den Poel on profit evolution estimation in the field of economics, RFs can be trained to predict the SPS of insulators. There are two major parameters of RFs, one is “ntree” representing the number of trees and the other is “mtry” representing the number of variables used in determining the best split at each node. Leo Breiman [17] explained that the optimal value of mtry usually lies between log 2 v and v where v is the number of features. Larivière and Van den Poel had chosen a total of 30 dependent variables, and set the parameters of RFs-R as ntree = 5000, mtry = 6. In this paper, there are 16 dependent variables, and the mtry is set as 4.
The following are the indicators to evaluate the RFs regression model [19], and R2 is the fitness of the model. Mean percent standard error (MPSE) is an accurate indicator reflecting the fitting effect, where n represents the total number of insulators involved in the prediction, y ^ i is the predicted output of the trees in the forest corresponding to the given input x i sample, and y i is the observed output. The parameter of ntree can be set according to the minimum MPSE [17].
R 2 = 1 i = 1 n ( y i y ^ i ) 2 / i = 1 n ( y i y ¯ i ) 2
MPSE = 1 n i = 1 n | ( y i y ^ i ) 2 / y ^ i | × 100 %

2.4. Concept of Mutual Information

There are different correlations between the SPS of the insulators and the influencing factors, and the impacting degree of the same influencing factors on the SPS of different insulator strings is also different. Using the theory of mutual information (MI), the high correlation factors with ESDD and NSDD can be selected [20].
The information about the system X obtained after the known system Y can be represented by the difference between the unconditional entropy and the conditional entropy, which is defined as the mutual information of X and Y. X is the decision attribute and Y is the condition attribute.
I ( X ; Y ) = H ( X ) H ( X | Y )
For discrete data,
H ( X ) = i = 1 n p ( x i ) log ( p ( x i ) )
H ( X | Y ) = i = 1 n j = 1 m p ( x i y j ) log [ p ( x i y j ) p ( y j ) ]
I ( X ; Y ) = i = 1 n p ( x i ) log [ p ( x i ) ] { i = 1 n j = 1 m p ( x i , y j ) log [ p ( x i , y j ) p ( y j ) ] }
i = 1 n j = 1 m p ( x i y j ) = j = 1 m p ( y j )
I ( X ; Y ) = i = 1 n p ( x i ) log [ p ( x i ) ] { j = 1 m p ( y j ) i = 1 n p ( x i | y j ) log [ p ( x i | y j ) ] }
where p ( x i ) is the probability of X = xi, p ( y j ) is the probability of Y = yi; p ( x i | y j ) is the conditional probability of X = xi when Y = yi. In order to calculate the probabilities, the data of the samples should be discretized. The discretization of the samples can be used to convert the specific value into the interval represented by the probability, and although the discretization will lose the details, the result is more statistically significant. The specific method of discretizing the attribute fields of each condition attribute and decision attribute is to find the maximum and minimum values in each attribute domain respectively, and divide the distance between the maximum and minimum values into w intervals. Each value is placed in the corresponding interval, and the number of values in each interval is obtained.

3. Results

3.1. The Weights of the Related Factors—RFs

Figure 3a,b shows the MPSE between the predicted and actual values based on the RFs when ntree takes different values. Overall, the MPSE of the prediction model decreases as ntree increases. However, when the value of ntree is large, a too fine classification will lead to a rapid increase in the amount of calculations. Considering the modelling speed and the prediction error, ntreeESDD = 45 and ntreeNSDD = 66 are the best decision tree numbers.
The RFs overcome the shortcomings of traditional variable selection methods (choosing one or two variables in a group of variables that are equally highly correlated). As can be seen from Figure 4, there are gaps in the importance of different factors. The importance of the factors given by the RFs shows that the dominant factor is the insulator type, and the smallest influencing factor is the contamination deposition time. From the perspective of the insulators' maintenance data during spring, although the meteorological factors affect the natural pollution of the insulators, the effect is not obvious.
As can be seen from Figure 4a, the related factors that affect ESDD are: the insulator surface area (sa), the position factor of the insulator (pf), the insulator voltage level (vl), the orientation of the surface (so), the hydrophobicity after contamination (HC), and the polarity/phase (pp) of the line. Compared with ESDD, the factors that affect NSDD are mainly the insulator surface area (sa), the orientation of the surface (so), and the hydrophobicity after contamination (HC), as can be seen from Figure 4b. The electrical factors of the insulators have less of an impact on NSDD.

3.2. The Importance of the Related Factors—Mutual Information

For the insulator string Gk (k = 1, ..., n), it is assumed that the ESDD and NSDD data sequence of the p insulator samples constitute the data set:
X D = [ x esdd , 1 , x esdd , 2 , , x esdd ,   p x nsdd , 1 , x nsdd , 2 , , x nsdd ,   p ]
The data sequence of the l latent correlation factors constitutes the data set YD = {Y1, Y2, ..., Yl}, and the mutual information between the XD and each correlation factor YD can be expressed as:
I ESDD = [ I ( x esdd , 1 , y 1 ) I ( x esdd , 1 , y j ) I ( x esdd , 1 , y l ) I ( x esdd , i , y 1 ) I ( x esdd , i , y j ) I ( x esdd , i , y l ) I ( x esdd , p , y 1 ) I ( x esdd , p , y j ) I ( x esdd , p , y l ) ]
I NSDD = [ I ( x nsdd , 1 , y 1 ) I ( x nsdd , 1 , y j ) I ( x nsdd , 1 , y l ) I ( x nsdd , i , y 1 ) I ( x nsdd , i , y j ) I ( x nsdd , i , y l ) I ( x nsdd , p , y 1 ) I ( x nsdd , p , y j ) I ( x nsdd , p , y l ) ]
where, xesdd,I, xnsdd,IXD, and YjYD. Randomly selecting p = 50 and j = 16, the thermal maps were drawn based on IESDD and INSDD, which are shown in Figure 5.
In Figure 5, the horizontal axis represents the 16 related factors listed in Table 2, and the vertical axis represents the ESDD and NSDD, respectively. Each colour block represents the mutual trust size of the impacting factor. The greater the mutual information, the stronger the correlation is.
If only the mutual information between the SPS of a single insulator and the correlation factors is analysed, it can be found that there are individual differences in the colour distribution of each row. However, when the results from many insulators are integrated, the overall colour distribution captures the common characteristics of the correlation between SPS and each factor, and the strong correlation factors that affect the SPS of the insulators can be determined.
It can be seen from Figure 5a that the hydrophobicity (Y3), surface area (Y11), and surface orientation (Y12) have a strong correlation with ESDD. Material (Y9), total length (Y13), polarity/phase (Y14), and voltage level (Y15) have a significant impact on ESDD. It can be seen from Figure 5b that the influencing factors of hydrophobicity (Y3), surface area (Y11), surface orientation (Y12), and total length (Y13) have a strong correlation with NSDD.

3.3. Natural Contamination Tests

The regression models of RFs and SVM have been compared in this paper. Two-thirds (2/3) of the data are randomly selected as the RFs-R model training sample set, and we establish the support vector machines regression (SVM-R) model based on the same training sample set. Two regression models are tested with the 1/3 of the remaining data as the test sample set.
Due to the test samples being large, we randomly selected the ESDD and NSDD actual values of the two strings of insulators and compared them with the forecasted values of RFs-R and SVM-R. It can be seen from Figure 6 that the forecasted ESDD and NSDD trends are consistent with the actual ESDD and NSDD. Therefore, it is possible to use RFs-R and SVM for the prediction.
It can be seen from Figure 6a that: using the RFs-R model, the results of the ESDD prediction are R2 = 0.951 and MPSE = 7.98%. Using the SVM-R model, the results of the ESDD prediction are R2 = 0.86 and MPSE = 17.65%. The maximum error of the RFs-R model is 31.7% of the maximum error of the SVM-R model. It can be seen from Figure 6b that: using the RFs-R model, the results of the NSDD prediction are R2 = 0.911 and MPSE = 9.04%. Using the SVM-R model, the results of the NSDD prediction are R2 = 0.88 and MPSE = 20.82%. The maximum error of the RFs-R model is 48.3% of the maximum error of the SVM-R model. The RFs-R model is feasible in ESDD and NSDD prediction through previous data learning and verification.

4. Discussion

In this chapter, the effects of electric field factors, meteorological factors, and contamination on the SPS of insulators are analysed. At the same time, the RFs classifier is used to quantitatively analyse the weight of the relevant factors.

4.1. Potential Distribution of the Insulators

Based on the finite element method, the electric field and the potential distribution of the insulators were simulated. Figure 7 shows the results of two-dimensional simulation of the suspension porcelain insulators, where Figure 7a,b is the voltage distribution and the electric field distribution of the insulator, respectively. Since there are 60 insulators on the ±660 kV DC transmission line, the voltage drop of each insulator is about 11 kV. In the simulation, the excitation voltage of the steel pin is set as 11 kV, and the excitation voltage of the steel cap is set as 0 kV.
According to Figure 7, the upper and lower surface of the insulator potential and electric field distribution is significantly different, corresponding to the actual contamination of the suspension insulator as shown in Figure 8, which reflects the importance of so (surface orientation).
Figure 9 shows the two-dimensional electric field and potential distribution simulation of the suspension insulator string and composite insulators. It can be seen from Figure 9b,d that the electric field strength at both ends of the insulator is high, and the field strength decreases first and then increases along the direction of the insulator string. From Figure 9a,c we can see that the potential difference between the two ends of the insulator is very large, and the potential difference in the middle part of the string is very small. Therefore, the total length (n) and the position factor (pf) of the insulator string will have a large effect on the electric field strength and potential of the insulator, which indirectly affects ESDD.

4.2. Contamination Analysis

From the cumulative frequency point of view, as shown in Figure 10a, the D90 of the composite insulators at the anode is 23.26 and 21.90 at the cathode; the D90 of the porcelain insulator is 41.54 and 33.33 at the cathode; the D90 of the coated room temperature vulcanized silicone rubber (RTV) insulator is 32.62 and 30.99 at cathode, and 90% of the contamination particle size conforms to the following rules: composite insulators < insulators coated with RTV < porcelain insulators.
The composite insulator has a stronger ability to adsorb small particles than the porcelain insulators, and the insulator material has a great influence on the particle size of the particles. The insulator string under the cathode more easily adsorbs small particles than the insulator string under the anode. The D90 of the composite insulators at the cathode is lower than that at the anode by 5.85%. The D90 of the porcelain insulators at the cathode are lower than that at the anode by 19.76%. The D90 of the insulators coated with RTV at the cathode is lower than that at the anode by 5.00%, as shown in Figure 10a.
From the frequency point of view, as shown in Figure 10b, the maximum particle size of the composite insulators at the cathode is 7.83 μm and 8.77 μm at anode; the maximum particle size of the RTV-coated insulators at the cathode is 14.2 μm and 12.3 μm at the anode; the maximum particle size of the porcelain insulators at the cathode is 18.1 μm and 22.3 μm at the anode. The maximum particle size of contamination conforms to the same rules as the cumulative frequency, composite insulators < insulators coated with RTV < porcelain insulators. The effect of contamination on the organic materials is greater than that of the inorganic materials, and the particle size of contamination is smaller.
Table 3 shows the particle size distribution of the contamination, where D10 refers to the particle size corresponding to the cumulative probability distribution number of 10%, and P < 3 refers to the cumulative probability of a particle size less than 3 μm.
It can be seen from Table 3 and Figure 10 that the particle size distribution of the contamination of the composite insulator is concentrated in the range of small particles below 10 μm, while the other two kinds of insulator particles have a frequency distribution of more than 10 μm. The particle size distribution of the contamination is significantly affected by the insulator type and the suspension insulators have a wider range of particle size distribution than the rod insulators. From the above analysis, the insulator polarity, type, and material affect the contamination particle size distribution, but not obviously.

4.3. Meteorological Factors

The following figures show the impact of related factors on ESDD or NSDD. The trend of the SPS of the insulators is qualitatively analysed according to Figure 11. The ranges of the related factors are shown in Table 4.
From Figure 11a we can see that the distance factor (d) has a significant influence on the NSDD, and little influence on the ESDD. With the increase of the distance factor (d), the NSDD of the lower surface shows an increasing trend. The NSDD within 5 km of the coastline is significantly lower than the NSDD in the inland area. The maximum values of ESDD and NSDD in all samples appear on the lower surface of the insulator string farthest from the coastline. Compared with Figure 11a–c, since the distance factor (d) has a linear relationship with altitude (a) and temperature (t), the effect on SPS is basically the same.
As can be seen from Figure 11d, the larger the annual average rainfall (rf) is, the greater the impact on ESDD and NSDD is. The ESDD and NSDD in the region of less rainfall are generally higher than that in the heavy rainfall area. However, with the increase of rf, the ESDD value of the upper surface gradually decreases, indicating that the rainfall plays a role in scouring the upper surface of the insulator. However with the increase of rf, the ESDD of the lower surface shows a tendency to decrease first and then increase. We believe that the region of rf ≤ 40 days is mainly concentrated in the inland areas, and the effect of scouring is obvious, leading to a downward trend of ESDD. With the increase of rf, the corresponding samples are concentrated in the coastal areas. The residual electrolyte increases on the surface because of the rain. Since it is difficult to scour the lower surface of the insulators by rain, the electrolyte gradually increases, resulting in the lower surface salt density rise.
It can be seen from Figure 12a that the wind has an obvious effect on the ESDD. With the increase in wv, ESDD showed a decreasing trend. The relationship between NSDD of the upper surface with the wind is complicated. We believe that the contamination can be easily carried by wind, causing uneven accumulation of contamination. As can be seen from Figure 12b, the SPS tends to increase first and then decrease with the increase of dt. Based on the statistical results of the insulator, the first saturated time of SPS is in the second year.

4.4. The Hydrophobicity of the Insulators

Choosing one suspension insulator and one composite insulator as samples, the hydrophobicity classification (HC) is measured based on the method of reference [21].
Figure 13 shows the hydrophobicity of the composite insulators. Regardless of the upper surface or the lower surface, the HC results of the composite insulators are mostly HC2 and HC3, as can be seen in Figure 14a,b, indicating that the composite insulators have good hydrophobicity. On the whole, the poorly hydrophobic points appear near the ends of the composite insulator, which corresponds with the distribution of the high electric field intensity in Figure 9d.
It can be seen in Figure 15a that the upper surface has good hydrophobicity and is concentrated in HC3–HC4. The lower surface of the suspension insulator shown in Figure 15b substantially lacks hydrophobicity. The HC of the entire string of suspended insulators are shown in Figure 16; the HC of the upper surface is maintained at HC4, and the weak hydrophobicity of HC5–HC7 of the insulators is related to the high field strength region of Figure 9b. In Figure 16b, the hydrophobicity of the lower surface is completely lost, whose ESDD is 0.306 mg/cm2 and NSDD is 3.245 mg/cm2, and there is a direct correspondence with the high SPS.

4.5. RFs Analysis and Forecasting

In Table 5, p represents the p-th related factor of SPS, α represents the MDA (or MDG) of each relevant factor in ESDD as a percentage of the total MDA (or MDG). β represents the MDA (or MDG) of each relevant factor in NSDD as a percentage of the total MDA (or MDG). The equations to calculate α and β are as follows (Equations (15) and (16)).
α a , p = MDA p i = 1 16 MDA i ;   β a , p = MDA p j = 1 16 MDA j ,
α Gini ,   p = MDG p i = 1 16 MDG i ;   β Gini ,   p = MDG p j = 1 16 MDG j
The parameters to evaluate the weights are α for ESDD and β for NSDD [17]. For ESDD, the αa (αGini) of the surface factor (sa), position factor (pf), insulator voltage level (vl), surface orientation (so), hydrophobicity (HC), and the polarity/phases (pp) are 19.83% (19.77%), 12.26% (14.36%), 11.49% (9.60%), 11.42% (9.60%), 8.02% (9.44%), and 6.35% (6.77%), respectively. The total impact of the insulator type and electrical factors are 52.94% (52.91%) and 21.88% (20.47%), respectively.
For NSDD, the βa (βGini) of the surface factor (sa), surface orientation (so), and hydrophobicity (HC) are 29.72% (29.30%), 16.50% (13.60%), and 11.04% (14.77%), respectively. The total impact of the insulator type and electrical factors are 55.37% (54.06%) and 7.29% (6.17%), respectively.
According to the ratio of α to β, the influence of the material (m), position factor (pf), total length (n), polarity/phases (pp), voltage level (vl), and voltage type (vt) on ESDD are greater than that on NSDD, while the other related factors have a larger effect on NSDD. The impact of the voltage level (vl), voltage type (vt), and polarity/phases (pp) on ESDD are 1.5 times, 3 times, and 4.5 times that of NSDD. The impact of the distance from coastline (d), wind velocity (wv), and rainfall (rf) on NSDD are 1.5 times, 2 times, 2.5 times of ESDD. It can be seen that the electrical factors have an obvious impact on ESDD, which corresponds to the results in reference [22].
The smallest influencing factor for ESDD and NSDD is the deposition time (dt). It has been considered that the RFs learning data are taken from the annual sampling and the external influencing factors are basically the same in each fixed period, which leads to the least impact on the SPS. On the basis of the SPS measurements, it is believed that the following inputs have high weights for the SPS forecasting of insulators, which include: the surface factor (sa), the position factor (pf), the insulator voltage level (vl), the surface orientation (so), hydrophobicity (HC), and the polarity/phases (pp). When the NSDD is forecasted, the information of the surface factor (sa), surface orientation (so), and hydrophobicity (HC) should be analysed emphatically.
In order to actually detect the validity of the RFs-R trained model, the FC160P/C170DC suspension insulators were suspended on the ±660 kV transmission line tower in 2016 for a one-year natural contamination test. A picture is shown in Figure 17. Using the trained RFs-R model, we set the surface factor (sa), position factor (pf), insulator voltage level (vl), the surface orientation (so), and the polarity/phases (pp) as inputs to predict the ESDD. We set the surface factor (sa), surface orientation (so), and hydrophobicity (HC) as inputs to predict the NSDD. The parameters of the RFs-R model were set as ntreeESDD = 45, ntreeNSDD = 66, and mtry = 4 (Consistent with Section 3.1). The relative error of the predicted ESDD and NSDD and the real ESDD and NSDD is 8.31% and 9.62%, respectively, which is shown in Figure 18.

5. Conclusions

(1)
The R2 and MPSE of the trained ESDD RFs-R model are 0.951 and 7.98%, respectively, and the relative error of the predicted ESDD is 8.31%. The R2 and MPSE of the trained NSDD RFs-R model are 0.911 and 9.04% respectively, and the relative error of the predicted NSDD is 9.62%. Compared with natural contamination test and the SVM regression model, the RFs-R model can effectively predict the natural contamination of insulators.
(2)
According to the MDA (MDG) and MI, the types of the insulators (including surface area, surface orientation, and total length) as well as the hydrophobicity are the main factors affecting both the ESDD and NSDD. Compared with NSDD, the electrical parameters have a significant effect on ESDD. For the influence factors of ESDD, the weights of the insulator type, hydrophobicity, and meteorological factors are 52.94%, 6.35%, and 21.88%, respectively. For the influence factors of NSDD, the weights of the insulator type, hydrophobicity, and meteorological factors are 55.37%, 11.04%, and 14.26%, respectively.
(3)
The effect of electrical parameters on the ESDD is greater than that on NSDD, while other non-electrical parameters have a significant impact on NSDD. The influence that the voltage level (vl), voltage type (vt), and polarity/phases (pp) exert on ESDD are 1.5 times, 3 times, and 4.5 times that of NSDD. The influence that the distance from coastline (d), wind velocity (wv), rainfall (rf) exert on NSDD are 1.5 times, 2 times, and 2.5 times that of ESDD.
(4)
For engineering reasons, SPS has been measured only in a fixed yearly period, and on-line daily ESDD and NSDD measurements are urgently needed, although considerable results have been achieved. More locations and higher accuracy data should be collected and analysed to quantitatively reveal more robust and accurate rules.

Acknowledgments

This project was supported by the Science and Technology Project of the State Grid Corporation of China (SGTYHT/15-JS-193).

Author Contributions

Ang Ren and Huaishuo Xiao conceived and designed the experiments; Ang Ren wrote the paper and Qingquan Li revised it.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, L.; Li, Y.; Lu, M.; Liu, Z.; Wang, C.; Lv, Z. Quantification and comparison of insulator pollution characteristics based on normality of relative contamination values. IEEE Trans. Dielectr. Electr. Insul. 2016, 23, 965–973. [Google Scholar] [CrossRef]
  2. Jiao, S.B.; Liu, D.; Xie, G.; Yi, D. Assessment of contamination condition of insulator based on PSO-SVM. In Proceedings of the 4th IEEE Conference on Industrial Electronics and Applications, Xi’an, China, 25–27 May 2009. [Google Scholar]
  3. Ahmad, A.S.; Ghosh, P.S.; Ahmed, S.S.; Ahmad, S.S.; Aljunid, S.A.K. Assessment of ESDD on high-voltage insulators using artificial neural network. Electr. Power Syst. Res. 2004, 72, 131–136. [Google Scholar] [CrossRef]
  4. Karamousantas, D.C.; Chatzarakis, G.E.; Oikonomou, D.S.; Karampelas, P. Effective insulator maintenance scheduling using artificial neural networks. IET Gener. Transm. Distrib. 2010, 4, 479–484. [Google Scholar] [CrossRef]
  5. Muniraj, C.; Chandraseka, S. Adaptive neurofuzzy inference system-based contamination severity prediction of polymeric insulators in power transmission lines. Adv. Artif. Neural Syst. 2011, 2011, 1–9. [Google Scholar] [CrossRef]
  6. Hannan, M.A.; Ali, J.A.; Mohamed, A.; Uddin, M. A random forests regression based space vector PWM inverter controller for the induction motor drive. IEEE Trans. Ind. Electron. 2016, 64, 2689–2699. [Google Scholar] [CrossRef]
  7. Samantaray, S.R. A data-mining model for protection of FACTS-based transmission line. IEEE Trans. Power Deliv. 2013, 28, 612–618. [Google Scholar] [CrossRef]
  8. Shah, A.M.; Bhalja, B.R. Fault discrimination scheme for power transformer using random forests technique. IET Gener. Transm. Distrib. 2016, 10, 1431–1439. [Google Scholar] [CrossRef]
  9. Kannan, K.; Shivakumar, R.; Chandrasekar, S. A random forests model based contamination severity classification scheme of high voltage transmission line insulators. J. Electr. Eng. Technol. 2016, 11, 951–960. [Google Scholar] [CrossRef]
  10. Larivière, B.; Van den Poel, D. Predicting customer retention and profitability by using random forests and regression forests techniques. Expert Syst. Appl. 2005, 29, 472–484. [Google Scholar] [CrossRef]
  11. Selection and Dimensioning of High-Voltage Insulators Intended for Use in Polluted Conditions; IEC: Geneva, Switzerland, 2008.
  12. Lv, Y.; Li, J.; Zhang, X.; Pang, G.; Liu, Q. Simulation study on pollution accumulation characteristics of XP13-160 porcelain suspension disc insulators. IEEE Trans. Dielectr. Electr. Insul. 2016, 23, 2196–2206. [Google Scholar] [CrossRef]
  13. Jiang, X.; Wang, S.; Zhang, Z.; Hu, J.; Hu, Q. Investigation of flashover voltage and non-uniform contamination correction coefficient of short samples of composite insulator intended for ±800 kV UHVDC. IEEE Trans. Dielectr. Electr. Insul. 2010, 17, 71–80. [Google Scholar] [CrossRef]
  14. Gençoğlu, M.T.; Cebeci, M. Review, investigation of contamination flashover on high voltage insulators using artificial neural network. Expert. Syst. Appl. 2009, 36, 7338–7345. [Google Scholar] [CrossRef]
  15. Ahmad, A.S.; Ghosh, P.S.; Aljunid, S.A.K.; Ahmad, H. Modeling of various meteorological effects on contamination level for suspension type of high voltage insulators using ANN. In Proceedings of the 1st International Conference and Exhibition on Transmission and Distribution in the Asia Pacific Region, Yokohama, Japan, 6–10 October 2002. [Google Scholar]
  16. Zhao, L.; Li, C.; Xiong, J.; Wang, C. An artificial contamination test on silicone rubber insulators under long-time wetted conditions. In Proceedings of the 2007 Annual Report—Conference on Electrical Insulation and Dielectric Phenomena, Vancouver, Canada, 14–17 October 2007. [Google Scholar]
  17. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
  18. Package “randomForest”. Available online: https://cran.r-project.org/web/packages/randomForest/randomForest.pdf (accessed on 8 June 2017).
  19. Janitza, S.; Strobl, C.; Boulesteix, A.L. An AUC-based permutation variable importance measure for random forests. BMC Bioinformat. 2013, 14, 1–11. [Google Scholar] [CrossRef] [PubMed]
  20. Cover, T.M.; Thomas, J.A. Elements of information theory. In Wiley Series in Telecommunications and Signal Processing; Honig, M., Ed.; Wiley & Sons Inc.: New York, NY, USA, 1990; Volume 294, pp. 155–183. [Google Scholar]
  21. Swedish Transmission Research Institute. Guide 1.92/1 Hydrophobicity Classification Guide. Available online: http://www.stri.se/wwwpublic/STRI_Guide_1_92_1.pdf (accessed on 8 June 2017).
  22. Takasu, K.; Shindo, T.; Arai, N. Natural contamination test of insulators with DC voltage energization at inland areas. IEEE Trans. Power Deliv. 1988, 3, 1847–1853. [Google Scholar] [CrossRef]
Figure 1. Insulator sampling diagram. (a) Suspension insulators; (b) long rod composite insulator.
Figure 1. Insulator sampling diagram. (a) Suspension insulators; (b) long rod composite insulator.
Energies 10 00878 g001
Figure 2. Modelling process of ESDD and NSDD forecasting based on RFs.
Figure 2. Modelling process of ESDD and NSDD forecasting based on RFs.
Energies 10 00878 g002
Figure 3. Forecasting the OOB error of RFs with different ntree. (a) The change in MPSE of ESDD when ntree changes; (b) The change in MPSE of NSDD when ntree changes.
Figure 3. Forecasting the OOB error of RFs with different ntree. (a) The change in MPSE of ESDD when ntree changes; (b) The change in MPSE of NSDD when ntree changes.
Energies 10 00878 g003
Figure 4. Importance analysis of influencing factors. (a) The results of ESDD; (b) The results of NSDD.
Figure 4. Importance analysis of influencing factors. (a) The results of ESDD; (b) The results of NSDD.
Energies 10 00878 g004
Figure 5. The thermal map. (a) ESDD; (b) NSDD.
Figure 5. The thermal map. (a) ESDD; (b) NSDD.
Energies 10 00878 g005
Figure 6. Contamination particle size. (a) ESDD; (b) NSDD.
Figure 6. Contamination particle size. (a) ESDD; (b) NSDD.
Energies 10 00878 g006
Figure 7. Two-dimensional simulation results of the suspension porcelain insulator. (a) The potential distribution of the suspension insulator; (b) The electric field of the suspension insulator.
Figure 7. Two-dimensional simulation results of the suspension porcelain insulator. (a) The potential distribution of the suspension insulator; (b) The electric field of the suspension insulator.
Energies 10 00878 g007
Figure 8. The contaminated insulator. (a) The upper surface; (b) The lower surface.
Figure 8. The contaminated insulator. (a) The upper surface; (b) The lower surface.
Energies 10 00878 g008
Figure 9. Insulator electric field and potential distribution. (a) The potential distribution of the suspension insulators; (b) The electric field of the suspension insulators; (c) The potential distribution of the composite insulators; (d) The electric field of the composite insulators.
Figure 9. Insulator electric field and potential distribution. (a) The potential distribution of the suspension insulators; (b) The electric field of the suspension insulators; (c) The potential distribution of the composite insulators; (d) The electric field of the composite insulators.
Energies 10 00878 g009
Figure 10. Contamination particle size. (a) The cumulative frequency of the particle size; (b) The frequency of the particle size.
Figure 10. Contamination particle size. (a) The cumulative frequency of the particle size; (b) The frequency of the particle size.
Energies 10 00878 g010
Figure 11. Effects of the influencing factors on ESDD and NSDD. (a) Distance from coastline; (b) Altitude; (c) Annual average temperature; (d) Medium to heavy rain.
Figure 11. Effects of the influencing factors on ESDD and NSDD. (a) Distance from coastline; (b) Altitude; (c) Annual average temperature; (d) Medium to heavy rain.
Energies 10 00878 g011aEnergies 10 00878 g011b
Figure 12. Effects of influencing factors on ESDD and NSDD. (a) Wind speed ≥5.5 m/s·days; (b) Deposition time.
Figure 12. Effects of influencing factors on ESDD and NSDD. (a) Wind speed ≥5.5 m/s·days; (b) Deposition time.
Energies 10 00878 g012
Figure 13. The hydrophobicity of the composite insulators. (a) The upper surface; (b) the lower surface.
Figure 13. The hydrophobicity of the composite insulators. (a) The upper surface; (b) the lower surface.
Energies 10 00878 g013
Figure 14. The hydrophobicity of the composite insulator. (a) The upper surface; (b) the lower surface.
Figure 14. The hydrophobicity of the composite insulator. (a) The upper surface; (b) the lower surface.
Energies 10 00878 g014
Figure 15. The hydrophobicity of the suspension insulator. (a) The upper surface; (b) the lower surface.
Figure 15. The hydrophobicity of the suspension insulator. (a) The upper surface; (b) the lower surface.
Energies 10 00878 g015
Figure 16. The hydrophobicity of the suspension insulators. (a) The upper surface; (b) the lower surface.
Figure 16. The hydrophobicity of the suspension insulators. (a) The upper surface; (b) the lower surface.
Energies 10 00878 g016
Figure 17. The diagram of the artificial natural pollution test.
Figure 17. The diagram of the artificial natural pollution test.
Energies 10 00878 g017
Figure 18. The results of the artificial natural pollution test.
Figure 18. The results of the artificial natural pollution test.
Energies 10 00878 g018
Table 1. Insulator parameters.
Table 1. Insulator parameters.
Insulator TypeVoltage Level (kV)Leakage Distance (mm)Upper Surface Areas (cm2)Lower Surface Areas (cm2)
FC160P/C170DC±660550 (a piece)18002700
FXBZW ±660/300±6609220224.6/120.2224/120
FXBZW ±660/160±6604680224.6/120.2224/120
LXHY4-100220450 (a piece)9751601
XWP2-160500450 (a piece)15511208
FC160P500550 (a piece)11982541
Table 2. Insulator parameters.
Table 2. Insulator parameters.
Influencing FactorsNoteSymbolsUnits
Contamination factorsdeposition timeInsulator working timedtyear
particle sizeContamination particle sizepsμm
hydrophobicityThe hydrophobicity of the contaminated insulatorHC1-7
Meteorological factorsaltitudeAltitude of the toweram
distance from coastlineDistance from the nearest coastlinedm
rain fallMedium to heavy rain daysrfday
temperatureAnnual average temperaturet°C
wind velocitywind speed ≥5.5 m/s·dayswvday
Insulator typematerial1 Coasted with RTV; 2 Composite; 3 Porcelainm-
position factorpf = i/n (i which is the i-th insulator)pf-
surface areaInsulator surface areasacm2
surface orientationSurface orientation (1 upper; 2 lower)so-
total lengthThe number of chainsn-
Electrical factorspolarity/phasesPolarity/Phases (1+; 2−; 3A; 4B; 5C)pp-
voltage levelInsulator operating voltagevlkV
voltage typeVoltage type (1 DC; 2 AC)vt-
Table 3. The particle size distribution of the contamination.
Table 3. The particle size distribution of the contamination.
SpeciesD10 (μm)D50 (μm)D90 (μm)P < 3P < 10P < 20P < 40
Anode composite2.298.2423.2614.44%59.42%86.05%98.99%
Anode porcelain7.1917.6041.540.15%21.90%57.22%88.91%
Anode RTV5.1615.0032.625.07%30.12%68.37%95.30%
Cathode composite3.518.2521.907.15%59.86%87.46%99.23%
Cathode porcelain5.1313.8933.332.50%32.42%68.13%94.99%
Cathode RTV4.5912.5930.994.94%38.92%71.88%96.06%
Table 4. The particle size distribution of the contamination.
Table 4. The particle size distribution of the contamination.
Factorsd (m)a (m)t (°C)rf (days)wv (days)dt (years)
Range40,127–229,73511–3411–1411–148–451–5
Table 5. The MDA and MDG of related factors.
Table 5. The MDA and MDG of related factors.
FeatureESDD Mean DecreaseNSDD Mean Decrease α a , p β a , p α Gini , p β Gini , p
AccuracyαaGini IndexαGiniAccuracyβaGini IndexβGini
dt0.000181.16%0.016711.29%0.009920.90%1.5491.34%1.280.96
ps0.000241.54%0.018261.41%0.019011.73%2.2721.97%0.890.72
a0.000452.89%0.036262.80%0.039023.56%4.5443.93%0.810.71
d0.000432.76%0.036162.79%0.046114.20%4.8024.16%0.660.67
rf0.000311.99%0.030142.33%0.058355.32%4.7014.07%0.370.57
t0.000422.70%0.036682.83%0.039323.58%3.9723.44%0.750.82
wv0.000372.37%0.030292.34%0.058815.36%4.7224.09%0.440.57
HC0.001258.02%0.122279.44%0.1211211.04%17.05414.77%0.730.64
m0.000634.04%0.053174.10%0.028112.56%2.4622.13%1.581.93
pf0.0019112.26%0.1859614.36%0.021321.94%5.7194.95%6.312.90
sa0.0030919.83%0.2561319.77%0.3261229.72%33.83629.30%0.670.67
so0.0017811.42%0.124339.60%0.1811116.50%15.70313.60%0.690.71
n0.000845.39%0.065785.08%0.051044.65%4.7134.08%1.161.24
pp0.000996.35%0.087716.77%0.023122.11%2.2111.91%3.023.54
vl0.0017911.49%0.124339.60%0.028112.56%2.4622.13%4.494.50
vt0.000634.04%0.053174.10%0.028732.62%2.4622.13%1.541.93

Share and Cite

MDPI and ACS Style

Ren, A.; Li, Q.; Xiao, H. Influence Analysis and Prediction of ESDD and NSDD Based on Random Forests. Energies 2017, 10, 878. https://doi.org/10.3390/en10070878

AMA Style

Ren A, Li Q, Xiao H. Influence Analysis and Prediction of ESDD and NSDD Based on Random Forests. Energies. 2017; 10(7):878. https://doi.org/10.3390/en10070878

Chicago/Turabian Style

Ren, Ang, Qingquan Li, and Huaishuo Xiao. 2017. "Influence Analysis and Prediction of ESDD and NSDD Based on Random Forests" Energies 10, no. 7: 878. https://doi.org/10.3390/en10070878

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop