1. Introduction
High-voltage insulators on overhead power lines have long been exposed to the outdoors and are susceptible to pollution and salt contamination. Salt contamination and industrial pollution may cause flashovers and line outages, thereby affecting the power supply and reducing the reliability of the power system. To prevent insulator flashovers, maintenance staff need to periodically wash the insulators [
1,
2].
Currently, various maintenance procedures are available to minimize flashovers caused by contamination and industrial pollution. Among the available maintenance procedures, periodic washing is the most common method to remove pollutants from the surface of insulators [
3]. Furthermore, the equivalent salt deposit density (ESDD) and non-soluble deposit density (NSDD) are regularly used to assess the severity of site pollution [
4].
In terms of maintenance, utilities often adopt periodic maintenance procedures when abnormal conditions are observed on the insulators. Although this method can effectively prevent flashovers due to contamination and industrial pollution, it requires a significant amount of manpower and cannot accurately ensure the insulation of the insulators [
5,
6]. Therefore, one of the challenges for maintenance is how to assess the condition of insulators and determine the washing schedule. Considering the variability of weather and pollution conditions, periodic washing may not be an efficient strategy.
Generally speaking, the progress of insulator discharge increases with the leakage current, which is observable when the surface of the insulator is contaminated and wet. Therefore, by observing the surface discharge and leakage current of the insulator, one can roughly understand its insulation and whether it has reached all the stages of the pollution flashover mechanism. In addition, it indicates how close the insulator string is to flashover. However, the leakage current is influenced by weather conditions such as temperature, relative humidity, pressure, wind speed, ultraviolet exposure, and the type and layers of pollution on the surface of the insulator [
7,
8].
The relationship between discharge and leakage current can be determined through experiments.
Table 1 illustrates the correlation between leakage current and surface discharge phenomena in the ceramic insulator. The experiment reveals that the leakage current is extremely low when the surface is clean and dry; 1 mA is the normal leakage current in the clean and dry state. When the relative humidity reaches a certain level, for example, 80% or more, the leakage current exceeds 1 mA, indicating that more obvious corona or sparks can be observed [
9,
10,
11,
12].
Therefore, both relative humidity and pollution conditions must combine to form a larger corona or spark. This leakage current typically initiates at 2 to 3 mA. When the leakage current exceeds 10 mA, due to the heat generated at the discharge roots, the pollution dries out in their vicinity, and the dry band is formed. Subsequently, a spark occurs, causing discharges along small portions of the insulation. If the leakage current exceeds 100 mA, an extended partial arc occurs. This arc discharge is highly unstable. Therefore, if an extended partial arc discharge, due to an extremely large leakage current, is observed, the insulator should be cleaned immediately.
As mentioned above, the leakage current is a more meaningful parameter as it provides information on all stages of the pollution flashover mechanism and indicates how close the insulator string is to flashover.
From a literature survey, it is evident that insulator contamination is related to weather parameters such as temperature, relative humidity, pressure, wind speed, and ultraviolet [
13,
14,
15,
16]. Since the leakage current on the surface of the insulator is affected by the material, surface contamination, and surrounding environment, as well as its nonlinear characteristics, artificial intelligence algorithms such as machine learning technology have been applied to analyze the leakage current or estimate contamination on the surface of insulators in some related studies [
10,
11,
13,
17,
18,
19,
20,
21,
22]. For instance, artificial neural networks (ANNs) have been used to build a leakage current model [
18,
19], support vector machines (SVMs) to evaluate contamination degree [
20,
21], and random forests to predict equivalent salt deposit density (ESDD) based on parameters such as pollution and weather [
22]. Although there have been numerous studies using machine learning and other algorithms for the leakage current of insulators, there is limited study on modeling analysis using long-term measurement data to build predictive models and evaluate the most effective prediction methods.
The study was conducted at a 161 kV test station located in a severely polluted industrial area on the western coast of Taiwan. In the test station, a data acquisition system has been built to measure the leakage current of insulators and atmospheric parameters (including temperature, relative humidity, pressure, wind speed, and ultraviolet) around the insulator. One insulator string was monitored under real operational conditions for 30 consecutive months.
This paper proposes a novel method to predict the leakage current using artificial intelligence algorithms and establishes a real-time salt contamination monitoring system. Firstly, this study takes silicon-grease-coated insulators into account and installs them in the test station. The leakage current and weather parameters are collected for an extended period (30 months). Then, artificial intelligence and machine learning techniques (such as support vector regression, gradient boosting regression, and long short-term memory neural networks) are applied to establish a prediction model for the leakage current of the insulators. The established model can accurately predict the leakage current of insulators based on weather parameters. Subsequently, the most effective prediction method is evaluated in terms of Mean Squared Error (MSE), Mean Absolute Error (MAE), and Explained Variance Score (EVS).
Additionally, to observe the real-time status of the insulator, this study establishes a monitoring platform that integrates the predicted leakage current, pollution level of the insulator, and weather parameters. It allows users or maintenance personnel to connect to the server through the network to observe the predicted results and weather parameters. The results can establish a real-time salt contamination monitoring system for insulators on transmission lines, enabling operation and maintenance personnel to realize the actual insulation situation of insulators in real time. This system not only helps prevent power outages due to salt contamination or pollution but also reduces the workload for maintenance personnel.
3. Prediction Method of the Leakage Current
Generally, time series analysis involves methods for extracting meaningful statistics and other characteristics from time series data. This encompasses various techniques, such as time series models, exponential smoothing forecasts, or moving averages. This study primarily focuses on the analysis of long-term time series data, including leakage current, temperature, relative humidity, and pressure measured per minute. Regression model algorithms like support vector regression (SVR), gradient boosting regression (GBR), and long short-term memory neural network (LSTM) are applied for predicting the insulator’s leakage current based on on-site measured weather data. Additionally, evaluation metrics (mean squared error (MSE), mean absolute error (MAE), and explained variance score (EVS)) are used to assess the error between the actual and predicted values, providing an explanation of the models’ performance in estimating leakage current.
3.1. Time Series Analysis
Time series analysis is a statistical forecasting method that examines a series of data points indexed in chronological order. The primary objective of time series analysis is to scrutinize the changing trends in data over time and estimate future trends. The advantage of time series analysis is its ability to describe and predict variables as long as there is historical data available. However, its limitation becomes apparent when forecasting variables are influenced not only by time and changes but also by other factors, leading to a notable reduction in forecasting accuracy [
24].
3.2. Support Vector Regression (SVR)
Support vector machine (SVM) was introduced by Cortes and Vapnik in 1995 [
25]. It is a supervised learning model with associated learning algorithms used for data classification and regression analysis. SVM typically functions as a binary linear classifier, using a line to separate training data into two categories as effectively as possible. It then corresponds to a two-dimensional space for classification prediction. However, the data encountered in big data scenarios are often linear and inseparable. Hence, SVM can utilize various kernel functions to perform non-linear classification and map the data to a high-dimensional space. In cases where linear classification in two-dimensional space is not sufficient, the data are transformed into a high-dimensional space using a kernel function to identify the optimal classification hyperplane. Support vector regression is an extension of support vector machine, designed to handle regression problems.
3.3. Gradient Boosting Regression (GBR)
Gradient boosting is a machine learning technique used for regression, classification, and other tasks. It produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees. Boosting is an algorithm that elevates a weak learner to a strong learner. The process begins by training a base learner from the initial training set and then adjusting the training sample distribution based on the performance of the base learner. Subsequently, the next base learner is trained using the adjusted sample distribution. This cycle repeats until the number of base learners reaches the pre-specified value of N. Finally, the N base learners are weighted and combined.
Gradient boosting, a type of boosting algorithm, differs from traditional boosting in that it does not assign weights to correct and incorrect samples. Instead, it calculates the difference between the predicted result and the sample (residual) and establishes a new learner to reduce the residual [
25,
26,
27,
28].
3.4. Long Short-Term Memory Neural Network (LSTM)
Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture commonly employed in the field of deep learning. It is a cyclic neural network well-suited for tasks such as data classification and time data prediction, as depicted in
Figure 9. A typical LSTM unit comprises a cell, an input gate, an output gate, and a forget gate. During the transmission of neuron data, previous prediction data can be retained, while unimportant information is discarded to enhance the overall learning effect [
29]. Subsequently, through the error back propagation (BP) process, weights are adjusted, and the model undergoes repeated analysis of big data and deep learning to establish prediction models.
3.5. Evaluation Metrics
This study adopts three evaluation metrics to assess prediction performance: mean squared error (
MSE), mean absolute error (
MAE), explained variance score (
EVS). The mean squared error
(MSE) is an estimator that measures the average of the squares of the errors. It can be used to evaluate the dispersion between individuals in the data. A smaller
MSE value indicates better accuracy in the prediction model [
30,
31].
The mean absolute error (
MAE) is a measure of errors between paired observations expressing the same phenomenon. When the same physical quantity is measured multiple times, each measurement value and its absolute error will not be the same. It takes the absolute error of each measurement and then calculates the average value. The mean absolute error is a non-negative value, and the better the model, the closer the
MAE is to zero [
11].
The explained variance score (
EVS) measures the dispersion of errors in a given dataset. It is calculated as Formula (3), where
is the variance of prediction errors and actual values, respectively. A result closer to 1 indicates that the independent variable can better explain the dependent variable, while a smaller value suggests a less effective model [
32].