Assessment of Line Outage Prediction Using Ensemble Learning and Gaussian Processes During Extreme Meteorological Events

Unlu, Altan; Peña, Malaquias

doi:10.3390/wind4040017

Open AccessArticle

Assessment of Line Outage Prediction Using Ensemble Learning and Gaussian Processes During Extreme Meteorological Events

by

Altan Unlu

^1,*

and

Malaquias Peña

²

¹

Department of Electrical & Computer Engineering, University of Connecticut, Storrs, CT 06268, USA

²

Department of Civil & Environmental Engineering, University of Connecticut, Storrs, CT 06268, USA

^*

Author to whom correspondence should be addressed.

Wind 2024, 4(4), 342-362; https://doi.org/10.3390/wind4040017

Submission received: 4 August 2024 / Revised: 6 October 2024 / Accepted: 28 October 2024 / Published: 1 November 2024

Download

Browse Figures

Versions Notes

Abstract

Climate change is increasing the occurrence of extreme weather events, such as intense windstorms, with a trend expected to worsen due to global warming. The growing intensity and frequency of these events are causing a significant number of failures in power distribution grids. However, understanding the nature of extreme wind events and predicting their impact on distribution grids can help and prevent these issues, potentially mitigating their adverse effects. This study analyzes a structured method to predict distribution grid disruptions caused by extreme wind events. The method utilizes Machine Learning (ML) models, including K-Nearest Neighbors (KNN), Random Forest (RF), Support Vector Machine (SVM), Decision Trees (DTs), Gradient Boosting Machine (GBM), Gaussian Process (GP), Deep Neural Network (DNN), and Ensemble Learning which combines RF, SVM and GP to analyze synthetic failure data and predict power grid outages. The study utilized meteorological information, physical fragility curves, and scenario generation for distribution systems. The approach is validated by using five-fold cross-validation on the dataset, demonstrating its effectiveness in enhancing predictive capabilities against extreme wind events. Experimental results showed that the Ensemble Learning, GP, and SVM models outperformed other predictive models in the binary classification task of identifying failures or non-failures, achieving the highest performance metrics.

Keywords:

extreme weather events; power distribution grids; machine learning; line outage prediction; grid resilience; ensemble learning; Gaussian process; support vector machine

1. Introduction

Extreme weather events have increased dramatically in both frequency and severity as a direct consequence of climate change, presenting significant risks to power systems [1,2,3]. Extreme weather conditions such as strong winds, tornadoes, and hurricanes are primary catalysts for power outages, significantly impacting the stability, reliability, and resilience of power systems. For example, Hurricane Sandy in 2012 led to one of the largest power outages in Northeast U.S. history. The hurricane’s powerful winds and extensive flooding caused damage to power lines and substations, and disrupted transmission systems’ normal function. As a result, millions of customers were left without electricity for several days.

Understanding the causes of power outages, such as aging infrastructure, system overloads, and insufficient vegetation management, is vital for enhancing the resilience of power systems. The energy sector actively addresses these issues by upgrading infrastructure with durable materials and improving vegetation management practices [4]. An effective power system restoration strategy should be adaptable, evolving with real-time data on faults, disturbances, and Distributed Energy Resources (DERs) availability [5]. It requires reassessing existing infrastructure and adopting advanced forecasting techniques, grid modernization efforts, and more robust integration of Renewable Energy (RE) sources [5,6].

Predicting the severity of extreme weather events is challenging due to their unpredictability. However, weather variables are essential for improving predictive learning models for forecasting process because electricity demand, DERs, and line outages are highly dependent on weather conditions [7,8,9]. Predictive statistical methods have proven effective in understanding the complex interactions between weather hazards, infrastructure, and the environment that cause power outages [10,11,12,13]. In this process, the weather model identifies and characterizes extreme events, while the component model evaluates the fragility of the system’s structural components [14,15,16]. To enhance this process, dealing with uncertainties is crucial; strategies such as Monte Carlo (MC) simulations, Markov modeling, Weibull analysis, and binomial distribution considering a range of representative scenarios are widely utilized to illustrate extreme conditions and failure analysis [17,18,19,20]. In addition, inevitable uncertainties occur during the estimation of extreme events due to factors such as the choice of probability distributions, methods for parameter estimation, and the record length of wind speed (WS) data. Different probability distributions provide varying results, producing uncertainty in predictions. Parameter estimation methods such as statistical or empirical introduce variability based on the data and approach used. The authors of [21] examined the impact of parameter uncertainty on return value estimates using the Generalised Pareto distribution. Their results highlights that while estimators produce identical results when uncertainty is ignored, they significantly differ when accounting for uncertainty in the shape and scale parameters. A previous paper [22] developed an algorithm for quantifying uncertainty in extreme value statistics using the Monte Carlo technique. Multiple widely used probability distributions fitted by the maximum likelihood method distributions provide accurate standard error estimates. The models applied to WS and precipitation data show that standard error increases with return period but decreases with sample size. Furthermore, the length of WS records affects the accuracy of predictions, especially when estimating values for specific return periods.

Predictive models that integrate weather with power system management is critical. Therefore, those adaptive models are vital for translating weather data into actionable strategies, enabling power systems to anticipate, prepare, and mitigate the effects of unpredictable weather conditions. Early research demonstrated the development of hurricane outage prediction for the full U.S. coastline [23]. Another study [10] collected utility-specific infrastructure and land cover data to improve outage predictions, analyzing various storms and seasons using the numerical weather prediction model for Eversource Energy in Connecticut. A later study [24] demonstrated regression tree models with weather predictions, soil and vegetation information, utility assets, and historical outage data to forecast outage numbers and distribution. Additionally, the authors of [25] described a technique for improving power distribution grid outage prediction by incorporating storm severity classification of weather-related outage events. In this study, the authors proposed Machine Learning (ML) models, including naive Bayes classifiers, Support Vector Machines, and tree-based models, to predict contingency analysis outcomes with REs in the Belgian transmission grid [26]. Research by [27] proposed the application of Ensemble methods to forecast the conditions of smart grid devices during extreme weather events to enhance the resilience of energy grids. The authors of [28] analyzed historical weather and outage data, revealing a strong link between heatwaves and power outages, and developed several ML models to predict equipment failures during heatwaves. Another article [29] presented comprehensive methods for predicting storm severity for the power grid using ERA5 reanalysis data combined with forest inventory. The study by [30] proposed a framework to predict power outages using hurricane forecasts, considering the causalities among distribution devices. In another study [31], a data-driven approach was proposed to predict vegetation-related outages in power distribution systems. Moreover, advancing predictive models is useful for anticipating outages and enabling rapid responses. Using data analytics, ML, and weather forecasting, these models help utilities address potential issues, mitigate disruptions, and improve energy delivery efficiency and safety.

The methodology used to derive the fragility curves can be divided into four main approaches [32]. In another study, the authors [33] reviewed fragility curves to model power system vulnerability. They classified fragility curves by event type and system component, with a focus on key differences across natural hazards. The first approach to obtain the fragility curve is empirical, which involves the use of collected historical data from the field such as weather conditions or system failures and utilizing statistical models to quantify damage failures and probabilities, and understand patterns to identify the how the weather conditions impact predictive capabilities. In [34], the research aimed to establish correlations between extreme weather and network reliability by utilizing observed weather and historical fault data records provided by transmission operators. The authors of [35] developed fragility curves for overhead electrical lines using a large dataset for electrical failures, which were correlated with windstorm hazard data to model failure probabilities. In this empirical methodology, the quality and consistency of these collected data were crucial, as they directly influenced the model’s accuracy. The second technique is analytical, where physical models and equations, such as structural or system simulations, are used to predict damage states based on specific input conditions. The authors of [36] described a workflow for generating experimental data using power infrastructure, hurricane, and structural analysis modules with fragility curves based on physical models and simulations of power system responses to hurricane stresses. The paper by [16] analyzed transmission tower fragility under downburst effects, comparing fragility curves and calculating collapse probabilities by wind direction, and optimizing tower layouts using lognormal fragility curves for varying wind angles and damage states. In addition, the third approach to obtain the fragility curve is judgmental, where expert historical experience and information to determine the qualitative assessments are used to estimate potential damage states to obtain fragility curves [33]. Finally, the fourth technique is hybrid, which integrates the previous judgmental, empirical, and analytical approaches by combining them to produce fragility curves. The study by [3] integrated lognormal fragility curves for transmission towers to model failure probabilities under severe weather, using a sequential Monte Carlo simulation to assess the impacts on power system resilience with multi-temporal optimal power flow. The dissertation by [37] combined the physical attributes of pole–wire systems with ML models. The ML model adjusted physics-based fragility curves to predict outages in high-impact events.

In this work, we utilized synthetic yet realistic datasets generated from a physical fragility model and a scenario-based predictive model. Our approach differs from the literature because we used these custom datasets instead of relying on readily available outage data. Our methodology combines physics-based power distribution line fragility analysis with data-driven predictive models of line failure classification, which has not been extensively studied in the literature. This combination increases the strengths of both approaches to provide better predictions of line failures, especially under extreme conditions. We analyzed our methodologies to predict line outages using binary classification. For model evaluation, we incorporated 5-fold cross-validation and conducted a comparative analysis of several predictive models to determine the most effective approach. This framework determines the accuracy of outage predictions and provides comprehensive evaluation methods for grid resilience and predictive capabilities.

The organization of the rest of this paper is as follows. The structure of the proposed binary classification and data generation are demonstrated in Section 2. The ML models studied and hyperparameter optimization are discussed in Section 3. The results and discussion are presented in Section 4. The final Section 5 provides the conclusion.

2. The Structure of Proposed Binary Classification and Data Generation for Line Failure Prediction

The proposed binary classification for the line failure prediction method is described, and the workflow is shown in Figure 1. This figure illustrates the comprehensive process of the ML framework with cross-validation. The workflow begins with fragility curves, the integration of weather information, and scenario generation for line failures, which collectively contribute to historical data generation.

Fragility curves are developed to quantify the probability of line failure given specific weather conditions. They provide a probabilistic measure of how likely a line is to fail under different intensities of weather events. Weather data, including variables such as WS, are crucial inputs for predicting line failures. Specifically, WS data are generated from a Weibull distribution, which is widely used in modeling WS due to its flexibility. WS data help in understanding the environmental conditions that could affect power lines and potentially lead to failures. Scenario generation involves creating various potential weather-induced line failure scenarios based on binomial distribution for predictive models. These scenarios help in simulating different conditions under which line failures might occur. The experimental setup, including the details of the data preparation, model training, and evaluation process, is explained in Section 3.13 of the manuscript. The detailed methodology is described in Section 3 of the manuscript. The classifier section depicts the implementation of multiple ML models, including K-Nearest Neighbors (KNN), Random Forest (RF), Support Vector Machine (SVM), Decision Trees (DTs), Gradient Boosting Machine (GBM),Gaussian Process (GP), Ensemble, and Deep Neural Network (DNN). These classifiers are subjected to hyperparameter tuning using grid search with stratified 5-fold cross-validation, as shown in Section 3.9, Section 3.10 and Section 3.11. The proposed methodology is Ensemble Learning, which combines RF, SVM, and GP using a soft voting mechanism. In this approach, the predictions from each model are aggregated by averaging the predicted probabilities, allowing the Ensemble to increase the strengths of each model to make a more accurate final prediction. For performance evaluation, we have used a range of metrics including accuracy, Receiver Operating Characteristic (ROC) curve, and Area Under the Curve (AUC), as well as precision, recall, and F1-score. These metrics provide a comprehensive assessment of the model’s performance by evaluating not only overall accuracy but also the ability to correctly classify and distinguish between classes.

2.1. Fragility Curve Development

Fragility curves play a crucial role in quantifying the vulnerability of distribution systems to extreme weather conditions [3], such as high WS. These curves are statistical representations that predict the probability of system failure at various WS levels, thus helping in identifying critical weaknesses in infrastructure design and maintenance [33,35]. Figure 2 demonstrates the generated fragility curve for power line damage states in relation to WS and the probability of line failures. To represent the probability of exceedance P using the Cumulative Distribution Function (CDF) of WS as the intensity [38,39],

P = Φ (\frac{ln (I) - ln (m)}{σ})

(1)

In Equation (1), P denotes the probability of exceedance, which represents the likelihood that the WS I exceeds a given threshold. The function

Φ

represents the CDF of the normal distribution [3], providing the probability that a normally distributed random variable. Here,

ln (I)

is the natural logarithm of the WS, and

ln (m)

is the natural logarithm of the median WS. The standard deviation

σ

of the lognormal distribution normalizes the difference between

ln (I)

and

ln (m)

.

In this study, we have constructed and utilized a lognormal distribution with WS intensity and a 10% standard deviation to design the fragility curves for low (0 to 25%), medium low (25 to 50%), medium (50 to 75%), and extreme (75 to 100%) structural damage states. Median values specific to each category defined the curves, capturing the increasing probability of damage as WS increased.

2.2. Weather Information

In this work, the case of high wind events occurrences can cause substantial disruption to electrical networks. Wind-related events are characterized by abnormally high WS that leads to infrastructure damage, line outages, and increased operational stress on the network. The Weibull distribution is a widely used statistical model to describe the variability in WS [40]. In our study, we applied the Weibull distribution to generate the WS data. The Weibull distribution is characterized by two key parameters: the shape factor k, which influences the spread of WS, and the scale factor c, which adjusts the magnitude of WS.

The probability density function (PDF) for the Weibull distribution is given by the following equation:

v (k, c) = \{\begin{matrix} (\frac{k}{c}) {(\frac{v}{c})}^{k - 1} e^{- {(\frac{v}{c})}^{k}} & if v \geq 0 \\ 0 & if v < 0 \end{matrix}

(2)

Here, v represents WS, k is the shape factor, and c is the scale factor. This function provides the conditional probability of WS, allowing us to statistically model and analyze wind behavior. Figure 3 shows the histogram of WS by utilizing the Weibull distribution in the predictive model showing different WS scenarios.

2.3. Scenario Generation

To analyze the impact of wind events on power systems, scenario generation is performed by combining WS Weibull distributions with fragility curves. By generating scenarios that incorporate different WSs and their associated failure probabilities, system operators can simulate potential outcomes during extreme wind events. For the relationship between WS and the probability of structural failure to model the extreme wind scenarios [41], the following function has been defined:

P (failure ∣ WS) = \{\begin{matrix} U (0, 0.25) & if WS \leq 65 \\ U (0.25, 0.5) & if 65 < WS \leq 75 \\ U (0.5, 0.75) & if 75 < WS \leq 85 \\ U (0.75, 1.0) & if WS > 85 \end{matrix}

(3)

where

P (failure ∣ WS)

represents the probability of failure given the WS, and

U (a, b)

denotes a uniform distribution between a and b. After mapping WS to failure probabilities through the function, these probabilities are used to simulate actual structural failures using a binomial distribution [42], where each trial results in either a failure or no failure. Each instance of potential failure is modeled as a single trial in a binomial distribution with the success probability defined by the mapped values. Utilizing this mapping function, we generated 1000 simulation cases to model line failures under various wind conditions. This simulation provides a dataset for analyzing the likelihood of structural failures and the probability of failures across a range of WS.

Figure 4 shows the relationship between WS and total line failures. The scatter plot indicates clusters of data points at various WS intervals, each corresponding to a range of total failures.

Figure 5 illustrates the relationship between WS and line failure probability. The plot shows various data points of WS intervals, with each cluster corresponding to a range of line failure probabilities. As WS increases from 40 mph to 120 mph, there is a noticeable trend of higher line failure probabilities. The clustering pattern indicates that higher WSs are associated with a greater likelihood of line failures.

Figure 6 illustrates the relationship between WS and line failure states, represented as binary values (0 for no failure, 1 for failure). The plot shows that as WS increases, line failure states become more frequent, particularly around speeds of 90 mph and above.

3. Methodology—Studied Machine Learning Models and Hyperparameters

In this section, we presented an evaluation of the predictive capabilities of eight ML models applied to the binary classification of line outages. The models evaluated include KNN, RF, SVM, DTs, GBM, GP, Ensemble models, and DNN. These models were chosen for their ability to handle complex data patterns and classification tasks. Additionally, hyperparameter optimization was performed to enhance the accuracy and generalization of each model.

3.1. K-Nearest Neighbors (KNN)

KNN is a simple and effective algorithm for binary classification and regression, especially suitable for small datasets. It predicts a new data point’s label by finding a predefined number of nearest neighbors using a distance metric [43]. Key hyperparameters include the number of neighbors, the weight function (either uniform or distance to closer neighbors), and the power parameter (p) for distance metrics [44]. KNN is non-parametric, making no assumptions about data distribution, but it can be computationally intensive for large datasets and sensitive to data scaling.

3.2. Random Forest (RF)

RF handles large datasets and prevents overfitting through Ensemble Learning, a technique where multiple models are trained on different subsets of the data and their predictions are combined, which often leads to better performance [45]. RF uses the bootstrap aggregating or bagging method. Each tree in the forest is constructed using an independently sampled random vector for diversity among the trees. Key hyperparameters of RF [46] include the number of estimators or trees in the forest, and more trees generally improve accuracy but also increase computational cost. Maximum depth limits how deep each tree can grow, helping control model complexity and overfitting. Minimum sample split specifies the minimum number of samples required to split a node, while minimum sample leaf sets the minimum number required at a leaf node; both splits occur only with sufficient data to prevent overfitting.

3.3. Support Vector Machine (SVM)

SVM is a powerful supervised learning algorithm used for both binary classification and regression tasks. It finds the optimal hyperplane that maximizes the margin between classes in the feature space, making it effective for high-dimensional data [47]. SVM can handle both linear and non-linear data by using the kernel function and the radial basis function (RBF) to map input features into a higher-dimensional space for better separation. Key hyperparameters of SVM [48] include C, which acts as a regularization parameter to control the trade-off between classification accuracy and margin size. Gamma defines the influence of training examples on the decision boundary. The kernel type determines how data are transformed into a higher-dimensional space.

3.4. Decision Trees (DTs)

DTs work by recursively splitting the dataset into subsets based on feature values, creating a tree-based structure where each node represents a decision, making them easy to interpret [49]. They handle both categorical and numerical data well without assuming a specific data distribution. However, they are sensitive to small changes in the dataset. Key hyperparameters of DTs [50] include maximum depth, which limits how deep the tree can grow to control complexity. Minimum samples split specifies the minimum number of samples required to split a node. Minimum sample leaf sets the minimum number of samples required at a leaf node, helping to prevent overfitting.

3.5. Gradient Boosting Machine (GBM)

GBM is an Ensemble Learning technique that combines multiple weak learners to form a strong predictor, where each tree sequentially corrects the errors of the previous ones and improving overall accuracy [51]. GBM uses gradient descent to minimize a loss function, enhancing predictive performance. Key hyperparameters of GBM [52] include the number of trees (estimators), which determines how many trees are added to the Ensemble. The learning rate controls the contribution of each tree to the final model. Maximum depth sets the maximum depth of each tree, which helps to control complexity. While GBM offers high predictive accuracy and flexibility, it can be computationally intensive and tend to overfitting if not carefully tuned.

3.6. Gaussian Process (GP)

GP classification is a statistical approach utilized specifically for binary classification tasks, such as predicting line failures in infrastructure systems. This method leverages a latent variable model coupled with a non-linear link function, typically the logistic function, to transform continuous data into binary outcomes (failure vs. non-failure). GP classification leads in scenarios where the relationships between variables are complex as it naturally incorporates uncertainty in its predictions [53,54]. The core mathematical model involves placing a GP prior on the latent function

f (x)

in Equation (4), which captures the underlying input–output relationships:

f (x) \sim GP (m (x), k (x, x^{'}))

(4)

The transformation from the latent function to the binary outcome is achieved through the logistic function, demonstrated in Equation (5).

p (y = 1 ∣ x) = σ (f (x)) = \frac{1}{1 + e^{- f (x)}}

(5)

The integration of the latent function’s posterior, necessary due to the non-linear transformation, typically requires approximation techniques to compute effectively. For further mathematical details on the implementation of GP classification, see [53].

3.7. Ensemble Learning (RF + SVM + GP)

Ensemble models combine the predictions of multiple algorithms to enhance accuracy. By increasing the strengths of different models, Ensemble methods improve generalization and often achieve better performance compared to individual models. Common Ensemble approaches include bagging, which reduces variance by training models on various data subsets, and boosting, which sequentially corrects the errors of previous models to reduce bias [55]. Stacking is another method that uses a meta-model to integrate predictions from multiple algorithms, thereby capitalizing on their combined strengths [56]. In our study, we utilized an Ensemble model comprising RF, SVM, and GP. By integrating these diverse models, our Ensemble model aims to enhance predictive accuracy and offer improved performance over individual models.

3.8. Deep Neural Network (DNN)

The Deep Feedforward Neural Network (DFFNN) consists of an input layer, multiple hidden layers, and an output layer [57,58]. The hidden layers play a crucial role in capturing and modeling the non-linear relationships between the input variables and the output target [59]. In a feedforward structure, each layer passes information forward to the next layer without any connections to previous layers. The DFFNN processes inputs and applies complex transformations through the hidden layers before making predictions at the output layer.

The output of each layer in the DFFNN can be represented mathematically as

y^{(l)} = σ (W^{(l)} \cdot y^{(l - 1)} + b^{(l)})

(6)

$y^{(l)}$ is the output of the l-th layer.
$y^{(l - 1)}$ is the output of the previous layer (or the input to the current layer).
$W^{(l)}$ is the weight matrix for the l-th layer.
$b^{(l)}$ is the bias vector for the l-th layer.
$σ$ is the activation function

In this formulation, the output of each layer

y^{(l)}

is computed by applying the activation function

σ

to the weighted sum of the outputs from the previous layer

y^{(l - 1)}

, plus a bias term

b^{(l)}

. The weight matrix

W^{(l)}

and bias vector

b^{(l)}

are parameters learned during the training process. The choice of activation function

σ

introduces non-linearity into the model, enabling it to capture complex patterns in the data.

3.9. ML Model Hyperparameter Tuning

Tuning hyperparameters involves selecting the best combination of values to maximize the model’s predictive accuracy. In this study, we have included grid search and five-fold stratified cross-validation to ensure robust model evaluation and prevent overfitting. Table 1 demonstrates the hyperparameters used for various models in our study, presenting the specific settings and ranges considered during the tuning process.

3.10. Deep Learning Model

The DFFNN consists of an input layer, multiple hidden layers, and an output layer. The hidden layers play a crucial role in capturing and modeling the non-linear relationships between the input variables and the output target.

Table 2 presents the architecture and training configuration of the DNN model, detailing the layers, activation functions, and hyperparameters used during the training process.

3.11. Cross-Validation and Final Model Evaluation

Cross-validation is a essential process in model evaluation that aims to assess the performance and generalizability of a ML model [43,60]. In this study, we employed k-fold stratified with five folds to ensure a balanced assessment of the model’s performance. This technique involves splitting the dataset into five equally sized folds, maintaining the original class distribution within each fold. The model is then trained on k-1 folds and validated on the remaining fold, repeating this process five times so that each fold serves as the validation set once [60]. This process reduces the risk of overfitting by providing a more comprehensive view of the model’s expected performance on unseen data. Figure 7 represents the cross-validation process used for model evaluation on the training set.

To further enhance model accuracy, we incorporated hyperparameter tuning using grid search during the cross-validation process [61]. Grid search systematically searches through a predefined grid of hyperparameters to identify the optimal combination that maximizes model performance. This ensures that the selected hyperparameters are optimized to the data, leading to improved generalization on unseen datasets. After identifying the best hyperparameters through cross-validation, the model’s performance was evaluated on a separate validation set to confirm its generalization capability. Finally, we combined the training and validation sets to train the final model using the optimal hyperparameters. This final model was then tested on an unseen test set to provide an unbiased estimate of its predictive performance. By integrating cross-validation and grid search into the model development process, our models were accurately tuned for the tasks.

3.12. Performance Evaluation Metrics

In this study, the performance evaluation was conducted using key metrics such as accuracy, precision, recall, and F1-score for classification tasks [62,63]. These metrics were utilized to assess the models’ ability to correctly classify line outages, providing a comprehensive view of their performance in terms of both overall correctness and their handling of class imbalances. Key metrics include accuracy, which measures overall correctness. Precision assesses the exactness of positive predictions, while recall evaluates the model’s ability to identify all relevant cases. The F1-score combines precision and recall to provide a balanced view of model performance. Accuracy calculates the overall correctness of the model by calculating the proportion of correctly classified instances out of the total instances. The formula for accuracy is defined as follows:

Accuracy = \frac{TP + TN}{TP + TN + FP + FN}

(7)

TP denotes true positives (correctly predicted line failures), TN denotes true negatives (correctly predicted line non-failures), FP denotes false positives (incorrectly predicted line failures), and FN denotes false negatives (missed line failures). Precision assesses the exactness of the model’s positive predictions by measuring the proportion of true positive instances out of the total predicted positive instances. The formula for precision is given by

Precision = \frac{TP}{TP + FP}

(8)

Recall evaluates the model’s ability to identify all relevant positive instances by calculating the proportion of true positive instances out of the actual total positive instances. Recall is defined as

Recall = \frac{TP}{TP + FN}

(9)

The F1-score combines precision and recall to provide a balanced view of the model’s performance, particularly when there is an uneven class distribution. It is the harmonic mean of precision and recall and is calculated using the following formula:

F 1 Score = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(10)

3.13. Experiment Setup

The dataset utilized in this study contains a various set of variables for analyzing and predicting line failures under varying weather conditions, particularly focusing on the effects of WS. It includes WS, measured in miles per hour, total failures and total non-failures, which log the number of observed line failures and their complements, respectively. The failure rate is computed as the proportion of total failures to the overall number of line events, providing a direct metric of failure occurrence. In addition, storm intensity level is derived from WS measurements, which are converted into categorical variables reflecting distinct WS intervals.

Line failure probability estimates the likelihood of a line failure based on the Equation (3). Additionally, the dataset includes 1000 binomial simulation cases, providing a detailed framework for statistically validating the predictive models against a variety of simulated outcomes. Figure 8 demonstrates the feature importance in predicting line failures. WS (mph) is the most significant predictor, with the highest importance score, representing its critical role in influencing line failures.

In this study, we prepared our dataset for predictive modeling by partitioning it into 70% for training, 15% for validation, and 15% for testing. The dataset contains input variables such as WS in mph, total failures, total non-failures, storm intensity, line failure probability, and simulation binomial runs, which collectively provide information on the factors influencing line failures and their probabilities. Synthetic Minority Over-sampling Technique (SMOTE) [64] is applied to the training set to address class imbalance, thereby enhancing the model’s predictive performance on minority classes. Later, the features are standardized to have zero mean and unit variance, which is crucial for ensuring that all features contribute equally to the model. The standardized value z is calculated using Equation (11), and a grid of hyperparameters is defined for tuning the model in Table 1.

z = \frac{x - μ}{σ}

(11)

where x is the original value,

μ

is the mean of the feature values, and

σ

is the standard deviation of the feature values.

The training set builds the model, while the validation set helps tune hyperparameters and prevent overfitting. We applied 5-fold cross-validation on the training set to evaluate each model’s generalization capability. Intermediate performance was evaluated on the validation set to the model parameters. After cross-validation, we assessed final model performance on a separate test set, which had not been used during training or validation, ensuring an unbiased evaluation of the model’s performance on unseen data. For the binary classification task, the target variable is one of the binomial simulation result, representing a binary outcome of line failures.

4. Results and Discussion

4.1. Binary Classification for Line Outages

This section focuses on predicting line outages through binary classification (outage/no outage). The results and discussions on model performance and predictive accuracy are presented in this section. The model utilizes weather conditions and other relevant features to make these predictions, as discussed in Section 3.13. ML and a deep learning model were employed to perform the binary classification task, as discussed in Section 3. The performance of the models is assessed based on their ability to accurately predict failures. The difference between actual and predicted outcomes is measured using key performance indicators, including accuracy, precision, recall, F1-score, and the Area Under the ROC Curve, to provide a comprehensive evaluation of each model’s predictive capabilities.

4.2. Cross-Validation Results and Discussion

In this study, we evaluated several ML models, including a DNN, for outcome prediction of the training set using 5-fold cross-validation and ROC curve analysis, as illustrated in Figure 9A–G. The ROC curves provide a visual representation of the models’ performance, measured by the AUC. Among the models, RF (Figure 9B) and the Ensemble model (Figure 9G) achieved the highest mean AUC of 0.88. RF and Ensemble demonstrate the ability to effectively handle the complexity of the dataset and consistently deliver strong classification performance across all validation folds. The KNN model (Figure 9A) closely followed with a mean AUC of 0.87, further validating its reliability in this context.

The GBM (Figure 9E), GP (Figure 9F), and DNN (Figure 9H) models all performed competitively, each achieving a mean AUC of 0.86. Each of those models presents the ability to capture complex non-linear relationships in the data. The SVM model (Figure 9C) achieved an AUC of 0.83. The SVM model showed medium performance, though it did not outperform the Ensemble and other previously presented approaches. In contrast, the DTs model (Figure 9D) obtained a lower mean AUC of 0.79, suggesting that simpler models might struggle to capture the underlying data patterns effectively. In addition, the STD across all models was low, indicating unvarying performance across different folds.

4.3. Final Model Evaluation

To evaluate the final model performance, we considered the test set results, allowing for a thorough assessment of the model’s ability to generalize to unseen data while taking account of across different validation folds presented in Section 4.2. The Ensemble model achieved the highest test accuracy of 0.8200, presenting its ability to handle the complexity of the dataset and generalize well to unseen data. The GP and SVM models both followed closely with a test accuracy of 0.8067, demonstrating strong performance and reliable generalization to new data. Table 3 summarizes the cross-validation and the training set and test set accuracy for various ML models.

The DNN performed effectively, achieving a test accuracy of 0.7933, emphasizing its ability to capture complex non-linear relationships within the dataset. Note that the test accuracy performance of KNN, RF, and GBM varied across different simulations and cross-validations, reflecting some inconsistency in their ability to generalize to unseen data. These models did not outperform the Ensemble, GP, SVM, and DNN models. The DTs model showed the lowest performance, with a test accuracy of 0.6733, highlighting its difficulties in capturing more complex patterns in the data, as seen from its lower cross-validation and test set performance. To evaluate the test set performance, we also considered the ROC and AUC results to further examine the models’ ability to distinguish between classes of their predictive accuracy. Figure 10 presents a comparison of the ROC curves and AUC values for different ML models on the test set.

The GP model demonstrates the highest AUC of 0.85 on the test set in Figure 10, indicating a superior ability to distinguish between classes compared to the other models. The SVM also performed well on the test set, with an AUC of 0.84, followed closely by the Ensemble model with an AUC of 0.83. On the other hand, the DT shows the lowest AUC value of 0.70 on the test set, reflecting relatively poorer performance in class differentiation. On the other hand, the remaining models such as KNN, GBM, DNN, and RF show competitive test set performance with AUCs ranging between 0.79 and 0.81, further demonstrating their potential effectiveness in classification tasks, though they did not outperform the GP, SVM, or Ensemble models.

To further evaluate the test set performance, we considered precision, recall, and F1-score for performance metrics, providing a more detailed assessment of the models’ classification accuracy and their ability to handle binary classes. Table 4 provides a detailed comparison of performance metrics for various ML models, specifically evaluating their effectiveness in binary classification tasks for line outages. We considered the task of predicting whether a particular power line will experience an outage (class 1) or not (class 0). The metrics included in Table 4 are precision*, recall*, and F1-score* for class 0, and precision, recall, and F1-score for class 1.

The Ensemble model demonstrated the strongest overall performance across both classes, achieving a precision of 0.74, recall of 0.80, and F1-score of 0.77 for class 1 and a precision of 0.88, recall of 0.83, and F1-score of 0.85 for class 0 in the test set, respectively. The GP and SVM models followed closely with competitive performance, both achieving a precision of 0.71, recall of 0.80, and F1-score of 0.76 for class 1 and a precision of 0.87, recall of 0.81, and F1-score of 0.84 for class 0 in the test set, respectively. These models demonstrated strong predictive across both classes, though they slightly lagged behind the Ensemble model.

The DNN also performed well, with solid results on the test set for both classes, though it fell behind the Ensemble, GP, and SVM models, particularly for class 1. The GBM and KNN models displayed moderate performance. However, their test set accuracy was lower compared to the top-performing models, particularly for class 1. Lastly, the RF and DTs model presented the weakest performance, struggling to generalize effectively on the test set, especially for class 1, with the DTs model showing the lowest on the test set.

5. Conclusions

In this study, we evaluated the performance of various ML models in predicting the binary classification of line failures. Our analysis included KNN, RF, SVM, DTs, GBM, GP, and the Ensemble model, which combined RF, SVM, and GP. Additionally, we examined DNN as a separate model. According to the results of this study on the test set, the Ensemble model outperformed all other models, demonstrating superior performance across key metrics. In terms of ROC and AUC, GP achieved the highest scores, followed by the Ensemble and SVM models. When considering precision, recall, and F1-score, the Ensemble model was the most outstanding, with GP and SVM showing similar performance in these metrics after the Ensemble model. The DNN also delivered strong results, performing well after these top models.

On the other hand, KNN, RF, and GBM demonstrated inconsistent performance, making it challenging to generalize their prediction accuracy. Lastly, the DTs was the least effective model. Additionally, based on cross-validation results, the Ensemble model and RF performed the best on the training set in terms of ROC and AUC, while DTs remained the least effective across training set evaluations.

Future work could explore the integration of additional environmental factors and the application of these models in real-time predictive systems. Investigating the use of transfer learning and time series-based advanced deep learning methods may also provide significant improvements. Additionally, methods to quantify and manage prediction uncertainties could be incorporated to provide more reliable prediction. Furthermore, cluster analysis presents another area of interest for future study, as it could provide additional information to further enhance predictive accuracy. Finally, studies should explore how predictive models can contribute to enhancing the resilience of power systems against extreme weather events and other disruptions.

Author Contributions

Conceptualization, A.U. and M.P.; methodology, A.U. and M.P.; software, A.U.; validation, A.U.; formal analysis, A.U.; investigation, A.U.; resources, A.U.; data curation, A.U.; writing—original draft preparation, A.U.; writing—review and editing, A.U. and M.P.; visualization, A.U.; supervision, M.P.; project administration, M.P. All authors have read and agreed to the published version of the manuscript.

Funding

This project received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are contained within the article.

Acknowledgments

The authors would like to thank University of Connecticut and Eversource Energy Center.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

KNN	K-Nearest Neighbors
RF	Random Forest
SVM	Support Vector Machine
DT	Decision Trees
GBM	Gradient Boosting Machine
GP	Gaussian Process
DNN	Deep Neural Network
DFFNN	Deep Feed-Forward Neural Network
DERs	Distributed Energy Resources
RE	Renewable Energy
MC	Monte Carlo
ROC	Receiver Operating Characteristic
AUC	Area Under the Curve
SMOTE	Synthetic Minority Over-sampling
ML	Machine Learning
TP	True Positives
TN	True Negatives
FP	False Positives
FN	False Negatives
CDF	Cumulative Distribution Function
WS	wind speed
WP	Wind Power

References

Akdemir, K.Z.; Kern, J.D.; Lamontagne, J. Assessing risks for New England’s wholesale electricity market from wind power losses during extreme winter storms. Energy 2022, 251, 123886. [Google Scholar] [CrossRef]
Mujjuni, F.; Betts, T.R.; Blanchard, R.E. Evaluation of Power Systems Resilience to Extreme Weather Events: A Review of Methods and Assumptions. IEEE Access 2023, 11, 87279–87296. [Google Scholar] [CrossRef]
Panteli, M.; Trakas, D.N.; Mancarella, P.; Hatziargyriou, N.D. Power systems resilience assessment: Hardening and smart operational enhancement strategies. Proc. IEEE 2017, 105, 1202–1213. [Google Scholar] [CrossRef]
Wanik, D.W.; Parent, J.; Anagnostou, E.; Hartman, B. Using vegetation management and LiDAR-derived tree height data to improve outage predictions for electric utilities. Electr. Power Syst. Res. 2017, 146, 236–245. [Google Scholar] [CrossRef]
Liu, W.; Ding, F.; Zhao, C. Dynamic restoration strategy for distribution system resilience enhancement. In Proceedings of the 2020 IEEE Power & Energy Society Innovative Smart Grid Technologies Conference (ISGT), Washington, DC, USA, 17–20 February 2020; pp. 1–5. [Google Scholar]
Xu, L.; Feng, K.; Lin, N.; Perera, A.; Poor, H.V.; Xie, L.; Ji, C.; Sun, X.A.; Guo, Q.; O’Malley, M. Resilience of renewable power systems under climate risks. Nat. Rev. Electr. Eng. 2024, 1, 53–66. [Google Scholar] [CrossRef]
Unlu, A.; Dorado-Rojas, S.A.; Pena, M.; Wang, Z. Weather-Informed Forecasting for Time Series Optimal Power Flow of Transmission Systems with Large Renewable Share. IEEE Access 2024, 12, 92652–92662. [Google Scholar] [CrossRef]
Jawad, M.; Nadeem, M.S.A.; Shim, S.O.; Khan, I.R.; Shaheen, A.; Habib, N.; Hussain, L.; Aziz, W. Machine learning based cost effective electricity load forecasting model using correlated meteorological parameters. IEEE Access 2020, 8, 146847–146864. [Google Scholar] [CrossRef]
Tsioumpri, E.; Stephen, B.; McArthur, S.D. Weather related fault prediction in minimally monitored distribution networks. Energies 2021, 14, 2053. [Google Scholar] [CrossRef]
Wanik, D.; Anagnostou, E.; Hartman, B.; Frediani, M.; Astitha, M. Storm outage modeling for an electric distribution network in Northeastern USA. Nat. Hazards 2015, 79, 1359–1384. [Google Scholar] [CrossRef]
Zenkner, G.; Navarro-Martinez, S. A flexible and lightweight deep learning weather forecasting model. Appl. Intell. 2023, 53, 24991–25002. [Google Scholar] [CrossRef]
Xie, J.; Alvarez-Fernandez, I.; Sun, W. A review of machine learning applications in power system resilience. In Proceedings of the 2020 IEEE Power & Energy Society General Meeting (PESGM), Montreal, QC, Canada, 2–6 August 2020; pp. 1–5. [Google Scholar]
Hou, G.; Muraleetharan, K.K. Modeling the Resilience of Power Distribution Systems Subjected to Extreme Winds Considering Tree Failures: An Integrated Framework. Int. J. Disaster Risk Sci. 2023, 14, 194–208. [Google Scholar] [CrossRef]
Barnes, A.; Nagarajan, H.; Yamangil, E.; Bent, R.; Backhaus, S. Resilient design of large-scale distribution feeders with networked microgrids. Electr. Power Syst. Res. 2019, 171, 150–157. [Google Scholar] [CrossRef]
Liu, H.; Wang, C.; Ju, P.; Li, H. A sequentially preventive model enhancing power system resilience against extreme-weather-triggered failures. Renew. Sustain. Energy Rev. 2022, 156, 111945. [Google Scholar] [CrossRef]
Zhu, C.; Yang, Q.; Wang, D.; Huang, G.; Liang, S. Fragility Analysis of Transmission Towers Subjected to Downburst Winds. Appl. Sci. 2023, 13, 9167. [Google Scholar] [CrossRef]
Ma, S.; Chen, B.; Wang, Z. Resilience enhancement strategy for distribution systems under extreme weather events. IEEE Trans. Smart Grid 2016, 9, 1442–1451. [Google Scholar] [CrossRef]
Sang, Y.; Xue, J.; Sahraei-Ardakani, M.; Ou, G. An integrated preventive operation framework for power systems during hurricanes. IEEE Syst. J. 2019, 14, 3245–3255. [Google Scholar] [CrossRef]
Omogoye, O.S.; Folly, K.A.; Awodele, K.O. Enhancing the distribution power system resilience against hurricane events using a bayesian network line outage prediction model. J. Eng. 2021, 2021, 731–744. [Google Scholar] [CrossRef]
Kebede, F.S.; Olivier, J.C.; Bourguet, S.; Machmoum, M. Reliability evaluation of renewable power systems through distribution network power outage modelling. Energies 2021, 14, 3225. [Google Scholar] [CrossRef]
Jonathan, P.; Randell, D.; Wadsworth, J.; Tawn, J. Uncertainties in return values from extreme value analysis of peaks over threshold using the generalised Pareto distribution. Ocean. Eng. 2021, 220, 107725. [Google Scholar] [CrossRef]
Hu, X.; Fang, G.; Yang, J.; Zhao, L.; Ge, Y. Simplified models for uncertainty quantification of extreme events using Monte Carlo technique. Reliab. Eng. Syst. Saf. 2023, 230, 108935. [Google Scholar] [CrossRef]
Guikema, S.D.; Nateghi, R.; Quiring, S.M.; Staid, A.; Reilly, A.C.; Gao, M. Predicting hurricane power outages to support storm response planning. IEEE Access 2014, 2, 1364–1373. [Google Scholar] [CrossRef]
Cerrai, D.; Wanik, D.W.; Bhuiyan, M.A.E.; Zhang, X.; Yang, J.; Frediani, M.E.; Anagnostou, E.N. Predicting storm outages through new representations of weather and vegetation. IEEE Access 2019, 7, 29639–29654. [Google Scholar] [CrossRef]
Yang, F.; Watson, P.; Koukoula, M.; Anagnostou, E.N. Enhancing weather-related power outage prediction by event severity classification. IEEE Access 2020, 8, 60029–60042. [Google Scholar] [CrossRef]
Toubeau, J.F.; Pardoen, L.; Hubert, L.; Marenne, N.; Sprooten, J.; De Grève, Z.; Vallée, F. Machine learning-assisted outage planning for maintenance activities in power systems with renewables. Energy 2022, 238, 121993. [Google Scholar] [CrossRef]
AlHaddad, U.; Basuhail, A.; Khemakhem, M.; Eassa, F.E.; Jambi, K. Towards sustainable energy grids: A machine learning-based ensemble methods approach for outages estimation in extreme weather events. Sustainability 2023, 15, 12622. [Google Scholar] [CrossRef]
Atrigna, M.; Buonanno, A.; Carli, R.; Cavone, G.; Scarabaggio, P.; Valenti, M.; Graditi, G.; Dotoli, M. A machine learning approach to fault prediction of power distribution grids under heatwaves. IEEE Trans. Ind. Appl. 2023, 59, 4835–4845. [Google Scholar] [CrossRef]
Tervo, R.; Láng, I.; Jung, A.; Mäkelä, A. Predicting power outages caused by extratropical storms. Nat. Hazards Earth Syst. Sci. Discuss. 2020, 2020, 1–26. [Google Scholar] [CrossRef]
Li, B.; Chen, Y.; Huang, S.; Guan, H.; Xiong, Y.; Mei, S. A Bayesian network model for predicting outages of distribution system caused by hurricanes. In Proceedings of the 2020 IEEE Power & Energy Society General Meeting (PESGM), Montreal, QC, Canada, 2–6 August 2020; pp. 1–5. [Google Scholar]
Melagoda, A.; Karunarathna, T.; Nisaharan, G.; Amarasinghe, P.; Abeygunawardane, S. Application of machine learning algorithms for predicting vegetation related outages in power distribution systems. In Proceedings of the 2021 3rd International Conference on Electrical Engineering (EECon), Colombo, Sri Lanka, 24 September 2021; pp. 25–30. [Google Scholar]
Jeong, S.H.; Elnashai, A.S. Probabilistic fragility analysis parameterized by fundamental response quantities. Eng. Struct. 2007, 29, 1238–1251. [Google Scholar] [CrossRef]
Serrano-Fontova, A.; Li, H.; Liao, Z.; Jamieson, M.R.; Serrano, R.; Parisio, A.; Panteli, M. A comprehensive review and comparison of the fragility curves used for resilience assessments in power systems. IEEE Access 2023, 11, 108050–108067. [Google Scholar] [CrossRef]
Kirsty, M.; Bell, K. Wind related faults on the GB transmission network. In Proceedings of the Probabilistic Methods Applied to Power Systems (PMAPS), Durham, UK, 7–10 July 2014; pp. 1–6. [Google Scholar]
Dunn, S.; Wilkinson, S.; Alderson, D.; Fowler, H.; Galasso, C. Fragility curves for assessing the resilience of electricity networks constructed from an extensive fault database. Nat. Hazards Rev. 2018, 19, 04017019. [Google Scholar] [CrossRef]
Zhu, X.; Ou, G.; Jafarishiadeh, F.; Sahraei-Ardakani, M. A data generation engine and workflow for power network damage and loss estimation under hurricane. In Proceedings of the 2022 North American Power Symposium (NAPS), Salt Lake City, UT, USA, 9–11 October 2022; pp. 1–5. [Google Scholar]
Hughes, W. Integrating Physics-Based and Data-Driven Models for Community Resilience Assessment Under Wind Storms. Ph.D. Thesis, University of Connecticut, Storrs, CT, USA, 2023. [Google Scholar]
Gupta, A.K.; Verma, K. Assessment of Infrastructural and Operational Resilience of Transmission Lines During Dynamic Meteorological Hazard. In Proceedings of the 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), Delhi, India, 6–8 July 2023; pp. 1–5. [Google Scholar]
Raj, S.V.; Kumar, M.; Bhatia, U. Fragility curves for power transmission towers in Odisha, India, based on observed damage during 2019 cyclone fani. arXiv 2021, arXiv:2107.06072. [Google Scholar]
Bhattacharya, P.; Bhattacharjee, R. A study on Weibull distribution for estimating the parameters. Wind Eng. 2009, 33, 469–476. [Google Scholar] [CrossRef]
Panteli, M.; Pickering, C.; Wilkinson, S.; Dawson, R.; Mancarella, P. Power system resilience to extreme weather: Fragility modeling, probabilistic impact assessment, and adaptation measures. IEEE Trans. Power Syst. 2016, 32, 3747–3757. [Google Scholar] [CrossRef]
Papoulis, A. Probability and Statistics; Prentice-Hall, Inc.: Englewood Cliffs, NJ, USA, 1990. [Google Scholar]
Guo, G.; Wang, H.; Bell, D.; Bi, Y.; Greer, K. KNN model-based approach in classification. In Proceedings of the On the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003, Catania, Sicily, Italy, 3–7 November 2003; Proceedings. Springer: Berlin/Heidelberg, Germany, 2003; pp. 986–996. [Google Scholar]
Wazirali, R. An improved intrusion detection system based on KNN hyperparameter tuning and cross-validation. Arab. J. Sci. Eng. 2020, 45, 10859–10873. [Google Scholar] [CrossRef]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Contreras, P.; Orellana-Alvear, J.; Muñoz, P.; Bendix, J.; Célleri, R. Influence of random forest hyperparameterization on short-term runoff forecasting in an andean mountain catchment. Atmosphere 2021, 12, 238. [Google Scholar] [CrossRef]
Deisenroth, M.P.; Faisal, A.A.; Ong, C.S. Mathematics for Machine Learning; Cambridge University Press: Cambridge, UK, 2020. [Google Scholar]
Sun, J.; Zheng, C.; Li, X.; Zhou, Y. Analysis of the Distance Between Two Classes for Tuning SVM Hyperparameters. IEEE Trans. Neural Netw. 2010, 21, 305–318. [Google Scholar] [CrossRef]
Kotsiantis, S.B. Decision trees: A recent overview. Artif. Intell. Rev. 2013, 39, 261–283. [Google Scholar] [CrossRef]
Mantovani, R.G.; Horváth, T.; Cerri, R.; Junior, S.B.; Vanschoren, J.; de Carvalho, A.d.L. An empirical study on hyperparameter tuning of decision trees. arXiv 2018, arXiv:1812.02207. [Google Scholar]
Bentéjac, C.; Csörgo, A.; Martínez-Muñoz, G. A comparative analysis of gradient boosting algorithms. Artif. Intell. Rev. 2021, 54, 1937–1967. [Google Scholar] [CrossRef]
van Hoof, J.; Vanschoren, J. Hyperboost: Hyperparameter optimization by gradient boosting surrogate models. arXiv 2021, arXiv:2101.02289. [Google Scholar]
Williams, C.K.; Rasmussen, C.E. Gaussian Processes for Machine Learning; MIT Press: Cambridge, MA, USA, 2006; Volume 2. [Google Scholar]
Nickisch, H.; Rasmussen, C.E. Approximations for binary Gaussian process classification. J. Mach. Learn. Res. 2008, 9, 2035–2078. [Google Scholar]
Bauer, E.; Kohavi, R. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Mach. Learn. 1999, 36, 105–139. [Google Scholar] [CrossRef]
Cawood, P.; Van Zyl, T. Evaluating state-of-the-art, forecasting ensembles and meta-learning strategies for model fusion. Forecasting 2022, 4, 732–751. [Google Scholar] [CrossRef]
Montavon, G.; Samek, W.; Müller, K.R. Methods for interpreting and understanding deep neural networks. Digit. Signal Process. 2018, 73, 1–15. [Google Scholar] [CrossRef]
Sze, V.; Chen, Y.H.; Yang, T.J.; Emer, J.S. Efficient Processing of Deep Neural Networks: A Tutorial and Survey. Proc. IEEE 2017, 105, 2295–2329. [Google Scholar] [CrossRef]
Unlu, A.; Peña, M. Combined MIMO Deep Learning Method for ACOPF with High Wind Power Integration. Energies 2024, 17, 796. [Google Scholar] [CrossRef]
Berrar, D.; Ranganathan, S.; Nakai, K.; Schönbach, C.; Gribskov, M. Cross-validation. In Encyclopedia of Bioinformatics and Computational Biology, 1st ed.; Elsevier: Amsterdam, The Netherlands, 2018; pp. 542–545. [Google Scholar]
Yan, T.; Shen, S.L.; Zhou, A.; Chen, X. Prediction of geological characteristics from shield operational parameters by integrating grid search and K-fold cross validation into stacking classification algorithm. J. Rock Mech. Geotech. Eng. 2022, 14, 1292–1303. [Google Scholar] [CrossRef]
Goutte, C.; Gaussier, E. A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In Proceedings of the European Conference on Information Retrieval; Springer: Berlin/Heidelberg, Germany, 2005; pp. 345–359. [Google Scholar]
Wardhani, N.W.S.; Rochayani, M.Y.; Iriany, A.; Sulistyono, A.D.; Lestantyo, P. Cross-validation metrics for evaluating classification performance on imbalanced data. In Proceedings of the 2019 International Conference on Computer, Control, Informatics and Its Applications (IC3INA), Tangerang, Indonesia, 23–24 October 2019; pp. 14–18. [Google Scholar]
Zhu, T.; Lin, Y.; Liu, Y. Synthetic minority oversampling technique for multiclass imbalance problems. Pattern Recognit. 2017, 72, 327–340. [Google Scholar] [CrossRef]

Figure 1. A detailed diagram illustrating the end-to-end process from data collection and preprocessing, including historical data generation (fragility curve, weather information, and scenario generation), to model training, validation, and testing, with classifiers and hyperparameter tuning steps defined. Sections 2 and 3 represents correspond to the respective stages.

Figure 2. Fragility curves for power line damage states under different conditions.

Figure 3. Frequency distribution of wind speeds.

Figure 4. Relationship between wind speed and total line failures.

Figure 5. Relationship between wind speed and line failure probabilities.

Figure 6. Relationship between wind speed and line failure states.

Figure 7. Cross-validation process for evaluating model performance.

Figure 8. Feature importance for the ML models.

Figure 9. Comparison of ROC curves for various ML models based on 5-fold cross-validation. Each subfigure (A–H) represents the model’s performance in terms of true positive rate vs. false positive rate across multiple validation folds, with the mean Area Under the Curve (AUC) and standard deviation (STD) values indicated.

Figure 10. Comparison of ROC curves and AUC across different ML models on the test set.

Table 1. Detailed hyperparameter configurations for optimizing ML models.

Model	Parameter	Values
KNN	number of neighbors:	5, 10, 15
	weights:	uniform, distance
	p:	1, 2
RF	number of estimators:	100, 200
	max. depth:	10, 20
	min. sample split:	2, 5
	min. sample leaf:	1, 2
SVM	C:	0.1, 1, 10, 100
	gamma:	1, 0.1, 0.01, 0.001
	kernel:	rbf
DT	max. depth:	10, 20, 30, 40, 50
	min. samples split:	2, 5, 10
	min. samples leaf:	1, 2, 4
GBM	number of estimators:	100, 200, 300
	learning rate:	0.01, 0.1, 0.2
	max. depth:	3, 4, 5
GP	kernel constant value:	0.1, 1, 10
GP	kernel length scale:	0.1, 1, 10
Ensemble	RF hyperparameters:	Best hyperparameters found
	SVM hyperparameters:	Best hyperparameters found
	GP hyperparameters:	Best hyperparameters found

Table 2. DNN model architecture and training configuration.

Type	Description
Dense Layer	256 neurons, ReLU activation
Dropout Layer	Dropout rate: 0.2
Dense Layer	128 neurons, ReLU activation
Dropout Layer	Dropout rate: 0.2
Dense Layer	64 neurons, ReLU activation
Dropout Layer	Dropout rate: 0.2
Dense Layer	32 neurons, ReLU activation
Output Layer	1 neuron, Sigmoid activation
Optimizer	Adam optimizer, learning rate = 0.001
Loss Function	Binary cross-entropy
Epochs	100
Early stopping	Applied

Table 3. Cross validation (CV) and test set accuracy for various ML models.

Model	Mean Accuracy (CV)	Std (CV)	Test Accuracy
KNN	0.8128	0.0329	0.7600
RF	0.8204	0.0266	0.7300
SVM	0.8096	0.0260	0.8067
DT	0.7792	0.0179	0.6733
GBM	0.7998	0.0125	0.7533
GP	0.8063	0.0299	0.8067
Ensemble	0.8106	0.0232	0.8200
DNN	0.8020	0.0297	0.7933

Table 4. Performance metrics for various ML models.

Model	Precision*	Recall*	F1-Score*	Precision	Recall	F1-Score
KNN	0.80	0.82	0.81	0.69	0.66	0.67
RF	0.78	0.80	0.79	0.65	0.62	0.64
SVM	0.87	0.81	0.84	0.71	0.80	0.76
DT	0.74	0.74	0.74	0.56	0.55	0.56
GBM	0.82	0.78	0.80	0.66	0.71	0.68
GP	0.87	0.81	0.84	0.71	0.80	0.76
Ensemble	0.88	0.83	0.85	0.74	0.80	0.77
DNN	0.83	0.84	0.84	0.73	0.71	0.72

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Unlu, A.; Peña, M. Assessment of Line Outage Prediction Using Ensemble Learning and Gaussian Processes During Extreme Meteorological Events. Wind 2024, 4, 342-362. https://doi.org/10.3390/wind4040017

AMA Style

Unlu A, Peña M. Assessment of Line Outage Prediction Using Ensemble Learning and Gaussian Processes During Extreme Meteorological Events. Wind. 2024; 4(4):342-362. https://doi.org/10.3390/wind4040017

Chicago/Turabian Style

Unlu, Altan, and Malaquias Peña. 2024. "Assessment of Line Outage Prediction Using Ensemble Learning and Gaussian Processes During Extreme Meteorological Events" Wind 4, no. 4: 342-362. https://doi.org/10.3390/wind4040017

APA Style

Unlu, A., & Peña, M. (2024). Assessment of Line Outage Prediction Using Ensemble Learning and Gaussian Processes During Extreme Meteorological Events. Wind, 4(4), 342-362. https://doi.org/10.3390/wind4040017

Article Menu

Assessment of Line Outage Prediction Using Ensemble Learning and Gaussian Processes During Extreme Meteorological Events

Abstract

1. Introduction

2. The Structure of Proposed Binary Classification and Data Generation for Line Failure Prediction

2.1. Fragility Curve Development

2.2. Weather Information

2.3. Scenario Generation

3. Methodology—Studied Machine Learning Models and Hyperparameters

3.1. K-Nearest Neighbors (KNN)

3.2. Random Forest (RF)

3.3. Support Vector Machine (SVM)

3.4. Decision Trees (DTs)

3.5. Gradient Boosting Machine (GBM)

3.6. Gaussian Process (GP)

3.7. Ensemble Learning (RF + SVM + GP)

3.8. Deep Neural Network (DNN)

3.9. ML Model Hyperparameter Tuning

3.10. Deep Learning Model

3.11. Cross-Validation and Final Model Evaluation

3.12. Performance Evaluation Metrics

3.13. Experiment Setup

4. Results and Discussion

4.1. Binary Classification for Line Outages

4.2. Cross-Validation Results and Discussion

4.3. Final Model Evaluation

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI