1. Introduction
Maintaining a continuous and reliable power supply for consumer loads hinges on the crucial task of ensuring the operational stability of the power grid. Modern power systems, as shown in
Figure 1, are intricate systems that are vulnerable to a variety of factors, including time-varying loads, Renewable Energy Sources (RESs), equipment failures, and unexpected events such as natural disasters. These factors can cause significant variations in power generation and consumption, leading to instability in power systems [
1]. The initial investigations developed several methodologies for evaluating the VS of the power system. The analytical method [
2,
3,
4], the honey badger algorithm [
5,
6] for DG allocation and size, the continuation power flow method [
7], the singular value decomposition method [
8], and predictive control [
9] are some of these methods. In recent years, deep learning (DL) and machine learning (ML) methods have shown great potential in addressing real-world applications such as natural language processing, speech recognition, autonomous vehicles, and image and vision processing, as well as electric load and energy prediction [
10]. These techniques have also been applied in electric power systems for adaptive voltage control (VVC) in microgrid integration, smart inverters, microgrid energy management, solar photovoltaic power forecasting, fault diagnosis for the active distribution grid, VS forecasting of the power system [
11], and overall stability and security of power systems [
10]. In [
12,
13], we used deep reinforcement learning (DRL) techniques to optimize VVC in active distribution networks (DN), addressing issues such as power loss and voltage changes on the bus. In [
14], we created a DRL model using a soft actor–critic algorithm to enhance the VVC of smart solar photovoltaic inverters based on droop operation. The proposed multiagent DRL method coordinates optimal VVC for solar photovoltaic inverters and static var compensators, maximizing nodal voltage in distributed networks. In [
15], we used a deep Q network and a deep deterministic policy gradient method to stabilize the voltage of a 200-bus power system across various operating conditions, aiming to improve both the efficiency of training and the precision of the proposed algorithm.
The problem of the frequency stability of electrical systems based on the ML and DL algorithms has garnered considerable interest [
16,
17,
18,
19]. In [
19], a hybrid technique was introduced that combines fuzzy logic, deep learning, and a grey wolf optimizer based on ordered position to enhance microgrid frequency stability. In [
20], a K-vector nearest-neighbor algorithm was described for dynamic frequency control, accounting for changes in wind turbine speed and varying loads over time. In [
21,
22,
23,
24,
25], we employed a spike neural network to tune the coefficients of a nonlinear integral backstepping controller and address frequency deviations in an autonomic microgrid caused by the intermittent nature of RESs, such as solar and wind energy. The primary aim of this network is to eliminate vanishing gradients, a common issue in regular recurrent neural networks (RNNs) that can hinder the capture of long-term dependencies within datasets. The long short-term memory (LSTM) network, with its gating mechanism, allows for selective retention or discarding of information from previous time steps, making it well-suited for modeling sequences with persistent dependencies. The remarkable precision of the LSTM classifier suggests that the sequential description of the data in this dataset plays a significant role in the prediction of the target variable. We proposed using long short-term memory (LSTM) to forecast the VS of the IEEE 33-bus approach. The LSTM model can determine the stability or instability of the power system based on historical inputs, such as load demand, generation capacity, and line parameters. Over the years, LSTM has become a popular choice for addressing various real-world problems due to its efficiency in handling both linear and non-linear data patterns.
This article proposes the long short-term memory (LSTM) system to forecast the VS of a 33-bus power network. The LSTM technique was evaluated by comparing its performance in forecasting VS with other methods, including support vector machine (SVM), Naive Bayes (NB), and convolutional neural network (CNN). The major contributions of this paper are as follows:
Designing a model that can predict the electrical grid voltage stability by combining deep learning and machine learning techniques. The proposed LSTM technique for VS assessment outperforms conventional models that rely on shallow machine learning methods, such as SVM, NB and CNN, in terms of accuracy and response time in simulations run on the IEEE 33-bus scheme.
Validating that the LSTM-based voltage stability evaluation method is more accurate and has a faster response time compared to conventional techniques based on traditional machine learning, such as artificial neural networks (ANNs), CNN, SVM, and decision trees (DTs), as demonstrated by the simulation outcomes of the IEEE 33-bus scheme.
Enhancing the performance of each model through hyper-tuning of the parameters.
The remaining parts of this research are organized as follows:
Section 2 describes the methods used to resolve the VS assessment issue in the electrical power system.
Section 3 presents the analysis and discussion of the findings.
Section 4 concludes with a summary of the findings and proposals for future research.
2. The Proposed System Methodology
In this system, a constant voltage source represents an infinite bus, and the line and load impedances are denoted by the symbols Z and , respectively. When the maximum power transmission is reached, more power can be delivered to the load by lowering the impedance . The voltage drop will then grow more noticeable as is decreased, which, in turn, increases the power demand. This increased power demand will continue to rise, eventually resulting in less power being supplied to the load.
The PV curve represents the process illustrated in
Figure 2. At this moment, the load is receiving active power, denoted as
. The
are voltages at the receiving and sending ends;
are active power values at the receiving and sending ends;
are apparent powers at the receiving and sending ends;
are reactive powers at the receiving and sending ends of the system, respectively;
Y is the admittance;
R is resistance;
X is reactance;
Z is impedance;
is the impedance angle of the lines and I is the current passing through the line.
If
, it can be expressed at the receiving end for real and active power:
A classical load system in Equation (2) is included in the scheme model of Equation (3) with the continuation power flow (CPF) approach [
16].
where
and
represent powers provided to the load bus
the limitation
is an integer that defines the system’s loading level;
and
represent the velocities at which the load bus’
active and reactive powers change, respectively.
whereas the voltage consequence at bus
is represented as
and the angle voltage variation involving nodes
with
is denoted as L0; the actual and fictional sections of the
part of the admittance conditions of the structure are represented by
; and
. Variations in active and reactive power as the constraint
changes establish the value of revolution in active and reactive power. The CPF approach raises the
value gradually until the maximum load limit is reached, increasing the system load. The
and
are equivalent at the bifurcation node to their maximum values,
and
, respectively.
where
is the power factor angle, the
, and
represents the constant power factor of the Pv curves.
The importance of implementing an artificial intelligence strategy in the electric power system is growing, driven by its effectiveness across various domains, including load prediction, assessment of the power system, fault detection, and more. To address essential protocol changes and create visual representations, the use of three distinct machine learning frameworks—Scikit-learn, TensorFlow, and PyTorch 3.0—is indispensable.
The proposed model for assessing VS using LSTM consists of five primary components: an input layer, LSTM layer, fully connected (FC) layer, softmax layer, and output layer. The frameworks utilized in this model include various tools such as the regular scaler for statistical normalization, the confusion matrix for evaluating performance, and cross-validation for the K-Fold.
Figure 3 illustrates the typical structure of the deep learning models used. The LSTM approach was used to forecast the PV curves based on VS assessments of the electrical system, as shown in
Figure 4. In addition, a comparison was made with other machine learning (ML) and deep learning (DL) models, including support vector machines (SVMs), Naive Bayes, K-nearest neighbors (KNN), and convolutional neural networks (CNNs), to evaluate the efficacy of the long short-term memory (LSTM) algorithm. The power–voltage curve (P-V) analysis is a widely used method for assessing voltage stability in power systems. This curve illustrates the relationship between power transfer in lines and voltage magnitudes at buses. The characterization of the PV curve shows that a steep curve reveals voltage sensitivity, while a flat slope indicates approaching instability. During the data study procedure, the scikit learn library was employed with a labeled contribution dataset. In this specific case, a supplementary dataset consisting of 30,000 observations was used as an independent variable. In this framework, a numerical value of 1 represents a state of stability (stable), whereas a value of 0 indicates a state of instability (unstable).
The assessment of features involved generating graphical representations for the 12 features, providing visual insights into their distribution and their correlation with the dependent variable ‘Control Action’. This section provides a visual representation of how these 12 numerical values relate to the dependent variable. The dataset was divided into two distinct subsets using the DL algorithm, one for training and the other for evaluation. The algorithm was trained on the first subset of 24,000 data points and then tested on the remaining 6000 data points. The analytical variables in the dataset included the notional power that each participant in the network created (positive) or consumed (negative), ranging from −3.0 to −0.5 for the data labeled for the national power grid load value to calculate the stability or instability of the IEEE 33-bus system, and the reaction time each participant experienced, ranging from 0.5 to 1.0. The dataset also included the price elasticity of demand for each node in the network, ranging from 0.05 to 1.0 as a continuous variable.
The principle of energy conservation resulted in the overall load demand being identical to the total generation capacity. A non-linear activation function was used to help the system modify its input consequences to its environment. The dataset underwent an initial filtering process to eliminate missing values, such as blank or zero values within its features, thereby ensuring the accuracy and validity of the results. Missing values were removed to ensure dataset execution, and a thorough search was conducted to identify artifacts like outliers, duplicates, or atypical patterns that could affect the analysis. The data were divided into different testing and training sets with a 30:70 ratio. We constructed a system model which evaluates the effectiveness of the training and testing data. Assigning 70% of the dataset for training and reserving 30% for testing helped to prevent overfitting of the model to the training data improved its capacity to generalize to new unseen data. In the ML technique, it is critical to select the appropriate number of layers, type of layers, and activation functions. The setting limitations and parameters of these techniques will be discussed in further detail to understand their impact on model effectiveness.
The LSTM concepts and the suggested LSTM algorithm for predicting voltage stability are explained in the next sections.
An LSTM cell comprises an input layer, forget gate, and output gate. The memory cell in an LSTM can determine how information is added to or removed from the cell state at each time step. The following is a breakdown of the components: ai is the forget gate; is the input gate and is the output gate; Xa, Xb, Xc, Xe and ja, jb, jc, je are the associated bias vectors and weight matrices, respectively; α() is the sigmoid function. In summary, the LSTM system learns how to effectively manage and update the cell state through these gates and their associated biases and weights. The sigmoid function is integral in controlling the amount of information retained or discarded, which is a crucial part of the learning process.
The whole change procedure of the cell position is expressed as a check on:
where * denotes the outcome in component order,
Ei is the candidate cell, and
Ei − 1, respectively, denotes the position of the cell in time steps
i and
i − 1.
The hidden state
hi is evaluated as follows:
The non-linear activation function is represented by the function
. The extended time domain correlation feature extraction in the LSTM network is made possible by its special architecture, which substitutes memory cells for hidden layer nodes. The dynamics of the post-contingency system serve as the input in this evaluation model. The VS of the IEEE 33-bus was predicted employing the LSTM algorithm. The network used the resolved linear unit (ReLU) activation function from the first to fifth layers, improving its ability to represent non-linear relationships and capture intrinsic patterns.
where
the hidden feature,
are the weight and biases that are used to tune the training data throughout procedure. The final outcome of our structure
is to find the voltage stability position of the electrical power system. The sigmoid activation function in the last layer allows the network to produce an output within a limited range, usually between 0 and 1. This output was good for predicting voltage stability, as shown in
Table 1. The initial layer of the LSTM model was constructed using an LSTM architecture with 32 units, employing the ReLU activation function, a return sequence parameter set to true, and an input shape of (13, 1). The subsequent stratum incorporated an additional LSTM layer with 24 units, using the ReLU function, and a true setting for the return sequence parameter. The third LSTM layer consisted of 16 units, including the ReLU activation function. The fourth layer with 24 units was densely connected, and used the ReLU activation function. The fifth layer was characterized by a high level of neuronal density, comprising 48 units. Finally, the last layer, serving as the output layer, contained a compact structure with a single unit. The sigmoid activation function was applied to this layer to produce the desired result. The proposed LSTM model consisted of a total of 14,105 trainable parameters, as illustrated in
Table 1, indicating the number of adjustable weights and biases within the model. This model used the Adam optimization algorithm with a learning rate of 0.01. The binary cross-entropy loss function was applied to measure the difference between the predicted and true labels, and precision served as the evaluation metric.
The ML technique was trained with hyperparameters to find the optimal value for the corresponding parameters. The performance of SVM, Naive Bayes and CNN models by tuning the corresponding hyperparameters is demonstrated in
Table 2.
The DL models mentioned above underwent 30-epoch training with a basic barrier in the training set. After training, these were verified on the test dataset, and precision was assessed using the accuracy score from sklearn_metrics. Then, precision was displayed on the console. Using pre-existing models saves time and computation as they have prior exposure to extensive data, eliminating the need for training from scratch, which is time-consuming and computationally demanding.
- b.
The Voltage Stability Assessment Indicators
In this paper, an approach is proposed that is thoroughly evaluated using statistical indicators, such as the AUC and the F1 score, in addition to the precision of the evaluation [
20], which is considered true positive (TP) for stable situations, while it becomes false negative (FN) for unstable conditions. True negatives (TN) are produced when an unstable sample is determined to be unstable; false positives (FP) are produced when the opposite is true. Some metrics are used to determine how well ML models work [
17]. The accuracy indicator measures the alignment between the predictions of the model and the actual results, representing the proportion of correct predictions among the total number of predictions, as shown in (11):
where precision is the proportion of accurate positive predictions to the absolute number of positive estimates. The indicator also calculates the percentage of actual positives rate (APR) that are correctly identified by the model.
Formula (13) represents the ratio of true positives to the sum of true positives and false negatives.
The
is a statistical measure that calculates the harmonic mean of precision and recall, with a weight assigned to each. It serves as a metric to assess the trade-off between precision and recall, especially in situations where there is a notable class imbalance. Mathematically, it can be defined as follows:
3. Results and Discussion of Machine Learning and Deep Learning Algorithm
The proposed scheme was experimentally implemented using the Python programming language in a Jupyter Notebook environment. The Microsoft Windows 11 operating system operates the LSTM on a personal computer (PC) with an Intel Core i7 processor at a frequency of 2.2 GHz. In addition, the PC has a memory capacity of 16 GB. The cross-correlation of all features present in the dataset was evaluated.
The figures below show significant correlation between the characteristics. In machine learning, classification is a type of supervised learning. The aim is to predict the class labels of the test data using patterns obtained from the training data. There are several classification algorithms, including support vector machines (SVMs), Naive Bayes, K nearest neighbor (KNN), long short-term memory (LSTM), convolutional neural networks (CNNs), and TabNet. The algorithms can be evaluated using the accuracy metric, which quantifies the ratio of accurately classified instances to the total number of instances. The efficiency of these algorithms was measured in terms of their precision. The long short-term memory model (LSTM) has been trained and validated to anticipate the VS of the system, as shown in
Figure 5.
In this case, the graphic shows how accurate the LSTM classifier can be, with an amazing degree of 0.9995 (99.95%). Over time, clear patterns and connections emerged between the input factors and the target outcome, which explains the extraordinary result.
The LSTM algorithm excels at integrating data from previous time steps to improve its predictive capabilities. Due to this feature, the model can capture long-term temporal correlations and make more accurate predictions. However, an accuracy of 0.9995 shows that the model may have just memorized the training data rather than properly applying its knowledge to unknown data, which is evidence of overfitting. The generalizability of a model can be established by testing it on data that were not used to create it. Overfitting-related problems can be remedied by experimenting with regularization strategies such as failure or weight loss.
Figure 6 shows the loss curves of the LSTM model during both the training and validation on the VS dataset. The voltage stability dataset was analyzed using the proposed LSTM algorithm.
Figure 7 displays the resulting confusion matrix and
Figure 8 visually depicts the remarkable accuracy of the CNN algorithm of 0.9610 in predicting the variable VS.
This study demonstrates the successful application of CNN models, typically used for image recognition and processing tasks, in tabular data classification. Given the circumstances, it is likely that the CNN model utilizes the fundamental spatial relationships in the input features. This enabled the development of the model to identify significant patterns and appearances, resulting in a significant improvement in accuracy. Non-linear activation functions, such as ReLU, introduce non-linearity to augment the capabilities of the model. Furthermore, the pooling of layers decreased the size of the output, thereby reducing the number of parameters and preventing overfitting. This study used a one-dimensional convolutional neural network (1DCNN) to accurately forecast VS using a particular dataset.
Figure 9 displays the loss curves for the CNN model. The results show that the system model performed admirably, with a total loss of the CNN model of 0.0962 and a training set accuracy of 0.9604. The model had a loss of 0.1078 and a precision of 0.9570 and performed well on the validation set.
Figure 10 shows a visual description of the confusion matrix. The results of this learning highlight the impressive ability of the CNN algorithm to predict VS on a specific dataset.
The reliability and ability to generalize to new data are shown by the remarkable accuracy attained in the validation set. With an overall accuracy of 0.9610 in the voltage stability dataset, the CNN model proved that it could learn and extract significant patterns from input features, leading to very accurate predictions. The ML and DL models mentioned above are compared in
Figure 11 to forecast the VS of the IEEE 33-bus system.