1. Introduction
With the emergence of the large-scale interconnected power grid and the high penetration of distributed generation, modern power systems are faced with severe challenges for stable operation. Transient stability is a crucial and complex issue in modern power systems [
1]. Rapid and accurate recognition of transient stability status is important for being aware of the imminent unstable risk so that enough time is available for applying the appropriate control strategies to prevent catastrophic outage [
2].
Classical transient stability analysis methods can be categorized into two branches, time-domain simulation (TDS) and transient energy function (TEF) [
3,
4]. Although the TDS method is straightforward and reliable, its high time complexity hinders real-time decision-making applications. As for the TEF method, it has low computational cost and provides the transient stability margin, but it is normally difficult to construct an available energy function when considering detailed system models.
With the wide deployment of phasor measurement unit devices, tremendous synchrophasor data is accessible for monitoring the stability of power systems [
5]. As a key tool for power system data analysis, machine learning shows great potential for real-time transient stability status prediction (TSSP) applications. Generally, TSSP itself can be regarded as a binary classification problem [
6], i.e., stable/unstable status. In the offline, the nonlinear relationship between the selected features and the corresponding stability status can be established via machine learning methods. When online, transient stability status can be predicted immediately after feeding the collected features to the classification model. Up to now, a variety of machine learning methods have been applied for TSSP, e.g., neural networks [
7,
8], support vector machines [
9,
10], decision trees [
11], and ensemble learning [
12,
13].
In general, practical power systems can remain at transient stable status when subjected to most disturbances. That is to say, an unstable status is detected only in a few situations, which results in the imbalanced data problem in the training database, i.e., stable samples significantly outnumber unstable samples [
14,
15]. Faced with this issue, conventional machine learning methods aiming to minimize the overall error rate tend to classify the samples as the stable class and show ineffectiveness in identifying unstable samples [
14]. It is known that the unstable class is more important than the stable class for power system operators, and poor recognition of unstable samples would dramatically deteriorate the practical utility of the TSSP classification model.
In the machine learning community, the imbalanced data problem in classification tasks is a hot topic of research and effective solutions can mainly be divided into data-level and algorithm-level approaches [
16]. The former achieves data rebalance by adding samples of minority class, namely oversampling, or reducing samples of majority class, namely undersampling. The latter handles this problem via enhanced classification algorithms, e.g., cost-sensitive learning [
17,
18].
However, few studies attempted to counteract the negative effects of imbalanced data for TSSP. Specifically, the unstable samples are duplicated to balance the sample number of different classes in Reference [
14]. Although this method could be simply and directly conducted, it is prone to overfitting [
18]. An adaptive synthetic sampling (ADASYN) algorithm is adopted in Reference [
15] to generate more unstable samples, but the generated unstable samples by linear interpolation are hard related to the actual operating conditions of power systems, which may affect the rationality of the classification model.
In this paper, a novel data segmentation-based ensemble classification (DSEC) method is proposed to better handle the imbalanced data problem of TSSP. The DSEC method consists of three steps. In the first step, the data segmentation strategy is utilized for dividing the stable samples into multiple non-overlapping stable subsets, ensuring that the samples in each stable subset are not more than the total unstable ones, then each stable subset is combined with the unstable set to form a training subset. For the second step, an AdaBoost classifier is constructed based on each training subset. In the final step, decision values from each AdaBoost classifier are aggregated for determining the transient stability status.
The rest of this paper is organized as follows:
Section 2 investigates the effects of the imbalanced data problem on TSSP. The DSEC method is proposed in
Section 3. The TSSP based on the DSEC method is introduced in
Section 4. The case studies are shown in
Section 5. Finally, conclusions are drawn in
Section 6.
3. The Proposed DSEC Method
From the analysis in
Section 2, the imbalanced data problem in the training set deteriorates the classification performance of the TSSP model. To address this challenge, the DSEC method is proposed for TSSP. The detailed description of this method is shown in the following.
3.1. Data Segmentation Strategy
The training set in the TSSP problem can be divided into the stable set and the unstable set, and the stable samples usually outnumber the unstable samples. To obtain a relatively balanced training set, the data segmentation strategy is proposed. The basic idea of the data segmentation strategy is to divide the stable set into multiple non-overlapping stable subsets, ensuring that the samples in each stable subset are not more than the unstable samples.
The specific processes of data segmentation strategy are shown below.
Step 1: Given the stable set
S with
NS samples and the IR value, determine the number of stable subset
T by
where
ceil represents the ceiling function.
Calculate the remainder
P by
Set k and P0 both equal to 1.
Step 2: If P0 ≤ P, go to step 3; otherwise go to step 4.
Step 3: Determine the number of samples in stable subset
Sk by
Create the stable subset Sk by randomly selecting |Sk| samples from stable set S without replacement. Set P0 = P0 + 1, k = k + 1, then go to step 5.
Step 4: Determine the number of samples in stable subset
Sk by
Create the stable subset Sk by randomly selecting |Sk| samples from stable set S without replacement. Set k = k + 1.
Step 5: If k ≤ T, return to step 2; otherwise go to step 6.
Step 6: Output T stable subsets S = {S1, S2, …, ST}.
From the strategy hereinbefore, if IR is an integer, the number of samples in each stable subset is the same as that of unstable samples. If not, the number of samples in each stable subset is less than that of unstable samples.
3.2. AdaBoost Algorithm
As a widely used machine learning method, AdaBoost is employed as the classifier in this paper. With the advantages of a sound theoretical foundation and simple implementation [
21], it has been applied to many classification problems in practice [
22,
23].
Since AdaBoost itself is also an ensemble learning method, the classification and regression tree (CART) is adopted as its base learner and the basic processes of the AdaBoost algorithm are described as follows:
Step 1: Given the training set D = {(x1, y1), (x2, y2), …, (xN, yN)}, yi ∈ {−1, +1}, set the iteration round to R.
Step 2: Set
r = 1, and initialize the weight of sample
i,
i = 1, …,
NStep 3: Create Dr by randomly selecting N samples from D with the probability ωr(i).
Step 4: Train the CART hr by using Dr.
Step 5: Calculate the classification error on
D.
Step 6: Calculate the weight
αr of CART
hr.
Step 7: Update the weight distribution.
Set r = r + 1, if r ≤ R, return to step 3; otherwise go to step 8.
Step 8: Output decision value.
3.3. The DSEC Method
To solve the imbalanced data problem of TSSP, the three-step DSEC method is proposed. The descriptions of each step are summarized below.
Step 1: Divide the training set D into stable set S and unstable set U. Next, utilize the data segmentation strategy to split the stable set into T stable subsets, S = {S1, S2, …, ST}. Then, combine each stable subset with unstable set U into T training subsets, D = {D1, D2, …, DT}. If IR is an integer, the stable samples are as many as the unstable samples in each training subset; otherwise, the unstable samples are more than the stable samples in each training subset.
Step 2: Train T AdaBoost classifiers with T training subsets independently.
Step 3: Ensemble the decision values from
T AdaBoost classifiers by using the summation rule expressed as follows:
Then, determine the transient stability status by
The schematic diagram of DSEC method is shown in
Figure 3:
5. Case Studies
The Northeast Power Coordinating Council (NPCC) 140-bus system, representing the equivalent power grid in the Northeastern United States, is utilized as a test system [
24,
25]. The simulations are carried out in MATLAB environment on a computer with an Intel Core i5 3.3 GHZ processor and 8 GB of RAM.
For database generation, the active power output and the terminal voltage of generators vary within ±20% and ±2% of the base operating condition, respectively, and the active and reactive power of loads both vary within ±20% of the base operating condition. A transmission line with permanent three-phase short-circuit is randomly selected as the fault condition and the fault duration is set to 0.12 s.
A total of 16,000 samples with 270 features are generated to form the original database of the TSSP. After two-stage feature selection preprocessing, 87 features are retained and the classification database is formed. A total of 60% of the classification database are randomly selected as the training set. Another 20% are randomly selected as the validation set and the remaining 20% are formed as the testing set. The sample distribution is shown in
Table 3.
From
Table 3, there is an obvious imbalanced data problem in the classification database and the number of stable samples is about 4.8 times the number of unstable samples.
After applying the data segmentation strategy, the sample distribution in each training subset is tabulated in
Table 4.
5.1. Parameter Selection
The main parameter of the DSEC model is iteration round R. Under different values of R, the GM performances of the DSEC model on the validation set are analyzed and compared and the value range is set to [10, 20, …, 100]. The results of parameter analysis are shown in
Figure 5. Taking into account the randomness of the AdaBoost classifier, the average results of 10 repeated experiments are utilized for comparison.
It can be seen from
Figure 5 that, with the increase of R, the GM value increases rapidly at first and then gradually remains stable. Considering the classification performance and model complexity, the value of R is set to 40.
5.2. Comparison with Traditional Machine Learning Methods
In this section, a comparison is made between the DSEC model and the traditional machine learning methods including SVM, ELM, CART, and AdaBoost, which do not consider the imbalanced data problem. The Gaussian function is chosen as the kernel function of SVM, and its parameters includes the penalty coefficient, c, and the kernel function parameter, γ. The grid search method is utilized for selecting the optimal parameters of SVM and the value range of both parameters is [2–8, 2–7, …, 28]. The main parameter of ELM is the number of hidden layer nodes, L, and its range is set as [50, 100, ..., 1500]. The default parameters are adopted in the CART algorithm and the iteration round of the AdaBoost classifier is 40.
The results of these methods for TSSP are compared and tabulated in
Table 5.
From
Table 5, when dealing with the TSSP problem with imbalanced data, the traditional machine learning methods lead to a high TSR but quite low TUR and GM. While the proposed DSEC model can significantly improve the TUR and GM, which become as high as 97.03% and 94.62% respectively, with the TSR still being maintained at 92.28%.
5.3. Comparison with Other Data-Level Methods
In this section, the DSEC method is compared with some state-of-the-art data-level methods for imbalanced TSSP problem, including random oversampling (ROS) [
14], random undersampling (RUS) [
20], the synthetic minority over-sampling technique (SMOTE) [
26], ADASYN, cluster-based undersampling (CUS) [
27], and EasyEnsemble [
20]. The detailed processes using these methods for the imbalanced data problem of TSSP are described as follows:
- (1)
ROS: A new unstable set UROS is sampled with replacement from the original unstable set U, so that |UROS| = NS. Then the unstable set UROS is combined with stable set S to form a new training set.
- (2)
RUS: A new stable set, SRUS, is sampled with replacement from the original stable set, S, so that |SROS| = NU. Then the stable set SRUS is combined with stable set U to form a new training set.
- (3)
SMOTE: New NS-NU unstable samples are generated by using SMOTE. Then, these unstable samples are added into the original training set, so that |USMOTE | = NS in the new training set.
- (4)
ADASYN: New NS-NU unstable samples are generated by using ADASYN. Then, these unstable samples are added into the original training set, so that |UADASYN| = NS in the new training set.
- (5)
CUS: The k-mediods algorithm is used for clustering the stable samples with NU clusters. A new unstable set UCUS is constructed with the NU samples from cluster center, so that |UCUS| = NS. Then the unstable set UCUS is combined with stable set S to form a new training set.
- (6)
EasyEnsemble: Randomly sample a stable subset SEasy from the original stable set S, so that |SEasy| = NU. Then the stable set SEasy is combined with stable set U to form a new training subset. Repeat above process TEasy times and obtain TEasy training subsets. Here, TEasy is set to 5.
The AdaBoost classifier is employed for data-level methods hereinbefore, and considering the randomness of these methods, the average results of 10 repeated experiments are taken for comparison. The training time and classification results of these methods are compared and shown in
Table 6.
From
Table 6, the DSEC method has higher TUR and GM values than other imbalanced data process methods, which means that the proposed method has a better classification performance both in unstable samples and overall samples and costs relatively less training time than other methods, except RUS. Therefore, the DSEC method is superior for TSSP with imbalanced data. Furthermore, the total time cost of DSEC method on testing data is 0.20 s, i.e., the computation time of one sample, is about 0.06 ms, which demonstrates the feasibility of applying the method for online application.
5.4. The Performance of the DSEC Method Under Different IRs
In order to study the classification performance of the DSEC method under different IRs, the value range of IR is set to [2, 4, ..., 10] and new training sets and testing sets for studying are constructed based on the value of IR. The GM performance of the DSEC method is evaluated under different IRs. In addition, as a traditional machine learning method, the AdaBoost classifier is utilized for comparing with the DSEC method using the same sample set. The results of these two methods are illustrated and compared in
Figure 6.
As the IR increases, the GM performance of the AdaBoost classifier decreases continuously and when IR = 10, the GM value decreases to about 87%. The performance of the DSEC method is almost unaffected with the change of IR and all the GM values are all higher than 93% under different IRs. The results further demonstrate the effectiveness of the DSEC method in dealing with the imbalanced data problem of TSSP.
Under different IRs, the increment of GM (IGM) value of the DSEC method over the AdaBoost classifier is shown in
Table 7.
An approximate linear function between the IR and the IGM value is fitted as follows:
The discrete data points and the fitted linear function are shown in
Figure 7.
As shown in
Figure 7, the IGM value almost increases linearly with the increase of IR, which means that the more severe the imbalanced data problem of TSSP, the greater the improvement of the DSEC method over the AdaBoost classifier makes.