This paper uses the IEEE 30-bus system and the IEEE 118-bus system to test and validate the proposed Hybrid Stacked Ensemble Method (HSEM). The IEEE 30-bus system consists of 6 generator buses, while the IEEE 118-bus system contains 54 generator buses. By leveraging both small- and large-scale test systems, the performance of the algorithm in providing an explainable warm-start point for AC-OPF problems is thoroughly evaluated.
4.2. IEEE 30-Bus System
The IEEE 30-bus system is a classical bus system that conforms to IEEE standards and is frequently used for testing as a small test system. This system consists of 41 transmission lines, six generator buses, and 24 load buses [
39]. Therefore, in the proposed HSEM model, the input dimension is 48, including the active demand and reactive demand of the 24 load buses. The output dimension of the model is 12, consisting of the active power outputs and voltages of the six generator buses.
Table 2 presents the results of the objective function values for HSEM and other comparative methods on the IEEE 30-bus system. In the table, the feasibility rate refers to the proportion of feasible solutions that satisfy the control variable limits and OPF security constraints as specified in Equation (
1). The maximum, minimum, and average
are employed to evaluate the performance of the different predictive model. The maximum
reflects the worst-case prediction error, the minimum
shows the best prediction error, and the average
provides an overview of the overall prediction accuracy. These metrics help comprehensively analyze the model’s performance in both extreme and ordinary situations. Mean Squared Error (MSE) and Mean Absolute Error (MAE) are used to assess the mean squared error and mean absolute error of the predictions, respectively. It is particularly important to note that
, MSE, and MAE are calculated only for the feasible solutions of the algorithm, ensuring the accuracy and reliability of the evaluation results.
In terms of feasibility rate, it is evident that the weak learners of the base models, except for CatBoost, perform well in meeting the constraints of the IEEE 30-bus system. For example, Regression Trees, Random Forest, LightGBM, and XGBoost all achieve a 100% feasibility rate on the IEEE 30-bus system. In contrast, although CatBoost has only a 46.61% feasibility rate, it performs exceptionally well in terms of , even surpassing other base learners by 1–2 orders of magnitude.
The proposed HSEM algorithm effectively predicts the optimal values for the IEEE 30-bus system case. The total generation cost resulting from the HSEM output is very close to that obtained from Matpower, the ground truth, with an average value of reaching as low as 0.000179%. Additionally, this algorithm maintains a 99.96% feasibility rate across 10,000 test sets, meeting the high accuracy requirements of practical applications. Furthermore, the MSE and MAE values also indicate that the HSEM algorithm has a low prediction error for the generation cost, demonstrating good consistency and stability.
When compared with the more popular DNN algorithms, it is observed that while DNN algorithms may perform better than base learners (except CatBoost) in terms of an average value of , the HSEM algorithm outperforms DNN in both and feasibility rate.
Although
Table 2 provides a quantitative analysis of the prediction performance of the objective function values using evaluation metrics, even the most indicative metric, the average
, can still be affected by extreme points. The scatter plot of
can better reflect the distribution of relative errors in the objective function values.
Figure 3 shows the scatter plots of
for the six comparative algorithms, while
Figure 4 presents the scatter plot of
for the HSEM algorithm alongside the corresponding frequency
occurrences.
From the aforementioned figures, it can be observed that the value of
for RT among the weak learners is mostly within
, for RF it is within
, and for LightGBM and XGBoost it falls primarily within
. Notably, CatBoost performs the best, with the value of
mainly distributed within
, while DNN falls within
. In contrast, as shown in the right panel of
Figure 4, the value of
for the HSEM algorithm is mainly within the range of
. If a few extreme points are excluded, the HSEM algorithm demonstrates the best stability and error performance. The results in
Figure 3 and
Figure 4 also confirm the accuracy of the evaluations presented in
Table 2.
The prediction of the OPF problem is essentially a multi-objective regression problem, where the prediction of the control variables determines the prediction of the objective function of the OPF problem. Therefore,
Table 3 analyzes the performance of different algorithms in predicting control variables. Since the dimensions of the two types of control variables—active power output and voltage at generator buses—are different,
Table 3 separately analyzes the MSE and MAE of different algorithms for the control variables
and
.
As shown in the table, the proposed HSEM algorithm achieves an MSE of and an MAE of in predicting the control variable . Given that the output range of voltage in the IEEE 30-bus system is on the order of to , the HSEM algorithm performs well in fitting . For , which has a range of , the HSEM algorithm achieves an MSE of and an MAE of , outperforming other algorithms and ensuring the accuracy of the regression problem for the target variables.
Additionally,
Figure 5 and
Figure 6 analyze the Absolute Percentage Error (APE) between the predicted values and the ground truth for control variables
and
, showing scatter plots and frequency distribution plots of APE for
and
. From
Figure 5, it can be observed that in most samples, the APE scatter points for
fall within the range of
, and the frequency distribution plot shows that errors are larger closer to 0. Considering the magnitude of the control variable
, the HSEM algorithm provides good predictions for
.
Furthermore,
Figure 6 shows that for most test sets, the APE scatter points for
are within
, with the frequency of errors peaking near 0. Therefore, the APE metric also indicates that the HSEM algorithm performs well in predicting control variables.
In terms of time complexity, the proposed HSEM algorithm requires only 4.63 ms per sample, compared to 87.3 ms per sample when using Matpower. This represents a substantial speedup of approximately 18.8 times, demonstrating that HSEM is significantly more efficient and capable of providing faster warm-start for AC-OPF problems, while maintaining its explainability and accuracy.
4.3. IEEE 118-Bus System
The IEEE 118-bus system is one of the IEEE power system test cases, derived from the power system of the northeastern United States. It is widely used for large-scale power system research and optimization algorithm verification. This system consists of 186 transmission lines, 54 generator buses, and 64 load buses [
1]. Therefore, for the IEEE 118-bus system, the input dimension of HSEM is 128, including the active and reactive demand corresponding to the 64 load buses. The output dimension of the algorithm is 108, comprising the active power outputs and voltages of the 54 generator buses. Unlike the IEEE 30-bus system, the IEEE 118-bus system has a higher output dimension (nine times that of the smaller bus system). Additionally, the range of control variables in the IEEE 118-bus system is also larger, making this bus system an excellent testbed for evaluating the performance of algorithms on larger-scale test system.
Table 4 describes the performance of the HSEM algorithm and other comparative algorithms in terms of objective function values when handling the IEEE 118-bus system.
From the perspective of feasibility rate, it is evident that due to the higher output dimension and range of the 118-bus system, the performance of various comparative methods in terms of feasibility rate is not as good as in the smaller bus system. In particular, CatBoost achieves a feasibility rate of only 6.35%. Among the comparative algorithms, even the highest feasibility rate achieved by LightGBM is only 95.57%, which still does not meet the requirements for large-scale or real-time prediction of feasible solutions. However, the HSEM algorithm performs well even when faced with the more complex, high-dimensional test system, achieving a feasibility rate of 99.22%. In contrast, the DNN algorithm only achieves a feasibility rate of 45.46%. When dealing with higher-dimensional nonlinear problems, the DNN algorithm often faces significant issues of exceeding safety constraints. This highlights one of the advantages of the proposed HSEM algorithm.
Regarding the objective function values, among the comparative algorithms, CatBoost shows the best performance based on the average . Even considering marginal cases, CatBoost has the smallest maximum . For the HSEM algorithm, although its maximum is larger than that of CatBoost, its overall performance on the test set in terms of surpasses all other algorithms, reaching . Additionally, the HSEM algorithm also shows the best performance on the other two evaluation metrics, MSE and MAE. It is important to note that CatBoost performs well on many evaluation measures, but its low feasibility rate shows it struggles with the OPF problem. On the other hand, the HSEM algorithm uses a stacking method, combining solutions from RT and RF algorithms and refining them through a meta-model. This allows the overall output to meet safety constraints while also improving accuracy.
Figure 7 shows the scatter plot distribution of
for comparative algorithms on the IEEE 118-bus system, while
Figure 8 presents the scatter plot and corresponding frequency of
for the HSEM algorithm.
Analyzing the above figures, it can be seen that the
of the HSEM algorithm mainly falls within the range of
, whereas for CatBoost it is primarily within
. These two algorithms exhibit much smaller objective function errors compared to the other comparative algorithms. The similarity in the
scatter plot of CatBoost to that of the HSEM algorithm aligns with the close average
values observed in
Table 4. However, as mentioned earlier, CatBoost’s feasibility rate is significantly lower than that of the other comparative algorithms. Essentially, the HSEM algorithm, through its two-layer stacking approach, corrects the base model’s prediction outputs to meet the constraints.
Table 5 presents the regression errors of the control variables for each algorithm on the IEEE 118-bus system. As shown in the table, the HSEM algorithm has the smallest regression errors. For the control variable
, which has a larger magnitude, the HSEM algorithm achieves an MSE of
and an MAE of
. For the control variable
, the HSEM algorithm achieves an MSE of
and an MAE of
. Additionally, whether considering MAE or MSE, CatBoost’s evaluation metrics are closest to those of HSEM, making it the second-best performing algorithm after HSEM. Therefore, the errors in the control variables also reflect the errors in the objective function values, consistent with the results in
Table 4.
Figure 9 and
Figure 10 depict the APE (Absolute Percentage Error) of the HSEM algorithm for the control variables
and
. From
Figure 9, it can be seen that the sum of the APE for
in the HSEM algorithm mostly falls within the range of 0.1%. Considering that the maximum value range of generator active power output in the 118-bus system is 0 to 707 (excluding the slack bus), and the minimum value range is 0 to 100, the HSEM algorithm fits the control variables for generator output very well. For the generator bus voltages, as shown in
Figure 10, the APE of the HSEM algorithm mostly falls within
. Therefore, the HSEM algorithm also performs well in fitting the control variables.
For the IEEE 118-bus system, the proposed HSEM algorithm requires only 12.7 ms per sample, whereas Matpower takes 297.6 ms per sample. This results in a speedup of approximately 23.4 times, demonstrating that HSEM significantly outperforms Matpower in terms of computational efficiency, while still providing an explainable and accurate warm-start point for AC-OPF problems.