4.3.1. Correlation Analysis between Weather Conditions and Solar Power Production
Pearson correlation is employed to quantitatively analyze the relationship between the meteorological factors and PV power. The inputs for the forecasting model are selected based on the high correlation of weather factors. The relationship between the solar power and measured weather data, as well as numerical weather forecast data in Dataset I, is examined, with the results shown in
Table 3. Similarly, the relationship between the solar power output and weather data from the inverse distance meteorological interpolation in Dataset II is evaluated, with the results shown in
Table 4.
The analysis of Dataset I shows that the four selected features used as inputs in the prediction model are measured global horizontal irradiance (GHI), numerical weather prediction (NWP) GHI, NWP direct normal irradiance (DNI), and historical PV power. For Dataset II, the selected features are IDW-interpolated GHI, IDW-interpolated diffuse horizontal irradiance (DHI), IDW-interpolated temperature, and historical PV power.
To analyze the impact of meteorological factors as an input on the prediction performance of the proposed model, this study conducted a comprehensive analysis of the prediction performance under two different input conditions: one without the meteorological features as the input and the other with the meteorological features as the input. The experimental results are shown in
Table 5 and
Table 6.
As shown in
Table 5, for Dataset I, when the meteorological features are used as the input, the proposed model exhibits superior prediction performance in forecasting power output. For example, in the spring experiment, using the meteorological features as the input improved the
of the proposed model by 0.965%, reduced the
by 0.083 MW, and decreased the
by 0.074 MW. Similar improvements were observed in autumn and winter.
As shown in
Table 6, for Dataset II, using the meteorological features generated by the inverse distance weighting meteorological encryption technique as the input significantly enhanced the prediction performance of the proposed model. For example, in the spring experiment, the
of the proposed model increased by 0.950%, the
decreased by 0.094 MW, and the
decreased by 0.121 MW. Additionally, the prediction performance of the proposed model was improved in the other seasons as well.
Therefore, including meteorological features as the input can effectively improve the prediction performance of the proposed model. The analysis of the experimental results for Dataset II indicates that using the inverse distance weighting meteorological encryption technique can achieve the comprehensive coverage of the meteorological resources within the region at a lower economic cost, thereby obtaining differentiated meteorological data resources and improving the model’s prediction performance. This further demonstrates that meteorological features are a critical factor in enhancing the prediction performance of the proposed model.
4.3.2. Ablation Analysis of Sub-Module Additions for PV Power Prediction Performance
The overall power prediction performance of the proposed model and models 1 to 4 in the ablation experiments across various regions and seasons is presented in
Table 7 and
Table 8. The results indicate that the proposed model outperforms models 1 to 4 in terms of the overall prediction performance. In the seasonal experiments on Datasets I and II, model 2 shows an improvement in the prediction performance compared to model 1. Using the spring experiment results as an example, in Dataset I, compared to model 1, model 2 with the added ST-GCN network showed an improvement of 0.650% in
, a decrease of 0.03 MW in
, and a reduction of 0.047 MW in
. In Dataset II, compared to model 1, model 2 showed an improvement of 1.899% in
, a decrease of 0.128 MW in
, and a reduction of 0.16 MW in
. Additionally, similar results were obtained in the experiments conducted in other seasons. This improvement is attributed to the ST-GCN network, which aggregates the feature information from highly correlated neighboring nodes, thus integrating features from multiple PV stations to obtain richer and more accurate feature representations. In contrast, model 1 predicts each PV station in the region individually, failing to capture the global information, which limits its prediction performance.
The experimental results of models 3 and 4 indicate that the addition of IAFF or CBAM attention layers improves the prediction performance of model 2. Specifically, after extracting the spatio-temporal feature information with ST-GCN, adding the CBAM attention layer significantly enhances model 2’s prediction performance by emphasizing the important spatio-temporal feature information. For example, in the spring experiments, for Dataset I, model 3 and model 4 improve the by 0.87% and 1.73%, respectively, compared to model 2. The decreased by 0.09 MW and 0.103 MW, and the decreased by 0.058 MW and 0.137 MW, respectively. For Dataset II, model 3 and model 4 improve the by 2.41% and 3.8%, respectively, compared to model 2. The decreased by 0.191 MW and 0.354 MW, and the decreased by 0.233 MW and 0.395 MW, respectively.
Moreover, the experimental results indicate that the proposed model outperforms model 4. Although model 4 focuses on the spatio-temporal features extracted by ST-GCN, which enhances model 3’s prediction performance, it does not process the initial node feature matrix with the IAFF attention layer, thus failing to extract beneficial features for prediction. This limitation restricts model 4’s prediction performance. For example, in the spring experiments, for Dataset I, compared to model 4, the proposed model improves the by 0.74%, reduces by 0.136 MW, and reduces by 0.057 MW. For Dataset II, the proposed model improves the by 1.05%, reduces by 0.155 MW, and reduces by 0.141 MW compared to model 4.
Therefore, based on the ST-GCN network, the proposed model incorporates a dual-layer attention mechanism that not only effectively learns and integrates both the local and global features from the initial node feature matrix, resulting in a more comprehensive node feature matrix, but also further enhances the useful spatio-temporal features and suppresses the less important ones after the ST-GCN extracts the spatio-temporal features. This dual-layer attention mechanism thereby significantly improves the model’s prediction accuracy and stability.
To analyze the residual distributions in the ablation experiments, the residual distributions of the proposed model and models 1 to 4, using the spring experiment results as an example, are presented in
Figure 3. For Dataset I, the residuals of the proposed model are primarily concentrated between −0.428 MW and 0.353 MW, with a median of approximately −0.013 MW. For Dataset II, the residuals are mainly concentrated between −0.635 MW and 0.244 MW, with a median of around −0.186 MW. Compared to models 1 to 4, the proposed model’s residual distribution in both datasets is notably narrow and peaked, indicating that the residuals are closely clustered around the median. This suggests the stability and precision of the proposed model’s predictive performance. Furthermore, these findings demonstrate that the improved spatio-temporal graph prediction model with a dual-attention mechanism substantially enhances the prediction performance for regional distributed PV power stations.
To intuitively demonstrate the actual forecasting performance of the proposed model and models 1 to 4, and to ensure the randomness of the experiment, this study randomly selected PV power stations S0 and S8 from Dataset I and stations no. 2 and no. 42 from Dataset II for the analysis of the actual prediction results.
In the ablation experiment, using the spring results as an example, the actual power forecasts of various PV power stations are shown in
Figure 4,
Figure 5 and
Figure 6.
Figure 4 shows that the proposed model maintains lower
and
indices compared to models 1 to 4 when predicting multiple stations simultaneously.
Figure 5 and
Figure 6 illustrate that the proposed model achieves a closer fit to the actual values for stations S0, S8, no. 2, and no. 42, with smoother curves and better prediction performance than models 1 to 4.
Additionally, to further analyze the power prediction results of the proposed model and ablation models 1 to 4 under complex weather conditions for PV stations,
Figure 5a shows that both the proposed model and models 1 to 4 can effectively capture the actual power trends under complex weather conditions. However, examining the second prediction day of PV station S0 under complex weather conditions reveals that, despite some differences between the proposed model’s predictions and the actual values, the proposed model’s results are closer to the actual values compared to models 1 to 4. Furthermore, as illustrated in
Figure 6, during the fourth and fifth prediction days for PV stations no. 2 and no. 42, the proposed model shows a better fitting trend, especially during the periods of local fluctuations, and its predictions are closer to the actual values in the trough periods. From the prediction results under complex weather conditions shown in
Figure 5 and
Figure 6, it can be seen that the proposed model maintains better prediction performance under complex weather conditions compared to models 1 to 4.
Therefore, the proposed model, employing an enhanced spatio-temporal graph prediction method with a dual-layer attention mechanism, demonstrates excellent prediction performance in regional PV power station group forecasting.
4.3.3. Evaluation of Ensemble and Proposed Models for Power Forecasting Performance
To further evaluate the forecasting effectiveness of the proposed model against various ensemble models for power forecasting in PV power plants, three ensemble models were selected for comparison: the CNN-Transformer prediction model (model 5), the ST-GCN-BITCN prediction model (model 6), and the ST-GCN-CNN-Transformer prediction model (model 7). Under the initial conditions specified in
Section 4.2, models 5 to 7 were retrained using data from various regions and seasons. The prediction results for these ensemble models are presented in
Table 9 and
Table 10.
The experimental findings for the proposed model and model 5 reveal that the proposed model markedly outperforms model 5. Using the spring experimental results of Dataset I as an example, the proposed model demonstrates an improvement in of 2.9% compared to model 5. Additionally, the is reduced by 0.203 MW and the is reduced by 0.217 MW. This is because model 5 predicts each station in the regional distributed PV power stations individually, without considering the complex spatio-temporal relationships between the plants. Instead, it relies solely on the CNN model to independently extract the features from each node, which limits its predictive performance.
The experimental findings for the proposed model and model 6 indicate that, despite model 6’s use of the ST-GCN network to account for the temporal and spatial characteristics between PV power plants, its predictive performance is limited by the BITCN’s constrained receptive field of the convolutional kernel. This limitation hampers its ability to capture long-term dependencies. Using the spring experimental results of Dataset I as an example, the proposed model demonstrates improvements in of 1.9% compared to model 6. Additionally, the is reduced by 0.202 MW and the is reduced by 0.162 MW.
The experimental findings for the proposed model and model 7 indicate that, while model 7 improves upon model 6 by incorporating the ST-GCN network and further accounting for the spatiotemporal characteristics between PV power plants, its predictive performance still requires enhancement. The reasons are (1) the initial node feature matrix was not processed, preventing the extraction of useful feature information for the prediction model; (2) although the CNN network further investigates the spatial characteristics captured by the ST-GCN network, the temporal features extracted by the ST-GCN are not enhanced, thus limiting the predictive performance of model 7. Using the spring experimental results of Dataset I as an example, the proposed model demonstrates improvements in of 0.9% compared to model 7. Additionally, the is reduced by 0.113 MW and the is reduced by 0.089 MW.
To analyze the residual distributions in the ensemble experiments, using the spring experiment results as an illustration, the residual distributions of the proposed model and models 5 to 7 in different combination experiments are shown in
Figure 7. Compared to models 5 to 7, the proposed model exhibits narrower and higher-peaked residual distributions in both Dataset I and Dataset II, indicating that the residuals of the proposed model are closer to the median. Moreover, the residual distributions of the proposed model across different datasets are highly similar, further demonstrating the stability and accuracy of its predictive performance. Consequently, this suggests that the proposed model has broad applicability and strong generalization capability.
To visually illustrate the forecast precision of the proposed model compared to models 5 to 7, and to ensure the fairness of the experiment, PV power stations S0, S8, no. 2, and no. 42 were chosen for detailed analysis. The experiments continued to use the spring results as examples. The actual power prediction results for different PV power stations are shown in
Figure 8,
Figure 9 and
Figure 10.
Figure 8 shows that the proposed model surpasses models 5 to 7 in terms of the
and
forecasting metrics.
Figure 9 and
Figure 10 illustrate that the proposed model demonstrates a superior fitting trend in the S0, S8, no. 2, and no. 42 photovoltaic power stations. Compared to models 5 to 7, the proposed model’s predictions are more aligned with the actual values, featuring smoother curves and greatly enhanced overall forecasting accuracy.
Additionally, to further analyze the performance of the proposed model and ensemble models 5 to 7 in predicting the PV station power under complex weather conditions,
Figure 9a shows that, under complex weather conditions, the actual power values of PV station S0 on the second prediction day exhibit significant fluctuations during several periods. The proposed model achieves a high degree of fit with the actual values during these periods, demonstrating strong predictive performance. In contrast, models 5 to 7 show considerable deviations from the actual values in some periods, with overall smoother fluctuations in their curves. Similarly, as illustrated in
Figure 10, during the fourth and fifth prediction days for PV stations no. 2 and no. 42, the proposed model’s predictions closely match the actual values, especially during the peaks and troughs in fluctuating periods. Conversely, models 5 to 7 show larger deviations during these periods. The analysis of
Figure 9 and
Figure 10 indicates that the proposed model not only captures the actual power trends more accurately but also maintains better predictive performance across different fluctuating periods compared to models 5 to 7.
Therefore, in the ensemble model experiments, the proposed model, utilizing an improved spatio-temporal graph prediction method with a dual-layer attention mechanism, demonstrates superior predictive performance. When evaluated across different datasets, it also exhibits broad applicability and strong generalization capabilities.