4.2.2. Wind Generation

The performance of the different models based on the wind power generation dataset is presented in Table 13. It is immediately clear that the models performed poorly under this use case, with each model particularly suffering from very high error rates and low CC values. Albeit low, it is also seen that the CC values are, on average, the same for the models when compared to the target data. However, high and positive CC values are obtained when compared between the different models as shown in the correlation matrix of Figure 7. This confirms that the different models all performed similarly with almost perfect correlation between their predicted values.


**Table 13.** Performance of the different methods for wind power generation.

The error rates as measured via the RAE and RRSE indicate that the ANN generated the highest error rate, and thus is reported as the poorest performer. Essentially, the RRSE values of each model return higher than their corresponding RAE values, which indicates the presence of outliers across the different models under the wind prediction use case. On the other hand, the SVM model suffices as the best performer, as it achieved a 21.18% reduction in the RAE as compared with the ANN model.

**Figure 7.** The correlation matrix of the different methods for the wind power generation dataset.

A Tukey test was conducted to examine the differences in the mean values of the models, and the results obtained are presented in Table 14. We observed that, unlike in the PV power prediction and the system hourly demand datasets, there was a significant difference in the mean performance of the different models and the target data. This can be seen in column 4 of Table 14 with very low associated *p*-values, where the performance of the ANN model is also indicated to be significantly different from all other models.


**Table 14.** Wind power generation: Tukey test comparison of the performance of the different models.

Finally, a visual assessment of the predicted values of the different models can be made, based on the results of Figure 8. We observed that the different models only matched the rising patterns of the target data while failing to track periods of low wind power generation. This implies that the inherent irregularities in the wind power generation pattern typically limited the output performance of the different models. We also observed that the predicted values of the ANN model deviated largely from the target as well as from the other models, thus justifying its poor performance as noted in Tables 13 and 14. Consequently, because of the highly stochastic nature of wind, it may be difficult to apply ML models for predicting wind power generation, thus warranting the need for improved methods in this regard.

**Figure 8.** Wind generation: Target and predicted demand generated by the different models.

4.2.3. Runtime Performance of the Different Algorithms

We performed a runtime evaluation of the various algorithms on both the PV and wind datasets, and the results are shown in Table 15. To begin, it is important to note that the following conditions were met prior to conducting these experiments:



**Table 15.** Timing performance of the different algorithms under both the PV and wind datasets.

Table 15 shows the empirical run-time results of each algorithm. However, it should be noted that because the k-NN is an unsupervised method, there was no need for a training process, and thus, no results are provided for it. According to these results, the LR achieved the shortest training time in both datasets, while the SVM algorithm achieved the quickest testing time in the PV dataset while having the same testing time as the ANN and LR in the wind dataset. Often, because testing time is most important to the user during real-time operation, we note that the SVM performed best; however, statistical significant analysis of these timing results shows otherwise in Table 16.

**Table 16.** Statistical significance test (Tukey's comparison test) of the test time of the different algorithms.


It should be noted that only the test time results of Table 15 for both the PV and wind datasets were subjected to the Tukey statistical test. Thus, the Tukey test results in Table 16 reveal that there were no statistically significant (ns) differences in the test time of the different algorithms, albeit for the GR algorithm, which yielded the longest test time compared to the other methods. The GR algorithm's relatively slower performance may be attributed to the effect of the Gaussian kernel, which is known to add additional processing requirements to the method. However, because the difference in the testing time performance was less than 0.195 s across all methods (see column 2 of Table 16), it is possible to conclude that any of these algorithms can be used for real-time power demand/supply prediction use cases in smart grid systems.
