**2. Related Work**

This section discusses the related work, particularly those concerned with prediction analysis in smart grids using various ML algorithms. Different prediction models for microgrids are also discussed, many of which are focused on power generation and consumption. Many of these models typically implement ML techniques to forecast short-term and day-ahead electricity demands.

First, it is noted that prediction errors can lead to an imbalance between power supply and demand; thus, load forecasting is essential for transmission system operators because of the impact of prediction accuracy on power system operations. Hence, improving energy demand prediction methods could enable a power grid to become more stable. A comparison of different ML techniques for short-term demand prediction on microgrids was conducted in [11] to improve prediction accuracy. The comparison was between ensemble prediction network (EPN) and long-short term memory (LSTM), neural network, and multilayer perceptron. The EPN technique outperformed other forecasting methods when error was evaluated on a wide range of data. It was shown that prediction accuracy influences the operational cost of energy too. In [12], the kernel-based extreme learning machine (KELM) algorithm was compared to the extreme learning machine (ELM) and the Gaussian kernel for predicting short-term electricity prices on a yearly dataset from the Australian market. The KELM technique was shown to outperform other kernel methods, but the Gaussian kernel-based ELM was more efficient for dealing with complexities in electricity pricing data and accurately predicting the price profile pattern. An automated reinforcement learningbased multi-period method for forecasting renewable power generation and load was proposed in another interesting article [13]. It was demonstrated that, when compared to traditional scheduling methods, the proposed method, along with its forecasting capability, significantly reduced operating costs while calculating at a faster rate. In a separate work, a least squares SVM (LS-SVM) coupled with the bacterial colony chemotaxis (BCC) optimization algorithm was proposed to improve the accuracy and speed of short-term load forecasting. The method was determined to achieve better accuracy and faster processing speed, compared with the ANN and LS-SVM based on grid search [14].

Various predicting techniques have been proposed to sustain the amount of energy generated to meet the demands of consumers, and some methods have been developed to enhance existing ones. For example, in [15], the ANN was compared with the multi-variable regression (MVR) and support vector machine (SVM) for improving energy dispatch for a residential microgrid based on solar power forecasting. The ANN model was most efficient in this case, with an accurate model for forecasting hourly irradiance and generated power. Unlike in [16], which perceived K-means as a new algorithm to predict irradiance intervals for stochastic generation in microgrids, improvements are always made as technology advances, as seen in [17], which indicates that the use of the regression technique is the way to go. They demonstrated that it yields improved performance since it has longer reliability and less processing time for the prediction of power generated in microgrids. Power forecasting will also be vital to the success of future energy management schemes, such as in transactive energy models [18]. In addition, IoT devices in smart grids will require efficient communication protocols for transmitting forecast data to a remote or cloud server. An efficient interface for such a purpose between CoAP and OSGP was proposed in [19], which can ensure that data are exchanged effectively between IoT devices used in home and industrial applications and an SG infrastructure. Other device development and prediction concepts can be gleaned from [20] in order to develop systems that can be used for smart grid prediction use-cases.

Furthermore, many methods have been used to forecast energy consumption, from an hour ahead to a day ahead, depending on various weather conditions. For comparison purposes, Ref. [21] stated that the SVM outperformed other algorithms for hourly prediction of load power consumption in a building. Power consumption prediction algorithms for the

day ahead are either ML or AI. In [22], a hybrid AI model (a combination of feed-forward artificial neural network (FFANN), wavelet transform (WT), and simulated annealing (SA)) was used to predict power demand for a day ahead. The hybrid model was shown to be more efficient as compared to using just one method, as in [23], which implemented the neural network technique for similar day-ahead prediction conditions. Ref. [24] focused on the use of ensemble learning techniques to predict the power consumption of a building with given weather forecast information. They noted that the gradient boosted trees yielded the best performance among the different ensemble methods used. Ref. [25] evaluated different AI algorithms (ARIMA, SARIMA, SVM, XGBoost, RNN, LSTM, and LSTM-RNN) at a university campus microgrid to predict power consumption. They suggested that RNN, LSTM, and RNN-LSTM provided the best MSR, MAE, MAPE, and R-squared when compared to the other techniques used.

Table 1 is essentially a summary of these various related comparative studies. Many of these studies, like previous observations in the literature, compare only a few ML algorithms, and frequently only within the same class.

**Table 1.** Summary of the related studies, with key characteristics from each study compared to the others.


Furthermore, it is clear that only a few metrics are used to compare these methods, which tends to bias the conclusions that can be drawn from the comparison exercise. Most importantly, none of the recently published studies performed a statistical significance test on their output results. As a result, their conclusions may be biased, making it difficult to determine which algorithm truly performs best. Additionally notable is the absence of timing performance, which limits an ML designer's ability to make appropriate choices. Finally, the findings of these studies demonstrate that no single ML algorithm performs best across all studies. As a result, in the absence of thorough statistical significance tests, many of these conclusions may not be truly reliable. Thus, in this article, we attempt to conduct an independent study of these well-known ML algorithms in order to determine whether there is any significant difference in their performance based on thorough significance tests. Our findings will help to inform the research community in this area, as well as assist designers in making sound decisions when developing smart grid systems.
