**5. Conclusions**

This paper aims to design a machine-learning-based electricity load forecasting system. We investigate two primary studies, i.e., exploratory data, to investigate the correlation between weather parameters and electricity load data and feature selection optimization for the machine learning forecasting model. This paper uses a statistical method, i.e., the correlation coefficient (CC), to select highly correlated weather parameters with the electricity load data. The results of this step are used as input for the machine-learningbased electricity forecasting model, which is not considered a statistical method. However, our results show that this feature selection step significantly affects the machine learning prediction accuracy. We found that this statistically based feature selection improves the accuracy of the machine learning model.

Results from exploratory data conclude that three weather parameters highly correlated with the electricity load in Bali islands, i.e., 2 m temperature, net solar radiation, and wind speed. Other weather parameters, such as rainfall rate, pressure, and relative humidity, are less correlated. To investigate the effects of weather parameters as feature input for the machine learning model, we perform scenarios in which we added one-by-one weather parameters, from high to low correlated weather parameters. For the GRNN model, the best performance scenario is achieved for the featured scenario only with 2 m temperature, a CC value of 0.937, and an RMSE value of 41.72. On the other hand, the best performance scenario for the SVR model is a feature scenario of 2 m temperature and net solar radiation, resulting in a CC value of 0.934 and an RMSE value of 48.88. Predicting using the GRNN is better than the SVR, especially in terms of correlation coefficient (CC) value and RMSE value, as shown in scenario-2 in Section 4.1. This result can be related to the fact that the GRNN only has one parameter to be optimized, i.e., the spread parameter. In contrast, there are more parameters to be optimized in the SVR model, i.e., type of kernel

function, regularization parameter, kernel coefficient, polynomial degree, etc. Therefore, optimizing the GRNN is more straightforward than the SVR. Moreover, the GRNN is a model with strong nonlinear mapping capabilities suitable for solving the electricity load forecasting problem with weather parameter features, as in this paper.

To improve the performance of the prediction, we also investigate an option to add another feature to the machine learning forecasting model, i.e., we add the moving average (MA) of historical electricity load data itself to the machine learning. There are three scenarios of moving average data that we investigated, i.e., monthly, weekly, and daily moving average data. Scenario with the additional feature of MA-monthly data gives worse performance than scenario without MA-monthly data. The other two scenarios, i.e., MA-weekly and MA-daily, give better performance than without MA data. The best performance scenario is achieved with MA-daily data; the GRNN model gives the CC value of 0.956, RMSE of 28.82, and the SVR model gives the CC value of 0.965 and RMSE value of 44.40. In conclusion, the GRNN model performs better than the SVR model regarding the RMSE value. The inclusion of moving average electricity load data is possible when the forecasting system can obtain near real-time realization (observation) data of electricity load.

For future research direction, there are several points that can be investigated further. Firstly, to further improve the accuracy of the electricity load prediction, more advanced machine learning models can be investigated, i.e., deep learning models. Secondly, in an area that is connected with multiple electricity grid systems, the correlation between weather parameters and electricity load can be low. Therefore, a new technique for feature selection is needed to design electricity load forecasting for this type of area.

**Author Contributions:** Conceptualization, S.A., D.A. and A.A.; methodology, S.A. and A.A.S.; software, I.A.A.; validation, D.A. and A.A.; formal analysis, S.A. and D.A.; investigation, I.A.A.; resources A.A.S.; data curation, I.A.A.; writing—original draft preparation, S.A.; writing—review and editing, D.A. and A.A.; visualization, I.A.A.; supervision, D.A.; project administration, S.A.; funding acquisition, I.A.A. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by PT PLN (Persero) Puslitbang Ketenagalistrikan with contract number: 0020.Pj/LIT.00.02/C0000000/2021.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.
