**3. Exploratory Data Analysis**

The relationship between weather data parameters with electricity load in Bali is investigated in this section by calculating how correlate these parameters with each other. To calculate correlation between two variables, we employ the so-called correlation coefficient (CC), which is utilized to show how close a relationship between two variables' data is to one another, especially for the trend of these variables. The formula for the correlation coefficient is defined as follows:

$$\text{CC} = \frac{cov(X, Y)}{\sigma\_x \sigma\_y} \tag{9}$$

where *X* and *Y* are variables that being compared, *cov*(*X*,*Y*) denotes the covariance between two variables, and *σ<sup>x</sup>* and *σ<sup>y</sup>* denotes the standard deviation of data *X* and *Y*, respectively. In this paper, we use Formula (9) to calculate the correlation between electricity load with weather parameters, such as 2 m temperature, net solar radiation, wind speed, rainfall rate, pressure, and relative humidity.

Figure 6 compares electricity load data in Bali Island during 2019 with weather parameters such as temperature, solar radiation, and wind speed, whereas Figure 7 shows comparisons for rain rate, pressure, and relative humidity. In Figures 6 and 7, the electricity load data is denoted as blue lines with the left-hand side *y*-axis, whereas weather parameters are red lines with the right-hand side *y*-axis. As shown in Figure 6, we can directly notice that the temperature and solar radiation have a very similar trend with the

electricity load in Bali, which indicates these two weather parameters have a high (positive) correlation with electricity load in Bali. For the wind speed, as shown in the lower part of Figure 6, the trend of electricity load is in the opposite direction, indicating that the wind speed and electricity load have a negative correlation.

**Figure 6.** Plots of electricity load in Bali during 2019 in comparison with weather parameters; (**a**) temperature; (**b**) solar radiation; (**c**) wind speed. The magnitude of electricity load belongs to left *y*-axis, whereas the magnitude of weather parameters is in the right *y*-axis.

In Figure 7, we can see lower correlations between the rainfall rate with electricity load. In contrast, for the pressure, we can also see a negative correlation with electricity load, as with the wind parameter. The trend of the relative humidity parameter with the electricity load is not very clear, which indicates a low correlation value. Table 1 shows correlation coefficient (CC) values between each weather parameter in Figures 6 and 7 with electricity load in Bali. As shown qualitatively in Figure 6, the most correlated weather parameter with the electricity load is the 2 m temperature and is followed by the net solar radiation with CC values of 0.63 and 0.43, respectively. As also noticed in Figure 6, the wind parameter negatively correlates with the electricity load, with a CC value of −0.40, which is relatively high. Other weather parameters such as rainfall rate, pressure, and relative humidity have lower CC values, i.e., −0.18, −0.22, and 0.14, respectively. Based on this exploratory data, we can conclude that three weather parameters have a high correlation with the electricity load in Bali island, i.e., 2 m temperature, net solar radiation, and wind speed. These parameters will be used as features for machine learning models, which will be discussed in the next section.

**Figure 7.** As in Figure 4, for other weather parameters; (**a**) rainfall rate; (**b**) pressure; (**c**) relative humidity.

**Table 1.** Correlation Coefficient (CC) between electricity load and various weather parameters.

