*4.4. Clustering Method*

Clustering load curves into groups is essential to identify load patterns [28]. For comparison purposes, we implemented the following three different clustering methods: k-means, hierarchical, and fuzzy c-means clustering.

### 4.4.1. K-Means Clustering

K-means is the most popular hard clustering algorithm goal to partition *n* data into *k* clusters, with the grouped data close to its centroid [29]. The K-means method is implemented as follows: First, determine the cluster number *K* then initial *K* centroids. Second, allocate each sample to the nearest centroids according to the distance. Third, determine the new *K* centroids that were generated by calculating the mean of the cluster points. Then, repeat the second and third steps until the centroids are completely unchanged.

Determining the number of clusters is one of the major challenges in clustering. The elbow method aims at finding the appropriate number of clusters by calculating the score for a range of values of *K* [30]. In our study, we determined this parameter by analyzing the following two metrics: distortion and Calinski–Harabasz score. Generally, distortion scoring computes the within cluster sum of squared (WCSS) to select cluster *K* [31]. Distortion score decreases with *K* increase. It is computed using Equation (4), as follows:

$$\mathcal{WCSS}(K) = \sum\_{h=1}^{K} \sum\_{x\_i \in c\_h} \left|| x\_i - \mu\_h \right||^2 \tag{4}$$

where *<sup>K</sup>* is the number of clusters, *ch* is the cluster *<sup>h</sup>*, *<sup>μ</sup><sup>h</sup>* is the *<sup>h</sup>*th cluster center, *xi* <sup>−</sup> *<sup>μ</sup>h*<sup>2</sup> is the Euclidean distance between data point *xi* and its belonged centroid *μh*.

We applied Yellowbrick package to visualize the elbow method [32]. Figure 8 illustrates the WCSS value in different *K*. By applying the elbow method for 1 ≤ *K* ≤ 10, the distortion score reduces rapidly with increase in *K* until *K* = 3 and then reduces gradually. We also employed Calinski–Harabasz analysis method in our study. It calculates the ratio of the sum of between-clusters dispersion and inter-cluster dispersion for all clusters, as follows:

$$CH(K) = \frac{\sum\_{h=1}^{K} m\_h ||c\_h - c||}{\sum\_{h=1}^{K} \sum\_{x^{(i)} \in c\_h} ||x\_i - c\_h||^2} \frac{N - K}{K - 1} \tag{5}$$

where *N* is the total number of data points, *K* is the number of clusters, *nh* and *ch* are the number of points and centroids of the *h*th cluster, respectively, *c* is the centroid of data points. The higher value of *<sup>K</sup>* ∑ *h*=1 *nhch* <sup>−</sup> *<sup>c</sup>*<sup>2</sup> means different cluster centroids are well separated, while the lower value of *<sup>K</sup>* ∑ *h*=1 ∑ *<sup>x</sup>*(*i*)∈*ch xi* <sup>−</sup> *ch*<sup>2</sup> indicates that the points of cluster

are well centered. Therefore, the larger the value of the CH index, the more distinct the clusters.

Figure 9 shows the scores according to the change in the value of *K*, and it has a maximum value when *K* = 3. Even looking at the graph combined with the distortion and Calinski–Harabasz scores, it proves that it is the optimal solution for *k* = 3.
