**2. Related Works**

Clustering is unsupervised learning, which could group similar data with no label attached to them [10]. Clustering algorithms can be classified into partitioning algorithms, hierarchical algorithms, density-based algorithms, and grid-based algorithms [11]. The authors of [12] implemented an improved K-means clustering method on load curves and verified that it performed better than the original K-means algorithm. The authors in [13] used modified fuzzy c-means (FCM) to extract representative load profiles of the customers. Ordering points to identify the clustering structure (OPTICS) is one of the density-based clustering models used to analyze consumer bid-offers in [14]. Gaussian mixture model (GMM) clustering is widely used to segment households' load profiles for demand response [15].

Additionally, most clustering algorithms cannot properly process high dimensionality data [16]. Most of the aforementioned works extracted consumption load patterns in terms of hourly, 30-min, 15-min load data. However, the advanced high-frequent smart meter could extract load data in intervals of 1-min, 30-s, and even 1-s, leading to large-scale consumption data that increases computational complexity. Most clustering algorithms evaluating the belonged cluster are calculated by distance. High dimensionality data would consume more computational complexity in each iteration, resulting in more time consumption. Hence, there are numerous studies about dimensionality reduction on load curve clustering, using feature extraction, feature construction and feature selection. In [17], the authors developed electricity price schemes based on demand patterns, using k-means combined with PCA. In [18], the authors proposed singular value decomposition to extract features before k-means clustering and evaluate the error sum of squares (SSE) index to compare with direct clustering. In [19], they used a fused load curve k-means algorithm, based on "Haar" discrete wavelet transform for reduce dimension, to obtain the load patterns of consumers from China and the United States and evaluate clustering performance by four CVI [20]. Xiao et al. [21] proposed a fusion clustering algorithm to obtain the consumption characteristics, using load curve clustering, based on discrete wavelet transform (CC-DWT).

In this study, we implemented clustering to segment 10-s interval daily electricity consumption data, using multi-level discrete wavelet transform, Pearson correlation coefficient, and PCA techniques to preprocess the daily load profiles. The clustering evaluation result shows our proposed method outperformed the conventional methods, without reducing dimension.
