Energy security is one of the critical factors for the sustainability and integrity of society [
1]. The balance between energy supply and demand is vital for energy security. To achieve this balance, monitoring, accounting, and management of energy consumption on the demand side is necessary [
2]. Non-intrusive load monitoring (NILM) has been established as a good substitute for intrusive submetering [
3], thus becoming a future tool for energy monitoring.
Two critical applications of NILM are home energy management systems (HEMS) and ambient assisted living (AAL) [
4], where it provides various solutions to the existing problems and opens avenues for future research. The energy profile of each device is of extreme importance in the overall operation of smart grids with renewable energy resources. NILM plays a vital role in efficiently extracting energy consumption data down to the appliance level, as demonstrated in
Figure 1, helping demand prediction. This energy-demand information may be used to manage and conserve energy at the consumer and grid levels.
1.1. Motivation
Shortage of energy is a challenge of the current time. An increase in energy production requires investment and dealing with many constraints. Nevertheless, there is much room for improving efficient energy utilization while minimizing the overall losses. NILM forms a basis for efficient energy utilization. NILM research is focused on disaggregating the energy usage of individual appliances attached to a central energy meter. This research area comes under the umbrella of cyber-physical systems and Industry 4.0 [
5]. The objective of NILM is to save on the hardware while providing sufficiently accurate energy usage patterns. The rise of NILM is linked with other recent areas of research such as the Internet of Things (IoT) for buildings [
6], smart grids, and demand-response management systems [
7], where the information from NILM is further utilized for buildings’ energy management and decision making. The primary purpose of all these areas is to manage and conserve energy by enabling stakeholders to make informed decisions.
The whole process of NILM can be divided into four significant steps, as illustrated in
Figure 2. The first step is to acquire the load signature (LS) using a suitable physical sensor. Aggregate real power consumption is taken as LS in this paper. This aggregate real power consumption in buildings is acquired using a commercial smart digital energy meter. The current study is an event-based approach and thus the second step detects individual events. An event is defined as a significant change or perturbation in the aggregate power. Each event is thus considered as corresponding to a state change of a device. After event detection, an important task is to identify or classify the event and the respective device with the help of specific features of the LS, including statistical quantities such as the data mean, peak, slope, median, mode, percentiles, range, variance, and standard deviation. Supervised (classification) and unsupervised (clustering) approaches are used at this point. Repeating this process for each data sample leads to a complete load identification or disaggregation at the individual device level.
Hart [
8] initially coined the concept of disaggregating the total energy and demonstrated that each appliance or device could be recognized using an appropriate LS feature, as shown in
Figure 1. He also defined the following three types of device models:
Machine learning approaches can be divided into supervised and unsupervised methods. A labeled dataset is employed in supervised algorithms, which train the algorithm to classify the test data and allocate them to a suitable class, while in unsupervised techniques no labeled training data are required. Although this paper proposes unsupervised methods, a brief overview of related supervised methods is presented below for comparison.
A lot of research has been done in event-based and non-event-based techniques. The research includes both supervised and unsupervised machine learning methods. However, the unsupervised and semi-supervised methods are relatively less explored [
9,
10]. Unsupervised methods have the advantage of less or no labeled training data required for classification; however, a disadvantage is that they have relatively low classification accuracy.
Some supervised learning approaches that improve accuracy and reduce computational time include artificial neural networks [
11,
12]. Neural networks are one of the crucial techniques in energy disaggregation. A variant of neural networks are known as concatenated convolutional neural networks (CNNs) [
13]: CNN-based algorithms achieve good generalization and energy disaggregation even with a short sample time. Deep neural networks have also been applied to energy disaggregation and give promising results [
14,
15]: increasing the accuracy of the specific scenario. Another variant called auto-associative neural networks was used on transient-based LS. It was implemented on REDD and UK-DALE datasets [
16]. Another approach studied the energy transients using artificial neural networks, improving the accuracy and reducing the computational complexity [
17].
Support vector machines [
18,
19], k-nearest neighbors [
20,
21], naïve Bayes classifiers [
22,
23], and linear-chain conditional random fields [
24] are also well-known methods. Probabilistic approaches for NILM have also been studied. In one such versatile study, authors explored the Viterbi algorithm with sparse transition and the Markov chain, showing improved performance compared with Bayesian classifiers. In [
25], the noise was used as the LS to detect eight appliances. Wavelet transform has also been applied to the transient signature for NILM [
26]. In [
27], short-time Fourier transform was used to identify different types of devices from transient power, where the shape of the transient data was used for identification. A novel LS called frequency invariant transformation of periodic signals was employed in a steady-state approach. The idea was to use the original electric current waveform with respect to the reference voltage as a signature for NILM. A neural network was employed and an accuracy of 90% was achieved with 18 different devices [
28]. In [
29], particle swarm optimization was used to optimize the training parameters for the neural network. One of the aspects in NILM research is a reduction in computational complexity. In this direction, authors from [
30,
31] have adopted a lightweight approach that can run on the edge. A combination of CNN and k-NN was employed to achieve good results.
Unsupervised approaches do not require huge labeled datasets for training [
32,
33]; instead, they treat the electrical system as a stochastic system and work with unlabeled data. Although unsupervised algorithms are less precise and computationally complex, they can disaggregate devices without training or labeled data. This characteristic can be used in available systems and, on a commercial scale, in ready-to-use NILM systems. One of the unsupervised techniques used for NILM is the hidden Markov model (HMM) and its variants [
34,
35]. In [
35], some variants of the factorial hidden Markov model (FHMM), factorial hidden semi-Markov model (FHSMM), and conditional FHMM (CFHMM) were proposed. HMM is a probabilistic technique with a random model. It assumes that the system has some unobservable states. In [
36], an additive and difference FHMM was introduced. A similar but separate study [
37] achieved an accuracy higher than 90% for a specific scenario. Here, f-measure (F1) and normalized disaggregation error (NDE) parameters were compared and discussed, showing improved efficiency. The study used a combination of features, i.e., real power, reactive power, and voltage waveforms for five appliances. In [
38], a combination of difference HMM and extended Viterbi algorithms was tested. Normalized error and root mean square error were used as performance metrics to compare two variants of this approach. These techniques are generally helpful. Mainly, HMMs perform well in the case of constantly ON devices like the fridge and freezer. However, they are computationally complex and long processing time makes them inappropriate for real-time implementation without increasing processing cost appreciably.
In studies [
39,
40], six categories of the sampling rate are established:
Very low: slower than one sample per min;
Low: between one sample per min to 1 Hz;
Medium: faster than 1 Hz up to fundamental frequency (fundamental frequency: lowest frequency in the signal);
High: from the fundamental frequency up to 2 kHz;
Very high: between 2 and 40 kHz; and
Extremely high: faster than 40 kHz.
This paper uses the low sampling rate data.
1.2. Graph Signal Processing
A relatively new semi-supervised classification technique for NILM, based on graph signal processing (GSP), has been presented in [
41,
42]. GSP is a field of study that deals with irregular data in time and space, such as a random network of sensors, the internet, and social networks. Data points in the graph are represented as nodes, also called
vertices (singular vertex). A vertex is one of the points/nodes on which a graph is defined. The vertices are connected through edges. In the graph, the edges represent relationships or interconnections between vertices. The
edges may be directed or undirected and can even have weights associated with them.
Figure 3 shows a visual representation of the graph with four vertices at data points; these vertices are connected using
edges. The mathematical model of the graph is
, where each
vertex i and
edges are represented using the adjacency matrix
A.
where
,
are two consecutive data samples and
is a scaling factor. The number of edges connected to the node represents the degree of that node. The degree of each node is obtained using the degree matrix,
; it is obtained by adding rows of the adjacency matrix.
In Equation (2),
D is an
N × N diagonal matrix [
42]. Another representation is the Laplacian, which has excellent properties and is very useful in spectral clustering. The Laplacian is mathematically represented as:
Graph signal processing (GSP) is a relatively new and valuable unsupervised technique used for NILM, having many advantages, as discussed earlier. One comprehensive and detailed study by Stankovic [
41] explored the application of GSP and proved its applicability. NILM accuracy was further improved in a later study [
43] using additional preprocessing and post-processing steps. In another study, the post-processing techniques were further enhanced using optimization and a genetic algorithm (GA) [
44]. In one recent study [
45], authors adopted GSP and the concept of clustering and produced favorable results and improved computation time. The current work is inspired by these studies and tries to explore GSP further and seek improvement from the NILM point of view.
1.3. Spectral Clustering
Less research has been conducted in the unsupervised domain as compared to the supervised domain [
46]; particularly, the use of spectral clustering is rarely explored for NILM applications and has room for further exploration and research. This study examines the feasibility of applying spectral clustering in simple yet efficient manner in order to enhance its performance in NILM applications.
Spectral clustering [
47,
48,
49] finds its roots in graph theory; its ultimate task is to cluster out the data based on their edges’ connectivity. This method also allows dealing with non-graphical data. Spectral clustering classifies the data using the eigenvalues (spectrum) of the Laplacian matrix. The concepts of eigenvalues and eigenvectors are of extreme importance here. For a matrix
A, if there exists a vector
x that is not all zeros and a scalar λ such that:
Then is said to be an eigenvector of A with corresponding eigenvalue λ. By careful examination of eigenvalues, it is found that there are some eigenvalues equal to or near to zero, which represent connected components within the graph. The corresponding eigenvectors are constant. The first non-zero eigenvalue is called the spectral gap. The spectral gap gives an approximate idea about the sparsity of the graph. The second eigenvalue is called the Fiedler value, and the corresponding vector is the Fiedler vector. The Fiedler value approximates the minimum graph cut needed to separate the graph into two connected components. However, the number of eigenvalues and vectors depends on the specified upper limit of clusters.
In one recent study [
50], automated spectral clustering is applied on multiscale data. The approach used is iterative and obviates the need of predefining parameters. It was tested for NILM applications. In the current proposed study, although various parameters are predefined the approach is simplified and less computationally complex. In another preliminary study [
51], spectral clustering has been used for NILM applications. However, the approaches proposed in the current study are different and novel. Moreover, a comprehensive and detailed analysis centered on NILM is presented in the current study.