Accurate Prediction of Hourly Energy Consumption in a Residential Building Based on the Occupancy Rate Using Machine Learning Approaches

Truong, Le Hoai My; Chow, Ka Ho Karl; Luevisadpaibul, Rungsimun; Thirunavukkarasu, Gokul Sidarth; Seyedmahmoudian, Mehdi; Horan, Ben; Mekhilef, Saad; Stojcevski, Alex

doi:10.3390/app11052229

Open AccessFeature PaperArticle

Accurate Prediction of Hourly Energy Consumption in a Residential Building Based on the Occupancy Rate Using Machine Learning Approaches

by

Le Hoai My Truong

¹

,

Ka Ho Karl Chow

¹,

Rungsimun Luevisadpaibul

¹,

Gokul Sidarth Thirunavukkarasu

^1,*

,

Mehdi Seyedmahmoudian

^1,*,

Ben Horan

²

,

Saad Mekhilef

^1,3

and

Alex Stojcevski

¹

School of Software and Electrical Engineering, Swinburne University of Technology, Hawthorn, VIC 3122, Australia

²

School of Engineering, Deakin University, Waurn Ponds, VIC 3216, Australia

³

Department of Electrical of Engineering, University of Malaya, Kuala Lumpur 50603, Malaysia

^*

Authors to whom correspondence should be addressed.

Appl. Sci. 2021, 11(5), 2229; https://doi.org/10.3390/app11052229

Submission received: 2 February 2021 / Revised: 20 February 2021 / Accepted: 25 February 2021 / Published: 3 March 2021

(This article belongs to the Special Issue Sustainable Built Environments in 21st Century)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In this paper, a novel deep neural network-based energy prediction algorithm for accurately forecasting the day-ahead hourly energy consumption profile of a residential building considering occupancy rate is proposed. Accurate estimation of residential load profiles helps energy providers and utility companies develop an optimal generation schedule to address the demand. Initially, a comprehensive multi-criteria analysis of different machine learning approaches used in energy consumption predictions was carried out. Later, a predictive micro-grid model was formulated to synthetically generate the stochastic load profiles considering occupancy rate as the critical input. Finally, the synthetically generated data were used to train the proposed eight-layer deep neural network-based model and evaluated using root mean square error and coefficient of determination as metrics. Observations from the results indicated that the proposed energy prediction algorithm yielded a coefficient of determination of 97.5% and a significantly low root mean square error of 111 Watts, thereby outperforming the other baseline approaches, such as extreme gradient boost, multiple linear regression, and simple/shallow artificial neural network.

Keywords:

deep learning; energy management systems; load forecasting; machine learning and microgrids

1. Introduction

The Australian energy market has been operating on a centralised generation model with state-owned power plants situated closest to fossil fuel resources such as coal, hydro, wind, and natural gas for many years. The centralised electricity generation model demonstrates several drawbacks to the environment and end-users due to the reduced efficiency caused by large transmission losses. The electricity price keeps going up due to the increased investment in distribution infrastructure required to connect households and businesses to a stabilised power supply [1]. Therefore, the government has spent the last couple of decades shifting its electricity generation model to a more decentralised approach by incorporating more renewable resources. Despite the tremendous amount of investment, there are still many transitional challenges to be solved, both in terms of political and technological readiness. It is predicted that the role of grid-supplied power will be inverted from being the primary source of energy to a backup source, having distributed renewable generation as the primary source [1]. This paper investigates a similar distributed generation model commonly known as microgrids and addresses the systems’ technical barrier of accurate forecasting of load demand, aiming to enable broader adoption of the decentralised generation model.

The microgrid consists of a control unit that uses a robust energy management system equipped with advanced load demand/generation capacity forecasting algorithms that aids in achieving reliable power flow control, load sharing, grid protection, stability, and smooth operation [2]. If successfully implemented, micro-grid systems promise to offer distribution system congestion relief, the postponement of new generation or delivery capacity, response to load changes, and local voltage support [2]. Accurate forecasting of load demand is crucial to a microgrid’s energy management system because it ensures energy savings and improves the operational or sizing efficiency of its supply and storage. However, load forecasting can be a tricky task since the energy performance of a building is influenced by various factors, such as occupancy rate, residence behaviour, household income, number and type of appliances, and weather conditions.

This study aims to develop a novel deep neural network-based energy prediction algorithm to synthesise hourly load profiles based on the occupancy rate. To achieve this, a comprehensive multi-criteria analysis (MCA) was carried out on the existing literature initially, and a novel synthetic load profile generator was formulated. The MCA aims to identify the most suitable prediction algorithm by evaluating the accuracy, usability, adaptability, computation time, randomisation, and implementation feasibility. The synthetic load profile generator developed in this study is later used to generate the stochastic load profiles. A state-of-the-art predictive microgrid model was created to generate the load profile accurately using average load consumption data, occupancy rate, and seasonality as the input. The microgrid’s mathematical model consists of realistic appliances with predefined constraints that help synthetically generate the accurate load profile. The proposed state-of-art eight-layered deep neural network model accurately predicts the hourly energy consumption patterns, considering the occupancy rate and seasonality as the critical inputs. Baseline models such as extreme gradient boost, multiple linear regression, and simple/shallow artificial neural network are used to verify the proposed prediction algorithm’s performance. Root mean square error (RMSE) and coefficient of determination (

R^{2}

) values are the metrics that the models are compared on the results indicated that the proposed deep neural network model outperformed the other models.

The manuscript is structured as follows: A detailed background study on micro-grids and their energy management systems (EMS), followed by the foundation of different machine learning approaches used in load profile forecasting, is highlighted in Section 2. In Section 3, the study’s methodology is deliberately discussed by highlighting the process of MCA table construction, the synthetic load profiles’ generator model, and the assumptions considered. A detailed overview of the proposed model and the baseline approaches considered for evaluation is also discussed. In Section 4, the results and discussions are highlighted. Finally, the concluding remarks and the significance of the paper are highlighted in Section 5.

2. Background

2.1. Microgrid and Energy Management System

The decentralised electricity generation model has many advantages over the conventional centralised systems that consist of remote generation units. In a centralised generation model, power flows in one direction, from a small number of large generators to many consumers over a long distance. Therefore, the centralised model requires large power plants to meet the demand and significant transmission lines to connect households and businesses with their power source, resulting in colossal air pollutant emissions, wastage of generation, and land use. The fact that it requires a high integration level also means that its system is extremely vulnerable to disturbances in the supply chain. Therefore, its attractiveness is reducing, and the penetration of small-scale decentralised systems or microgrids is emerging and increasingly invested.

The microgrid is essentially a local energy grid with control capability and autonomous operation, which can be disconnected from the primary grid when required [3]. In opposition to centralised systems, power in a microgrid flows in both directions. As it is built locally using renewable sources, it is more efficient, more reliable, lower cost, and cleaner than the centralised model. The three main types of a microgrid are remote, grid-connected, and networked microgrids. In this study, a grid-connected microgrid focuses on the most suitable system for commercial and residential buildings. A grid-connected microgrid can operate in either grid-connected mode or stand-alone mode based on the requirement. In grid-connected mode, the microgrid imports or exports the power from and to the grid depending on generation and load conditions and market policies based on contractual obligations. On the other hand, in islanding mode, the microgrid disconnects from the utility when an abnormal condition occurs in the grid, and the microgrid has to satisfy the load with the required level of power quality by utilising the local storage and renewable resources [4]. Therefore, the energy management strategies used are different, and a coordinated control approach for microgrid energy management is required to minimise the errors between the forecasting and real-time data in the schedule, and dispatch layers of the system [5].

The setup of monitoring and optimising energy consumption to regulate the energy flow is commonly known as an energy management system (EMS). A microgrid usually requires an energy management system to assign active and reactive power references, ensuring cooperation between the controllable units to achieve stable and economic operation [5]. The latest research shows that the objective of energy management of a microgrid is to minimise the microgrid’s operating costs such as fuel costs, operating maintenance costs, and purchase cost of electricity from the conventional power grids [5]. One of the key features that allow a microgrid energy management system to achieve its objective is to have a robust and accurate forecasting technique of load demand/generation profile capability. In a residential setting, energy management works on optimising energy consumption, equipment efficiency, detecting faults in a deteriorating system, implementing ways to reduce energy wastage, and recovering energy wastage for other purposes. The EMS inferences provide data visualisation to household owners to better understand their usage patterns and recommendations on smart energy usage to drive better behaviour in their daily use. As an effect, maintenance cost is reduced because equipment usage is optimised [6]. Not only capturing and presenting historical data, but an EMS can also help forecast a household or a building’s energy consumption by using an intelligent machine learning algorithm. It also allows the developers to accurately identify the sizing of power resources required to meet a built or a new unit’s demand. However, one of the main drawbacks of using high-end machine learning algorithms is the lack of valid data set for effective training. That is why our study focuses not only on developing an optimised machine learning algorithm but also on establishing a novel synthetic load generator to evaluate our proposed model.

For the synthetic load generator to be built, an investigation into standard controllable loads, uncontrollable loads, and critical loads with their usage pattern in a typical household was carried out. Any device or appliance’s power consumption can be controlled by adjusting their duration on and off time [7]. Controllable loads can be differentiated from uncontrollable loads based on their ability to be turned off without sacrificing the user’s comfort [8]. For example, air conditioners and refrigerators can be turned off for a certain period to save power without significantly affecting household comfort. However, microwave ovens or toasters, examples of uncontrollable load, should not be turned off because this directly impacts the occupant’s comfort. The critical loads are loads that will result in a significant loss or damage when power off [9]. Examples of essential loads are a smoke alarm, a water pump, and critical load panels in an energy storage system [10]. Following these definitions, in a distributed power system, popular controllable loads, such as refrigerators, HVAC systems, entertainment devices, and fans are considered in the mathematical model formulated in this study. The most commonly used loads in residential settings are supposed to generate an extensive data set of load profiles with artificially induced non-linearity. A list of loads being considered is explained in the methodology section below.

Another fact about microgrids is that majority of them are built-in integration with renewable energy sources such as wind turbines, solar photovoltaic (PV), and fuel cells due to the global transition to green technologies. In addition to the eco-friendly advantages of such systems, the renewable resources add more non-linearity into the system due to their stochastic nature, resulting in an advanced/robust energy management algorithm. In 2009, for the EU, nearly 55% of the new installed capacity based on renewable sources corresponded to the wind and PV intermittent generation (39% and 16%, respectively) [11]. According to the Australian Department of Industry, Science, Energy, and Resources, in 2019, 21% of Australia’s total electricity generation was based on the renewable energy resources, which consisted of 7% of wind, 7% of solar, and 5% of hydro, making the share of renewables the highest recorded since the 1970s [12]. On this note, the study of load forecasting methodology is critical because it will enable an accurate sizing/scheduling of renewable sources for a microgrid system, preventing resource wastage or shortage and many other system failures.

2.2. Machine Learning

The complexity of residential load forecasting lies in the significant volatility and stochastic nature of the load profiles. Many researchers worldwide are working towards addressing this complexity by developing an accurate forecasting technique that addresses this uncertainty. The statistical learning approaches are based on the predefined relationship between variables and require a smaller data set but whereas the more accurate and advanced machine learning algorithms require a big data set. In recent years, the rise of big data with machine learning makes it a potential solution to address load forecasting in a residential energy management system. Traditional methods tend to avoid such uncertainty by load aggregation to offset uncertainties, customer classification to cluster uncertainties, and spectral analysis to filter out uncertainties [13]. Therefore, many studies are carried out to evolve the current machine learning techniques to learn the uncertainty at the building level directly due to the many influencing factors.

According to Lars Hulstaert, a data scientist at Johnson and Johnson, most machine learning systems require the ability to explain to stakeholders why specific predictions are made [14]. The accuracy and interpretability trade-off is typically considered when choosing a suitable machine learning model. Generally, there are two types of machine learning models, namely, black-box and white-box. Black-box models such as neural networks and gradient boosting models yield highly accurate predictions. However, their computational operation is difficult to understand. On the other hand, white-box models such as linear regression and decision trees, despite being much easier to interpret, produce less predictive capacity. In this research, an initial comprehensive multi-criteria analysis of the most common machine learning techniques in models such as linear regression, gradient boosting, decision tree, and neural network was performed to determine the best potential method optimised for load forecasting application.

Machine learning approaches are generally used to address supervised and unsupervised learning problems. Since the proposed methodology aims to predict energy consumption, supervised learning is a more suitable option as its primary function is to model the value of the target variable based on the predictor variables. Machine learning and artificial intelligence techniques are used in a wide variety of applications, such as load forecasting [15], determining product quality [16], and fault quality [17]. Linear regression, decision trees, and neural networks are all examples of supervised learning. Despite the similarities, their computation principles are different. Regression analysis is a methodology that allows finding a functional relationship among dependent variables and independent variables [18]. For complex systems, such as the energy consumption in buildings, the regression analysis is considered as an iterative process, in which the outputs are used to diagnose, validate, criticise, and possibly modify the inputs [18]. In the decision tree approaches, an empirical tree represents a segmentation of the created data by applying a series of simple rules. Through the repetitive process of splitting, predictions are made, and the logic is usually comprehensible [19].

On the contrary, the neural network is a class of algorithms loosely modelled on connections between neurons in the human biological brain, which is designed to imitate the natural nervous system information process and decision making [20]. There are many choices of neural network optimising architecture that significantly influences the performance of the model. This study proposes a novel deep neural network model by optimising the hyper-parameters to enhance neural networks’ performances. A comprehensive literature review in the form of multi-criteria analysis (MCA) was carried out. The next section of the paper will critically highlight how the MCA analysis is performed and evaluated.

3. Multi-Criteria Analysis (MCA)

3.1. MCA Development

A MCA to choose the most suitable machine learning technique for estimating energy consumption in a residential building is an evaluation process that considers different measurable criteria to rank, compare, and select the best performing models considered in the literature. A list of benchmarks is identified to evaluate the identified techniques’ performance and measured either qualitatively or quantitatively. The MCA was established by following the procedure shown in Figure 1, the set of chosen criteria are listed in Table 1.

There are six main analysis criteria considered in this study, each with different weighting depending on their relative importance to the study’s objectives. In this study, accuracy is assumed to have the highest weightage because it aims to identify the most accurate predictive method for estimating residential buildings’ energy consumption. A rating from 1 to 10 is applied to each criterion, with a higher value representing a favourable outcome. Each technique will be ranked according to its MCA score, where the higher the score, the more suitable the approach fits for purpose. Furthermore, the MCA was done with three separate sets of scoring systems for different perspectives of business managers, electrical engineers, and data scientists to ensure the final result is not biased to one specific area. The three scoring systems used the same papers from the literature review, with the final scores being the average of the individual scores.

3.2. MCA Results Evaluation

The MCA matrices and final scoring table obtained from the study are indicated in Figure 2 and Table 2. The MCA consisted of eight different approaches used in load forecasting from the existing literature that were critically analysed. By considering different perspectives and analysing different performance criteria, the framework yielded an accurate shortlist of the most effective ML techniques in estimating energy consumption in a residential building.

As the results, it was demonstrated that ML techniques based on neural networks such as ANN (MCA score = 189), ANFIS (MCA score = 187), and WNN (MCA score = 181) exhibited better performance in estimating energy consumption than those based on decision trees such as XGBoost (MCA score = 185) and regression analysis such as Gaussian process regression (MCA score = 144) and ARIMA (MCA score = 140). ANN produced the most accurate predicting results as it is well-known for its ability to handle noise and perform non-linear analysis of data-set like the investigated load profiles [21]. Furthermore, ANN also tends to ignore excess input data that are of minimal significance and concentrate on the more critical input [21]. On the other hand, despite performing better than MLR, XGBoost or the decision tree method generally does not outperform neural networks for non-linear data and is susceptible to noisy data [19]. The technique is more suitable for predicting categorical outcomes, and unless visible trends and sequential patterns are general, decision trees are less appropriate for application to time series data [19]. Regarding the MLR, despite being the most comfortable and most intuitive approach of prediction, it is the least appropriate for predicting energy usage due to its weakness in working with data with no apparent pattern.

Additionally, more criteria can be included for analysis to cover all the aspects of each ML technique. As a next step of the MCA, we model the performance of ANN, XGBoost, and MLR to validate the literature review on the performance of neural networks, decision tree, and regression analysis using python. The shallow or a simple ANN model is also used as a benchmark for testing the proposed prediction model’s performance. Furthermore, a new hyperparameter-tuned deep neural network model will be developed and evaluated based on the prediction of energy consumption load profiles.

4. Methodology

The methodology adopted to rationalise the approach used in the study to maintain focus on critical research aspects is clearly illustrated in Figure 3.

The detailed procedures and assumptions of the load profiles synthesis and machine learning modelling are critically explained in the subsections below.

4.1. Synthetic Load Profile Generation

In order to obtain a wide variety of load profiles based on the occupancy rate and seasonality, we developed a synthetic load profile generator that considers basic mathematical models of individual appliances to generate a set of load profiles programmatically. Initially, the model includes a load profile of different households with varying factors, such as the type of residence and number of people, as shown in Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8. Figure 4 provided a visual representation of the individual load profiles considered in the study. The synthetically generated household load profile models consist of a randomised algorithm that creates different appliance loads and the usage time. The load profiles are constrained to schedule the total amount of power consumption in kWh randomly and the electricity cost per day of the load profile to closely resemble real-world usage. The synthetic load profile generator is later used to generate a data sample of 100,000 data points (load profiles) for each type of occupancy rate, and these data are used by the proposed forecasting model to train and make predictions based on the limited number of inputs. The mathematical models are provided with a pre-defined set of constraints to generate the different non-linear load profiles replicating the real-time load profiles. Inferences from the power consumption studies on households are used as the basic input for the models to systematically generate numerous number of random samples of load profiles. These generated load profiles are then used to train a novel prediction algorithm to forecast the load profile at higher accuracy. The following factors are taken into consideration:

The hourly electricity usage data used in the study were obtained from residential buildings with a different number of occupants in Victoria, Australia.
Electricity bills and statistics provided by energy providers and distributors were used to identify the average daily usage, which was then considered as a reference point to fine-tune the load profiles to represent the daily usage in Victoria, Australia in 2020, and it was programmatically generated using the synthetic load profile generator.
The power ratings for all of the appliances present in the load profile table were taken from the existing appliances and from the Daftlogic website, which provides the typical power consumption list of households [22].

However, the proposed deep neural network based model aims at predicting the hourly load profile of a residential building with just having the occupancy rate the input to the prediction model. In general, machine learning models require a feature (input) and a label (output) to learn and be successful in doing estimation. The randomly generated load profiles are appropriated tagged, such that it could be used to train the prediction algorithms considered in the study. The energy consumption values from the corresponding hour were then added together using Equation (1) to obtain a new version of the load profile for the different household. Hence, the hourly daily usage load profile is developed and shown in Table 9.

E_{H} = \sum_{i = 0}^{n} P_{i}

(1)

where

E_{H}

is the energy amount in a particular hour and

P_{i}

is the power used in that hour by appliance i.

The background research indicated that the load forecasting techniques based on the occupancy rate and seasonality were niche and an area left unexplored which is considered as one of the key contribution of the proposed work. Based on which the randomised load profile generator was modelled. The load profile generator randomly generates several combinations of occupancy rate, season, and time resulting in a dataset that replicates the hourly generation pattern associated with the input feature as illustrated in Table 9. The equation used for generating the energy consumption patterns is expressed in Equation (2), where

E_{H}

is the energy amount in a particular hour;

E_{O}

is the base energy value;

r a n d o m (n u m b e r_{m i n}, n u m b e r_{m a x})

generates a random number between minimum and maximum values. The final load profiles can be seen in Table 10, which includes 100,000 data sets. The advantage of using the synthetic load profile generator for building more custom load profiles is that the system’s randomness can be fine-tuned based on the requirement, making it a more robust and adaptable solution. This approach decreases the latency that can occur with the dataset focused on a particular context.

E_{H} = r a n d o m (0.8, 1.2) * E_{O} + r a n d o m (0, 50)

(2)

4.2. Machine Learning Modelling

In general, machine learning models work with the dataflow of taking in input features, extracting a relationship with the input feature and the output label, and predicting the future. In our proposed model, the input features of occupancy rate, seasonality, and datetime are given to the deep neural network model as input features. The deep learning models output is essentially the estimated value of energy consumption as illustrated in Figure 5. As indicated in the previous section, the synthetic load generator was used to generate the dataset in this pattern, and then the full set of data was then fed into the proposed deep learning model. The learning requires two stages, the training stage to create the prediction model and the testing stage to verify the prediction model’s prediction performance.

Furthermore, the Python programming language with Tensor flow and Keras libraries was used to develop the MLR, XGB, and shallow/basic ANN models, and the proposed deep neural network model. A different number of hyper-parameter tuning approaches were included in the shallow/simple or conventional ANN model to obtain the proposed deep neural network model. Results indicated that after the hyper-parameter tuning, the prediction accuracy of the model had improved significantly.

4.2.1. Proposed Deep Neural Network Model

A machine learning model’s performance is heavily dependent on its hyper-parameters and in general, the hyper-parameters are tunable, and finding an optimised value for these parameters can directly influence the performance of the model [23]. It is essential to understand that this study focuses on optimising a shallow ANN model’s hyper-parameters to obtain a more accurate and useful deep learning model. On the other hand, hyper-parameters are external parameters that are set by the operator of the network [24]. For example, there are two types of hyper-parameters: Hyper-parameters related to neural network structure (number of hidden layers, dropout, activation function, etc.) and hyper-parameters related to training algorithm (learning rate, epoch, iterations, batch size, etc.) [24]. In this study, an iterative process of fine-tuning the shallow ANN model’s hyper-parameters is performed by optimising the number of hidden layers, activation function, and dropout layers to result in the proposed deep ANN model. A deep neural network, also known as a multi-layer neural network, has more hidden layers than a shallow one. Which enables the deep neural network models to learn more abstractions relationships within the input data and how the features interact with each other on a non-linear level [25].

Hidden layers are the layers of neurons in between the input layer and the output layer. Increasing the number of hidden layers increases the network model’s ability. However, there is a limit to the number of hidden layers added before its effectiveness declines. Optimising this value is a challenging task in creating deep neural network models, and in this proposed model, the optimised number of hidden layers was roughly around six. Besides that, two dropout layers were also added. Dropout layers are the layers that randomly “kill” a certain percentage of neurons in every training iteration to ensure some information learned is randomly removed, reducing the risk of over-fitting the data during the training phase [24]. Having the right combination of hidden and dropout layers in ANN makes it a useful prediction model, and in this case, we are calling this developed model a deep ANN.

Additionally, the activation function is a set of rule that determines if a neuron should “fire” or not [23]. There are many types of activation functions, and each one is suited for distinctive situations. For example, a sigmoid function returns an output of “1” when the neuron’s input is one or higher. Similarly, it produces a negative one when the input is below the negative one and returns the same value to the input when the input is between “−1” to “1”. Rectified Linear is another activation function. This function outputs 0 when input is negative, while the output matches the input when input is positive. In this study, the sigmoid function was used for the network’s hidden layers, while rectified linear function was used for the output layer.

Apart from the modification mentioned above, the number of neurons was also varied and tested to find an optimum neuron number for the effective deep ANN model. Ultimately, the study is aimed to introduce a new optimised and finely tuned deep ANN model. Figure 6 below depicts the improvements being made on the neural network to transform it from a shallow ANN model to a deep ANN model, while Figure 7 demonstrates the architecture of the optimised deep neural network.

Finally, we use root mean squared error (RMSE) and coefficient of determination (

R^{2}

) to evaluate the models created. The RMSE produces an average difference between the estimated value and the actual value. The desired RMSE value is to be as small as possible to indicate the predicting model is accurate. On the other hand, the

R^{2}

indicates how closely the model can follow the expected estimate of energy consumption value in percentage. The

R^{2}

value is desired to be as close to 100% as possible. The proposed deep ANN model is compared with the baseline models of XGBoost, MLR, and shallow ANN. The inferences from this evaluation are discussed in detail in the results and discussions section below.

5. Results and Discussion

Figure 8 illustrates the hourly error in the prediction for each techniques. The final results of the study comparing the performance of the proposed deep ANN with the other baseline approaches is shown in Table 11. It is very clear from the results that the proposed deep ANN and XGB were more accurate in prediction than the shallow ANN and MLR models. The observations reinforce the finding from the MCA analysis carried out on the literature illustrating the proposed model’s acceptance. However, the result from Table 11 also emphasises that a simple shallow ANN would perform as weak as an MLR model unless it is adequately tuned. It is shown that hyperparameter tunning allowed the shallow ANN to achieve much higher accuracy, with an exceptional coefficient of determination of 97.5%.

The MLR prediction model graph showed that the estimated values followed a linear pattern and did not adequately represent the actual values. MLR machine learning method achieved an average RMSE value of 635 W and 17% accuracy. Similarly, the shallow ANN prediction model achieved an average RMSE value of 637 W and 18% accuracy. Therefore, it is clear that a lack of hidden layers and hyperparameter tuning significantly reduces the ANN model’s predicting performance. On the other hand, the XGB prediction model’s estimated value was much closer to the expected values. XGBoost machine learning method achieved an average RMSE value of 271 W and 85% accuracy. Even better, the deep ANN model’s estimated values were able to resemble the original data set to a greater extent. The deep ANN machine learning method achieved an average RMSE value of 111 W and 98% accuracy. Overall, this trend in the results is consistent with the inferences obtained from the MCA in Section 3.2 and the machine learning model’s design objectives highlighted in Section 4.2.1.

The line graph in Figure 9 reveals a better visual representation of all models’ performance. The deep ANN model achieved the highest accuracy of all the techniques examined in this study, indicated by the fact that it had the lowest RMSE curve compared to the rest. The increased accuracy is due to its ability to model the randomness in the model and deal with input noises, unlike linear regression methods, which are only suited for linear modelling. The XGB models are also susceptible to noisy data, evident from the accurate prediction results obtained. However, despite performing well in predicting energy consumption values, deep ANN took a significant longer computation time (2738 s or 45 min and 38 s) to build the model compared to 4 s and 28 s for MLR, XGB respectively. Based on the accuracy required and computational availability, the type of prediction model is chosen.

6. Conclusions

In this study, a predication algorithm of a residential building based on the occupancy rate was investigated. The synthetic load profile generator model proposed in this study is close to realistic model were used to generate the random load profiles which were used to train the proposed state-of-the-art deep ANN model. The computation time and the accuracy of different machine learning models were then compared, and the results indicated that the proposed deep ANN model was the most appropriate for energy consumption prediction.

This study’s main contribution was the novel predictive algorithm for load forecasting based on occupancy rate and the establishment of the finely tuned deep ANN model. Other findings from the research include:

MLR was the least accurate in prediction (17.2%), but it was the fastest in computation (28 s). Since the energy consumption values do not have a close linear relationship with time but instead present significant randomness in peak consumption, it was difficult for MLR to find the best fit function, and hence, accurately predict values.
XGBoost performs better than MLR in predicting energy consumption (84.9%) but does not handle noise well and is not suitable for time series data set. Therefore, it falls in the middle range of the ranking.
Deep ANN performs better than shallow ANN and can take hours or days to train the data and create a prediction model. However, the model can be very accurate in prediction (97.5%) since it works well with random data set and can handle noise. It is at the top of the chart for its ability to accurately predicting energy consumption. In the cases where computation time is not a significant concern, deep ANN is highly recommended.

Further research can be performed to optimise the hyper-parameters related to the ANN model, such as learning rate, momentum, epoch, iterations, and batch size. Additionally, the novel deep neural network based forecasting algorithm proposed can be later evaluated with realistic historical load profile data in the future. Overall, the authors believe that the novel synthetic load profile generator and the finely tuned deep ANN model developed in this study would be enhancing the performance of the load profile forecasting and can be used in future with wide variety of data sets. The synthetic profile generator could be of ideal use when we are not having access to historical data where the novel MG model can assist in generating the load profiles that can be used for forecasting the hourly energy consumption profile. The prediction algorithms also provide a great opportunity to a market operator to predict their customers’ energy consumption with limited inputs, to help them identify the most optimal energy generation schedule.

Author Contributions

Conceptualisation: L.H.M.T., K.H.K.C., R.L., G.S.T., M.S. and A.S.; data curation: L.H.M.T., K.H.K.C. and R.L.; formal analysis: L.H.M.T., K.H.K.C. and R.L.; investigation: G.S.T., M.S., B.H., S.M. and A.S. methodology: L.H.M.T., K.H.K.C., R.L., G.S.T., M.S., B.H., S.M. and A.S.; project administration: L.H.M.T. and G.S.T.; resources: B.H., S.M. and A.S.; software: K.H.K.C. and G.S.T.; writing—original draft: L.H.M.T., K.H.K.C., G.S.T. and R.L.; writing—review and editing: M.S., B.H., S.M. and A.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

Daftlogic for the typical power consumption of households published in their website, which was the base for the synthetic data generation considered in this proposed study.

Conflicts of Interest

The authors declare no conflict of interest.

Nomenclature

MLR	Multiple linear regression
XGB	Extreme gradient boost
ANN	Artificial neural network
ML	Machine learning
ANFIS	Adaptive neural fuzzy inference system
WNN	Wavelet neural network
SVM	Support vector machine
ARIMA	Auto regression integrated moving average
MCA	Multi-criteria analysis
EMS	Energy management system

References

Green, D.; Sonnreich, T. Centralised to De-centralised Energy—What Does it Mean for Australia. In Infrastructure for 21st Century Australian Cities; Australian Davos Connection, Limited: Melbourne, Australia, 2015; Volume 177. Available online: https://reneweconomy.com.au/centralised-decentralised-energy-mean-34072/ (accessed on 27 January 2021).
Lasseter, R.; Akhil, A.; Marnay, C.; Stephens, J.; Dagle, J.; Guttromsom, R.; Meliopoulous, A.S.; Yinger, R.; Eto, J. Integration of Distributed Energy Resources. The CERTS Microgrid Concept; Technical Report; Lawrence Berkeley National Lab. (LBNL): Berkeley, CA, USA, 2002. [Google Scholar]
Lantero, A. How Microgrids work. Renew. Sustain. Energy Rev. 2014, 1, 1. [Google Scholar]
Al-Saedi, W.; Lachowicz, S.W.; Habibi, D.; Bass, O. Power flow control in grid-connected microgrid operation using Particle Swarm Optimization under variable load conditions. Int. J. Electr. Power Energy Syst. 2013, 49, 76–85. [Google Scholar] [CrossRef]
Jiang, Q.; Xue, M.; Geng, G. Energy management of microgrid in grid-connected and stand-alone modes. IEEE Trans. Power Syst. 2013, 28, 3380–3389. [Google Scholar] [CrossRef]
Kinn, M.C. Proposed components for the design of a smart nano-grid for a domestic electrical system that operates at below 50V DC. In Proceedings of the 2011 2nd IEEE PES International Conference and Exhibition on Innovative Smart Grid Technologies, Manchester, UK, 5–7 December 2011; pp. 1–7. [Google Scholar]
Mateska, A.K.; Borozan, V.; Krstevski, P.; Taleski, R. Controllable load operation in microgrids using control scheme based on gossip algorithm. Appl. Energy 2018, 210, 1336–1346. [Google Scholar] [CrossRef]
Morsali, R.; Thirunavukkarasu, G.S.; Seyedmahmoudian, M.; Stojcevski, A.; Kowalczyk, R. A relaxed constrained decentralised demand side management system of a community-based residential microgrid with realistic appliance models. Appl. Energy 2020, 277, 115626. [Google Scholar] [CrossRef]
Zhang, L.; Sun, M. Research on the type of load of accessing to microgrid based on reliability. In Advances in Computer Science Research, Proceedings of the 2015 2nd International Conference on Electrical, Computer Engineering and Electronics, Jinan, China, 29–31 May 2015; Atlantis Press: Amsterdam, The Netherlands, 2015. [Google Scholar] [CrossRef] [Green Version]
Fields, S. What are critical load panels? Energysage 2020, 1, 1. [Google Scholar]
Olivares, D.E.; Mehrizi-Sani, A.; Etemadi, A.H.; Cañizares, C.A.; Iravani, R.; Kazerani, M.; Hajimiragha, A.H.; Gomis-Bellmunt, O.; Saeedifard, M.; Palma-Behnke, R.; et al. Trends in Microgrid Control. IEEE Trans. Smart Grid 2014, 5, 2. [Google Scholar] [CrossRef]
Australian Government; Department of Industry, Science, Energy and Resources. Renewables. 2019. Available online: https://www.energy.gov.au/data/renewables (accessed on 18 October 2020).
Shi, H.; Xu, M.; Li, R. Deep learning for household load forecasting—A novel pooling deep RNN. IEEE Trans. Smart Grid 2017, 9, 5271–5280. [Google Scholar] [CrossRef]
Hulstaert, L. Black-box vs. white-box models. Towards Data Sci. 2019, 1, 1. [Google Scholar]
Sajjad, M.; Khan, Z.A.; Ullah, A.; Hussain, T.; Ullah, W.; Lee, M.Y.; Baik, S.W. A novel CNN-GRU-based hybrid approach for short-term residential load forecasting. IEEE Access 2020, 8, 143759–143768. [Google Scholar] [CrossRef]
Park, H.S.; Phuong, D.X.; Kumar, S. AI based injection molding process for consistent product quality. Procedia Manuf. 2019, 28, 102–106. [Google Scholar] [CrossRef]
Khosravani, M.R.; Nasiri, S.; Weinberg, K. Application of case-based reasoning in a fault detection system on production of drippers. Appl. Soft Comput. 2019, 75, 227–232. [Google Scholar] [CrossRef]
Fumo, N.; Biswas, M.R. Regression analysis for prediction of residential energy consumption. Renew. Sustain. Energy Rev. 2015, 47, 332–343. [Google Scholar] [CrossRef]
Tso, G.K.; Yau, K.K. Predicting electricity energy consumption: A comparison of regression analysis, decision tree and neural networks. Energy 2007, 32, 1761–1768. [Google Scholar] [CrossRef]
Rosenblatt, F. The perceptron: A probabilistic model for information storage and organization in the brain. Psychol. Rev. 1958, 65, 386. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kalogirou, S.A.; Bojic, M. Artificial neural networks for the prediction of the energy consumption of a passive solar building. Energy 2000, 25, 479–491. [Google Scholar] [CrossRef]
Fields, S. List of the Power Consumption of Typical Household Appliances. 2020. Available online: https://www.daftlogic.com/information-appliance-power-consumption.htm (accessed on 13 October 2020).
DeepAI. Hyperparameter. 2020. Available online: https://deepai.org/machine-learning-glossary-and-terms/hyperparameter (accessed on 27 January 2021).
MissingLink. The Complete Guide to Artificial Neural Networks: Concepts and Models. 2020. Available online: https://missinglink.ai/guides/neural-network-concepts/complete-guide-artificial-neural-networks/ (accessed on 27 January 2021).
Wagenaar, T. Deep Networks vs. Shallow Networks: Why Do We Need Depth? 2020. Available online: https://stats.stackexchange.com/q/274571) (accessed on 28 January 2021).

Figure 1. Procedures for establishing multi-criteria analysis (MCA).

Figure 2. MCA matrices and scoring table.

Figure 3. Research methodology.

Figure 4. Visual representation of all the load profiles.

Figure 5. Data-flow of machine learning with the input and output choices.

Figure 6. Layout summary of our ANN designs, shallow ANN (left), and deep ANN (right).

Figure 7. The proposed deep ANN layout for this study. * Layer 3 and layer 6 are the dropout layers.

Figure 8. The graph shows the predication from each techniques: MLR (top left), XGB (top right), shallow ANN (bottom left), and deep ANN (bottom right).

Figure 9. The graph of RMSE of estimated energy consumption value using MLR (cyan), XGBoost (blue), shallow ANN (red), and deep ANN (green).

Table 1. The set of chosen criteria for the MCA.

Implementation Feasibility	The level of ease to implement the technique in the restricted amount of time and resources.
Usability	The capacity to provide a condition for its users to perform the tasks safely, effectively, efficiently and satisfactorily.
Computational time	The amount of time it takes for the technique to converge to an outcome.
Accuracy	The size of the dispute between the technique’s outcome and the real statistic.
Randomisation	The ability of the technique to draw a pattern from a random dataset.
Adaptability	The same technique can be combined with other optimisation technique or can be used in a different environment.

Table 2. MCA total scoring table.

Techniques	Total Score
ANFIS (Adaptive Neural Fuzzy Inference System)	187
ANN (Artificial Neural Network)	189
MLR (Multiple Linear Regression)	170
XGBoost (eXtreme Gradient Boosting)	185
WNN (Wavelet Neural Network)	181
SVM (Support Vector Machine)	169
ARIMA (Auto Regressive Integrated Moving Average)	140
GPR (Gaussian Process Regression)	144

Table 3. Initial load profile 1 created for the project (summer).

Type of Household	No. of Occupants	Load	Wattage (W)	Amount of Time per Day (h)	Prefer Usage Time	Energy Consumption (Wh)	Cost per Day (AUD)
Household 1:	1	Fridge	80	24	00:00–23:59	1920
University student		Washing machine	800	0.1	10:00–11:00	80
		Fan	200	10	13:00–23:00	2000
		Microwave	800	0.1	12:00–13:00	80
		Heater	2000	0	-	-
		Chargeable devices	150	8	22:00–6:00	1200
		Rice cooker	830	0.5	11:00–12:00	415
		Toaster	850	0.05	7:00–8:00	42.5
		TV	150	4	18:00–22:00	600
		Gaming console	150	4	18:00–22:00	600
		Other chargeable devices	200	6	10:00–16:00	1200
		Total		Average S = 7.5, W = 10.4 kWh		8137.5	3.2924425

Table 4. Initial load profile 2 created for the project (summer).

Type of Household	No. of Occupants	Load	Wattage (W)	Amount of Time per Day (h)	Prefer Usage Time	Energy Consumption (Wh)	Cost per Day (AUD)
Household 2:	2	Fridge	120	24	00:00–23:59	2880
20 s couple		Washing machine	800	0.3	13:00–14:00	240
		Fan	200	10	17:00–3:00	2000
		Microwave	1000	0.2	12:00–13:00	100
		Heater	2000	0	-	-
		Chargeable devices	400	8	22:00–6:00	3200
		Blender	500	0.1	9:00–10:00	50
		Toaster	850	0.1	6:00–7:00	85
		Iron	1200	0.1	19:00–20:00	120
		Vacuum cleaner	1000	0.2	16:00–17:00	200
		Gaming console	300	2	19:00–21:00	600
		Coffee machine	1000	0.1	6:00–7:00	100
		Total		Average S = 11.5, W = 14.6 kWh		9675	3.669745

Table 5. Initial load profile 3 created for the project (summer).

Type of Household	No. of Occupants	Load	Wattage (W)	Amount of Time per Day (h)	Prefer Usage Time	Energy Consumption (Wh)	Cost per Day (AUD)
Household 3:	3	Fridge	120	24	00:00–23:59	2880
Young family		Washing machine	800	0.7	9:00–10:00	560
		Fan	200	12	12:00–0:00	2400
		Microwave	1000	0.5	11:00–12:00	500
		Heater	2000	0	-	-
		Chargeable devices	300	12	16:00–4:00	3600
		TV	150	6	13:00–19:00	900
		Rice cooker	850	1	11:00–12:00	850
		Toaster	850	0.3	4:00–5:00	225
		Iron	1200	0.5	20:00–21:00	600
		Vacuum cleaner	1000	0.2	14:00–15:00	200
		Dish washer	1500	1	18:00–19:00	1500
		Gaming console	300	2	17:00–19:00	600
		Coffee machine	1000	0.3	4:00–5:00	300
		Total		Average S = 13.1, W = 17.6 kWh		15,345	5.061163

Table 6. Initial load profile 4 created for the project (summer).

Type of Household	No. of Occupants	Load	Wattage (W)	Amount of Time per Day (h)	Prefer Usage Time	Energy Consumption (Wh)	Cost per Day (AUD)
Household 4:	2	Fridge	120	24	00:00–23:59	2880
Middle age couple		Washing machine	800	0.3	19:00–20:00	240
		Fan	200	4	20:00–0:00	800
		Microwave	1000	0.05	17:00–18:00	50
		Heater	2000	0	-	-
		Chargeable devices	200	12	18:00–6:00	2400
		TV	150	3	20:00–23:00	450
		Rice cooker	830	0.2	17:00–18:00	166
		Toaster	850	0.1	5:00–6:00	85
		Iron	1200	0.2	19:00–20:00	240
		Vacuum cleaner	1000	0.1	16:00–17:00	100
		Gaming console	150	1	20:00–21:00	150
		Coffee machine	1000	0.2	5:00–6:00	200
		Total		Average S = 11.5, W = 14.6 kWh		10,461	3.8626294

Table 7. Initial load profile 5 created for the project (summer).

Type of Household	No. of Occupants	Load	Wattage (W)	Amount of Time per Day (h)	Prefer Usage Time	Energy Consumption (Wh)	Cost per Day (AUD)
Household 5:	4	Fridge	150	24	00:00–23:59	3600
Family		Washing machine	800	0.7	9:00–10:00	560
		Fan	200	10	8:00–18:00	2000
		Microwave	1000	0.5	10:00–11:00	500
		Heater	2000	0	-	-
		Chargeable devices	300	12	20:00–8:00	3600
		TV	150	3	19:00–22:00	450
		Toaster	850	0.3	6:00–7:00	255
		Iron	1200	0.5	6:00–7:00	600
		Vacuum cleaner	1000	0.2	11:00–12:00	200
		Dish washer	1500	1	18:00–19:00	1500
		Air fryer	1000	0.2	11:00–12:00	200
		Gaming console	150	2	20:00–22:00	300
		Coffee machine	1000	0.3	6:00–7:00	300
		Total		Average S = 14.5, W = 18.9 kWh		15,215	5.029261

Table 8. Initial load profile 6 created for the project (summer).

Type of Household	No. of Occupants	Load	Wattage (W)	Amount of Time per Day (h)	Prefer Usage Time	Energy Consumption (Wh)	Cost per Day (AUD)
Household 6:	5	Fridge	180	24	00:00–23:59	4320
Big family		Washing machine	800	0.7	12:00–13:00	560
		Fan	200	12	12:00–24:00	2400
		Microwave	1000	0.5	11:00–12:00	500
		Heater	2000	0	-	-
		Chargeable devices	300	12	20:00–8:00	3600
		TV	150	6	16:00–22:00	900
		Rice cooker	850	1	11:00–12:00	850
		Toaster	850	0.3	6:00–7:00	255
		Iron	1200	0.5	19:00–20:00	600
		Vacuum cleaner	1000	0.2	14:00–15:00	200
		Dish washer	1500	1	19:00–20:00	1500
		Air fryer	1000	0.2	17:00–18:00	200
		Gaming console	300	3	19:00–22:00	900
		Coffee machine	1000	0.3	6:00–7:00	300
		Total		Average S = 15.8, W = 20.8 kWh		17,085	5.488159

Table 9. Initial load profile created for the project (summer).

Time	Energy Consumption (Wh)	Time	Energy Consumption (Wh)
00:00	230	12:00	360
01:00	230	13:00	480
02:00	230	14:00	480
03:00	230	15:00	480
04:00	230	16:00	480
05:00	230	17:00	280
06:00	230	18:00	580
07:00	123	19:00	380
08:00	80	20:00	380
09:00	80	21:00	380
10:00	360	22:00	230
11:00	695	23:00	230

Table 10. The Final version of the load profile.

Index	Season	Number of Occupants	Hour	Energy Consumption (Wh)
1	0	1	0	230
2	0	1	1	230
3	0	1	2	230
4	0	1	3	230
5	0	1	4	230
:	:	:	:	:
99,996	0	2	2	303.92
99,997	1	1	22	272.93
99,998	0	3	18	784.27
99,999	1	1	3	177.36
100,000	1	4	7	518.85

Table 11. The result of the simulation.

Technique	RMS Error (Watts)	Coefficient of Determination (%)
Deep ANN	111.20	97.5
XGB	270.85	84.9
MLR	634.65	17.2
Shallow ANN	636.74	16.6

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Truong, L.H.M.; Chow, K.H.K.; Luevisadpaibul, R.; Thirunavukkarasu, G.S.; Seyedmahmoudian, M.; Horan, B.; Mekhilef, S.; Stojcevski, A. Accurate Prediction of Hourly Energy Consumption in a Residential Building Based on the Occupancy Rate Using Machine Learning Approaches. Appl. Sci. 2021, 11, 2229. https://doi.org/10.3390/app11052229

AMA Style

Truong LHM, Chow KHK, Luevisadpaibul R, Thirunavukkarasu GS, Seyedmahmoudian M, Horan B, Mekhilef S, Stojcevski A. Accurate Prediction of Hourly Energy Consumption in a Residential Building Based on the Occupancy Rate Using Machine Learning Approaches. Applied Sciences. 2021; 11(5):2229. https://doi.org/10.3390/app11052229

Chicago/Turabian Style

Truong, Le Hoai My, Ka Ho Karl Chow, Rungsimun Luevisadpaibul, Gokul Sidarth Thirunavukkarasu, Mehdi Seyedmahmoudian, Ben Horan, Saad Mekhilef, and Alex Stojcevski. 2021. "Accurate Prediction of Hourly Energy Consumption in a Residential Building Based on the Occupancy Rate Using Machine Learning Approaches" Applied Sciences 11, no. 5: 2229. https://doi.org/10.3390/app11052229

APA Style

Truong, L. H. M., Chow, K. H. K., Luevisadpaibul, R., Thirunavukkarasu, G. S., Seyedmahmoudian, M., Horan, B., Mekhilef, S., & Stojcevski, A. (2021). Accurate Prediction of Hourly Energy Consumption in a Residential Building Based on the Occupancy Rate Using Machine Learning Approaches. Applied Sciences, 11(5), 2229. https://doi.org/10.3390/app11052229

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Accurate Prediction of Hourly Energy Consumption in a Residential Building Based on the Occupancy Rate Using Machine Learning Approaches

Abstract

1. Introduction

2. Background

2.1. Microgrid and Energy Management System

2.2. Machine Learning

3. Multi-Criteria Analysis (MCA)

3.1. MCA Development

3.2. MCA Results Evaluation

4. Methodology

4.1. Synthetic Load Profile Generation

4.2. Machine Learning Modelling

4.2.1. Proposed Deep Neural Network Model

5. Results and Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Acknowledgments

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI