1. Introduction
The investigation of energy consumption in buildings is increasingly attracting attention due to its major economical and environmental effects [
1]. The construction sector, specifically, has shown to be responsible for approximately
of global energy consumption and
of CO
emissions [
2], numbers which are continuously growing due to urbanization [
3]. Therefore, changes in energy consumption and energy efficiency on buildings are prone to heavily impact current society, including on major socioeconomic and ecological issues, such as global warming and climate change [
4].
Smart metering has been proposed as an alternative to improve building energy management and efficiency [
5,
6,
7]. The concept of smart metering is usually related to intelligent meter devices, which can be remotely and locally accessed, and are able to register, process and provide feedback regarding energy consumption of the household [
8].
By allowing close monitoring of energy consumption on the demand side and greater observability and controllability of the power grid, smart metering can be utilized as a tool to gain efficiency in each step of the customer-provider relationship.
The straightforward availability of historical consumption data provided by smart meters allows the consumer to extract insights regarding its behavior and evaluation of consumption patterns.
On the supply side, energy providers may be able to use smart metering to have a more in-depth and accurate overview of the energy consumption in each region. In addition, it allows for better forecasting of energy demand, which results in better maintenance and network planning [
9].
Finally, the forecasting of energy demand allows the supplier to quickly react to changes in any part of generation, transmission, and distribution process, being able to identify suspicious energy consumption activity and detect fraud; see, for example, References [
8,
10].
As a result, the task of forecasting energy consumption has shown to be of great value to the efficiency gains from smart metering. The ability to accurately forecast the energy consumption for each household allows better fraud detection and maintenance/network planning. As a result, approaches based on statistical linear and Machine Learning (ML) techniques have been widely employed in this task [
11,
12,
13].
However, the ability to accurately forecast energy consumption represents a challenging task due to the fact that it follows linear and non-linear patterns [
11]. The motivation in utilizing statistical linear models is in part due to the existence of a well established methodology for model construction, proposed by Box & Jenkins [
14]. Among the linear techniques, Seasonal Autoregressive Integrated Moving Average (SARIMA) is commonly used because it can model the seasonality in time series [
11].
Although statistical linear techniques are flexible, they have an underlying assumption that the generating process of the time series is linear and, as a result, they do not perform well with non-linear processes. While linearity is often a useful assumption, it has shown to fail in many real-world problems since the early 1980’s [
15].
On the other hand, ML techniques, such as Artificial Neural Networks (ANNs) [
12,
16] and Support Vector Regression (SVR) [
13,
17], have been employed due to its data-driven approach and ability to model non-linear patterns [
18]. These techniques, however, do not possess established methodologies for feature selection and are sensitive to hyperparameter misspecification, which can degenerate their performance [
19,
20].
In this context, hybrid systems have been developed to combine the strengths of statistical and ML techniques for modeling linear and non-linear components of a real-world time series [
19,
21,
22]. Zhang [
19] proposed a hybrid system that supposes a linear combination of the linear and non-linear patterns presents as follows:
where
is a real world time series,
is the linear component, and
is the non-linear component. Based on Equation (
1), the hybrid system proposed by Zhang [
19] is composed of three phases: modeling of the time series (
) employing a statistical linear technique, non-linear modeling of the residual series (
) using an ML technique, and the combination of the linear (
) and non-linear (
) forecasts using a simple sum. Several works followed Zhang’s assumption in different applications [
23,
24,
25,
26]. However, hybrid systems that suppose a linear combination can degrade the whole system’s performance once that the relationship may not be additive [
27].
Alternatively, Khashei and Bijari [
20] proposed a non-linear combination (Equation (
2)) of the linear and non-linear forecasts to overcome this limitation.
where
f is a combination function generated by an ML technique. Khashei and Bijari’s hybrid system has two phases: time series modeling using ARIMA; and non-linear modeling of the linear forecast residuals, while utilizing an ML technique to generate the final forecast.
Based on this assumption, Chou and Ngo [
7,
28] proposed a hybrid system that combines the SARIMA with a Least Squares SVR (LSSVR) trained via metaheuristic Firefly Algorithm (MetaFA) in a smart metering data forecast scenario. In this work, the hybrid system SARIMA-MetaFA-LSSVR employs a SARIMA for linear modeling of the time series, and posteriorly the LSSVR performs the final forecast from non-linear modeling of the residuals, of the linear output, and exogenous data. The results obtained in References [
4,
28] show that hybrid systems can be a promising approach in terms of accuracy when compared with single statistical and ML techniques and other literature approaches.
However, hybrid systems are still affected by obstacles originated from ML modeling, such as the lack of an established methodology for hyperparameter optimization and feature selection. In order to overcome these limitations, this paper proposes a three-phase hybrid system, which we name Evolutionary Hybrid System (EvoHyS), that non-linearly combines statistical and ML techniques to forecast smart grid networks’ time series. The proposed hybrid system is composed of three phases: (i) forecast of the linear and seasonal component of the time series using a Seasonal Autoregressive Integrated Moving Average (SARIMA) model, (ii) forecast of the error series using an ML technique, and (iii) combination of both linear and non-linear forecasts from (i) and (ii) using a a secondary ML model. A Genetic Algorithm (GA) is employed to find the best set of parameters and input data for phases (ii) and (iii). An experimental evaluation in a 1-day-ahead forecast scenario was performed with a data set of a smart grid network installed in a residential building. EvoHyS attained the best overall performance considering five well-known accuracy measures compared to the single, ensemble, and hybrid models of the literature. These results indicate that EvoHyS is a promising strategy to support decisions aiming to improve efficient energy usage in buildings that use smart metering. The EvoHyS is innovative in the smart meter consumption forecasting area because:
It is the first hybrid system that performs the forecasting employing three phases of modeling;
It employs an ML model in an exclusive phase to find the best combination between linear and non-linear forecasts;
It uses a GA to search for the best set of the ML models parameters in the residuals and combination modeling phases;
It is able to search for the set of linear and non-linear forecasts that maximize the accuracy of the whole system.
The remaining of this paper is organized as follows: we will present related works regarding the use of evolutionary algorithms for the combination of forecasting models in
Section 2. In
Section 3, we will describe and discuss our methodology and evaluate our technique for a energy consumption forecasting problem in
Section 4. Finally, we will compare our method with other works in the literature and provide direction on where further research is warranted in
Section 4.3.
2. Related Work
The development of systems based on ML models has been highlighted in the energy forecasting area [
29]. In this area, electricity load and energy consumption forecasts have received great attention due to their relationship to demand, supply, and environmental issues [
30,
31]. In general, electricity load forecasting tasks have a major impact on the planning, operating, and monitoring power systems. The accuracy of the forecasts can impact operation costs since an overestimation can increase the number of generators employed and produce an unnecessary reserve of electricity. The underestimation of electricity load can put at risk the system’s reliability due to insufficient load required to attend the demanding market [
32]. In the same way, electricity consumption forecasting models can improve energy efficiency and sustainability in diverse sectors, such as in residential buildings [
33,
34,
35] and in industry [
36,
37].
In order to achieve accurate electricity load forecasts several machine learning models have been employed in this task [
38,
39,
40]. Models, such as ANNs based on Wavelets [
38], Long Short-Term Memory (LSTM), Random Forests [
39], and ensembles [
40], have been investigated.
Likewise, energy consumption forecasting systems based on ML models have been used in the literature. Culaba et al. [
33] employed a hybrid system based on clustering and forecasting using
K-Means and SVR models, respectively. Deep learning models, such as Convolution Neural Networks (CNN), were employed by Reference [
34] for energy consumption forecasts in the context of new buildings with few historical data. Pinto et al. [
35] used ensemble models to forecast energy consumption in office buildings. Walther and Weigold [
36] performed a systematic review of the literature on energy consumption forecasting models in the industry.
Considering the literature of energy consumption forecasting on smart metering data, several ML methods have been investigated. In this context, Gajowniczek and Zabkowski [
41] employed MLP and SVR models to forecast the consumption on individual smart meters. For that, their solution extracts features related to the meter’s consumption history (e.g., average, maximum and minimum load) and the temperature inside the house. They argued that they do not perform a traditional time series modeling due to the high volatility of their data.
Zhukov et al. [
42] investigated the effects of concept drift in smart grid analysis. A random forecast algorithm for concept drift was employed, and an ensemble using the weighted majority vote rule was used to combine the outputs of individual learners. The proposed method was compared to other algorithms in the concept drift detection context, obtaining promising results.
Electricity pricing and load forecasting are important tasks in smart grid structures due to the improvements of efficiency in the management of electric systems [
31,
43,
44]. In this scenario, Heydari et al. [
43] proposed a hybrid system based on variational mode decomposition (VMD), gravitational search algorithm (GSA), and general regression neural networks (GRNN). The VMD performs the series’s decomposition into several intrinsic mode functions (IMFS), while the GSA performs a feature selection in the time series. Furthermore, considering the importance of electricity load forecasting in electric systems, this task can also be performed in individual households through the employment of smart metering technologies [
45,
46]. In this way, Li et al. [
47] employed a Convolutional Long Short-Term Memory-based neural network with Selected Autoregressive Features to improve forecasting accuracy. Fekri et al. [
46] used deep learning models based on online adaptive recurrent neural networks, considering that energy consumption patterns may change over time. In addition, several load forecasting applications have been addressed, such as peak alert systems [
48], where a modified support vector regression is employed, using smart meter data and weather data as input.
Another work that deals with smart metering forecast [
49], investigated the effects of factors, such as seasonality and weather condition for electricity consumption prediction, using different machine learning algorithms: regression trees, MLP and SVR. Their findings show that: regression trees obtain the lowest Root Mean Squared Error (RMSE) values in almost all evaluated scenarios; adding weather data does not improve the results; and a historical window of one year to train the models is enough to achieve low-error forecasts.
Sajjad et al. [
50] propose a deep-learning model for hourly energy consumption forecast of appliances and houses. The input data is processed using min-max normalization or z-score standardization, which is fed into a Convolutional Neural Network (CNN) followed by a Recurrent Neural Network (RNN), specifically a Gated Recurrent Unit (GRU). Finally, a dense layer on top of the GRU outputs the prediction. They do not provide, however, any details about their strategy of selecting the hyper-parameters of the network.
Similarly, Wang et al. [
51] employ an Long Short-Term Memory (LSTM) model that outputs quantile probabilistic forecasts. For training, the network minimizes the average quantile loss for all quantiles. The input of the network is composed of the historical consumption, the day of the week and hour of the day of the data point to be predicted. Similar to Reference [
50], the process of selection of nodes and layers of the network is not presented.
In addition, hybrid systems have gained attention due to their ability to increase the accuracy of the single ML models [
30,
52]. These systems are developed aiming to overcome the limitations of single ML models regarding misspecification, overfitting, and underfitting [
27]. In this sense, Somu et al. [
53] employed the
K-means clustering-based convolutional neural networks and long short term memory (
KCNN-LSTM) to forecast energy consumption using data from smart meters. In this work, the
K-means is employed to identify tendency and seasonal patterns in the time series, while the CNN-LSTM is used in the forecasting process.
Chou and Truong [
54] proposed a hybrid system composed of four steps: linear time series modeling, non-linear residual modeling, combination, and optimization. The parameter selection process for the models employed in the first three steps is performed through a Jellyfish Search (JS) optimization algorithm [
55]. Bouktif et al. [
56] employed a Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) to search for hyperparameters of the LSTM in load forecasting tasks.
The proposed hybrid system differs from the hybrid systems proposed in the literature since it employs a GA to perform the optimization of the residual forecasting model and the combination model. Furthermore, the optimization also selects the most relevant lags to reduce model complexity and enhance forecasting accuracy.
3. Evolutionary Hybrid System
The proposed Evolutionary Hybrid System (EvoHyS) can be better understood by splitting into two steps: training and testing.
Figure 1 and
Figure 2 show that each step is divided into three phases. In the training step (
Figure 1), the training set of the the time series (
) is modeled by a linear model (ML
), an ML model (ML
) is used in the modeling of the residual series (
), and the combination of the linear and non-linear estimates is performed with an ML model ((ML
)).
Figure 2 shows that the linear (ML
) and ML (ML
) models are employed in the test step to forecast the time series and respective residuals. After that, the ML
model is used to combine these estimates, generating the final forecast.
For linear modeling, a Seasonal Autoregressive Integrated Moving Average (SARIMA) model was chosen as ML
due to its ability to model seasonal time series, such as historical energy consumption data. The Box & Jenkins methodology [
14] used in the design and parameters adjust of the SARIMA model performs the modeling of the linear patterns presented in the training data (Equation (
3)).
After ML
model estimation, the residual series (
) is calculated from the difference between the training data (
) and the linear forecast (
) according to Equation (
4),
Figure 1 shows that the outputs of Stage (I) of the training step are: the linear forecast (
), the residual series (
), and the estimated SARIMA model (ML
). The SARIMA model is used in the test step.
Figure 1 shows that Stage (II) of the training step receives the residual series (
) as input data. In this stage, an ML technique (
) is trained to model the non-linear patterns that were not modeled in Stage (I). In this stage,
is subdivided into training and validation sets to ML
parameters estimation. The Support Vector Regression (SVR) model was chosen to perform the non-linear modeling of the residual series. In contrast to ML models, such as neural networks, the SVR performs a quadratic optimization which yields a single minimal solution [
57]. Furthermore, SVR models have shown accurate results in residual modeling [
21]. A Genetic Algorithm (GA) is employed to search the SVR parameters and the best set of temporal lags. The objective is to improve the residual modeling using the ability of exploration and exploitation of the GA. The forecast of the non-linear patterns (
) using the residual series (
) modeling is performed after the GA finds the parameters and the set of temporal lags that maximizes the accuracy of the SVR (Equation (
5)).
The outputs of Stage (II) of the training step are the trained and the forecast of the residuals (). The trained is used in the Test Step.
Stage (III) of the training step employs an ML model (
) to search for the most suitable combination between the linear and non-linear forecasts. An MLP neural network was selected to perform the combination due to of its robustness against noise and the ability to approximate any non-linear function [
58,
59]. The definition of the inputs of the
is an important task [
21] since it is closely related to the accuracy of the system. Thus, a GA is used to define the MLP parameters, such as the activation function, optimization algorithm, and number of hidden neurons. Equation (
6) shows the final forecast in the training step of the EvoHyS.
Figure 1 shows that the output of the Stage (III) of the training step is the trained
model. In Stages (II) and (III), a validation set is employed for: verifying the ability of generalization of the ML models, avoiding the overfitting, and selecting the best hyperparameters setting. This validation set is a subset of the training set used in Stage (I) to estimate the parameters of the SARIMA model.
In the test step, as presented in
Figure 2, the Stages (I) and (II) receive the test patterns of the time series (
) and residual series (
), respectively. The number of time lags
and
used as inputs for the
and
models were defined in the Stages (I) and (II) of the training step. In Stage (I), the
model generates the time series forecast (
), and in Stage (II) the
model generates the residual forecast (
). Finally, in Stage (III), the
model performs the final forecast from the combination of the outputs of Stages (I) and (II) according to Equation (
7).
where
and
are the forecasts used in the combination stage defined by GA in Stage (III) of the training step. Algorithms 1 and 2 summarize the description of the training and test steps in algorithmic language.
Algorithm 1 Training Step |
- 1:
Input: training vector of size - 2:
Output: ▹ trained models - 3:
ML← modelTraining() ▹ Stage (I) - 4:
forecast(ML, ) - 5:
- 6:
ML← modelTraining() ▹ Stage (II) - 7:
forecast(ML, ) - 8:
ML← modelTraining(,) ▹ Stage (II) - 9:
forecast(ML,,)
|
Algorithm 2 Testing Step |
- 1:
Input: , , ML, ML, ML - 2:
Output: - 3:
forecast(ML, ) ▹ Stage (I) - 4:
forecast(ML, ) ▹ Stage (I) - 5:
forecast(ML,,) ▹ Stage (I)
|
3.1. Genetic Algorithm
In a Genetic Algorithm (GA), a set of solutions (population) undergoes an evolutionary process inspired by Charles Darwin’s theory of evolution [
60]. The evolution consists of changes in the characteristics of the population over generations guided by the fitness function. The genetic characteristics are passed to the next generation from the genetic operators (crossover and mutation). Natural selection (parent and survivor selection) improves the fitness of the population’s individuals over generations. The EvoHyS uses a GA in the Stages (II) and (III) of the training step (
Figure 1). In Stages (II) and (III), the GA optimizes the
and
models, respectively.
The proposed GA can be described by defining four main components: Chromosome, Fitness Function, Reproduction (Crossover and Mutation), and Survivor Selection. Each one of these components will be described in the following sections.
3.1.1. Chromosome
In this work, the chromosome, or genotype, contains the models’ parameters to be optimized in Stages (II) and (III) of the training step (
Figure 1).
Figure 3 shows the generic representation of the chromosomes that codifies the
and
models. For both individuals, the chromosome is a fixed-length vector that contains two distinct parts. In the first block, there is information regarding lags that can be utilized as inputs for the model. Its size corresponds to the maximum number of lags that can be used by the model. Each position contains a binary value (0 or 1), which corresponds to a given lag’s absence or presence. In the second block the information regarding the parameters of the model is presented. A list of possible values is established for each model parameter, where each parameter search space is defined through different orders of magnitude or according to predefined choices present in popular machine learning libraries, such as
scikit-learn [
61]. The population is randomly generated in both Stages (II) and (III).
In Stage (II), the SVR model is used as
. For that, the chromosome’s first block is composed of time lags used for residual modeling. The second block holds information regarding the SVR parameters’ configuration, such as the kernel parameter
gamma, the regularization factor
C, and the
, which defines the
-sensitive cost function [
57].
In Stage (III), the MLP model is employed as the combination mode, and is represented as . In the combination stage (Stage (III)), the first block consists of the input data employed to perform the final forecasting. The second part comprises genes that contain information regarding the hidden layer, activation function, and learning algorithm. The maximum number of hidden layers is previously defined. The number of neurons per layer, activation function, and learning algorithm are determined from a list of possible values. For each hidden layer, there is a binary value (0/1) that corresponds to the absence/presence of a given hidden layer.
3.1.2. Fitness Function
The fitness function used for the evaluation of each individual is given by Equation (
8).
where
The Mean Squared Error (MSE) is obtained by the difference between the actual values and the forecast of the model. The validation set is used to evaluate the generalization ability of the forecasting model.
For the MLP model (Stage (III)), each individual is evaluated three times with different initialization of weights due to its stochastic nature. From those three evaluations, the best model is selected.
3.1.3. Reproduction
The next generation is created from current population from two ways. The first group consists of copies of the current population that are chosen using the roulette wheel selection. The process is performed with replacement, once the selected individual can be chosen again. The first group is generated until , where and are the crossover rate previously defined in the interval [0, 1] and the size of the population, respectively.
Another group is created combining the genes of two parents chosen from roulette wheel selection. Each one of the offspring is produced from single-point crossover between selected two individuals. The process is performed with replacement until .
The objective of the adopted strategy in the crossover is to enhance the chance of the fittest individuals remain in the population, accelerating the convergence of the algorithm. In addition, this reproduction aims to balance the trade-off between exploration and exploitation of the GA.
After crossover, all offspring is subject to mutations in their genes. The objective of that is to explore the search space and to guarantee the variability in the population. The number of mutations is chosen randomly for each individual of the offspring using a uniform distribution for a specific range, which can be defined as . The initial value for the maximum number of mutations () is previously defined on the first iteration. For later iterations, increases one unit by epoch. As a result, while the population convergences, the proposed GA increases in order to keep adding variability, aiming to avoid premature convergence.
For binary genes, the mutation performs a reversion of the current state (from 0 to 1 or vice-versa). In other genes, the mutation replaces the current state with a new option chosen randomly from a list of possible states.
3.1.4. Survivor Selection
The offspring generated after the reproduction phase using genetic operators (crossover and mutation) has the same quantity of individuals as the previous population. Thus, the replacement of the population is performed without the need to have a survivor selection operator based on fitness. This replacement strategy aims to escape local minima that may occur in the previous iteration.
5. Conclusions
The employment of smart meters has become an important alternative in monitoring energy consumption and efficiency, allowing better management and network planning. Forecasting energy consumption in smart meters have become an important tool for maintenance planning and fraud detection. However, achieving accurate forecasts may be a challenging task, since energy consumption data is likely to present linear and non-linear patterns [
11].
In this work, an evolutionary hybrid system is proposed to forecast energy consumption in smart meters. In order to improve forecast accuracy, the proposed system (EvoHyS) is composed of three stages. First stage performs linear modeling through the employment of a SARIMA model. In the second stage, an evolutionary optimization based on a genetic algorithm is employed to find the best hyper-parameter of the SVR model, as well as to perform input selection. In the third and last stage, a combination of linear and non-linear models is performed using an MLP optimized by a genetic algorithm.
The experiments were conducted on data set of a smart grid network installed in a residential building. The simulations were carried considering the consumption per day of the week using several single and hybrid models proposed in the literature. In general, the EvoHyS achieved the best overall results, demonstrating good generalization performance on different days of the week.
The superior performance attained by EvoHyS compared with single models corroborates with previous studies [
7,
26,
55] that show the benefits of using hybrid systems that combine statistical and ML models. The modeling of linear and non-linear patterns separately enables generating specialist models that combined achieve higher accuracy than single models. In comparison with literature hybrid systems (SARIMA-MetaFA-LSSVR and SARIMA-PSO-LSSVR), EvoHyS outperformed them in most evaluation measures. This result shows that employing an exclusive phase to combine linear and non-linear forecasts can improve hybrid systems’ accuracy in the smart meter consumption forecasting area.
The overall run-time complexity of the EvoHyS is the sum of the complexity of the three models used to perform the final prediction: SARIMA for time series forecast (linear on the number of lags), SVR with RBF kernel for the residual forecast (quadratic: number of lags times number of support vectors) and MLP (linear on the number of lags).
For future directions, an investigation of the influence on external variables, such as temperature, precipitation in energy consumption will be considered. The employment of deep learning forecasting methods in the proposed hybrid system architecture also may be analyzed. Furthermore, EvoHyS should be evaluated in other energy consumption scenarios that involve smart meters time series.