1. Introduction
After the industrial revolution and the increase in the number of cities along with their populations, providing new approaches to meet the energy requirements of citizens is an important issue for researchers [
1,
2,
3]. Based on a previous estimation, the population of cities will increase up to five billion by 2030. The residential sector is responsible for 27% of global energy consumption [
4]. In addition, for the US and the European Union, the heating and air conditioning equipment consume 48% and 65% of total energy consumption in the buildings, respectively. Hence, estimating the exact loads of heating and cooling systems is an essential task in terms of energy conservation and environmental protection [
5]. In this regard, many methods of prediction are suggested, but the prediction of electric energy consumption by previous prediction methods is difficult [
6]. Machine learning-based models can understand and remake complicated non-linear patterns to map the relationship among many distinct engineering-based parameters and the related effective elements [
7,
8].
In order to improve the accuracy of predictor models in energy systems, many scholars have suggested different ML methods. Bui et al. [
9] utilized an artificial neural network (ANN) expert system improved via the use of an electromagnetism-based firefly algorithm (EFA) to forecast the energy consumption in buildings. They determined that the EFA-ANN approach can be very effective in early designs of energy-efficient buildings. Amber et al. [
10] utilized multiple regression (MR), genetic programming (GP), ANN, deep neural network (DNN), and support vector machine (SVM) approaches for estimating the electricity consumption of buildings, and found that ANN provided a mean absolute percentage error equal to 6% and acted better than other techniques. Ma et al. [
11] suggested a method based on the SVM in China, and the SVM method was shown to have good accuracy for estimating the building energy consumption, with an
r2 of more than 0.991. Zhang et al. [
12] utilized the support vector regression (SVR) method for the mentioned prediction tasks and stated that the accuracy of this method is highly based on household behavior variability. Olu-Ajayi et al. [
13] utilized many different ML techniques, including ANN, gradient boosting (GB), DNN, RF, stacking, K-nearest neighbor (KNN), SVM, decision tree (DT), and linear regression. They showed that DT presented the best outcome, with a 1.2 s training time. Banik et al. [
14] also utilized the RF and extreme gradient boosting (XGBoost) ensemble and stated that the precision of this method was 15–29%. They also found that the machine learning approaches can be very impressive for forecasting the energy consumption of buildings.
Previous simple ML methods could suffer from disadvantages such as low precision and a high calculation time, so researchers have suggested new ML-based approaches to address these limitations [
15,
16,
17]. As an alternative to the aforementioned methods, many scholars have suggested new ML-based predictive approaches to estimate building energy consumption. Fayaz and Kim [
18] suggested a new methodology to forecast energy consumption based on deep extreme learning (DELM) and demonstrated that its performance is considerably better than ANN for different time periods. Khan et al. [
19] used both long short-term memory (LSTM) and the Kalman filter (KF) and provided a new predictive model, and they proved the effectiveness this method compared to previous simple ML approaches. The LSTM is also considered in [
20]. Khan et al. [
21], in a different study, proposed a new hybrid network model considering a dilated convolutional neural network (DCNN) together with bidirectional long short-term memory (BiLSTM), and they claimed that this method provided better performance compared to other approaches. The testing time of this method was determined to be 0.005 s, which was dramatically less than that of the CNN-LSTM method, equal to 0.07. Khan et al. [
22] then introduced a new predictive model to forecast short-term electric consumption. In this approach, two deep learning models, including LSTM and gated recurrent unit (GRU), were used, and they proved the superior performance of this method by obtaining the lowest mean absolute percentage error, close to 4. Chen et al. [
23] utilized a framework by integrating building information modeling (BIM) with LSSVM and the non-dominated sorting genetic algorithm-II. They have shown that the LSSVM successfully predicted building energy consumption with a root mean square error of 0.0273. Moon et al. [
24] have suggested a two-stage building-level STLF model called RABOLA for enabling practical learning in the case of the unseen data, and appropriate results were obtained. Phyo and Jeenanunta [
25] proposed the bagging ensemble method, including the ML model along with LR and SVR methods, for forecasting tasks, and they found that this method enhanced the accuracy in different forecasting fields.
Another hybridization approach is the use of metaheuristic algorithms along with the ML of interest [
26,
27]. Utilizing this strategy has led to valuable developments of hybrid predictive models serving in the field of energy-related analysis [
28,
29]. As a matter of fact, proper usage of a particular method, especially newly developed ones, entails continuous enhancements and verification of competency. In the case of metaheuristic-based predictive models, the family of these algorithms is regularly growing, and it is important to keep the models updated with the latest designs. With this logic in mind, in this work, a powerful optimizer called the water cycle algorithm (WCA) is applied to energy performance analysis through simultaneous prediction of annual thermal energy demand (TD
A) and annual weighted average discomfort degree-hours (DD
A). For this purpose, the algorithm should be coupled with a framework that supports dual-prediction. A double-target MLP (2TMLP) plays the role of this framework. As such, the proposed model is named WCA-2TMLP hereafter. Many studies have previously confirmed the suitable optimization competency of WCA when incorporated with ANN techniques for various purposes [
30,
31], especially building energy assessment [
32].
Furthermore, three comparative benchmarks that are considered to validate the performance of the WCA are: shuffled complex evolution (SCE), the heap-based optimizer (HBO), and the salp swarm algorithm (SSA). After evaluating the efficiency parameters, the models are ranked, and the most promising core is formulized to be a mathematical prediction equation.
3. Results and Discussion
This work evaluates the proficiency of several metaheuristic algorithms that hybridize a 2TMLP for energy performance prediction in residential buildings. To present the results, this section is divided into several parts, as follows.
3.1. Hybrid Creation
Generally, to create a hybrid of the neural network and the metaheuristic technique, the training algorithm of the ANN should be replaced with the metaheuristic algorithm. The outcomes are hybrid models named SCE-2TMLP, HBO-2TMLP, SSA-2TMLP, and WCA-2TMLP. The architecture of the hybrid model (i.e., the number of variables) depends on the 2TMLP. In this work, the structure of this network was (11, 9, 2), representing 11 neurons in the input layer, 9 neurons in the middle layer, and 2 neurons in the output layer. The calculations were as follows: (i) the 11 neurons in the layer received the values of UM, UT, UP, αM, αT, Pt, ACH, Scw-N, Scw-S, Scw-E, and Glz, (ii) using 11 × 9 = 99 weights and 9 biases, the neurons in the middle layer performed the first level of calculations and sent the results to the output layer, and (iii) using 9 × 2 = 18 weights and 2 biases, the neurons in the output layer calculated the TDA and DDA.
Next, the equations governing these calculations in the 2TMLP were expressed as the problem function of the SCE, HBO, SSA, and WCA. Thus, each algorithm attempts to iteratively optimize the problem so that it attains an optimal training for the 2TMLP.
3.2. Training 2TMLP Using the Metaheuristic Algorithm
As explained, training using the metaheuristic algorithm is iterative. The number of iterations is determined based on the behavior of the algorithm. This study considered 1000 iterations for all 4 models. During each iteration of the optimization, the metaheuristic algorithm explored the training data and tuned the mentioned 128 weights and biases accordingly. The 2TMLP was then reconstructed using the tuned weights and biases and performed one prediction for all training data. To investigate the quality of the results, a cost function was required in this step. The RMSE was calculated for each iteration. Since a double-target network was being trained here, an average of the RMSEs calculated for TDA and DDA was considered as the cost function.
Figure 2 shows the rate of training, wherein the RMSE was generally reduced over 1000 iterations. It means that the metaheuristic algorithms were able to improve the initial solution. The behavior of all four algorithms was comparable as they reduced the RMSE quite differently from each other.
Table 2 expresses the initial and final values of the cost function obtained for each model. It can be seen that all algorithms significantly reduced the error of training. The SCE, HBO, and SSA started with a value between 50 and 60, while the initial value of WCA was around 32. Considering the final values, the SCE, SSA, and WCA ended up between 0 and 10, while this value was nearly 28 for the HBO. The lowest optimization error was eventually captured by the WCA algorithm, which was approximately 3.36.
3.3. Training Assessment
Figure 3a illustrates the training results for the TD
A. This chart shows how the target values were hit by the predictions of the four models. The graphical interpretations indicated a satisfying goodness-of-fit for all models as the general patterns (i.e., significant ups and downs) were well-followed. However, a noticeable distinction could be seen between the HBO-2TMLP and the three other models. There were some overestimations and underestimations by the line of HBO-2TMLP that were more accurately modeled by SCE-2TMLP, SSA-2TMLP, and WCA-2TMLP.
Figure 3b shows the distribution of errors in the form of a scatter chart. This chart also confirmed the lower accuracy of HBO-2TMLP as its points were more scattered compared to the other three models.
The RMSE values calculated for the SCE-2TMLP, HBO-2TMLP, SSA-2TMLP, and WCA-2TMLP were 2.66, 9.28, 2.94, and 0.81, respectively. In addition, the MAEs were 3.01, 12.71, 3.68, and 1.17. These error results, along with the CRs of 99.40%, 88.43%, 99.05%, and 99.90%, indicate a reliable prediction for all models and show the superiority of the WCA-2TMLP.
Figure 4a illustrates the training results for the DD
A. The prediction results were quite promising because both small and large fluctuations were nicely recognized and followed by all models. However, similar to the DD
A results, there were some weaknesses that were more tangible for the HBO-2TMLP relative to the other algorithms.
Figure 4b supports the aforementioned claims, owing to the tolerable rate of the scattered errors. Likewise, the points corresponding to HBO-2TMLP were less gathered around the ideal line, i.e., error = 0.
The RMSE values calculated for the SCE-2TMLP, HBO-2TMLP, SSA-2TMLP, and WCA-2TMLP were 14.65, 42.89, 9.77, and 5.54, respectively. In addition, the MAEs were 9.84, 29.69, 6.42, and 3.63. These error results, along with the CRs of 98.92%, 89.45%, 99.45%, and 99.82%, indicate a reliable prediction for all models and show the superiority of the WCA-2TMLP.
3.4. Testing Assessment
The results of the testing phase were similarly evaluated. According to the obtained RMSEs of 9.14, 14.66, 10.51, and 7.39, as well as the MAEs of 6.83, 11.70, 7.17, and 4.12, it can be deduced that all models could predict the TD
A with excellent accuracy.
Figure 5 shows the correlation between the real and predicted TD
A values. The CR values demonstrated 97.33%, 90.88%, 95.55%, and 98.67% agreement for the results. However, similar to the previous phase, the WCA-2TMLP achieved the most accurate results in this step.
As for the DD
A, the RMSE values were 91.52, 68.74, 90.15, and 96.10, associated with the MAEs of 44.01, 51.89, 41.25, and 40.07. The correlations of the DD
A testing results are shown in
Figure 6. Referring to the CR values of 99.60%, 93.82%, 99.47%, and 99.74%, the products of all models were in good harmony with the real values. In this phase, the WCA-2TMLP, despite achieving the smallest MAE and the largest CR, obtained the largest RMSE. However, based on the better performance in terms of two indices (out of three), the superiority of the WCA-2TMLP was evident here, too.
3.5. Comparison
Table 3,
Table 4 and
Table 5 present the values of the RMSE, MAE, and CR, respectively, calculated for training and testing both the TD
A and DD
A. From the overall comparison, it was found that the WCA-2TMLP had the largest accuracy in most stages. More clearly, for the TD
A analysis (in both phases), the order of the algorithms from strongest to weakest was: (1) WCA-2TMLP, (2) SCE-2TMLP, (3) SSA-2TMLP, and (4) HBO-2TMLP. However, the outcome was different for the DD
A analysis. While the training RMSE demonstrated the lowest error for WCA-2TMLP and the highest for HBO-2TMLP, the ranking was adverse in the testing phase. In both phases, the SSA-2TMLP captured the second position, followed by SCE-2TMLP. However, the MAE consistently suggested the following ranking: (1) WCA-2TMLP, (2) SSA-2TMLP, (3) SCE-2TMLP, and (4) HBO-2TMLP. As for the CR, the WCA-2TMLP and HBO-2TMLP were the strongest and the weakest predictor in both phases, respectively, while the SSA-2TMLP and SCE-2TMLP shared the second and third positions interchangeably in the training and testing phases.
In summary, in terms of accuracy, the WCA was introduced as the most effective metaheuristic algorithm in this study. The SSA and SCE provided reliable solutions and are recommended for practical applications as well, and the results of HBO do not make it preferable. Considering the time of optimization, the implementation of SCE, HBO, SSA, and WCA required about 782, 79,717, 8507, and 60,281 s. Hence, utilizing the WCA was more time-consuming relative to the SSA and the SCE. Due to the sufficient accuracy of the SCE, as well as the shortest time of optimization achieved, it can be suitable for time-sensitive cases, along with WCA.
3.6. Discussion
Studying the buildings comprises a wide range of domains, from safety measures [
58] and external designs [
59] to structural [
60] and energy performance analysis [
61]. Recently, the development of smart cities has affected construction- and energy-related policies in the construction sector [
62,
63]. Following the previous efforts of energy engineers in forecasting energy performance of residential building using intelligent approaches, this work presented novel ANN-based models for this purpose.
Referring to the literature, it was pointed out that the combination of ANN and WCA has provided optimized solutions for various intricate engineering simulations, such as spatial analysis of environmental phenomena (e.g., groundwater potential [
64]). This algorithm has also provided effective solutions for energy simulations (e.g., electrical power output of power plants [
31]). In comparative studies, a reason for the superiority of the WCA has been its muti-directional optimization strategy, as well as its multi-rain process that protects the solution against local minima [
37].
In the application of metaheuristic algorithms, population size is one of the most prominent hyperparameters. In this work, the selection of population size for all algorithms was performed by trial-and-error. The population sizes used were 10, 50, 100, 200, 300, 400, and 500. It was discovered that 10, 300, 400, and 500 fit the best for the SCE, HBO, SSA, and WCA, respectively.
Earlier literature demonstrates that the findings of this research are in harmony with previous studies that have addressed the applicability of metaheuristic techniques in the field of energy performance analysis. In a broad comparative study by Lin and Wang [
32], the evident superiority of the WCA was reported for predicting the heating and cooling loads of residential buildings using a public dataset provided by Tsanas and Xifara [
65]. The benchmark algorithms were the equilibrium optimizer (EO), the multi-verse optimizer (MVO), the multi-tracker optimization algorithm (MTOA), electromagnetic field optimization (EFO), and the slime mold algorithm (SMA). Concerning the present dataset, the methodology offered in this work outperformed some previous models. For instance, in adaptive neuro-fuzzy inference system (ANFIS) optimization, carried out by Alkhazaleh et al. [
66] for the TD
A prediction using EO and Harris hawks optimization (HHO), the best model was ANFIS-400-EO, which achieved a training MAE = 1.87, while in this work the lowest MAE was 0.81. As for testing, the lowest MAE of the cited study was 5.74, obtained by ANFIS-100-HHO, while this study reduced it to 4.12 using WCA-2TMLP. Hence, significant improvements can be detected.
The size of the dataset used in this study was relatively small (i.e., 35 samples). However, referring to the results, it was shown that the models were able to obtain a very reliable understanding of the building energy behavior using 28 samples, and they could accurately extrapolate the knowledge to the remaining 7 samples. This dataset has also been reported to be suitable for developing metaheuristic-based hybrids in similar previous studies [
66,
67]. However, it would be interesting for the future studies to investigate the sensitivity of prediction accuracy on the size of the dataset. Creating similar datasets is highly suggested to cross-validate the methodologies presented so far.
The dataset used here was bounded to taking 11 influential parameters into account. Hence, another noticeable suggestion about the dataset could be extending the dataset through considering further building characteristics and design parameters that can directly/indirectly affect the energy performance (e.g., window-to-wall ratio, U-value of walls/windows, unplanned air exchange between the building and environment, etc.). Once provided, the dataset may be subjected to a feature selection process in order to maintain the most contributive parameters and discard the negligible ones. In doing so, not only would a more realistic assessment be achieved, but also the methodology would be optimized by reducing the dimensions of the problem.
3.7. TDA and DDA Formula
With reference to the suitable results presented by the SCE-2TMLP, the governing equations organized by the SCE algorithm are provided as a predictive formula for simultaneous estimation of the TD
A and DD
A. Based on the architecture of the 2TMLP (i.e., 11 input neurons, 9 hidden neurons with Tansig activation, and 2 output neurons with Purelin activation function), the procedure of calculating the TD
A and DD
A requires producing 9 outputs from the middle layer. As expressed by Equations (12) and (13), as well as
Table 6, each of these outputs (represented by O
1, O
2, …, O
9) is a non-linear function of the input parameters (i.e., U
M, U
T, U
P, α
M, α
T, Pt, ACH, Scw-N, Scw-S, Scw-E, and Glz).
where,
from which, the weights, W
i1, W
i2, …, W
i11, as well as the biases, b
i, are presented in
Table 3.
Once O
1, O
2, …, O
9 are calculated, Equations (14) and (15) yield the TD
A and DD
A, respectively. The reason for the linear calculations in these two equations lies in the Purelin function, which is described as
f(
x) =
x.