1. Introduction
The famous Brundtland Commission Report of the UN defines sustainability as meeting the needs of the present without compromising the ability of future generations to meet their own needs [
1]. To make life on earth sustainable by consuming less and polluting less is one of the most important responsibilities of humans. The majority of the energy sources used today come from fossils. In fact, in recent years, more environmentally friendly and sustainable methods have been developed to produce green energy or to consume less. New energy sources are sustainable to a certain extent. However, the most important issue is to reduce energy consumption and to ensure that the same work is carried out with less energy and, thus, energy efficiency is achieved. In collective living areas, most of the energy is used for the heating and cooling needs of indoor spaces.
Global warming refers to the impact of human activities on the climate, especially the burning of fossil fuels and large-scale deforestation, which causes the emission of greenhouse gases, such as carbon dioxide, into the atmosphere [
2]. As a result of the increase in energy demand with the recent global warming, it has become important to re-evaluate the old methods and the measures taken in terms of energy efficiency. In this context, it is seen that the implementation of concepts such as green buildings, green industry, and passive houses has increased. Governments have made serious regulations on these issues [
3].
The envelope of the buildings has been constantly developed due to the factors such as temperature, humidity, winds and solar radiation of the outside air. These factors are evaluated separately in hot weather and cold weather conditions. Especially the radiation effect of the sun and air temperature are important factors in terms of indoor thermal comfort in both weather conditions. In hot climate regions, the hot period lasts longer than the cold period. For this reason, the hot periods of these regions are important in terms of cooling load of buildings. Climatic changes due to environmental pollution in recent years (greenhouse effect, global warming, …) bring energy expenditures and cooling to the forefront [
4]. The use of climatic data in house design is very important in ensuring energy efficiency in the building. The design of buildings according to different climatic characteristics is effective in the formation of suitable indoor conditions and energy conservation. For example, a courtyard with an external part of the building reduces the cooling load by providing abundant air flow [
5].
The calculation of the cooling load in every building has become necessary due to the increase in energy costs and to prevent climate changes caused by fuel energy consumption in buildings. Especially for buildings located in hot and tropical climate regions, it becomes especially important to determine and evaluate cooling loads in the design stage. On the other hand, the gradual increase in outside temperatures due to global warming leads to an increase in cooling load, especially in buildings located in hot-humid climates. This high cooling demand leads to more greenhouse gas emissions and thus supports the global warming process [
6]. This situation generates a vicious circle which worsens the situation by causing more and more greenhouse gas emissions every day. Therefore, the cooling load should be calculated carefully. Energy efficiency should estimate and limit the energy consumption for the cooling of buildings.
As indicated in
Table 1, the production industry was the sector with the highest energy consumption globally. Buildings were the second in terms of global energy consumption. Only this indicator can point out that serious energy savings can be achieved by enabling energy efficiency in the design and use of buildings. To contribute to the energy savings in the building industry, our study mainly focuses on facilitating the design phase of buildings through the prediction of energy consumption parameters, such as cooling load, with a simple but efficient approach.
The amount of energy used in buildings at the urban scale varies regionally but accounts for approximately 40% of total consumption [
8,
9]. This issue clearly shows that buildings are responsible for a large part of energy consumption that cannot be ignored. Therefore, efficient building design with energy-saving features and auxiliary tools that support these practices can be used to improve the energy efficiency of buildings and may be helpful in alleviating its use.
The main purpose of this study is to determine the cooling load in buildings located in tropical climates. Because when the cooling load of the buildings is known, a design can be made according to the required cooling load, the energy to be consumed by the cooling devices can be reduced, and a sustainable design can be realized. Malaysia is a country located in Southeast Asia. Located between one and four degrees north latitude, Malaysia’s climate is equatorial. Tropical forests cover 70% of the country. Because of the effect of monsoon winds and continuous rains, especially between January and May, the country’s humidity increases during this season. The daily temperature is between 21 °C and 32 °C in the lowlands, while it is lower in the higher regions [
10]. Malaysia is known for its hot and humid climate. As a solution to these climatic conditions, air conditioners (AC)s and AC systems, in general, are widely used in all regions of the country. The wide use of AC systems results in high energy consumption in buildings. In countries such as Malaysia, buildings should be designed to minimize the energy consumption related to the cooling of indoor spaces. Establishing the functional link between the architectural properties of the building and its energy consumption is crucial to facilitate energy-efficient building designs. In this study, the following aspect of architectural design is utilized for the prediction framework we have implemented and tested.
There are various types of houses available in the market in Malaysia, but the terraced house type typically accounts for 41% of the total residential property stock in 2018, which is available in one and multi-story types [
11,
12]. It is followed by low-cost houses (including low-cost house, low-cost flat and flat) with a 30 percent portion, which is purposely constructed for the low-income group. Terraced houses are the most common residential type, classified under low- to medium-cost housing [
13], which are preferred by developers due to the speedy construction methods [
14]. As the design is based on the British terraced house, it did not take into account local climatic conditions and cultures when it was brought to Malaysia [
15]. The scope of this study was double-story terrace houses. The double-story terrace houses are the most common type of low-rise buildings and the biggest fraction of both existing supplies and newly planned residential units in Malaysia.
The case study was carried out on an intermediate double-story terrace house to represent the simulation of conventional low-rise residentials in Malaysia.
Figure 1 is the representative building used in this study. The building was located in Skudai, Johor Bahru (latitude 1°32′ N, longitude 103°40′ E), with a total floor area of 200 m
2. The building had a rectangular shape floor plan and aspect ratio of (width/length) 1/2 and a ceiling height of 3.8 m. This model is a typical terrace house with an indoor space layout of a living-cum-dining area (DL), kitchen (K), a guest room with one bathroom (WC) on the ground floor as well as one master bedroom (MB), two smaller bedrooms together with a hall area (corridor) along with two bathrooms (WC) on the first floor. The house was facing the South–North direction, with large windows in the front facade (South). The building structure is a reinforced concrete structure with a brick-infilled frame. The roof is pitched, covered with clay tiles and non-insulated walls and roofs. Cement sand renders covered the façade. The walls are built from brick, and the surface is plastered. A single-glazed window frame was made of aluminum, and
Table 2 describes the base model materials and thermal properties.
These aspects were the total floor area, aspect ratio, ceiling height, window material, external wall material, roof material, window wall ratio north faced, window wall ratio south faced, horizontal shading, and orientation of the building. The literature defines the Cooling Load as the building’s energy consumption, or the amount of energy required to keep the environment at a constant temperature [
17]. The number of variables affecting cooling load calculations is very high. This study focuses on the following subset of these variables that appear as the key ones based on findings in our previous research [
16] (
Figure 2).
Floor area: The floor area is the floor area of the region where the cooling load will be calculated [
18]. As it increases, the cooling load will increase too.
Aspect ratio: Aspect ratio is the ratio of the width to the height of the building. Thanks to the optimum aspect ratio, the building is shaded in hot weather, and the energy consumption required for cooling is reduced [
19].
Ceiling height: As the height of the ceiling increases, the air volume in the room, which directly affects the dynamics that affect the cooling load, also increases. This also affects cooling efficiency [
20].
Window material: Appropriate thermal comfort conditions can be achieved with glass selection according to the characteristic features of climate zones [
21].
External wall material: One of the major factors that make up the cooling load is the total heat gained from the external walls [
22].
Roof material: The roof material used is an important factor that has an impact on the cooling load. For example, when comparing a traditional roof with a green roof, it is known that the green roof application saves energy and has less negative impact on the environment [
23].
Window wall ratio: The window-to-wall ratio is the ratio of the window area to the entire façade surface area. In regions with climatic conditions where heating energy demand is high, solar energy gain increases as the window/wall ratio increases [
24].
Horizontal shading: With horizontal shading installation, the cooling load is reduced compared to the case without shading [
25]. Elements such as balconies, overhangs, etc., are horizontal shading elements [
26].
Orientation: Energy efficiency can be achieved with the right building orientation. Among the building orientation types, the cooling load increases in the perimeter zones orientated towards the west façade [
27].
Heat loads have long been calculated manually and using the instantaneous calculation method, which assumes that heat gains are converted into instantaneous cooling loads. Although this method is simple and fast, it neglects processes such as heat storage and radiation transfer and, therefore, has little reliability [
28]. There are many methods for cooling load calculations.
Figure 3 shows the relationship between the American Society of Heating, Refrigerating, and Air-conditioning Engineers (ASHRAE) cooling load calculation methods in terms of complexity and accuracy.
From
Figure 3, it can be seen that ASHRAEs Heat Balance Method has the highest complexity and accuracy. The Heat Balance Method, using the finite differences approach, calculates the inner surface temperatures of each surface in detail, as well as the solar gains, and makes the closest estimation of the heating and cooling load with the inclusion of natural ventilation, shading, HVAC equipment, and thermal mass [
29]. The fact that accuracy increases as the complexity increases has led to the search for methods that are less time-consuming and complex. The calculation of the cooling load is more complex than the heating load due to the presence of dynamic responses and thermal mass [
30].
In recent years, studies carried out utilizing machine learning methods have achieved very accurate results in the estimation of cooling loads. For example, Xuemei et al. [
31] developed the Least Square Support Vector Machine (LS-SVM) for cooling load prediction, and when compared with Back Propagation Neural Network (BPNN), LS-SVM provided a higher accuracy with less error. Similarly, Li et al. [
32] utilized a support vector machine (SVM) to predict the hourly building cooling load and achieved effective results. Gao et al. [
33] used extreme learning machine (ELM) and random forest (RF) together to predict the cooling load of large commercial buildings. Sha et al. [
34] compared the performances of different ML algorithms in predicting cooling load and showed that gradient tree boosting (GTB) achieves the most accuracy with fewer errors. Ngo [
35] applied an ML method for the prediction of cooling loads of buildings, based on data from 243 buildings and observed high accuracy in predictions. Rana et al. [
36] proposed a data-driven approach that has shown greater accuracy than gradient tree boosting (GTB). Xuan et al. [
37] used the Chaos approach and Wavelet Decomposition (WD) with the Support Vector Regression separately to predict the cooling load, and the results showed that the hybrid forecasting models perform better than the single ones. Zingre et al. [
38] applied long short-term memory (LSTM) to estimate cooling load and demonstrated the predictive potential of this method when the data are in the form of a time series.
In this study, we implemented and tested several foundational machine learning models (Linear Regression, Decision Tree, Elastic Net, K Nearest Neighbor, Support Vector Machines) and ensemble machine learning models (Random Forest, Gradient Boosting, Histogram Gradient Boosting, Voting, Stacking) to determine the model with the best performance in the prediction of the cooling load based on architectural aspects of a tropical building. Python [
39] programming language v3.6 was used for the cooling load estimation experiments. Anaconda 3 [
40] was preferred as the development environment. Numpy [
41] and Pandas [
42] libraries were used to prepare the data for the training, the sci-kit-learn library was used to develop machine learning models, and the Matplotlib [
43] library was utilized for data visualization.
3. Results
In this study, predictions were made for cooling load. The inputs are the total floor area, aspect ratio, ceiling height, window material, external wall material, roof material, window wall ratio north faced, window wall ratio south faced, horizontal shading, and orientation of the building. The output is cooling load (
Figure 9).
In this study, five foundational regression algorithms and five ensemble algorithms were used for the generation, training, and validation of the models. The Voting algorithm has been implemented with different combinations of base learners (lr: linear regression, knr: k-neighbor regression, ent: elastic net regression, dtr: decision tree regression, svr: support vector regression, rfr: random forest regression, gbr: gradient boosting regression, hgbr: a histogram-based gradient boosting regression). The Stacking algorithm has been implemented by utilizing different combinations of base learners and Gradient Boosting Regressor as the final estimator. The accurate metric values obtained as a result of the predictions made using each algorithm and each implementation of Voting and Stacking algorithm are provided in
Table 5.
As mentioned previously, the R2 value approaching one indicates that the model has a high success rate. When the results of all models are compared, the lowest R2 value (0.7341) was obtained in the SVR implementation, while the highest R2 score (0.9949) was obtained with two models, (i) the histogram gradient boosting regression algorithm and (ii) in two of the stacking implementations, (a) the combination of random forest regression, gradient boosting regression and histogram-based gradient boosting regression with final estimator being the gradient boosting regression and (b) the combination on linear regression, decision tree regression, random forest regression, gradient boosting regression, and histogram-based gradient boosting regression with final estimator being the gradient boosting regression. Among the foundational models, the model with the highest R2 score (0.9569) was the decision tree. The model with the lowest R2 score (0.7341) is SVR (i.e., this was also the lowest overall R2 score).
In ensemble methods, the training/validation with histogram gradient boosting algorithm and stacking resulted in the highest R2 value (0.9949). Combinations of which the highest R2 value was obtained in the stacking model, were the combinations in which multiple ensemble methods were used together as base learners. As the low error rate is an indication of better success in regression algorithms, among all models, the success of the histogram gradient boosting algorithm and stacking models can also be confirmed through negative MSE/RMSE/MAE values for i, ii.a and ii.b (NMSE = {−8.93; −8.93; −8.94}, NMAE = {−1.8; −1.78; −1.77}). When negative values of MSE/RMSE/MAE are considered, the smaller number denotes a higher error. The model with the highest error rate (NMSE = −471.17, NMAE = −14.16) was SVR and the results complied with low R2 scores obtained by the same model. Among the foundational models, the decision tree model has the lowest error rate (NMSE = −75.77, NMAE = −5.11).
In ensemble methods, the stacking models have the lowest error rates. The combination of which the lowest MSE is obtained (NMSE = −8.93) was the Stacking model with the combination of random forest regression, gradient boosting regression, and histogram-based gradient boosting regression with the final estimator being the gradient boosting regression. The combination in which the lowest MAE is obtained (NMAE = −1.77) was the combination of linear regression, decision tree regression, random forest regression, gradient boosting regression, and histogram-based gradient boosting regression with the final estimator being the gradient boosting regression.
The analyses were performed with a PC equipped with IntelI CoreI I3-5005U CPU @ 2.00 GHz (4Cores). When the time performance of the models was evaluated, the SVR model was the model with the worst time performance (54 s for 10-fold cv process) among the foundational models and is the model with the lowest overall R
2 score. Among the models with the best performance, the time performance of the Histogram Gradient Boosting model (24 s for 10-fold cv process) was much better than the best-performing Stacking model (178 s for 10-fold cv process). A comparison of the models is shown in
Table 6.
4. Discussion and Conclusions
Due to the rapid population growth throughout the world, the energy demand is increasing day by day. The built environment is one of the key consumers of energy. Especially for buildings in tropical climates, the energy load required for cooling is very high. Therefore, it is necessary to know the energy load required for the cooling of the building to develop building designs with a focus on energy efficiency.
In the study, various machine learning algorithms are implemented to predict the cooling load of a tropical building based on its architectural attributes such as floor area, aspect ratio, ceiling height, window material, external wall material, roof material, window wall ratio north faced, window wall ratio south faced, horizontal shading, and orientation. The main findings of this study are as follows:
- (1)
Ensemble learning algorithms/models are superior to foundational algorithms models in the prediction of the cooling load of the building through regression. Among the ensemble models, stacking-based models were found to be most successful when compared to others. Ensemble models have been more successful (high R2, low error) than base models as they combine decisions from multiple models to improve their overall performance.
- (2)
It is observed that Support Vector Regression was the least efficient model among all foundational and ensemble models, not only in terms of performance/accuracy but also in terms of time performance in the training/validation stages.
- (3)
When only the foundational algorithms were compared, Decision Tree Regression was the model with the best performance. This indicates that Tree Based approaches can be efficient in the prediction of the cooling load of buildings based on their architectural properties.
- (4)
In a similar study, Guo et al. [
75] predicted heating and cooling loads based on light gradient boosting machine algorithms. Common models in our study and [
75] are Random Forest and SVR. The same R
2 values were obtained for Random Forest in both studies, but SVR had a higher R
2 value in Guo et al. [
75]. This indicates (a) that based on the nature of the dataset, SVR can also provide accurate results, so tests with SVR should not be neglected in studies for developing cooling load prediction models, (b) that Tree Based approaches and Ensemble models are very promising in cooling load prediction.
- (5)
When the time performance of the models is taken into account, the Histogram Gradient Boosting algorithm appears as the optimal model, as it also provides a good prediction performance.
In summary, the results of the study have demonstrated that Ensemble Learning algorithms can be successfully used to establish the relationship between the architectural properties of tropical buildings and their cooling load because ensemble methods come to a conclusion by using more than one predictor in the same prediction task. In this method, the results of predictors with different metric scores are combined with different methods (voting, stacking, etc.). Thus, more successful performance is achieved. Furthermore, the cooling load of tropical buildings can be accurately predicted through the use of ensemble learning algorithms. Future research will focus on how hyperparameter optimization would enhance the performance of the provided models. The accuracy of the prediction model provided in this paper can be further enhanced through the addition of other predictor variables such as the occupancy status of the rooms, occupancy schedule, space usage conditions, and characteristics.