Joint Forecasting Model for the Hourly Cooling Load and Fluctuation Range of a Large Public Building Based on GA-SVM and IG-SVM

Wang, Meng; Yu, Junqi; Zhou, Meng; Quan, Wei; Cheng, Renyin

doi:10.3390/su152416833

Open AccessArticle

Joint Forecasting Model for the Hourly Cooling Load and Fluctuation Range of a Large Public Building Based on GA-SVM and IG-SVM

by

Meng Wang

¹,

Junqi Yu

^2,*,

Meng Zhou

¹,

Wei Quan

² and

Renyin Cheng

²

¹

School of Management, Xi’an University of Architecture and Technology, Xi’an 710055, China

²

School of Building Services Science and Engineering, Xi’an University of Architecture and Technology, Xi’an 710055, China

^*

Author to whom correspondence should be addressed.

Sustainability 2023, 15(24), 16833; https://doi.org/10.3390/su152416833

Submission received: 21 August 2023 / Revised: 25 November 2023 / Accepted: 11 December 2023 / Published: 14 December 2023

Download

Browse Figures

Versions Notes

Abstract

:

Building load prediction is one of the important means of saving energy and reducing emissions, and accurate cold load prediction is conducive to the realization of online monitoring and the optimal control of building air conditioning systems. Therefore, a joint prediction model was proposed in this paper. Firstly, by combining the Pearson correlation coefficient (PCC) method with sensitivity analysis, the optimal combination of parameters that influence building cooling load (BCL) were obtained. Secondly, the parameters of the support vector machine (SVM) model were improved by using the genetic algorithm (GA), and a GA-SVM prediction model was proposed to perform building hourly cold load prediction. Then, when there is a demand for the fluctuation prediction of BCL or extreme weather conditions are encountered, the information granulation (IG) method is used to fuzzy granulate the data. At the same time, the fluctuation range of the BCL was obtained by combining the prediction of the established GA-SVM model. Finally, the model was validated with the actual operational data of a large public building in Xi’an. The results show that the CV-RMSE and MAPE of the GA-SVM model are reduced by 58.85% and 68.04%, respectively, compared with the SVM for the time-by-time BCL prediction, indicating that the optimization of the SVM by using the GA can effectively reduce the error of the prediction model. Compared with the other three widely used prediction models, the R² of the GA-SVM model is improved by 4.75~6.35%, the MAPE is reduced by 68.00~72.76%, and the CV-RMSE is reduced by 59.69~64.97%. This proved that the GA-SVM has higher prediction accuracy. In addition, the joint model was used for BCL fluctuation range prediction, and the R² of the prediction model was 97.27~99.68%, the MAPE was 2.59~2.84%, and the CV-RMSE was only 0.0249~0.0319, which demonstrated the effectiveness of the joint prediction model. The results of the study have important guiding significance for building load interval prediction, daily energy management and energy scheduling.

Keywords:

large public building cooling load; fuzzy information granule; genetic algorithm; support vector machine; joint forecasting model

1. Introduction

CO₂, a greenhouse gas, is believed to be the main cause of global warming. To cope with climate change, many countries are implementing CO₂ emission reductions [1]. China is the second largest economy in the world and is also a major country in energy consumption and CO₂ emissions. The 2022 China Urban and Rural Construction Carbon Emission Series Research Report pointed out that the energy consumption and carbon emissions of China’s building sector during operation stage accounted for 21.3% and 21.7% of the country in 2021, respectively, that is, 1.06 billion tce and 2.16 billion tons. With the advancement of industrialization and urbanization, energy consumption and CO₂ emissions of China’s building sector will further increase in the future [2]. In 2021, the Chinese State Council issued the Action Plan for Achieving Carbon Peak before 2030, in which accelerating the improvement of building energy efficiency and optimizing building energy structures has become an important part. Therefore, the building sector needs to demonstrate greater energy conservation and emission reduction capabilities [3].

In recent years, large public buildings have attracted the attention of policy makers and researchers because of their high energy consumption, complex energy consumption structure, and ease of being affected by the external environment. In the operation of large public buildings, HVAC systems play an important role in controlling the indoor environment, accounting for more than 40% of the total energy consumption of the building and being the largest single energy consumption system in the building [4,5]. Therefore, in order to improve the energy efficiency of a building, the predictive control of the cooling load of the building is considered to be an effective way to reduce CO₂ emissions [6,7]. A large amount of actual experience and equipment operation data show that the building load demand shows certain periodicity and regularity over time [8]. Accurate prediction of building load helps operators to correctly schedule and plan energy consumption during building use. However, due to the complexity of internal and external parameters [9], it is a challenging task to predict the cooling load of a building without error.

In recent years, physical simulation and data-driven technology methods have been applied to the prediction of BCL [10]. Physical simulation methods often need to rely on detailed information of the building system and simulate the load of the building with the help of software tools, such as EnergyPlus, DeST, and eQUEST [11]. However, it is difficult to obtain the accurate structural parameters of buildings, and it is very complicated to establish accurate physical models.

With the improvement of data operation and maintenance systems, data-driven methods are widely used in BCL forecasting. Data-driven techniques establish load forecasting models based on historical data [12], among which artificial neural network (ANN), support vector machine (SVM) and so on are widely used [13]. Notably, ANN is good at solving nonlinear problems [14]. Through the training and learning of the model, it can better fit the complex nonlinear relationship between the cooling load and input parameters [15]. Wang et al. (2018) proposed an integrated method combining ANN with the dynamic prediction model of BCL [16]. In recent years, some time series prediction algorithms based on deep learning have also been used in the prediction field of BCL widely, such as long short-term memory (LSTM). Muzaffar et al. (2019) proposed to using an LSTM network for short-term load forecasting. Many experimental results show that predictions based on LSTM are superior to other methods and have the potential to further improve prediction accuracy [17]. However, training accurate neural network models usually requires a large amount of historical data, which are difficult to obtain in some practical projects [18].

In contrast, SVM proposed by Vapnik in 1995 can effectively solve nonlinear and collinearity problems with less sample data [19]. The research of Amasyali et al. (2018) showed that up to 25% of data-driven models in building energy prediction research are trained by SVM [20]. Fan et al. used SVR to predict BCL and selected important SVR input variables through sensitivity analysis to verify the prediction accuracy [9]. Li et al. proposed four different modeling methods including SVM based on the energy use and meteorological data of 59 residential buildings to establish a building annual energy consumption prediction model [21]. They concluded that SVM has higher accuracy and generalization ability than other methods. In order to improve the performance of prediction models further, hybrid prediction methods are widely used because they can absorb the advantages of different prediction methods. Fan et al. used the k-Nearest neighbor (KNN) classification method to obtain the load pattern for the forecast day [22]. Zhou et al. used two hybrid machine learning methods to predict the cooling load of commercial buildings, and the results show that the two hybrid algorithms are superior to the single BP and SVR [23]. In the SVM prediction model, optimizing parameters can improve its prediction performance effectively. The successful application of the current heuristic algorithm can help the model find the optimal parameters quickly. Commonly used optimization algorithms include the genetic algorithm (GA) [24], particle swarm optimization (PSO) [25] and so on. It is worth noting that GA has the characteristics of strong global search ability, fast search speed and strong robustness.

Previous studies have provided the basis and guidance for BCL prediction, but there are still some problems. Firstly, it is difficult to determine the margin required by SVM, which can easily lead to poor model fit or overfitting, so SVM parameters must be optimized [26,27]. In addition, previous load forecasting mostly constitutes point forecasting, which cannot deal with the uncertainty of large fluctuations in meteorological conditions and sudden changes in BCL demand, resulting in large errors in the prediction results. Therefore, more in-depth research needs to be carried out to predict the fluctuation trend and fluctuation range of BCL.

In this paper, to solve the above problems, the main contributions are as follows:

(1) The key influencing factors of BCL were obtained by the Pearson correlation coefficient (PCC) method. At the same time, considering the delay effect of input parameters on BCL, sensitivity analysis was used to compare the influence of meteorological parameters at different historical moments on the accuracy of the model, and the best parameter combination was obtained.

(2) The SVM model widely used in BCL forecasting still has some problems such as local optimization and difficult parameter determination. In order to improve the accuracy of the prediction model, the GA algorithm was used to optimize the parameters of the SVM model to avoid the unreasonable selection of SVM parameters.

(3) Previous load forecasting mostly constituted point forecasting, which could not deal with uncertainties such as large fluctuations in meteorological conditions and sudden changes in BCL demand. The information granulation (IG) method was introduced to fuzzy process the original load data, and the fluctuation trend and the range of the BCL were obtained by using the joint prediction model.

(4) Based on the model, a practical case was used to compare the proposed model with the mainstream prediction model, and various performance indicators were used to illustrate the results and applicability of the proposed model.

The rest of this paper is arranged as follows: Section 2 introduces the basics of SVM, GA and IG, and details the establishment method of the BCL forecasting model and its evaluation index. Section 3 focuses on data description and analyzes the influencing factors of BCL. Section 4 uses the actual case data to perform sensitivity analysis to determine the BCL influencing factors, GA-SVM for BCL hourly cold load prediction, and IG-SVM for BCL fluctuation range prediction, respectively, and the experimental results are analyzed and discussed. Section 5 concludes the paper.

2. Methodologies

2.1. SVM

SVM is one of the most successful algorithms in machine learning from the past ten years [20]. The principle of SVM is to introduce a kernel function, map the input space to a high-dimensional feature space through non-linear mapping and to perform linear regression in the feature space, mentioned by Liu et al. [28]. This algorithm adopts the segmentation principle for interval maximization, finds a hyperplane to divide the sample and finally transforms to a convex quadratic programming problem [29]. Because SVM adopts the principle of structural risk minimization (SRM) [30], it has good generalization performance [31].

2.1.1. Process Input Parameters

In order to improve the prediction accuracy while removing the influence of individual bad values on the model, the input parameters are normalized. The normalized mapping used is as follows:

f : p_{i} \to P_{i} = \frac{p_{i} - p_{m i n}}{p_{m a x} - p_{m i n}}

(1)

where p_i is each input and output parameter, that is, the building cooling load and its influencing factors. p_i is various parameters after normalization. p_min and p_max represent the minimum and maximum values of the corresponding parameters.

After normalization, the input parameters of the model are denoted as X_i and the output parameters are denoted as Y_i. The input and output parameters of the model can be approximated by Equation (2):

Y = f (x) = [W \cdot φ (X)] + b

(2)

where φ(X) represents the high-dimensional feature space non-linearly mapped from the input space X.

\begin{matrix} \underset{ξ_{i}, ζ_{i}, W, b}{M i n i m i z e} = \frac{1}{2} {‖W‖}^{2} + C \sum_{i = 1}^{n} (ξ_{i} + ζ_{i}) \\ s . t . \{\begin{matrix} Y_{i} - W \cdot φ (X) - b \leq ε + ξ_{i} \\ W \cdot φ (X) + b - Y_{i} \leq ε + ζ_{i} \\ ξ_{i} \geq 0, ζ_{i} \geq 0 \end{matrix} \end{matrix}

(3)

In Formula (3), ‖W‖² is a regular term, C is a penalty factor. The larger C is, the better the fitting result of the model will be, but an excessive C value may reduce the generalization ability of the model [32]. ξ_i, ζ_i are relaxation factors, ε is the loss function and uses discrete data to represent the decision function:

L_{ε} (Y_{i}, f (X_{i})) = \{\begin{matrix} 0, |Y_{i} - f (X_{i})| \leq ε \\ |Y_{i} - f (X_{i})| - ε, |Y_{i} - f (X_{i})| > ε \end{matrix}

(4)

Lagrangian multipliers a_i and b_i are introduced to transform the convex optimization problem into a maximization quadratic form.

\begin{matrix} M a x i m i z e W (a_{i}, b_{i}) = \sum_{i = 1}^{n} Y_{i} (a_{i} - b_{i}) - ε \sum_{i = 1}^{n} a_{i} + b_{i} - \\ \frac{1}{2} \sum_{i, j = 1}^{n} (a_{i} - a_{i}^{'}) (a_{j} - a_{j}^{'}) (X_{i} - X_{j}^{'}) \\ s . t . \{\begin{matrix} \sum_{i = 1}^{n} a_{i} = \sum_{i = 1}^{n} a_{i}^{'} \\ 0 \leq a_{i}, a_{i}^{'} \leq C \end{matrix} i = 1,2, \dots, n \end{matrix}

(5)

By introducing the kernel function k (X_i, X_j) instead of the inner product calculation, the dual form of the optimization objective can be obtained:

\begin{matrix} \underset{{a_{i}}, {b_{i}}}{M a x i m i z e} : \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} (a_{i} - b_{i}) (a_{j} - a_{j}^{'}) \cdot k (X_{i} - X_{j}^{'}) - \\ ε \sum_{i = 1}^{n} (a_{i} + b_{i}) + \sum_{i = 1}^{n} Y_{i} (a_{i} - b_{i}) \\ s . t . \{\begin{matrix} \sum_{i = 1}^{n} a_{i} = \sum_{i = 1}^{n} a_{i}^{'} \\ 0 \leq a_{i}, a_{i}^{'} \leq C \end{matrix} i = 1,2, \dots, n \end{matrix}

(6)

In Equation (6), i and j represent different samples, and Equation (2) can be written in the following explicit form:

Y = f (x) = \sum_{i = 1}^{n} (a_{i} - b_{i}) \cdot k (X_{i}, X_{j}) + b

(7)

2.1.2. SVM Kernel Function

The kernel function of SVM is essentially a mapping function [33]. There are four typical kernel functions: linear, polynomial, RBF and Sigmoid, and the expressions are shown in Table 1. Among them, the linear kernel function is mainly used in the case of linear separability, possessing the pros of fewer parameters and fast running speed. However, the factors affecting the cooling load of the building are complex, so linear kernel function may not be applicable. In addition, RBF kernel function is mostly used for linear inseparable cases. In particular, RBF requires the determination of fewer parameters than the polynomial kernel function. This function can map the sample set from the input space to the high-dimensional feature space effectively. So, it is only necessary to determine the width parameter, which helps to improve the calculation efficiency. Therefore, this paper uses the RBF function as the SVM kernel function.

2.1.3. SVM Parameter Optimization

When using SVM to solve the prediction question, model parameter selection is particularly vital. Undeniably, an ideal result can be obtained by choosing the appropriate model parameters. The larger the penalty parameter c is, the worse the generalization ability of the model will be, resulting in the increased possibility of the over-fitting problem during model training. Otherwise, it is prone to the problem of underfitting, resulting in the low accuracy of the model. Additionally, the support vector of the model decreases with the increase of kernel function parameter g, thereby the accuracy of the model is also reduced. Therefore, it is particularly crucial to find optimal SVM parameters for model prediction.

At present, regarding the SVM parameter optimization problem, scholars commonly use various methods including grid search [34], heuristic algorithm [35], etc. Importantly, the grid search method makes c and g change within limits. For a certain set of selected parameters, the K-fold Cross Validation (K-CV) method is adopted to acquire the accuracy of the model under these parameters, and finally the group of c and g with the highest accuracy are chosen as the best parameters. However, it will take a lot of time to find the global optimal solution in a larger range.

By contrast, applying the heuristic algorithm can find the global optimal solution rapidly. In previous research, commonly used heuristic optimization algorithms include GA [24], PSO [25], etc. Since the genetic algorithm was proposed, it has been widely used in automatic control, computer science, intelligent fault diagnosis, load forecasting and many other fields. When solving complex combinatorial optimization problems, better optimization results can be obtained faster than in some traditional optimization algorithms, thus significantly improving the speed of SVM parameter optimization [36]. In addition, the genetic algorithm has the characteristics of a strong global search ability and strong robustness. In this paper, the GA is used to optimize SVM model parameters.

2.2. GA-SVM

The GA was proposed by Professor Holland in 1969. It searches for the global optimal solution of a problem within a limited range by simulating the evolutionary process and the mechanisms of natural organisms [37]. The basic idea is to simulate the biological evolution and genetic mechanisms, which is suitable for solving complex nonlinear and multi-dimensional spatial optimization problems [38]. Therefore, the GA is used to optimize SVM model parameters, so as to build the GA-SVM based building cooling load prediction model.

The GA, according to the selected fitness function, screens individuals through selection, crossover and mutation, and finally, determines the global optimal solution to obtain the optimal parameters c and g. The specific parameters of GA optimization are set as follows: the amount of the population is 20, the maximum evolutionary algebra is 200, the search range of c is [0,100], and the search range of g is [0,100]. The steps to optimize SVM parameters using the GA are as follows.

Step 1: Initialize the population. A large population size can provide a sufficient sampling capacity for the search, but it will increase the calculation amount and increase the convergence speed, so it is set to 20.

Step 2: Calculate the fitness function. The fitness function reflects the possibility that chromosomes can be inherited to the next generation and can effectively provide a basis for the next generation screening. The mean square error between the actual BCL value and the predicted value of all chromosomes in the population was superimposed, and the reciprocal of the minimum value was selected as the fitness of the GA, which was calculated as follows:

E = \frac{1}{\sum_{k = 1}^{N} (T_{k} - Y_{k})^{2}}

(8)

where N represents the number of chromosomes in the population, Y represents the actual value of BCL, and T represents the predicted value of BCL.

Step 3: Selection. The selection is made in the way of roulette, that is, the more fit the chromosome is, the greater the probability of being selected.

Step 4: Cross. Crossover is to select a pair of individuals from the population according to a certain probability, and generate the next generation of new individuals with higher fitness after the coded crossover, and the crossover probability is,

P_{c} = \{\begin{matrix} P_{c m a x}, E_{m a x} < E_{m e a n} \\ P_{c m a x} - \frac{P_{c m a x} - P_{c m i n}}{{i t e r}_{m a x}} * i t e r, E_{m a x} \geq E_{m e a n} \end{matrix}

(9)

where

E_{m a x}

represents the largest fitness distribution function value among the two individuals to be crossed in the last generation of chromosomes,

E_{m e a n}

represents the average value of the fitness distribution function value of all individuals in the last generation of chromosomes,

i t e r

is the current number of iteration updates,

{i t e r}_{m a x}

is the upper limit of iteration times.

P_{c m a x}

and

P_{c m i n}

are the peak and minimum values of the expected crossover probabilities, respectively.

Step 5: Mutation. It refers to the selection of some individuals from the previous generation of chromosomes for mutation according to a specific probability. In most previous studies, the mutation probability is a fixed value with low flexibility and robustness. In order to improve the local optimization ability of the GA, the mutation probability expression was selected as follows:

P_{m} = \{\begin{matrix} P_{m m a x}, E < E_{m e a n} \\ P_{m m a x} - \frac{P_{m m a x} - P_{m m i n}}{{i t e r}_{m a x}} * i t e r, E \geq E_{m e a n} \end{matrix}

(10)

where m represents the individual with the mutation.

Step 6: Determine whether to terminate. If

{i t e r < i t e r}_{m a x}

, proceed to the next step until the optimal solution is found in the iteration process; otherwise, end the iteration and complete the optimization process.

Figure 1 shows the detailed process of using GA to optimize SVM parameters.

2.3. IG-SVM

Based on the different requirements of BCL prediction and in order to ensure the accuracy of prediction, the information granulation (IG) method was applied to predict the change trend and fluctuation range of BCL [39]. The concept of IG was first mentioned by Lotfi A. Zadeh [40]. IG is to decompose a whole into several parts for research. For cooling load forecasting, the data of the corresponding time period can be studied as an information granule according to the requirement of forecasting. Also, the effective information of each information particle was extracted by IG technology. Whether the boundary of an information grain can be accurately defined determines whether the grain is fuzzy or not. However, when the characteristics of an information grain cannot be accurately described, a fuzzy information grain can be used to supplement [40]. Ruan et al. (2013) pointed out that large-scale nonlinear time series data with noisy data can be predicted quickly by using fuzzy particle SVM [39].

There are currently three main models for the study of information granulation, which are the model based on fuzzy set theory, rough set theory and quotient space theory. Among them, fuzzy information granulation is the information granulation expressed in the form of fuzzy sets, and the fuzzy granulation of time series data by the fuzzy set method is mainly divided into two steps:

Step1: The division of windows. According to the actual prediction needs, the time series is divided into several sub-sequences as the operation window.

Step2: Information fuzzification. According to certain rules, each window divided in the previous step is fuzzified to produce a single information granule, which is the fuzzy information granule.

The fuzzification of information is actually a process of effective information extraction. The task of fuzzification is to establish a fuzzy concept that can reasonably describe the sequence in each time with the essence of determining the membership function of the fuzzy concept. When granulating, the basic form of the fuzzy concept is determined first, followed by the specific membership function.

The commonly used fuzzy particles have the following forms: triangular, trapezoidal, Gaussian, etc. For each fuzzy particle, although the membership degree is different, two basic principles must be met during the establishment: firstly, the fuzzy particle must be able to reasonably represent the original data, and secondly, the fuzzy particle must have certain particularity. The triangular fuzzy granulation method proposed by W. Pedrycz meets the requirements [41,42], so this method is adopted. The membership functions are as follows:

A (x, a, m, b) = \{\begin{matrix} 0, x < a \\ \frac{x - a}{m - a}, a \leq x \leq m \\ \frac{b - x}{b - m}, m < x \leq b \\ 0, x > b \end{matrix}

(11)

In Formula (8), x is the time series to be granulated, and a, m and b, respectively correspond to the three parameters after granulation in each window: Low, R and Up. For a single fuzzy particle, the Low, R and Up represent the minimum, average, and maximum changes in the original data, respectively.

In order to predict the variation trend and fluctuation range of BCL, a prediction model of the fluctuation range of cooling load based on fuzzy information granules was established. The specific steps are as follows:

Step 1: Inputting the original dataset of BCL and data preprocessing. The BCL dataset obtained from the database is preprocessed, including outliers processing, missing values processing, etc.

Step 2: Determining the window length of the BCL to be granulated and dividing the information window.

Step 3: Fuzzy granulation processing. Using triangular fuzzy particles to fuzzify the load data of each information window to obtain the parameters (Low, R and Up) of each window.

Step 4: Predicting the parameters (Low, R and Up) of each window separately. The GA-SVM prediction model is used in this paper.

Step 5: The prediction results are analyzed and verified to obtain the trend of load change in a future time window and the prediction performance of the model.

The flow chart of the IG-SVM model is shown as in Figure 2.

2.4. Model Evaluation Index

Three important indicators for evaluating model performance, namely the mean absolute percentage error (MAPE), the coefficient of variation root mean square error (CV-RMSE) which describe the difference between the predicted value and the observed value of the measurement set model and the squared correlation coefficient (R²), the corresponding calculation formula is as follows (n is the number of samples):

M A P E = \frac{1}{n} \sum_{i}^{n} \frac{|f (x_{i}) - y_{i}|}{y_{i}} \times 100 %

(12)

C V - R M S E = \frac{\sqrt{\sum_{i = 1}^{n} (f (x_{i}) - y_{i})^{2}}}{\frac{1}{n} (\sum_{i = 1}^{n} y_{i})}

(13)

R^{2} = \frac{(n \sum_{i = 1}^{n} f (x_{i}) y_{i} - \sum_{i = 1}^{n} f (x_{i}) \sum_{i = 1}^{n} y_{i})^{2}}{(n \sum_{i = 1}^{n} f (x_{i})^{2} - (\sum_{i = 1}^{n} f (x_{i})^{2}) (n \sum_{i = 1}^{n} y_{i}^{2} - (\sum_{i = 1}^{n} y_{i})^{2})}

(14)

2.5. Joint Forecasting Model Framework

The BCL joint forecasting model framework proposed in this paper is shown in Figure 3.

Step 1: Inputting the raw data and preprocessing it to analyze the forecast requirements.

Step 2: The PCC method was used to obtain the key influencing factors of BCL. At the same time, due to the specific hysteresis of the influence of some input parameters on BCL, sensitivity analysis was used to compare the influence of historical values of different meteorological parameters on the accuracy of the model to obtain the best parameters combination.

Step 3: The GA algorithm was used to optimize SVM kernel parameters c and g to enhance SVM convergence speed and prediction accuracy.

Step 4: When there is a demand for BCL fluctuation range prediction or a sudden change in the meteorological environment, the IG-SVM model is used for the fuzzy granulation of data. At the same time, the trained GA-SVM was used to predict the maximum, average and minimum of each window, respectively, and the fluctuation range of BCL was obtained.

Step 5: Model verification was carried out with a practical case, that is, the performance indicators of the model proposed in this paper were compared with the SVM and mainstream prediction models to verify its effectiveness and universality.

3. Case Study

3.1. Data Source and Description

Xi’an is located in the warm temperate subhumid continental monsoon climate zone, with four distinct seasons of cold, warm, dry and wet, and a large demand for cooling in the summer. Therefore, we used a large commercial center in Xi’an as an example. The commercial center has 8 floors with a total height of 36.2 m and a total built area of 250,000 m², among which the air conditioned area accounts for about 190,000 m². The air conditioning equipment runs from 8am to 10pm daily, operating with a degree of regularity.

The experimental data comes from the commercial center, and the hourly cooling load and hourly meteorological data of the air conditioning operation from 1 June to 31 July were collected. Descriptive statistics were performed for variables and the results are shown in Table 2.

In order to clearly show the overall trend of the data, the hourly data for each day were processed to obtain the total daily load value, daily solar radiation, average dry bulb temperature and average relative humidity, respectively. Figure 4 shows the meteorological data and the cooling load trend change chart (in days) for two months from June to July. It can be seen that the cooling load changes with a certain regularity, and the cooling load values on weekends are significantly higher than on weekdays. We used the hourly data from 1 June to 24 July (54 days in total) as the training sample, and the hourly data from 26 to 31 July (7 days in total) as the test sample to verify the model training accuracy.

3.2. Selection of Model Parameters

During a building’s operating cycle, the parameters which affect the cooling load are mainly outdoor climate parameters, such as solar radiation, dry bulb temperature, outdoor humidity and wind speed [15]. The daily solar radiation and dry bulb temperature can affect the heat gain through the building envelope, and therefore have a greater impact on BCL. In addition, the outdoor humidity has a certain impact on the load, because in order to maintain a constant indoor moisture content, the air conditioning system needs to cool and dehumidify the air. At the same time, the start–stop of various indoor electromechanical equipment also has an impact on the cooling load of the building to a certain extent. Notably, in large public buildings, the start–stop of various indoor electromechanical equipment have obvious time rules, resulting in a small impact on the cooling load. Therefore, this paper only considers the impact of climatic conditions on the BCL.

The PCC method was adopted to analyze the correlation of input variables. The heat map of correlation coefficients of each parameter with the cooling load is shown in Figure 5. The correlation coefficients of solar radiation, dry bulb temperature and relative humidity are 0.74, 0.60 and −0.23, respectively, which shows that all have an effect on BCL. In addition, the correlation coefficient between wind speed and BCL is only −0.04. Compared with the other three parameters, there is a low correlation between wind speed and BCL. As a result, the three factors of solar radiation, dry bulb temperature and relative humidity were taken as the input parameters of the hourly cooling load forecasting model for buildings.

4. Result and Discussion

4.1. Sensitivity Analysis of Input Parameters

The thermal inertia in buildings influences the BCL. As mentioned by Zhang and Wen [43], meteorological parameters such as solar radiation and dry bulb temperature have a lagging effect on BCL. Therefore, the BCL at the current moment may be affected by historical moment factors [44]. However, the effect of time-lagged energy loading factors on prediction accuracy decreases with increasing time lag [45]. Therefore, only the meteorological parameters of the previous 1–2 h are selected.

In this paper, the current moment was set as t, so the previous 1 h and the previous 2 h are t − 1 and t − 2. For the selected model input parameters, the SVM prediction model was used for analysis and validation. The specific representation symbols of each input parameter are shown in Table 3.

Sensitivity analysis was used to discuss the accuracy of the SVM prediction model under eight input conditions, as shown in Table 4. Within the scope of the study, the training set consisted of hourly data from 1 June to 24 July, and the test set consisted of data from 25 to 31 July. Through calculation and validation, the optimized parameters and prediction performance of each model are shown in Table 5, from which the best combination of input parameters for the SVM prediction model can be determined.

It can be seen from Table 5, without considering any historical data, that the MAPE of the prediction model Case 1 is 12.57% with R² of 87.15%, indicating poor model prediction accuracy. Taking the dry bulb temperature (Case 2, Case 4) of the previous hour into consideration, the model MAPE is 13.41% and 13.32%, respectively. Compared with the prediction accuracy of Case 1, which takes no account of any historical parameters, the prediction accuracy of Case 2 and Case 4 decreases. Conversely, when considering the amount of solar radiation in the previous hour only (Case 3), the accuracy of the model is improved in comparison with Case 1, Case 2, and Case 4. Therefore, the prediction performance of the model improves when considering the historical solar radiation values, while it reduces with the consideration of the historical dry bulb temperature. This is because the correlation between dry bulb temperature and cooling load is weak, therefore increasing the data dimension of dry bulb temperature reduces the accuracy of model training. Moreover, this paper also designed four input parameter combinations of Cases 5–8. It can be seen from Table 5 that when only the first two moments of solar radiation (Case 5) are considered, the prediction accuracy of the model is the highest. On the contrary, once the data dimension of the dry bulb temperature is increased, the model accuracy decreases (Case 6–8), which is in line with the analysis above. The comparison error between the predicted value and the actual value of each model is shown in Figure 6.

Figure 6 shows that the combination of the model parameters selected in Case 5 has the highest prediction accuracy, so

T_{t}, S_{t}, S_{t - 1}, S_{t - 2}, H_{t}

were chosen as the model input parameters in this study.

4.2. Analysis of the GA-SVM Prediction Result

In this section, the two optimization strategies of SVM are compared first. Next, the hourly cooling load of a large public building is predicted by using a practical case. Within the scope of this study, hourly data from 1 June to 24 July was used as the training set, and the data from 25 to 31 July was used as the test set. SVM and GA-SVM were used to predict the BCL. The parameter optimization method of SVM is the grid search algorithm.

4.2.1. SVM Parameter Optimization

Firstly, the grid search method was used for parameter optimization. The method can find the optimal solution of SVM parameters within a certain range. The K-fold cross-validation method is incorporated in the parameter optimization to effectively avoid the occurrence of over-learning as well as under-learning states and to obtain convincing results. In this paper, the input parameters: t_τ, s_τ, s_τ₋₁, s_τ₋₂, h_τ were selected for the model parameter c, g optimization, and the results are shown in Figure 7.

In Figure 7a, it can be seen that c is 9.1896, g is 3.0314, and model CV-MSE is 0.0198. On this basis, we implemented precise optimization and took K as 5. The test set was separated into 5 parts. The step size of parameters c and g was 0.5. As a consequence, under the 5-fold CV, the optimal parameters c and g of the model are 0.5 and 16 and CV-MSE is 0.015. The precise optimization result is shown in Figure 7b.

Secondly, the GA was used to optimize the SVR parameters. In terms of optimizing the SVM parameters through GA, the input parameters: t_τ, s_τ, s_τ₋₁, s_τ₋₂, h_τ, were selected, where the maximum evolution generation number of the population was set to 200, and the maximum number of populations was set to 20. The optimal parameters c and g were 2.49 and 75.25, respectively.

4.2.2. Prediction Results of GA-SVM

The GA-SVM was used to predict BCL hourly, and comparison experiments were set up to compare with the unimproved SVM model. The prediction results and error curves are shown in Figure 8 and Figure 9.

As can be seen from Figure 8, the SVM has a lower degree of fitting to the original data than the GA-SVM, and the prediction accuracy is better under low-load mode (day 1–2, day 4–7), but under a high-load mode (day 3), the prediction accuracy of the SVM model is much lower than that of the GA-SVM. It can also be seen from Figure 9 that the SVM had very large error fluctuations on the third day. In contrast, the GA-SVM model can predict the hourly cooling load value accurately.

In order to prove the performance of the GA-SVM model in the hourly cooling load prediction of buildings further, it was compared with the SVM and some mainstream forecasting models in current research, and the results are shown in Table 6.

The detailed evaluation indicators of the prediction results of each model are shown in Table 6. It can be seen that the GA-SVM has the best prediction effect. Both the CV-RMSE and MAPE were the lowest, and R² is the highest, which can describe the change rule of BCL accurately. Compared with the unoptimized SVM model, the CV-RMSE and MAPE of the GA-SVM were reduced by 58.85% and 68.04%, respectively. It can be seen that using the GA to optimize the parameters of the SVM can improve the accuracy of the prediction model greatly.

To verify the accuracy of the GA-SVM further, BP neural network, DNN and LSTM prediction models were selected for comparison. These three models are the most widely used and highly accurate forecasting models in current load forecasting [14,17,23]. According to Table 6, the CV-RMSE of the SVM model was reduced by 2.04~14.86% compared with the BP, LSTM and DNN. In general, SVM has better comprehensive prediction performance than the BP, LSTM and DNN models. It can be seen that the SVM model has a high degree of fitting and small error when predicting nonlinear time series prediction problems. Thus, we optimized on the basis of selective SVM. In addition, compared with BP, LSTM and DNN, the R² of the GA-SVM proposed in this paper was increased by 4.75~6.35%, MAPE was reduced by 68.00~72.76%, and CV-RMSE was reduced by 59.69~64.97%. Therefore, the GA-SVM proposed in this paper has better prediction performance.

4.3. IG-SVM Predicts the Cooling Load Fluctuation Range

Short-term load forecasting is an important means to ensure the safety and economic operation of the energy system, but most of the forecasting methods are aimed at the specific value of the load at the next moment, that is, point forecasting, ignoring the research on the change trend and fluctuation range of the load. In addition, considering the influence of extreme weather on the accuracy of the hourly BCL prediction model, the IG-SVM was used to predict the fluctuation range of the cooling load. The advantage of this model is that the input variable is only the cooling load value at the historical time, which has nothing to do with input parameters such as weather.

First of all, the original BCL data was analyzed, and the hourly cooling load range was from 8 a.m. to 10 p.m., with a total of 15 moments of cooling load data. Secondly, every three consecutive moments were divided into an information window, and then each information window underwent fuzzy processing, and the 15 cooling load values of the day were divided into five fuzzy information grains. Finally, the GA-SVM was used to perform regression prediction on the data after information granulation. Similarly, the training set was the hourly cooling load data from 1 June to 24 July, and the test set was the daily cooling load variation range from 25 to 31 July. The result of fuzzy IG is illustrated in Figure 10.

In Figure 10, Low, R, and Up represent the minimum, average, and maximum values of the fuzzy particles corresponding to the changes in the original data, respectively.

The variation range of BCL was obtained by using the GA-SVM for Low, R and Up, respectively. Taking the first day (25 July) of the test set as an example, the results of the predicted and true values of the three parameters were obtained, as shown in Figure 11. Thus, the IG-SVM model can predict model parameters more precisely on the training data.

The GA-SVM model was used to carry out the rolling prediction of the seven-day load of the test set, and the variation range of the cooling load was obtained. The comparison of the results with the actual variation range of BCL are shown in Figure 12 and Table 7.

Figure 12 and Table 7 show that the cooling load variation range from 25 to 31 July can be predicted more accurately by the IG-SVM model. The R² of the predicted value and the actual value is 97.27~99.68%, indicating a high degree of fit. In addition, the MAPE of the model is 2.59~2.84%, and the CV-RMSE is only 0.0249~0.0319. The model has high precision and a small relative error. More importantly, in extreme cases, such as large fluctuations in meteorological data, it is also possible to accurately predict the cooling load range of commercial buildings.

In addition, as it was proven in Section 4.2 that the GA-SVM outperforms other prediction models in prediction, the GA-SVM was used in this section to predict the BCL data after information granulation directly with better results. However, in order to prove the performance of this joint prediction model further, the BCL with fuzzy information granulation was again validated using the current general prediction model. The prediction results are shown in Table 8.

As can be seen from Table 8, compared with the first three models, the R² of the IG-SVM improved by 2.19% to 6.08%, indicating a better fit. Overall, the IG-SVM joint prediction model has better prediction performance.

5. Conclusions

In this paper, a joint prediction model of hourly cooling load and IG-SVM based on GA-SVM were proposed. Firstly, the PCC method was used to obtain the key factors affecting BCL. At the same time, due to the delay effect of some input parameters on BCL, sensitivity analysis was used to compare the influence of historical values of different meteorological parameters on the accuracy of the model, and the best parameter combination was obtained. Secondly, using GA to improve the SVM model, the GA-SVM model was proposed to predict the hourly BCL. Then, when there is a BCL fluctuation prediction demand or extreme weather conditions are encountered, the IG-SVM model is used to fuzzy granulate the data. At the same time, combined with the established GA-SVM model, the maximum, average and minimum values of each window were predicted, respectively, and the fluctuation range of BCL was obtained. Finally, the model was verified by the actual operational data of a large public building in Xi’an, and compared with mainstream prediction models. The results were as follows:

(1) The PCC method was used to obtain the three key factors affecting the BCL (solar radiation, dry bulb temperature, and relative humidity), and the optimal combination of Case 5 was obtained by comparing the effects of eight sets of input parameters at different historical moments on the accuracy of the model using sensitivity analysis.

(2) The CV-RMSE and MAPE of the GA-SVM prediction model were 58.85% and 68.04% lower than that of SVM, respectively, which indicates that using GA to optimize SVM can effectively reduce the error of a prediction model. In addition, the CV-RMSE and MAPE of the SVM prediction model decreased by 2.04~68.04% compared with mainstream prediction models such as the BP neural network, DNN and LSTM. It can be seen that SVM model has the characteristics of a high degree of fitting and small error when predicting nonlinear time series prediction problems and is more suitable for the accurate prediction of large public building cooling loads.

(3) When the fluctuation range of the building cooling load is predicted or extreme weather conditions are encountered, the IG-SVM model is used for processing, and the R² between the predicted value and the actual value was 97.27~99.68%, indicating a high degree of fitting. In addition, the MAPE of the model was 2.59~2.84%, and the CV-RMSE was only 0.0249~0.0319. The accuracy of the model is high and the relative error is low. It can be seen that the combined forecasting model can effectively improve the accuracy of load forecasting.

Large public buildings have high energy consumption and a large load demand, and the predictive control of their cooling load is considered to be an effective way to save energy and reduce emissions [6,7]. The advantage of this study is that the combined forecasting model can provide accurate forecasting performance when encountering extreme weather conditions or when there is a range of load variation forecasting needs. The improvement of prediction accuracy is conducive to the realization of online monitoring and optimal control of building air-conditioning systems, thus providing effective data support and theoretical references for building energy management and contributing to the decision-making process of the energy management system, such as demand response and peak shifting. In addition, in building energy management, the intelligent control system can adjust the operation strategy of the air-conditioning system according to the changes in the demand of the cold load, so as to make the use of electric power resources more economical and reasonable, and to realize energy-saving operation. From a macro perspective, energy management, the adjustment of the industrial structure and technological transformation can successfully realize energy conservation and emission reduction, thus helping the building sector achieve the “double carbon” goal.

However, the limitation of this study is that due to the availability of data, the model was only verified in large commercial buildings. In fact, the main functions and properties of different buildings (such as office buildings, hotels, etc.) are different. In practical applications, the results of this study may not be applicable to other types of large public buildings. In the future, when more data are available, we will try to verify this method in other types of buildings, fully consider the characteristics of the buildings, such as the use of functions, determine if the cold load prediction model is suitable for all types of large public buildings, and further improve the model according to actual needs to improve its accuracy and generalization ability. In addition, in future research, we should pay attention to the relationship between the model prediction and the actual energy saving effect, so as to put forward reasonable and feasible strategies for energy conservation in the building sector.

Author Contributions

Conceptualization, M.W. and J.Y.; methodology, M.W.; software, M.W. and M.Z.; formal analysis, M.W., W.Q. and R.C.; investigation, M.W. and M.Z.; data curation, R.C.; writing—original draft preparation, M.W.; writing—review and editing, J.Y. and M.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key R&D Program of China (Grant No. 2022YFC3802703-04).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data that has been used is confidential.

Conflicts of Interest

The authors declare no conflict of interest.

References

Hu, Y.; Huang, H.; Wang, H.; Li, C.; Deng, Y. Exploring cost-effective strategies for emission reduction of public buildings in a life-cycle. Energy Build. 2023, 285, 112927. [Google Scholar] [CrossRef]
Yao, Y.; Shen, Y.; Liu, K. Investigation of resource utilization in urbanization development: An analysis based on the current situation of carbon emissions in China. Resour. Policy 2023, 82, 103442. [Google Scholar] [CrossRef]
Qiao, R.; Liu, T. Impact of building greening on building energy consumption: A quantitative computational approach. J. Clean. Prod. 2020, 246, 119020. [Google Scholar] [CrossRef]
Paulino, J.; Esperanza, G.G.; Beatriz, M.; José, P. Modelling energy performance of residential dwellings by using the MARS technique, SVM-based approach, MLP neural network and M5 model tree. Appl. Energy 2023, 341, 121074. [Google Scholar] [CrossRef]
Chen, S.; Zhou, X.; Zhou, G.; Fan, C.; Ding, P.; Chen, Q. An online physical-based multiple linear regression model for building’s hourly cooling load prediction. Energy Build. 2022, 254, 111574. [Google Scholar] [CrossRef]
Gao, Z.; Yu, J.; Zhao, A.; Hu, Q.; Yang, S. A hybrid method of cooling load forecasting for large commercial building based on extreme learning machine. Energy 2022, 238, 122073. [Google Scholar] [CrossRef]
Yao, Y.; Chen, J. Global optimization of a central air-conditioning system using decomposition-coordination method. Energy Build. 2010, 42, 570–583. [Google Scholar] [CrossRef]
Ma, Y.X.; Yu, C. Impact of meteorological factors on high-rise office building energy consumption in Hong Kong: From a spatiotemporal perspective. Energy Build. 2020, 228, 110468. [Google Scholar] [CrossRef]
Fan, C.; Liao, Y.; Zhou, G.; Zhou, X.; Ding, Y. Improving cooling load prediction reliability for HVAC system using Monte-Carlo simulation to deal with uncertainties in input variables. Energy Build. 2020, 226, 110372. [Google Scholar] [CrossRef]
Ding, Y.; Zhang, Q.; Yuan, T. Research on short-term and ultra-short-term cooling load prediction models for office buildings. Energy Build. 2017, 154, 254–267. [Google Scholar] [CrossRef]
Eguía, P.; Granada, E.; Alonso, J.M.; Arce, E.; Saavedra, A. Weather datasets generated using kriging techniques to calibrate building thermal simulations with TRNSYS. J. Build. Eng. 2016, 7, 78–91. [Google Scholar] [CrossRef]
Deb, C.; Zhang, F.; Yang, J.; Lee, S.E.; Shah, K.W. A review on time series forecasting techniques for building energy consumption. Renew. Sustain. Energy Rev. 2017, 74, 902–924. [Google Scholar] [CrossRef]
Zhang, W.; Yu, J.; Zhao, A.; Zhou, X. Predictive model of cooling load for ice storage air-conditioning system by using GBDT. Energy Rep. 2021, 7, 1588–1597. [Google Scholar] [CrossRef]
Bui, D.K.; Nguyen, T.N.; Ngo, T.D.; Nguyen, H. An artificial neural network (ANN) expert system enhanced with the electromagnetism-based firefly algorithm (EFA) for predicting the energy consumption in buildings. Energy 2020, 190, 116370. [Google Scholar] [CrossRef]
Elbeltagi, E.; Wefki, H. Predicting energy consumption for residential buildings using ANN through parametric modeling. Energy Rep. 2021, 7, 2534–2545. [Google Scholar] [CrossRef]
Wang, L.; Lee, E.W.M.; Yuen, R.K.K. Novel dynamic forecasting model for building cooling loads combining an artificial neural network and an ensemble approach. Appl. Energy 2018, 228, 1740–1753. [Google Scholar] [CrossRef]
Muzaffar, S.; Afshari, A. Short-Term Load Forecasts Using LSTM Networks. Energy Procedia 2019, 158, 2922–2927. [Google Scholar] [CrossRef]
Khuntia, S.R.; Rueda, J.L.; Meijden, M.A.M.M. Forecasting the load of electrical power systems in mid- and long-term horizons: A review. IET Gener. Transm. Distrib. 2016, 16, 3971–3977. [Google Scholar] [CrossRef]
Zhang, G.; Ge, Y.; Pan, X.; Afsharzadeh, M.; Ghalandari, M. Optimization of energy consumption of a green building using PSO-SVM algorithm. Sustain. Energy Technol. Assess. 2022, 53, 102667. [Google Scholar] [CrossRef]
Amasyali, K.; El-Gohary, N.M. A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy Rev. 2018, 81, 1192–1205. [Google Scholar] [CrossRef]
Li, Q.; Ren, P.; Meng, Q. Prediction model of annual energy consumption of residential buildings. In Proceedings of the 2010 International Conference on Advances in Energy Engineering, Beijing, China, 19–20 June 2010; pp. 223–226. [Google Scholar] [CrossRef]
Fan, C.; Liao, Y.; Ding, Y. Development of a cooling load prediction model for air-conditioning system control of office buildings. Int. J. Low-Carbon Technol. 2019, 14, 70–75. [Google Scholar] [CrossRef]
Zhou, X.; Zi, X.; Liang, L.; Fan, Z.; Yan, J.; Pan, D. Forecasting performance comparison of two hybrid machine learning models for cooling load of a large-scale commercial building. J. Build. Eng. 2019, 21, 64–73. [Google Scholar] [CrossRef]
Guan, S.; Wang, X.; Hua, L.; Li, L. Quantitative ultrasonic testing for near-surface defects of large ring forgings using feature extraction and GA-SVM. Appl. Acoust. 2021, 173, 107714. [Google Scholar] [CrossRef]
Wang, X.; Guan, S.; Hua, L.; Wang, B.; He, X. Classification of spot-welded joint strength using ultrasonic signal time-frequency features and PSO-SVM method. Ultrasonics 2019, 91, 161–169. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Wang, G.; Yao, G.; Shen, Q.; Yu, X.; He, S. Combining GA-SVM and NSGA-Ⅲ multi-objective optimization to reduce the emission and fuel consumption of high-pressure common-rail diesel engine. Energy 2023, 278, 127965. [Google Scholar] [CrossRef]
Li, Y.; Yang, P.; Wang, H. Short-term wind speed forecasting based on improved ant colony algorithm for LSSVM. Clust. Comput. 2019, 22, 11575–11581. [Google Scholar] [CrossRef]
Liu, Y.; Chen, H.; Zhang, L.; Wu, X.; Wang, X.J. Energy consumption prediction and diagnosis of public buildings based on support vector machine learning: A case study in China. J. Clean. Prod. 2020, 272, 122542. [Google Scholar] [CrossRef]
Zhang, Q.; Tian, Z.; Ding, Y.; Lu, Y.; Niu, J. Development and evaluation of cooling load prediction models for a factory workshop. J. Clean. Prod. 2019, 230, 622–633. [Google Scholar] [CrossRef]
Emhamed, A.A.; Shrivastava, J. Electrical load distribution forecasting utilizing support vector model (SVM). Mater. Today Proc. 2021, 47, 41–46. [Google Scholar] [CrossRef]
Ma, W.; Zhang, X.; Xin, Y.; Li, S. Study on short-term network forecasting based on SVM-MFA algorithm. J. Vis. Commun. Image Represent. 2019, 65, 102646. [Google Scholar] [CrossRef]
Ma, Z.; Ye, C.; Li, H.; Ma, W. Applying support vector machines to predict building energy consumption in China. Energy Procedia 2018, 152, 780–786. [Google Scholar] [CrossRef]
Chapelle, O.; Vapnik, V.; Bousquet, O.; Mukherjee, S. Choosing multiple parameters for support vector machines. Mach. Learn. 2002, 46, 131–159. [Google Scholar] [CrossRef]
Tan, W.; Sun, L.; Yang, F.; Che, W.; Ye, D.; Zhang, D. Study on bruising degree classification of apples using hyperspectral imaging and GS-SVM. Optik 2018, 154, 581–592. [Google Scholar] [CrossRef]
Cai, W.; Wen, X.; Li, C.; Shao, J.; Xu, J. Predicting the energy consumption in buildings using the optimized support vector regression model. Energy 2023, 273, 127188. [Google Scholar] [CrossRef]
Pan, X.; Xing, Z.; Tian, C.; Wang, H.; Liu, H. A method based on GA-LSSVM for COP prediction and load regulation in the water chiller system. Energy Build. 2021, 230, 110604. [Google Scholar] [CrossRef]
Costa-Carrapiço, I.; Raslan, R.; González, J.N. A systematic review of genetic algorithm-based multi-objective optimisation for building retrofitting strategies towards energy efficiency. Energy Build. 2020, 210, 109690. [Google Scholar] [CrossRef]
Bre, F.; Silva, A.S.; Ghisi, E.; Fachinotti, V.D. Residential building design optimisation using sensitivity analysis and genetic algorithm. Energy Build. 2016, 133, 853–866. [Google Scholar] [CrossRef]
Ruan, J.; Wang, X.; Shi, Y. Developing fast predictors for large-scale time series using fuzzy granular support vector machines. Appl. Soft Comput. J. 2013, 9, 3981–4000. [Google Scholar] [CrossRef]
Zadeh, L.A. Fuzzy sets. Inf. Control 1965, 8, 338–353. [Google Scholar] [CrossRef]
Tseng, F.; Tzeng, G.; Yu, H.; Yuan, B. Fuzzy ARIMA model for forecasting the foreign exchange market. Fuzzy Sets Syst. 2001, 118, 9–19. [Google Scholar] [CrossRef]
Tseng, F.; Tzeng, G. A fuzzy seasonal ARIMA model for forecasting. Fuzzy Sets Syst. 2002, 126, 367–376. [Google Scholar] [CrossRef]
Zhang, L.; Wen, J. A systematic feature selection procedure for short-term data-driven building energy forecasting model development. Energy Build. 2019, 183, 428–442. [Google Scholar] [CrossRef]
Zhang, C.; Li, J.; Zhao, Y.; Li, T.; Chen, Q.; Zhang, X.; Qiu, W. Problem of data imbalance in building energy load prediction: Concept, influence, and solution. Appl. Energy 2021, 297, 117139. [Google Scholar] [CrossRef]
Zhang, C.; Li, J.; Zhao, Y.; Li, T.; Chen, Q.; Zhang, X. A hybrid deep learning-based method for short-term building energy load prediction combined with an interpretation process. Energy Build. 2020, 225, 110301. [Google Scholar] [CrossRef]

Figure 1. GA to optimize SVM flow chart.

Figure 2. IG-SVM algorithm flow chart.

Figure 3. Flow chart of the joint forecasting model.

Figure 4. Changes in the meteorological parameters and cooling load from June to July.

Figure 5. Heat map of correlation coefficients.

Figure 6. Case 1–8 comparison of predicted values and real values and comparison of prediction error curve. (a) Comparison of the predicted value and actual value of Case 1–4; (b) Case 1–4 relative error curve; (c) Comparison of the predicted value and actual value of Case 5–8; (d) Case 5–8 relative error curve.

Figure 7. Grid search method to optimize the results: (a) Rough optimization result; (b) Accurate optimization.

Figure 8. SVM and GA-SVM predicted value and true value comparison.

Figure 9. SVM and GA-SVM error comparison curve.

Figure 10. Fuzzy information granulation visualization graph.

Figure 11. Comparison of the predicted values and true values of the three parameters of the fuzzy information granule: (a) Low parameter predicted values and original values; (b) R parameter predicted values and original values; (c) Up parameter predicted values and original values.

Figure 12. IG-SVM prediction results.

Table 1. Commonly used kernel functions.

Name	Expression
Linear kernel function	$k (u, v) = (u \cdot v)$
Polynomial kernel function	$k (u, v) = {(r (u \cdot v) + c o e f 0)}^{d}$
RBF kernel function	$k (u, v) = \exp (- r {\|u - v\|}^{2})$
Sigmoid kernel function	$k (u, v) = \tanh (r (u - v) + c o e f 0)$

Table 2. Descriptive statistics of variables.

Variable	Sample Size	Minimum Value	Maximum Value	Average Value	Standard Deviation
Dry bulb temperature	915	20.80	39.00	29.84	3.80
Relative humidity	915	19.00	98.00	63.07	18.61
Solar radiation	915	0	986.44	361.54	302.35
Wind speed	915	6.81	22.70	14.81	2.92
BCL	915	1314.00	28,740	13,938.69	6628.29

Table 3. Input parameter symbol comparison table.

Symbol	Meaning
$T_{t}$	Dry bulb temperature at time $t$
$T_{t - 1}$	Dry bulb temperature at time $t - 1$
$T_{t - 2}$	Dry bulb temperature at time $t - 2$
$S_{t}$	Sun radiation at time $t$
$S_{t - 1}$	Sun radiation at time $t - 1$
$S_{t - 2}$	Sun radiation at time $t - 2$
$H_{t}$	Relative humidity at time $t$

Table 4. Different historical data input parameter combination table.

Case	Model Input Parameters
Case 1	$T_{t}, S_{t}, H_{t}$
Case 2	$T_{t}, T_{t - 1}, S_{t}, H_{t}$
Case 3	$T_{t}, S_{t}, S_{t - 1}, H_{t}$
Case 4	$T_{t}, T_{t - 1}, S_{t}, S_{t - 1}, H_{t}$
Case 5	$T_{t}, S_{t}, S_{t - 1}, S_{t - 2}, H_{t}$
Case 6	$T_{t}, T_{t - 1}, T_{t - 2}, S_{t}, S_{t - 1}, H_{t}$
Case 7	$T_{t}, T_{t - 1}, S_{t}, S_{t - 1}, S_{t - 2}, H_{t}$
Case 8	$T_{t}, T_{t - 1}, T_{t - 2}, S_{t}, S_{t - 1}, S_{t - 2}, H_{t}$

Table 5. Forecast results of different models.

Case	SVM Parameters (c, g)	MAPE (%)	CV-RMSE	R² (%)
Case 1	c = 0.871, g = 16	12.57	5.22	87.15
Case 2	c = 0.5, g = 16	13.41	5.65	85.06
Case 3	c = 0.5, g = 16	10.29	4.53	90.34
Case 4	c = 0.5, g = 6.96	13.32	5.58	85.47
Case 5	c = 0.5, g = 16	8.48	3.84	93.07
Case 6	c = 0.871, g = 9.19	9.52	4.32	91.29
Case 7	c = 0.233, g = 16	8.73	3.98	92.57
Case 8	c = 1.149, g = 4.595	11.62	4.96	88.31

Table 6. Comparison of results of different prediction models.

Model	MAPE (%)	CV-RMSE	R² (%)
GA-SVM	2.71	1.58	97.85
SVM	8.48	3.84	93.07
BP	8.47	3.92	93.41
LSTM	9.95	4.51	92.01
DNN	9.01	4.01	92.14

Table 7. Joint prediction model effectiveness evaluation.

Result	MAPE (%)	CV-RMSE	R² (%)
Low	2.73	0.0249	99.68
R	2.59	0.0293	98.15
Up	2.84	0.0319	97.27

Table 8. Comparison of results of joint prediction models.

Model	MAPE (%)	CV-RMSE	R² (%)
IG-BP	3.07~5.62	0.0417~0.0591	95.19~98.84
IG-LSTM	4.51~5.70	0.0627~0.0892	93.97~98.01
IG-DNN	4.99~7.02	0.0881~0.1003	94.45~97.67
IG-SVM	2.59~2.84	0.0249~0.0319	97.27~99.68

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, M.; Yu, J.; Zhou, M.; Quan, W.; Cheng, R. Joint Forecasting Model for the Hourly Cooling Load and Fluctuation Range of a Large Public Building Based on GA-SVM and IG-SVM. Sustainability 2023, 15, 16833. https://doi.org/10.3390/su152416833

AMA Style

Wang M, Yu J, Zhou M, Quan W, Cheng R. Joint Forecasting Model for the Hourly Cooling Load and Fluctuation Range of a Large Public Building Based on GA-SVM and IG-SVM. Sustainability. 2023; 15(24):16833. https://doi.org/10.3390/su152416833

Chicago/Turabian Style

Wang, Meng, Junqi Yu, Meng Zhou, Wei Quan, and Renyin Cheng. 2023. "Joint Forecasting Model for the Hourly Cooling Load and Fluctuation Range of a Large Public Building Based on GA-SVM and IG-SVM" Sustainability 15, no. 24: 16833. https://doi.org/10.3390/su152416833

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Joint Forecasting Model for the Hourly Cooling Load and Fluctuation Range of a Large Public Building Based on GA-SVM and IG-SVM

Abstract

1. Introduction

2. Methodologies

2.1. SVM

2.1.1. Process Input Parameters

2.1.2. SVM Kernel Function

2.1.3. SVM Parameter Optimization

2.2. GA-SVM

2.3. IG-SVM

2.4. Model Evaluation Index

2.5. Joint Forecasting Model Framework

3. Case Study

3.1. Data Source and Description

3.2. Selection of Model Parameters

4. Result and Discussion

4.1. Sensitivity Analysis of Input Parameters

4.2. Analysis of the GA-SVM Prediction Result

4.2.1. SVM Parameter Optimization

4.2.2. Prediction Results of GA-SVM

4.3. IG-SVM Predicts the Cooling Load Fluctuation Range

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI