1. Introduction
Centralized heating is a widely used system that transfers heat to the user side and uses it directly [
1]. The heat sources of centralized heating include combined heat and power plants, various heat pumps, solar energy, boiler heating [
2], etc. In the face of the increasingly severe greenhouse effect, the rational use of centralized heat supply heat energy is getting more and more attention. Since centralized heat supply is a complex system with lagging and coupling, how to scientifically implement heat supply on demand has become an urgent problem to be solved [
3]. In recent years, heat load forecasting has given us access to science and technology [
4]. According to the length of the forecast period, heat load forecasting can be divided into long-term heat load forecasting, medium-term heat load forecasting, short-term heat load forecasting, and extreme short-term heat load forecasting [
5]. The corresponding periods are more than one year, several weeks to one year, one day to one week, and less than one day. Long-term and medium-term load forecasts can be used to estimate trends in load changes when we need long-term solutions for the system in the design phase [
6]. Short-term and very short-term heat load forecasting can be used to control and schedule the exact load demand [
7].
Heat load forecasting is the prediction of future heat load levels in a building or area under specific meteorological conditions [
8]. Such predictions can help architects, designers, and energy managers to better plan buildings and infrastructure [
9]. This approach can improve energy efficiency and reduce energy costs. Currently, numerical models and machine learning algorithms are commonly used for heat load forecasts [
10]. The following are some typical techniques for heat load forecasting.
This method uses empirical formulas to determine the heat load of a system or a region. These calculations are based on historical data and the characteristics of certain buildings or places. However, this method is not very accurate [
11].
- 2.
Method based on physical models
This uses the physical characteristics of the building or area, meteorological data, and energy transfer theory to build a mathematical model to predict the heat load. This method has high accuracy, but it needs to input a large amount of data, and the calculation is complicated.
- 3.
Machine learning-based method
This approach uses the machine learning algorithm to predict thermal loads, and it requires training models based on historical and meteorological data. Machine learning algorithms include linear regression [
12], support vector machine [
13], clustering algorithm [
14], etc. The advantage of this method is the high accuracy, but it requires a large amount of data.
The various methods mentioned above provide scientific guidance for heat load prediction [
15]. Among them, machine learning methods are more popular in heat load forecasting due to their high accuracy and flexibility [
16]. Currently, machine learning has been applied to data mining, computer vision, natural language processing, and other fields [
17]. The main use in the field of load forecasting is the regression prediction of data [
18]. From the perspective of prediction methods, backpropagation (BP), artificial neural networks (ANNs), recurrent neural networks (RNNs), and other methods are more widely used [
19]. Xie et al. [
20] improved the traditional ground source heat pump by introducing a hybrid hourly prediction model integrating multiple overlapping extended LSTMs and back propagation neural networks (BPNNs). Bergsteinsson et al. [
21] proposed a framework that combines temporal hierarchy with adaptive estimation to improve the accuracy of heat load forecasting by optimally combining the prediction results of multiple aggregation layers through an adjustment process. Liu et al. [
22] proposed applying LSTM to heat load forecasting of cogeneration units. Kim et al. [
23] used an optimal nonlinear autoregressive exogenous neural network (NARX) model to improve the load forecasting accuracy. In general, machine learning has been widely applied in the field of load forecasting.
From the perspective of model input, external factors such as outdoor temperature [
24], outdoor wind speed [
25], and light intensity are usually considered. Among them, the outdoor temperature has a greater influence on the heat load [
26]. In some studies, some internal factors are also considered, such as the supply temperature [
27], the return water temperature [
28], and the supply flow rate of the heating system. Sometimes, the effect of previous heat loads on the system is also considered [
29]. At the same time, incidental factors can also affect the heat load, such as the behavior of indoor personnel [
30], the number of indoor personnel, etc. Some researchers distinguish special days when predicting thermal loads, and this approach effectively avoids the influence of the peculiarities of certain days on the overall system data [
31]. Extreme short-term heat load prediction incorporating external factors is widely used to ensure the efficient use of building energy [
32]. Usually, historical hourly or three-hourly data are used as model inputs to predict 24-h or 48-h heat load data to guide the adjustment of actual heating [
33]. The main challenge in heat load forecasting is the translation of historical data into a predictive model and the accuracy of the predictive model. To address this problem, Huang et al. [
34] used a convolutional neural network to extract the feature vectors of environmental factors, and then the K-means clustering algorithm was used to establish the feature clustering model of various energy loads, which in turn led to the load prediction results of multi-energy systems. Gu et al. [
35] used outdoor temperatures and historical heat loaders as influencing factors. In conclusion, due to the characteristics of heating systems such as lag and complexity, researchers often take many internal and external factors into account when making predictions.
LSTM is widely used in the field of process control. An LSTM-ANN agent model was created and applied to predict woodchip degradation, cellulose depolymerization, Kappa number, and cellulose aggregation [
36]. In this paper, we used MATLAB 2020b to run the program for our experiments and analyze the effects of prediction methods and model inputs on experimental results. Finally, LSTM is used as the main prediction method, and the hyperparameters of LSTM are optimized using the Bayesian algorithm to improve the prediction accuracy.
The article structure of this paper is as follows.
Section 2 describes the source and composition of the data and smoothes its outliers. The data are analyzed using the Pearson correlation analysis method.
Section 3 describes the forecasting methods used. The Bayesian algorithm and the optimization process are presented. In
Section 4, the prediction results are analyzed, and the error evaluation metrics are used to demonstrate the strengths and weaknesses of the prediction results.
Section 5 presents the conclusions of this paper and briefly analyzes the issues that need to be addressed in the future.
5. Conclusions
This experiment analyzed various factors related to the heat load of a real object in long-term operation. Considering the influence of different factors, the factors with high correlation were selected as the input to the model. In terms of data pre-processing, the 3σ principle was chosen to process the data to ensure the fit. For the potential problem of data noise, the moving average method was used to smooth the data and remove the noise to make the data more reliable and easier to analyze.
For the prediction method, the LSTM optimized by the Bayesian algorithm was selected. The initial learning rate, ridge regularization coefficient, and the number of recurrent units in the hidden layer of the LSTM were optimized by using the powerful optimization ability of the Bayesian algorithm. BP, SVM, and LSTM were selected for comparison, and RMSE, R2, MAE, and MBE were chosen as evaluation indexes to evaluate the prediction results of the above methods. It is easy to find that BO-LSTM had the best fitting effect through the final results. The RMSE decreased most significantly at the step size of 72 h, with a decrease of 0.15089. In other steps, the RMSE of BO-LSTM also decreased, and the other two evaluation indexes also decreased. It can be seen that the Bayesian optimized LSTM as a prediction method has a strong prediction ability and general applicability. The object of this study is not dynamic, and real-time forecasting of online dynamics is the problem that we want to solve. In addition to the above issues, there is also a problem of applying the results of hourly forecasts to actual adjustments. We believe that a real-time data acquisition and prediction platform can be built to transmit the acquired data to the prediction software via Object Linking and Embedded for Process Control (OPC) and then transmit the predicted data to the actuator for control to achieve the purpose of actual control.