1. Introduction
Efficient property management of large organizations is a serious challenge. This problem increases with the number and territorial distribution of properties, with the diversity of their size and functions. One of the important factors related to the operation and management of buildings is electricity consumption. It is needed for almost every device in our work environment: telephone, computer, office equipment, workplace lighting, room heating, and many others.
In addition, there are power outages, which affect both private consumers and enterprises. In the case of enterprises, deficiencies prevent them from running their business effectively. As a consequence, they may be the reason for the untimely provision of services or deliveries and generate large additional costs or financial penalties for failure to comply with the terms of trade agreements. Another element is external factors and random situations that unexpectedly disrupt production for the domestic and global market as well as electricity consumption systems, e.g., the outbreak of the COVID-19 epidemic or war in a country that is the supplier of the energy-producing raw material, etc. This may negatively affect the management of the enterprise and its costs planned for electricity consumption. Such situations mean that electricity consumers, especially enterprises, have to plan their electricity consumption. The problem of such planning becomes more complicated for organizations with many different buildings that consume significant amounts of electricity. An important aspect of management is, therefore, predicting the amount of energy consumption, which will allow for planning the demand for energy consumption, planning to reduce the number of people in the office at high electricity price moments, or collecting alternative forms of energy supply during lower consumption and lower prices.
Data-driven building energy consumption prediction has gained a lot of attention in recent years. The importance of energy consumption forecasting is highlighted in the article, which identifies research areas that may require more attention: long-term energy consumption forecasting in buildings, energy consumption forecasting in residential buildings, and energy consumption forecasting in lighting buildings [
1]. The relatively limited research in these areas may be due to a lack of sufficient data and/or the complexity of occupant energy use behavior in these contexts. Therefore, in this article, we present the results of prediction research based on real data.
The research aims to forecast electricity consumption in public buildings using time series methods, which will facilitate energy management in teams of heterogeneous distributed facilities. The study analyzed real historical data on energy demand for selected company office buildings in 2018–2021. The dataset contains energy consumption readings (sampling) 330 every 15 min from electricity meters. The dataset, therefore, contained over 140,174 reads. A group of five different buildings, located in different parts of southern Poland, was selected for this research. These buildings were built in different periods and are equipped with various automation devices.
In the review [
2], the authors indicate that different models serve different purposes, have different scopes, were trained on different datasets, and use different features for prediction. All the models examined in the literature have their own strengths and weaknesses and perform differently under different circumstances.
The main contribution is to examine real-time series data of electrical energy consumption in a complex of heterogeneous buildings of an organization using time series analysis methods, the Holt–Winters model and ARIMA/SARIMA model, and neural networks Deep Neural Network, Recurrent Neural Network, and Long Short-Term Memory to indicate the usefulness of these methods to support decision-making by managers of complexes of various types of buildings. Additionally, the article is a response to the need of a real enterprise to test the possibilities of forecasting energy consumption in a set of heterogeneous buildings using real data. The research results will be used to develop an easy-to-use decision support system for people without knowledge of prediction methods or programming skills.
The paper is organized as follows. In
Section 2, we present a literature review of data prediction methods, the Holt–Winters model and ARIMA/SARIMA model, and neural networks such as Deep Neural Networks, Recurrent Neural Networks, and Long Short-Term Memory. In
Section 3 we describe the studied building complex and their parameters. In
Section 4 we present data preparation and results for experiments using the Holt–Winters model, ARIMA model, Deep Neural Network, Recurrent Neural Network, and Long Short-Term Memory. The
Section 5 contains conclusions and future works.
2. Methods
Various time series data analysis methods can be used to predict electricity consumption. Some of the most popular ones currently include the naive model with seasonality, linear regression, and Facebook Prophet, which we used to examine the impact of external factors on the prediction of electricity consumption in an office building with photovoltaic panels [
3]. In review [
1], the authors examine that in total,
of the studies of building energy consumption prediction, employed ANN and
used SVM for training their models. Only
of the studies made use of decision trees. Meanwhile,
of the studies utilized other statistical algorithms, including MLR, OLS, and ARIMA. Some of the papers present the use of advanced forecasting methods to analyze various aspects of energy consumption and the response of structures to extreme weather conditions. The methods used include deep learning, probabilistic models, and Bayesian techniques, which are applied to improve forecast accuracy and account for uncertainty [
2,
4,
5,
6,
7,
8]. The techniques employed are as follows: Deep learning (CNN, LSTM, BiLSTM, QRF): Used for time series forecasting, such as wind speed and the response of bridges to extreme weather conditions. Probabilistic and Bayesian models: Applied to account for uncertainty in energy consumption forecasts and to predict long-term trends. Hybrid methods: A combination of physical and statistical models, such as integrating numerical forecasts with on-site measurements to increase the accuracy of long-term forecasts.
In the case of a complex of commercial heterogeneous buildings, there may be repeatability of energy consumption due to the cycle of activity and the habits of employees and customers (e.g., holidays, holiday periods), therefore, methods used in forecasting data that show both seasonality, such as the Holt–Winters model and ARIMA/SARIMA model.
2.1. Holt–Winters Model
The Holt–Winters method, also known as exponential smoothing for trended and seasonal time series, is a popular and simple time series forecasting technique. The Holt–Winters method is intended to predict future values based on available historical data [
9]. Two characters stand out in the model: additive and multiplicative.
The two main components of the time series are taken into account in this iterative method: trend and seasonality.
In a time series, the trend refers to long-term changes. This method takes into account the evolution of levels over time and estimates changes in levels using progressive smoothing. If the trend plot is linear then we apply it additively. If the trend line grows or shrinks exponentially, we apply it multiplicatively.
Seasonality: Seasonality refers to cyclical patterns in data that repeat at regular intervals, such as monthly or quarterly. The Holt–Winters method can take into account seasonality by exponential smoothing over seasons. If the peaks and valleys for seasonality are constant over time, we apply it additively. If the size of the seasonal fluctuations tends to increase or decrease with the level of time series, we apply it multiplicatively.
The paper [
10] describes a study on the use of the Holt–Winters and Prophet models for long-term peak load forecasting in Kuwait. The study showed that the forecasted maximum peak load will increase to approximately 18,550 MW for the Prophet model and 19,588 MW for the Holt–Winters model by 2030 in Kuwait. Additionally, it was determined that the best months for scheduling preventive maintenance for the years 2020 and 2021 are from November 2020 to March 2021 for both models.
The paper [
11] discusses the development of a hybrid model for ultra-short-term forecasting of residential electricity consumption in China. This model combines the Holt–Winters (HW) method with the Extreme Learning Machine (ELM) network. Original data are processed using a moving average filter, and then subjected to a two-step prediction: HW for the linear component and ELM for the nonlinear component. The study compared the model with other methods, demonstrating its higher accuracy, particularly for shorter training sets. Ultimately, the proposed HW-ELM model proved to be more efficient than other established methods for forecasting residential electricity consumption.
The paper [
12] presents a comparison between the ARIMA Model and the Holt–Winters Model in terms of MAE, RSS, MSE, and RMS criteria for predicting data on total primary energy consumption in the USA. The data examined in this period span from January 1973 to December 2016.
The paper [
13] proposes the application of the Holt–Winters model to improve the forecasting of energy meter demand by optimizing the smoothing coefficient. The detailed operation of the model and the method for selecting the optimal coefficient are presented. This model successfully predicted the demand for energy meters from 2015 to 2017 and proved effective in forecasting the demand for 2018 as well.
This method is also widely used in applications other than energy consumption prediction. In the article [
14], the authors use the Holt–Winters method to forecast economic time series. Seasonality and a strong trend are characteristic of them. The tested method is capable of better capturing the various patterns in the data.
The use of the Holt–Winters method for analyzing commodity prices in the European Union and Western Balkans is another example. According to the authors of the article [
15], regardless of the data, the Holt–Winters method always produces the second-best result in both cases.
Macroeconomic variables such as exports of goods are one of the most important economic parameters of a country. Forecasting such data allows you to prepare, among other things, how much money you can count on in the coming year. The authors [
16] used the method to analyze the difficult case (not necessarily stable) of the quantity and quota of exports in Serbia.
Transport is an integral part of human life. Many researchers deal with problems related to transport. An example is forecasting street traffic in the city depending on the time of day [
17,
18]. The Holt–Winters method allows you to effectively predict opportunities to increase traffic. This allows (for example) street lights to be controlled in such a way as to reduce traffic congestion.
Another means of transport is the railway. Important information is the number of people who use local transport. The forecasted flow of people allows for greater safety during rush hours, planning a more frequent network of connections on selected routes, or planning the work of people involved in urban transport. The authors [
19] forecast passenger flow to plan not only the basic features but also to see how a rail transit plan can be prepared among passengers.
2.2. ARIMA/SARIMA Model
ARIMA, or Autoregressive Integrated Moving Average, is a time series analysis method used to forecast future values based on historical data. SARIMA is an extension to ARIMA that supports the direct modeling of the seasonal component of the series.
The effectiveness of the method is demonstrated when the time series contains some levels of trend or seasonality.
The method is divided into three main components:
Autoregression (AR): Refers to modeling the relationship between current and previous values in a time series. In an autoregressive model, this value is determined by a linear combination of previous values along with random errors.
Differentiation (I): refers to the procedure of transforming a time series into a stationary series by calculating the differences between successive values. The result is that trend and seasonality are removed to obtain a stationary time series.
Moving Average (MA): Used to model the relationship between current values and model errors over previous time periods. The component removes the influence of the model error from the previous analyzed periods on the current forecast value.
ARIMA is defined by three parameters: p, d, and q where p is the number of lags in the autoregressive model, d denotes the degree of differentiation (if d = 1, it means that single differentiation was performed), and q is a number of lags in the moving average model.
When creating the ARIMA forecast model, the above parameters are selected using additional charts. Partial collegrams and autocorrelation of the time series are analyzed.
After selecting the parameter values, we can use the model to forecast data based on historical data [
20].
The authors in the article [
15] show that the ARIMA model works best for prices of goods from the European Union. This is related to greater price stability on the market and the well-known price stabilization policy. This makes it easier to forecast price inflation.
One of the basic things that most researchers would like to predict is disease prediction; 2020 produced a lot of data regarding COVID-19. This allowed researchers to use forecasting methods to determine the spread of the disease in different parts of the world. In the article [
21,
22,
23,
24], the authors use the ARIMA model to forecast the epidemiological trend in Italy, India and China, based on data collected from the Ministry of Health.
An accurate economic forecast influences government policy. Forecasting helps in making better decisions on infrastructure creation, budget planning, and economic planning. In the study [
25,
26], the authors project total spending on health care spending, public health spending, and out-of-pocket payments.
Many countries around the world are severely affected by air pollution, which has a negative impact on health or death. The authors [
27] used a time series forecasting approach to predict the future levels of various pollutants in this paper. The effectiveness of the proposed method using SARIMA is demonstrated by experimental analysis of air pollution level forecasts in Bhubaneswar.
An important aspect is fire safety. In the article [
28], the authors examine the frequency of fires in China. Their regularity allows the SARIMA model to be used to predict the event of a fire outbreak. The results presented in the article show the effectiveness of the method.
2.3. Neural Networks: DNN, RNN, LSTM
A DNN (Deep Neural Network) is an artificial neural network that comprises interconnected neurons arranged in multiple layers. The term “deep” signifies the presence of numerous hidden layers between the input and output layers.
Neurons serve as the fundamental components of DNNs, functioning as processing units. Each neuron accepts inputs, performs computations on the data, and produces an output. This output is then transmitted as input to subsequent neurons in the following layers, see [
29].
An artificial neuron, which constitutes the basic computational unit in a DNN, consists of two essential elements: a summation unit and an activation function. The summation unit combines the weighted inputs of the neuron, and the resulting value is passed through the activation function, introducing nonlinearity to the model. This amalgamation of the summation unit and activation function enables DNNs to effectively model intricate dependencies and patterns in the data.
Within a DNN, information flows from the input layer, where it is initially provided as input data, through the hidden layers until it reaches the output layer where the final results are generated. During the training process of a DNN, the connection weights between neurons are adjusted iteratively to minimize the error between the network’s outputs and the expected values, for more details see [
30].
A Recurrent Neural Network (RNN) is a specialized type of artificial neural network designed to handle sequential data that relies on the input order. Unlike conventional feedforward neural networks, RNNs possess a feedback mechanism that enables them to maintain an internal memory or context of previous inputs. This memory empowers them to capture and process patterns that unfold over time in sequential data [
31].
The distinguishing characteristic of RNNs lies in their recurrent connections, which allow the output of a neuron to be fed back as input to itself or other neurons in the network. This recurrent architecture facilitates the persistence and propagation of information across time steps, enabling the network to consider not only the current input but also the historical context of preceding inputs.
The fundamental component of an RNN is the recurrent neuron, which possesses an internal hidden state that gets updated at each time step based on the present input and the previous hidden state. This hidden state functions as a memory mechanism, enabling the network to learn and retain long-term dependencies within sequential data [
32].
Long Short-Term Memory (LSTM) is an advanced type of recurrent neural network (RNN) architecture developed to overcome the limitations of traditional RNNs in capturing and preserving long-term relationships within sequential data. LSTMs excel in tasks that involve processing and comprehending sequences of information [
33].
The core breakthrough of LSTMs lies in their memory cell, which possesses the ability to selectively remember or forget information over extended periods. This memory cell comprises essential components such as input gates, forget gates and output gates. These gates regulate the flow of information into, out of, and within the memory cell, granting the network the capacity to control the retention or dismissal of relevant information.
The input gate determines which portions of the input are stored in the memory cell, the forget gate determines which information should be erased from the memory cell, and the output gate decides which segments of the memory cell should be exposed as the output of the LSTM unit. Through dynamic updates to the gate states based on input and previous states, LSTMs can effectively retain significant information and discard irrelevant information. This mechanism addresses the challenge of vanishing gradients and facilitates the modeling of long-term dependencies.
LSTMs have found wide-ranging applications, including natural language processing, speech recognition, machine translation, sentiment analysis, and time series prediction. Their ability to capture and leverage long-term dependencies in sequential data makes them highly valuable in tasks where context and memory retention are critical, for more details see [
34].
3. Characteristics of the Studied Building Complex
A group of five various buildings located in different parts of southern Poland was chosen for the study. These buildings were built in different periods of time and are equipped with a variety of automation devices.
The functional characteristics of the examined buildings are presented in
Table 1. The technical characteristics of buildings are as follows.
Building No. A has the following technical features:
thermal comfort maintenance system consisting of ventilation, air conditioning, and heating managed by BMS;
a heating system based on system heat sources fed from the city’s 140 kW district heating network. The network heat is used to heat the building with radiators and to power the heaters in the ventilation systems. In addition, the air handling unit is equipped with a four-section electric heater with a power of 45 kW, which is switched on when the heaters supplied by the network heat source are not working;
ventilation unit is equipped with a recuperation system based on rotary exchangers. The selection of the amount of fresh air prepared and supplied to the building is carried out on the basis of measurements of carbon dioxide in rooms;
the building is equipped with a structural grid, powered by a guaranteed voltage capable of sustaining operation for at least 24 h, using a battery charging system and a diesel-powered generator;
facility has a power supply from two independent circuits of the power grid. It is equipped with technical security systems for the facility, including CCTV, SKD, and fire. In addition, the technical section has vehicle charging stations for eight electric vehicles. However, these stations are used for charging an average of one vehicle per day;
the building has a 40 kWp photovoltaic system and is equipped with installations to minimize environmental impact, such as a system related to the use of gray water and a system for intelligent water metering in sewage systems.
Building No. B has the following technical features:
system of thermal comfort maintenance consisting of ventilation, air conditioning, and heating. This system is powered by heat pumps with a total heating capacity of 69.6 kW and a cooling capacity of 58.8 kW. The lower heat source is 16 vertical boreholes of 100 m. The upper heat source is a water circuit. The heat pump buffer consists of two tanks with a capacity of 1500 dm3;
the ventilation is equipped with a recuperation system based on cross-flow heat exchangers. The selection of the amount of fresh air prepared and delivered to the building is based on normative requirements for the number of people working in the building;
the building is equipped with a structural network, supplied with guaranteed voltage capable of maintaining operation for at least 24 h with the use of the system of battery charging and a diesel generator;
the facility is powered by two independent power grid circuits. It is equipped with technical security systems for the facility, including CCTV, SKD, and fire protection;
the building has a photovoltaic installation with a capacity of 53 kWp and three wind turbines with a capacity of 6 kW.
Building No. C has the following technical features:
system of thermal comfort maintenance consisting of ventilation, air conditioning, and heating. This system is powered by 2 air-to-water heat pumps with a total capacity of heating power of 67.20 kW. The heat pump buffer is a tank with a capacity of 800 dm3. The garage is heated by the airflow heated by a heat pump;
the ventilation is equipped with a recuperation system based on cross-flow heat exchangers. The selection of the amount of fresh air prepared and delivered to the building is based on normative requirements for the number of people working in the building. Ventilation is equipped with a heater powered by a heat pump;
the building is equipped with a structural network, supplied with guaranteed voltage capable of maintaining operation for at least 24 h with the use of the system of battery charging and a diesel generator;
the facility has one power supply from the power grid. It is equipped with the systems of technical protection of the facility, including CCTV, SKD, fire protection—staircase smoke extraction;
the building has a photovoltaic installation with a capacity of 20 kWp.
Building No. D has the following technical features:
system of thermal comfort maintenance consisting of ventilation, air conditioning, and heating. This system is powered by heat pumps with a total heating capacity of 67.20 kW. The heat pump buffer is a tank with a capacity of 800 dm3;
the ventilation is equipped with a recuperation system based on cross-flow heat exchangers. The selection of the amount of fresh air prepared and delivered to the building is based on normative requirements for the number of people working in the building. Ventilation is equipped with a heater powered by a heat pump;
the facility has one power supply from the power grid. It is equipped with the systems of technical protection of the facility, including CCTV, SKD, fire protection—staircase smoke extraction;
the building has a photovoltaic installation with a capacity of 8.16 kWp.
The complex of buildings No. E includes five and photovoltaics provide power for the entire property). Technical characteristics of particular buildings of the complex are as follows:
Complex No.E_1: a system of thermal comfort maintenance consisting of ventilation, air conditioning, and central heating powered by electric furnaces with a total power of 48 KW; the facility is powered by two independent power grid circuits. It is equipped with the systems of technical protection of the facility, including CCTV, SKD, fire protection—staircase smoke extraction; the building draws electricity from a common photovoltaic installation with a capacity of 20.91 kWp. which provides energy for the entire property;
Complex No.E_2: a system of thermal comfort maintenance consisting of ventilation, air conditioning, and heating with accumulation stoves with a total power of 30 kW; selection of the amount of fresh air prepared and supplied to the building is carried out based on normative requirements for the number of people working in the building; the facility is powered from two independent power grid circuits. It is equipped with the systems of technical protection of the facility, including CCTV, SKD, fire protection—staircase smoke extraction; the building draws electricity from a common photovoltaic installation with a capacity of 20.91 kWp. which provides energy for the entire property;
Complex No.E_3: a system of thermal comfort maintenance consisting of ventilation, air conditioning, and heating with accumulation stoves. The total heating power 45 kW; the facility is powered by two independent power grid circuits. It is equipped with fire protection—staircase smoke extraction; the building draws electricity from a common photovoltaic installation with a capacity of 20.91 kWp. which provides energy for the entire property;
Complex No.E_4: a system of thermal comfort maintenance consisting of ventilation, air conditioning (for part of the office), and heating with storage heaters and convection heaters. It should be mentioned that the office space occupies 50 m2, and the remaining part is made up of unheated garages. The power of the devices needed to maintain the thermal comfort of office space is 4 KW; the facility is powered by two independent power grid circuits. It is equipped with fire protection—staircase smoke extraction; the building draws electricity from a common photovoltaic installation with a capacity of 20.91 kWp. which provides energy for the entire property;
Complex No.E_5: an unheated building; selection of the amount of fresh air prepared and supplied to the building is carried out based on normative requirements for the number of people working in the building; the facility is powered by two independent power grid circuits. It is equipped with fire protection—staircase smoke extraction; the building draws electricity from a common photovoltaic installation with a capacity of 20.91 kWp. which provides energy for the entire property.
4. Results
Experimental research was performed on the datasets obtained from energy consumption meters placed in the building complex, described in the previous section. The data were collected over a period of four years, from 1 January 2018 to 31 December 2021, in the form of time series. The dataset contains energy consumption readings (sampling) every 15 min. The dataset, therefore, contained over 140,174 reads.
4.1. Data Preprocesing
Before applying forecasting methods, an analysis of the input data was performed. Outliers were noticed during this analysis. An analysis of occurrences of outliers indicated data recorded during transitions from winter to summer time or from summer to winter time. The outliers resulted from a double recording of energy consumption at the same hour during transitions from summer to winter time and the absence of measurements for one hour during transitions from winter to summertime. This is a known issue with time and calendar standardization in time series analysis. Matplotlib libraries were used in Python to visualize the time series (
Figure 1). Conclusions from the analysis indicated that this is a temporary situation and that long-term analysis will not be affected. Therefore, these outlier values were replaced with the average value of readings from neighboring days.
Figure 1 shows an example time series of electricity consumption before the data are processed.
Due to business requirements, the data were aggregated into monthly periods to perform a forecast one year ahead on a monthly basis. This was conducted to indicate the likely level of energy consumption in the upcoming year, enabling the company to determine how much energy needs to be purchased for future periods.
Therefore, the following initial data processing took place:
deletion of additional duplicate data (conversion from summer to winter time—an additional hour),
removal of anomalies, such as bad readings or the occurrence of excessively high values,
averaging data from neighboring days,
grouping the values in a monthly period.
Figure 2 shows an example time series of electricity consumption after the data are processed. The data are looking much better, and it is possible to make more accurate forecasts.
4.2. Prediction of Energy Consumption in a Building Complex
The energy consumption data were divided into a training set and a test set. The training set included data from 2018 to 2020. The prediction was set for 2021 based on data from the training set. The following drawings show the electricity consumption of the chosen examined buildings in the examined period along with the prediction. Data from 2018 to 2020 on which the models were trained are marked in blue in the charts. Test data are shown in orange. Green represents predictive data for 2021. We selected building B and complex of building E for which the time series differ. In the case of building B, the time series is characterized by periodicity, but without seasonality and growing trend. Whereas, in the case of the complex of building E, the time series is characterized by seasonality and a downward trend. The hyperparameter tuning for all experiments described in the paper was conducted using the grid search method. The
Figure 3,
Figure 4,
Figure 5,
Figure 6 and
Figure 7 show the results obtained by the methods respectively:ARIMA/SARIMA, Holt-Winters, DNN, RNN, LSTM.
4.3. Forecast Accuracy
In order to determine the obtained error, three types of error determination methods were used:
MAE (Mean Absolute Error)—it is calculated as the sum of the absolute values of individual values divided by its quantity.
where:
is the actual energy consumption at time i,
is the forecasted value at time i,
n is the number of observations.
RMSE (Root Mean Squared Error)—it is calculated as the square root of the sum of the squares of the errors. The calculation “penalize” small errors with small values, while large errors receive a larger penalty.
where:
is the actual energy consumption at time i,
is the forecasted value at time i,
n is the number of observations.
MAPE (Mean Absolute Percentage Error)—it is calculated as the sum of the absolute errors divided by the sum of the actual value. To make the result scalable, these values are given as percentages.
where:
is the actual energy consumption at time i,
is the forecasted value at time i,
n is the number of observations.
The
Table 2,
Table 3,
Table 4,
Table 5 and
Table 6 present the results for all the types of tested buildings. For each building type, forecast evaluation values were determined using the MAE, RMSE and MAPE methods for all forecast models.
Analyzing the results obtained in the research, it can be seen that the best results for each type of building, the best forecasting model is ARIMA/SARIMA (MAPE results are very small, which means that it is the best fit). Another simple method by Holt–Winters is also a good forecasting method. All artificial intelligence methods did not produce good prognostic results. The worst results were achieved by the RNN method, which flattened and averaged its forecast results and did not adapt the results to real data in any way.
5. Conclusions
Energy consumption in the studied buildings (office, warehouse) is characterized by seasonality in monthly periods. Classic models are good at predicting energy consumption in this type of building. The ARIMA model gave the best results for the examined data. For buildings characterized by seasonality and trends, the forecast was almost perfect with actual values. Neural networks did not work well despite regular time series (too little data to train the network). Due to this, the evaluation results gave the worst results. In particular, the RNN model produced results that in no way failed to match the real data. The described methods do not exhaust the discussion on possible other, more complex prediction models that would take into account, for example, weather conditions that also have an impact on the overall consumption of electricity from the grid in buildings with photovoltaic panels.
The prediction of electricity consumption is used to forecast the purchase of the appropriate amount of energy for the entire building complex in the long term—it allows for properly planning both energy consumption in the following days and months. However, there are no simple and good solutions, in the form of a black box, for people making decisions who do not have specialist knowledge of prediction methods or time series analysis. The presented research is part of work on the development and implementation of a decision support system that is easy to use for a person without scientific knowledge and forecasts the purchase of the appropriate amount of energy for the entire building complex in the long term. The results are intended to facilitate the management of thermal comfort maintenance systems, or the use in some locations of other heat sources to support building heating when the values produced indicate that the ordered energy value is exceeded. This action limits unnecessary contractual penalties and means that people managing energy consumption can actually influence energy consumption in advance. The results of the research will be implemented in the enterprise to manage non-network properties.
Further work is related to the use of daily or weekly data to train neural networks. Such data were too fragmented to be used for simple forecasting methods. In addition, we plan to examine probabilistic and Bayesian models to predict long-term trends.
Another research area is to explore other prediction methods. This will allow for the comparison of larger quantities and the matching and adaptation of a specific forecasting method to a specific building type.
Author Contributions
Conceptualization, K.K.; Methodology, W.B. and K.G.-D.; Software, R.M. and W.B.; Validation, R.M., W.B., K.G.-D. and E.K.; Formal analysis, K.K., R.M., W.B. and K.G.-D.; Data curation, K.K.; Writing—original draft, E.K., R.M., W.B., K.G.-D. and E.K.; Visualization, R.M.; Supervision, E.K.; Funding acquisition, K.K. All authors have read and agreed to the published version of the manuscript.
Funding
Research project partly supported by AGH University of Krakow subvention for scientific activity no. 16.16.120.773 and program “Excellence initiative—research university” for the AGH University of Krakow. The APC was funded by Kazimierz Kawa.
Data Availability Statement
The datasets presented in this article are not readily available because the data are part of an ongoing study.
Conflicts of Interest
Author Kazimierz Kawa was employed by the Tauron Dystrybucja S.A. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
References
- Amasyali, K.; El-Gohary, N.M. A review of data-driven building energy consumption prediction studies. Renew. Sustain. Energy Rev. 2018, 81, 1192–1205. [Google Scholar] [CrossRef]
- Wang, H.; Zhang, Y.M.; Mao, J.X. Sparse Gaussian process regression for multi-step ahead forecasting of wind gusts combining numerical weather predictions and on-site measurements. J. Wind. Eng. Ind. Aerodyn. 2022, 220, 104873. [Google Scholar] [CrossRef]
- Kawa, K.; Mularczyk, R.; Bauer, W.; Kucharska, E. Wpływ czynników zewnętrznych na predykcję zużycia energii elektrycznej w budynku biurowym z panelami fotowoltaicznymi. Prz. Elektrotechniczny 2024, 2024, 221. [Google Scholar] [CrossRef]
- Zhang, Y.M.; Wang, H. Multi-head attention-based probabilistic CNN-BiLSTM for day-ahead wind speed forecasting. Energy 2023, 278, 127865. [Google Scholar] [CrossRef]
- da Silva, F.L.; Cyrino Oliveira, F.L.; Souza, R.C. A bottom-up bayesian extension for long term electricity consumption forecasting. Energy 2019, 167, 198–210. [Google Scholar] [CrossRef]
- Weeraddana, D.; Khoa, N.L.D.; O’Neil, L.; Wang, W.; Cai, C. Energy Consumption Forecasting Using a Stacked Nonparametric Bayesian Approach. In Proceedings of the Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track, Ghent, Belgium, 14–18 September 2020; Dong, Y., Ifrim, G., Mladenić, D., Saunders, C., Van Hoecke, S., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 19–35. [Google Scholar]
- Zhang, Y.M.; Wang, H.; Mao, J.X.; Xu, Z.D.; Zhang, Y.F. Probabilistic Framework with Bayesian Optimization for Predicting Typhoon-Induced Dynamic Responses of a Long-Span Bridge. J. Struct. Eng. 2021, 147, 04020297. [Google Scholar] [CrossRef]
- Tang, L.; Wang, X.; Wang, X.; Shao, C.; Liu, S.; Tian, S. Long-term electricity consumption forecasting based on expert prediction and fuzzy Bayesian theory. Energy 2019, 167, 1144–1154. [Google Scholar] [CrossRef]
- Chatfield, C.; Yar, M. Holt–Winters forecasting: Some practical issues. J. R. Stat. Soc. Ser. D Stat. 1988, 37, 129–140. [Google Scholar] [CrossRef]
- Almazrouee, A.I.; Almeshal, A.M.; Almutairi, A.S.; Alenezi, M.R.; Alhajeri, S.N. Long-Term Forecasting of Electrical Loads in Kuwait Using Prophet and Holt–Winters Models. Appl. Sci. 2020, 10, 5627. [Google Scholar] [CrossRef]
- Liu, C.; Sun, B.; Zhang, C.; Li, F. A hybrid prediction model for residential electricity consumption using holt–Winters and extreme learning machine. Appl. Energy 2020, 275, 115383. [Google Scholar] [CrossRef]
- Rahman, A.; Ahmar, A.S. Forecasting of primary energy consumption data in the United States: A comparison between ARIMA and Holter-Winters models. AIP Conf. Proc. 2017, 1885, 020163. [Google Scholar]
- Ma, Y.; Wang, J.; Song, J.; Zheng, Z.; Huang, L.; Zhang, J. Holt–Winters Predicting Model of Energy Meter Based on Optimal Smoothing Coefficient. In Proceedings of the 2020 IEEE 3rd Student Conference on Electrical Machines and Systems (SCEMS), Jinan, China, 4–6 December 2020; pp. 288–291. [Google Scholar] [CrossRef]
- Lima, S.; Gonçalves, A.M.; Costa, M. Time series forecasting using Holt–Winters exponential smoothing: An application to economic data. AIP Conf. Proc. 2019, 2186, 090003. [Google Scholar]
- Karadzic, V.; Pejovic, B. Inflation forecasting in the Western Balkans and EU: A comparison of Holt–Winters, ARIMA and NNAR Models. Amfiteatru Econ. 2021, 23, 517–532. [Google Scholar]
- Mladenović, J.; Lepojević, V.; Janković-Milić, V. Modelling and prognosis of the export of the Republic of Serbia by using seasonal Holt–Winters and ARIMA method. Econ. Themes 2016, 54, 233–260. [Google Scholar] [CrossRef]
- Raikwar, A.R.; Sadawarte, R.R.; More, R.G.; Gunjal, R.S.; Mahalle, P.N.; Railkar, P.N. Long-Term and Short-Term Traffic Forecasting Using Holt–Winters Method; IGI Global: Hershey, PA, USA, 2018. [Google Scholar]
- Wang, Z.H.; Lu, C.Y.; Pu, B.; Li, G.W.; Guo, Z.J. Short-term forecast model of vehicles volume based on arima seasonal model and holt–Winters. In Proceedings of the ITM Web of Conferences, Wales, UK, 12–15 September 2017; EDP Sciences: Les Ulis, France, 2017; Volume 12, p. 04028. [Google Scholar]
- Wang, X. The short-term passenger flow forecasting of urban rail transit based on holt–Winters’ seasonal method. In Proceedings of the 2019 4th International Conference on Electromechanical Control Technology and Transportation (ICECTT), Guilin, China, 26–28 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 265–268. [Google Scholar]
- Shumway, R.H.; Stoffer, D.S.; Shumway, R.H.; Stoffer, D.S. ARIMA models. In Time Series Analysis and Its Applications: With R Examples; Springer: Berlin/Heidelberg, Germany, 2017; pp. 75–163. [Google Scholar]
- Perone, G. An ARIMA model to forecast the spread and the final size of COVID-2019 epidemic in Italy. MedRxiv 2020. [Google Scholar] [CrossRef]
- Roy, A.; Kar, S. Nature of transmission of Covid19 in India. Medrxiv 2020. [Google Scholar] [CrossRef]
- Kotwal, A.; Yadav, A.K.; Yadav, J.; Kotwal, J.; Khune, S. Predictive models of COVID-19 in India: A rapid review. Med. J. Armed Forces India 2020, 76, 377–386. [Google Scholar] [CrossRef]
- Wang, Y.w.; Shen, Z.Z.; Jiang, Y. Comparison of ARIMA and GM (1, 1) models for prediction of hepatitis B in China. PLoS ONE 2018, 13, e0201987. [Google Scholar] [CrossRef] [PubMed]
- Ramezanian, M.; Haghdoost, A.A.; Mehrolhassani, M.H.; Abolhallaje, M.; Dehnavieh, R.; Najafi, B.; Fazaeli, A.A. Forecasting health expenditures in Iran using the ARIMA model (2016–2020). Med. J. Islam. Repub. Iran 2019, 33, 25. [Google Scholar] [CrossRef]
- Dritsakis, N.; Klazoglou, P. Time series analysis using arima models: An approach to forecasting health expenditures in usa. Int. Econ./Econ. Internazionale 2019, 72, 77–106. [Google Scholar]
- Samal, K.K.R.; Babu, K.S.; Das, S.K.; Acharaya, A. Time series based air pollution forecasting using SARIMA and prophet model. In Proceedings of the 2019 International Conference on Information Technology and Computer Communications, Singapore, 16–18 August 2019; pp. 80–85. [Google Scholar]
- Ma, S.; Liu, Q.; Zhang, Y. A prediction method of fire frequency: Based on the optimization of SARIMA model. PLoS ONE 2021, 16, e0255857. [Google Scholar] [CrossRef] [PubMed]
- Aggarwal, C.C. Neural Networks and Deep Learning, 2nd ed.; Springer International Publishing AG: Cham, Switzerland, 2023. [Google Scholar]
- Bishop, C.M.; Bishop, H. Deep Learning; Springer: Cham, Switzerland, 2024. [Google Scholar]
- Salem, F.M. Recurrent Neural Networks; Springer: Cham, Switzerland, 2022. [Google Scholar]
- Du, K.L.; Swamy, M.N. Neural Networks and Statistical Learning, 2nd ed.; Springer eBook Collection; Springer: London, UK, 2019. [Google Scholar]
- Yang, X.S.; He, X.S. Mathematical Foundations of Nature-Inspired Algorithms; SpringerBriefs in Optimization Ser.; Springer International Publishing AG: Cham, Switzerland, 2019. [Google Scholar]
- Joseph, M. Modern Time Series Forecasting with Python, 1st ed.; Packt Publishing Limited: Birmingham, AL, USA, 2022. [Google Scholar]
Figure 1.
The figure shows the data read by the sensors and saved in the system. The data were sampled very frequently because it was recorded every fifteen minutes.
Figure 1.
The figure shows the data read by the sensors and saved in the system. The data were sampled very frequently because it was recorded every fifteen minutes.
Figure 2.
The data after preprocessing are shown. The anomalies in the data are out of sight. There is no more fragmentation through grouping data. This information will make it possible for the test models to provide more accurate forecasts.
Figure 2.
The data after preprocessing are shown. The anomalies in the data are out of sight. There is no more fragmentation through grouping data. This information will make it possible for the test models to provide more accurate forecasts.
Figure 3.
Results for the ARIMA/SARIMA method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). The ARIMA/SARIMA method performed very well. It mapped the time series, shifted by a small time range. (b) The time series is characterized by seasonality and a downward trend. Forecasting using the ARIMA/SARIMA model provided almost perfect coverage with the test data for a given time series.
Figure 3.
Results for the ARIMA/SARIMA method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). The ARIMA/SARIMA method performed very well. It mapped the time series, shifted by a small time range. (b) The time series is characterized by seasonality and a downward trend. Forecasting using the ARIMA/SARIMA model provided almost perfect coverage with the test data for a given time series.
Figure 4.
Results for the Holt–Winters method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). Despite attempts to change the parameters, the simpler Holt–Winters model failed to predict the data. (b) The time series is characterized by seasonality and a decreasing trend. For this reason, the simpler Holt–Winters model is more accurate than other models.
Figure 4.
Results for the Holt–Winters method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). Despite attempts to change the parameters, the simpler Holt–Winters model failed to predict the data. (b) The time series is characterized by seasonality and a decreasing trend. For this reason, the simpler Holt–Winters model is more accurate than other models.
Figure 5.
Results for the DNN method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). The DNN artificial intelligence method completely failed to predict the data. It has been averaged and the values look sinusoidal. (b) The time series is characterized by seasonality and a downward trend. The DNN model did not cope at all with forecasting this type of data. The data have been flattened and heavily averaged.
Figure 5.
Results for the DNN method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). The DNN artificial intelligence method completely failed to predict the data. It has been averaged and the values look sinusoidal. (b) The time series is characterized by seasonality and a downward trend. The DNN model did not cope at all with forecasting this type of data. The data have been flattened and heavily averaged.
Figure 6.
Results for the RNN method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). The RNN artificial intelligence method completely failed to predict the data. The data were shifted in time and not adjusted in large values. (b) The time series is characterized by seasonality and a downward trend. The RNN model gave the worst results of all the methods tested. The obtained data were flattened, below the lowest actual value.
Figure 6.
Results for the RNN method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). The RNN artificial intelligence method completely failed to predict the data. The data were shifted in time and not adjusted in large values. (b) The time series is characterized by seasonality and a downward trend. The RNN model gave the worst results of all the methods tested. The obtained data were flattened, below the lowest actual value.
Figure 7.
Results for the LSTM method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). The best representation of the forecast was provided by the LSTM method. The obtained result is consistent with the test data. (b) The time series is characterized by seasonality and a downward trend. The LSTM model gave the best results of all three artificial intelligence methods. The results were similar to the Holt–Winters model.
Figure 7.
Results for the LSTM method for building B and complex of buildings E. (a) A time series is characterized by periodicity, but without seasonality, a variable trend (it is not clear). The best representation of the forecast was provided by the LSTM method. The obtained result is consistent with the test data. (b) The time series is characterized by seasonality and a downward trend. The LSTM model gave the best results of all three artificial intelligence methods. The results were similar to the Holt–Winters model.
Table 1.
Functional characteristics of the examined buildings.
Table 1.
Functional characteristics of the examined buildings.
Building Designation | Building Type | Volume | Useful Floor Area | Building Age and Location | Way of Using |
---|
Building No.A | an office building having 4 storeys and basement | 14,040 m3 | 3230 m2 | commissioned in 2017, located in the city center near the river | mostly used for 24 h/day, or at least 12 h/day |
Building No.B | an office building having 2 storeys | 7778 m3 | 1962 m2 | commissioned in 2015, located on the outskirts of the city nearby the river | mostly used for 24 h/day, or at least 12 h/day |
Building No.C | an office/garage building having 2 storeys | 7503 m3 | 1178 m2 | commissioned in 2017, located on the outskirts of the city | used for 11 h/day |
Building No.D | an office building having 3 storeys | 4308 m3 | 1099 m2 | commissioned in 2017, located in the city center | used for 11 h/day |
Complex No.E_1 | an office building having 4 storeys | 8399.8 m3 | 742.0 m2 | commissioned in 1972, located on the outskirts of the city | mostly used for 8 h/day |
Complex No.E_2 | an office/warehouse building, one-story | 2505.2 m3 | 556.7 m2 | commissioned in 1972, located on the outskirts of the city | used for 12 h/day |
Complex No.E_3 | an office (social)/warehouse building, having 2 storeys | 5707.4 m3 | 713.8 m2 | commissioned in 2017, located on the outskirts of the city | used for 12 h/day |
Complex No.E_4 | an office/garage building one-story | 3837.7 m3 | 645.6 m2 | commissioned in 1972, located on the outskirts of the city | used for 12 h/day |
Complex No.E_5 | a garage/warehouse building, one-story | 2198.1 m3 | 399.6 m2 | commissioned in 1972, located on the outskirts of the city | used for 12 h/day |
Table 2.
Forecast error values for the building No. A.
Table 2.
Forecast error values for the building No. A.
Method | MAE | RMSE | MAPE |
---|
HW | 989.88 | 1165.22 | 0.78 |
ARIMA | 554.56 | 700.59 | 0.44 |
DNN | 2764.66 | 3261.27 | 8.55 |
RNN | 1270.44 | 1596.91 | 47.15 |
LSTM | 1261.78 | 1447.78 | 3.90 |
Table 3.
Forecast error values for building No. B.
Table 3.
Forecast error values for building No. B.
Method | MAE | RMSE | MAPE |
---|
HW | 4037.18 | 4233.92 | 31.39 |
ARIMA | 1057.31 | 1287.97 | 8.22 |
DNN | 1403.80 | 1565.40 | 44.97 |
RNN | 1256.56 | 1536.99 | 444.74 |
LSTM | 682.89 | 1014.07 | 15.96 |
Table 4.
Forecast error values for building No. C.
Table 4.
Forecast error values for building No. C.
Method | MAE | RMSE | MAPE |
---|
HW | 2546.50 | 2939.17 | 5.71 |
ARIMA | 1941.05 | 2254.62 | 4.35 |
DNN | 2738.11 | 3426.65 | 17.33 |
RNN | 8490.28 | 10,318.1 | 872.49 |
LSTM | 3024.02 | 3457.38 | 27.35 |
Table 5.
Forecast error values for building No. D.
Table 5.
Forecast error values for building No. D.
Method | MAE | RMSE | MAPE |
---|
HW | 3337.76 | 4061.03 | 7.07 |
ARIMA | 3009.39 | 4496.48 | 6.38 |
DNN | 3657.68 | 5061.1 | 20.90 |
RNN | 6482.95 | 8346.98 | 608.43 |
LSTM | 2301.19 | 3515.51 | 17.99 |
Table 6.
Forecast error values for the complex of building No. E.
Table 6.
Forecast error values for the complex of building No. E.
Method | MAE | RMSE | MAPE |
---|
HW | 5885.33 | 6847.08 | 5.02 |
ARIMA | 2854.90 | 3509.47 | 2.44 |
DNN | 14,070.78 | 17,225.94 | 33.49 |
RNN | 24,295.17 | 30,226.85 | 960.62 |
LSTM | 5017.83 | 6252.26 | 16.53 |
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).