Next Article in Journal
Maintenance 4.0 for HVAC Systems: Addressing Implementation Challenges and Research Gaps
Previous Article in Journal
Evolutionary Cost Analysis and Computational Intelligence for Energy Efficiency in Internet of Things-Enabled Smart Cities: Multi-Sensor Data Fusion and Resilience to Link and Device Failures
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Comparative Analysis of Load Profile Forecasting: LSTM, SVR, and Ensemble Approaches for Singular and Cumulative Load Categories

by
Ahmad Fayyazbakhsh
*,
Thomas Kienberger
and
Julia Vopava-Wrienz
Chair of Energy Network Technology, Montanuniversitaet Leoben, Franz-Josef Straße 18, A-8700 Leoben, Austria
*
Author to whom correspondence should be addressed.
Smart Cities 2025, 8(2), 65; https://doi.org/10.3390/smartcities8020065
Submission received: 19 January 2025 / Revised: 21 March 2025 / Accepted: 7 April 2025 / Published: 10 April 2025

Abstract

:

Highlights

What are the main findings?
  • A comparative study is performed to find the most optimum model for load profile forecasting.
  • A model is developed to forecast single and cumulative load profiles containing EV, HH, and HP.
What is the implication of the main finding?
  • The model is validated on both synthetic and measured data for some grids in Austria.
  • The selected model demonstrates high robustness in forecasting all profiles’ categories based on a ROC-like curve for peak catching and other validation metrics such as MAPE, MAE, and SMAPE.
  • Energy forecasting and correction are applied to control the peaks and make the forecast even more optimal and realistic.

Abstract

Accurately forecasting load profiles, especially peak catching, is a challenge due to the stochastic nature of consumption. In this paper, we applied the following three models for forecasting: Long Short-Term Memory (LSTM); Support Vector Regression (SVR); and the combined model, which is a blend of SVR, Gated Recurrent Units (GRU), and Linear Regression (LR) to forecast 24 h-ahead load profiles. Household (HH), heat pump (HP), and electric vehicle (EV) loads are singular, and these were collectively considered with one-year load profiles. This study tackles the issue of accurately forecasting load profiles by evaluating LSTM, SVR, and an ensemble model for predicting energy consumption in HH, HP, and EV loads. A novel forecast correction mechanism is introduced, adjusting forecasts every eight hours to increase reliability. The findings highlight the potential of deep learning in enhancing energy demand forecasting, especially in identifying peak loads, which contributes to more stable and efficient grid operations. Visual and validation data were investigated, along with the models’ performances at different levels, such as off-peak, on-peak, and entirely. Among all models, LSTM performed slightly better in most of the factors, particularly in peak capturing. However, the blended model showed slightly better performance than LSTM for EV power load forecasting, with an on-peak mean absolute percentage error (MAPE) of 21.45%, compared to 29.24% and 22.02% for SVR and LSTM, respectively. Nevertheless, visual analysis clearly showed the strong ability of LSTM to capture peaks. This LSTM potential was also shown by the mean absolute percentage error (MAPE) and symmetric mean absolute percentage error (SMAPE) during the on-peak period, with around 3–5% improvement compared to SVR and the blended model. Finally, LSTM was employed in predicting day-ahead load profiles using measured data from four grids and showed high potential in capturing peaks with MAPE values less than 10% for most of the grids.

1. Introduction

Increasing the use of renewable energy sources (RES) and modern consumers, such as electric vehicles (EVs) and heat pumps (HPs), leads to grid stress. To reduce costly expansion, demand and generation management is crucial. To do this correctly, forecasting is a crucial factor that should be considered. Forecasting energy demand is described as the utilization of historical demand data to anticipate future consumption levels through the appropriate statistical methods [1]. Energy demand forecasts (EDFs) play a vital role in scheduling for the future, addressing essential issues related to the expansion of the current infrastructure, planning the operation of current power plants, and establishing the framework for energy tariffs [2].
Conventional load prediction primarily concentrates on the loads at the system or bus level. Nonetheless, the deployment of smart meters allows for the gathering of extensive sets of detailed electricity consumption data. This opens the opportunity to conduct load forecasting specifically for individual consumers [3,4]. In contrast to the loads of a building with several households (HHs), the loads of individual HHs exhibit higher levels of volatility.
Engineering models, statistical models, and artificial intelligence models are the three main categories of models for energy demand forecasts [5,6]. Engineering models utilize physical and thermodynamic principles, necessitating intricate building and environmental parameters. However, these are challenging and time-consuming to implement. Statistical and artificial intelligence models, on the other hand, forecast based on relevant factors such as temperature and humidity, climate data, and historical energy consumption [7].
Researchers have been trying to forecast the load by recurrent neural networks (RNN), especially by long-short term memory (LSTM) models [8,9] rather than traditional forecasting models, such as autoregressive integrated moving average models (ARIMA) and support vector machines (SVMs) [10,11]. The advantages of models based on recurrent neural networks compared to traditional modeling for forecasting distinguish them and result in their preferential use in a wide range of applications.
The feature generation from the timestamp, clustering algorithms, multi-input and multi-output sliding windows, and its ability to handle static or live data and address the lower vanishing gradient problem collectively contribute to making the LSTM model not only more functional than conventional ones but also more advanced compared to other RNN models [12]. Moreover, it is an appropriate approach for managing time series data of varying lengths [11]. In the literature, there were several attempts to compare LSTMs, or generally RNN models, with traditional models. It is worth mentioning that most of the time, LSTM is the best among RNN models to be chosen, owing to its accuracy and complexity [13,14]. In this research, an attempt was made to explore and address various studies that compare RNN models with alternative models, aiming to identify the most optimal approach for power load forecasting. Table 1 summarizes some of the latest studies in forecasting load profiles in different countries. It is evident that in most of the latest studies, researchers tried to use machine learning and deep learning methods. The factor that should be most considered is model architecture. Due to the dataset’s nature and length, the model structure can vary, and finding the optimum architecture is of actual importance. In recent years, there have been several attempts to make a hybrid model that can handle different behaviors through a pattern. The selection of models should be studied before implementation.
Until now, forecasting efforts have primarily focused on real or synthesized data, with a specific emphasis on the load profile variable. However, there has been a noticeable lack of comprehensive research aimed at modeling various loads independently and collectively, meaning aggregating different demands (HH, EV, and HP) as total demand. Moreover, as shown in Table 1 and other research studies, the blending model for forecasting load profiles and comparing them (separately and collectively) with different types is missing. This study proposed a blended model due to the nature of load profiles in some categories, which can influence the final demand pattern. Therefore, this research study aims to explore the potential of the LSTM model in forecasting load profiles for power demands, both individually and collectively. In this context, this paper also compares different models, such as SVR and a blended model (SVR + linear regressor (LR) + GRU) with LSTM, to choose the best and most efficient approach for the forecast of both synthetic and measured data. The blended model was proposed for the first time based on the nature of the datasets, making this model a valuable model to be compared with LSTM. Finally, as another novelty of this research, the forecast correction for every 8 h is used to improve the robustness of the model for future studies in the area of optimization and controlling the peaks. Although the receiver operating characteristic (ROC)-like curve is not usually used for load forecasting, in this study, this validation metric was used based on the classification of studying on-peak, off-peak, and total patterns to show the performance of the models at different thresholds and to evaluate the model in terms of peak catch.

2. Materials and Methods

2.1. Model Processing

Based on the results and the model selection, the LSTM model is described in detail in this chapter. The blended model is also presented due to its innovative structure. Moreover, the SVR model description is provided for further clarification.
In the phase of training the model, variables are allocated to the connections that move both into and out of the LSTM gates, with some of these connections being recurrent. These variables undergo continuous updates across numerous training cycles, which are known as epochs in the RNN, to enhance the accuracy of forecasts. The incorporation of a forget gate, along with the additive characteristic of cell state gradients, enables the LSTM to adjust connection weights in a manner that substantially mitigates the risk of encountering a vanishing gradient problem [20,21]. LSTMs use gates to manage the flow of information into and out of the memory cell and monitor the process of forgetting (Figure 1) to obtain precise predictions for sequential data [22].
In LSTM (Figure 1), the first gate determines which information to forget (Equation (1)), and the subsequent one decides which information from the current input vector should be stored in the cell (Equation (2)). At the end, the output gate (Equation (5)) contains information about the output at the current moment in time. In the input gate, which contains the decision it, the value of the cell is updated, and tanh is applied to generate a new state value known as the current cell state (Equation (4)) along with its temporary value (Equation (3)). Finally, the output value of the cell at time ht can be expressed as the output value of the cell (Equation (6)) [24,25,26].
f t = σ ( W f     x t   + U f h t 1 + b f )
i t = σ ( W i     x t   + U i h t 1 + b i )
Ĉ t = σ ( W c     x t   + U c h t 1 + b c )
C t = f t   C t 1   + i t Ĉ t
O t = σ ( W o     x t   + U o h t 1 + b o )
h t = O t   t a n h   ( C t )
σ x = 1 1 + x
t a n h x = x x x + x
ft, it, Ĉt, Ct, Ot, and ht: forgetting threshold at time t, input threshold, temporary unit state of Ct, current cell state, output threshold at time t, and output value of the output gate at time t, respectively.
σ: Sigmoid activation function.
ht−1: output at time t − 1 and xt is input.
Wf, Uf, Wi, Ui, Wc, Uc, Wo, and Uo: weights.
bf, bi, bc, and bo: bias terms.
The inputs can be defined before training; however, if the weights of the inputs are not finalized, LSTM can find out and update the weight of each input during training using backpropagation through time (BPTT). Inputs (xt) of this forecasting were the temperatures, the sine and cosine of days of the week and hours of the day, and the load in the previous day.
The input layer contains 100 neurons and a hidden layer with the same number of neurons, with a rectified linear unit (ReLU) activation function. The output layer is a dense layer with one neuron. The optimum number of neurons for the measured data was chosen after testing different quantities (10, 20, 30, 50, 100, and 150) by considering the average error to identify the best configuration. After evaluating the average error, especially for the last day of testing data, 100 neurons (with less than 10%) showed the most optimal condition by taking over- and under-fitting into account. The model is trained for classification prediction using the adaptive moment estimation (Adam) optimizer, a robust and well-known technique, and the mini-batch size varies according to each dataset. This optimizer is found to be much more dominant than other stochastic techniques. Also, it is quite efficient for sparse and noisy problems [19].
The model is trained and evaluated using various metrics, such as the Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE) (Equations (9)–(12)), to find the best model among the ones used. Due to Lewis’s benchmark, which various researchers use to forecast energy consumption, high accuracy can be achieved when the MAPE is less than 10%. MAPE values of 10–20% and 20–50% are considered good and reasonable accuracy evaluations, respectively [27]. It is essential to emphasize that the outcomes of error evaluations must be considered with visual data for conclusive decision making. This part is crucial, given the potential for high levels of errors in the data frame, including low points, possibly zero. Examining other data points in the context of actual vs. forecasted values offers invaluable insights for the analysis [28,29]. Moreover, to alleviate one of the gaps and drawbacks of the MAPE (due to its denominator), the Symmetric Mean Absolute Percentage Error (SMAPE) (Equation (13)) is used. When actual numbers are near zero, this solution handles the problem of division by zero and offers a more reliable measure of precision.
M A E = 1 N i = 1 N | y i * y i |
M S E = 1 N i = 1 N ( y i * y i ) 2
M A P E = 1 N i = 1 N | y i * y i y i * | 100 %
R M S E = 1 N i = 1 N ( y i * y i ) 2
S M A P E = 1 N i = 1 N ( | y i * y i | ( y i * + y i ) / 2 ) 100 %
N: number of observations.
yi: predicted value; yi*: actual value.
To avoid unnecessary training iterations, we have incorporated the early stopping technique to determine the optimal number of training iterations. Specifically, the dataset is divided into training, test, and validation sets. The model implemented Keras (TensorFlow backend). During each training iteration, the procedure halts if the MSE on the validation set does not decrease for the defined number of epochs. In this study, early stopping monitors the validation loss, with a patience of three, meaning that training is stopped if validation loss does not improve for three consecutive epochs. The model also reverts to the weights of the epochs with the lowest validation loss by activating a feature named “Restore Best Weights”.
The number of neurons is another parameter that has been studied to find the most optimal point, aiming to avoid overfitting and the lack of ability to capture complexity. After running the code several times for each neuron number, the average results showed that the average error reached its lowest point when 100 neurons were used (Figure 2). However, when the number of neurons increased to 150, an increase in the average error was observed. This phenomenon is likely due to the vanishing and exploding gradient problems, where training becomes unstable due to large weight updates, and overfitting, where the model starts to memorize rather than generalize well to testing data [30].
In the architecture of LSTM and sequential models, the lookback number is a crucial factor that should be taken into account. Several studies have focused on day-ahead forecasting with a one-hour time step, and it was found that 24 lookbacks resulted in the lowest error [31,32]. This is because the model can train under the most optimum conditions when considering the load profile, which follows a repeated daily pattern. By considering the previous 24 h, the model can better capture this pattern.
To ensure the effectiveness of 24 lookbacks, various alternative lookback values were studied, and different errors were investigated (see Figure 3). It is evident that all types of errors were reduced with 24 lookbacks, whereas they began to rise with an increase in this number. The main hypothesis behind these results is that lookback numbers and input data assist the model in capturing patterns and identifying peaks (indicating how much the trend at each point should increase or decrease), respectively.
Figure 4 describes the different steps that were implemented to forecast one year ahead of the power load. Data preparation was the first part, which contains different steps. One of them is data cleaning, which cleans the data from inconsistencies made by synthetic data or irrelevant information. The used models, particularly LSTM, are robust in capturing peaks but can still be influenced by extreme anomalies [33]. The SMAPE is used as an alternative to the MAPE to reduce the effects of zero values. However, for measured data, anomalies, such as not a number (NaN), were cleaned and filled with linear interpolation, which is a mix of backward and forward fills. Before this step, the Masking Layer in Keras (MLK) was used, but the LSTM did not work better. The main reason is that LSTM works better with continuous sequences, and it might break the temporal dependencies. Furthermore, data with a negative load were filled with zero. Moreover, dummy variables were used to increase the performance of the model. In this regard, on-peak time was the dummy variable that shows which hours of the year are more likely to be on-peak load.
In the feature engineering and selection process, we focused on time-based features, i.e., we employed dummy variables to detect time-based patterns. In the first dataset, we established the peak demand hours and constructed binary dummy variables for patterns of such specific hours by marking them as “1” during peak hours and “0” otherwise. This assisted the model in adjusting for the variation in the data of time-of-day impacts. For the EV dataset, the presence or absence of charging activity at different times was considered to be the feature responsible for creating dummy variables. In the morning hours, when there was no charging, we coded a dummy variable with a value of “0” to represent the absence of charging. This coding distinguished times of inactivity, which might influence the behavior and performance of the model.
The other steps are trigonometric data for capturing cyclic and seasonal effects and set losses to show the functionality of the model and make the model capable of using the early-stopping feature. In model training, the main characteristics of the model architectures were chosen, such as the Adam optimizer, an optimization algorithm for training the model, and the numbers of neurons, lookbacks, and epochs. For each parameter, a distinct study was performed to find the most optimum condition, such as the number of neurons. For instance, the number of lookbacks is crucial to catch the pattern accurately. In this study, although the first decision was to have a look at 24 h, this parameter was changed to several conditions, such as the last 8 h or the last week; 24 h showed the lowest error in the three applied models.
The last part of forecasting is the evaluation of selected models by calculating average errors and data visualization.
The SVR, which is derived from the SVM algorithm, serves as an additional alternative for comparison. It incorporates hinge loss functions to minimize prediction errors and aims to establish either a rigid or flexible boundary, with the goal of encompassing as many samples as possible to enhance reliability (Figure 5). In contrast to ANN models, this particular model is more capable of avoiding problems related to local minima and overfitting, resulting in better performance in forecasting energy consumption. The presence of false neighbors in the reconstructed state space was identified as a factor influencing the forecasting accuracy of SVR. Consequently, the SVR model was enhanced by incorporating the reconstruction features of a time series and optimizing the initial local predictor through the removal of false neighbors [28]. SVR uses sequential minimal optimization (SMO) as an optimization algorithm; however, the performance can be further improved by tuning techniques such as Bayesian optimization, grid study, or evolutionary algorithm [34]. The Radial Basis Function (RBF), which allows SVR to detect complex non-linear patterns, was implemented in SVR.
Blending models is a technique used by researchers to improve complexity and robustness [36,37]. Several models can be chosen to create a blended model. SVR, which has a strong ability to handle non-linear relationships, and LR, which offers simplicity and interpretability, were always chosen as models with high potential in blended models. Moreover, temporal dependencies are another factor that should be considered in blended models in terms of power load forecasting. One of those models is GRU, which is an RNN model and is less complex than LSTM. The decision to use GRU and LR in the blended model was based on their complementary strengths. GRU, a type of recurrent neural network, effectively captures temporal dependencies while being more computationally efficient than LSTM. LR was chosen for its interpretability and capacity to model straightforward linear trends, as the datasets of EVs and HPs contain simple patterns in some periods. Additionally, SVR was included for its effectiveness in managing non-linear relationships. By combining these three models, the approach balances complexity, computational efficiency, and robustness, leading to enhanced performance across various load profile categories. The hyperparameters used in this model are provided in this section so that the model specification can be further elaborated upon. For SVR in the blended model, a radial basis function (RBF) kernel was used, and regarding GRU, which had 50 neurons, ReLU was used as an activation function, with an Adam optimizer of 30 epochs and a batch size of 32.
A standard GRU cell comprises two gates, namely the reset gate (R) and the update gate (Z) (Figure 6 and Equations (14)–(17)). Like the LSTM cell, the computation of the hidden state output at time “t” involves the utilization of the hidden state at time “t − 1” and the input time series value at time “t”. As for the LSTM cell, for the calculation of the hidden state output at a time “t”, the hidden state at a time “t − 1” and the input time series value at a time “t” are used. The reset gates in a GRU and the forget gates in an LSTM share a similar function [38]. Many studies using the RNN framework commonly opt for the LSTM network, which is characterized by higher complexity, a larger number of parameters, and a longer generation time compared to the GRU network [39]. It was also investigated that LSTM addresses the vanishing gradient problem encountered in RNNs and enables the retention of information across extended time intervals [24]. The LSTM model forecasts future events by considering past occurrences and current inputs. Due to its reliance on long-term memory, it is frequently used to detect anomalous behavior [40,41]. Table 2 summarizes the main features of the models used in this research.
r t = σ   W r x t + U r h t 1 + b r
z t = σ   ( W z x t + U z h t 1 + b z )
h ¯ t = t a n h   ( W h x t + U h ( r t ʘ   h t 1 ) + b r )
h t = ( 1 z t ) ʘ   h t 1 + z t ʘ h ¯ t )
Xt: input at time t.
h t 1 : previous hidden state.
Wr, Wz, Wh, Ur, Uz, and Uh: metrics for weighting the current and hidden states.
br, bz, and bh: bias terms.
σ: Sigmoid activation function.
ʘ : element-wise multiplication.
rt: reset gate.
zt: update gate.
ht: hidden state.
h ¯ t : candidate hidden state.
For the blended model, different methods are selected based on the datasets and the nature of the pattern. To be more specific, the pattern of different load categories requires a more complex method, as there are several patterns in each category. For instance, for EV and HP, there are several times with zero or constant values. Researchers concluded that sometimes, deep learning methods, particularly LSTM, could result in a saturated forget gate, resulting in the oversimplification of the pattern and neglecting more complex and dynamic information. In this context, a blended model was proposed to investigate whether it can perform better than LSTM. In the proposed methods, LR deals with fundamental linear trends, SVR is effective in handling non-linear patterns with its kernel-based approach, and GRU is used to deal with temporal dependencies and long-term trends. The weighting method is one of the most important factors in the blended model. In this proposed method, a simple averaging weighting was performed (Equation (18)), meaning that every model has an equal contribution to the final forecast.
y B l e n d e d = 1 3   ( y S V R + y L R + y G R U )
yBlended: blended prediction
ySVR, yLR, and yGRU: SVR, LR, and GRU prediction outcomes

2.2. Data Processing

The selection of time steps greatly affects the performance of the forecast models. This study selected a time step of one sensor with a dataset of one year. This selection enables the model to capture short-term fluctuations in the load profiles. Moreover, this interval is common when it comes to the energy market and load forecasting. To assess the effectiveness of the proposed approach, an empirical study was conducted using synthesized datasets and 4 real datasets. The dataset comprises electricity consumption data gathered over a year. The synthesized dataset collected energy consumption data from 370 households with 133 HPs and 59 EVs, referring to a setting in the future low-voltage grid in central Europe.
  • Due to variations in the numerical magnitude among resource monitoring characteristics, the process of data standardization is crucial for mitigating differences in feature sizes. The Sklearn preprocessing module utilizes max–min standardization [16], which involves scaling all features from 0 to 1. This standardization method, known as min–max standardization, involves the linear transformation of the original data, effectively mapping values to the range of [0, 1] (Equation (19)) [43].
y i = x i m i n 0 j n x j m a x 0 j n m i n 0 j n x j
  • To improve the accuracy of the models, dummy variables were categorized and a factor was provided to show the seasonal effects. Dummy variables, especially categorical dummy variables, have been widely used to enhance the accuracy of modeling and forecasting in different categories [44,45]. One of the factors is the identification of peak consumption hours and the corresponding maximum hour to capture peaks and minimize deviations effectively. To determine these variables, the average consumption for each hour of the day was calculated, and two variables were established as follows: one representing the top 10 h with the highest consumption, and another indicating the single maximum consumption hour for this purpose. It is worth mentioning that peak consumption hours can be determined for each month or season separately. The benefit of this type of variable is that it can be examined individually for each month, as the average consumption hours for each month differ. Making an average for the whole year can negatively impact the model, as it may alter the final average peak without considering the peaks for each month.
  • To address issues related to the periodic nature of the hour of the day, day of the week, month of the year, and seasons of the year along with their discontinuity, trigonometric functions, such as sine and cosine, were employed (Equations (20) and (21)). Through the use of these functions, a smooth wrap-around of values, preserving the cyclic pattern, is guaranteed. As a result, by incorporating trigonometric factors, models like LSTM or other models used for sequential data can effectively capture the cyclic nature of temporal data and learn meaningful relationships between various points in the temporal cycle. It is worth noting that sine and cosine can also enhance the models’ complexity [46,47].
x s i n = s i n 2 π x m a x ( x )
x c o s = c o s 2 π x m a x ( x )

3. Results and Discussion

This research focuses on evaluating the effectiveness of the chosen models in predicting the load profiles for the next day by capturing their patterns from historical data. The study involves the examination of visual and validation data to identify the most optimal model. Various dummy variables, including peak time, were introduced to enhance the accuracy of the models. Additionally, the model’s accuracy was improved by implementing a lookback of 24 h and using the load from the previous day as input. The methodology incorporated other data to purposefully enhance the performance of the models. It is worth mentioning that due to the main purpose of this study, which is reducing the potential of bottlenecks from forecasting, catching the peaks was the first priority of this study. Synthetic data and data measured from two grids and a shopping center were used in this study to test the model’s robustness.

3.1. Synthetic Data

In this part, the load profile generator, HP load generator, and EV load generator [48] are explained to synthesize data for evaluating the models.

3.1.1. EV Load Forecast

EVs, in terms of validation results, were the least predictable compared to other categories because their patterns are influenced by several factors, such as driving and charging patterns, which can be variable, resulting in less predictable behavior. Moreover, the fact that the EV load is zero for the majority of the time means that a small deviation leads to a high error. As shown (Figure 7), for synthetic data, EV load prediction for the day ahead, especially with the blended model, was predictable at an acceptable level. One of the reasons for this result may be the unnecessary complexity of LSTM and the simplicity of SVR. This is because the data showed that people start charging at almost the same time and end at almost the same time, but with varying levels of peaks on different days. Consequently, a blended model that contains linear and support vector regressions, along with GRU, can help the model capture the peaks more effectively. On the other hand, more complex models for the simple data frame could cause overfitting because a high number of layers are needed to produce precise results [49]. The validation data revealed a slightly higher blended performance, which may be considered negligible due to its weakness in capturing peaks in the total load. The MAPE and SMAPE for different ranges of EV load forecasting were shown in Table 3 to analyze different observation ranges. As can be seen, on-peak data showed better SMAPE than the total and off-peak loads, especially for the blended model. The main reason behind this observation could be that each model used in the blended model has different sensitivities to noises in the data, helping the model to capture the pattern more effectively.
Additionally, the results themselves indicated only a slight improvement in error percentage. Furthermore, the presence of zero in the actual values can sometimes influence the error in a strange and unpredictable manner. The reason could be attributed to capturing the two most important patterns as follows: peaks and periods of no consumption. The LSTM and blended models, as observed on the last day, could capture both of these patterns effectively. However, SVR failed to catch these two patterns as effectively as the other models. Figure 7A also illustrates a more precise model in terms of capturing noises, as observed on the 23rd of December, with a small peak, something that both SVR and LSTM were unable to capture.

3.1.2. HH Load Prediction

After EVs, household consumption was the second least predictable in terms of validation errors, which could be due to the diverse and unpredictable behavior of consumers in different household types and their varying presence at home.
As evident from the visualized data, the primary source of deviation in the LSTM model for total consumption lies in its minimum consumption values, leading to a high average error for both the unscaled data and the average error for the last day. Conversely, a strong correlation with the actual data is observed during consumption peaks. However, for EVs and HHs in the LSTM model, an opposite pattern is observed as follows: there is a good correlation in the minimum values but significant deviation during peaks. To mitigate these discrepancies, outliers in the data were identified, and a dummy variable was introduced to enhance forecasting performance. It is crucial to handle outliers in a manner that does not impact the unseen data. Specifically, when testing the model, the input data provided should include relevant features such as seasons, months, day of the week, hours of the day, and the average maximum consumption representing the peak. Using the maximum value for each day individually can introduce accuracy issues in the input data, potentially leading to artificially positive predictions. This occurs because the model is presented with information that should ideally remain unseen during testing.
Although the total MAPE of LSTM for households was around 22% (Table 4), it is still deemed acceptable due to several reasons. Before mentioning these reasons, it should be noted that this 22% error is only acceptable if the model achieves a low error for total consumption or effectively captures the peaks; otherwise, it should be rejected or architecturally modified. The first and foremost reason is that one of the primary objectives of the prediction is to identify whether or not we would encounter bottlenecks, and studying this phenomenon should be conducted at the grid level. The second reason worth mentioning is that household patterns are challenging to capture, and it is not surprising that we may not have a well-matched model for them. To study the performance of the models in different power load ranges, the MAPE and SMAPE were measured at the following three levels: on-peak, off-peak, and total (Table 4). During off-peak, household power load was more predictable with SVR than the other ones. This result could be due to the performance of SVR, which is higher when data patterns are relatively stable or linear, as is often the case during the off-peak. During on-peak, LSTM had a slightly higher performance than SVR and the blended model due to the more complex pattern at that range, which requires the more complex model to capture it.
It is worth noting that when the number of households increases, the impact of individual errors can be mitigated to some extent. Finally, apart from during the summer and, to some extent, the spring, household consumption has no significant influence on the grid level compared to the other types of consumption, especially HPs. The other evident result (Figure 8) is that due to the blending of an RNN and two regression models, the blended model may also capture noise in the data. It is worth mentioning that Figure 8 indicates that the visual results demonstrate a slightly higher performance of LSTM, although the difference is negligible.

3.1.3. HP Load Prediction

HP consumption was the load category with the highest prediction accuracy (in the case of the validation results) among the three categories, as it has a high dependency on temperature and seasons. However, the pattern is less discernible than the categories, as the peaks occur at unpredictable times each day, without depending on temperature, time, or other factors (Figure 9).
There is a notable surge in electricity consumption around 7 and 8 AM, followed by a significant decrease between the evening and sleeping time. These two periods are referred to as transient time. This pattern is a result of the specific features of the building, where indoor air temperature needs to reach a comfortable level for occupants, and the residual heat indoors helps maintain an acceptable temperature throughout the day [50]. In addition, due to the thermal inertia of buildings, delayed responses occur in heating and cooling demand, resulting in unexpected variations in energy consumption. This variability can differ from one building to another, leading to peaks occurring at unpredictable times, which may not have a strong correlation with the main inputs.
As mentioned in the previous section, both visual and validation factors should be considered simultaneously. Consequently, for HP load prediction, almost all the mentioned errors were caused by peaks. Although the validation results indicated that the blended model outperformed SVR and LSTM, when visual data were integrated to select a model, the models were not able to reach the defined goals. In this case, to make it clearer, Table 5 is considered, which presents different HP power load ranges. As can be seen, in the off-peak load, SVR and ensembles performed better. This result is due to the SVR’s lower complexity and because of having LR and SVR in the ensembled model, which makes the models able to capture patterns that are linear or non-linear. The MAE, RMSE, and their percentages, MAPE, and SMAPE showed small differences between models, although the blended model demonstrated the best performance, while SVR was the least effective model.

3.1.4. Total Load Forecast

As demonstrated by the results, moving gradually from single to cumulative load at the transmission level increased the predictability of the forecasts. The highest level of prediction accuracy, with the lowest error, was achieved for the combination of energy consumption types. An important aspect to consider is that the forecasting and testing data were collected in the last days of December, including Christmas and Christmas Eve, and it is logical to have some unpredicted deviations. This is also further supported by the difference in the training MAE and the testing MAE. The testing MAE for HH, HP, EV, and total consumption was 0.863, 24.98, 32.12, and 47.19, respectively, while the training MAE was 0.66, 22.61, 29.34, and 39.38, respectively. Nevertheless, these types of deviations can be mitigated by incorporating several years of historical data to train the models.
One of the most important factors to consider is the emphasis on peaks. This is significant because one of the main objectives of load profile forecasting is to determine whether or not a bottleneck (at the feeder level) is likely to occur during peaks [51]. The forecasting errors during peaks and off-peaks are shown in Table 6. As can be seen, for the blended model, the MAPE and SMAPE for the whole period were lower than LSTM and SVR. This performance was expected due to the different sensitivities of the different models used in the blended model, which makes it able to capture small noises relatively more effectively. However, LSTM showed more robustness during the on-peak period due to its ability to capture more complex, non-linear patterns regarding long-term dependencies. This factor makes LSTM relatively better at capturing rapid fluctuations and unpredictable behaviors. As seen, in SVR (Figure 10B) and the blended model (Figure 10A), most of the errors are caused by the peaks, with only a small amount occurring when we have the lowest load. However, in LSTM (Figure 10C), it was observed that the lowest load caused a significant amount of error. It is logical that when the actual value is at its lowest, even a slight difference in the forecast can result in a high error. An important aspect to be highlighted is that, as shown in Figure 10A–C, among all models, the blended one could capture noise more effectively than the others. For instance, for the 23rd and 26th of December, before the main peaks, small peaks occurred and LSTM and SVR could not capture that, but the blended model tried to capture them due to the general assumption that blending models leads to more complexity [52], leading to several noises and peaks along with the major ones. This is the same as what happened with EV load forecasting. Sensitivity and specificity were two other factors studied in this part of the research. As mentioned earlier, the most crucial aspect that the model should capture is the period when the peak load occurs. Investigating whether the captured peaks truly reflect the model’s behavior is crucial in this case. This analysis helps determine whether the peaks were identified accurately by the model or whether they occurred by chance. One factor that should be considered is the threshold, defined as the grid capacity, indicating the level at which the load should be considered as either peak or off-peak. As this part was for the synthetic data, thresholds were changeable to find the sensitivity and specificity of different thresholds.
The aforementioned results were further supported by validation, as presented in Table 7. LSTM demonstrated higher overall performance, with notable effectiveness in metrics such as the MAE and RMSE. As expected, SVR exhibited the lowest accuracy, with a MAE exceeding 60.
LSTM is more effective than the other models in managing highly non-linear and time-dependent load patterns, especially for predicting peak loads. However, simpler models like SVR can occasionally yield similar or even better results during more stable periods, such as off-peak times, because they are less sensitive to minor fluctuations. The main reason for studying the effectiveness of models in three levels was to show the performance of the models in different stages and their performance in handling outliers. It is evident that, across all categories, SVR outperformed LSTM, especially for households, where its performance was superior to that of LSTM and the blended model.
The other factor to be mentioned is that the validation results align with visual observations. For instance, SVR demonstrated better performance in observed visual data during the off-peak period, which aligns with validation results. Moreover, in the blended model for household forecasting, the validation data showed the lowest performance during off-peaks, where visual data were confirmed by showing the failure to capture off-peak patterns during some hours of the first days (Figure 8).
Figure 11 shows the ROC, along with the specificity results for the LSTM. The model’s sensitivity was in the perfect range, indicating that the peaks were not random and most of them were true positives. Moreover, off-peak periods were caught almost correctly, with almost no random influence.
S e n s i t i v i t y = T P T P + F P
S p e c i f i c i t y = T N T N + F N
TP = the model identifies the peaks correctly.
TN = the model identifies the off-peaks correctly.
FP = the model identifies the peaks incorrectly.
FN = the model identifies the off-peaks incorrectly.
In the study of peak load prediction, a meticulous examination of the behavior of all models was undertaken, with a focus on their ability to identify load peaks accurately. The critical aspect we sought to capture was the time period during which peak loads occur, a fundamental requirement for efficient energy management and grid stability. In this context, an investigation was carried out to determine whether the model’s detected peaks genuinely reflected its behavior or whether they were mere chance occurrences.
Researchers used several methods to evaluate forecasting performance as follows: heat maps, which show the parts with the highest error; and uncertainty bonds, which make confidence intervals. However, a ROC-like curve, as depicted in Figure 11, was used for threshold-based evaluation. The ROC-like curve provides valuable insights into the model’s performance in terms of peak catching. The blue curve represents the model’s performance across varying thresholds. As the threshold is changed, the trade-off between sensitivity (true positive rate) and specificity (true negative rate) becomes evident. Ideally, the curve should be hugging the top-left corner, indicating high sensitivity and specificity simultaneously. On the other hand, as we move towards the bottom right, the results transition into the worst ranges. Between these two ranges, the estimates are made in random mode.
The LSTM model is distinguished by its ability to capture true peak values. Sensitivity remains in the “perfect” range, where peaks are identified without chance. TP dominated, ensuring that actual peak load periods were correctly flagged. The choice of threshold (grid capacity) significantly impacts sensitivity and specificity. Fine-tuning this threshold is crucial for striking the right balance between peak detection accuracy and false alarms. It is evident that increasing the threshold causes a reduction in the true positive rate, while the false positive rate remains almost unchanged. The best threshold in all three models was 450. However, due to the typical peaks during the year and the ROC-like curve results, the recommended threshold would be around 533–550. These thresholds showed weak SVR and modest results in the blended model, while a high true positive rate was achieved in the LSTM model.
In the end, achieving higher robustness is of great importance. In this case, having a forecast that can be repeatedly adjusted with the purpose of making it flexible would be a considerable step toward a robust model. It was assumed that people would consume a certain amount of energy in the following 24 h. In this case, the collective or total power demand forecast was made, followed by the integration of the power load to determine the energy demand. This energy will be used on the measured day but will be influenced by incentives and price offers. It was assumed that people would charge their EVs during the following day, but this can occur at different expected times. In this case, energy consumption is more useful in showing the amount of energy demand rather than the load. Once a lower price is offered, people will try to charge their EVs during that time, which can lead to a further peak or shift in the peak. A recommendation will be made at the end of this research to control the peak. As seen in Figure 12, every 8 h, the energy consumption trend was adjusted to compensate for the energy that increased or decreased in the past 8 h. The reduction and increase in the following hour depend on the current level of energy and are adjusted linearly. For instance, at hour 19, when the energy consumption was 553 kWh, the reduction to compensate for the higher energy usage in the first 16 h was almost twice as much as at hour 23, which had 281 kWh of energy consumed during that hour. These dependencies are defined mathematically in Equations (24)–(27). The linear behavior was chosen because no actual data were available after the price was offered. If such offers and actual data are available for a few days, this behavior can be modified to show the forecast more precisely. It is worth mentioning that the first and second sets of actual data used for the adjustment were based on assumptions.
E f = 0 T P f t t
E t = α E A t E f t
α t = E f   ( t ) t = 1 T E f   ( t )
E a t = α t   ( E t E cf ( t ) )
Ef = forecasted energy.
t = 1 T E f   ( t ) : This term refers to the forecasted parts that are not revealed. For the first adjustment, it would be the sum of the last 16 h, and for the second adjustment, it is the sum of energy forecasted in the last 8 h.
EA= actual energy.
α(t) = adjustment factor.
Et = total energy.
Ea = adjusted energy.
Ecf = cumulative sum of forecasted energy up to hour ‘t’.
It is evident that the adjustment factor is affected by the actual energy consumption, leading to a proportional impact on the adjustment.

3.2. Measured Data

In this segment, three grids and a shopping center located in various regions of Austria were utilized to forecast the load profiles one day ahead based on the measured data. Forecasting, in this part of the study, is performed on the level of overall loads, meaning the substation level.
While the synthetic dataset covered historical data for one year, it is important to note that, due to certain constraints, the historical data for these grids was limited to 2–3 months. As a result, the dummy variables in this segment differed slightly from the previous one. Specifically, the impact of the month on the results was not evident due to the limited range of conditions. Despite these limitations, the visual and validation results examined demonstrated acceptable accuracy, especially with LSTM.
After validating the superior performance of LSTM compared to the other models that were tried, it was employed to forecast the one-day-ahead load profile of measured data from two different grids (A, B), as well as for a shopping center located in Austria. As anticipated, LSTM demonstrated the ability to predict the one-day-ahead load with an acceptable level of accuracy (Table 8). To clarify further, the model’s performance, in some instances, surpassed that achieved by synthetic training data.
  • The performances at the substation and feeder levels of grid A show distinct differences. The substation level demonstrates higher accuracy, not only in capturing peaks but also in the validation results. As discussed earlier, the main reason for this phenomenon is that transitioning from a low to a higher consumption level improves forecasting accuracy. Moving from a low to a high consumption level leads to a focus on a higher level of consumption, which fluctuates less than the consumption at the low level.
  • On the other hand, grid B exhibited the lowest accuracy regarding the validation results. As depicted in Figure 13, peaks were either captured, or a positive deviation was indicated that the forecast exceeded the actual value. This deviation would not adversely affect the model selection; however, it did decrease accuracy in the validation results.
One of the main reasons for the lower robustness of the measured data compared to the synthetic data lies in the length of the dataset. In the synthetic data, a year of data was used, while the measured data did not exceed 2 months. In this case, finding the weight of each input could be more difficult for the model, and the seasonal effect is less predictable than in a longer dataset.
  • The shopping center’s consumption exhibited the highest level of accuracy compared to the other grids. This heightened accuracy was anticipated, given that consumption in shopping centers tends to fluctuate less, with the day type playing a crucial role as one of the most significant characteristics. To provide more specificity, the opening and closing times of shopping centers determine the consumption level, and the day type indicates whether or not that consumption will be reached [53]. It is worth noting that different countries have varying working days and hours; for example, in the Czech Republic and Slovakia (probably open until 9 PM), shopping centers are open on Sundays, whereas in Austria and Germany (probably open until 7 p.m.), on Sunday, shopping centers remain closed.

4. Conclusions

This research investigated the performance of different models in forecasting the 24-hour-ahead load profiles for different load categories, both individually and collectively. The key finding of this research is that the 24 h-ahead load profile is detectable with good but varying performance among different models. Moreover, periods with the highest load are detectable correctly due to the ROC-like curve results. The models examined in this study are among the most commonly used for forecasting. The findings of this research are summarized as follows:
The influence of different variables depends on several factors, including dataset type t and the length of data. Forecasting accuracy tends to be higher due to the more catchable pattern for real data, especially for datasets with a larger number of houses, grids, or cities.
In this study, LSTM demonstrated its ability to capture complex patterns due to its sophisticated architecture, whereas simpler patterns were more effectively modeled and predicted by less complex approaches such as SVR and GRU. Moreover, blended models that combine different models can sometimes capture small peaks or noises rather than the main peaks, which are hard to catch with a single model. It is worth noting that creating an effective blending model needs to be studied before examination to manage the complexity based on the specific characteristics of the dataset under analysis.
Similarly, in this research, most of the time, visual data should be considered alongside the validation data. This approach helps balance error minimization with the primary research studies. This research focused on peak capturing, which is why visual data are more crucial to consider than validation data. A ROC-like curve was also implemented to validate the data for this term. The validation results clearly stated that LSTM could capture peaks more sufficiently, while the blended model was the second best and has the chance to be improved in the future. Evaluation metrics further demonstrated the performance of LSTM in capturing overall consumption trends.
Regarding the blended model and its performance, the results were promising, although LSTM outperformed it. The proposed blended model showed that different models may be better suited for different load patterns. For example, a hybrid model that contains SVR and LR along with deep learning could catch the small peaks more accurately than LSTM, as the contribution of each model could enhance its performance. However, for further development, it is highly recommended that researchers apply weighting techniques based on the nature of their dataset.
One crucial part of this study showed that the influence of human behavior, which is an important factor in the energy sector based on managing energy market prices, can be more detectable by adjusting forecasting through real-time implementation. This step can be improved further by adjusting the forecast every hour for highly volatile loads such as EV charging. However, this requires real-time data from the smart meter and grid response.
Several factors should be considered to improve LSTM’s performance. One of the most important factors is the number of backward steps that the model should consider, as it can significantly influence the model’s performance in increasing and decreasing patterns. Dummy variables are also important components that can be created for most datasets to help models capture patterns. Although the dummy variables in this research were created to correspond with load peaks (in line with our research purpose), they could be generated from different perspectives.
As a perspective, it could be useful to ensure that the model operates at its highest performance. In this regard, model architecture should be considered along with data processing. For LSTM, like other neural network models, the number of neurons plays a vital role. Moreover, training and testing set losses should be studied to ensure the model avoids overfitting and underfitting. To improve the performance of forecasting, using hyperparameter optimization such as Bayesian optimization, random search, and the evolutionary algorithm is of actual value, especially for renewable energy forecasting. To improve the transparency of the model’s features, explainable AI (XAI) methods, such as Shapley Additive Explanations (SHAP) or Local Interpretable Model-agnostic Explanations (LIME), can provide insights into feature importance and model behavior. In addition, feature selection techniques such as correlation analysis, principal component analysis, recursive feature elimination, etc., would improve forecasting performance in future research. Using the early-stopping function (with different numbers of patience) is highly recommended to prevent overfitting. Furthermore, researchers should consider the saturation-prone forget gate, which is typical for several types of datasets. From another perspective, a formulation or optimization procedure to eliminate peaks that could exceed the thresholds would be of real value. Given that EVs can sharply increase peaks, implementing a booking system for EV owners to schedule their charging times (with defined power loads) could be an effective solution to control the power load. Moreover, implementing flexible pricing is also a way to maintain safe peaks. In this case, pricing will be determined by the threshold, which is the generation capacity. Each generation can have a different price. For example, if the current level is lower than 50% of the threshold, the price will be different from when the level is between 60 and 70% of the capacity. This would encourage EV owners to check the price repeatedly and charge their vehicles when they find a suitable price. A key element of this strategy is that once a charging session begins, the price remains fixed for that session, ensuring that users are charged at the initially defined rate.

5. Declaration of Generative AI and AI-Assisted Technology Use in the Writing Process

In terms of using GenAI tools, we would like to point out that ChatGPT 4o was used solely to correct grammatical mistakes in some parts of the research to improve its readability. It is worth noting that the authors subsequently reviewed these sections to ensure the accuracy of the meaning and the information provided, especially when technical terms and words were involved. The authors take full responsibility for the content of this publication

Author Contributions

Conceptualization, A.F. and J.V.-W.; methodology, A.F. and J.V.-W.; software, A.F.; validation, A.F., J.V.-W. and T.K.; investigation, A.F.; resources, J.V.-W.; data curation, A.F.; writing—original draft preparation, A.F.; writing—review and editing, A.F., J.V.-W. and T.K.; visualization, A.F. and J.V.-W.; supervision: T.K.; project administration, J.V.-W.; funding acquisition, J.V.-W. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Austrian Climate and Energy Funds (Klima- und Energiefonds) for the ‘Friendly Charge Project’ (No. 899917).

Data Availability Statement

Data can be made available upon request due to restrictions (project policy).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ANNArtificial Neural Network
ANFISAdaptive Neuro-Fuzzy Inference System
LSTMLong Short-Term Memory
BPTTBackpropagation Through Time
DELMDeep Extreme Learning Machine
EVElectric Vehicles
ERCOTElectric Reliability Council of Texas
EDFsEnergy Demand Forecasts
ELMExtreme Learning Machine
XAIexplainable AI
FNFalse Negative
FPFalse Positive
GRUGated Recurrent Units
GRNNGeneralized Regression Neural Network
HPHeat Pump
HHsHouseholds
LIMELocal Interpretable Model-agnostic Explanations
LRLinear Regression
MAEMean Absolute Error
MAPEMean Absolute Percentage Error
MLKMasking Layer in Keras
MSEMean Squared Error
NaNNot a Number
RBFRadial Basis Function
ROCReceiver Operating Characteristic
ReLURectified Linear Unit
RNNRecurrent Neural Networks
RESRenewable Energy Sources
RMSERoot Mean Squared Error
SMOSequential Minimal Optimization
SVMSupport Vector Machines
SVRSupport Vector Regression
SMAPESymmetric Mean Absolute Percentage Error
SHAPShapley Additive Explanations
TNTure Negative
TPTrue Positive

References

  1. Yeom, S.; An, J.; Hong, T.; Koo, C.; Jeong, K.; Lee, J. Managing energy consumption and indoor environment quality using augmented reality based on the occupants’ satisfaction and characteristics. Energy Build. 2024, 311, 114165. [Google Scholar] [CrossRef]
  2. Bujalski, M.; Madejski, P.; Fuzowski, K. Day-ahead heat load forecasting during the off-season in the district heating system using Generalized Additive model. Energy Build. 2023, 278, 112630. [Google Scholar] [CrossRef]
  3. Kong, W.; Dong, Z.Y.; Jia, Y.; Hill, D.J.; Xu, Y.; Zhang, Y. Short-Term Residential Load Forecasting Based on LSTM Recurrent Neural Network. IEEE Trans. Smart Grid 2019, 10, 841–851. [Google Scholar] [CrossRef]
  4. Wang, Y.; Gao, N.; Hug, G. Personalized Federated Learning for Individual Consumer Load Forecasting. CSEE J. Power Energy Syst. 2023, 9, 326–330. [Google Scholar] [CrossRef]
  5. Wang, C.; Li, X.; Li, H. nRole of input features in developing data-driven models for building thermal demand forecast. Energy Build. 2022, 277, 112593. [Google Scholar] [CrossRef]
  6. Mendes, N.; Mendes, J.; Mohammadi, J.; Moura, P. Federated learning framework for prediction of net energy demand in transactive energy communities. Sustain. Energy Grids Netw. 2024, 40, 101522. [Google Scholar] [CrossRef]
  7. Bui, D.K.; Nguyen, T.N.; Ngo, T.D.; Nguyen-Xuan, H. An artificial neural network (ANN) expert system enhanced with the electromagnetism-based firefly algorithm (EFA) for predicting the energy consumption in buildings. Energy 2020, 190, 116370. [Google Scholar] [CrossRef]
  8. Kanthila, C.; Boodi, A.; Marszal-Pomianowska, A.; Beddiar, K.; Amirat, Y.; Benbouzid, M. Enhanced multi-horizon occupancy prediction in smart buildings using cascaded Bi-LSTM models with integrated features. Energy Build. 2024, 318, 114442. [Google Scholar] [CrossRef]
  9. Xie, J.; Zhong, Y.; Xiao, T.; Wang, Z.; Zhang, J.; Wang, T.; Schuller, B.W. A multi-information fusion model for short term load forecasting of an architectural complex considering spatio-temporal characteristics. Energy Build. 2022, 277, 112566. [Google Scholar] [CrossRef]
  10. Wang, Y.; Gan, D.; Sun, M.; Zhang, N.; Lu, Z.; Kang, C. Probabilistic individual load forecasting using pinball loss guided LSTM. Appl. Energy 2019, 235, 10–20. [Google Scholar] [CrossRef]
  11. Rick, R.; Berton, L. Energy forecasting model based on CNN-LSTM-AE for many time series with unequal lengths. Eng. Appl. Artif. Intell. 2022, 113, 104998. [Google Scholar] [CrossRef]
  12. Hu, W.; Wang, X.; Tan, K.; Cai, Y. Digital twin-enhanced predictive maintenance for indoor climate: A parallel LSTM-autoencoder failure prediction approach. Energy Build. 2023, 301, 113738. [Google Scholar] [CrossRef]
  13. Liu, X.; Lin, Z.; Feng, Z. Short-term offshore wind speed forecast by seasonal ARIMA—A comparison against GRU and LSTM. Energy 2021, 227, 120492. [Google Scholar] [CrossRef]
  14. Yang, S.; Yu, X.; Zhou, Y. LSTM and GRU Neural Network Performance Comparison Study: Taking Yelp Review Dataset as an Example. In Proceedings of the 2020 International Workshop on Electronic Communication and Artificial Intelligence, IWECAI 2020, Shanghai, China, 12–14 June 2020; Institute of Electrical and Electronics Engineers Inc.: Piscataway, NJ, USA, 2020; pp. 98–101. [Google Scholar] [CrossRef]
  15. Fayaz, M.; Kim, D. A prediction methodology of energy consumption based on deep extreme learning machine and comparative analysis in residential buildings. Electronics 2018, 7, 222. [Google Scholar] [CrossRef]
  16. Wang, W.; Hong, T.; Xu, X.; Chen, J.; Liu, Z.; Xu, N. Forecasting district-scale energy dynamics through integrating building network and long short-term memory learning algorithm. Appl. Energy 2019, 248, 217–230. [Google Scholar] [CrossRef]
  17. Hossain, M.S.; Mahmood, H. Short-Term Load Forecasting Using an LSTM Neural Network. In Proceedings of the 2020 lEEE Power and Energy Conference at llinois (PECl), Champaign, IL, USA, 27–28 February 2020; IEEE: Piscataway, NJ, USA, 2020. [Google Scholar]
  18. Jin, N.; Yang, F.; Mo, Y.; Zeng, Y.; Zhou, X.; Yan, K.; Ma, X. Highly accurate energy consumption forecasting model based on parallel LSTM neural networks. Adv. Eng. Inform. 2022, 51, 101442. [Google Scholar] [CrossRef]
  19. Dubey, A.K.; Kumar, A.; García-Díaz, V.; Sharma, A.K.; Kanhaiya, K. Study and analysis of SARIMA and LSTM in forecasting time series data. Sustain. Energy Technol. Assess. 2021, 47, 101474. [Google Scholar] [CrossRef]
  20. Du, R.; Chen, H.; Yu, M.; Li, W.; Niu, D.; Wang, K.; Zhang, Z. 3DTCN-CBAM-LSTM short-term power multi-step prediction model for offshore wind power based on data space and multi-field cluster spatio-temporal correlation. Appl. Energy 2024, 376, 124169. [Google Scholar] [CrossRef]
  21. Smagulova, K.; James, A.P. A survey on LSTM memristive neural network architectures and applications. Eur. Phys. J. Spec. Top. 2019, 228, 2313–2324. [Google Scholar] [CrossRef]
  22. Landi, F.; Baraldi, L.; Cornia, M.; Cucchiara, R. Working Memory Connections for LSTM. Neural Netw. 2021, 144, 334–341. [Google Scholar] [CrossRef]
  23. Karijadi, I.; Chou, S.Y. A hybrid RF-LSTM based on CEEMDAN for improving the accuracy of building energy consumption prediction. Energy Build. 2022, 259, 111908. [Google Scholar] [CrossRef]
  24. Fan, D.; Sun, H.; Yao, J.; Zhang, K.; Yan, X.; Sun, Z. Well production forecasting based on ARIMA-LSTM model considering manual operations. Energy 2021, 220, 119708. [Google Scholar] [CrossRef]
  25. Cai, C.; Tao, Y.; Zhu, T.; Deng, Z. Short-term load forecasting based on deep learning bidirectional lstm neural network. Appl. Sci. 2021, 11, 8129. [Google Scholar] [CrossRef]
  26. Somu, N.; Raman, G.; Ramamritham, M.R.K. A deep learning framework for building energy consumption forecast. Renew. Sustain. Energy Rev. 2021, 137, 110591. [Google Scholar] [CrossRef]
  27. Wei, N.; Li, C.; Peng, X.; Zeng, F.; Lu, X. Conventional models and artificial intelligence-based models for energy consumption forecasting: A review. J. Pet. Sci. Eng. 2019, 181, 106187. [Google Scholar] [CrossRef]
  28. Ju, J.; Liu, F.A. Multivariate time series data prediction based on ATT-LSTM network. Appl. Sci. 2021, 11, 9373. [Google Scholar] [CrossRef]
  29. Hewamalage, H.; Ackermann, K.; Bergmeir, C. Forecast evaluation for data scientists: Common pitfalls and best practices. Data Min. Knowl. Discov. 2023, 37, 788–832. [Google Scholar] [CrossRef]
  30. Zucchet, N.; Orvieto, A. Recurrent neural networks: Vanishing and exploding gradients are not the end of the story. Adv. Neural Inf. Process. Syst. 2024, 37, 139402–139443. Available online: http://arxiv.org/abs/2405.21064 (accessed on 9 April 2025).
  31. Zou, L.; Munir, M.S.; Kim, K.; Hong, C.S. Day-ahead Energy Sharing Schedule for the P2P Prosumer Community Using LSTM and Swarm Intelligence. In Proceedings of the 2020 International Conference on Information Networking (ICOIN), Barcelona, Spain, 7–10 January 2020; pp. 396–401. [Google Scholar] [CrossRef]
  32. Chamatidis, I.; Tzanes, G.; Istrati, D.; Lagaros, N.D.A. Stamou, Short-Term Forecasting of Rainfall Using Sequentially Deep LSTM Networks: A Case Study on a Semi-Arid Region. Environ. Sci. Proc. 2023, 26, 157. [Google Scholar] [CrossRef]
  33. Nguyen, B.N.; Ogliari, E.; Pafumi, E.; Alberti, D.; Leva, S.; Duong, M.Q. Forecasting Generating Power of Sun Tracking PV Plant using Long-Short Term Memory Neural Network Model: A case study in Ninh Thuan—Vietnam, ICCE 2024. In Proceedings of the 2024 IEEE 10th International Conference on Communications and Electronics, Danang, Vietnam, 31 July–2 August 2024; pp. 333–338. [Google Scholar] [CrossRef]
  34. Sultana, N.; Hossain, S.M.Z.; Abusaad, M.; Alanbar, N.; Senan, Y.; Razzak, S.A. Prediction of biodiesel production from microalgal oil using Bayesian optimization algorithm-based machine learning approaches. Fuel 2022, 309, 122184. [Google Scholar] [CrossRef]
  35. Zhang, Y.; Wang, Q.; Chen, X.; Yan, Y.; Yang, R.; Liu, Z.; Fu, J. The Prediction of Spark-Ignition Engine Performance and Emissions Based on the SVR Algorithm. Processes 2022, 10, 312. [Google Scholar] [CrossRef]
  36. Ahmed, N.; Assadi, M.; Zhang, Q. Investigating the impact of borehole field data’s input parameters on the forecasting accuracy of multivariate hybrid deep learning models for heating and cooling. Energy Build. 2023, 301, 113706. [Google Scholar] [CrossRef]
  37. Huang, X.; Han, Y.; Yan, J.; Zhou, X. Hybrid forecasting model of building cooling load based on EMD-LSTM-Markov algorithm. Energy Build. 2024, 321, 114670. [Google Scholar] [CrossRef]
  38. Zhang, S.; Gurusamy, S.; James-Chakraborty, K.; Basu, B. Short-term office temperature forecasting through a data-driven approach integrated with bidirectional gated recurrent neural network. Energy Build. 2024, 314, 114231. [Google Scholar] [CrossRef]
  39. Shi, J.; Teh, J. Load forecasting for regional integrated energy system based on complementary ensemble empirical mode decomposition and multi-model fusion. Appl. Energy 2024, 353, 122146. [Google Scholar] [CrossRef]
  40. ALMahadin, G.; Aoudni, Y.; Shabaz, M.; Agrawal, A.V.; Yasmin, G.; Alomari, E.S.; Al-Khafaji, H.M.R.; Dansana, D.; Maaliw, R.R. VANET Network Traffic Anomaly Detection Using GRU-Based Deep Learning Model. IEEE Trans. Consum. Electron. 2023, 70, 4548–4555. [Google Scholar] [CrossRef]
  41. Li, B.; Wu, Y.; Song, J.; Lu, R.; Li, T.; Zhao, L. DeepFed: Federated Deep Learning for Intrusion Detection in Industrial Cyber-Physical Systems. IEEE Trans. Ind. Inform. 2021, 17, 5615–5624. [Google Scholar] [CrossRef]
  42. Yu, H.; Zhong, F.; Du, Y.; Xie, X.; Wang, Y.; Zhang, X.; Huang, S. Short-term cooling and heating loads forecasting of building district energy system based on data-driven models. Energy Build. 2023, 298, 113513. [Google Scholar] [CrossRef]
  43. Yan, M.; Liang, X.M.; Lu, Z.H.; Wu, J.; Zhang, W. HANSEL: Adaptive horizontal scaling of microservices using Bi-LSTM. Appl. Soft Comput. 2021, 105, 107216. [Google Scholar] [CrossRef]
  44. Wang, Z.X.; He, L.Y.; Zhao, Y.F. Forecasting the seasonal natural gas consumption in the US using a gray model with dummy variables. Appl. Soft Comput. 2021, 113, 108002. [Google Scholar] [CrossRef]
  45. Zhao, D.; Liu, Y.; Chen, H. Are Mini and full-size electric vehicle adopters satisfied? An application of the regression with dummy variables. Travel Behav. Soc. 2024, 35, 100744. [Google Scholar] [CrossRef]
  46. Khaled, A.; Elsir, A.M.T.; Shen, Y. GSTA: Gated spatial–temporal attention approach for travel time prediction. Neural Comput. Appl. 2022, 34, 2307–2322. [Google Scholar] [CrossRef]
  47. Weerakody, P.B.; Wong, K.W.; Wang, G. Cyclic Gate Recurrent Neural Networks for Time Series Data with Missing Values. Neural Process. Lett. 2023, 55, 1527–1554. [Google Scholar] [CrossRef]
  48. Thormann, B.; Kienberger, T. Evaluation of grid capacities for integrating future E-Mobility and heat pumps into low-voltage grids. Energies 2020, 13, 5083. [Google Scholar] [CrossRef]
  49. Al Mamun, A.; Sohel, M.; Mohammad, N.; Sunny, M.S.H.; Dipta, D.R.; Hossain, E. A Comprehensive Review of the Load Forecasting Techniques Using Single and Hybrid Predictive Models. IEEE Access 2020, 8, 134911–134939. [Google Scholar] [CrossRef]
  50. Sun, S.; Chen, H. Data-driven sensitivity analysis and electricity consumption prediction for water source heat pump system using limited information. Build. Simul. 2021, 14, 1005–1016. [Google Scholar] [CrossRef]
  51. Garrido-Valenzuela, F.; Cruz, D.; Dragicevic, M.; Schmidt, A.; Moya, J.; Tamblay, S.; Herrera, J.C.; Muñoz, J.C. Identifying and visualizing operational bottlenecks and Quick win opportunities for improving bus performance in public transport systems. Transp. Res. Part A Policy Pract. 2022, 164, 324–336. [Google Scholar] [CrossRef]
  52. André, M.; Perez, R.; Soubdhan, T.; Schlemmer, J.; Calif, R.; Monjoly, S. Preliminary assessment of two spatio-temporal forecasting technics for hourly satellite-derived irradiance in a complex meteorological context. Sol. Energy 2019, 177, 703–712. [Google Scholar] [CrossRef]
  53. Yuan, Y.; Chen, Z.; Wang, Z.; Sun, Y.; Chen, Y. Attention mechanism-based transfer learning model for day-ahead energy demand forecasting of shopping mall buildings. Energy 2023, 270, 126878. [Google Scholar] [CrossRef]
Figure 1. Structure of LSTM model [23].
Figure 1. Structure of LSTM model [23].
Smartcities 08 00065 g001
Figure 2. Influence of the number of used neurons on the average error.
Figure 2. Influence of the number of used neurons on the average error.
Smartcities 08 00065 g002
Figure 3. Influence of the number of lookbacks on the (A) MAE, (B) RMSE, and (C) MAPE and SMAPE.
Figure 3. Influence of the number of lookbacks on the (A) MAE, (B) RMSE, and (C) MAPE and SMAPE.
Smartcities 08 00065 g003
Figure 4. Flowchart for implementing forecasting for one-day-ahead loads.
Figure 4. Flowchart for implementing forecasting for one-day-ahead loads.
Smartcities 08 00065 g004
Figure 5. Structure of SVR model [35].
Figure 5. Structure of SVR model [35].
Smartcities 08 00065 g005
Figure 6. Structure of the GRU model [42].
Figure 6. Structure of the GRU model [42].
Smartcities 08 00065 g006
Figure 7. Actual vs. predicted EV load by the (A) blended, (B) SVR, and (C) LSTM models.
Figure 7. Actual vs. predicted EV load by the (A) blended, (B) SVR, and (C) LSTM models.
Smartcities 08 00065 g007
Figure 8. Actual vs. predicted HH load (load * 1000) by the (A) blended, (B) SVR, and (C) LSTM models.
Figure 8. Actual vs. predicted HH load (load * 1000) by the (A) blended, (B) SVR, and (C) LSTM models.
Smartcities 08 00065 g008
Figure 9. Actual vs. predicted HP load by the (A) blended, (B) SVR, and (C) LSTM models.
Figure 9. Actual vs. predicted HP load by the (A) blended, (B) SVR, and (C) LSTM models.
Smartcities 08 00065 g009
Figure 10. Actual vs. predicted total load by the (A) blended, (B) SVR, and (C) LSTM models.
Figure 10. Actual vs. predicted total load by the (A) blended, (B) SVR, and (C) LSTM models.
Smartcities 08 00065 g010
Figure 11. ROC curve for the regression with specificity and the false negative rate for the (A) blended, (B) SVR, and (C) LSTM models.
Figure 11. ROC curve for the regression with specificity and the false negative rate for the (A) blended, (B) SVR, and (C) LSTM models.
Smartcities 08 00065 g011
Figure 12. Original vs. adjusted energy consumption of the last testing data.
Figure 12. Original vs. adjusted energy consumption of the last testing data.
Smartcities 08 00065 g012
Figure 13. Actual vs. predicted total load of (A) grid A (feeder level), (B) grid A (substation level), (C) grid B, and (D ) LCS shopping center.
Figure 13. Actual vs. predicted total load of (A) grid A (feeder level), (B) grid A (substation level), (C) grid B, and (D ) LCS shopping center.
Smartcities 08 00065 g013
Table 1. Comparative research studies.
Table 1. Comparative research studies.
Ref.ModelDatasetEvaluation MetricsEvaluation ResultOther Outcomes
[15]Deep extreme learning machine (DELM),
Adaptive neuro-fuzzy inference system (ANFIS),
Artificial neural networks (ANNs)
4 residential
buildings
RMSE,
MAE,
MAPE
DELM (RSME = 2.24, MAPE = 5.7, MAE = 2);
ANFIS (RSME = 2.46, MAPE = 6.38, MAE = 2.26);
ANN (RSME = 2.6, MAPE = 6.7, MAE = 2.39).
The different number of hidden layers, hidden neurons, and different combinations of activation functions made DELM more optimal than the other models.
[16]LSTM,
ANN,
Support Vesctor Regressor (SVR)
Five building groups on two campusesMAPE, RMSEFor dormitory buildings: LSTM (MAPE = 2.94%, RMSE = 0.27);
ANN (MAPE = 9.09%, RMSE = 0.79). For SVR, MAPE = 8.48%, RMSE = 1.1.
Not only for dormitory building, but also for other buildings such as research and office buildings, LSTM has more accurate and better prediction among the tried models.
[17]Extreme learning machine (ELM),
Generalized regression neural network (GRNN),
LSTM
Electrical Reliability Council of Texas (ERCOT)MAE,
MAPE,
RMSE
LSTM (MAE = 55.42, RMSE = 63.81, MAPE = 4.79%);
GRNN (MAE = 61.62, RMSE = 68.45, MAPE = 5.33%);
ELM (MAE = 73.82, RMSE = 80.56, MAPE = 6.86%).
ELM and GRNN were used in this model to compare and validate LSTM as a good and superior model.
[18]Decision tree,
Random forest,
SVR,
LSTM
Five households in the UKMAE,
MAPE%,
RMSE
30 min frequency: LSTM (MAE = 0.0842, RMSE = 0.1926, MAPE = 18.49%);
Decision tree (MAE = 0.1923, RMSE = 0.2952, MAPE = 62.42%);
Random forest (MAE = 0.1364, RMSE = 0.2216, MAPE = 40.87%);
SVR (MAE = 0.834, RMSE = 0.2062, MAPE = 16.4%).
Although for a 30 min frequency SVR had better results, for the other frequencies, LSTM was the best compared to other models. Moreover, their unique model, which was a combination of singular spectrum analysis and parallel LSTM, outperforms the mentioned models.
[19]SARIMA,
ARIMA,
LSTM
5567 LondonHouseholdsMAELSTM had the lowest MAE among all models.
The MAE in all models reduces through the epochs and in the last epoch, the error for all models was at its lowest rate.
The lowest error was in spring, and the highest was in winter, as the energy usage during winter could be less predictable or hard to catch the pattern.
Table 2. Important features of the models used.
Table 2. Important features of the models used.
ModelArchitecture Activation FunctionOptimization AlgorithmHyperparameters
LSTM1 input layer (100 neurons), 1 hidden layer (100 neurons, 1 output layer ReLUAdamLookback: 24; batch size: 10 variable, epochs (20), early stopping (patience = 3)
SVRSVR with a RBF kernelRBF kernel acts as an activation function in neural network methodsSequential Minimal Optimization Kernel: RBF
Blended modelSVR (RBF Kernel) + GRU (50 neurons and one hidden layer) + Linear RegressionReLU (for GRU)Adam (GRU)Epochs: 30 (GRU); batch size: 32 (GRU), early stopping (patience = 3)
Table 3. MAPE and SMAPE for different observation ranges in EV power forecasting.
Table 3. MAPE and SMAPE for different observation ranges in EV power forecasting.
EVTotalOn-PeakOff-Peak
ModelMAPESMAPEMAPESMAPEMAPESMAPE
LSTMinf65.96%22.02%12.89%inf75.24%
SVRinf65.10%29.24%18.25%inf73.29%
Blendedinf42.58%21.45%12.39%inf63.19%
Table 4. MAPE and SMAPE for different observation ranges in HH power forecasting.
Table 4. MAPE and SMAPE for different observation ranges in HH power forecasting.
HHTotalOn-PeakOff-Peak
ModelMAPESMAPEMAPESMAPEMAPESMAPE
LSTM22.66%11.36%16.45%8.79%30.76%14.70%
SVR23.43%10.63%17.99%9.96%24.68%11.55%
Blended23.79%11.28%16.70%8.98%33.20%14.34%
Table 5. MAPE and SMAPE for different observation ranges in HP power forecasting.
Table 5. MAPE and SMAPE for different observation ranges in HP power forecasting.
HPTotalOn-PeakOff-Peak
ModelMAPESMAPEMAPESMAPEMAPESMAPE
LSTM12.81%6.89%15.56%8.95%9.92%4.73%
SVR14.74%8.61%21.52%12.95%7.63%4.06%
Blended9.83%5.42%19.79%7.44%6.84%3.36%
Table 6. MAPE and SMAPE for different observation ranges in cumulative power forecasting.
Table 6. MAPE and SMAPE for different observation ranges in cumulative power forecasting.
TotalTotalOn-PeakOff-Peak
ModelMAPESMAPEMAPESMAPEMAPESMAPE
LSTM25.66%12.28%12.61%6.79%27.53%13.07%
SVR23.05%13.25%33.11%20.70%21.69%12.19%
Blended18.63%8.85%15.62%8.66%19.03%8.88%
Table 7. Evaluation characteristics of the models for synthetic data.
Table 7. Evaluation characteristics of the models for synthetic data.
ErrorModelHHHPEVTotal
MAELSTM0.86324.9832.1247.19
SVR1.425.7239.4363.30
Blended0.8822.8936.8452.98
MSELSTM1.42130546734331
SVR1.5176235286592
Blended1.23122148126317
RSMELSTM1.1936.1368.3565.81
SVR1.25341.9759.3981.19
Blended1.134.9569.3679.47
MAPELSTM22.66%12.81%inf%25.66%
SVR24.43%14.74%inf%24.05%
Blended23.79%9.83%inf%18.63%
SMAPELSTM22.66%6.89%65.96%12.28%
SVR23.43%8.61%65.1%13.25%
Blended23.79%5.42%42.58%8.85%
Table 8. Evaluation characteristics of the models for the measured data.
Table 8. Evaluation characteristics of the models for the measured data.
Grid
Error
ABLCS
SubstationFeeder
MAE5.250.7171.7711.24
MAE%5.27%11.99%17.87%5.40%
RMSE6.880.92.60416.67
RMSE%6.05%20.49%25.14%8.64%
MAPE4.92%12.56%25.03%6.57%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fayyazbakhsh, A.; Kienberger, T.; Vopava-Wrienz, J. Comparative Analysis of Load Profile Forecasting: LSTM, SVR, and Ensemble Approaches for Singular and Cumulative Load Categories. Smart Cities 2025, 8, 65. https://doi.org/10.3390/smartcities8020065

AMA Style

Fayyazbakhsh A, Kienberger T, Vopava-Wrienz J. Comparative Analysis of Load Profile Forecasting: LSTM, SVR, and Ensemble Approaches for Singular and Cumulative Load Categories. Smart Cities. 2025; 8(2):65. https://doi.org/10.3390/smartcities8020065

Chicago/Turabian Style

Fayyazbakhsh, Ahmad, Thomas Kienberger, and Julia Vopava-Wrienz. 2025. "Comparative Analysis of Load Profile Forecasting: LSTM, SVR, and Ensemble Approaches for Singular and Cumulative Load Categories" Smart Cities 8, no. 2: 65. https://doi.org/10.3390/smartcities8020065

APA Style

Fayyazbakhsh, A., Kienberger, T., & Vopava-Wrienz, J. (2025). Comparative Analysis of Load Profile Forecasting: LSTM, SVR, and Ensemble Approaches for Singular and Cumulative Load Categories. Smart Cities, 8(2), 65. https://doi.org/10.3390/smartcities8020065

Article Metrics

Back to TopTop