1. Introduction
The optimisation of the use of batteries in renewable energy systems is a challenge that is being analysed and researched. In the case of wind renewable energy, Ref. [
1] presents an optimal battery energy storage system (BESS) capacity for the wind farm integration model and its effects of equivalent lifecycle and reserve degree on BESS capacity.
In the solar renewable energy field, efforts are very focused on the optimal operating cost and profitability of battery energy storage system portfolios in different electricity market conditions [
2]. Other research works focus on the optimal size of battery energy storage system (BESS) and the optimal scheduling of BESS power, as presented in [
3]. The goals of these strategies are to maximise profit and minimise penalty costs.
The optimisation of battery energy storage systems (BESSs) holds particular significance within the framework of Local Energy Markets (LEMs). These markets are defined [
4] as decentralised systems that coordinate energy generation, storage, transportation, conversion, and consumption within specific geographic areas [
4]. Through the implementation of automated control and demand-side management strategies, LEMs, especially those integrating local Heat, Ventilation, and Air Conditioning (HVAC) production, have the potential to considerably improve energy efficiency, mitigate greenhouse gas emissions, and cultivate energy independence [
5].
In previous works [
6,
7], the solutions focus mainly on optimisation strategies for determining the BESS reserves, but they do not usually follow allocation strategies using artificial intelligence (AI) algorithms. These predictive methods can be used to determine how frequent the “excursions” are. In our case, excursions mean oscillations from minute actual consumption to half-hourly forecast consumption in a certain period to improve the BESS reserve capacity. Knowing the excursion frequency according to different seasons and periods may help to manage BESS reserves optimally to meet Distribution Network Operator (DNO)/Distribution System Operator (DSO) contract conditions. Not meeting real-time load constraints may generate an Energy Community burden in contractual penalties from DNOs/DSOs, or operational impacts due to local power system outages.
Figure 1 illustrates an actual minute vs. half-hourly forecast load kW comparison. Estimated deviations named “excursions” show evidence of higher BESS capacity than foreseen. They are represented in yellow.
The reserve capacity depends on the number and magnitude of excursions [
8], so focusing on this magnitude, it is possible to optimise the BESS reserve capacity by providing IA inference models that can be used to predict the excursions. It is also necessary to define evaluation metrics that allow for the comparison of current and new models in terms of performance and quality. There are several key performance indicators, but almost all fall into the economic and technical benefits category for different prosumer types [
9]. This work has defined two key performance indicators (KPIs) to optimise the BESS management. KPI
1 is named excursion size categories, and KPI
2 is named forecast/actual energy %. KPI
2 is defined as the ratio of the minute actual energy over the half-hourly forecast energy, so it measures the behaviour of the consumption energy deviations from the predictive model.
There are many AI algorithms to apply to these types of datasets, such as the classic forecasting [
10] or the support vector machine [
11]. Some of them are involved in the modelling and prediction of behavioural and production profiles [
12]. However, given their limitations, these models cannot detect complex patterns such as those in this particular case of “excursions”. These patterns can be detected with machine learning techniques like neural networks (multilayer perceptron) [
13] and gradient boosting regressor algorithms [
14]. After training many predictive models for consumption with statistical, machine learning, deep learning, and hybrid methods, these two machine learning techniques were selected because of their excellent accuracy and generalisation results. The main features of algorithms chosen are:
- (i)
A multilayer perceptron (MLP) is a fundamental feedforward neural network architecture designed by Frank Rosenblatt and is composed of at least three layers: an input layer, one or more hidden layers, and an output layer. As illustrated in
Figure 2, MLPs possess a property known as the universal approximation theorem. This theorem establishes that an MLP with a sufficient number of hidden units can approximate any continuous function with arbitrary accuracy. This capability, coupled with its ability to learn from data, has solidified the MLP’s status as a cornerstone in various fields, including pattern recognition, function approximation, and prediction.
- (a)
Neurons in consecutive layers are fully interconnected, with each connection possessing an associated weight. The input layer receives raw data, which are transformed through the hidden layers via non-linear activation functions. These functions introduce non-linearity, allowing the network to learn complex patterns and representations. The output layer produces the final prediction based on the processed information from the hidden layers.
- (b)
Training an MLP involves an iterative weight adjustment process through backpropagation, an algorithm that calculates the gradient of the error function concerning the network weights. This gradient information is then used to update weights to minimise prediction error. The learning capacity of the MLP is influenced by various factors, including the number of hidden layers, the number of neurons per layer, the choice of activation functions, and the optimisation algorithm employed.
- (c)
Backpropagation relies on differentiable functions. To function correctly, backpropagation requires that both the combination of inputs and weights (typically a weighted sum) and the activation function (such as rectified linear unit -ReLU-) within a neuron are differentiable. Moreover, these derivatives should be bounded as a gradient descent optimiser (other more powerful gradient descent-based optimisers like Adam are also used), which is commonly employed to optimise multilayer perceptrons.
- (d)
The backpropagation process iteratively adjusts weights. In each iteration, forward propagation calculates the output for a given input, and the mean squared error (or other validation metric) is computed between the predicted and actual values. Subsequently, backpropagation calculates the error gradient concerning the weights. These gradients are used to update the weights, moving them closer to a minimum error configuration. Equation (
1) shows how the current iteration (called epoch in the neural network domain) gradient is computed as the bias (
) times the error derivative with respect to every neuron’s weight and the
learning rate (an arbitrary rate not too big, and not too small) times the previous iteration gradient. This process continues until the gradient converges, indicating minimal change in weight updates and optimisation convergence.
- (e)
The MLP is a versatile computational model capable of approximating any continuous function with arbitrary accuracy, provided sufficient complexity. This property and its ability to learn from data have established it as a cornerstone in fields like pattern recognition that can be applied to electrical consumption prediction.
- (ii)
Gradient boosting regressor: The gradient boosting regressor is an eXtreme gradient boosting (the acronym XGB will be used to refer to this algorithm in this work) algorithm for regression, which is a powerful ensemble method (see
Figure 3). It sequentially combines a set of weak learners, typically decision trees, to create a predictive model that continually enhances predictions. Unlike conventional methods that aim to minimise prediction error directly, the gradient boosting regressor employs the negative gradient of the loss function to guide the learning process. It has the following characteristics:
Ensemble Method: the technique combines multiple weak learners to develop a more robust final model.
Iterative Learning: the model is created one tree at a time, with each subsequent tree focusing on improving the predictions where previous trees made errors.
Gradient-Based Learning: the negative gradient of the loss function is utilised to select new trees, emphasising areas with errors.
Flexibility: the choice of weak learners and hyperparameters can be adjusted to enable the technique to adapt to different data types and regression tasks.
XGB optimises an objective function that measures how well a model trains. This function breaks down into:
- (a)
Objective function is defined (see Equation (
2)) as the sum of a loss function and a regularisation term.
- (b)
The loss function measures the predictability of the model.
- (c)
The regularisation term measures the complexity of the model and helps determine an accurate and stable model.
- (d)
XGBoost is a CART decision tree ensemble model, similar to Random Forest, but different in how it is trained. XGBoost optimises the objective function shown in Equation (
2).
Obj represents the overall objective function that XGBoost aims to minimise. is the summation symbol, meaning the summation over all leaves (j) in the tree. Gj is the sum of gradients on leaf j. It measures how much the predictions on leaf j need to be adjusted to reduce the loss. Hj is the sum of Hessians on leaf j. It measures the curvature of the loss function. is the regularisation term, which helps prevent overfitting by penalising complex models. The ‘3’ is a constant term, which does not affect the optimisation process and can be ignored.
- (e)
The ensemble model sums the prediction of several trees together. Finally, the efficiency of the tree structure is measured using the regularisation term. This score is like the measure of impurities in a decision tree, except that it also considers the complexity of the model.
Time-series regression methods are supervised learning methods that forecast the consumption objective variable of the proposed predictive models. Machine learning methods need a sliding window strategy to analyse time-series data. The sliding window strategy involves transforming the data into a matrix where each value is associated with the time window that comes before it, as shown in
Figure 4. This transformation can also take into account exogenous variables. Once transformed, a regression model can be trained to predict the next value in the series. Each row in the training data represents a separate instance, with values at previous time windows used as predictors for the target value at the next time window. This is considered an example. That way, the supervised learning method may be used to learn with examples. The model must consider all dependent and external variable time lags to predict the dependent variable at a given time step accurately.
Optimisation methods in the field of electrical energy storage are aimed at maximising the efficiency, profitability, and overall performance of BESS [
7,
18,
19,
20,
21,
22,
23]. The primary techniques utilised today in the renewable energy besides machine learning are mathematical optimisation and metaheuristic algorithms.
Mathematical optimisation methods include:
- (i)
Linear Programming (LP): This is suitable for problems with linear objective functions and constraints. It is used for basic energy arbitrage or peak shaving.
- (ii)
Non-linear Programming (NLP): this handles the non-linear relationships essential for modelling system dynamics.
- (iii)
Mixed-Integer Programming (MIP): this combines continuous and integer variables, useful for problems with discrete decisions like charging/discharging schedules or unit commitment.
- (iv)
Dynamic Programming (DP): this is effective for sequential decision-making problems, such as optimal charging strategies considering future price forecasts.
Metaheuristic evolutionary optimisation algorithms: these reproduce the behaviour of nature in many aspects. A small list is selected, but the list of evolutionary optimisers applied to renewable energy nowadays is long.
- (i)
Genetic Algorithms (GAs): these algorithms imitate natural selection to find optimal solutions, making them suitable for complex and non-convex problems.
- (ii)
Particle Swarm Optimisation (PSO): inspired by bird flocking or fish schooling, this method is efficient for solving continuous optimisation problems.
- (iii)
Ant Colony Optimisation (ACO): this technique is used to find optimal paths or configurations based on the behaviour of ants.
- (iv)
Many more evolutionary optimisers are used in renewable energy: grey wolf optimiser, artificial bee colony, bat algorithm, moth–flame optimiser, etc. They are based on identifying the optimal solutions determined by the objective function. These solutions are then combined through a process known as crossover and are further improved by introducing mutations, which are random changes.
The selected methods of this paper (MLP-NN and XGR) belong to the machine learning category which presents produce stable models with the capacity to unveil non-linear relationships among explanatory variables. This work proposal is to use the AI-based predictive model results (consumption forecasts) as inputs of conventional optimisation techniques in a two-stage method to improve the optimiser search.
This paper presents how to properly set the BESS reserve capacity guided by highly granular predictive AI models for consumption. This work verifies two main hypotheses:
First Ho: the results of the predictive models can be used to manage the BESS more optimally.
Second Ho: the results obtained with current optimisation techniques can be improved by using a highly granular consumption forecast as inputs of the optimisation problem.
2. Materials and Methods
As mentioned, AI algorithms will create predictive models for the consumption excursions. AI algorithms need datasets to be trained, and this research will use a particular algorithm corresponding to the Cornwall Local Energy Market [
24]. The Cornwall Local Energy Market trial has produced a comprehensive and fully documented dataset, referred to as [
25], which encompasses detailed energy consumption measurements, battery energy storage system state of charge, and site-specific information. These standardised data points were gathered at a minute frequency and are integral to all analysed LEM case studies. Furthermore, the dataset incorporates consumption and production forecasts, weather data, and detailed BESS specifications.
The Cornwall LEM dataset is accompanied by a comprehensive data dictionary. Trilemma Consulting conducted in-depth analyses of site metadata, fleet self-consumption, and BESS utilisation. Collectively, these resources yield a detailed comprehension of the Energy Community’s composition, consumption patterns, generation capabilities, and energy storage dynamics.
The predictive models for consumption will use the exogenous variables list described in
Table 1, which incorporates power flows (energies measured in kWh), state of charge %, and meteorological variables (precipitation, wind, solar irradiation). This final table to train the models was produced with data coming from the Cornwall LEM dataset aggregated at a half-hourly granularity level.
The methodology used to develop the forecasting models follows the standard open-source methodology CRISP-DM [
26]. This methodology starts by defining the Energy Community BESS management problem. In our case, it determines the BESS reserves capacity by knowing the deviations from the minute actual vs. the half-hourly forecast consumption. Once the consumption model is specified, there will be a data description stage to understand how consumption behaves by season, different time scales (12 h time window, week, month), and different aggregation levels (whole fleet or specific sites). The data preparation stage will cleanse and normalise the data. As the consumption is a time series, a sliding window will be built to train the predictive model. Finally, the model is trained and assessed using selected machine learning techniques. Model assessment will use the RMSE, MSE, MAE, and R
2 metrics. Finally, the methodology states the model deployment, which means the digital transformation of BESS reserves capacity using this predictive model, the BESS management performance monitoring, and the model’s performance monitoring.
The tool for designing and running predictive models is the IBM SPSS Modeler. This visual tool produces diagrams containing all phases of model design, running, and validation.
Figure 5 shows the general diagram that builds the whole model, following all stages defined by CRISP-DM methodology.
Following the methodology for developing predictive models, the experimentation started with data acquisition from the Cornwall LEM dataset [
25]. Some tables were previously integrated at the site level (Cornwall LEM Energy Community had 100 dwellings) to obtain the consumption dataset with identified exogenous variables. The main table contained web scrapped data from Cornwall LEM site owners’ microgrid management web applications. The main table containing minute granularity level information from power and storage flows (as in
Table 1) that were crossed with meteorological data at the minute granularity level. Finally, the dataset was aggregated at a half-hourly granularity level. The model diagram
Figure 5 follows the CRISP-DM methodology and includes the following data pre-processing, modelling and assessment, and graphical analysis tasks:
- (i)
Data Selection: the data time horizon was reduced to a year, specifically 1 April 2019–31 March 2020.
- (ii)
Data Auditing and Preparation: All missing values were replaced by median values. Outliers’ values were replaced by lower and upper bounds using Tukey’s Box and Whiskers method [
27]
- (iii)
Model training with MLP-NN and XGB methods. This task is represented in the subdiagram in
Figure 6 and explained later in detail.
- (iv)
Accuracy and stability validation. Accuracy was measured by using RMSE, MSE, MAE, and R
2 metrics. Model stability tests are based on ANOVA tests. This task is represented in the subdiagram contained in
Figure 6 and explained later.
- (v)
Graphical analysis of half-hourly forecasting model compared against actual minute data by season (summer, winter and shoulder) and 12 hour/week/month lapses, and at different aggregation levels (whole fleet or specific sites).
Figure 6 presents the model training and validation subdiagram with the selected machine learning methods (MLP-NN and XGB). The subdiagram includes the following tasks:
- (i)
Model specification: Both target and input variables are declared in the model. Input variables include all exogenous variables and their corresponding lags to create the sliding window. Twenty-four lags were defined at a half-hourly granularity level.
- (ii)
Feature extraction: The relevant exogenous variables were extracted using a significance test. The null significance hypothesis was set, as there is no correlation between consumption and the exogenous variables, and is measured using the p-value; the hypothesis is rejected when the p-value is ≤ 0.05. In that case, the feature or exogenous variable is considered significant.
- (iii)
Sliding window creation: all target and input variables lags are created to define the sliding window. Twenty-four lags at a half-hourly granularity level were defined, meaning a 12-h input window in the model. The output window size is used to predict the next half-hour.
- (iv)
Model training with MLP-NN and XGB methods
- (v)
Accuracy and stability validation: Accuracy was measured against actual data on consumption minutes. The analysis compared the models’ half-hourly forecast against the actual minute data to derive the “excursions”, or residuals. Model stability tests used ANOVA tests, validating the error variance equality hypotheses in all partitions. If the p-value is less or equal to 0.05, the error variance equality hypothesis is rejected, making the model unstable.
Figure 6.
Half-hourly consumption modelling and assessment stage subdiagram.
Figure 6.
Half-hourly consumption modelling and assessment stage subdiagram.
Figure 7 exhibits a graphical analysis of half-hourly forecasts against minute actual consumption kW in a shoulder season week at day-of-week and half-hour granularity level. MLP-NN model forecasts are green, while XGB model forecasts are orange. Actual minute data are in blue. A day-of-the-week (weekly) pattern emerges throughout the day at half-hour granularity level.
3. Experimentation & Results
Several forecasting models were developed to estimate the consumption kW at a half-hourly granularity level with two machine learning methods: multilayer perceptron neural network (MLP-NN) and eXtreme gradient boost (XGB).
Table 2 presents the models’ validation metrics (RMSE and MAE measured in kW, and MSE would be measured in kW
2).
Both models are stable as they passed the stability test. The ANOVA test was performed with the data partitioning: training subset (60%—917,346 observations), testing subset (30%—457,558 observations), and validation subset (10%—153,542 observations). ANOVA tests produced the following p-values: MLP-NN model: 0.315 and XGB model: 0.814. The analysis cannot reject the hypotheses, so both models are stable.
These accurate and stable models are not theoretical constructs but practical tools used to estimate consumption kW at a half-hourly level. The estimates are then compared against the actual consumption kW at a minute level to analyse behaviour deviations by the whole fleet or an individual site. Once the models are trained, they must be evaluated using the two KPIs mentioned in the introduction and developed further on in
Section 3.1 and
Section 3.2. Both models were stable and accurate, but XGB was slightly more accurate than MLP-NN, which is the recommended method for this work. In specific cases, these models’ accuracy difference may imply a relevant penalty variation when not meeting the DNO/DSO contract conditions.
Graphical analysis is shown in
Figure 8 that compares half-hourly forecast (MLP-NN model in green colour and XGB model in orange colour) vs. minute actual consumption kW (blue colour) for an average site in the whole fleet in a shoulder season 12 h period by minute.
Figure 9 also shows half-hourly forecast (with the MLP-NN model in green and the XGB model in orange) vs. minute actual consumption kW (in blue), but in this case for the site 100 (a heavy consumption site) in a winter season week by half-hour. This low granularity level analysis performs better the excursions.
3.1. KPI1 Excursion Size Categories
Excursion (residuals), refering to oscillation, is the absolute value between minute consumption kW and half-hourly forecast consumption kW. This indicator shows how often the actual consumption is over or under the forecast, which may help optimise the BESS reserve capacity.
KPI1 is a nominal variable measuring the excursion/residual size. Excursion size categories are defined according to the distribution analysis of MLP-NN and XGB model residuals performed at whole fleet (100 dwellings) or individual site (site 100 was selected because of its considerable consumption) levels.
The residuals obtained from the comprehensive distribution analysis of the whole fleet are segmented in
Table 3. Residual sizes are defined as SMALL, MEDIUM or BIG using the residual distribution quartiles. Notably, the maximum residuals are 0.914 kW with the MLP-NN model and 0.590 kW with the XGB model. This thorough analysis forms the basis for the discretisation of KPI
1, as shown in
Table 4.
Based on the residual distribution of
Table 3, KPI
1 is considered as residual size, measured in kW and categorised as in
Table 4.
Figure 10 exhibits the KPI
1 for the whole fleet from 1 April 2019–31 March 2020. KPI
1 is segmented as BIG (in blue) when residuals are greater or equal to 0.08 kW; MEDIUM (in green) if they belong to (0.029, 0.08) kW interval; and SMALL (in orange) if they are lesser or equal to 0.029 kW.
The same distribution analysis is performed only for a dwelling. Site 100 was selected because of its high consumption. Residual metrics are shown in
Table 5. It should be noted that the maximum residuals are 3.957 kW with the MLP-NN model and 4.006 kW with the XGB model. Other KPI
1 analyses at different granularity levels and seasons may be found at [
28] and are available for other researchers.
3.2. KPI2 Forecast vs. Actual Energy %
Consumption power (kW) is the instantaneous rate at which energy is used and is converted to consumption energy (kWh) over a specific period. To calculate consumption energy, we multiply power by the duration of the period. For instance, the half-hourly forecast consumption energy is 0.5 times the half-hourly forecast consumption power, and minute consumption energy is (1/60) times the minute consumption power. KPI
2 defines a ratio to measure the proportion of forecasted energy over actual energy and is computed according to:
The descriptive statistics of KPI
2 are presented in
Table 6.
Figure 11 exhibits the KPI
2 graphical analysis for the whole fleet during 12 h in the summer season. The KPI
2 smoothing average increases from approximately 5% from 0:00 h to 6:00 h to approximately 6% from 6:00 h to noon.
Figure 12 changes the KPI
2 scale to the whole summer month period, showing the forecast/actual energy % for the average site of the entire fleet. The smoothing average of KPI
2 around 4% means the forecast energy is slightly more significant than the actual data for most of the period (the atypical residual with a high residual is significant).
4. Discussion
The results of the proposed predictive models are very accurate when comparing actual minute data against half-hourly forecasting data.
Two key performance indicators were defined to manage the BESSs better. KPI1 shows the “excursions” or half-hourly forecast deviations from the actual minutes of the data. This KPI1 was discretised, showing that the residuals were mostly minimal. KPI2 was defined as a ratio between the half-hourly forecast and the minute actual consumption energies. This KPI2 has been represented at a half-hourly granularity level centred at the small 0.05% value, indicating potential cost savings from accurate consumption forecasting. These two KPIs will help the Energy Community set their BESS reserves accurately, potentially saving them from fines with its DNO/DSO.
As for the hypotheses set, they have been validated with the presented results:
- (i)
The results of the predictive models can be used to manage the BESS more optimally with the defined KPIs. The trained predictive consumption models have produced accurate and stable results that allow the Energy Community to set proper BESS reserves aligned with actual consumption. Evidence of these results has been detailed in
Table 2,
Table 3,
Table 4 and
Table 5, and
Figure 10,
Figure 11 and
Figure 12.
- (ii)
Optimisation models used to determine BESS are based on mathematical models, primarily linear approximations. The machine learning methods used in this work (MLP-NN and XGB) are non-linear approximations that may capture consumption dynamics more precisely and help them in their search process as inputs to the optimisation problem.
Finally, future research may consist of improving the predictive model’s accuracy and stability. Two directions are given:
- (i)
Augment the granularity to high-frequency models based on minute data. Half-hourly granularity level forecasting models may not capture extreme variable weather conditions.
- (ii)
Try other predictive methods like hybrid methods of deep learning techniques. Combining convolutional neural networks (CNNs) and recurrent neural networks like LSTM, GRU, Bi-directional LSTM, or Bi-directional GRU may improve model accuracy and stability by capturing highly non-linear relations among the variables used in the model.
5. Conclusions
Maximising battery energy storage involves various procedures, with the most extended and conventional ones relying on optimisation strategies. However, optimising processes can be further enhanced by integrating predictive tools powered by artificial intelligence algorithms. Combining prediction and optimisation techniques in a two-stage model produces a better performance scenario. Better demand forecasts help optimisers get a better objective function configuration meeting all site constraints (load balance, minimum operational and reserve capacity, BESS longevity, etc.).
This work highlights how the Consumption estimated with artificial intelligence methods can optimise the management of BESS reserves. The optimal scenario is that BESS reserves will meet LEM Energy Community needs and avoid supply contract fines from DNO/DSO when over-demanding. With this information, the Energy Community will optimally set their minimum operational and reserve capacity levels—the first to serve site owners’ needs. The second is to graduate potential reserve capacity to be sold in the market but meet DNO/DSO contract conditions and avoid fines.
As for the potential economic impact of optimising BESS management, Trilemma Consulting in their [
29] made assumptions for the import and export tariffs that help to get an idea of the maximum absolute performance by MWh. Export tariff was assumed at 0.15 £/MWh and import tariff as 0.055 £/MWh in the Cornwall LEM, so the maximum absolute performance is 0.095 £/MWh. The relative maximum absolute performance by MWh can be computed as:
Trilemma Consulting also computed the annual earnings potential in the “A Whistle-stop Tour” presentation on Cornwall LEM Residential Project [
29]. The maximum annual earnings potential managing the BESS Headroom % (the available BESS to be charged) was estimated in 125 £ for a 5 kWh battery; 187.50 £ for a 7.5 kWh battery; and 250 £ for a 10 kWh battery.
This work experimented with consumption data from the Cornwall LEM dataset [
25] at different aggregation levels: whole fleet or specific site. IA-based consumption and predictive models’ accuracy were measured as high, and the model’s stability was not rejected. Applying this predictive model to the conventional optimisation methods as inputs can improve the optimal scenario search.
There are two primary analyses in this research. The first analysis is the Excursion Size Categories, which explores the amount and size of the excursion or residuals. It measures the models’ accuracy. The second analysis Actual vs. Forecast Energies Comparison presents the performance of the half-hourly forecast against the minute actual data by different seasons, days of the week, and half-hour lapses.
Although the consumption forecasting models proposed in this work showed good performance, they have limitations. Consumption forecasting models may improved in two directions of future research:
Finer forecasting granularity level. Minute forecasting models will better adjust the BESS management. These models can better capture complex pattern consumption variations at different times (12 h, week, month, day of week) or site aggregation (whole fleet or specific site) granularity levels.
MLP-NN and XGB methods belong to the Shallow Machine Learning category. They can be surpassed with Deep Learning and Hybrid methods that can capture better non-linearities and have a better memory or attention.
- -
Most Deep Learning methods used today in consumption forecasting include:
- *
Convolutional Neural Networks (CNN) to extract features [
30]
- *
Recurrent Neural Networks (RNN) to forecast time series like the Consumption. RNNs most used in this problem are Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Bi-Directional LSTM, and Bi-Directional GRU methods. All these methods develop a memory of important events during the time and forget the rest [
31]
- *
Transformers are deep-learning methods that simultaneously pay attention to important events at different places. There are brand new algorithms applied to regression problems like N-Beats by Yoshua Bengio [
32] or the Temporal Fusion Transformer (TFT) [
33]
- -
There are interesting hybrid methods like the TSFEDL [
34] library, designed for time-series analysis. The library incorporates 22 advanced models that combine convolutional and recurrent neural networks to extract meaningful spatio-temporal patterns from time-series data.
Finally, the authors want to point out other applications which may be attractive to other researchers. Estimating the After Diversity Maximum Demand (ADMD) [
35] helps to optimise the local distribution network size. These predictive models minimise the Energy Community infrastructure investment but meet site owners’ needs.