2.4. Algorithm Used in This Research
The algorithm provided in the
Supplementary Data and used in this research is depicted in
Figure 3. First of all, the energy consumption models available in the Here
® API are tuned [
36]. Then, the driver sets the destination by using a web interface. Afterwards, the algorithm assesses the optimal route for the driver. To do this, the Here
® API is called by the Python code by using the Routingmode parameter [
35]. This parameter has an attribute named Type which can take three types of routes: the route that requires the least amount of travel time, the shortest one which reduces and optimizes the distance covered and finally the balanced mode which searches for the correct balance between distance and time (only for trucks). The way how the algorithm works to determine the best routes belongs to the Here
® know-how. The Python code receives from the Here
® API the potential routes (the shortest, the fastest and the balanced one) to the destination and the energy consumption for each one. The application chooses the one with less energy consumption as explained later.
Appendix A provides further information about how to set up Here
® to help the reader to reproduce the experiment. Finally, the algorithm runs a block called eco-charging which aims to calculate the RE contribution and energy structure generation (wind power, photovoltaic, etc.) by using neural networks. The driver is, therefore, informed about when the charging process is greener.
Here
® API provides energy consumption models that allow assessing energy consumption by using several parameters such as speed, auxiliary energy consumption (radio, cooling/heating, accelerations, decelerations, etc.). The way of tuning these models implies that the value of each parameter in kWh is provided depending on the speed value (if possible, as not all parameters are linked to speed such as auxiliary systems). In this research, these values were established by performing data acquisition after the drivers participating in this research made each trip 50 times in different periods and traffic conditions (
Section 2.2). As shown in
Figure 4, the Inca
® software installed in a laptop as well as input/output from ETAS
® supplier modules were used to perform the data acquisition. Finally, the tuning engineers of the company that collaborated in this study assessed the factors’ values by analyzing the data acquisition by using the MDA
® software provided by ETAS
® (Stuttgart, Germany) and internal procedures. To introduce this information by using the Here
® interface is easy. First of all, the reader must indicate to Here
® that the standard energy consumption model will be used.
Figure 5 shows an example that helps the reader to reproduce this study. Once these factors are tuned and introduced in the Python code, Here
® returns the energy consumption estimate for each type of route (the fastest, the shortest and the balanced one). Consequently, the one with less energy consumption is chosen. Taking into account the initial battery capacity before the trip, the algorithm can determine if a charge is needed during the trip.
Finally, the EC block is run, and the eco-score (how REs are integrated into the charging process) is assessed. The aim of this block is to determine the RE contribution when the charging process may take place considering the battery capacity. In addition, an estimate of energy structure (wind power, fuel, etc.) is made. The block is depicted in
Figure 6. In phase 1, several factors are analyzed such as the battery capacity and the energy consumption for a specific journey, among others. It must be reminded that the energy consumption was estimated earlier by using the energy consumption model. Furthermore, the most likely time when the charging process takes place can be assessed (phase 2). Therefore, the RE contribution and most likely energy source mix (coal, solar energy, gas, etc.) can be obtained as detailed later by using gated recurrent unit (GRU) networks and nonlinear autoregressive (NAR) neural networks (phase 3) [
41,
42,
43,
44,
45]. Finally, the EC is assessed considering the RE contribution. In addition, the algorithm proposed in this paper provides information about different parameters such as chargers thanks to Open Charge Map API [
35].
The EC score measures how green the charging process is considering the RE contribution. It can be assessed as given by Equation (1):
where
REc,t is the RE contribution to the total electricity demand at
t (in MW) and
REmax,d is the maximal RE contribution (in MW) during the day when the charging process takes place. Both parameters are calculated by using neural networks. RE contribution is measured by using Equation (2):
where
REc is the RE contribution (in %),
RE is the total electricity generated by RE sources (in MW) and
NRE is the total electricity generated by non-RE such as coal (in MW).
REc,t and
REmax,d are estimated as follows. The French system operator publishes files on a daily basis in which one can find the CO
2 generation structure and the total electricity demand of the day [
46]. It must be taken into account that electricity demand and total RE contribution are stationary series. In other words, the pattern is repeated. Only some aspects have to be considered such as weekends and seasons. Anyway, two electricity consumption peaks can be found every day. Consequently, NAR networks are needed to model the electricity demand prediction for a specific day from a desired time (for example, departure planned at 7 p.m.) to midnight. The Python code analyzes the results returned by the neural network and determines the maximum RE contribution of the day. Finally, Equations (1) and (2) are assessed.
Typical recurrent networks present problems when it comes to long-term predictions due to the vanishing gradient problem. Engineers face this problem when training recurrent neural networks with gradient-based learning methods and backpropagation. When using this method, each of the neural network’s weights receive an update proportional to the partial derivative of the error function with respect to the current weight in each iteration of training. In some cases, the gradient will be vanishingly small. Consequently, the weight does not change its value, and might stop the neural network training. To enhance long-term predictions, long short-term memory or GRU can be used. In this research, GRUs have been chosen, as they are more efficient (they require less memory). GRU is a recurrent neural network architecture that uses update and reset gates (
Figure 7).
Mathematically, the process is as follows:
(a) Update gate for time step t
The update gate
zt is calculated by following Equation (3):
where
xt is the inputs presented to the network,
W(z) is its weight matrix,
ht-1 holds the information of the previous step
t−1 and
U(z) is its weight matrix. Both results are added, and a sigmoid activation is applied to squash the result between 1 and 0. The update gate allows determining how much of the past information should be passed along to the future.
(b) Reset gate for time step t
It is given by Equation (4).
The meaning of this factor is the same as for Equation (3) except rt which is the reset gate. The reset gate corresponds to the past information which must be forgotten.
(c) Current memory content
The new memory content
uses the reset gate to store relevant information from the past.
The meaning of this factor is the same as for Equations (3) and (4). represents the Hadamard product.
(d) Final memory at a current step
In this step, the vector
ht is calculated by using Equation (6). This vector holds the information for the current unit and passes it down to the network. To do this, the update gate is needed.
The GRU network was coded in Python.
Figure 8 shows the pseudocode. To reproduce the results, the reader must have the data published by the French system operator for the last four years. The first three-year data are used for inputs of the network and the last-year data are employed as targets to train the network. It is of paramount importance to rescale all data to make them range between 0 and 1 to assure the network performance. The network parameters are set up by using the keras package. First of all, with the Sequential parameter, the code specifies that the model is sequential, and the output of each layer is the input for the next layer. In this study, the authors have used the Dropout function which is a technique where randomly selected neurons are ignored during training. This means that their contribution to the activation of downstream neurons is temporally removed on the forward pass and any weight updates are not applied to the neuron on the backward pass. The main advantage of this technique is that the network becomes less sensitive to the specific weights of neurons. The method used to analyze the error loss is the mean squared error which is widely recommended for regression problems. The method used to optimize the model is Adam which is an optimization algorithm that can be used instead of the classical stochastic gradient descent procedure to update network weights iteratively based on training data. It offers many advantages such as straightforward implementation and computational efficiency, among others. Other comments can be found in the pseudocode (
Figure 8).
The algorithm estimates the structure generation for the next two hours (
Figure 9) by using the data published by the French system operator (CO
2 generation structure and the total electricity demand of the day) and NAR networks. These networks are useful when handling time series and predictions. These networks have been created and trained in an open loop. In this case, the targets are used as feedback. Then, the networks are verified in a close loop [
41,
42,
43,
47]. Mathematically, NAR networks can be expressed by
where
f represents the network response taking into account the previous input data, and
ε(t) is the difference between the predicted value
and the actual
y. The number of delays establishes the
d values to be considered for the prediction. The number of hidden layers and neurons per layer is flexible to achieve the best performance of the neural network under design. This number must be carefully chosen to avoid an increase in the neural network complexity. The effect of choosing the value of the delay parameter is shown in
Figure 10. As one can see, a high
d implies that the predicted line series line changes slower. On the other hand, when
d is lower, the predicted line series follows the real power wind value more accurately. However, if
d takes a very low value, then the predicted line series does not follow the real power wind value. The main explanation is that
d determines the weight given to past values. Consequently, significant changes in trend are not detected which could happen due to weather conditions. That is why, NAR networks are used in this research as an estimation and the accuracy remains on GRU networks. Anyway, this is not an issue as Matlab
® allows correcting predictions if predicted values are known. This is the case of this application as it can predict
t + 1,
t + 2,
t + 3… at a specific moment
t. However, when the moment is
t + 1, the neural network can be updated as the predicted
t + 1 value and the real
t + 1 are known in real time (the French system operator publishes the needed data in real time). To reproduce the results of this study, the authors obtained good predictions for the next 2 h with
d = 3 when using the data belonging to 2019 published by the French system operator. The pseudocode of the NAR network is shown in
Figure 11, coded by using Matlab
® (Natick, MA, USA) The NAR networks were trained by using the trainlm function which implies that bias and weights are updated according to Levenberg–Marquardt optimization. It is the fastest backpropagation algorithm even if it may require more memory than other methods.
2.5. Data Analysis
As detailed in the result section, the data obtained in this research seem to be close to a normal distribution. Consequently, a method must be set to confirm this assumption. To do this, the package named PASSWR belonging to the R software was used. This package includes commands such as EDA which provide a lot of information to perform exploratory data analysis such as kurtosis, skewness and p-value. Kurtosis is a statistical measure that defines how heavily the tails of distribution differ from the tails of a normal distribution. Therefore, kurtosis identifies whether the tails of a given distribution contain extreme values. For a normal distribution, its value is 3. There are three types of kurtosis: mesokurtic when kurtosis is close to 3; leptokurtic when values are quite higher than 3; and platykurtic when the extreme values are less than the normal distribution. Skewness essentially measures the symmetry of the distribution. For a normal distribution, its value should be close to 0. At this point, it is important to highlight that symmetry does not imply that the data correspond to a normal distribution. Thus, these two parameters must be analyzed carefully. Finally, the p-value or probability value is the probability of obtaining test results at least as extreme as the results actually observed during the test, assuming that the null hypothesis is correct.
Plots are also of paramount importance when analyzing the data. In this research, three plots were used: histograms, Q-Q plots and boxplot. A histogram is a graphical representation which organizes a group of data points into user-specified ranges. The Q–Q plot, or quartile–quartile plot, is a graphical tool used to assess if a set of data plausibly came from some theorical distribution such as a normal one. Finally, a box plot is a graphical rendition of statistical data based on the minimum, first quartile, median, third quartile and maximum. In this graph, the top of the rectangle indicates the third quartile, a horizontal line near the middle of the rectangle indicates the median and the bottom of the rectangle indicates the first quartile.
Figure 12 shows an example of how a dataset corresponds to a normal distribution by using PASSWR.