*4.1. Training*

We used the training data of 2015, as described in Section 3.1, for training and evaluating our models. Aggregating EV sessions on a monthly basis reveals a higher number of sessions during the winter months, compared to the summer months. In the case of daily EV sessions, it has been noticed that all weekdays have similar profiles, which are different from weekends [12]. Thus, we could assume that days belonging to the same month (*m*) and daytype (*dt*, weekday vs. weekend) have similar profiles, and defined a set of dates (**S**) as pairs (*<sup>m</sup>*, *dt*) (e.g., for *m* = January and *dt* = weekday, **S** will have all dates that are weekdays from January). Training data for the model for a (*<sup>m</sup>*, *dt*) combination comprises the session data for dates present in the respective **S**. We trained 24 individual models, one for each of the (*<sup>m</sup>*, *dt*) combination.

#### 4.1.1. Arrival Model

**Inter-arrival time models**: For IAT models, we modeled the daily profiles of *λ*, which can be fitted using (i) a mean model, (ii) a polynomial regression model or (iii) a localized regression model (outlined in Section 2.1.1). We can rewrite Equation (6) in terms of (*<sup>m</sup>*, *dt*) as in Equation (11). We calculated the EV arrival rates (*λ*) for each day and *ts* (24 timeslots of 60 min each), by fitting the inter-arrival time to an exponential distribution.

$$
\lambda\_{m,d\_l} = f\_{m,d\_l}(t) \tag{11}
$$

For the mean model, the fitted value for each *ts* is the average *λ* (across all days). Accordingly, each *ts* has a single value of *λ*. This results in a discontinuous mapping between *λ* and *t*, for which the function in Equation (11) becomes discontinuous at the boundaries of *ts* and we see a sudden change in *λ*.

For regression methods, we transform *λ* by taking the logarithm and applying min-max normalization for each day. This transformation is necessary to correctly fit the peak hours, during which inter-arrival times are very low (high *λ*). We take the logarithm in order to more accurately model the values of *λ* during the night hours (00:00–06:00), which have few arrivals (low *λ*). Normalization is used to scale the arrival rates of all days in **S** to the same levels. We represent this transformed *λ* using *λ<sup>t</sup>*, and we use *s* to represent the re-scaling parameter for predicted values. Equation (13) can be used to ge<sup>t</sup> the fitted *λ* from the regression models *fm*,*dt*(*t*).

$$f\_{\lambda}(\lambda\_l)\_{m,d\_l} = f\_{m,d\_l}(t) \; , \quad 0 < \lambda\_l \le 1 \tag{12}$$

$$
\lambda\_{m,d\_t} = \mathfrak{e}^{sf\_{m,d\_t}(t)} \tag{13}
$$

For the polynomial regression model we modeled the relationship in Equation (12) using a grid search for the best polynomial degree (∈ {1, ... , 50}). Mean squared error (MSE) was used as the error metric during grid searching. It provides a strong penalty for large errors, which was necessary to fit the sharp morning peaks during weekdays. This resulted in a continuous and differentiable function of *λ* in terms of *t*.

For the localized regression model, polynomials of degree 1 and 2 with *α* ∈ {0.125, 0.25, 0.5} were tested. We noticed that the best results sdfd generated for degree 1 and *α* = 0.125. This resulted in a piecewise, continuous and differentiable profile of *λ* throughout the day.

Scale treatment and randomization: For regression based methods, for which we model *λ<sup>t</sup>*, we may encounter a situation wherein *f*(*t*) < 0. In this case, *λ* becomes very low, which can cause the sampled inter-arrival time (Δ*t*) of EVs to be very large. Since the next EV arrival is calculated relative to the past arrival Equation (4), such high Δ*t* may cause the next arrival to be very late, thereby skipping a large period of time. This becomes problematic when this period covers times with high

values of *λ* (hence a high number of EV arrivals, which however will not be generated). For practical purposes, we impose a lower limit of 1 on *λ* (meaning we have at least 1 arrival in each *ts*).

When we transformed *λ*, we appiedy a min-max normalization on ln(*λ*) for each day (where the minimum value of ln(*λ*) is 0, because *λ* ≥ 1). Each day in the training data has its own maximum value of ln(*λ*). These values can be saved as an array of re-scaling parameters (represented by *s* in Equation (13)). When generating session arrivals, we randomly selected a value from this array to re-scale the predicted values. This helped in introducing variance in to the otherwise smooth profiles of the predicted *λ*.

**Arrival count models**: We mapped each *ts* to the parameters **P** that characterize the discrete distribution of the number of arrivals in that timeslot. Similarly to the IAT models, we have a model for each (*<sup>m</sup>*, *dt*) combination. Our training data for each model are the numbers of EV arrivals at *ts* for each of the days of the respective combination (*<sup>m</sup>*, *dt*).

$$\begin{array}{ccccc} f\_{m,d\_l} : & t\_s & \rightarrow & \mathbf{P}\_{m,d\_l} \\ & \{1, \ldots, 24\} \rightarrow \{\mathbf{P}\_{m,d\_l,1}, \ldots, \mathbf{P}\_{m,d\_l,24}\} \end{array} \tag{14}$$

For the Poisson model, the average number of EV arrivals *λ* was calculated per *ts*. In the Poisson distribution, the mean is equal to its variance, a restriction that is not present in the negative binomial distribution.

In case of the negative binomial model (with parameters **P** = {*μ*, *α*}), *μ* is the average number of EV arrivals per timeslot (=*λ*), and *α* is the dispersion parameter, which can be used to define the variance of the distribution (var = *μ* + *αμ*2). A negative binomial distribution model thus allows one to introduce more variability in the generated number of EV arrivals, compared to a Poisson model. Both *α* and *μ* were fitted for each individual (*<sup>m</sup>*, *dt*) combination. It is possible that the estimated *α* for an (*<sup>m</sup>*, *dt*) combination is extremely low, in which case the underlying distribution is more likely to be Poisson. It is also possible that during night hours (low EV arrivals), the estimation process of *α* might result in impractical values (less than 0). To adjust for this, we can use a Poisson distribution wherein the estimated values of *α* are negative, or set a lower limit on *α* (e.g., *α* ≥ 0.1).

In IAT models, the time of the next EV arrival is the sum of the time of previous EV arrival and randomly sampled Δ*t*. As previously stated, this dependency becomes troublesome if Δ*t* is very large (due to the low *λ*, the next EV arrival may be very late, skipping a large time interval) or very low (high *λ*, large number of EV arrivals in a small amount of time). Due to that, fitting *λ* as a function of *t* requires caution in IAT models. However, we do not face this problem when using the AC modeling approach, wherein the number of arrivals are generated separately for each *ts*. Indeed, a low/high *λ* in the previous *ts* will not affect the number of arrivals in next *ts*. For practical uses, we can also assume that night hours with low numbers of EV arrivals are similar, and combine *ts* = 1–6 into a single timeslot. The fitted value of *λ* is then associated with the time from 00:00 to 06:00.

#### 4.1.2. Mixture Models (*MMc*, *MMe*)

Similarly to the arrival models, we verified that days belonging to a (month, daytype) combination have similar distributions in terms of departure times and charging loads, and thus fit models for each (*<sup>m</sup>*, *dt*) combination. For each *ts* we fit a Gaussian mixture model to the real-world data, and modified Equation (9) as follows.

$$\begin{array}{rcl}P\_{t\_4=t\_s,m,d\_t}(t\_c) & = \text{GMM}\_{m,d\_t,t\_s} \\ & = \left(\sum\_{k=1}^K \phi\_k \mathcal{N}(\mu\_{k'} \,\sigma\_k^2)\right)\_{m,d\_t,t\_s} \end{array} \tag{15}$$

This resulted in a GMM fitted for each (*<sup>m</sup>*, *dt*, *ts*) combination, with trained values for (i) mixing probabilities (*φk*), (ii) mixture means (*μk*) and (iii) mixture variances (*σ*2*k* ), for each mixture. We used expectation minimization to fit the GMM.

Expectation maximization for fitting GMM requires initialization of mixtures (*μk*, *σ*<sup>2</sup> *k* ). The number of mixtures also needs to be chosen for each model (representing a *m*, *dt*, *ts* combination). We initialized each GMM based on session clusters (see Section 3.3). We grouped the sessions observed in *m*, *dt*, *ts*, into their respective session clusters. We used the number of clusters obtained to initialize the *K* for the GMM, and calculated the *μk*, *σ*<sup>2</sup> *k*from the EV sessions in the respective groups.
