*3.3. Clustering*

We used the expectation maximization (EM) algorithm for training the GMM used in *MMc* (defined in Section 2.2). EM algorithm can be initialized with a realistic number of mixtures (along with mean and variance for each mixture) for it to converge to a practical solution. To achieve this practical solution, we initialized each GMM with session clusters. EV sessions were clustered based on arrival and connection times, and here we outline the different types of sessions that are observed in the real-world data.

**Sessions clusters:** In our previous work [2], on the same data, we discussed three types of sessions. Namely, (i) **Park to charge**: arrivals throughout the day; (ii) **Charge near home**: arrivals during evenings, and staying till late at night; (iii) **Charge near work**: arrivals during early morning, and staying till evenings. The largest cluster was the park to charge cluster (60% of sessions), followed by the charge near home (29% of sessions) and the charge near work clusters (11% of sessions). The DBSCAN algorithm was used to determine these clusters, which is a density based clustering algorithm. We could see a similar distribution of sessions in the 2015 dataset, after clustering the sessions. The resulting session clusters are shown in Figure 3. These clusters are only based on 2015 data, contrary to the previous work, which combined the full data for 2012–2016. Please refer to [2] for further details.

**Figure 3.** Session clusters for 2015. We used DBScan to cluster EV sessions on a monthly basis, and combine the data for all months.

#### **4. Training Additionally, Evaluation**
