*2.1. Dataset Processing*

The University of Nottingham operates a fleet of 121 vehicles across 4 UK campuses, which provide a wide variety of roles, including catering services, estates management and security. A total of 48 of these vehicles from 6 different departments were actively tracked using the Trakm8 service, which provided detailed information on vehicle condition, driving patterns and individual journey details. The latter included the time and GPS location at the start and end of the journey from which latitude and longitude could be derived, as shown in Table 1. Analysis of this data thus allowed a dataset to be constructed of when, where and for how long each vehicle was stationary.


**Table 1.** Example data received for each vehicle journey.

At the time of the study, the fleet was not equipped with V2G technology, and the compatible charge points were not available. However, the best potential locations for V2G charge points were determined through a combination of (i) interviews with fleet managers to understand the patterns of use of the vehicles and overnight parking location of the different fleets, (ii) analysis of Trakm8 data to identify typical parking locations of the tracked vehicles, (iii) assessing infrastructure feasibility to install V2G chargers (e.g., energy supply availability to connect 3-phase V2G chargers) [18]. This analysis resulted in the identification of 6 proposed locations spread across 3 campuses in the city of Nottingham, UK.

Cross-referencing parked locations with each of these charge point locations allowed the number of vehicles to be determined that could potentially be available if the necessary hardware was in place. This was achieved by calculating the great-circle distance using the haversine formula, as shown in Equation (1), where *r* is the radius of the earth (6371 km), and *disti* is the distance in km between the location of a parked vehicle *v* (*end*\_*latv* and *end*\_*lngv*) and charger location *i* (*lati* and *lngi*).

$$dist\_i = 2 \ast r \ast \arcsin\left(\sqrt{\sin^2\left(\frac{lat\_i - end\_{\text{-}}lat\_v}{2}\right) + \cos(end\\_lat\_v) \ast \cos(lat\_i) \ast \sin^2\left(\frac{ln\text{g}\_i - end\_{\text{-}}lon\_v}{2}\right)}\right) \tag{1}$$

When the shortest distance to a charge point was below 100 m, the vehicle was considered to be parked within a suitable radius and hence potentially available to a V2G aggregation service, i.e., *av* = 1, as shown in Equation (2). This radius was chosen to account for inevitable variance in GPS locations and to be close enough to require only minor changes in behaviour to park close enough to a charging station to be plugged in, e.g., choosing a different parking place within the same car park.

$$\left(\left(\min\{dist\_i\}\_{i=1}^6 < 0.1 \to a\_{\overline{v}} = 1\right) \land \left(\min\{dist\_i\}\_{i=1}^6 \ge 0.1 \to a\_{\overline{v}} = 0\right)\right) \tag{2}$$

Forty-two weeks of data were collected, and each of the 294 days, *d*, represented in the dataset was divided into 48 contiguous half-hour periods; *hh<sup>d</sup> <sup>i</sup>* , 1 ≤ *i* ≤ 48, 1 ≤ *d* ≤ 294. The dataset was then processed to determine vehicle availability as follows:

For each pair of consecutive journeys, *J v <sup>n</sup>* and *J v <sup>n</sup>*+1, in the dataset for each vehicle, *v*:


• Where *av* = 1, the vehicle was deemed to be available for each period within *hhv p*

The resulting dataset contained 677,280 rows, 57% of which represented half-hour periods in which a vehicle was available, i.e., *av* = 1. In addition to vehicle availability, several other features were added to the data that had the potential to impact vehicle usage and hence availability:

• The day number (d); from 0 to 6, i.e., Sunday to Saturday

Example entries in the dataset are shown in Table 2.



**Table 2.** Sample data from the processed dataset.

Where d = day; hh = half-hour period; ph=public holiday; uh = university holiday; hol = holidays; term = university term and *av* = vehicle availability.

The data was split into training and test datasets containing 237 days (81%) and 57 days (19%) of the total dataset, respectively, with the composition shown in Table 3.


**Table 3.** Composition of the training and test datasets.
