4.1. Datasets
In this case study, we use smartphone GPS data from ‘the City of Shenzhen Mapping’ database (source from Shenzhen Urban Transport Planning Center, Shenzhen, China) to understand the local driving behaviour in Shenzhen, China. The municipality of Shenzhen covers an area of 1991 square kilometers including urban and rural areas, with a total population of approximate 12 million. The city has an elongated shape measuring 81.4 km from east to west while the shortest section from north to south is only 10.8 km (
Figure 2). Shenzhen was established as a Special Economic Zone in 1980, and so the road system is relatively modern and well planned.
As shown in
Table 1, each subset corresponds to a unique and continuous period between January and June in 2017. The data in each set were collected in seven consecutive days. GPS are measured at every 1 s, resulting in a 3TB database. Given the amount of data, we have a large number of samples for almost every major road and expressway in Shenzhen. To review one subset in
Table 1 as an example, there are approximately 2 billon points (each with unique latitude and longitude) collected in Set 5 from 1 May 2017 to 7 May 2017, from Monday to Sunday. The raw data may encounter both user and system errors. The system errors are mainly due to technical issues such as signal reflection, phone battery, canyon effects and network connection that disturbs data transfer between the user and the server. To address the issue of noise and outlier, most GPS receivers employ proprietary filtering algorithms to compensate for data points beyond variances. Thus, the software embedded within the receiver automatically provides certain level of data correction. Second, additional measures of reliability can help identify questionable data, and numerous techniques can filter the data based on these measures (e.g., Pauta criterion). Third, one advantage of this proposed deep learning approach is to reduce/manage the negative effects of defects during feature extraction.
4.2. Autoencoder-Self Organizing Mapping Network Application—Test One
To obtain samples that are most representative of the entire population, we followed a two-stage sampling process. First the whole dataset was divided into 42 subgroups by date (7 days in a month for 6 months). Then we selected the top 96 ids which contain most of the valid GPS points in each day. Finally, we obtained 4032 ids as the driving behaviour training set for the AESOM neural networks (
Table 2). To verify both the feasibility and accuracy of the autoencoder, we performed some experiments and decide to employ the structure consisting of an encoder with one input layer and two hidden layers, neuron size as 8-6-3 respectively (
Figure 1). This also determines the extraction feature matrix structure and a 3-D vector
to transform the significant components into input neurons of the SOM for clustering. The lattice of a competitive layer in SOM was set as a 5 × 5 grid, with consideration of both computational cost and classification performance.
In this AESOM framework, the main objective of an autoencoder is to detect the structure of a large multivariate dataset (data patterns and relations) and to implement a compression scheme. In addition, it learns to what extent each component is associated with each input variable and how much the set of components explain the variability of the original dataset. After obtaining the component vector
(the output of layer 3), we can understand and name each factor, observing the contribution of all the variables.
Table 3 shows the output
, where the loss value determined by
is only 0.08, indicating a well-qualified neural network performance. The relationship between
and input variables
is displayed in
Table 4.
According to the feature extraction matrix in
Table 4,
and
display strong relationships with acceleration/deceleration driving features ([
and
). Specifically,
reflects acceleration with “+” sign while
represents deceleration of vehicles (with “−” sign). In contrast,
reflects speeding behaviour (
.
There is a growing body of evidence to suggest several road safety benefits are associated with reduced speed variability between vehicles. Specifically, increased speed variation may disturb homogenised traffic flow and increase the likelihood of conflict situations caused by human behaviour [
49]. Considering the combination of acceleration and deceleration (
and
), we set the coefficient
β to be (1, 0, 1), and correspondingly design the SOM networks to cluster the 2-D inputs into 4 classes as shown in
Table 5. It is good to find almost half driving in Shengzhen metro with consistent speed. Only 3.2% drivers show heavily variable speed, thus, we call this small group “Neurotic” drivers.
Based on
, the SOM networks produced four distinct clusters as displayed in
Table 6. 3.69% drivers would be classified as consistently speeding. This smallest group can be labeled as an “Aggressive” class. Even though a small percentage, there are over 3 million vehicles in Shenzhen and around 1.7 million vehicles on road each day in 2017. Thus, the actual volume of aggressive drivers on road daily in Shenzhen can be up to 70,000. The rest 53.08% (C2 + C3) show light to moderate risky speeding profiles.
It is noted that the outputs from the SOM networks based on
or
are in only three distinct clusters. As presented in
Table 7, 66.34% of drivers (in C1) prefer to decelerate in a relatively smooth style. However, there are 6.48% of drivers who exhibit inconsistent or excessive accelerations (harsh take-off), labelled as “Inattentive” drivers. As expected, clustering based on
indicates a similar distribution to that on
, that about 93.99% of drivers (C1 + C2) constitute the norm, while 6.01% decelerate frequently, as shown in
Table 8. They are more likely to closely tailgate and suddenly brake.
In conclusion, in Shenzhen, by and large drivers conformed to road rules, staying within the confines of the speed limit, with no harsh braking or sharp accelerating. Drivers in a small group were prone to acceleration as well as deceleration. This kind of motion can create high risk of accidents. The driving behaviour patterns have various physical, psychological and incidental aspects that are measurable. Driver behaviour is related not only to the driver’s character and socio-economics, but also to education, training, police enforcement, etc.
4.3. Autoencoder-Self Organizing Mapping Network Application—Test Two
In the second experiment, to investigate improper vehicle position maintenance, we added two vehicle lateral orientation features
and
from the raw sequences of GPS data. Here
is the instantaneous angular velocity of the vehicle;
is the angular acceleration. Thus, a ten dimensional
X vector was formulated as follows.
We keep the same autoencoder structure as in test one, consisting of an encoder with one input layer and two hidden layers, neuron size as 8-6-3 respectively. The new loss value is 0.03, indicating a better neural network performance compared with the network in test one. The results are presented in
Table 9,
Table 10,
Table 11 and
Table 12.
Test two considers variations in the lateral and longitudinal position of the vehicle. According to the extraction matrix in
Table 9,
presents a strong correlation between angular velocity and vehicle deceleration features;
reflects association with speed and acceleration with “+” sign behavior, while
displays a strong relationship with the combination of lateral and longitudinal speed features.
In contrast to the clustering results in test one with a small class defying driving norms, the SOM networks in test two produced three distinct clusters based on
and there are 23.52% of drivers in C3 who conduct sharp turning with deceleration (
Table 10). Typical scenarios can be turning at intersections, where they tend to turn the steering wheel suddenly with harsh braking simultaneously. Another differentiating factor of the way drivers turn is that we see more extreme lateral acceleration with high speed in a small class C4 (1.29%) in
Table 12 based on
, indicating a higher risk of accidents.
Improper vehicle lateral position maintenance and inconsistent or excessive angular acceleration/deceleration have been identified as major contributing behavioural characteristics that influence road safety. This proposed AESOM approach provides a good opportunity to combine feature learning and classification into an integrated deep learning framework to discover latent patterns and values from mega sensor data. The clustering results display the heterogeneous driving style profiles across the population. By adding the vehicle lateral orientation parameters to the neural networks, experiments verify the advantages of AESOM when dealing with high dimensionality.