1. Introduction
With the development of technology and the increase in transportation demand, many scientists have been attracted to the study of traffic problems of high complexity and practical significance. Many models have been proposed to understand the characteristics and mechanisms of traffic flow evolution, which can be broadly classified into microscopic, macroscopic, and mesoscopic models [
1,
2,
3,
4,
5].
The cellular automata model (CA model) is a type of microscopic model. The basic idea of the CA model is to use a large number of simple structures, simple links, and simple rules running in parallel in time and space to simulate complex and rich phenomena. It has the following advantages.
(1) The complex system’s collective phenomena and evolutionary dynamics can be portrayed well.
(2) It has simple and efficient calculation.
(3) The update rule is flexible and intuitive.
The first traffic flow CA model can be traced back to the Wolfram184 model [
6]. In a time step, the vehicle either remains motionless (if the preceding cell is occupied) or moves forward one cell (if the preceding cell is empty). Another CA model was proposed by Cremer and Ludwig (1986) [
7]. However, these early models did not get attention until the NaSch model was proposed [
8]. In the NaSch model, the transition from free flow to congested flow is of the first order. To describe the first-order leap in traffic flow, several scholars introduced the slow-start rule to represent this feature [
9,
10,
11,
12].
Based on long-term empirical data analysis, Kerner proposed a three-phase traffic theory [
13,
14,
15], which divides congested traffic into a synchronized flow phase (S) and a wide moving jam phase (J), in addition to a free flow phase (F). In the synchronized flow phase, when the speed is not low (or density is not high), the congestion does not appear spontaneously, then this “synchronized flow” is stable. As the speed decreases (or the density increases), the synchronized flow becomes unstable, and congestion will appear spontaneously.
To reproduce synchronized flow, many CA models have been proposed. The comfortable driving (CD) model was proposed by Knospe et al. in 2000, which considers the driver’s requirement for smooth and comfortable driving [
16]. The system is a coexistence of the free flow and wide moving jam, but the model cannot reproduce the synchronized flow correctly. Jiang and Wu proposed a modified comfortable driving (MCD) model in 2003 to reproduce the synchronized flow [
17]. This model can reproduce synchronized flow, wide moving jam, and the S→J transition, but it fails to reproduce the F→S transition.
The Kerner–Klenov–Wolf (KKW) model is one of the first CA models in the framework of three-phase traffic theory, in which the two-dimensional region of steady-state and speed adaptation effect is explicitly considered [
18]. The updated rules include deterministic and stochastic rules.
Gao proposed a model based on the generalized NaSch model in 2007 and combined the model with the slow-start rule [
19]. The basic diagram obtained is similar to the KKW model, and this model can reproduce synchronized flow, wide moving jam, and the S→J transition, but it fails to reproduce the F→S transition. To overcome this deficiency, Gao et al. later proposed an improved model [
20]. However, the velocity of the vehicles fluctuates too much in the synchronized flow, which seems unrealistic.
Tian proposed a two-state model (TS model) in 2015, and the TS model considers two driver states: defensive state and normal state [
21]. The model can reproduce free flow, synchronized flow, wide moving jam, and F→S and S→J transition. To make the TS model more realistic, Tian introduced a logistic function of safe speed and random probability in the two-state model, called the two-state model with safe speed (TSS model) [
22]. The model can reproduce the three phases of traffic flow well, and in the improved two-state model, synchronized flow can coexist with free flow.
In previous models, the vehicle has an infinite deceleration and can immediately stop in one step. In some models, a finite deceleration capability of vehicles is considered [
23,
24,
25,
26,
27,
28].
Driving style is closely related to driving behavior and can be used as a representation to predict and explain driving behavior. As a central part of the traffic system, the driver’s behavior plays an important role in the traffic flow.
Kaur et al. introduced driver behavior characteristics into the lattice point model and studied driver behavior during curves using a lattice point hydrodynamic approach [
29]. Wang et al. used factor analysis to extract the main factors affecting driving emotion. They established a driving emotion recognition model based on fuzzy integrated judgment and the AD emotion model. The model was validated by actual driving, virtual driving, and interactive simulation experiments [
30]. Zheng et al. considered the effect of driver memory and proposed an extended car-following model in which the control signals of vehicles and following vehicle speed contrast were considered. Through numerical simulations, the application of this model was shown to be effective in suppressing traffic congestion [
31]. Thompson et al. investigated the effects of habits and expectations on driver behavior and attention allocation in familiar and unfamiliar environments [
32]. Peng et al. analyzed the generation and development of overtaking-induced driver road rage using 32 overtaking accidents as examples [
33]. Yang et al., based on driving configured on the diversity of driving tendencies, proposed a two-lane CA model to simulate the average speed, flow rate, and frequency of lane changes under different lane change and deceleration rules [
34]. Shi et al. investigated the effects of distracting behaviors such as cell phone use by drivers during driving on traffic safety [
35]. Sharma et al. used a lattice-hydrodynamic traffic flow model to study the influence on behavior of driver aggressiveness and conservatism while driving [
36]. Li et al. proposed a new grid model to analyze the effect of aggressive driving behavior on traffic flow stability based on consideration of the driver aggression effect [
37].
Currently, the impact on traffic flow is mainly analyzed qualitatively in terms of individual driver behavior. At the same time, relatively few quantitative studies have been conducted on the effects of drivers on traffic flow. Most previous studies on driving heterogeneity were based on experimental and questionnaire methods, such as in [
33,
34,
35,
36,
37].
The questionnaire analysis method options are fixed and cannot restore the actual driving condition of the vehicle. The current research on driving style is not comprehensive, so this paper analyzes driving styles from the perspective of a model based on the real driving condition of the vehicle.
The driver’s personality characteristics also have an impact on the driving effect. For the same driving scenario, different drivers sustain different physiological and psychological conditions and thus produce different fuel consumption and emissions, affecting the sustainable development of transportation.
Since the daily driving habits of drivers can affect vehicle emissions, some scholars have trained up drivers or made operational recommendations to encourage drivers to adopt ecological driving behaviors, such as avoiding sudden-stop and excessive idling, and accelerating gently, which can significantly reduce vehicle energy consumption and emissions [
38].
Miotti et al. states regulating driving styles can help reduce the energy consumption and emissions of driving without requiring infrastructure or vehicle technology change [
39].
Meseguer et al. experimentally verified that an aggressive driving style always leads to more energy consumption and CO
2 emissions [
40].
Gonder et al. experimented on light vehicles and found that changing the driving style can cause a 20% change in fuel consumption for aggressive drivers, and the change in fuel consumption for drivers compared to mild drivers can also reach 5% to 10% [
41].
Rafael et al. evaluated the impact of three driving styles on fuel efficiency and emissions on the chassis dynamometer. The results showed that the aggressive driving style had low fuel efficiency and high emissions [
42].
Mansfield et al. studied the pre- and post-intervention bias changes in driving behavior and showed that the effect of intervention on driving behavior depends on individual driver factors and driving motivation [
43].
Barth et al. proposed the provision of dynamic driving advice during the driving trip. Dynamic driving advice can reduce fuel consumption and CO
2 emissions by 10–20% and does not affect the overall trip time significantly. The percentage savings also depends on the congestion level, with little effect on free flow and significant savings in congested conditions [
44].
Zhai et al. proposed a continuous traffic flow enhancement model considering the effects of driver characteristics and traffic fluctuations, obtaining model stability under linear and nonlinear conditions. The effects of driver characteristics and traffic fluctuations on traffic flow and emissions were investigated [
45].
Jiao et al. analyzing the relationship between driver characteristics and following optimal speed by using the grey correlation method, proposed a new optimal speed function (OVF) and vehicle following model, and numerically simulated the influence of different driver characteristics on the vehicle following behavior and fuel economy [
46].
Frequent start-stops and idling of vehicles increase fuel consumption and emissions. Therefore, studying the formation and dissipation mechanism of congestion can help implement reasonable traffic control, effectively reducing the waiting time in congested areas and thus can reduce energy consumption and emissions, which is conducive to the sustainable development of transportation.
Pan et al. investigated the effect of traffic congestion on particulate matter emissions and the energy consumption of single-lane traffic streams using the NaSch model with periodic and open boundary conditions [
47].
Shankar et al. found that if a significant reduction in traffic congestion can be achieved, a significant reduction in energy consumption could be obtained [
48].
The rest of this article is arranged as follows: The second part introduces the data processing of driving style classification; the method is introduced, and statistical analysis is performed to obtain the differences in vehicle speed, acceleration, deceleration, etc., under different styles. The third part discusses the classical CA models and the main rules for improving the model. The fourth part focuses on the simulation analysis from the perspective of the fundamental diagram and spatio-temporal characteristics analysis. Finally, in the fifth part, the paper’s findings are summarized.
2. Driving Style Analysis
2.1. Data Preparation
In this study, we used NGSIM data to classify driving styles, obtained driving data for the following vehicles, and used the data to validate an improved meta-automata model.
The NGSIM data were obtained from the Next Generation Simulation program [
49], which collected vehicle trajectory data on US-101 and Lankershim Avenue in Los Angeles, California, I-80 in Emeryville, California, and Peachtree Street in Atlanta. The data provide precise location information for each vehicle, recorded at 10 Hz, resulting in precise lane locations and positions relative to other vehicles. The data from US-101 were selected for analysis in this study.
The US-101 data contain 25 attributes: vehicle ID, frame ID, global time, local X, and local Y, etc. [
49]. The length of the study area is about 640 m, including eight lanes, five driving lanes, two ramp lanes, and one merging lane, shown in
Figure 1. The dataset contains 45 min of US-101 vehicle trajectories divided into three 15-min periods: 7:50 a.m.–8:05 a.m., 8:05 a.m.–8:20 a.m., and 8:20 a.m.–8:35 a.m. The trajectory data from 7:50 a.m.–8:05 a.m. was used in this study.
To ensure the accuracy of the data, we selected the vehicles on the main lane, which accounted for about 97% of the total, chose the cars as the study object, and then converted the imperial units to international standard units.
The contents of the processed data are shown in
Table 1.
2.2. Data Preprocessing
The NGSIM raw data were obtained from video analysis, which contains a lot of errors and noise [
50]. Using the raw data directly would lead to greater bias in the analysis results, affecting the calibration of the microscopic traffic flow model and reducing the accuracy of the subsequent analysis. Therefore, a Savitzky–Golay filter using a third-order polynomial with a window length of 21 was used to smooth the velocity and acceleration data [
51].
2.3. Car-Following Process Extraction
The car-following process is mainly expressed in the process of the following vehicle (FV) to the driving state of the leading vehicle (LV).
Figure 2 shows the following scenario. In the same lane, the driver of the following vehicle adjusts the driving speed of his vehicle in real-time according to the driving behavior of the previous vehicle to maintain the desired distance; d is the workshop distance between the two vehicles.
Since vehicle following behavior is driving behavior in a single lane, the data need to be pre-processed. To study the heterogeneity of the following behavior, it is necessary to extract the following vehicle data that meet the requirements from the US-101 dataset, with the following specific screening conditions.
The following behavior data is extracted in a way that the two vehicles driving continuously in the same lane are extracted as a combination.
The following vehicle and the vehicle in front will not change lanes within a certain time.
The vehicles of 1–5 lanes are selected for following behavior data extraction. The vehicle driving in or out of the ramp may affect the following behavior.
The vehicle type as “car” is only considered. The various types of vehicles in the following performance are considered differently, while the car accounts for a large proportion.
For each data group, the vehicle following time is required to be above 20 s to ensure a relatively stable vehicle following state.
For each data group, the time headway between the vehicle and the preceding vehicle must be kept within 5 s to ensure that the distance between vehicles will not be too large, resulting in the following effect not being obvious.
According to the above rules, a total of 1104 data sets (653,568 vehicle trajectory samples) were collected from the processed NGSIM data. For the extracted vehicle following data, 70% were used for driving style analysis and determining the model design parameters, and 30% were used to test the model’s accuracy.
2.4. Driving Style Division
Driving styles are closely related to driving safety, and aggressive driving styles are usually more likely to lead to traffic accidents. For example, aggressive drivers are easily affected by the road environment (e.g., rainy day, snowy day) and other road users (e.g., disobeying traffic rules) during the driving process, appear driving anger cognition (e.g., frustration, mild irritation) and driving aggressive behavior (e.g., cursing, honking, etc.). To incorporate the actual situation and reduce the influence of subjective questionnaires on the results, vehicle kinematic data were used in this paper.
Velocity and acceleration can show driving habits, and frequency change of velocity and following distance can reflect driving personality. In this paper, 10 evaluation indicators on driving styles were selected, as shown in
Table 2.
When classifying driving styles, if all the parameters of the follow-the-leader model are used for classification, all the information about driving behavior is retained. The accuracy and convergence of the analysis results can be affected if the driving style is classified directly using the parameters. Principal component analysis (PCA) is a statistical algorithm that converts correlated variables into linearly uncorrelated variables using orthogonal transformations. The transformed variables are called principal components (PC).
We calculated each PC contribution and cumulative contribution rates, as shown in
Figure 3. The first five PCs were selected based on the 85% cumulative contribution principle to reflect the original indicators’ information fully. The PC coefficient matrix is shown in
Table 3.
The scores of each PC were calculated based on the PC score coefficient matrix and used as the input for the subsequent classification and driving style recognition models. Among them, the main influencing factors of the first three PCs are average acceleration, maximum velocity, and average deceleration.
Then the number of driving style classifications was determined using cluster analysis methods. In this paper, the k-means algorithm was used to classify driving styles. The essence of the algorithm is to determine new cluster centers through iterative operations, and the calculation converges when the cluster centers do not change.
However, the disadvantage of k-means is that it is difficult to determine the number of “
k” clusters. The main methods to determine the value of
k are the silhouette measure and elbow method. The elbow method is based on the idea that the number of clusters
k is taken from 1 to
k = 8, with each step of 1. The sum of squared errors (SSE) is calculated for each value of
k. The formula is shown in Equation (1).
where
is the cluster midpoint in cluster
,
is the sample in cluster
, and
is the total number of clusters in the dataset.
When the number of clusters increases, the degree of aggregation of each cluster also increases, and the SSE decreases gradually. When the value of k is less than the correct number of clusters, the increase in the value of k will significantly increase the degree of aggregation of each cluster, and the decrease in SSE is greater. However, when k reaches the optimal number of clusters and then increases the number, the decrease of SSE will become slow and eventually level off. Therefore, the relationship between SSE and k is in the shape of an elbow, and the value of k corresponding to that elbow is the optimal number of data clusters. It is calculated that when the value of k is greater than 3, the SSE change tends to level off. Therefore, the number of driving styles classified in this study was three, and the three driving styles were named: calm, moderate, and aggressive.
Then the first three principal components were derived from arriving at the driving style identification results, shown in
Figure 4. The clustering results show that 24% of the vehicles are of aggressive driving style, 54% of vehicles are of moderate driving style, and 22% of vehicles are of calm driving style.
To characterize the parametric features of the following behavior under different driving styles, the cluster centers in each cluster are taken to represent the following behavior of the class of drivers, shown in
Table 4. According to the analysis of the clustering results, the aggressive driver will be close to the preceding vehicle, drive fast, and likely act boldly in the following process; acceleration and deceleration are greater than in the other two styles. Calm drivers always stay away from the vehicle in front of them and drive cautiously and slowly. The surroundings have less influence on moderate drivers.
3. Model Description
3.1. NaSch Model
The NaSch model is a classical single-lane CA model, and the evolutionary rules mainly include acceleration, deceleration, random slowing rule, and update rule. The random slowing rule is set, so that the vehicle does not drive on the road at a fixed speed, and the vehicle will randomly slow down with a certain probability p. The evolution rules of the NaSch model in the one-time step are as follows.
where,
is the maximum velocity,
is the distance between two vehicles,
,
are the position and velocity of the vehicle at the moment
n, respectively.
3.2. KKW Model
This subsection briefly introduces the core part of the KKW model, the speed adaptation rules [
52]. The updated rules include dynamic rules and stochastic rules.
Step 1. Dynamical part of the KKW model:
in which
is calculated by the following equation:
In the above equation, is the maximum speed of the vehicles, is the safe speed that the vehicle cannot exceed to avoid a collision, is the adaptation speed of the vehicles within the synchronized flow distance and the preceding vehicle’s speed. is the time step, set to 1 s. is the acceleration or deceleration. The sign(x) is a sign function, if x > 0, sign(x) = 1, if x = 0, sign(x) = 0, otherwise sign(x) = −1. is the synchronized distance, is the distance between the following vehicle and the preceding vehicle.
The updated rule of reflects the “speed adaptation”. The “speed adaptation” means that when the front vehicle is within the synchronization distance, the following vehicle will judge whether to accelerate according to the preceding vehicle’s speed. When the distance is greater than , the influence of the preceding car on the following car is weak, and the acceleration of the following car is almost not affected by the speed of the preceding car. When the distance is less than , the front car acts powerfully on the following car, and the following car adjusts its speed to approach the speed of the preceding car. The “speed adaptation” ensures that the speed of the following car can be very close to the speed of the front car in a certain distance range, which causes a synchronized flow state.
Synchronized flow distance is a function of the speed. In the literature [
15], Kerner gives the following linear and nonlinear relations for
where
,
and
are constants. In this paper, the linear relationship of
is selected as the improved model parameter design.
Step 2. Stochastic part of KKW models:
In which
is the vehicle’s acceleration.
is the parameter set to control the speed disturbance, its value is set by setting a random number, representing the slow start and “pinch” phenomena, depending on the corresponding conditions, as in Equation (11).
where
r is a random number between 0 and 1.
and
are functions used to control the rate of addition and subtraction.
and are constants between 0 and 1, and . , , are constants, .
Step 3. Vehicle movement:
3.3. The Improved CA Model
Based on the previous presentation, a new CA model was proposed by considering the driving styles and introducing two principles (the speed adaptation principle and the over-acceleration principle), which are improved and combined with the NaSch model.
Over-acceleration rule on a single lane [
53]: in the current driving lane, when the distance between a car and the preceding car is within the synchronized flow distance and the speed of the preceding car is less than or equal to its speed, the following car will accelerate with probability in addition to adapting to the speed of the preceding car; this probability is related to the current speed and the difference of the synchronized flow speed. The over-acceleration rule was modified to make it more suitable for the model proposed in this paper.
The over acceleration occurs in the lane when
where
r is a random number between 0 and 1,
is the preceding vehicle’s velocity,
is the same as defined in the previous section.
To describe the diversity of driving styles, the vehicle’s maximum speed and acceleration are subdivided according to the driving types. The subscripts denote variables regarding aggressive, moderate, and calm styles, respectively. Consequently, , , and are the vehicle parameters of aggressive style. , , and are the vehicle parameters of moderate style. , , and are the vehicle parameters of calm driving style.
The evolution rules for the improved CA model are given here.
Rule (a): Classification of driving style characteristics
Rule (b): determine whether the distance is within the synchronization distance
Follow rules (c), (d), skip rule (e),
Skip rules (c), (d), follow rule (e).
Where, is the distance between the following vehicle and the preceding vehicle.
Rule (c): the improved speed adaptation rules
where,
is the slow start probability when the vehicle’s speed is 0,
is the random slowing probability, and
r is a random number between 0 and 1.
Rule (d): The over-acceleration rule within the synchronized flow distance is shown in Formulas (14) and (15) for details.
Rule (e): The process of acceleration
Rule (f): The process of deceleration
Rule (g): Improved random slowing rule
Rule (h): Update location
where,
is the vehicle position when the timestep is
n.