*2.1. Macroscopic Traffic Flow Model and Vehicle Velocity Prediction*

Macroscopic traffic flow models use a mathematical model to describe traffic dynamics. The mathematical model evaluates the unmeasured area based on real-time input data. This model is usually based on empirical relationship, in which the parameters to be evaluated are either obtained from external calculation of historical data or generated from internal evaluation of an algorithm. The macroscopic traffic flow model method has been widely used in traffic state assessment.

Literature [16,17] introduces in detail the current developments and trends of the above macro traffic flow model, as well as traffic modeling, evaluation, and control methods based on intelligent networked vehicles. The macroscopic traffic flow model method has the following advantages: First, this method explains the mechanism of traffic, expands the observation data, and provides additional information. Therefore, this method can use less data to predict accurate traffic conditions. Second, it has higher interpretability. This means that even if the prediction is not accurate, the reason may be found in a certain confidence interval. Third, it can be directly integrated with traffic control practices, such as using model predictive control. The macroscopic traffic flow model method also has the following shortcomings: First, inaccurate or uncalibrated models will lead to poor performance of traffic state assessment results. Therefore, in actual application, the macroscopic traffic flow model traffic state assessment method must be carefully selected and calibrated. In this case, checking the validity of a model or calibrating a model requires a big set of data [18]. Second, since the macro-traffic flow model cannot be adjusted adaptively, it cannot reflect the changes in traffic flow brought by random traffic interference alone. Therefore, in the absence of real-time traffic information, it is more suitable for the prediction of traffic flow that does not change rapidly, such as the prediction of traffic parameters in expressways and urban loops.

Traffic flow (simulation) models are used to characterize complex traffic flow systems in order to understand, describe, and predict traffic flow [19]. It is a basic tool for analysis and experimental research of transportation systems. The traffic flow model is not only used in the traditional fields of traffic system design, testing, management, and personnel training. With hot research direction on intelligent vehicles and intelligent transportation systems, it has also been used to evaluate and predict the state of the transportation system [18]. Macroscopic traffic flow models can be classified according to the degree of detail they represent in traffic systems. This classification can be implemented by considering different levels of traffic entities in their respective flow models. Literature [20] classifies the traffic flow model as follows:


The microscopic traffic flow model not only describes the time and space behavior of system entities (vehicles, drivers, etc.), but also describes the interaction between them in detail. For example, a lane change behavior of each vehicle in the flow is described as a series of driver decisions.

Similar to the microscopic model, the submicroscopic model describes the characteristics of a single vehicle in the traffic flow. However, in addition to the detailed description of the driving behavior, the control behaviors (transmission shifting, ESP, etc.) of the vehicle in response to the surrounding conditions are also modeled. Moreover, the sub-modules of the vehicle are also modeled by mathematical equations.

The mesoscopic traffic flow model neither identifies nor tracks individual vehicles, but it lists the behavior of individual vehicles (for example, in the form of probability). In view of this, traffic is represented by small groups of traffic entities. The model does not describe the behavior and interaction of these groups in detail. For example, the lane change of a single-vehicle is described as a transient event. The decision to change lanes

is based on parameters such as the relative density of the lane and the speed difference. Some mesoscopic models analogous to the theory of gas motion have been pushed out. These gas motion models describe the dynamic distribution of model velocity.

The macroscopic traffic flow model describes the traffic flow at an overall level without identifying its constituent details. For example, traffic flow is aggregated (represented by flow, density, and speed). The behavior of individual vehicles (lane changes, etc.) is usually not displayed. A macroscopic traffic flow model assumes that it is appropriately allocated to the road lanes and uses approximate methods to achieve it. Macroscopic traffic flow models are usually classified based on the number of their partial differential equations. Usually, one side of the equation is the model representation, and the other side is the order of the equation. Here are some typical traffic flow models and their extended models [18].

#### 2.1.1. One-Dimensional Flow Model

The Lighthill-Whitham-Richards (LWR) model [21] is a famous innovation in the mesoscopic traffic flow model. It uses a conservation law of vehicles in the traffic flow and assumes that the traffic parameters (traffic speed and density) follow a fundamental diagram (FD) in equilibrium. This traffic model has the ability to identify traffic congestion and distinguish between traffic congestion and free flow. Moreover, because of its relatively simple equations, it can be quickly calculated and solved. However, due to the limitations of the model, it cannot reproduce some complicated phenomena, for example: unstable flow and the stop-and-go phenomenon.

After the LWR model, higher-order models were developed in order to show phenomena different from equilibrium traffic state (usually referred to as micro disturbance and non-equilibrium state), higher-order models usually use another momentum equation to describe the evolutionary relationship of traffic speed instead of the FD. The Payne-Whitham (PW) model [22] is the first well-known high-order modeling attempt. The PW model and its extension [23] successfully reproduce some well-known traffic phenomena, such as hysteresis, reduced capacity. However, it still has some shortcomings, such as negative speed [17].

The Aw-Rascle-Zhang (ARZ) model [24] is another high-order model. It allows the equilibrium state of traffic to be transformed into other states. It also overcomes the limitations of the PW model. It can be derived from the LWR model. The ARZ model is further developed and extended to the following models: general second-order model [25], phase change model [26], and generalized ARZ (GARZ) model [27].

Recently, the theory of explaining the traffic flow model has made new progress. Some models can be represented using Hamilton-Jacobi partial differential equations (HJ-PDE). HJ-PDE has been studied in detail in the field of partial differential equations and physics [28], and its theory can effectively solve this kind of model [29]. For example, the traffic density state variable ρ(t, x) of the traditional LWR model can be transformed into the state variable N(t, x) of the accumulation flow. In addition, due to the nature of the accumulated flow, this model can implicitly express vehicle trajectory and travel time. The relationship between macro and micro can be shown. Then, a Lagrangian coordinate system was proposed and used accumulated flow and vehicle trajectories in the mesoscopic traffic flow model [30]. Similar topics have also been discussed in other traffic flow models [31].

#### 2.1.2. Multi-Lane Models, Multi-Class Models, and Random Models

Actual traffic often has multiple lanes and multiple categories, different from the model introduced in Section 2.1.1. The studies [32,33] proposed some models that consider multi-lane and multi-category traffic behavior. For these extended models, modeling lane change behavior is one of their main challenges. Regarding lane changes, human behavior is quite important.

For the random properties of the traffic and the uncertainty of input/output data, a stochastic model [34] based on the LWR model is proposed. In the references [35,36], the source of randomness in dynamics explains the heterogeneity of vehicles, which means the relationship between multi-class models and random models. Moreover, the work [37] developed an LWR stochastic model based on HJ PDE.

#### 2.1.3. Development Trend

The traffic flow model is not only used in the traditional fields of traffic system design, testing, management, and personnel training [20]. As the research on intelligent vehicles and intelligent transportation systems is a hotspot, it is also used to evaluate and predict the state of the transportation system [18]. It extends from ensuring the stability and energy efficiency optimization of a single vehicle to the safety and efficiency of the entire system.

The main principle of the prediction algorithm is to combine numerical simulation, traffic model, real-time data, and historical data to predict the evolution of future traffic conditions. Designing a fast, scalable, and accurate road traffic forecasting tool is the key to overcome the lack of forecasting ability of the existing traffic management information system [18], and it can be applied to the prediction and planning of vehicle path and speed in the future. At present, there is no hybrid traffic flow model that combines the advantages of macro traffic flow model and data-based traffic flow model. The hybrid traffic flow model is of great significance to improve the accuracy, robustness, and real-time performance of prediction. In the future, the traffic flow model will not be used individually, but a multi-layered hybrid model, as well as possible complementary and combined use of model-based traffic flow models and data-based learning models.

#### *2.2. Data-Based Traffic Flow Model and Vehicle Velocity Prediction*

Traffic parameter prediction method based on big data and machine learning has attracted the research interest of many scholars in recent years [38,39]. The traffic management department realizes the prediction of traffic flow by collecting and analysising the current and historical traffic data. Big data analysis can effectively predict the occurrence of traffic accidents. Big data analysis mainly solves the following three problems: data storage, data analysis, and data management [40]. The collected traffic big data is trained into a prediction model by machine learning method to analyze the evolution trend of traffic. Machine learning models can be divided into: supervised learning, unsupervised learning, reinforcement learning, deep learning and entity-based algorithms [40]. The labeled training data is used for supervised learning algorithms. Linear regression, decision tree, neural network and support vector machine are typical supervised learning methods.

The data-based traffic flow model can be divided into historical data-based traffic flow model and real-time big data-based traffic flow model. Among them, the traffic condition assessment method that widely relies on historical data uses a statistical method or machine learning method to find the relationship between historical data. Traffic conditions are evaluated based on this correlation and real-time data, meaning that it does not require prior knowledge of explicit modeling in the macroscopic traffic flow model. This method usually requires a big amount of historical data.

The traffic flow model based on historical data has the following advantage: less time for model selection and calibration. The disadvantages are: first, based on historical data means that the model may fail when unexpected events occur or when a relatively long trend is predicted. Second, the computational consumption required for training and learning will be very high. Third, the method can be regarded as a "black box", which means it is unable to properly explain the model decisions [18]. Fourth, if the information from real traffic flow is greatly different from that stored in the data used for the training model, the prediction accuracy will not be guaranteed.

Compared with the macro traffic flow model and the traffic flow model method based on historical data, the method based on real-time big data is defined as a method that does not rely on the empirical relationship that appears in the macro traffic flow model but depends on real-time data flow. This means that this method relies less on the prior knowledge of transportation. Its advantage is that it is robust to uncertain phenomena or unpredictable accidents. In an era of ubiquitous sensors (smart phones) and the emergence of a large number of intelligent networked vehicles, methods based on real-time big data streams may become popular in the future [41,42]. This method is more suitable for the prediction of urban road network traffic flow.

#### 2.2.1. Research Status of Traffic Flow Evolution Using History Traffic Data and Artificial Intelligence Methods

It has become a hot research direction using deep learning method to predict the change of traffic parameters, for example in [43–47]. The unsupervised incremental machine learning, deep learning, and deep reinforcement learning was adopted by Dinithi Nallaperuma et al. to structure an expansive smart traffic management platform [43]. It can successfully model traffic flow with fluctuation; however, the method proposed in the literature is not effective for predicting traffic flow with high frequency fluctuation. An improvement to this problem is to increase the amount of data used to train the deep learning model. The restricted Boltzmann Machine method was used to predict traffic [44]. This method has a better nonlinear fitting ability and high prediction accuracy for typical chaotic time series. Di Zang et al. [45] solved the task of long-term traffic speed prediction for elevated highways by coupling convolutional long-short-term memory and convolutional neural network (CNN) into a single framework.

For the problem of uncertain data used in the training model, a common solution is to combine fuzzy rules with deep learning [48] and neural network [49]. By introducing the fuzzy representation into the deep learning model to lessen the impact of data uncertainty, a deep convolutional network model was established to explore the spatiotemporal connection of traffic flow to promote traffic flow prediction in [48]. The experimental results show that the combination of deep learning and fuzzy theory can improve the prediction accuracy, compared with other methods, such as autoregressive integrated moving average (ARIMA), deep learning-based prediction model for Spatial-Temporal data, CNN, fully convolutional neutral network, and fatigue detection convolutional network. The Takagi-Sugeno system was used for fuzzy reasoning, and two learning processes were proposed to update the membership function of the fuzzy system [49]. The proposed method has advantages over the six traditional models, such as artificial neural network, support vector machine, ARIMA model, and vector autoregressive model.

In addition to representing uncertainty with fuzzy rules, another approach is to point out exactly what the uncertainties are and then label those uncertainties with contextual factors. The relationship between traffic flow values in a time interval is investigated based on a combination of contextual factors from historical data [50]. From the analysis results, forecasting accuracy can be better improved by the proposed new method. On the other hand, the design is slightly inferior to the conventional method due to inconsistent points. This can be interpreted to the high volatility degree associated with low-traffic-flow periods.

There are several ways to improve the prediction accuracy of deep learning methods: training data are screened [51,52] or training parameters are optimized [53]. A deep belief network (DBN) model and a kernel extreme learning machine classifier is combined as a prediction model, wherein the important features of the traffic flow data are extracted through DBN at the bottom of the network. To predict the traffic flow, the extracted results are inputted into the kernel extreme learning machine classifier [52]. Automatically use those highly correlated spatiotemporal points to train the deep learning network, and reduce the use of less correlated data [51]. This explains the interaction between past and future data to some extent. Traffic flow theory and its application on urban transportation networks with more efficient deep learning architectures is a promising study field [51]. This is the valuable research direction recommended by this paper.

#### 2.2.2. Research Status of Velocity Prediction Using Real-Time Traffic Data

Real-time prediction model of vehicle travel speed is helpful to improve vehicle safety, maneuverability, and fuel economy. In order to achieve these effects, an accurate velocity prediction model needs to be established and can be successfully implemented in the real

system. There are two main types of vehicle speed prediction models: the prediction model based on Markov chain and the prediction model based on recursive neural network. Markov chains are stochastic data-driven models that predict future states from state transition matrices and current states. Transient probabilities are aggregated into state transition matrices. The structure of measurement based on transfer conditions is intuitive and easy to implement. There are three kinds of Markov chain models: interval coding, fuzzy coding, and velocity constraint model [54]. The speed prediction models are based on recurrent neutral network (RNN) including standard RNN, long short-term memory, and gated recurrent unit (GRU) models [54]. Among the three models based on Markov chain, the model combining the fuzzy coding method and the constraint model has the highest prediction accuracy. Of the three RNN-based models, GRU has the highest prediction accuracy due to the appropriate structure of long-term dependent learning by combining the amount of previously determined state data [54].

#### 2.2.3. Development Trend

More research study needs to be performed in the following aspects for future databased vehicle speed prediction methods [15]:


#### *2.3. Influence of Vehicle Lateral Dynamic on Speed Prediction*

The influence of vehicle lateral dynamics on vehicle speed prediction is based on the traffic risk assessment from drivers or intelligent vehicles. Traffic risk assessment models can be divided into two types: longitudinal and lateral models. The longitudinal traffic risk assessment model mainly evaluates vehicle collision accidents caused by untimely braking or insufficient braking force. It includes a risk assessment model based on traffic accident data or simulation data [55,56] and a model-based risk assessment model [57–59]. Model-based work often requires parameters and empirical assumptions, while data-based methods only focus on extracting the relationship between images and road safety without considering other influencing factors such as drivers; thus, it has certain limitations.

Lateral traffic risk assessment models often build lane change risk models based on road traffic environment and lane change path factors, such as building a highway exit lane change risk model based on a proportional advantage model [60]. The existing literature rarely considers the impact of the traffic environment and vehicle lateral handling characteristics [61]. How to dynamically and accurately analyze the impact of traffic conditions on the prediction of vehicle velocity change and intermittent road safety (vehicle lateral stability) is a huge challenge for road safety analysis in practical application [60].

In order to study the stability characteristics of different vehicles and generate their stability criteria (lateral traffic risk assessment model), it is necessary to study the structural characteristics of vehicles and their handling stability control methods. For example, distributed drive control (also known as torque vector control) enhances the vehicle's dynamic performance [62], so its traffic risk assessment should be different from that of ordinary front-wheel steering vehicles. In a similar work, Kun jiang et al. [63] presented a method to estimate and predict individual tire forces based on a vehicle dynamics model and observer with low-cost sensors and driver assistance map, which is in close relation to the speed of the vehicle. Yu-Chen Lin et al. [64] developed an ecological cruise control based on an adaptive prediction-based control strategy. The design guarantees the safety of driving, riding comfort, as well as fuel efficiency simultaneously when running on roads with curves and up-down slopes. However, the above two literature did not involve the personalized assessment of vehicle traffic risk, nor did they pay attention to the relationship between traffic risk and predicted speed.

Some researches combine vehicle lateral safety with longitudinal speed prediction. To this end, a three-degree-of-freedom vehicle lateral dynamics model with lateral load transfer ratio index is derived for a rollover speed prediction model in [65]; similarly [66,67]. Lin Li et al. [68] combined the energy management of HEVs with vehicle speed prediction and planning before entering the corner. Pan Song et al. [69] demonstrated an improved optimal speed adjusting method based on the vehicle handling stability and path following performance. Med Krid et al. [70] presented a model predictive control (MPC) strategy for an active anti-roll system, which aims to minimize the load transfer during cornering and the consumed energy by the actuators. Hongbin Ren et al. [71] formulated a quadratic optimization problem for an integrated control of longitudinal speed and lateral motion control based on longitudinal progression maximization and lateral path tracking error minimization.
