Machine Learning-Based Lane-Changing Behavior Recognition and Information Credibility Discrimination

Chen, Xing; Yan, Song; Wang, Jingsheng; Zhang, Yi

doi:10.3390/sym16010058

Open AccessArticle

Machine Learning-Based Lane-Changing Behavior Recognition and Information Credibility Discrimination

¹

School of Traffic Management, People’s Public Security University of China, Beijing 100038, China

²

School of Information Science and Technology, Tsinghua University, Beijing 100084, China

^*

Author to whom correspondence should be addressed.

Symmetry 2024, 16(1), 58; https://doi.org/10.3390/sym16010058

Submission received: 7 November 2023 / Revised: 11 December 2023 / Accepted: 27 December 2023 / Published: 1 January 2024

(This article belongs to the Section Engineering and Materials)

Download

Browse Figures

Versions Notes

Abstract

:

Intelligent Vehicle–Infrastructure Collaboration Systems (i-VICS) put forward higher requirements for the real-time security of dynamic traffic information interaction. It is difficult to ensure the safety of dynamic traffic information interaction by means of traditional static information security. In this study, a method was proposed through machine learning-based lane-changing (LC) behavior recognition and information credibility discrimination, based on the utilization and exploitation of traffic business characteristics. The method consisted of three stages: LC behavior recognition based on Support Vector Machine (SVM), LC speed prediction based on Recurrent Neural Network (RNN), and credibility discrimination of speed information under LC states. Firstly, the labeling rules of vehicle LC behavior and the input/output of each stage model were determined, and the raw NGSIM data were processed to obtain data sets for LC behavior identification and LC speed prediction. Both the SVM classification and RNN prediction models were trained and tested, respectively. Afterwards, a model of credibility discrimination speed information under an LC state was constructed, and the real vehicle speed data were processed for model verification. The results showed that the overall accuracy of vehicle status recognition by the SVM model was 99.18%, and the precision of the RNN model was on the order of magnitude of cm/s. Considering transverse and longitudinal abnormal velocity, the accuracy credibility discrimination of LC velocity was more than 97% in most experimental groups. The model can effectively identify the abnormal speed data of LC vehicles and provide support for the real-time identification of LC vehicle speed information under i-VICS.

Keywords:

intelligent transportation; traffic business characteristics; lane-changing behavior recognition; speed prediction; information credibility discrimination

1. Introduction

With the development of science technology, Vehicular Ad Hoc Network and Intelligent Vehicle Road (VANET) collaborative system [1] have greatly affected the intelligent transportation system. The former establishes connections between vehicles and vehicles through wireless communication devices, such as on-board equipment and shares real-time information in the network, while the latter realizes real-time information sharing and collaborative decision making through communication and data interaction between vehicles and road infrastructure. VANET involves massive and important data transmission and information exchange between vehicles and roadside infrastructure, and between vehicles and vehicles. Compared with non-connected vehicles, connected vehicles (CVs) require more frequent information interaction and are highly dependent on external information. This means that compared with traditional vehicles, connected vehicles will be faced with more potential threats from malicious attackers to attack the network, steal, tamper and falsify data [2]. According to Upstream’s Global Automotive Cybersecurity 2023 report, cyberattacks have caused more than USD 500 billion in losses to the global automotive industry over the past five years, with remote cyberattacks accounting for about 70% of vehicle security threats. Kim et al. [3] pointed out that network attacks on autonomous vehicles can be divided into three categories: automatic control system, components of autonomous driving system and vehicle-connected communication, among which the defense against vehicle-connected communication attacks is anomaly detection. Highly trusted data interaction is the key guarantee of intelligent transportation system, abnormal data detection and credibility discrimination are important means to ensure the reliability of data.

There have been many studies on how to solve these problems by means of communication and computer system security technology. Yao et al. [4] proposed a dynamic entity-centric trust model based on weight which is simple enough to realize fast trust evaluation for the data in VANETs and helps vehicles to detect false or forged data. Azees et al. [5] proposed an efficient anonymous authentication scheme to avoid malicious vehicles from entering the VANET, and designed a condition tracking mechanism applicable to vehicles and roadside units (RSUs), to improve the efficiency of the VANET system while maintaining privacy. El-Rewini et al. [6] proposed a layered framework for traditional vehicle information security threats, which consists of sensing, communication, and control layers to investigate attacks and threats related to the communication layer and propose corresponding countermeasures. To avoid the privacy disclosure of requesting users due to the cracking of anonymous servers, Zhou et al. [7] proposed a group signature location privacy protection scheme with backward irrelevance, which has higher security and lower computing cost. Zheng et al. [8] proposed a vehicle identity authentication protocol based on a lightweight group signature, which can authenticate the vehicles anonymously in a fast and efficient way, aiming to solving the problem of illegal member’s tracking attacks. Yang et al. [9] proposed an identity authentication scheme based on vehicle behavior prediction for software-defined Internet of Vehicles (IoV) within the Mobile Edge Computing (MEC) framework, to solve cryptography-based authentication schemes in IoV. Yang et al. [10] proposed introducing the idea of edge computing (EC) into VANETs and using idle nodes’ resources to assist RSUs in quickly authenticating messages.

However, the above information security means still have some limitations, and there is a lack of effective protection mechanisms in terms of continuous trusted identity authentication and traffic business characteristics verification of data. Therefore, some scholars have proposed a credibility discrimination method based on traffic business characteristics. This method is mainly divided into two stages: one is the extraction of traffic business features (including traffic physical boundary, vehicle motion state and driver’s driving behavior [11]) and the other is the credibility discrimination of the extracted features.

The excellent performance of the rapidly developing machine learning technology in the intelligent transportation system has been widely considered by researchers [12]. To solve the problems of recognition and prediction, domestic and foreign scholars have applied machine learning to traffic business characteristics extraction [13,14], and carried out the following studies: With the indexes of speed, acceleration, lateral offset, space headway, speed difference and time headway, Ji et al. [15] divided the driving behaviors of minibuses into car-following, LC and overtaking. Chen et al. [16] divided the LC process of vehicles into the car-following (CF) stage, LC preparatory stage and LC execution stage based on multi-classification support vector machine. Xie et al. [17] modeled the LC process that is composed of LC decisions (LCD) and LC implementation (LCI) based on deep belief network (DBN) and LSTM. Huang et al. [18] proposed a LSTM neural networks (NN) based CF model considering asymmetric driving behavior to predict vehicle speed. Considering the effects of LC of side cars, Zhao et al. [19] proposed a two-lane multi-speed difference following(FS-MAVD) model, and constructed a convolutional bidirectional LSTM network combined with a temporal attention mechanism model to predict acceleration. Cai et al. [20] proposed a SLSTMAT(Social-LSTM-attention) algorithm, which innovatively introduced social characteristics of target vehicles and extracted them through convolutional neural networks to establish a vehicle behavior recognition model based on deep learning. Zhao et al. [21] designed a driving intention recognition and vehicle trajectory prediction model based on graph neural network and Gated Recurrent Unit. The results showed that the proposed model can better identify the driving intention of vehicles. Huang et al. [22] proposed a LC intention recognition method based on Attention-BiLSTM network. Compared with the LSTM model, the accuracy and F1 score of the proposed Attention-BiLSTM model increased by 13.2% and 10.5%, respectively.

Domestic and foreign scholars have carried out the following studies on characteristics-based credibility discrimination: Feng et al. [23] systematically studied the network security of traffic signal control system in the environment of CV, analyzed the potential threats of traffic signal control system, proposed a network security analysis framework, and completed network attack and defense on a security test platform. Iqbal et al. [24] provided a data set based on machine learning methods for training and evaluating malicious threat detection against Connected and Autonomous Vehicles (CAVs), and proposed a method to simulate network security attacks against VANET using simulation to improve the network security of CAVs. Steven et al. [25] proposed a system framework to detect and classify misbehavior in VANET, using plausibility checks as the feature vectors of the machine learning model, and K-nearest neighbor algorithm and SVM to improve the overall detection accuracy. Huang et al. [26] proposed a data-driven method to identify falsified trajectories generated by compromised CVs, and proposed trajectory embedding model, computed the similarity distance between trajectories based on vector representations, used hierarchical clustering to identify anomalous trajectories. Shangguan et al. [27] designed a vehicle infrastructure cooperative credible interaction framework, and constructed a model of vehicle behavior state deduction and one path perturbation factor quantification. Wu et al. [28] proposed an extended LC model (ELC) which can model CAV’s LC behaviors under cyberattacks; simulations were conducted to illustrate the impact of different malicious attacks on vehicles’ LC movements. Shi et al. [11] built a credibility discrimination model for CF behavior based on SVM and LSTM neural network, which was trained and verified with NGSIM data sets, and the correct discrimination rate on normal and abnormal data sets reached more than 97%.

At present, many studies have been carried out in the research direction of traffic information credibility discrimination based on traffic business characteristics, but the existing studies have not fully covered various traffic scenarios in practical applications, especially for micro scenarios. Scholars focus more on discrimination methods for autonomous driving fleets, traffic control systems, and other objects, and less research on discriminating the driving information of a single vehicle. Shi et al. [11] proposed a credibility discrimination method for traffic information in the CF scenario. Compared with the CF scenario, the traffic factors to be considered in LC are more complex and more dangerous [29], and the features are more difficult to extract. The LC scenario targeted in this paper is more complex than the CF scenario in the literature [11]. The CF state focused in literature [11] only needs to consider the longitudinal speed of the vehicle, while the LC state also needs to consider the transverse speed in addition to the longitudinal speed, so our model is more complex. And because of the optimization of hyperparameters, the accuracy of our velocity model is higher.

Based on the above analysis, several research gaps can be identified as follows:

(1): There are many parameters and large dimensions in the process of lane change, and the data are difficult to process;
(2): Credibility discrimination models and methods based on traffic business characteristics under LC scenarios have not been studied;
(3): The integrality of steps such as vehicle state recognition and speed prediction is insufficient, and the input and output of each stage model are inconsistent.

To fill up the above research gaps, in this paper, a method of information recognition and credibility discrimination of LC behavior based on machine learning is proposed. The main contributions of this paper are as follows:

(1): To recognize LC behavior state and solve the problem of multi-dimensional parameter continuous time series sampling, an eigenvector dimensionality reduction method is proposed;
(2): A credibility discrimination method is proposed based on traffic business characteristics, including identification, prediction, and discrimination under LC scenarios. Both the credibility discrimination process and evaluation rules are designed;
(3): In the prediction model, a parameter consistency processing method is proposed to deal with the complex input parameters of different models. The input matrix is formed by the feature vector in the state recognition according to the time dimension.

The structure of the full text is as follows. Section 2 describes the vehicle LC scenario and related parameters and analyzes the key problems in the process of implementing the model. Section 3 introduces the input and output of each algorithm of the model. In Section 4, the model training, testing, and verification processes are presented. In Section 5, the research conclusion is summarized.

2. Scenario Description and Problem Analysis

2.1. Scenario and Parameter Description

The state of the vehicle at any time in the process of driving on the road can be expressed by its own transverse and longitudinal speed, the relative longitudinal displacement and longitudinal speed difference between the vehicle and the front car, the left front car, the left rear car, the right front car, the right rear car, and other traffic business characteristics. As shown in Figure 1, in the process of LC, parameters, including the distance between the vehicle and the vehicle in front, as well as the relative distance and relative speed between the vehicle and the vehicle before and after the target lane of LC, are mainly considered.

2.2. Problem Analysis

2.2.1. The Architectural Design of the Model Computation Process

The architectural design of model computation is composed of recognition algorithm, prediction algorithm, and discrimination calculation through certain logic and rules. The model first determines whether the vehicle is in the LC state based on the recognition algorithm. If the identification result is LC, the speed is predicted based on the historical vehicle operating parameters through the prediction algorithm. This is to establish a credibility discrimination calculation rule, calculate the error between the discriminative speed data and the predicted data, and judge whether the discriminative speed data are reliable.

2.2.2. Vehicle LC Behavior Recognition Algorithm Based on SVM

In this paper, the non-lane-changing state of the vehicle is marked as 0 and the LC state is marked as 1. Appropriate feature parameters are selected as input vectors, and the SVM binary classifier is trained to determine whether the vehicle is in the state of LC. The behavior recognition algorithm can be expressed as follows:

k = f_{1} (S (t))

(1)

where

k \in {0, 1}

represents the vehicle state;

S (t)

represents the input vector of the vehicle state parameters at time

t

; and

f_{1} (\cdot)

represents the mapping relationship between the input and output variables of the recognition algorithm.

2.2.3. Vehicle LC Speed Prediction Algorithm Based on RNN

After the LC vehicle is identified by the SVM, a vector,

S (t)

, representing vehicle parameters in a certain time series is selected and a matrix,

X

, is formed and input into the RNN neural network. The transverse speed,

v_{p, x}

, and longitudinal speed,

v_{p, y}

, of the vehicle at the next moment are predicted:

v_{p, x} (t), v_{p, y} (t) = f_{2} (X)

(2)

where

f_{2} (\cdot)

represents the mapping relationship between the input and output variables of the prediction algorithm; and

v_{p, x} (t)

and

v_{p, y} (t)

represent the transverse and longitudinal velocities of the predicted output at the same time.

2.2.4. Credibility Discrimination Calculation

In this section, we calculate the error between the predicted speeds,

v_{p, x} (t)

and

v_{p, y} (t)

, and the discriminative speed (i.e., the speed value received by the system through communication, not the true speed value of the vehicle),

v_{s, x} (t)

and

v_{s, y} (t)

, at time

t

, and compare the predicted error with the preset error threshold. If the threshold is exceeded, the screened speed is not credible. Therefore, it is necessary to establish a credibility discrimination algorithm,

f_{3} (\cdot)

:

e_{1} (t), e_{2} (t) = f_{3} (v_{p, x} (t), v_{s, x} (t), v_{p, y} (t), v_{s, y} (t))

(3)

where

e_{1} (t)

and

e_{2} (t)

represent the errors of the transverse and longitudinal velocities.

3. Traffic Information Credibility Discrimination Based on Business Characteristics

3.1. Model Calculation Flow

The credibility discrimination model of vehicle LC is shown in Figure 2. The model process can be divided into the preparatory stage and the discriminative stage. In the preparatory stage, whether the vehicle is in the state of LC is identified, and in the discriminative stage, the vehicle speed is predicted and the reliability of the speed is judged. In the preparatory stage, the vehicle status is identified once every 0.5 s by the vehicle LC behavior recognition algorithm. To eliminate the interference of accidental factors, the vehicle status is identified as LC for three consecutive times, and the vehicle is considered to be in the LC state; otherwise, it is in some other driving state.

After recognizing that the vehicle is in the LC state, the model enters the discriminative stage. Taking the time step as 0.1 s, the transverse speed and longitudinal speed of the vehicle under 5 consecutive time steps are predicted by the vehicle LC speed prediction algorithm. After the prediction is completed, the discrimination model is used again to determine whether the vehicle is in the LC state, which is to ensure that the vehicle is still in the LC state within the predicted 5 time steps. After the successful identification, it will then enter the credibility discrimination calculation.

In the credibility discrimination calculation model,

F (t)

, several indicators are calculated and compared with the preset error threshold for model verification, including the average cumulative relative error,

E_{r} (t)

, between the

v_{p, x} (t)

sequence and the

v_{s, x} (t)

sequence, and the average cumulative absolute error,

E_{a} (t)

, between the

v_{p, y} (t)

sequence and the

v_{s, y} (t)

sequence within a certain period. When the error is less than or equal to the preset threshold, the vehicle speed is reliable with the output result of

F (t)

at 1. Otherwise, the

F (t)

is 0.

3.2. LC Behavior State Recognition Based on SVM

Support Vector Machines (SVMs), first proposed by Cortes and Vapnik in 1995 [30,31], show unique advantages in solving linear, non-linear and high-dimensional pattern recognition. The core idea of its classification is to construct a hyperplane to separate samples of different categories, and make the distance between the two types of samples closest to the plane the maximum and the same distance between the two types of samples from the hyperplane, to achieve the optimal classification of linear separable samples.

For the binary classification problem, it is assumed that the training sample set can be expressed as follows:

Q = {(x_{1}, y_{1}), (x_{2}, y_{2}), \dots, (x_{n}, y_{n})}, x_{i} \in ℝ^{n}, y_{i} \in {- 1, 1}

(4)

where,

i = 1, 2, 3, \dots, n

,

x_{i}

represents the eigenvector and

y_{i}

represents the class to which the feature vector belongs. The hyperplane equation that distinguishes the two types of samples is as follows:

f (x) = w^{T} + b

(5)

where

w

represents the normal vector on the hyperplane and

b

stands for intercept. The geometric diagram is shown in Figure 3.

A relaxation variable,

ξ_{i}

, is introduced to turn the classification problem into an optimization problem and find the optimal hyperplane:

{\begin{cases} \min \frac{1}{2} w^{T} w + C \sum_{i = 1}^{l} ξ_{i} \\ s . t . y_{i} (w^{T} x_{i} + b) \geq 1 - ξ_{i} \\ ξ_{i} \geq 0, i = 1, 2, 3, \dots, n \end{cases}

(6)

where

C > 0

is the penalty parameter. The above formula is converted to the dual problem by Lagrange function, and the kernel function is introduced:

{\begin{cases} \max - \frac{1}{2} \sum_{i = 1}^{n} \sum_{j = 1}^{n} α_{i} α_{j} y_{i} y_{j} K (x_{i}, x) + \sum_{i = 1}^{n} α_{i} \\ s . t . \sum_{i = 1}^{n} α_{i} y_{i} = 0, C \geq 0, α_{i} \geq 0 \end{cases}

(7)

where

α_{i}

and

α_{j}

are Lagrange coefficients and

K (x_{i}, x)

is the kernel function. Common kernel functions of SVM include linear kernel, polynomial kernel, Gaussian kernel and sigmoid kernel. Because Gaussian kernel function has the ability of nonlinear mapping, it can map the original feature space to a higher dimensional feature space, to better deal with nonlinear separable problems. Therefore, the Gaussian kernel function is chosen as the kernel function of SVM.

K (x_{i}, x) = \exp (- γ {‖ x_{i} - x ‖}^{2})

(8)

where

γ

is the hyperparameter of the Gaussian kernel function.

SVM has been widely used in intelligent transportation. Currently, researchers have applied SVM to LC behavior recognition [32,33], overtaking intention recognition [34], LC behavior prediction [35,36], traffic accident severity analysis [37,38], and distracted driving state recognition [39,40]. Zhang et al. [33] processed and analyzed the LC data of drivers under natural driving conditions, defined as the LC behavior, and smoothed the data in the execution phase of LC. Wen et al. [39] proposed a distracted driving state discrimination model, which used a genetic algorithm and SVM to study the recognition effect of vehicle transverse control indicators on distracted driving state discrimination.

In this paper, the vehicle driving state was identified based on the SVM model. According to Section 2.1, the state of the vehicle at the moment could be represented by the feature vector,

s_{1} (t)

:

s_{1} (t) = [\begin{matrix} v_{x} (t), \\ Δ v_{x}^{L 1} (t), \\ Δ v_{x}^{R 1} (t), \end{matrix} \begin{matrix} v_{y} (t), \\ Δ x^{L 1} (t), \\ Δ x^{R 1} (t), \end{matrix} \begin{matrix} Δ v_{x} (t), \\ Δ v_{x}^{L 2} (t), \\ Δ v_{x}^{R 2} (t), \end{matrix} \begin{matrix} Δ x (t), \\ Δ x^{L 2} (t), \\ Δ x^{R 2} (t) \end{matrix}]

(9)

where

v_{x} (t)

and

v_{y} (t)

represent the longitudinal speed and transverse speed of the vehicle at time

t

, respectively;

Δ v_{x} (t)

and

Δ x (t)

represent the longitudinal speed difference and longitudinal relative displacement between the vehicle and the vehicle in front of the same lane at time

t

, respectively;

Δ x^{L 1} (t)

,

Δ x^{L 2} (t)

,

Δ v_{x}^{R 1} (t)

, and

Δ v_{x}^{R 2} (t)

represent the longitudinal speed difference between the vehicle and the left front car, the left rear car, the right front car, and the right rear car at time

t

, respectively; and

Δ x^{L 1} (t)

,

Δ x^{L 2} (t)

,

Δ x^{R 1} (t)

, and

Δ x^{R 2} (t)

represent the longitudinal relative displacement between the vehicle and the left front car, the left rear car, the right front car, and the right rear car at time

t

, respectively.

To make the sample input reflect the change law within a certain time range, and reduce the possibility that the vehicle at time

t

was interfered by other factors, leading to the classifier identification error, the three 12-dimensional feature vectors at time

t - 0.4

,

t - 0.2

and

t

were spliced horizontally into 36-dimensional feature vector,

S (t)

:

S (t) = [s_{1} (t - 0.4), s_{1} (t - 0.2), s_{1} (t)]

(10)

3.3. LC Speed Prediction Based on RNN

Recurrent Neural Network (RNN) is a type of neural network that has the function of memory. Different from the traditional feedforward neural network, when processing sequence data, RNN will take the input of the current moment and the state of the previous moment as the input of the current moment, so that the network has the memory function. The structure of RNN can be divided into three parts: input layer, hidden layer and output layer. The input layer is responsible for receiving sequence data, the hidden layer is responsible for the nonlinear transformation of the data, and the output layer is responsible for output results.

RNN can obtain the hidden state of the current time step by linear transformation of the input data of the current time step and the hidden state of the previous time step, and nonlinear mapping through the activation function. The principle can be expressed by the following formula:

h_{t} = f (W_{x x} x_{t} + W_{h h} h_{t - 1})

(11)

where

h_{t}

represents the hidden state of the current time step,

x_{t}

represents the input data of the current time step,

h_{t - 1}

represents the hidden state of the previous time step,

W_{x x}

and

W_{h h}

are the weight matrix of the input data and hidden state, respectively, and

f (\cdot)

represents the activation function. Next, the output data are obtained by multiplying the hidden state with the weight matrix,

W_{h y}

, and adding the bias term,

b_{y}

, and then through the activation function

g (\cdot)

. This process can be expressed as follows:

y_{t} = g (W_{h y} h_{t} + b_{y})

(12)

where

y_{t}

represents the output data of the current time step,

W_{h y}

is the weight matrix between the hidden state and the output data, and

b_{y}

is the bias term.

RNN has been used in intelligent transportation for abnormal trajectory recognition [41,42], short-term traffic flow prediction [43,44], and traffic event prediction [45]. Li et al. [42] proposed an unsupervised anomaly detection method based on trajectory reconstruction errors. By minimizing the difference between the reconstructed output and the original input, the model learns the motion characteristics of the normal trajectory and detects the abnormal traffic trajectory. An et al. [44] used real data such as online car booking orders as data sources to predict the order demand of online car booking at a certain time and place in the future by using recurrent neural networks.

In this paper, a prediction model of vehicle LC speed is established based on RNN. In the process of LC, in addition to considering the distance and speed between the car and the car in front, the driver should consider the relative distance and speed difference between the car and the car in the target lane of LC. The feature vector, composed of the parameters affecting the LC speed at time

t

, was

s_{2} (t)

. The

s_{2} (t)

sequence in a period of forward tracking was composed of the input matrix,

X

, of the RNN neural network to predict the transverse and longitudinal speed of the vehicle at the next time, which could be expressed as follows:

s_{2} (t) = [v_{x} (t), v_{y} (t), Δ v_{x} (t), Δ v_{x 1} (t), Δ v_{x 2} (t), Δ x (t), Δ x_{1} (t), Δ x_{2} (t)]

(13)

v_{p, x} (t), v_{p, y} (t) = f_{2} ([s_{2} (t - T), s_{2} (t - T + Δ t), \dots, s_{2} (t - Δ t), s_{2} (t)])

(14)

where,

T

is the time length of the selected sequence;

Δ t

is the sampling interval;

Δ v_{x 1} (t)

and

Δ v_{x 2} (t)

represent the longitudinal speed difference between the front and rear vehicles and the target lane at time

t

, respectively;

Δ x_{1} (t)

and

Δ x_{2} (t)

represent the relative longitudinal displacement of the vehicle in front and behind the target lane at time

t

, respectively. Because the LC time is short,

T = 1.5 s

was selected here to make the speed prediction more timeliness. NGSIM data set was used for verification, and the sampling interval was selected as 0.1 s, that is,

Δ t = 0.1 s

. The three feature vectors were taken as inputs and the predictable transverse and longitudinal vehicle speeds predicted by RNN at the output layer were obtained. The structure is shown in Figure 4.

3.4. Credibility Discrimination Calculation

There are two main indexes in the evaluation method of credibility discrimination, namely, average cumulative relative error,

E_{r} (t)

, and average cumulative absolute error,

E_{a} (t)

:

E_{r} (t) = \frac{1}{N} \sum_{i = 0}^{N} \frac{| v_{p, x} (t - i Δ t) - v_{s, x} (t - i Δ t) |}{v_{s, x} (t - i Δ t)}

(15)

E_{a} (t) = \frac{1}{N} \sum_{i = 0}^{N} | v_{p, y} (t - i Δ t) - v_{s, y} (t - i Δ t) |

(16)

In the formula,

v_{s, x}

and

v_{s, y}

are the transverse and longitudinal velocities of the vehicle to be discriminated;

N

is the number of discriminative collection points and

N = 5

. The credibility discrimination calculation method is based on the identification result of the vehicle state. When the vehicle is in the state of LC, the transverse and longitudinal velocities must be considered at the same time. In the process of LC, the longitudinal speed changes within

Δ t = 0.1 s

are relatively stable, and the average cumulative relative error is selected as the evaluation index. Because LC is divided into several processes, in the whole process of LC, the transverse speed of the vehicle greatly fluctuates. When the vehicle reaches the lane boundary, the transverse speed of the vehicle is larger, and when the vehicle is about to end the LC, the transverse speed of the vehicle gradually approaches 0, so the average cumulative absolute error is selected. Based on the above rules, the identification expression is as follows:

F (t) = {\begin{cases} 1, E_{r} (t) < ε_{1}, E_{a} (t) < ε_{2} \\ 0, else \end{cases}

(17)

The values of

ε_{1}

and

ε_{2}

should make the measured speed and the abnormal speed as reliable as possible. After the test,

ε_{1} = 2 %

and

ε_{2} = 0.05 m / s

; that is,

F (t) = 1

only if

ε_{1} < 2 %

and

ε_{2} < 0.05 m / s

.

4. Model Training and Verification

The hardware and software environment verified by the test is described as follows: CPU 12th Gen Intel(R) Core (TM) i9-12900KF@ 3.20 GHz, RAM 64 GB, GPU NVIDIA GeForce RTX 3080 Ti, software PyCharm Community Edition 2022.3.2. The package used and corresponding version are shown in Table 1.

4.1. Training and Testing of Vehicle LC Behavior Recognition Algorithm Based on SVM

4.1.1. Data Preparation

The US101 highway vehicle data from NGSIM data set were selected for research. The data included instantaneous speed, acceleration, coordinates, vehicle type, lane number and other information, which were updated every 0.1 s. After eliminating the on-ramp data of side roads and expressways as well as the data of non-small vehicles, the vehicle steering angle,

θ

, within 0.1 s was calculated according to the instantaneous coordinates of the vehicle at time

t

and time

t - 0.1

. The velocity at time

t

was orthogonal decomposed according to the vehicle steering angle to obtain

v_{x}

and

v_{y}

.

According to the research conclusion of Wang et al. [46], the average LC time of vehicles in NGSIM data are 6.8 s. To reduce the data fragments whose transverse speed was close to 0 at the initial and end stages of LC, the lane number change moment was taken as the LC moment of the vehicle, and the 2.5 s time segment before and after the LC moment was taken as the LC data and marked as LC status 1, and the remaining data were non-LC data and marked as 0.

As the transverse speed of the vehicle was small and the sampling frequency was fast, the degree of interference was large. To reduce the interference of accidental factors in a very short time during LC and make the speed change more continuous and improve the model effect, the moving average method was used to reduce the noise [33,47]. However, if the time window is too large, the variation law of the transverse speed of the vehicle in the process of LC is weakened, so the time window

n = 3

was specified for

v_{y}

data smoothing.

4.1.2. Training and Testing

The 43531 LC samples and 59816 non-LC samples were randomly selected from the sample set. The sample was divided into training set and test set according to the ratio of 7:3. Min-max normalization was used for the training set to scale the data to the range, and the training set normalization parameters were used to normalize the test set:

x * = (x - x_{\min}) / (x_{\max} - x_{\min})

(18)

where

x *

is the normalized sample data;

x_{\min}

is the maximum value of the sample data;

x_{\max}

is the maximum value of the sample data; and

x

is the sample data. Before data fitting, the Grid Search method was used to traverse to find the optimal combination of penalty parameter

c

and bandwidth parameter

γ

. Since the penalty parameter and kernel function parameter

γ

of the support vector machine model are not easy to determine, to seek the optimal parameter combination and ensure a certain search length, the value range of the initial parameter

c

was set as

(10^{- 2}, 10^{3})

, and the value range of

γ

was set as

(10^{- 3}, 10^{2})

[11]. The hyperparameter combination with the highest accuracy was selected as the optimization result by using 5-fold cross-validation, as shown in Figure 5.

At this time, the optimal combination was

c = 100

,

γ = 10

, and the model was trained with this parameter combination and saved. Table 2 and Table 3 show the confusion matrix output by the training set and test set of the model under this hyperparameter combination, respectively.

It can be seen from the above table that in the training set, the accuracy rate of the model was 99.86% with a recall rate of 99.70%, and the F1 score was 99.83%. Meanwhile, the accuracy rate of the model in the test set was 99.18% with a recall rate of 99.45%, and the F1 score was 99.03%. The random forest optimization model based on the multi-layer perceptron neural network proposed by Liu et al. [48] has an accuracy of 91.9% in recognizing four typical driving behavior patterns (free driving, CF, left and right LCs). Although the tasks and parameters are slightly different, as compared with similar studies, it can be found that the vehicle LC behavior recognition model proposed in this paper has a better recognition effect.

4.2. Training and Testing of LC Speed Prediction Algorithm Based on RNN

4.2.1. Data Preparation

The LC fragment marked in Section 4.1.1 was selected. The LC fragment obtained from each LC of the same vehicle contains 5 s of vehicle data. Within 5 s, vehicles experience state changes, i.e., from before LC, executing LC, and then ending LC. To ensure the prediction accuracy of LC speed with the above different states as much as possible, 14,499 groups of sample data were selected using the sliding window method within 5 s after determining the input sequence. According to the input settings in Section 3.3, the data of the first 15 sampling points of the same vehicle were combined into a 15 × 8 matrix

X

as the input of the neural network.

4.2.2. Training and Testing

The data were normalized in the same way as SVM and the above data were divided into a training set, a validation set, and a test set with the proportion of 3:1:1. The batch size of the model training was set to 128 and the epoch was set to 1000.

To prevent overfitting in the RNN model training, a larger data set was used, and a layer of dropout was added between the input layer and the hidden layer, and between the hidden layer and the output layer, respectively [49,50], randomly dropping some neurons during training. This allowed the model to not be too dependent on any one neuron, thus avoiding overfitting. It can effectively reduce the computational complexity of the model and improve the generalization ability of the model. With the control variable method and other parameters unchanged, dropout parameters were set to 0.5, 0.1, 0.05, and 0.01, respectively, and the model was trained in sequence [51]. The training results are shown in Table 4 and Table 5. When the dropout value was 0.01, the mean absolute error (MAE) values of the training set, validation set and test set were the smallest, which proved that the model fitting effect was the best.

To balance prediction accuracy and efficiency, in addition to determining the dropout rate, the number of neurons and the value of learning rate are two very important hyperparameters that affect the model results. Too many neurons will make the model complex and overfit, and too small neurons will make the model too simple and underfit. Too large learning rate may cause the model to fail to converge in the training process, and even shock phenomenon, while too small learning rate may lead to slow model convergence and long training time. A combination of 16, 32, 64, 128, 256 [52] neurons with a learning rate of 0.0001, 0.001, 0.01 [51] was used to observe the results of training loss convergence and determine the parameter combination. The simulation results of the hyperparameter combination are shown in Table 6.

When the number of neurons in each layer was 64 and the learning rate was 0.001; the MAE of the test set was minimal. RNN neural network, structure parameters and training parameters were set as shown in Table 7.

The RNN neural network used the Adam optimizer to verify the data of the validation set once after each training round. The mean square error was taken as the loss function. The change trend of training loss is shown in Figure 6. The minimum mean square error loss of the training set and validation set in all rounds was 2.83 × 10⁻⁵ and 1.30 × 10⁻⁵, respectively. The parameter of this round was set as the parameter of the vehicle LC speed prediction model.

After the training, mean absolute error (MAE) and mean square error (MSE) were used to evaluate the performance of RNN. The performance on the training set, verification set, and the test set is shown in Table 8, and the error reached the order of cm/s. The CNN-Bi-LSTM-Attention model proposed by Zhao et al. [53] predicted vehicle acceleration with an average absolute error of 0.0531 m/s. Although the prediction objects are different, it still shows that the model used in this paper achieved high precision.

Taking the data fragment of vehicle ID 1 in the test set as an example, the predicted speed value was compared with the real value, and the prediction result is shown in Figure 7.

The blue curve represents the changing trend of the longitudinal speed of the vehicle, corresponding to the coordinate scale of the left longitudinal axis. The red curve shows the variation trend of the transverse speed of the vehicle, corresponding to the coordinate scale of the right vertical axis. The blue and red dots at the end of the curve represent the longitudinal and transverse velocity values at time

t

predicted by the model, respectively. The predicted velocity coincides with the end of the actual velocity curve, which shows that the model had a good prediction effect.

4.3. Credibility Discrimination Model Calculation

4.3.1. Data Preparation

To fully verify the validity and scientificity of credibility discrimination, previously unused data were selected for this experiment. The 2 s operation data of 250 vehicles in LC and non-LC were selected, respectively. The real velocity data of single point, two points and three points at five-time points in the last 0.5 s were processed randomly. The longitudinal velocity of a single point randomly increased and decreased by 10~12%, and the transverse velocity of a single point randomly increased and decreased by 0.25~0.40 m/s, respectively, which were collectively referred to as increment and decrement here. Two-point processing for each point changed the amplitude and the size of the change value was 1/2 of the single point, three-point processing was 1/3 of the single point. After the above processing, a total of 6 groups of abnormal data were obtained, and a total of 7 groups of real data were added to carry out test verification.

4.3.2. Verification Results and Analysis

The confusion matrix of vehicle state recognition in the preparatory stage is shown in Table 9. As can be seen from the table, the overall accuracy was 99.60%, the recall rate was 99.20%, and the accuracy rate was 100%. It can be seen that the SVM classifier had a good recognition effect in this experiment.

After predicting the vehicle speed, the last 0.5 s data of 248 correctly identified LC vehicles were identified again, and the results indicated that they were all LC states. The discriminative calculation was carried out on 248 LC vehicles, and the discriminative results are shown in Table 10. The accurate number represented the number of correct identification results. For abnormal data, it was the data that identified the anomalies. For real data, it was the data that identified the results as credible. Longitudinal velocity, transverse velocity and LC behavior were evaluated with

ε_{1}

,

ε_{2}

,

ε_{1}

and

ε_{2}

, respectively.

The accuracy of the transverse and vertical abnormal velocity LC behavior was more than 93%, respectively. Except for the random three-point anomaly treatment group, the accuracy of the transverse and vertical abnormal velocity LC behavior was more than 97%. The recognition rate of the random three-point abnormal LC velocity was slightly lower because the errors of more data were slightly larger than the preset error threshold. The credibility discrimination model had a good effect on the recognition of vehicle LC behavior and the discrimination of random vehicle abnormal speed data.

Taking the 227th vehicle in the random three-point abnormal incremental data set as an example, the comparison of the true value, predicted value and abnormal value of the transverse and longitudinal speed is shown in Figure 8.

The blue curve represents the changing trend of the longitudinal speed of the vehicle, corresponding to the coordinate scale of the left longitudinal axis. The red curve shows the variation trend of the transverse speed of the vehicle, corresponding to the coordinate scale of the right vertical axis. The five blue and red dots represent the longitudinal and transverse velocity values predicted by the model for five consecutive moments. The three blue and red crosses represent the abnormal longitudinal and transverse velocity values of the three random moments, respectively. The trend of the predicted velocity curve was very close to the actual velocity curve, and it was distinguishable from the abnormal velocity curve, which showed that the model had a good discrimination ability. The speed prediction accuracy was high, and even small abnormal fluctuations can be accurately discriminated.

5. Conclusions and Future Developments

5.1. Conclusions

To prevent the speed information of connected vehicles from being maliciously tampered, this paper proposed a method of lane-changing behavior recognition and information credibility discrimination based on machine learning. The vehicle state was identified based on the SVM model, and the transverse and longitudinal velocity was predicted based on the RNN model, and a discriminative mechanism was established to identify the difference between the collected speed and the predicted speed.

Key research results are summarized below:

(1): To discriminate the credibility of vehicle speed during LC, an information credibility discrimination method was proposed based on traffic business characteristics. Taking the LC scene as an example, traffic business characteristics (such as vehicle speed and location) were used as model inputs to identify LC behavior, predict vehicle LC speed, and discriminate abnormal speed data. Experimental results have shown that the method has good prediction accuracy and identification accuracy, together with a high accuracy of the model, which reflects the feasibility and effectiveness of using traffic business characteristics to identify abnormal information.
(2): In the treatment of LC anomalous speed, the longitudinal speed of a single point of five consecutive sampling points within 0.5 s was randomly increased and decreased by 10–12%, respectively, and the transverse speed was randomly increased and decreased by 0.25~0.40 m/s, respectively. The increase and decrease of two points and three points is 1/2 and 1/3 of the single point, respectively. Although the abnormal speed was already very close to the real speed, the model still had a good discriminative effect. When being disturbed by attackers and malicious vehicles, the accuracy of discrimination will be higher if the LC speed interaction information is greatly changed.
(3): This model can be applied to on-board equipment or roadside units and combined with the credibility discrimination model of traffic business characteristics, such as speed and acceleration of other vehicles’ behaviors. Furthermore, it can provide support to defend against the attack of LC vehicle speed abnormal information of VANET and i-VICS.

5.2. Limitations and Future Development

The LC behavior studied in this paper was not further subdivided into left LC and right CL, forced LC [54], and free LC. Due to the limitations of the data set, the trained model may not apply to the LC behavior of urban roads. At the same time, LC behavior can be further divided into several stages, such as LC preparation and LC execution. If the different stages and types of LC behavior are classified and the speed is predicted separately, the credibility discrimination effect will be optimized. In addition to the CF and LC behaviors, the reliability of vehicle speed can be discriminated based on traffic business characteristics such as sudden braking and U-turns.

In the future, we can adopt a variety of strategies to enhance real-time discrimination. The efficiency of the system can be improved through continuous optimization of the model algorithm, which involves modifying the existing algorithm or finding a more efficient alternative algorithm. Steps can be taken to improve the hardware performance, and parallel computing technology can be used to improve the overall computational efficiency of the algorithm.

Author Contributions

Conceptualization, S.Y. and X.C.; methodology, X.C. and S.Y.; software, X.C. and J.W.; validation, X.C. and J.W.; formal analysis, X.C. and Y.Z.; writing—original draft preparation, X.C. and S.Y.; writing—review and editing, X.C.; visualization, X.C. and S.Y.; supervision, J.W. and S.Y.; project administration, S.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Key Research and Development Program of China, grant number 2018YFB1600600, and also by the People’s Public Security University of China Basic Scientific Research for New Teachers Starting Fund Project, grant number 2022JKF434.

Data Availability Statement

The data presented in this study are available in the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, Y.; Yao, D.Y.; Li, L.; Pei, H.; Yan, S.; Ge, J.W. Technologies and Applications for Intelligent Vehicle-infrastructure Cooperation Systems. J. Transp. Syst. Eng. Inf. Technol. 2021, 21, 40–51. [Google Scholar]
Sun, X.; Qi, Z.F. Analysis on Cyber Security of Vehicle-infrastructure Cooperation. J. Highw. Transp. Res. Dev. 2020, 37, 142–146. [Google Scholar]
Kim, K.; Kim, J.S.; Jeong, S.; Park, J.-H.; Kim, H.K. Cybersecurity for autonomous vehicles: Review of attacks and defense. Comput. Secur. 2021, 103, 102150. [Google Scholar] [CrossRef]
Yao, X.X.; Zhang, X.L.; Ning, H.S.; Li, P.J. Using trust model to ensure reliable data acquisition in VANETs. Ad Hoc Netw. 2017, 55, 107–118. [Google Scholar] [CrossRef]
Azees, M.; Vijayakumar, P.; Deboarh, L.J. EAAP: Efficient Anonymous Authentication with Conditional Privacy-Preserving Scheme for Vehicular Ad Hoc Networks. IEEE Trans. Intell. Transp. Syst. 2017, 18, 2467–2476. [Google Scholar] [CrossRef]
El-Rewini, Z.; Sadatsharan, K.; Selvaraj, D.F.; Plathottam, S.J.; Ranganathan, P. Cybersecurity challenges in vehicular communications. Veh. Commun. 2020, 23, 100214. [Google Scholar] [CrossRef]
Zhou, Q.; Zeng, Z.K.; Wang, K.M.; Chen, M.L. Location Privacy Protection Scheme for Group Signature withBackward Unlinkability. Netinfo Secur. 2023, 23, 62–72. [Google Scholar]
Zheng, M.H.; Duan, Y.Y.; Lyu, H.X. Research on Identity Authentication Protocol Group Signature-based in Internet of Vehicles. Adv. Eng. Sci. 2018, 50, 130–134. [Google Scholar] [CrossRef]
Yang, X.T.; Li, Z. Identity Authentication Scheme Based on Vehicle Behavior Prediction for Iov. Comput. Eng. 2021, 47, 129–138. [Google Scholar] [CrossRef]
Yang, C.J.; Peng, J.S.; Xu, Y.; Wei, Q.J.; Zhou, L.; Tang, Y.N. Edge Computing-Based VANETs’ Anonymous Message Authentication. Symmetry 2022, 14, 2662. [Google Scholar]
Shi, Y.C.; Yan, S.; Yao, D.Y.; Zhang, Y. SVM-LSTM-based car-following behavior recognition andinformation credibility discirmination. J. Traffic Transp. Eng. 2022, 22, 115–125. [Google Scholar] [CrossRef]
Yao, J.F.; He, R.; Shi, T.T.; Wang, P.; Zhao, X.M. Review on machine learning-based traffic flow prediction methods. J. Traffic Transp. Eng. 2023, 23, 44–67. [Google Scholar] [CrossRef]
Ji, H.H.; Mei, J.; Wang, L.; Liu, S.D.; Ren, Y. Data-Driven Kalman Consensus Filtering for Connected Vehicle Speed Estimation in a Multi-Sensor Network. Symmetry 2023, 15, 1699. [Google Scholar] [CrossRef]
Liu, G.; He, S.; Han, X.; Luo, Q.Y.; Du, R.H.; Fu, X.S.; Zhao, L. Self-Supervised Spatiotemporal Masking Strategy-Based Models for Traffic Flow Forecasting. Symmetry 2023, 15, 2002. [Google Scholar] [CrossRef]
Ji, X.F.; Lu, M.Y.; Qin, W.W. Passenger Cars Driving Behaviors Recognition Under Truck Movement Interruption. J. Transp. Syst. Eng. Inf. Technol. 2021, 21, 174–182. [Google Scholar] [CrossRef]
Chen, L.; Feng, Y.C.; Li, Q.R. Probe into the Multi-class SVM-based recognition model for the vehicle lane-altering behaviors. J. Saf. Environ. 2020, 20, 193–199. [Google Scholar] [CrossRef]
Xie, D.F.; Fang, Z.Z.; Jia, B.; He, Z.B. A data-driven lane-changing model based on deep learning. Transp. Res. Part C Emerg. Technol. 2019, 106, 41–60. [Google Scholar] [CrossRef]
Huang, X.L.; Sun, J.; Sun, J. A car-following model considering asymmetric driving behavior based on long short-term memory neural networks. Transp. Res. Part C: Emerg. Technol. 2018, 95, 346–362. [Google Scholar] [CrossRef]
Zhao, J.D.; Jiao, L.X.; Zhao, Z.M.; Qu, Y.C.; Sun, H.J. A Car-Following Model Driven by Combination of Theory and Data Considering Effects of Lane Change of Side Cars. J. South China Univ. Technol. 2023, 51, 10–19. [Google Scholar]
Cai, Y.F.; Tai, K.S.; Wang, H.; Li, Y.C.; Chen, L. Research on Behavior Recognition Algorithm of Surrounding Vehicles for Driverless Car. Automot. Eng. 2020, 42, 1464–1472+1505. [Google Scholar] [CrossRef]
Zhao, S.E.; Su, T.B.; Zhao, D.Y. Interactive Vehicle Driving Intention Recognition and Trajectory Prediction Based on Graph Neural Network. Automob. Technol. 2023, 07, 24–30. [Google Scholar] [CrossRef]
Huang, K.Q.; Luo, T. Vehicle lane change intention recognition based on attention-bilstm network. J. Zhejiang Univ. Technol. 2023, 51, 264–270. [Google Scholar]
Feng, Y.H.; Huang, S.E.; Wong, W.; Chen, Q.A.; Mao, Z.M.; Liu, H.X. On the Cybersecurity of Traffic Signal Control System with Connected Vehicles. IEEE Trans. Intell. Transp. Syst. 2022, 23, 16267–16279. [Google Scholar] [CrossRef]
Iqbal, S.; Ball, P.; Kamarudin, M.H.; Bradley, A. Simulating Malicious Attacks on VANETs for Connected and Autonomous Vehicle Cybersecurity: A Machine Learning Dataset. In Proceedings of the 2022 13th International Symposium on Communication Systems, Networks and Digital Signal Processing (CSNDSP), Porto, Portugal, 20–22 July 2022. [Google Scholar]
So, S.; Sharma, P.; Petit, J. Integrating Plausibility Checks and Machine Learning for Misbehavior Detection in VANET. In Proceedings of the 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), Orlando, FL, USA, 17–20 December 2018. [Google Scholar]
Huang, S.E.; Feng, Y.H.; Liu, H.X. A data-driven method for falsified vehicle trajectory identification by anomaly detection. Transp. Res. Part C Emerg. Technol. 2021, 128, 103196. [Google Scholar] [CrossRef]
Shangguan, W.; Zha, Y.Y.; Fu, Y.; Zheng, S.F.; Chai, L.G. Vehicle-infrastructure cooperative credible interaction method based on traffic business characteristics understanding. J. Traffic Transp. Eng. 2022, 22, 348–360. [Google Scholar] [CrossRef]
Wu, X.K.; He, S.; Zhang, S.W.; He, X.Z.; Wang, S.F. An Improved Lane-changing Model for Connected Automated Vehicles Under Cyberattacks. J. Tongji Univ. 2022, 50, 1715–1727. [Google Scholar]
Chen, T.Y.; Shi, X.P.; Wong, Y.D. A lane-changing risk profile analysis method based on time-series clustering. Phys. A 2021, 565, 125567. [Google Scholar] [CrossRef]
Cortes, C.; Vapnik, V. Support-Vector Networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Vapnik, V.; Vapnik, V. The Natural of Statistical Learning Theory; Springer: Berlin/Heidelberg, Germany, 1995. [Google Scholar]
Chen, R.B.; Ye, X.E.; Zhang, F.Y.; Zhao, D. Research on recognition method of vehicle lane-change behavior based on video image. In Proceedings of the International Conference on Applied Science & Engineering Innovation, Jinan, China, 30–31 August 2015. [Google Scholar]
Zhang, C.; Han, Y.; Liu, K.W.; Wang, D.; Lin, Y. Lane-changing recognition and analysis using driver behavior data. J. Chongqing Univ. Technol. 2022, 36, 52–59. [Google Scholar]
Ma, T.T.; Xu, X.J.; Zhu, W.D. Driver’s Overtaking Intention Recognition Based on Support Vector Machine. J. Shanghai Univ. Eng. Sci. 2016, 30, 203–208. [Google Scholar]
Mi, J.X.; Yu, H.L.; Xi, J.Q. Prediction of Driver’s Lane Changing Behavior Based on MLP-SVM. Acta Armamentarii 2022, 43, 3020–3029. [Google Scholar]
Liu, Z.Q.; Wu, X.G.; Ni, J.; Zhang, T. Driving Intention Recognition Based on HMM and SVM Cascade Algorithm. Automot. Eng. 2018, 40, 858–864. [Google Scholar] [CrossRef]
Erzurum Cicek, Z.I.; Kamisli Ozturk, Z. Prediction of fatal traffic accidents using one-class SVMs: A case study in Eskisehir, Turkey. Int. J. Crashworthiness 2022, 27, 1433–1443. [Google Scholar] [CrossRef]
Sun, Y.X.; Shao, C.F.; Yue, H.; Zhu, L. Urban traffic accident severity analysis basedon sensitivity analysis of support vector machine. J. Jilin Univ. 2014, 44, 1315–1320. [Google Scholar] [CrossRef]
Wen, X.; Deng, C. GA-SVM Distracted Driving State Discrimination Model Based onVehicle Lateral Running Data. Sci. Technol. Eng. 2023, 23, 10990–10996. [Google Scholar]
Ma, Y.L.; Gu, G.F.; Gao, Y.E.; Ma, Y. Driver Distraction Judging Model Under In-vehicle InformationSystem Operation Based on Driving Performance. China J. Highw. Transp. 2016, 29, 123–129. [Google Scholar] [CrossRef]
Song, L.; Wang, R.J.; Xiao, D.; Han, X.T.; Cai, Y.N.; Shi, C. Anomalous Trajectory Detection Using Recurrent Neural Network. In Proceedings of the 14th International Conference on Advanced Data Mining and Applications, Nanjing, China, 16–18 November 2018. [Google Scholar]
Li, C.N.; Feng, G.W.; Liu, R.Y.; Miao, Q.G. Traffic Trajectory Anomaly Detection Method Based on Reconstruction Error. Comput. Sci. 2022, 49, 149–155. [Google Scholar]
Zhang, Q.Y.; Li, C.W.; Guo, G.M.; Wang, J.J. Short-term Traffic Flow Prediction Based on MCTLBO-RNN. J. Wuhan Univ. Technol. 2020, 42, 92–99. [Google Scholar]
An, L.; Zhao, S.L.; Wu, Y.L.; Chen, R.Z.; Li, J.X. Prediction method of supply and demand for onlinecar based on recurrent neural networks. Appl. Res. Comput. 2019, 36, 756–761. [Google Scholar] [CrossRef]
Liu, W.; Zhang, X.L.; Sun, S.B.; Zhao, P.C. The Forecast of the Traffic Event of the Recurrent NeuralNetwork Based on the Random Deactivation. Comput. Simul. 2021, 38, 78–82+87. [Google Scholar]
Wang, Q.; Li, Z.H.; Li, L. Investigation of Discretionary Lane-Change Characteristics Using Next-Generation Simulation Data Sets. J. Intell. Transp. Syst. 2014, 18, 246–253. [Google Scholar] [CrossRef]
Chen, J.X.; Cheng, W.Y.; Wan, J.; Wang, Y.R. Safety Evaluation of Car Following Behavior Based on NGSIM Micro Trajectory Data. J. Chongoing Jiaotong Univ. 2022, 41, 1–6+21. [Google Scholar]
Liu, T.; XU, L.; Zhang, X.L.; Peng, J.S. Driving Behavior Patterns Recognition Nethod in High Speed Conditions Based on Multi-source Parameters. J. Chongqing Jiaotong Univ. 2023, 42, 88–97. [Google Scholar]
Hwang, S.H. Vehicle Trajectory Prediction with Lane Stream Attention-Based LSTMs and Road Geometry Linearization. Sensors 2021, 21, 8152. [Google Scholar]
Ashfaq, F.; Ghoniem, R.M.; Jhanjhi, N.Z.; Khan, N.A.; Algarni, A.D. Using Dual Attention BiLSTM to Predict Vehicle Lane Changing Maneuvers on Highway Dataset. Systems 2023, 11, 196. [Google Scholar] [CrossRef]
Zhao, X.M.; Sun, K.; Gong, S.Y.; Wu, X. RF-BiLSTM Neural Network Incorporating Attention Mechanism for Online Ride-Hailing Demand Forecasting. Symmetry 2023, 15, 670. [Google Scholar] [CrossRef]
Zhang, Q.Y.; Zhou, L.F.; Su, Y.X.; Xia, H.W.; Xu, B.R. Gated Recurrent Unit Embedded with Dual Spatial Convolution for Long-Term Traffic Flow Prediction. ISPRS Int. J. Geo-Inf. 2023, 12, 366. [Google Scholar] [CrossRef]
Zhao, J.D.; Zhao, Z.M.; Qu, Y.C.; Xie, D.F.; Sun, H.J. Vehicle Lane Change Intention Recognition Driven by Trajectory Data. J. Transp. Syst. Eng. Inf. Technol. 2022, 22, 63–71. [Google Scholar] [CrossRef]
Li, H.; Cheng, H.H.; Wang, J.W.; An, Y.S. Application of Improved Sliding Window Algorithm and SVM in Vehicle Lane ChangeBehavior Recognition. Comput. Syst. Appl. 2019, 28, 113–118. [Google Scholar] [CrossRef]

Figure 1. Vehicle lane-changing (LC) scene.

Figure 2. Credibility discrimination flow chart.

Figure 3. Schematic diagram of SVM. The red and blue circles in the diagram represent two types of points in the space that the instance feature vectors map to.

Figure 4. RNN neural network structure.

Figure 5. Grid search parametric surface plot.

Figure 6. RNN loss variation.

Figure 7. Comparison between predicted and real speeds.

Figure 8. Discrimination of abnormal speed.

Table 1. The package used in the experiment and the corresponding version.

Package	Version
Tensorflow	2.6.0
Numpy	1.22.0
Keras	2.6.0
Scikit-learn	1.2.2
Pandas	2.0.2

Table 2. SVM training set confusion matrix.

Virtual State	Recognized State		Total
Virtual State	Non-LC (0)	LC (1)	Total
non-LC (0)	41,840	9	41,849
LC (1)	91	30,402	30,493
Total	41,931	30,411	72,342

Table 3. SVM test set confusion matrix.

Virtual State	Recognized State		Total
Virtual State	Non-LC (0)	LC (1)	Total
non-LC (0)	17,786	181	17,967
LC (1)	72	12,966	13,038
Total	17,858	13,147	31,005

Table 4. Comparison of MAE (

v_{p, x}

) results with different dropouts.

Table 4. Comparison of MAE (

v_{p, x}

) results with different dropouts.

Dropout Value	$MAE (v_{p, x})$
Dropout Value	Training Set	Validation Set	Test Set
0.5	1.08 × 10⁻¹	1.06 × 10⁻¹	1.08 × 10⁻¹
0.1	6.09 × 10⁻²	6.02 × 10⁻²	6.04 × 10⁻²
0.05	5.62 × 10⁻²	5.56 × 10⁻²	5.51 × 10⁻²
0.01	3.82 × 10⁻²	3.88 × 10⁻²	3.75 × 10⁻²

Table 5. Comparison of MAE (

v_{p, y}

) results with different dropouts.

Table 5. Comparison of MAE (

v_{p, y}

) results with different dropouts.

Dropout Value	$MAE (v_{p, y})$
Dropout Value	Training Set	Validation Set	Test Set
0.5	3.12 × 10⁻²	3.19 × 10⁻²	3.18 × 10⁻²
0.1	1.68 × 10⁻²	1.70 × 10⁻²	1.70 × 10⁻²
0.05	1.59 × 10⁻²	1.57 × 10⁻²	1.59 × 10⁻²
0.01	1.37 × 10⁻²	1.37 × 10⁻²	1.38 × 10⁻²

Table 6. Comparison of test set MAE results with different hyperparameters.

Number of Neurons	Learning Rate	$v_{p, x}$ MAE	$v_{p, y}$ MAE
16	0.01	3.61 × 10⁻²	1.53 × 10⁻²
	0.001	2.47 × 10⁻²	1.27 × 10⁻²
	0.0001	6.14 × 10⁻²	1.77 × 10⁻²
32	0.01	4.26 × 10⁻²	1.65 × 10⁻²
	0.001	1.88 × 10⁻²	1.24 × 10⁻²
	0.0001	4.23 × 10⁻²	1.42 × 10⁻²
64	0.01	7.35 × 10⁻²	2.36 × 10⁻²
	0.001	1.71 × 10⁻²	1.22 × 10⁻²
	0.0001	3.73 × 10⁻²	1.33 × 10⁻²
128	0.01	2.13 × 10⁻¹	7.50 × 10⁻¹
	0.001	2.09 × 10⁻²	1.22 × 10⁻²
	0.0001	3.06 × 10⁻²	1.28 × 10⁻²
256	0.01	9.33 × 10⁻¹	4.68 × 10⁻¹
	0.001	2.62 × 10⁻²	1.41 × 10⁻²
	0.0001	2.31 × 10⁻²	1.23 × 10⁻²

Table 7. RNN neural network parameters.

Structural Parameter	Value
The number of hidden layers	1
Each layer hides layer units	64
Batch Size	128
Learning Rate	0.001
Epochs	1000

Table 8. RNN model evaluation index.

Evaluation Index	$v_{p, x}$ MAE	$v_{p, x}$ MSE	$v_{p, y}$ MAE	$v_{p, y}$ MSE
Training Set	1.73 × 10⁻² m/s	6.31 × 10⁻⁴ m²/s²	1.22 × 10⁻² m/s	3.23 × 10⁻⁴ m²/s²
Validation Set	1.75 × 10⁻² m/s	6.40 × 10⁻⁴ m²/s²	1.23 × 10⁻² m/s	3.34 × 10⁻⁴ m²/s²
Test Set	1.71 × 10⁻² m/s	6.43 × 10⁻⁴ m²/s²	1.22 × 10⁻² m/s	3.20 × 10⁻⁴ m²/s²

Table 9. Confusion matrix for vehicle state recognition in the preparatory phase.

Virtual State	Recognition State		Total
Virtual State	Non-LC	LC	Total
non-LC	250	0	250
LC	2	248	250
Total	252	248	500

Table 10. Results of credibility discrimination.

Exception Handling Mode	Longitudinal Velocity		Transverse Velocity		LC Behavior
Exception Handling Mode	Exact Number	Accuracy Rate	Exact Number	Accuracy Rate	Exact Number	Accuracy Rate
Random Single Point Abnormal Increment	248	100%	248	100%	248	100%
Random Single Point Abnormal Decrement	247	99.60%	247	99.60%	247	99.60%
Random Two-point Abnormal Increment	246	99.19%	247	99.60%	245	98.79%
Random Two-point Abnormal Decrement	248	100%	247	99.60%	247	99.60%
Random Three-Point Abnormal Increment	234	94.35%	244	98.39%	230	92.74%
Random Three-Point Abnormal Decrement	246	99.19%	239	93.15%	230	92.74%
Unprocessed True Value	246	99.19%	244	98.39%	243	97.98%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chen, X.; Yan, S.; Wang, J.; Zhang, Y. Machine Learning-Based Lane-Changing Behavior Recognition and Information Credibility Discrimination. Symmetry 2024, 16, 58. https://doi.org/10.3390/sym16010058

AMA Style

Chen X, Yan S, Wang J, Zhang Y. Machine Learning-Based Lane-Changing Behavior Recognition and Information Credibility Discrimination. Symmetry. 2024; 16(1):58. https://doi.org/10.3390/sym16010058

Chicago/Turabian Style

Chen, Xing, Song Yan, Jingsheng Wang, and Yi Zhang. 2024. "Machine Learning-Based Lane-Changing Behavior Recognition and Information Credibility Discrimination" Symmetry 16, no. 1: 58. https://doi.org/10.3390/sym16010058

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning-Based Lane-Changing Behavior Recognition and Information Credibility Discrimination

Abstract

1. Introduction

2. Scenario Description and Problem Analysis

2.1. Scenario and Parameter Description

2.2. Problem Analysis

2.2.1. The Architectural Design of the Model Computation Process

2.2.2. Vehicle LC Behavior Recognition Algorithm Based on SVM

2.2.3. Vehicle LC Speed Prediction Algorithm Based on RNN

2.2.4. Credibility Discrimination Calculation

3. Traffic Information Credibility Discrimination Based on Business Characteristics

3.1. Model Calculation Flow

3.2. LC Behavior State Recognition Based on SVM

3.3. LC Speed Prediction Based on RNN

3.4. Credibility Discrimination Calculation

4. Model Training and Verification

4.1. Training and Testing of Vehicle LC Behavior Recognition Algorithm Based on SVM

4.1.1. Data Preparation

4.1.2. Training and Testing

4.2. Training and Testing of LC Speed Prediction Algorithm Based on RNN

4.2.1. Data Preparation

4.2.2. Training and Testing

4.3. Credibility Discrimination Model Calculation

4.3.1. Data Preparation

4.3.2. Verification Results and Analysis

5. Conclusions and Future Developments

5.1. Conclusions

5.2. Limitations and Future Development

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI