*3.1. Speed Reliability Evaluation of Drivers in Curved Road Tests*

This paper randomly selects experimental curves 6 and 9 for specific analysis [37–39]. For curve 6 (R = 150 m), according to the speed value of the vehicle entering the curve, the drivers are classified at 40 km/h, 50 km/h and 60 km/h, respectively, and there are seven testers who turn the curve at a speed of 60 km/h, and the driver numbers are 5, 7, 11, 13, 19, 20 and 23. The testers who pass the curve road with 50 km/h are No. 4, 8, 9, 10, 15, 16, 17, 18, 21, 26 and 27. There are 8 testers, i.e., No. 1, 2, 6, 12, 14, 22, 24 and 25, who cross the corner with 40 km/h. According to the speed analysis, it can be found that the drivers who turn at 60 km/h are generally skilled drivers. Only tester 23 is classified as a new driver but with 4 years of driving experience of 50,000 kms. In this research, this can be classified as skilled driver. In the 50 km/h cornering vehicle, only testers 8, 9 and 27 are novice drivers, and in the 40 km/h cornering vehicle, drivers 6, 12, 14, 22, 24 and 25 are identified as novices. This further manifests as better performance of the skilled drivers when cornering. The related results are shown in Figure 4. As can be observed, the vehicle entering a curve with the speed of 50 km/h and 60 km/h exhibits better speed change law, i.e., first deceleration and the consequent acceleration, and the vehicle entering the curve with the speed of 40 km/h basically presents the characteristic of gradual acceleration after entering the curve. This can be explained by the reason that the initial speed of 50 km/h and 60 km/h is mainly adopted by skilled drivers, and it can also be proved that skilled drivers possess better turning maneuverability. The novice drivers mainly take the speed of 40 km/h to cross the curve. Through analyzing the forward movement data, it can be found that the novice drivers tend to slow down before the curve due to tight cornering, therefore, the vehicle speed is reduced to the designed speed in the turning phase. Overall, the skilled drivers show a higher speed when entering curve 6. By contrast, when the novice drivers enter the curve, the speed trend basically maintains the characteristics of deceleration first and consequent acceleration.

**Figure 4.** Speed analysis under curve 6. (**a**) Average velocity distribution; (**b**) Novice and skilled velocity distribution.

For curve 9 (R = 55 m), the drivers are classified according to the critical turning speeds of 30 km/h, 40 km/h and 50 km/h. Tester 11, who turns with the speed of 60 km/h, can be excluded due to higher risk driving; and there are 8 testers, including testers 7, 13, 16, 20, 22, 24, 25 and 26, turning with the speed of 50 km/h. Among them, the testers 20, 22, 24 and 25 are novice drivers. There exist 13 testers, including testers 1, 4, 5, 6, 8, 12, 14, 15, 17, 18, 19, 21 and 23, turning with the speed of 40 km/h, and among them the testers 6, 8, 12, 14, and 23 are novice drivers. Testers 3, 9, 10 and 27 turn with the speed of 30 km/h, and tester 2 turn with the speed of 20 km/h. Since there is only one case for the tester with high speed of 60 km/h and low speed of 20 km/h, it is not discussed and instead considered as abnormal data. As shown in Figure 5, the vehicles entering a curve at the speed of 40 km/h and 50 km/h exhibit a better speed change law, i.e., first deceleration and following acceleration. The vehicles entering a turn at the speed of 30 km/h basically present gradual acceleration after entering the curve. According to the analysis of testers who use different speeds to enter a curve, there is no specific distribution law between skilled drivers and novice drivers. Through the analysis of the forward movement data, the drivers who enter the curve with the speed of 30 km/h tend to slow down before the curve, and the speed drops to the minimum value at the turning phase. On the whole, the turning

speed of skilled drivers looks basically the same that of the novice drivers in curve 9, and the skilled drivers generally maintain the operations of first deceleration and the following acceleration, while the novice drivers raise more significant deceleration trend.

**Figure 5.** Trajectory analysis of curve 9: (**a**) average velocity distribution; (**b**) novice and skilled velociy distribution.

Based on the analysis of the vehicle speed, the reliability coefficients of curves 6 and 9 are further obtained. According to the validation method of the curve reliability, the Kronbach's *α* reliability, split-half reliability and *r* reliability were selected to evaluate the simulator speed [25], as shown in the following equations. The Kronbach's *α* reliability can be presented as:

$$\alpha = \frac{k}{k-1} \left[ \frac{S\_X^2 - \sum S\_{X\_i}^2}{S\_X^2} \right] = \frac{k}{k-1} \left[ 1 - \frac{\sum S\_{X\_i}^2}{S\_X^2} \right] \tag{2}$$

where *k* is the number of form stakes, *S*<sup>2</sup> *<sup>X</sup>* is the total variance of the sampled vehicle speed, and *S*<sup>2</sup> *Xi* is the speed variance of the driver at the *Xi* form stake. The split-half reliability can be expressed by Spearman-Brown coefficient, as:

$$\sigma\_{xx} = \frac{2 \times \frac{\sum X\_1 X\_2 / n - X\_1 X\_2}{S\_{x\_1} S\_{x\_2}}}{1 + \frac{\sum X\_1 X\_2 / n - \overline{X\_1 X\_2}}{S\_{x\_1} S\_{x\_2}}} = \frac{2\left(\sum X\_1 X\_2 / n - \overline{X\_1 X\_2}\right)}{S\_{x\_1} S\_{x\_2} + \left(\sum X\_1 X\_2 / n - \overline{X\_1 X\_2}\right)}\tag{3}$$

where *rxx* is the reliability coefficient of the simulator curve experiment, *X*<sup>1</sup> is the oddnumbered pile speed sum of the curve, *X*<sup>2</sup> is the even-numbered pile speed sum of the curve, *X*<sup>1</sup> is the mean value of the odd-numbered pile speed sum of the curve, *X*<sup>2</sup> is the mean value of the sum of even-numbered station speeds in the curve, *n* is the number of test drivers, *SX*<sup>1</sup> is the standard deviation of the odd-numbered station speed sum and *SX*<sup>2</sup> is the standard deviation of the even-numbered station speed sum. The *r* reliability [41] can be formulated, as:

$$\tau = 1 - \frac{\sqrt{k \sum\_{i=1}^{k} S\_i^2 - S^2 \times 3.92}}{\sqrt{k-1}(X\_{\text{max}} - X\_{\text{min}})} \tag{4}$$

where *S*<sup>2</sup> is the total variance of the sum of speeds in the curve, *Si* <sup>2</sup> is the variance of the speed under the *i* form of stake, *k* is the number of form of stakes, *X*max and *X*min is the highest and lowest values of the sum of speeds. In terms of these coefficients, the result is shown in Table 1. By selecting the experimental data of curves with different radii and obtaining the reliability coefficient values according to the reliability model, it is found that 3 types of reliability coefficients for curve 6 meet the experimental reliability requirements; whereas for the curve 9, the α reliability coefficient and Spearman-Brown coefficient meet the requirements, but the *r* reliability coefficient is low. Note that when evaluating the reliability of the test, the *r* coefficient should be generally higher than 0.70, the other reliability coefficients should be generally higher than 0.80 [41–43]. Therefore, the data consistency of curve 9 is biased simply from the perspective of *r* reliability coefficient. Meanwhile, the reliability coefficients of the data are improved after data cleaning. Therefore, the difference between testers is a key factor that should be taken into consideration. Furthermore, through the comparison of reliability coefficients, the reliability of the tested driving simulator in small-radius turns is low, due to the insufficient follow-up of the steering scene when testers turn in the simulator. In this context, the testers lack a full understanding of the lateral scene and cannot accurately complete the driving operation.


**Table 1.** Reliability coefficient results of curve 6 and cure 9.

#### *3.2. Verification of Driving Speed in Curve Test*

Based on the basic information of the experimental road, the operating speed is theoretically calculated according to the speed prediction model detailed in the specification [44], as:

$$v\_{\rm middle} = -244.123 + 0.6v\_{\rm in} + 40\ln(R\_{\rm now} + 500), v\_{\rm in} \in [30, 120], R\_{\rm now} \in [55, 600] \tag{5}$$

*vout* = −183.092 + 0.7*vmiddle* + 30 ln(*Rf ront* + 500), *vmiddle* ∈ [30, 120], *Rf ront* ∈ [55, 600] (6)

where *vin* is the running speed at the entrance of the curve, *vmiddle* is the running speed at the midpoint of the curve, *vout* is the running speed at the exit of the curve, *Rnow* is the radius of current curve and *Rf ront* is the radius of the curve to be driven into when the front is a straight line, *Rf ront* = 600 m . If *Rf ront* > 5*Rnow*, *Rf ront* is set to 5*Rnow*.

Through the calculation and comparison of the experimental data, the test values and predicted values of sampling points are shown in Figure 6. To intuitively analyze the data sampled from the starting point, midpoint and end point of the curve, the test data of the simulator is compared with the theoretical calculation value of the speed prediction model in the design specification, and it can be found that the overall change trend of speed is consistent, while there exists large speed deviation at the entrance of individual curves. The results of the simulator experiment are generally higher than the theoretical values. The points where the simulated value is lower than the theoretical speed appear in curves 9, 10, 14, and 16. The curve radius query results show that the radius of the curves 9 and 16 is 55 m, and the radius of the curve 14 is 150 m, demonstrating that the driver in large-radius curves is more aggressive in the driving simulator; however, the driver tends to be more cautious in the small-radius sharp curves, which is related to the lack of effectiveness provided by the simulator to drivers in sharp turns, thus complying with the conclusion from the reliability evaluation that the driver test performs with low reliability in small-radius curves.

**Figure 6.** Speed comparisons of curve sampling point.

From a quantitative point of view, according to the criterion-related validity theory, the Pearson correlation method is employed to obtain the validity coefficient of 0.802 between the simulator data and the theoretical data. Based on the obtained correlation coefficient, the Student's t testis performed. The Student's t test is to use the t distribution theory to infer the probability of the difference, so as to compare whether the difference between two averages is significant. According to the result of the paired sample test, when the preset significance level *α* is 0.05, *t* is 0.781, and the significance level is 0.438. Hence, it can be considered that the difference between the two groups of data is not significant. The comprehensive validity and statistical test prove that the driving simulator is effective in simulating turning road conditions. Based on the validity analysis, according to the meta-analysis method, the component difference *t* is adopted to calculate Cohen's *d* index as:

$$d = \frac{2t}{\sqrt{df}}\tag{7}$$

By substituting the test value into (7), we can get *d* = 0.2. According to the Cohen's effect level classification, the effect level is low at this time, this is mainly due to the large difference in the test value of the small-radius curve based on the subjective analysis of the test value. By excluding the small-radius curve data for verification, we can get *t* = 3.659, *df* = 47 and *d* = 1.07, highlighting that the simulator test owns large effect level. Therefore, it can be explained that the simulator generates a large effect in turning conditions with a radius of greater than 150 m, but with low effect for small-radius curves. The test simulator has a high validity coefficient under turning road conditions when the radius of the curve is greater than 150 m. However, the validity on small radius curves with radius less than 55 m is low and needs to be improved.

#### *3.3. Verification of Vehicle Trajectory in Curve Test*

According to the designed experimental scheme, the trajectory data of the vehicle at the selected curve 6 (R = 150 m) and curve 9 (R = 55 m) is obtained using the typical vehicle trajectory planning RRT algorithm, and under the same virtual landmark, the lateral offset of the path planned by RRT algorithm at the curve is gained after transformation. The RRT algorithm is a growing tree search algorithm based on random sampling, which is widely used in robot path planning [45]. The experimental vehicle trajectory and RRT predicted trajectory for curves 6 and 9 are illustrated in Figures 7 and 8, and the trajectory similarity values are shown in Tables 2–4.

**Figure 8.** Curves of RRT lateral offset at different speeds in Turn 9.

**Table 2.** Trajectory similarity of different types of drivers.



**Table 3.** Cosine similarity of different vehicle speeds and RRT trajectory adjustment in turn 6.


**Table 4.** Cosine similarity of different vehicle speeds and RRT trajectory adjustment in turn 9.


For curve 6, the vehicle with higher speed leads to larger fluctuation of the lateral offset and tends to be driven closely to the center line of the lane at the midpoint of the curve. Thus, the trend of skilled drivers driving in the middle of the road is more obvious. According to the classification of drivers with different speeds in curve 6, all those who turn at the speed of 60 km/h are skilled drivers, and those who turn at the speed of 40 km/h are basically novice drivers and a few cautious skilled drivers. The skilled drivers feature more advanced driving skills and can judge turns more accurately and turn at the speed of 60 km/h. The trajectory similarity between the skilled drivers is as high as 0.9715. The novice drivers perform unskilled and cautious driving behavior and basically control the

speed at a low level of 40 km/h to pass through the turn. The trajectory similarity between novice drivers is lower than that of skilled drivers, while the overall similarity also reaches 0.8583. In the driving simulator, the trajectory similarity of vehicles with the speed of 40 km/h and 60 km/h is the largest, reaching 0.8252, proving that novice drivers with low speed can reach a trajectory similarity to that of skilled drivers with the speed of 60 km/h. In curve 9, the trajectory similarity between the novice and the skilled drivers who turn at the same speed of 40 km/h reach 0.9785 and 0.9956, respectively, and the trajectory similarity of the skilled drivers is higher than that of the novice drivers. The vehicles with the speed of 30 km/h and 40 km/h own the highest trajectory similarity of 0.6475, but from the analysis of the trajectory cosine similarity, the overall trajectory of the driving simulator in curve 9 fluctuates prominently; meanwhile, the trajectory similarity of vehicles with different speeds is markedly low. Through comprehensive analysis of the RRT trajectory calculation process under the designed curve radii, the comparison between the experimental vehicle trajectory and RRT trajectory, and between the similarity values, it can be concluded that the path point planned by RRT is quite close to the obstacle, which is the edge of the road. When planning the curve path, there emerges a phenomenon of excessive turning. The trajectory predicted by RRT in curve 6 and the trajectory predicted by the driving simulator are both negative, manifesting that the trajectories of the two are opposite. The trajectory predicted by RRT in curve 9 and the trajectory predicted by the driving simulator with the speed of 50 km/h are negative, which proves that the trajectories of the two vehicles are opposite at this speed, while the vehicle trajectories with other speeds are consistent, but the similarity is not high.

#### **4. Improvement of Simulator Curve Track Planning Model**

Through the analysis of the experimental data, it is concluded that RRT does not consider the actual driving speed, steering and other vehicle behavior factors in the curve trajectory planning process. It is necessary to establish a curve trajectory planning model driven by human-simulating operation data to provide a trajectory calibration reference for the simulator's validaton under turning conditions.

### *4.1. Data-Driven Modeling Method*

#### 4.1.1. Long Short-Term Memory Recurrent Neural Network

Since the lateral offset during vehicle driving is time series data, a supervised learning model can be constructed to predict this variable. The model needs to use time series data including the longitudinal displacement of the vehicle, the lateral displacement of the vehicle, the longitudinal speed of the vehicle and the steering wheel angle. The time series data itself is not only affected by the previous input characteristics, but also by the input characteristics at previous moment. In view of the above characteristics, this paper adopts a neural network with time series properties for data fitting and model construction. As shown in Figure 9, when the recurrent neural network (RNN) processes sequence data, it searches the data before the current state, saves it in the model and exploits it in the current network output, making it suitable for predicting the lateral offset of vehicles with time series. In actual calculations, in order to reduce the complexity of the algorithm and increase the inference speed of the model, it can be assumed that the current state is related to the previous states. The RNN will always receive the input *xt* and loop inside the model. Each calculation will use the previous calculation information.

**Figure 9.** Schematic diagram of RNN.

From the perspective of the data set, the sampling frequency of vehicle operation is relatively high, and the current input of the model has a relationship with the input within a long period of time. When the RNN model processes long-period data, as the input pre-time sequence state is too much, the gradient will explode and disappear due to long-term dependence. Therefore, long short-term memory (LSTM) RNN is adopted to enhance the model's ability to learn long-term dependence [46]. Figure 10 presents the network structure of LSTM, where the cell state *Ct* represents the memory information of the model at time *t*. The forgetting gate *ft* takes the previous sequence *ht*−<sup>1</sup> and the current sequence *xt* as input, and obtains the forgotten and retained part of the data information through the activation function. The sigmoid function is employed as activation function, which is close to 0 or 1 within the range of data values. The forgetting gate is described as:

$$f\_t = \sigma(\mathcal{W}\_f \cdot [h\_{t-1}, \mathbf{x}\_t] + b\_f) \tag{8}$$

**Figure 10.** LSTM network structure diagram.

The input gate needs to process the input information of the current sequence, and determine the data to update the model, so as to update the model state. The sigmoid function is used to determine the information characteristics of the data which are added to the model, and the new characteristic information is converted into the added data information using the tanh function, as:

$$\mathbf{i}\_t = \sigma(\mathcal{W}\_i \cdot [\mathbf{h}\_{t-1}, \mathbf{x}\_t] + b\_i) \tag{9}$$

$$C\_{\ell} = \tanh(\mathcal{W}\_{\mathcal{C}} \cdot [h\_{\ell-1}, \mathbf{x}\_{\ell}] + b\_{\mathcal{C}}) \tag{10}$$

The update gate updates the cell status with the forget gate and output gate information received. The update gate *ft* ∗ *Ct*−<sup>1</sup> represents the information removed from the model cell, and *it* ∗ *Ct*−<sup>1</sup> represents the new data information of the model cell. The update gate is formulated, as:

$$\mathbf{C}\_{t} = f\_{t} \* \mathbf{C}\_{t-1} + i\_{t} \* \overline{\mathbf{C}\_{t}} \tag{11}$$

The output gate applies the sigmoid function to confirm the output content, and the tanh function to process the model cell content, then the output information is obtained by multiplying the two parts, as:

$$O\_t = \sigma(\mathsf{W}\_{\mathsf{o}} \cdot [h\_{t-1}, \mathsf{x}\_t] + b\_{\mathsf{o}}) \tag{12}$$

$$h\_t = O\_t \* \tanh(\mathbb{C}\_t) \tag{13}$$

where the sigmoid function is the logistic function, which is an S-shaped growth curve in biology. In practice, because of the monotonically increasing feature of the function, it is often adopted as the activation function of the neural network, which can map the input variables of the neural network to [0, 1], as:

$$
\sigma = \frac{1}{1 + e^{-\mathbf{x}}} \tag{14}
$$

The tanh function is one of the hyperbolic functions. The tanh function overcomes the central asymmetry of the sigmod function, and its value range is [−1, 1], as:

$$\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}} \tag{15}$$

The forget gate, update gate and output gate all adopt non-linear activation functions to enhance the learning ability of the network and enable the model to master highdimensional complex data.

#### 4.1.2. Bi–LSTM

The Bi–LSTM is an extension of traditional LSTM, which can improve the performance of the model to solve the time series regression problem. Based on the characteristics of the lateral offset, the prediction may need to be jointly determined by several previous inputs and several subsequent inputs, which will be more accurate. Therefore, the Bi– LSTM is employed in this study, which uses time series processing for the past and future bidirectional data, trains the front and back time series data added by the network model, and leverages forward and backward bidirectional LSTM to model the data. The forward LSTM model performs forward calculations on the data from 1 to *t*, saves the forward output of each sequence data time; and the backward LSTM model performs reverse calculations on the data from t to 1, and saves the backward output at each sequence data time. By synthesizing the forward and backward LSTM models to obtain the final output [47], we can yield:

$$h\_t = f(w\_1 \mathbf{x}\_i + w\_2 h\_{t-1})\tag{16}$$

$$h\_t' = f(w\_3 \mathbf{x}\_t + w\_5 h\_{t+1}') \tag{17}$$

$$O\_t = \mathcal{g}(w\_4h\_t + w\_6h\_t')\tag{18}$$

#### 4.1.3. Multilayer Perceptron

MLP, also a member of artificial neural networks, is based on bionics, which proposes the perception of information from the perspective of human organ perception, and the acquisition is built as a prototype. It was proposed by Frank Rosenblatt in 1958 [48].
