4.1.4. Prediction Results and Model Validation

In order to evaluate the predictive performance of the model, seven error validation indicators are considered to establish a multi-indicator fusion evaluation plan [49], including:

(a) Goodness of fit (*R*2)

$$R^2 = \frac{SSR}{SST} = \frac{\sum\_{i=1}^{n} \left(\hat{y}\_i - \overline{y}\right)^2}{\sum\_{i=1}^{n} \left(y\_i - \overline{y}\right)^2} = 1 - \frac{SSE}{SST} \tag{19}$$

(b) Mean of absolute error (Error Mean)

$$\mu = \frac{1}{n} \sum\_{i=1}^{n} \left( \mathcal{Y}\_i - \mathcal{Y}\_i \right) \tag{20}$$

(c) Standard deviation of absolute error (Error Std)

$$
\sigma = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} \left( \hat{y}\_i - y\_i - \mu \right)^2} \tag{21}
$$

(d) Mean square error (MSE)

$$MSE = \frac{1}{n} \sum\_{i=1}^{n} \left(\mathcal{g}\_i - y\_i\right)^2 \tag{22}$$

(e) Root mean square error (RMSE)

$$RMSE = \sqrt{\frac{1}{n} \sum\_{i=1}^{n} \left(\mathcal{g}\_i - y\_i\right)^2} \tag{23}$$

(f) Normalized root mean square error (NRMSE)

$$NRMSE = \frac{\sqrt{\frac{1}{n} \sum\_{i=1}^{n} \left(\mathcal{Y}\_i - y\_i\right)^2}}{\mathcal{Y}} \tag{24}$$

(g) Rank Correlation

$$r\_s = 1 - \frac{6\sum d\_i^2}{n(n^2 - 1)}\tag{25}$$

where *<sup>R</sup>*<sup>2</sup> is the goodness of fit, *yi* is the original trajectory data (*<sup>i</sup>* = 1, 2, ··· , *<sup>n</sup>*), *<sup>y</sup>*ˆ*<sup>i</sup>* is the predicted value, *y* is the average value, SSR is the regression sum of squares, SST is the sum of square deviations, and SSE is the residual sum of square. *rs* is the rank correlation coefficient, *di* is the level difference of each pair of samples of the two variables, *xi* is the difference between the variable and *yi*, and *n* is the sample size.

#### 4.1.5. Model Training

In order to reflect the performance advantages of Bi–LSTM model and avoid parameter overfitting, the structural performance of Bi–LSTM model is set manually to make it optimal. Twenty-four sets of crossover experiments in the form of number of hidden layers × number of traversal rounds × batch size were designed for testing as shown in Table 5. The model contains four classes of hidden layer structures, with only one hidden layer in class 1 and a number of hidden layer cells of 100, denoted as *h*100. Class 2 has two hidden layers, both with a number of neurons of 100, denoted as *h*<sup>100</sup> × *h*100. The remaining two classes are denoted as *h*<sup>100</sup> × *h*<sup>100</sup> × *h*<sup>75</sup> and *h*<sup>100</sup> × *h*<sup>100</sup> × *h*<sup>75</sup> × *h*75.

**Table 5.** The structure of the cross-experiment.


The training process sets the base learning rate *β* to 0.005 and the dropout value is 0.5. Using the Adam iterative optimization algorithm [50], the learning rate of each parameter is dynamically adjusted using first-order moment estimation and second-order moment estimation of the gradient with the following equations.

$$m\_t = \lambda \times m\_{t-1} + (1 - \lambda) \times g\_t \tag{26}$$

$$m\_t = \gamma \times n\_{t-1} + (1 - \gamma) \times g\_t^2 \tag{27}$$

$$
\hat{m}\_t = \frac{m\_t}{1 - \lambda^t} \tag{28}
$$

$$
\hat{n}\_t = \frac{n\_t}{1 - \gamma^t} \tag{29}
$$

$$
\theta\_t = \theta\_{t-1} - \frac{\hat{m}\_t}{\sqrt{\hat{n}\_t} + \varepsilon} \times \beta \tag{30}
$$

where *gt* is the gradient of the time step, *mt* is the first-order moment estimate of *gt*, which is the exponential moving average of *gt*, *nt* is the second-order moment estimate of *gt*, the exponential moving average of *g*<sup>2</sup> *<sup>t</sup>* , *λ* and *γ* is the exponential decay rate, *m*ˆ *<sup>t</sup>* is the deviation correction for *mt*, *n*ˆ*<sup>t</sup>* is the deviation correction for *nt*, *θ<sup>t</sup>* is the parameter vector for time step *t*, *β* is the learning rate, *ε* is the residual term, the default is taken as 10<sup>−</sup>8.

After completing 24 sets of training tests, the distribution of *L*0.05 and *MRMSE* of the model is obtained. As the number of hidden layers increases, the *L*0.05 and *MRMSE* curves rise, and *L*0.05 and *MRMSE* of the group 4 experiments are at a lower level, i.e., the structural performance of the group 4 model is optimal. With the same number of hidden layers, *L*0.05 and *MRMSE* show a decreasing trend as the number of traversal rounds increases. Taking groups 1–6 as an example: the number of traversal rounds is 100 for groups 1–3 (*h*<sup>100</sup> × 100 × 32, *h*<sup>100</sup> × 100 × 64, *h*<sup>100</sup> × 100 × 128) and 200 for groups 4–6 (*h*<sup>100</sup> × 200 × 32, *h*<sup>100</sup> × 200 × 64, *h*<sup>100</sup> × 200 × 128). The change in the number of traversal rounds shows a significant decrease in the values of *L*0.05 and *MRMSE*. In the case of the same hidden layer and the same number of traversal rounds, for example, with the increase of batch size, *L*0.05, *MRMSE* also tend to increase. Based on the above analysis, group 4 (with hidden layer of, lot size of 32 and number of traversal rounds of 200) is finally adopted as the optimal Bi–LSTM model.

#### *4.2. Model Verification and Testing*

In order to compare the prediction performance of LSTM, MLP, and Mul–Bi–LSTM, 28 sets of the same vehicle curve data are used to establish trajectory planning models, respectively. The input of Mul–Bi–LSTM model is the longitudinal displacement of the vehicle, the lateral displacement of the vehicle, the longitudinal speed of the vehicle, and the steering wheel angle and the output is the lateral offset. The input and output of LSTM and MLP models are all lateral offsets. After data processing, the simulation prediction result of single variable and Mul–Bi–LSTM input is obtained. Three sets of typical data are randomly selected for comparative analysis, as shown in Figures 11–13.

**Figure 11.** Trajectory planning model of sample a: (**a**) rank correlation of LSTM, (**b**) rank correlation of MLP, (**c**) rank correlation of Mul–Bi–LSTM, (**d**) R<sup>2</sup> of LSTM, (**e**) R2 of MLP, (**f**) R<sup>2</sup> of Mul–Bi–LSTM.

**Figure 12.** Trajectory planning model of sample b: (**a**) rank correlation of LSTM, (**b**) rank correlation of MLP, (**c**) rank correlation of Mul–Bi–LSTM, (**d**) R<sup>2</sup> of LSTM, (**e**) R2 of MLP, (**f**) R<sup>2</sup> of Mul–Bi–LSTM.

**Figure 13.** Trajectory planning model of sample c: (**a**) rank correlation of LSTM, (**b**) rank correlation of MLP, (**c**) rank correlation of Mul–Bi–LSTM, (**d**) R<sup>2</sup> of LSTM, (**e**) R2 of MLP, (**f**) R<sup>2</sup> of Mul–Bi–LSTM.

As shown in Table 6, three test samples are randomly selected for model verification. In terms of Rank Correlation or R2, the prediction results of the Mul–Bi–LSTM trajectory planning model are generally better than those of LSTM and MLP. To further verify the applicability of the LSTM model framework between different curves, curves 6, 9 and 13 are randomly selected for performance validation, and all the sample data are summarized and analyzed to obtain the planned trajectory of each curve, as shown in Table 7. It can be seen from Table 7 that in the curve trajectory planning model, the trajectory prediction value of Mul–Bi–LSTM is larger than that of LSTM. The rank correlation of Mul–Bi–LSTM trajectory prediction model in curve 6 increases by 1.3%, and R<sup>2</sup> increases by 11.65%, compared with the LSTM model. The Rank correlation of Mul–Bi–LSTM trajectory prediction model in curve 9 increases by 1.49, and R<sup>2</sup> increases by 14.35%, compared with that of LSTM model. The rank correlation of Mul–Bi–LSTM trajectory prediction model in curve 13 increases by 1.64%, and R<sup>2</sup> increases by 4.85%, compared with that of the LSTM model. Finally, in order to verify the portability of the LSTM model framework under the full sample condition, 80% of the samples are drawn as the training set and 20% of the samples are the test set. Mean square error (MSE), root mean squared error (RMSE) and normalized root mean squared error (NRMSE) are calculated to evaluate the prediction performance of the model. The results of each curve trajectory planning are shown in Table 8. It can be seen from Table 8 that compared with LSTM and MLP, the mean error of Mul–Bi–LSTM is reduced by 0.00275 and 0.00742, and the error Std is dropped by 0.00282 and 0.02955, and the MSE is reduced by 0.00356 and 0.00261. The results show that the value predicted by Mul–Bi–LSTM has a smaller deviation from the true value and is closer to the true value. Hence, the proposed Mul–Bi–LSTM possesses optimal prediction effect. With the introduction of human-like driving characteristic variables, the prediction performance of the model is significantly improved, and the prediction results are more accurate and closer to the true value, which is because there exists a strong recursive relationship between the trajectory data. The characteristics of driving variables can better describe and reflect the characteristics of the lateral offset, while the LSTM model separates the connections between the data, making its overall prediction performance lower, compared with that of the Mul–Bi–LSTM model.




**Table 8.** Comparison of several models.


From the previous results, it is clear that the Mul–Bi–LSTM model performed better than the MLP and LSTM models in terms of simulator data prediction. To further validate the adaptability of the Mul–Bi–LSTM model to the simulator data, a *K*-fold cross-validation method was used. This method evaluates the effect of the model, with the features that each sub-sample is involved in training and testing, and can reduce the generalisation error. *K*-fold cross-validation method is based on the idea that the data set is divided into *K* regions. It selects different regions in turn as the test set. The remaining *K*-1 regions are used as the training set, and *K* times of model validation are performed. The final average of the accuracy of the *K* times validation results is taken as an estimate of the accuracy of the algorithm.The driving simulation result dataset was divided into 2 to 10 folds and the *R*<sup>2</sup> of each fold was estimated to prevent overfitting, as shown in the Figure 14. It can be seen that the test results of six-fold cross-validation have the highest *R*2. Combined with the results evaluated by the *K*-fold crossover method, the Bi–LSTM model proposed in this paper has good stability and generalisability.

**Figure 14.** 2 to 10-fold cross validation *R*2.

#### **5. Conclusions**

This paper mainly evaluates the effectiveness of the driving behavior characteristics of the driver under the curved road conditions in the simulator. The analysis with the speed under curves and the curve vehicle trajectory are considered as the validation indicators. The driving simulator and 27 different types of testers are selected to design the curve experiment with the radii of 55 m, 150 m, 250 m, 350 m, 450 m and 550 m for validation of the experiment simulator. In the evaluation of vehicle speed, a method for evaluating vehicle speed consistency based on reliability coefficients is proposed, and the Kronbach's *α* reliability, split-half reliability and *r* reliability are selected to conduct the reliability verification of the experimental speed results. Through sampling analysis, it is found that the overall reliability of the speed simulation for the tested driving simulator is relatively high, but the reliability of the speed simulation is low when turning in a small-radius curve. In the evaluation of speed accuracy of simulator curve, the speed values measured by the speed model at the starting point, midpoint and end point of the curve are adopted as calibration criteria, and the criterion-related validity, Cohen's *d* exponent method is used for validity analysis. The results reveal that the test simulator perform preferable predictive efficiency in general.

In the evaluation of the curve trajectory, the concept of lateral offset is proposed as the virtual coordinate based on the longitudinal and lateral displacement of the vehicle with the designed road form pile number. The predicted trajectory obtained by the RRT model is adopted as the evaluation standard of the curve trajectory. In the comparison of indicators, the method of revised cosine trajectory similarity is selected for analysis. The experimental results show that the overall reliability of the vehicle turning trajectory is high when the curve radius is greater than or equal to 150 m in the test simulator, while the reliability of the vehicle turning trajectory is relatively poor when the curve radius is 55 m. Based on the

experimental results, in view of the low similarity of the curve trajectory between the RRT trajectory planning model and the simulator, the idea of using human-simulating operation data is proposed to establish a more complete human-simulating curve trajectory planning model by Mul–Bi–LSTM. Firstly, based on the trajectory similarity, it is concluded that the driver's curve trajectory presents strong regularity. Secondly, the trajectory planning performance of LSTM, MLP and Mul–Bi–LSTM models on the verification set and test set are compared. The results show that the Mul–Bi–LSTM model considering driving behavior factors can generate the most similar trajectory to the driver, which owns superior generalization performance. In conclusion, the speed and trajectory as the validation indicators, based on the velocity model and the Mul–Bi–LSTM trajectory model, can be used to evaluate the validity of the driving simulator under complex road conditions.

The future works to be investigated will include: (1) refinement of the validation system of the vehicle driving simulator, and (2) vehicle driving simulator development for intelligent transportation [51]. The validation of a driving simulator is a multi-level comprehensive system, which needs to be improved. The indicator system needs to be further refined, and the validation indicators need to be continuously modified through actual vehicle experiments. In addition, although the curve trajectory planning model based on human-simulating operation data driven by the thesis can be applied to emerging fields such as man-machine and autonomous driving, it is limited by experimental conditions. The effect in this process needs to be further developed for experimental scenarios and realization of related tests.

**Author Contributions:** Conceptualization, L.C. and F.G.; methodology, L.C.; investigation, L.C. and J.X.; data curation, W.T.; writing—original draft preparation, L.C. and S.W.; writing—review and editing, F.G. and Z.C.; supervision, F.G.; funding acquisition, F.G. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by NATIONAL NATURAL SCIENCE FOUNDATION OF CHINA, Grant Number 71961012.

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.
