**4. Methodology**

The camera modelling approach and process are presented in Figure 4. The test vehicle collects information from mounted experimental equipment, such as vehicle dynamic data, GPS sensor data and camera data introduced in Section 3. These data will be used for target determination and feature selection. Depending on modelling requirements, sensor detection results contain essential information about the modelling target, which facilitates the calculation of the differences between measured and reference data. This

error represents both the camera's performance and uncertainty in lane detection and the target of the model. In order to improve the performance of the camera model and decrease the training time of NN, data extraction and input features selection are applied, which contribute most to the prediction variable or output used in this case. The selected features based on the ReliefF algorithm are used as input for MLP. The relationships between each input feature and target are evaluated using ReliefF, which is a feature weighting method designed for multi-class, noisy, and incomplete dataset classification issue [42,43]. Once inputs and targets are determined, the MLP-based approach is applied for modelling.

**Figure 4.** Schematic representation of necessary components for camera model.

#### *4.1. Target Determination*

As previously discussed in Section 2, our model primarily focuses on straight highway segments. Therefore, *C*0 Lane Position Error (*C*0-LPE) and *C*1 Heading Angel Error (*C*1-HAE) are considered as our targets of MLP. Reference data for each target were taken from M86 road marking coordinates and ADMA-RTK reference system, respectively. In addition, detection data for each target were taken from the MOBILEYE camera.

*C*0-LPE is calculated as the difference between M86 road markings coordinates and detection data. The calculation process is defined in the next steps:


*C*1-HAE is calculated as the difference between the heading angle data provided by the reference system and the detection output of the camera. The calculation is conducted in the same way as mentioned for *C*0-LPE, resulting in a one-dimension vector as a target of the MLP. As detailed in the next section, different input features were selected for each target.

#### *4.2. Feature Selection*

Various features were collected from different experimental devices and electronic controllers during the measurement process. However, a mass of data often contains many irrelevant or redundant features. In this study, the ADMA reference system provides details on the available data. Some features (ambient temperature, GPS receiver states, altitude, etc.) were discarded because they did not significantly impact vehicle dynamics or camera model functionality. Feature selection aims to maximize information associated with the

target, carried out by the extracted features from raw data. Additionally, considering that different features have different update cycles, time synchronization is also required during data processing to align all features on the same timeline. The time synchronization process is as follows:


In order to further refine and reduce the parameters input to the predictive model, features should be selected from extracted data, which minimizes the number of input features. The benefit of the process is to reduce training time, lower the risk of overfitting, and improve the model's performance. The primary notions and applications of ReliefF are to rate the quality of features based on their ability in order to distinguish samples that are close to one another. The final weight assigned to each feature is calculated. According to ReliefF results, the final features with the greatest relevance to each target are selected and shown in Table 2, while the corresponding arguments are illustrated in Figure 5. Each set of input chosen features has a defined target. These features are used as inputs to the corresponding MLP model.

**Table 2.** Final Feature Selection for each target.


All features are connected to the body coordinate system, described in [44].

**Figure 5.** Illustration of the selected input feature variables on side and top views of the vehicle.

#### *4.3. Neural Network Modelling*

MLP is used here to estimate the performance of the camera. It is widely used in different fields, such as system modelling, anomaly detection, and classification applications to solve complex problems in a variety of computer applications [45–49]. Additionally, the MLP approach has been preferred as a method for state estimation and simulation implementation [50]. MLP is useful in research for its ability to solve problems stochastically. Therefore, it is employed here to estimate *C*0-LPE and *C*1-HAE.

A typical architecture of MLP has one input layer, one or more hidden layers, and one output layer. The working principle uses the connecting layers, which are components of neurons, to transfer normalized input data to the output. The number of layers in the network and the number of neurons in each layer are typically determined empirically. The architecture of the used MLP is presented in Figure 6. This architecture is used for both prediction models, including the estimation of heading angle and lateral position errors. The data mapping process from input data to the output data is presented in Equations (2)–(4). Furthermore, the arithmetic process in a node is illustrated in Figure 7.

**Figure 6.** Proposed MLP architecture.

**Figure 7.** Data processing in a neural network node.

$$O\_l = \sum\_{j=1}^{n} (w\_{i,j,l}x\_j + b\_{i,l}) \quad l = 1, 2, \dots, m \tag{2}$$

Each hidden layer contains the associated coefficient weights and bias. The inputs of each node are calculated from the previous layer or are the initial input of the network, then the results of the mathematical operation can be provided by Equation (2), where *x* is the normalized input variable, *w* is the weight of each input, *i* is the input counter, *b* is the bias of this node, *n* is the number of input variables, and *k* and *m* are the counters of the hidden layer and the number of neural network nodes, respectively.

$$\mathcal{F}\_l(O\_l) = \frac{2}{1 + \varepsilon^{-2O\_l}} - 1 \tag{3}$$

Subsequently, the results of *Ol* are applied to function *Fl*. Here, the hyperbolic tangent sigmoid function is used as an activation function calculated by Equation (3), which defines the output of that node given an input or set of inputs.

$$\widehat{\mathcal{Y}} = \left(\sum\_{l=1}^{m} w\_{out,l} \mathcal{F}\_l\right) + b\_{out} \tag{4}$$

Finally, multiple nodes and hidden layers build up the MLP, as shown in the Figure 6. The output of each node is forwarded to the next layer to continue the same operations. The output layer of the entire network is defined by Equation (4), where output *y* calculates the weighted sum of the signals provided by the hidden layer. The coefficients associated with them are grouped into matrices *wout*,*<sup>l</sup>* and *bout*.

The most critical step in MLP modelling is training. Backpropagation is the most often used training algorithm, which is described as a process for adjusting the network parameters (weights and biases) to minimize the error function between the estimated and real outputs. In comparison to other back-propagation algorithms, a supervised learning algorithm called Scaled Conjugate Gradient (SCG) was selected [51].

The number of layers in the network and the number of neurons in each layer are typically determined empirically. By comparing training result performances, four hidden layers were decided to be utilized in the MLP model, with the number of neurons distributed as 50, 30, 10, and 10 in each layer. The architecture of hidden layers and the number of neurons in each layer are used for both prediction models ( *C*0-LPE and *C*1-HAE estimation). As shown in Section 4.2, the number of inputs for each target is shown in Table 2.

After the definition of the MLP architecture, its performance is evaluated using three different metrics: Mean Squared Error (MSE), Root Mean Square Error (RMSE), and correlation coefficient (R2). MSE is used to represent the average squared difference between the estimated values and the actual value (see Equation (5)). On the other hand, RMSE is a typical metric for regression models and is used to quantify the model's prediction error, with a larger error resulting in a higher value (see Equation (6)). Finally, *R*<sup>2</sup> represents the proportion of real output dynamics that could be caught by the MLP model. *R*<sup>2</sup> varies between 0 and 1. A higher number indicates that the model is more accurate in its predictions (see Equation (7)).

$$MSE = \frac{1}{n} \sum\_{i=1}^{n} (y\_i - \hat{y\_i})^2 \tag{5}$$

$$RMSE = \left(\frac{1}{n} \sum\_{i=1}^{n} (y\_i - \hat{y}\_i)^2\right)^{\frac{1}{2}} \tag{6}$$

$$R^2 = 1 - \frac{\sum\_{i=1}^{\mathfrak{u}} (y\_i - \hat{y}\_i)^2}{\sum\_{i=1}^{\mathfrak{u}} (y\_i - \overline{y\_i})^2} \tag{7}$$

In Equations (5)–(7), y is the target (observe) value, *yi* is the average value of the target, and *n* is the number of the MLP output data samples.

#### **5. Results and Discussion**

Real-world collected road test data are utilized to train the network model in order to evaluate the accuracy of PLDM better. This section explains MLP training results. Moreover, in order to verify the accuracy of model predictions and the validity of the approach, the employed MLP model results are compared with five other algorithms. Finally, this model will be deployed in the vehicle simulation software CarMaker from IPG Automotive GmbH [52].

#### *5.1. Training Results*

In order to train the MLP model, the data gathered by the various devices are synchronized, and 9010 samples were selected from collected data to optimize the model. Input data are randomly separated into three sets: training (70%), validation (15%), and test (15%). The configuration of the MLP model is discussed in Section 4.3 and presented in Table 3. Supervised training is performed on the model using the training set. The validation set is also used to mitigate the issue of overfitting. Finally, the test set is used to evaluate model performance on unseen data.

**Table 3.** MLP network model configuration for *C*0 and *C*1 estimation.


As discussed in Sections 4.1 and 4.2, the MLP model is used to estimate *C*0-LPE, which consists of five features and a two-dimension target. The combination of five input features and a one-dimension target is used to estimate *C*1-HAE. The regression graphs obtained as results of the MLP training are given in Figure 8. The models are evaluated for the test set after convergence, and regression accuracy can achieve 94.0% and 95.5%, respectively, for *C*0-LPE and *C*1-HAE estimation in order to further evaluate the estimation model, where evaluation metrics (MSE, RMSE, and *R*2) measure regression performance. As a result, MLP training performance is provided in Table 4. The results show good agreemen<sup>t</sup> between actual and estimated values, and the prediction errors of the results are within an acceptable range. The models are more consistent with the trend of real values in terms of the predicted value. Therefore, the training of the network has been successfully provided.

**Table 4.** Performance evaluation of MLP for *C*0-LPE and *C*1-HAE estimation.


**Figure 8.** MLP training regression graph. (**a**) Training regression result for *C*0-LPE (**b**) Training regression result for *C*1-HAE.

#### *5.2. Comparing with Other Approaches*

The proposed method can be used for *C*0-LPE and *C*1-HAE predictions and for comparing the effectiveness and accuracy of this method. Moreover, five other machine learning methods are introduced in [53,54], and these algorithms are categorized and introduced as follows:


Finally, these five algorithms and MLP model are compared with performance metrics, as shown in Table 5. In comparison, the suggested MLP achieves outstanding results while outperforming alternative approaches. Additionally, the GPR demonstrated a rather good regression result, with an accuracy of 83% and 86% for *C*0-LPE and *C*1-HAE estimation, respectively; this is probably because the GPR kernel can extract sequential data from complex temporal structures. All three models, SVM, LR, and SR showed comparable performance and underfitting for *C*0-LPE. Furthermore, two other neural network models based on data-driven approaches are introduced in [23,24], which include Mixture Density Network (MDN) and deep Gaussian Process (GP). MDN outputs a Gaussian mixture through a multilayer perceptron. Each Gaussian distribution is assigned a corresponding weight, which predicts the entire probability distribution. Deep GP is a deep belief network based on GP mappings. The data are modelled as the outputs of a multivariate GP. Both models can accurately represent uncertainty between camera detection and measurement results, but they do not produce an accurate estimate compared to MLP. Therefore, driven by the goal of the digital twin, MLP can more accurately represent the behaviour of sensors in real environments and still show substantial advantages.


**Table 5.** Performance comparison between several regression algorithms.

#### *5.3. Virtual Validation in CarMaker*

In this section, a test run is randomly selected from the test set samples carried out in the co-simulation based on the Carmaker-Simulink software, which provides a multi-body simulation environment that includes vehicle dynamics control and sensor modules. These modules can support custom modifications. Thus, PLDM replaces the default camera model in CarMaker and tests detection performance in a virtual environment. As illustrated in Figure 9, the entire model is integrated into the co-simulation platform. Realistic reproduction of the virtual scenario is produced using the digital twin-based M86 map. Subsequently, at each time step, the ideal CarMaker object sensor detects the object from the map and provides precise information feedback in list format. In particular, test run data from previous tests conducted in a real-world environment, such as vehicle dynamics and positioning information, are stored in an external file that could be utilized as input for free movement in CarMaker. This module is mainly responsible for real measurement playback, and necessary data are transmitted to the MLP-based error estimation module, where the estimator predicts the corresponding error values for *C*0-LPE and *CC*1 -HAE, respectively, based on the current vehicle state. Due to the fact that the MLP model was trained on prior training data successfully, ground-truth lane marking data are manipulated according to the model's output. In this case, two polynomial coefficients ( *C*0-LPE and *CC*1 -HAE) of lane marking detection can be determined.

**Figure 9.** The procedure of the phenomenological lane detection model in simulation.

As shown in Figure 10, the estimated value and real value of the randomly selected samples are generally consistent with the trend of the sample change. Affected by real factors, the method to make predictions for certain samples still contains certain errors. A larger *C*0 of the left lane estimation error greater than the right lane estimation can be observed, probably because the left lane line is dashed and the right lane line is continuously solid. Namely, a dashed lane marking is usually more challenging to determine from the background in a captured image, as explained in [60]. Overall, the predicted peak and valley space corresponding to the estimated values still contains some errors compared to the actual values. However, the maximum error value of 0.05 m is still acceptable. The reason is that this method takes into account many more factors than traditional regression

forecasting methods, and it is hard to avoid errors in the weights of some secondary factors. However, in terms of the overall trend, the effectiveness of the chosen model is proven.

**Figure 10.** Simulation result for *C*0 and *C*1 estimation. (**a**) *C*1 estimation comparison between camera detection data and MLP based output. (**b**) *C*0 of left lane estimation comparison between camera detection data and MLP based output. (**c**) *C*0 of right lane estimation comparison between camera detection data and MLP-based output.

(**c**)
