**1. Introduction**

The traffic safety problem is severe with an increasing number of vehicles on the road. According to [1], approximately 11 percent of road accidents result from lane departures caused by inattentive, distracted, or drowsy drivers. According to statistics from [2], in 2015 nearly 13,000 people died in single-vehicle run-off-road, head-on, and sideswipe crashes where a passenger vehicle left the lane without warning. Lane Keeping Assist (LKA) and lane departure warnings are designed to reduce potential risk and improve driving safety. They support more effective driving tasks that maintain safe lateral vehicle control. The study that investigated the safety potential of Lane Keeping Assist systems shows that the

**Citation:** Li, H.; Tarik, K.; Arefnezhad, S.; Magosi, Z,F.; Wellershaus, C.; Babic, D.; Babic, D.; Tihanyi, V.; Eichberger, A.; Baunach, M.C. Phenomenological Modelling of Camera Performance for Road Marking Detection. *Energies* **2022**, *15*, 194. https://doi.org/10.3390/ en15010194

Academic Editor: Rui Xiong

Received: 12 October 2021 Accepted: 20 December 2021 Published: 28 December 2021

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

possibility to avoid fatal accidents is between 16.4% and 29.2%, depending on the capability of the system [3]. For passenger vehicles, these values even went up to 23.2% to 40.9%

Nowadays, almost every installed system relies on vision-based technologies to detect and trace lane marking. For most conventional methods [4–6], the lane edge is detected in the region of interest by image filtering and thresholding. With the development of artificial intelligence, the convolutional NN-based approach has stimulated a promising research direction for the extraction of lane marking from acquired images [7–9]. In contrast, Kim et al. [10] uses an MLP in the fully connected layer to manually extract the Region of Interest as the input of convolutional NN and directly outputs lane marking candidates. This approach ultimately outputs the detected lane marking by fitting a function. Thus, the camera's computational performance and the algorithm's detection efficiency affect the accuracy of the detection results. An appropriate lane marking detection model is required to analyze and validate vision-based lane marking detection systems. This model is developed based on the ground truth of the digital twin maps, which provides an excellent setting for detecting and reading a list of lane marking points to validate the performance of the lane marking model.

Meanwhile, Kalra et al. [11] and Shladover et al. [12] demonstrated that using Autonomous Driving Systems (ADS) statistically results in fewer collisions. However, hundreds of millions of kilometres of test drives should be conducted to verify the robustness of ADS algorithms and software. Furthermore, ADS are subject to different research challenges (technical, non-technical, social, and policy) [13]. In particular, different driving scenarios related to traffic and humans bring new system requirements to ADS [14]. These cases induce that certification of an automated system can only be achieved with the support of modelling and simulation [15]. More specifically, to realistically capture the complexity and diversity of the real world in a virtual environment, models that combine virtual scenarios, flexible simulations, and real measurement data should be considered [16,17].

In order to accommodate different requirements encountered during the vehicle development process, various camera model types with distinct detection performance are developed, as demonstrated in the prior studies. For example, Schlager et al. [18] defined low-fidelity sensor modules for input and output using object lists, which are filtered according to the sensor specific Field of View (FOV). In [19], and an error-free camera model is introduced, which can correctly recognize all objects within the FOV. Based on this sensor model, a more refined sensor model is proposed in [19,20], which supports arbitrarily shaped FOVs. In order to standardize the modelling process, a modular architecture was proposed [21], which defines the filtering process for input objects lists according to different sensor effects and occlusion situations [20]. A significant advantage of the described model is that it only considers detection results within the FOV of the sensor, which results in lower computing complexity.

However, due to the low-fidelity provided by the model, the detection performance of a specific sensor cannot be accurately replicated. Therefore, a stochastic model for errors in the position measurement is constructed based on an ideal sensor in [21] where the variation is a random Gaussian white noise. The real detection behaviour is still not reflected by a random error distribution. In order to improve the reality of sensor simulation and approximate the distribution of given measurements or a dataset, non-parametric machine learning approaches can be used. It estimates the outputs and ensures that the shape of the distribution will be learned from the data automatically [22–24]. Furthermore, the details of the perception function are usually not accessible to the developer of the automated driving system, i.e., the vehicle manufacturer. The measurement process of a comprehensive physical model is also computationally expensive. Accordingly, a statistical model of the perception process is proposed. Examples of statistical models can be found in [25,26]. In these models, the measurement and reference data drive the construction of the sensor model, where errors are calculated between data and the probability functions map the errors to reference data as the outputs of the model [27]. This approach can implicitly depict several sources of error. In contrast to previous techniques, the resulting sensor output

distribution is no longer limited to a specific set of distributions. This statistical model was also employed in [28] as a lane marking detection model, where a direct relationship between sensing distance and error was developed by measuring errors of a real camera system [22]. These models only take the measurement error of the camera and ignore the impact of environmental and vehicle dynamic movement on the results. Hence, it is impossible to predict the output correctly based on the vehicle's current status.

In order to enhance the fidelity of camera simulation, a complex camera model is proposed that mimics the physics of imaging processes in [29,30] optical situations (e.g., optical distortion, blur, and vignetting) and additionally the image processing modules (e.g., signal amplification, objects or features identification, and detection) are modelled. In [31], an optical model was presented to validate the functional and safety limits of camera-based ADAS, which is based on the real, measured lens used in the product. In addition, Carlson et al. [32] proposed an efficient, automatic, and physically based augmentation pipeline to vary sensor effects to augmen<sup>t</sup> camera simulation performance. As more or changing requirements emerge, the model must be updated with optical characterization models, which results in increasing effort. Therefore, the main design paradigm of the model presents a barrier to allowing iterative development cycles.

Additionally, a semi-physical approach combining geometric and stochastic approaches to simulate dedicated short-range communication was developed in [33] and calibrated for different environmental conditions with on-road measurements.

This paper aims to remove the drawbacks and limitations of these previous research studies by fitting lane marking detection errors. It is based on statistical models using real-time vehicle measurement data collected in real-world tests. As the camera sensing algorithm is highly confidential, it is impossible to determine primary factors driving the detection error from extensive vehicle data. Therefore, feature selection is introduced, removing the data containing redundant or irrelevant features without losing informative features. In this study, the lane detection error model is constructed from the MLP. One of the main advantages of the MLP is the capability of simulating both linear and nonlinear relationships between the parameters. Meanwhile, the trained MLP is applied to estimate the output from new input data in the virtual simulation environment.

The structure of the subsequent sections of this paper is as follows: The problem is defined in Section 2. Section 3 represents the method for data collection and ground truth definition. Section 4 describes the methodology and structure of the designed MLP for lane marking detection using vehicle-based data. Experimental results are presented and discussed in Section 5. Finally, a conclusion is provided in Section 6.

#### **2. Problem Definition**

Numerical models of cameras can be used for simulation and digital twin-based testing for automated vehicles. In prior studies [28,34,35], varieties of sensor models with a distinct performance and detail profile were introduced that can replicate the performance of real cameras in simulation. These camera models can be adapted to accommodate specific simulation requirements. Three camera models that are frequently utilized in a simulation scenario can be categorized as follows:


BILEYE camera series 630 [36] is used, which includes complicated and confidential perception algorithms that are difficult to be simulated in software.

• Phenomenological Sensor Model: It simulates sensor performance, whereas phenomenological output effects are modelled without consideration for internal processes or algorithms of a camera, but with an emphasis on reproducing the real effects that are the difference between camera outputs and reference data. The phenomenological sensor model places greater emphasis on physical effects to establish the relationship between input and output of the camera model. While using this model, it is possible to map the realistic behaviour of lane detection more quickly and efficiently. Moreover, the camera modelling framework avoids complex algorithms.

Camera recognition is mainly responsible for detecting road marking. For the current study, our test vehicle is equipped with a MOBILEYE camera series 630, which employs a third-degree polynomial to estimate detected lane markings. Thus, the stored output of the image processing unit is four coefficients *C* ∈ R4, *C* = [ *C*0, *C*1, *C*2, *<sup>C</sup>*3] for each detected lane marking, the polynomial function is presented in Equation (1).

$$Y\_{\text{Car}}(X\_{\text{Car}}) = \sum\_{i=0}^{3} \mathbb{C}\_{i} \cdot X\_{\text{Car}} \tag{1}$$

The measurement coordinate is relative to the camera system, where *XCam* points in a forwards direction and *YCam* points to the right side illustrated in Figure 1.

**Figure 1.** Illustration of lane marking detection.

These coefficients are explained in Table 1. Since our test scenarios are primarily focused on straight segments of the highway, *C*2 and *C*3 are ignored. However, *C*0 is the lateral distance to the detection lane marking at the height of the camera. *C*1 indicates the vehicle heading relative to the lane heading and the road markings on the measurement section are symmetrical, implying that *C*1 values for the left and right lane markings are identical. As a result, this paper will only focus on *C*0 and *C*1 estimation, as shown in Figure 1.


**Table 1.** Lane detection coefficients from MOBILEYE camera.

Vision-based lane detection is influenced by different factors that contain external environmental parameters [37] (e.g., lane line reflectivity, appearance, and lighting conditions, etc.) as well as vehicle dynamic performance [38] (e.g., speed and heading angle, a departure from the road centerline, etc.), resulting in discrepancies between detection results and the ground truth. This phenomenon can be observed by comparing two different road markings in Figure 1. According to the guide to the expression of uncertainty in measurement, the detection result of the camera can be stated as the best reference quantity plus the measurement uncertainty [39], where uncertainty can be treated as the detection error, and it is estimated by using an NN-based approach in this paper. Finally, a phenomenological camera model is proposed to approximate real-world camera detection performances.

#### **3. Experimental Setup**

#### *3.1. Data Collection*

The highway was publicly closed during data collection. Lane detection data were collected using the MOBILEYE 630 system installed in the test car. The MOBILEYE camera provides real-time image processing to recognize various road objects such as lane markings, pedestrians, and so on. For this study, the data related to the type of detected longitudinal marking (continuous or dashed), polynomial coefficient of lane marking, and view range were recorded. Meanwhile, the six-degree-of-freedom inertial measurement system of the GENESYS Automotive Dynamic Motion Analyzer (ADMA) for motion analysis is combined with the NOVATEL RTK-GPS receiver to provide a highly accurate vehicle kinematic data. Figure 2 shows the measurement setup used for data collection.

**Figure 2.** Measurement setup for measuring vehicle.

#### *3.2. Ground Truth Definition*

ADMA-RTK combination is a strap-down inertial measurement system. The extended-Kalman filter used in the ADMA can estimate several important sensor errors in order to

enhance system performance. Depending on the capability of the GPS receiver, the position accuracy range down to 1 cm. Meanwhile, six inertial sensors provide high accuracy data [40]. Due to the accurate performance of this combination, it is used as a reference system. The measurements and data collection were conducted on the M86 highway in Hungary, see Figure 3. The construction of a close road section facilitates the development and testing of connected and autonomous vehicles. The total length of the test road section is 3.4 km [41].

**Figure 3.** M86 freeway located near Csorna (Hungary) on route E65 (GNSS coordinates: 47.625778, 17.270162).

In order to perfectly duplicate the real-world test scenario in the simulation environment, the M86 road was converted into an Ultra-High-Definition (UHD) map, a digital twin of reality that accurately represents every detail of the test environment. The production workflow that was applied for the production of the UHD map was presented in [41]. A digital twin-based M86 map was explicitly produced for testing and validating ADAS/AD driving functions with an absolute precision of +/−2 cm as a quality reference source. The extreme high precision of the lane marking data in this map will be used as the ground truth for comparison with the camera detection output. Additionally, this map will also be used for further virtual testing to duplicate simulation results.
