**3. Methodology**

#### *3.1. Dynamic Ground Truth Sensor Model Validation Approach*

The DGT-SMV approach is depicted in Figure 1. The process starts with the definition of driving scenarios (*scenario definition*), which are related to phenomena of the evaluated sensor, such as multipath-propagation and separation capability [24,25]. In the next step, the tests are performed on a proving ground or in public traffic (*real test drive*), including an accurate measurement equipment.

The measurement data is used to label the recorded low-level sensor output (*measurement data labelling*). For providing a method for direct comparison, the test drives are then re-simulated in a detailed virtual representation using a digital twin of the environment and the investigated virtual sensor (*virtual sensor replay*). The sensor then produces the virtual low-level sensor output that is finally compared with the measured sensor output (*performance evaluation*) with statistical methods. The individual steps in the DGT-SMV process are described in detail in the next sections.

#### *3.2. On-Road Measurements*

Virtual testing can be conducted on a road section edited with a simple road editor in the initial development phase; however, commercially available simulation programs can be also used to re-simulate the real-world measurements of road sections. Ultra-High-Definition Maps [25] allow the simulation of existing real road geometries, thus, facilitating the realistic modelling of the virtual environment. This method allows the analysis of the

complete functional chain from detection of environment perception sensors to intervention systems.

On the digital map, one can accurately measure the position, direction and movement of the test vehicle, as well as the reference distance to static objects, such as lane markings, guardrails and curbs using a high precision inertial measurement unit (IMU). As part of the measurement campaign in cooperation with the Department of Automotive Technologies of the Budapest University of Technology and Economics [25], several driving manoeuvres were performed to determine the validation criteria for radar-sensor models by collecting real radar-sensor data.

**Figure 1.** Dynamic ground truth sensor model validation process.

#### 3.2.1. Driving Scenario

In our literature review, we did not find systematic validation and verification methods that would allow for the objective testing of available sensor models, in particular, with respect to the different phenomena that occur in different test scenarios during test drives with real sensors. The problem was first raised in the ENABLE-S3 [26] project. The ENABLE-S3 EU project aimed to develop an innovative V&V methodology that could combine the virtual and the real world in an optimal way. Several experiments were conducted to define and test validation criteria for sensor models. Three radar-specific phenomena were identified to be investigated in detail.

All of these phenomena derive from the physics of radar detection and are widely used to describe the performance of radar sensors. These are the ability to detect occluded objects (multipath propagation), the separation of close objects (separability) and the rapid fluctuation of the measured radar cross-sectional signal (RCS) over azimuth angles. As a result of this research, Holder et al. [24] concluded that validating and verifying sensor models and measurement data for repeatability and reliability is a difficult and complex task due to the highly stochastic nature of the radar output data. Furthermore, they found that radar-specific characteristics can be related either to the hardware architecture of the sensor under investigation (i.e., separability) or to signal propagation properties, such as multipath propagation, scattering and reflections (i.e., detection of occluded objects).

In this research work, the Continental ARS308 commercially available radar sensor was used for generating radar-sensor-measurement data to characterise the behaviour of the real radar sensor under different driving scenarios. Since we did not have detailed technical documentation for the radar sensor used in our experiments, which would allow us to infer its sensor performance and some expected hardware-related radar-specific phenomena, we decided to treat the sensor as a black box. Thus, we only used the information given in the sensor data-sheet to set the parameters in the virtual sensor model.

The measurement campaign on the M86 highway section in Hungary was conducted in cooperation with international industrial and academic partners. In this campaign, two important aspects of the assessment of ADAS/AD functions were considered. First, the mapping of the road geometry in order to produce a UHD map of the highway section.

Secondly, the creation of a ground truth database of all participating test cars using high accuracy Global Navigation Satellite System (GNSS). Detailed insights for the entire measurement campaign can be found in [25]. Taking into account the number of available potential target vehicles, a tailor-made manoeuvre catalogue was prepared, that includes a total of 17 manoeuvrers, with 12 manoeuvrers designed for measurement runs on a M86 highway section and five on the dynamic platform of the ZalaZone automotive proving ground [27].

As more target vehicles were available, the manoeuvres were performed with up to five vehicles, with different configurations in terms of distance, speed and acceleration. For completeness, it should be mentioned that there were also manoeuvres involving up to 11 vehicles and two trucks, for which high accuracy GPS data are also available. In this research work, we demonstrate the potential of our assessing method using one selected manoeuvre—the *Range test target leaving*, which is depicted in Figure 2.

**Figure 2.** Target leaving with constant delta speed.

This driving scenario contains four driving manoeuvres with varying target vehicle speed parameters. The initial state is defined as follows. Both vehicles with activated FSRA (Stop and Go ACC) and follow time set to the minimum reached the initial speed, which was set to 30 km/h for this manoeuvre. The distance to the TARGET vehicle controlled by the FSRA system of the EGO vehicle is in steady state condition. After the initial conditions are reached, the driver of the target vehicle changes the set speed of the FSRA system from the initial 30 km/h to vset\_TAR = v1-4\_TAR and leaves the EGO vehicle.

The measurement is considered complete when the distance between the vehicles has reached 250 m. To obtain the best measurement result, the angular orientation deviation (with respect to the direction of movement of the sensor) of the vehicles tested shall be kept below 1 degree. Unfortunately, the highway section used in this joint research project had a slightly curved characteristic, and therefore the angular orientation deviation continuously changes during the test runs and increases over 1 degree. This road geometry will lead to that the reflection points on a cumulative representation are shifted.

#### 3.2.2. Vehicle Set-Up and Measurement System

The sensor performance evaluation process is based on the comparison of low-level detection points from real measurements using automotive radar sensors and the corresponding simulation. This requires high accurate ground truth reference data of the environment including static and dynamic objects. An appropriate approach to generate ground truth data in terms of high accuracy measurements is illustrated in Figure 3.

In order to detect as much information as possible of the surrounding of the car, the ego vehicle was equipped with the following sensors for environmental perception:



Radar sensors mounted on ego-vehicles are shown in Figure 4. In addition to the environmental sensors, the DEWETRON-CAPS Measurement System [28] was also mounted on the target and the ego vehicle. The CAPS measurement system allows the implementation of a wirelessly connected topology of data acquisition units, consisting of one master and several slave measurement PCs. In addition to the time-stamped input and output hardware interfaces, the core of the CAPS system is the high accuracy inertial measurement unit (IMU).

In our experimental setup, we used the most advanced Automotive Dynamic Motion Analyzer (ADMA-G-Pro+) GPS/INS-IMU from GeneSys Ltd. The ADMA, combined with an RTK-DGPS receiver [29] and connected to the Hungarian Positioning Service, provided high accuracy dynamic state and position information of the test vehicles in real time. In addition, an accurate time measurement is derived from the GPS/PPS signal, allowing synchronous measurement of all connected data acquisition units.

As all measurement inputs are time-stamped, the measurement system ensures that all data streams received local or via WLAN connection are stored in time synchronisation. In addition, the robust WLAN connection allows for the real-time transmission of dynamic state and position information from the target vehicle to the ego vehicle. The transmitted data allows the driver to monitor the driving scenario online to rapidly ensure the quality of the measurement process.

**Figure 3.** Schematic diagram of the measurement setup.

**Figure 4.** Test vehicle measurement setup.

#### *3.3. Re-Simulation of Experiments*

When on-road measurements are conducted, it is possible to generate exact the same scenarios in the virtual simulation using the recorded trajectory of the vehicles. IPG CarMaker® was selected for the simulation environment, as this software package provides a Virtual Vehicle Environment (VVE), which represents a Multi-Body-Simulation (MBS). This includes the equations of motions, kinematics and also for ADAS applications, sensor models. For building a Digital-Twin of a highway section, a detailed Ultra-High-Definition (UHD) map was provided by Joanneum Research Forschungsgesellschaft, which were also a partner in the consortium [25]. This map represents a highly accurate representation of the highway section in an OpenDrive file format. As the road geometry in IPG CarMaker® [30] is defined in a RD5 file format, the UHD map in in OpenDrive format has to be converted.

The virtual map includes the following road geometry items [25]:


This detailed description of the environment makes it possible to reduce deviations between the real and virtual world to a minimum. In addition to the preparation of the virtual environment, the recorded trajectories must be transformed from the geodetic WGS-84 coordinate system to a metric coordinate system, which is used in the simulation. Since the operating radius of the experiments is smaller than 50 km, the curvature of the earth can be neglected, and the transformation can be performed on a plane metric coordinate system, which is spanned relative to a reference point [28].

IPG CarMaker distinguishes between two categories of vehicles, the ego vehicle and traffic vehicle. The first one represents the Vehicle Under Test (VUT), including a multi-body representation where all sub-systems can be changed by the user, e.g., mounting ADAS sensors to the vehicle, whereas the traffic vehicle only represents a motion model, which is, in our case, a single-track model. In order to make the traffic vehicle follow the previously recorded and afterwards transformed trajectory, the exact position in *x*- and *y*-direction was given to the vehicle at every time step.

The ego vehicle is controlled via the IPG Driver, a mathematical representation of the behavior of a human. This Driver performs any interaction to the car, e.g., steering or accelerating/braking. If the ego vehicle is now given a target trajectory, this would be approximated by the IPG Driver, just as a human driver would do. However, since in our case, an exact following of the recorded trajectories is absolutely necessary for the evaluation of the sensor model, a by-pass has to be performed on the driver model. This was done with a modification in the C-Code interface provided by the software vendor IPG. With this adaptation, the ego vehicle is now able to reproduce the same trajectory in the virtual environment as it was measured in the real world, ignoring any intervention by the IPG driver.

For the replay of the scenarios, a standard IPG-Car parameter set was used, including models of powertrain, tires, chassis, steering, aerodynamics and sensors. Since the ego vehicle exactly follows the recorded trajectory, no detailed parameter setting is required. The simulation software offers a number of different sensor models, which operate at different levels, ranging from ideal sensors to phenomenological sensors and to raw signal interfaces.

Since this paper focuses on the validation of low-level sensor models, the RSI radarsensor model from IPG CarMaker V 8.1.1® [30] is considered and described in detail.

#### 3.3.1. IPG RSI Radar Sensor Model

This sensor model provided by IPG CarMaker imitates the physical wave propagation by an optical ray-tracing approach. It includes the major effects of wave propagation, e.g.,


Using this ray optical sensor model in a virtual environment requires the modelling of material properties of objects, such as the relative electric permittivity for electromagnetic waves and scattering effects. This parameters have a significant influence on the reflected direction and field strength of the reflected wave. The reflections are created by a detailed 3D surface in the visualisation. In the used set-up, the default values provided by the simulation tool for a 77 GHz radar were used.

#### 3.3.2. Parameter Setting of the Sensor Model

Radar sensors are influenced by a multitude of parameters, which makes the parameter setting of such models complex. To ensure the comparability of the sensor model, the real hardware was treated like a black box so the sensor model was set with those parameters given in the data sheet provided by the manufacturer of the Radar sensor. To set the parameters for the atmospheric environment, temperatures and and in particular the data sheet based parameters: Field of View, Range, Cycle time, Max. Channels, Frequency, Separability Distance, Separability Azimuth, Separability Elevation, and Separability Speed was used.

With the additional offered two "design" parameters and scattering effect, CarMaker® gives users the possibility to fine tune the sensor model. However, one parameter given in the data sheet of the real hardware was not adjustable in the software package. The inaccuracy depending on the distance of the detected object was afterwards superimposed to the simulation results, as this leads to more robust results. To make this modified data visible in the results, this data is marked as *modified data* in Section 4 [30,31].

#### *3.4. Labelling of Radar Measurement Data*

In order to assign the individual reflection points of the radar sensor to the dynamic targets, a method that is already known in the field of object tracking is used, namely the gating technique. Using the ground truth information of the dynamic objects, target points only in a specific shaped area around an object of interest are considered. Figure 5 represents the gating area with the associated and not associated target points. The shape of the gating area can be variously designed, such as rectangular or elliptical [32].

In accordance with the shape of an average car, we used a rectangular shape. In this case, only dynamic objects are considered, as no static object information is available in the virtual map, e.g., bridge heads or overhead traffic signs. This means that the evaluation is limited to moving objects where the ground truth is measured with the RTK-GPS IMU measurement equipment but is also applicable to static objects, given that the ground truth is referenced.

**Figure 5.** Gating area of the target vehicle with and with not associated reflection points according to [31].

#### *3.5. Evaluation Procedure*

Different evaluation metrics are given in literature, such as comparison of occupancy grid maps, statistical hypothesis testing, confidence intervals, correlation measurements and the generation of probability density functions [18,21,33,34]. In contrast to object list based deterministic sensor models, physical non-deterministic sensor models do not allow a direct comparison between experimental observations and simulation models. To make the highly stochastic process of a physical radar-sensor model comparable, including the physical attributes, e.g., the relative velocity or RCS value, statistical evaluation methods are best suited to describe the distribution of parameters in space and time.

Using previously labelled data, it is possible to evaluate them by statistical means in such a way that a quantitative statement can be made about the quality of the sensor model used in comparison to the real hardware. Introducing a reference point <sup>P</sup>*ref*(*<sup>x</sup>*, *y*) on the target vehicle enables the calculation of the deviation on every radar detection point to the ground truth of the dynamic object, see Figure 6. Radar detection points are represented by the vector *ζr* for the measured sensor data and *ζs* for the simulated data. The deviation is calculated with

$$\mathcal{J}\_{s,\Lambda}(\mathbf{x},\mathbf{y}) = \mathcal{J}\_s(\mathbf{x},\mathbf{y}) - \mathcal{P}\_{ref}(\mathbf{x},\mathbf{y}),\tag{1}$$

$$\mathcal{Z}\_{r,\Delta}(\mathbf{x},\mathbf{y}) = \mathcal{Z}\_r(\mathbf{x},\mathbf{y}) - \mathcal{P}\_{ref}(\mathbf{x},\mathbf{y}).\tag{2}$$

where *ζ<sup>s</sup>*,<sup>Δ</sup> representing the deviation of the simulation data and *ζ<sup>r</sup>*,<sup>Δ</sup> representing the deviation of the real sensor data to the reference point.

**Figure 6.** Reference point <sup>P</sup>*ref*(*<sup>x</sup>*, *y*) on the dynamic object; Deviation target points real sensor to reference point *ζ<sup>r</sup>*,<sup>Δ</sup>(*<sup>x</sup>*, *y*) and deviation target points simulation to reference point *ζ<sup>s</sup>*,<sup>Δ</sup>(*<sup>x</sup>*, *y*) according to [31].

Assuming radar sensors are subject to a highly stochastic process, the detection points can be treated as realizations of a distribution function [35], p. 35. Using methods, including kernel density estimation (KDE), a probability density function (PDF) can be generated from the large number of realizations.

#### *3.6. Validation Metrics for Comparing Probability Distributions*

The validation of simulation models is based on the numerical comparison of data sets from experimental observations and the computational model output for a given use case. To quantify the comparison, validation metrics can be defined to measure the difference between the physical observation and the simulated output. Whether comparing measured physical quantities or virtual simulations, observed values contain uncertainties. In the presence of uncertainties, the observed values subject to validation are samples from a distribution of possible measured values, which are usually unknown.

To optimally quantify the difference or the similarity between distributions, we need the actual distributions. For the empirical data sets resulting from experimental observations (real radar sensor) and the output of the computational model (physical radar-sensor model), we do not know the actual distribution, or even its shape. Although one can always make assumptions (parametric) or estimate kernel density estimates (KDE), these are not quite ideal in practice, as their analysis is limited to specific types of distributions or kernels used. To stay as close to the data as possible, we therefore consider a non-parametric divergence measure. Non-parametric models are extremely useful when moving from discrete data to probability functions or distributions.

Non-parametric approaches are another way to estimate distributions. Such methods can be used to map discrete distributions of any shape. The simplest implementation of nonparametric distribution estimation is the histogram. Histograms benefit from knowledge of the data sets to be estimated and require fine-tuning to achieve optimal estimation results. In our application, this knowledge is available, since the bin width of the histogram can be determined according to the real sensor data sheet.

As stated above, a metric is a mathematical operator that gives a formal measure of the difference between experimental and model results. The metric plays a central role as it can be used to describe the fidelity of sensor models used to validate ADAS/AD functions. A low metric value means a good match and vice versa. According to [18] the metric can be defined by the following criteria: it must be intuitively understandable, applicable to both deterministic and non-deterministic data, a good metric defines a confidence interval as a function of the number of measured data and meet the mathematical properties of the metric.

The variables measured by perception sensors are usually non-parametric due to the highly stochastic nature of the output data [24]. Based on these properties, one possible description of the correspondence between synthetic and real perceptual data could be the comparison of their probability distribution functions. In the context of validating perception–sensor models, the most useful characterization appears to be the comparison of the distributions of random variables and the shapes of the corresponding observations. Random variables whose distribution functions are the same are called "distribution inequalities".

If the shapes of the distributions are not exactly the same, the difference can be measured using several possible measures. Maupin et al. [36] described a number of validation metrics for deterministic and probabilistic data that are used to validate computational models by quantifying the information provided by physical and simulated observations. In the context of this research, we proposed to use the Jensen–Shannon Divergence (JSD) [37], as it provides a quantified expression of the results of a comparison between two or more discrete probability distributions in a normalised manner.

The JSD, is a symmetrised version of the Kullback–Leibler Divergence described in detail in [18,36]. We consider a true discrete probability distribution P and its approximation Q over the values taken on by the random variable. The Jensen–Shannon Divergence calculated with

$$DJS(\mathcal{P}||\mathcal{Q}) = \frac{1}{2} DKL(\mathcal{P}||\mathcal{M}) + \frac{1}{2} DKL(\mathcal{Q}||\mathcal{M}) \tag{3}$$

where M is the mean distribution for P and Q, as given by

$$\mathcal{M} = \frac{\mathcal{P} + \mathcal{Q}}{2} \tag{4}$$

The Jensen–Shannon Divergence uses the Kullback–Leibler Divergence to calculate a normalized measure. If P and Q describe the probability distribution of two discrete random variables, the KL divergence is calculated according to Equation (5).

$$DKL(\mathcal{P}||\mathcal{Q}) = \sum\_{i=1} \mathcal{P}\_i(\mathbf{x}) \log(\frac{\mathcal{P}\_i \mathbf{x}}{\mathcal{Q}\_i \mathbf{x}}) \tag{5}$$

Since the JS Divergence is a smoothed and normalised measure from the KL Divergence, it can be easily integrated into development processes. By definition, the square root of the Jensen–Shannon divergence describes the Jensen–Shannon distance.

$$DistJS(\mathcal{P}||\mathcal{Q}) = \sqrt{DJS(\mathcal{P}||\mathcal{Q})} \tag{6}$$

As both the divergence *DJS*(P||Q) and the distance *DistJS*(P||Q) are symmetric with respect to the arguments P and Q and the JS-Divergence is always non-negative, the value of *DJS*(P||Q) is always a real number in the closed interval of [0; 1].

$$0 \le D \| S(\mathcal{P} || \mathcal{Q}) \le 1 \tag{7}$$

If the value is 0, the two distributions P and Q are the same, if the value is 1, the two distributions are as different as possible. For better interpretation we present JSD in percentage values in the following. As *DistJS*(P||Q) fulfils the mathematical properties of a true metric [38], such as symmetry, triangle inequality and identity, the Jensen–Shannon Distance is a valid metric distance.
