2.1. Scenario and Calibration Parameters
As shown in the state-of-the-art, different scenarios must be considered for ADAS/ADS calibration. For this purpose, the framework links the concept of scenario-based testing with virtual calibration. A scenario is defined as describing a temporal sequence of events in road traffic [
4]. While functional scenarios describe events linguistically, so-called logical scenarios use scenario parameters to represent different forms of a scenario [
15]. For example, a logical target cut-in scenario can occur at various distances and velocities at which another vehicle cuts the lane in front of the ego vehicle. These values are denoted as scenario parameters. Scenario parameters have different co-domains and units. If a fixed value is selected for each scenario parameter of a logical scenario, a so-called concrete scenario is derived from a logical scenario. An example of a concrete scenario is a target cut-in scenario with a cut-in distance of 50 m and an initial velocity of 100 km/h. Therefore, a logical scenario contains a group of concrete scenarios with their respective scenario parameters and their corresponding co-domains.
The concept of logical and concrete scenarios provides a systematic approach to mathematically describing the ODD of ADAS/ADS. Scenario parameters can be clustered just like simulation models into parameters controlling the driving environment, (ego)vehicle, and driver behavior in the logical scenario.
Scenario parameters controlling the driving environment can influence all layers of the data layer model for the scenario description of Bock et al. [
16]. Parameters can be implemented to control road networks, road furniture and rules, temporal modifications and events, moving objects, environmental conditions, and digital information in the logical scenario. A simple example is using scenario parameters to define the trajectories of other vehicles.
To perform calibration for different vehicle derivatives, derivative-specific ego vehicle characteristics have to be adaptable using scenario parameters. Differences in the chassis, powertrain, and aerodynamics of the ego vehicle have to be considered. Specifically, differences in the sensor sets are relevant for ADAS/ADS calibration and need to be controlled by scenario parameters, e.g., the position of cameras. Additionally, the functionality of partner software and hardware, such as delays in sensor fusion algorithms, can be an example of scenario parameters for vehicle clusters.
For the calibration of HMI functionalities, considering different scenarios regarding the behavior of the driver is important. For example, during calibration in take-over scenarios, when the driver has to take over the driving task because the ADAS/ADS reaches the limits of its operating conditions, driver attention is a variable parameter that can be implemented as a scenario parameter.
The modular testing framework features a logical scenario database, in which both machine-readable scripts for implementing scenarios in simulation and a database with all scenario parameters and their properties can be stored [
14]. The creation of scenario models is the objective of research in the field of virtual ADAS/ADS validation [
17,
18,
19]. Models can be derived from accident databases, field operational tests, driving simulator studies, traffic simulations, and expert knowledge [
20].
In order to consider different logical and concrete scenarios during virtual calibration, several test cases with different concrete scenarios must be created for each data set to be tested. In the virtual testing framework, the dataset to be used in a test is defined by individual calibration parameters. Just like for scenario parameters, information about available calibration parameters of the SUT is stored in the exchangeable calibration parameter database of the framework [
14]. Co-domains and default values are stored here as information for the testing agent strategy and the writing method of the parameters for the simulation module.
2.2. Evaluation Parameters
The performance evaluation of the SUT in the simulated scenario is particularly important as it serves as a target value for virtual calibration methods. Evaluation metrics have to quantify functionality, comfort, safety, efficiency, and the naturalness of driving. Since some of these performance aspects are perceived subjectively and may vary depending on the test driver, there is research on objectifying ADAS/ADS evaluation [
11,
21,
22,
23]. Studies are performed with different test drivers and driving scenarios to collect objective characteristic values from measurements and subjective feedback. Based on correlation analysis, evaluation metrics can be created that are able to predict subjective driver perceptions of SUT performance from objectively measured characteristic values. The studies can be performed on proofing grounds or in a more efficient, safe, and reproducible manner in a driving simulator (Driver-in-the-Loop). In addition, ADAS/ADS evaluation metrics can be designed based on system specifications or the knowledge of system experts.
Due to different aspects of ADAS/ADS performance, virtual calibration can represent finding a Pareto-optimal solution. However, it has been shown in the state-of-the-art that reducing individual aspects by weighted sums to an overall rating is better suited for automated virtual calibration than identifying Pareto fronts in the space of calibration parameters [
10]. One reason is the comparatively long simulation time, which is why the number of test cases to be simulated and evaluated is a critical factor.
Nevertheless, the virtual testing framework of Markofsky and Schramm [
14] offers some possibility to test different weightings of individual performance aspects. It deploys a concept of performance evaluation based on direct and indirect evaluation parameters, so-called key performance indicators. Direct KPIs are calculated directly from simulated measurement data or logged internal signals of the SUT. Indirect KPIs process multiple direct KPIs or scenario parameters to map them to new indices, e.g., using correlation models or quality loss functions. This concept allows for the implementation of complex multi-layer performance rating metrics, as they are needed for virtual calibration.
Figure 2.
Structure of multi-layer evaluation metrics in virtual ADAS/ADS testing framework.
Figure 2.
Structure of multi-layer evaluation metrics in virtual ADAS/ADS testing framework.
Figure 2 shows the structure of evaluation metrics in the virtual ADAS/ADS testing framework. In the bottom layer, direct KPIs are evaluated from simulated measurement data and SUT signals, which were logged during the simulation. Direct KPIs represent mostly physical values, e.g., a minimal distance or a maximum jerk during a target cut-in scenario. These direct KPIs are processed by indirect KPI evaluation metrics, which calculate new numerical indicators from a single or multiple direct KPIs or scenario parameters. Different metrics, such as prediction models known from state-of-the-art research, can be implemented as indirect KPI evaluation metrics. Several metrics can be linked sequentially, which enables the design of processing chains represented in
Figure 2 as a dotted arrow. This results in individual ratings for different aspects of SUT performance, such as comfort, safety, etc. These are weighted and summed up, resulting in an overall rating for the performance of the SUT in the simulated scenario.
This structure brings the advantage that all calculations above the pool of direct KPIs can be repeated without re-simulating the test case. Thus, new indirect KPI evaluation metrics and new weights of the performance aspect ratings can be implemented and analyzed for already evaluated test cases.
In the virtual ADAS/ADS testing framework, so-called simulation quality criteria (SQCs) are used to check in postprocessing whether the defined test case was implemented correctly in simulation. This is conducted after the SUT’s performance has been evaluated. Errors in simulation models, for example, due to an improper choice of scenario parameters, can lead to deviations of events in simulation from the specified scenario. SQCs can be used to verify the correct execution of simulation test cases and ensure that calculated KPIs are valid.
All evaluation parameters and corresponding evaluation scripts are stored in the evaluation database of the modular framework, where they can be accessed by the testing agent and the evaluation module [
14]. This enables the testing agent to integrate only case-relevant evaluation parameters into the test case description during sampling, which enhances postprocessing efficiency.
2.3. Test Case Definition and Sampling
A test case in the virtual ADAS/ADS testing framework includes a logical scenario, defined scenario parameters to derive a concrete scenario in simulation, calibration parameters defining the used data set of the SUT, and evaluation parameters to be calculated in postprocessing. This definition transfers test case sampling, i.e., the creation of new test cases and the analysis of test case results, into processing mathematical parameter spaces.
The testing agent performs test case sampling in the virtual ADAS/ADS testing framework. It creates new test cases in the dynamic test database, transfers them to the processing pipeline, and analyzes results. In order to obtain the necessary information about various parameters, it has access to the calibration parameter database, the logical scenario database, and the evaluation database. To cover different use cases of virtual verification, validation, and calibration of ADAS/ADS, the testing agent allows the implementation of different test case sampling strategies through the exchangeable testing agent strategy module.
In virtual calibration, the basic objective of test case sampling is to vary the calibration parameters of the SUT to achieve the highest overall performance rating possible in different scenarios. One way to achieve this objective is to test the calibration parameter space using a grid of nodes created by a design of experiments, as proposed by Beglerovic et al. [
11]. A simple form is a full factorial design, where a grid of test cases is placed over the entire parameter space. The resulting number of test cases
can be calculated according to formula (1) with
calibration parameters to be considered,
nodes of individual parameters,
logical scenarios to be considered and
concrete scenarios to be considered for each logical scenario:
This test case sampling method is suitable for rough testing of the calibration parameter space, but
increases exponentially as the number of nodes
is increased for higher accuracy. Therefore, most approaches in the state-of-the-art use optimization algorithms as test case sampling methods for virtual calibration [
7,
10]. Particle swarm optimization (PSO) has proven particularly suitable for this task [
10]. Compared to deterministic methods, the population-based, stochastic PSO has the advantage of only approaching optimal solutions and not converging to a local minimum too early. In addition, the algorithm efficiently solves complex optimization problems without information about the solution space, which is the case in ADAS/ADS calibration [
24]. Even for in-house developed ADAS/ADS, the behavior with respect to the performance rating in different scenarios is difficult to predict, especially for black-box systems purchased from suppliers.
In addition, particle swarm optimization acts robustly against the non-deterministic behavior of the simulation. In the simulation for virtual ADAS/ADS testing, the SUT partly interacts with other actors, such as road users, whose behavior can be implemented based on stochastic algorithms. This can lead to slight deviations in KPIs in the same test case. The stochastic implementation of particle swarm optimization, whereby the algorithm does not try to find one optimal solution but to obtain near-optimal solutions, makes it robust to small non-deterministic variations in the KPIs [
10].
Another advantage of the population-based approach is the parallel processing of several test cases. Mostly due to the comparatively long processing times of test cases in virtual ADAS/ADS testing, this offers the possibility of increasing efficiency. When hardware-in-the-loop tests are used, the simulation must be performed in real-time, but test cases can be distributed across a cluster of test benches and performed in parallel.
The following Algorithm 1 shows the implementation of a PSO for virtual ADAS/ADS calibration in the testing agent strategy module of the testing framework of Markofsky and Schramm [
14].
Algorithm 1: PSO Calibration Test Case Sampling |
|
First, for all calibration parameters to be varied by the optimizer, the limits of their co-domains are retrieved from the calibration parameter database (line 1). If no initial particle positions were specified in the input of Algorithm 1, the particle positions are randomly initialized in the space of calibration parameters to vary considering the respective co-domain (line 3). Otherwise is set (line 5). In addition, particle velocities are randomly initialized and the local best positions of each individual particle are set to the current positions (line 7 and 8). Due to the high processing time of test cases in virtual ADAS/ADS testing, all evaluated particle positions as well as their costs are saved in metrics, which are initialized in line 9. This allows Algorithm 2 to check whether the particle positions to be evaluated are already present in and associated test cases do not need to be processed again. Algorithm 2 is used in line 10 for the first time to determine the costs for the initial particle positions . Based on the smallest value in a global best position is defined from the corresponding position in (line 11). After this initial cost calculation, the main loop of the algorithm starts.
PSO iteratively adjusts the positions of particles in the calibration parameter space to converge to a global optimum with minimal cost. For this purpose, the initialized particles are moved with a velocity
in each iteration, which is determined according to the formula in line 14. The particle velocity of the previous iteration
and the connection vectors of the current particle position to the local best position of the respective particle and the global best position are considered with weights. The inertia factor
weights the particle velocity of the previous iteration
. A higher value results in more exploration of the search space but also slows down the convergence of the PSO [
10]. The acceleration constants
and
are used to influence the motion tendency of particles to local or global best positions, while
and
are two equally distributed random factors that are supposed to apply a natural behavior to the particle swarm.
After moving particle positions
with particle velocity
(line 15), the new particle positions
are checked to see if they are within the co-domains of the corresponding calibration parameters. The original PSO does not take constraints of parameter space into account, so measures have to be implemented to deal with out-of-bound positions. In the implementation presented in
Section 4, the algorithm shows the best performance when a periodic search space, according to Zhang et al. [
25], is applied, where out-of-bound particles are reset into the search space before evaluation (lines 16 to 22).
After this step, particle positions are processed to obtain costs for each particle (line 26). The local best positions of each particle and the global best position are updated if new costs are lower than the cost of the previously stored positions (lines 27 and 28). After the global optimal position and corresponding costs are returned together with the evaluation history and .
The following Algorithm 2 shows the implementation of the particle processing procedure applied in Algorithm 1.
Algorithm 2: Particle Processing |
|
Algorithm 2 forms a loop for processing particles individually (line 1). As mentioned earlier, a strength of PSO is the parallel processing of particle positions and their associated test cases. The loop in line 1 can be replaced by an appropriate implementation to distribute particle positions on a cluster of test benches.
In the first step of particle processing, each value of a particle position is rounded to two decimal values (line 2). This measure is implemented to increase the reusability of already processed particles, which is checked in line 3. If a particle position was already processed and can be located in , the corresponding particle costs saved in are considered. Since particles in PSO converge to local or global best positions after a few iterations, this measure strongly increases the efficiency of the algorithm.
For each particle position in that has not been processed before, the necessary test cases are created. For each concrete scenario to be considered during virtual calibration, there needs to be a dedicated test case to calculate the overall cost of the particle position. Thereby, different logical scenarios are considered, each containing concrete scenarios. The necessary scenario parameter values are defined in matrices or can be retrieved as default values from the logical scenario database of the framework. To create a test case, a logical scenario and key-value pairs of the scenario parameters must be specified, which are created in this step (line 7). Additionally, the currently investigated particle position together with constant calibration parameters and default values retrieved from the calibration parameter database, are converted to key-value pairs to define the data set to be tested in this test case (line 8). In line 9, the names of the performance rating KPI and all other KPIs needed for evaluating the SUT’s performance in the respective logical scenario are retrieved from the evaluation database of the framework. This information is then used to complete the necessary information for test case creation. After the creation of all test cases (line 10), these are processed (line 12), and costs for each particle position are calculated (line 13). Since the PSO is designed to minimize the cost of particle positions, in this step, the cost is calculated from the difference between the maximum achievable performance rating and the mean value of performance ratings for all test cases. In addition, it is checked to see whether the SQCs of all test cases were positive. If an SQC is negative and additional error-handling measures are not activated, it can be assumed that the test case was not implemented correctly in the simulation. Therefore, the costs of this particle position are set to the maximum value, and affected test cases are labeled accordingly in the dynamic test database in order to be investigated manually (line 14). This procedure is repeated for all particle positions in and the cost vector is returned together with updated and .
With this PSO implementation consisting of Algorithm 1 and Algorithm 2, in which no additional stop conditions are implemented, the number of test cases
to be simulated can be calculated according to formula (2).
Nevertheless, the actual number of test cases is generally below this value due to the converging nature of PSO and the reuse of already evaluated particle positions (Algorithm 2, line 4) which strongly increases optimization efficiency.