*2.1. Sample*

A total of 60 out of 112 registered participants participated in the test procedure. The selection focused on obtaining a well-distributed study sample. Out of those 60 participants, 55 completed the test successfully. The results from 7 participants were removed due to incomplete datasets for the questionnaires. This yielded 48 complete datasets for the analysis of the questionnaires of expression of trust, raw NASA TLX, and the system usability scale.

The sample is grouped by the participants' demographic factors in the defined categories, such as age, gender, yearly driving range, driver assistance experience, and educational level. The distribution is as follows:


#### *2.2. Study Materials*

Three different questionnaires were used to evaluate trust, system usability, and workload to obtain subjective ratings of these relevant aspects. They gather the participants' feedback on their global perception of trust in automation and the use of AD systems. The questionnaires are the EOT, the raw NASA TLX, and the SUS. The questions are presented using a tablet computer allowing the subjects to provide their rating by screen tapping. The ratings may be corrected up to the point when the participant confirms the questionnaire as complete. The answer is recorded in a decimal format. To support the participants to put in their intended answer, the graphical appearance of the answer input area is supported by a color code, smileys, and written explanation. An example of the visualization is shown in Figure 2.


**Figure 2.** Example of a question presented to the participants [44].

The EOT is a modified version of Helldin et al. [45] and relies on the original questionnaire on trust in automation from Jian et al. [39]. The questions assess user trust in AD functionality on a seven-point Likert scale, spreading from do not agree at all to agree completely. The questionnaire is presented before the participants ge<sup>t</sup> into the cockpit and start the test execution, and after they leave the cockpit. This double testing allows measuring changes in the participants' trust subject to their experience in the test environment. The following questions are used in this questionnaire:


Each question reflects upon a specific part of the human trust in automation. In the present study, all single answers of a participant are summarised, and a mean value is built according to Equation (1) to reflect an overall change of the response, from before to after the experiment, where *Q*0*p* is the new built overall score per participant and *Api* is the single response to one of the questions one to seven of one participant.

$$Q0\_p = \frac{1}{n} \sum\_{i=1}^{n} A\_{p\_i} \tag{1}$$

 use.

The mean value is treated as a new dependent variable and processed together with the demographic interactions in the analysis of variances and correlations.

After the test execution, the usability of the system is assessed by the SUS [46]. The tenquestion questionnaire is adapted to fit the boundaries of the study. It also adopts the advice from Grier et al. [47]. The questions used are:


Q10 I had to learn a lot of things before I could start using this assistance system.

The possible answers range from absolutely disagree to absolutely agree. The numerical values behind the answers reach from 0 to 4, with four representing the highest value the participant's answer can be. Therefore, for questions 1, 3, 5, 7, and 9, the values range from 0 to 4 and for the negatively formulated questions 2, 4, 6, 8, and 10, the values range from 4 to 0 and their polarity is reversed prior to the statistical analysis. For the ten questions, the maximum summed up result is 40 points. The summed value is further scaled by a factor of 2.5 to obtain a scale from 0 to 100 for each participant. A mean value for the system is then generated out of all participants' answers [48,49].

The third post-testing questionnaire is the NASA TLX [50], a standardised questionnaire on participants' perceived workload. For the actual purpose, the questions are presented as single questions without the weighting of the question pairs, also called "raw TLX". This was chosen as an appropriate cut-off in terms of the participant's timing. Therefore, an overall workload calculation is excluded in the data analysis procedure as the official rules can not be followed [50].


The answer options to those questions range from very low to very high with exception of question 4, which offers an answer scale from perfect to failed on a seven-point scale.

#### *2.3. Experimental Procedure*

Each participant followed a standardised experimental procedure also described in Clement et al. [44]. Figure 3 offers a top-level visualization of this procedure, starting with the introduction phase and continuing on to the post-testing phase. These two phases are the measurement phases for investigating the change of attitude due to the participant's exposure to AD in a moving driving simulator.

In phase one, the "introduction phase", the participants are introduced to the testing procedure, informed about the relevant data protection rules (according to the GDPR [51]) and the appropriate measurements. This is done by a psychologist to ensure consent and to provide a trained professional to deal with unforeseen circumstances. Upon the initial talk, the participants' expression of trust in AD and their prior experience is evaluated by a questionnaire. Participants have the ability to withdraw from the experiment in any phase.

In phase two, the "get prepared phase", the participants are equipped with the necessary sensors and connected with the database for recording. They are seated in the simulator for further instructions, which is the first time they see it. The participants do not see or meet each other to avoid any influence in this regard. They ge<sup>t</sup> informed about the detailed testing procedure and the equipment used for it, especially the simulator vehicle cockpit they are seated in and which they need to handle. Especially the interaction with the human–machine interface (HMI) is explained, as the information about disabled functionality is displayed in the drivers dashboard and interaction with the presented Scenario Specific Questionnaire on Trust (SSQT) questions is done on the central infotainment.

In phase three, the "get used to it phase", the participants are able to drive in the virtual environment on their own, five minutes without the moving hexa-pot platform and then five minutes with the activated movement platform. Test questionnaires of the SSQT are presented in the central infotainment dashboard to avoid uncertainties within the experiment.

In phase four, the "test execution phase", the actual exposure to the high automated driving scenarios takes place. The participants are driving trough the scenarios with the task to observe the behaviour of the vehicle and its reactions to the environment and further answer the SSQT to provide the required feedback. All ten scenarios and their consecutive SSQTs are processed automatically one after another.

In phase five, the "get cleared phase", the participants leave the cockpit, once all scenarios are completed successfully and all sensors are removed, so they are able to recover before they move on to phase six.

In phase six, the "post-testing phase", the same questionnaire as in phase one is presented to assess the participants' expression of trust in the AD after their simulator experience to evaluate the differences. Additionally, the NASA TLX [50] questionnaire is presented to the participants for workload measurements without a pairwise weighting. Furthermore, the SUS [40] is used to evaluate the subjective usability of the testing system.

**Figure 3.** Experiment procedure for each individual participant. (XY represents the number of the scenario, XY.1 the first scene within the scenario XY, and XY.1.Q1 represents question one targeting the specific scene within the scenario XY).

#### *2.4. Equipment and Techniques*

For the virtual environment, an actual vehicle cockpit, resected of a real vehicle and modified with additional HMI displays and buttons, is mounted on a moving hexa-pod platform [52]. A 180° canvas displays the simulated scenery of a real road with three video projectors (4 K 100 Hz) for prevention against motion sickness. The platform is capable of performing movements in six degrees of freedom, three lateral directions with accelerating up to 6 m/s<sup>2</sup> in any direction and up to 1.5 m of movement, and rotating around the three axis as a result. The virtual environment is provided by Vires VTD [53], whereas the platform movement is calculated by a vehicle simulation model [54] for better accuracy. All components of the environment are synchronised by the co-simulation framework Model.Connect [55]. The simulator is capable of performing all movements of normal driving conditions up to the aforementioned 6 m/s<sup>2</sup> accelerations [44].

The environment is regulated from a control (and preparation) room and is supported by a data server for data management. The automated data collection focuses on the participant's state, responses to the questionnaires, and the biometric data. The biometric sensor setup can be modified in accordance with the study design and the expected output. This study makes use of time-of-flight cameras, a chest belt for heart-rate measurements, a wristband for assessing skin conductivity and body temperature, and an eye-tracking device. The setup is characterised by the combination of the subjective psychological data assessment and the interactive subjective questionnaires.

## *2.5. Scenarios*

Phase 4 of the testing procedure, shown in Figure 3, contains ten consecutive scenarios which are executed and standardised for each participant. The scenarios are designed to give the subject the impression of and experience with future AD. All scenarios contain potentially safety critical situations, which may influence participants' subjective perceived safety and therefore affect trust in automated driving systems. Even though participants do not ge<sup>t</sup> harmed in the simulated situations, they can neither predict the system's behaviour nor the final outcome. The simulated safety critical scenarios represent situations that may happen in real world driving with an estimated impact on trust and an influence on participants' subjective ratings. Each scenario is followed by an SSQT, supported by a reminding video of the action to be evaluated. The scenarios are scheduled automatically from the test automation service once the SSQT is completed. The ten scenarios are divided into the following 4 clusters (scenarios with similar use case [56]), which are also shown in Figure 4.

Simulated safety critical situations:


Each scenario is conducted in high driving automation mode (SAE level 4—no need for driver intervention, but still possible) [1], see also Figure 1, while the participants were tasked with observing the vehicle behaviour and the environment and providing their subjective feedback. The scenario clusters are designed in the same environment [56]. Their AD parameters differ so that two driving modes are available: sporty or comfortable. Compared to the comfortable configuration, the sporty configuration allows a shorter time to collision and higher de-/acceleration rates as well as smaller distances to follow or to stop. Beyond the description of the scenarios, they are published for all details in [57].

Cluster one describes a constant drive with 100 km/h approaching a vehicle with 70 km/h on a straight one-way two-lane road in an urban area. The driver is informed about the high driving automation mode in the HMI. As all parameters like speed difference, other traffic and range of vision for an automated take-over are met, the vehicle controller performs an automated take-over of the other vehicle. After the take-over, the ego vehicle automatically changes back to the initial lane and continues to drive with 100 km/h. Next, it is passed by another vehicle which splashes dirt on the ego vehicle's sensor setup. The driver gets informed about a decrease in take-over functionality. The next vehicle approached by the ego vehicle is driving with 70 km/h. As the sensor setup is not recovered yet, no automated take-over can be performed and the vehicle follows the slower vehicle in the lane in front of the ego vehicle. After a predefined time (10 s) of sensor cleaning, the take-over functionality is available again and displayed to the driver. As all mandatory conditions are met, another automated take-over is performed in the same manner as the first one. After changing back to the initial lane, the drive continues with 100 km/h and the scenario ends. This scenario is performed twice, once with each controller configuration.

**Figure 4.** Pictures of critical events of the four scenario clusters: (**top left**) C1—at the beginning of the second take-over; (**top right**) C2—appearance of the stopped truck in the fog; (**bottom left**) C3—appearance of the child between the parked vehicles; (**bottom right**) C4—cut-in of the delivery van from the construction site with nearly zero speed.

Cluster two is described by the ego vehicle starting off and following another vehicle driving with 50 km/h in high driving automation mode on a straight two-way road. The speed remains constant and appropriate to the road conditions. After a defined time of 40 seconds following the vehicle ahead, the weather conditions deteriorate, as the appearing fog limits visibility. Due to the reduced visibility of the sensors, the vehicle's speed is automatically reduced by the automated driving controller to 30 km/h and the driver is informed about the issue via the HMI. The vehicle in front drives faster and disappears in the fog. As the drive continues, the vehicle ahead suddenly appears in the fog and seems to be stopping behind a truck that has already stopped in the fog. The automated driving system handles the emergency brake situation and automatically stops behind the vehicle in a suitable distance. Due to a higher speed, the *sporty* controller setting creates a more critical situation and a shorter stopping distance. No accident occurs. This scenario is performed twice, once with each controller configuration.

The three scenarios of cluster three are characterised by a high driving automation low speed drive in an urban residential area with parked vehicles. The road is straight. The ego vehicle approaches a narrow lane with vehicles parked on both sides of the one-way road with 30 km/h. The ego vehicle automatically changes lanes to the free middle lane and the speed is reduced further to 15 km/h due to reduced sensor range between the parked vehicles. After some driving time between the parked vehicles, suddenly a child appears between two parked vehicles on the right and crosses the road just ahead of the ego vehicle. The automated driving controller reacts to the situation and nearly stops <5 km/h the ego vehicle in front of the child. After the situation is cleared and the child leaves the scene, the controller automatically continues driving. This scenario is performed three times and the stopping distance and the deceleration values differ between the scenarios depending on controller configurations (1 & 3 comfortable and 2 sporty).

Cluster four implements three evolving scenarios. The consecutive scenario always extends the previous one by adding lousy weather in the second repetition and a delivery van cutting into the driving line in the third repetition, while the controller for the high driving automation is always set on comfortable mode. All scenarios are driven in high driving automation mode. The basic/first scenario is defined by driving on a three-lane

motorway on the first lane with 100 km/h, which is then closed due to a construction site ahead with a speed limitation of 80 km/h. The automated driving controller automatically changes from the first to the second lane as all conditions for an automated lane change are met. The speed is reduced as required by the traffic signs and the construction site is passed successfully. After the construction site the speed is increased back to 100 km/h and the scenario ends. In the second scenario, the first scenario is repeated under poor weather conditions and reduces the visibility. The rest of the scenario remains unchanged, while in the third scenario, the weather conditions from the second scenario remain the same and an additional critical situation is provoked by a delivery van leaving the construction site right in front of the ego vehicle. The automated driving controller performs an emergency brake and reduces the speed from 80 km/h to 18 km/h to avoid a collision. The distance to the delivery van is reduced down to 2 m and creates a critical situation at higher speeds. The delivery van accelerates slowly as is typical for the vehicle type. After the construction site, the delivery van changes to the first lane. The ego vehicle lane is cleared and the automated driving controller accelerates to the initial speed of 100 km/h and the scenario ends.
