**Feeling Uncertain—Effects of a Vibrotactile Belt that Communicates Vehicle Sensor Uncertainty**

#### **Matti Krüger 1,†, Tom Driessen 2,†, Christiane B. Wiebel-Herboth 1, Joost C. F. de Winter 2,\* and Heiko Wersing <sup>1</sup>**


Received: 31 May 2020; Accepted: 30 June 2020; Published: 6 July 2020

**Abstract:** With the rise of partially automated cars, drivers are more and more required to judge the degree of responsibility that can be delegated to vehicle assistant systems. This can be supported by utilizing interfaces that intuitively convey real-time reliabilities of system functions such as environment sensing. We designed a vibrotactile interface that communicates spatiotemporal information about surrounding vehicles and encodes a representation of spatial uncertainty in a novel way. We evaluated this interface in a driving simulator experiment with high and low levels of human and machine confidence respectively caused by simulated degraded vehicle sensor precision and limited human visibility range. Thereby we were interested in whether drivers (i) could perceive and understand the vibrotactile encoding of spatial uncertainty, (ii) would subjectively benefit from the encoded information, (iii) would be disturbed in cases of information redundancy, and (iv) would gain objective safety benefits from the encoded information. To measure subjective understanding and benefit, a custom questionnaire, Van der Laan acceptance ratings and NASA TLX scores were used. To measure the objective benefit, we computed the minimum time-to-contact as a measure of safety and gaze distributions as an indicator for attention guidance. Results indicate that participants were able to understand the encoded uncertainty and spatiotemporal information and purposefully utilized it when needed. The tactile interface provided meaningful support despite sensory restrictions. By encoding spatial uncertainties, it successfully extended the operating range of the assistance system.

**Keywords:** spatiotemporal displays; sensory augmentation; reliability display; uncertainty encoding; automotive hmi; human-machine cooperation; cooperative driver assistance; state transparency display

#### **1. Introduction**

Modern cars are equipped with sensor systems that surpass human perception in various ways. For example, camera systems may offer continuous 360-degree vision and Lidar can provide vision in the dark. Advanced driver assistance systems use these sensor capabilities by providing the driver with supportive information (e.g., lane departure warning, blind-spot detection, navigation) or by taking over control (e.g., adaptive cruise control, automated lane-keeping). However, the reliability of sensory systems may degrade due to changes in the environment. For example, the accuracy of Lidar measurements tends to decrease in the rain [1], and car manufacturers warn about reduced reliability of sensors in tunnels (e.g., Reference [2] (p. 96)). Since drivers cannot be expected to have an understanding of the functioning (or the mere existence) of these sensor systems, they may benefit

from the availability of information on sensor reliability. An assistance system could assess such measures of uncertainty by itself, where the level of uncertainty may be based on signal variance or the disagreement between different sensor signals. A system that would share information on sensor uncertainty could help drivers adjust their level of trust in the automation to appropriate levels [3]. This approach is in line with a cooperative automation framework, which challenges designers to regard assistance functions as cooperative partners or team agents, rather than as tools, for example, References [4–8]. Among ten challenges to make automation a team player, Klein et al. [6] (p. 93) listed the team agent's ability to "make pertinent aspects of their status and intentions obvious to their teammates". Communicating system uncertainty might be one step in this direction.

#### *1.1. Related Work*

Drivers have been found to show safer behavior when being given appropriate supplementary information about the traffic environment (see e.g., References [9–11], but also Reference [12] for potential adverse effects). Several studies in the automotive context have further investigated the potential of reliability displays, especially for automated driving. Most attempts to communicate system uncertainty have focused on visual displays [13–18]. Variants of such displays include function-specific versus function-unspecific uncertainty encodings or different types of implicit and explicit visualization. Qualitative displays, for example, have illustrated uncertainty through icons, while quantitative displays have incorporated multiple levels or continuous measures of uncertainty using graphs and scales. Beller et al. [13] used an emoji-like icon showing a confused face reaching out with open palms to indicate system uncertainty in a driving simulator experiment. Helldin et al. [15] investigated the impact of visualizing assistance uncertainty on drivers' trust by displaying a visualization of assistance competence (*SAE* level 2 [19]) in a driving simulation with varying weather conditions. The amount of machine confidence was displayed by means of seven empty bars that filled up as confidence increased, in a similar way to mobile phone status bars displaying signal quality. Kunze et al. [16] designed an anthropomorphic reliability display for a simulated *SAE* level 3 automated vehicle. They made a visual display showing a peak from a heartbeat graph that lit up according to a simulated heartbeat frequency between 50 bpm (high reliability) and 140 bpm (low reliability). In addition to the graph, a numeric value of the current machine heart rate was visible.

Uncertainty communication has been shown to be beneficial. Previous work has found improved safety measures [13] and faster take-over times [15,16,20], as well as accompanying changes in gaze behavior [15,16,20]. Furthermore, it was found that drivers showed a more appropriate trust calibration [13,15,18] and gave higher acceptance ratings for such systems [13] compared to baseline conditions. Also, system comprehension [13] and situation awareness [13] were shown to be improved due to uncertainty communication. However, the deployment of the visual modality as a feedback channel has also been subject to criticism. One disadvantage of visual uncertainty communication is that the driver's visual modality might not be continuously available for input as other activities compete for visual attention. When observing the road or engaging in non-driving tasks, drivers may neglect continuous visual displays [21]. This might become especially problematic in automated driving, where the driver is likely to be engaged in a non-driving task. Thus, the use of visual displays for communicating uncertainty carries the risk of disuse or an increase in perceptual workload [16,20].

Recent studies have investigated the use of touch [22], olfaction [23], as well as peripheral vision to share measures of system uncertainty with the driver. In particular, a driving simulator study by Kunze et al. [22] investigated different variants of vibrotactile feedback in a car seat to communicate increases or decreases in the global uncertainty of an automated vehicle for initiating a takeover by the driver. They showed that encodings of uncertainty increase were more intuitive to users than encodings of uncertainty decrease. Moreover, changes in amplitude and rhythm of the vibrotactile feedback were rated highest. The authors did not investigate the effect of the tactile uncertainty feedback on objective measures and recommended that it should still be examined whether people can make use of the feedback and respond to it appropriately. In another study, Kunze et al. [20] coupled a

peripheral awareness display with vibrotactile feedback in order to communicate different levels of global system uncertainty in an automated driving simulator experiment. However, they only used the vibrotactile feedback to communicate the highest level of system uncertainty. Results showed that driver workload was significantly lower compared to a visual display condition that needed focal visual attention for the uncertainty communication to be perceived. In addition, they found that users had a more appropriate attention distribution and showed better take-over performance.

Apart from its potential for reliability communication, vibrotactile interfaces have been identified as promising elements of user interfaces [24] and particularly applicable in the context of driver assistance [25] such as for driving- [26–30] or navigation support [31–39]. In addition, also advanced tactile encodings of relevant information such as spatial distances [40–46], directions [32,47–52] and spatio-temporal measures [53,54] have been investigated.

Auspicious reports from these studies let us conclude that vibrotactile feedback is a promising candidate for uncertainty communication in the automotive context and should be investigated in greater detail. To our knowledge, no study so far has investigated tactile communication of system uncertainty relating to individual sensing and signaling about other traffic participants. Here we extend previous research by investigating a previously presented vibrotactile driving assistance system [53,54], augmented with an uncertainty communication functionality.

#### *1.2. Current Study*

The main goal of this study is to evaluate driving experience and performance with a driving assistance system that communicates safety-relevant information and additionally conveys its uncertainty about this information. Using a driving simulation environment, we test how the tactile encoding of one dimension of system uncertainty affects the driver's perception of the system in terms of its usefulness and satisfaction and how it affects perceived workload. In addition, we explore whether such a signal influences measures of driving safety and gaze-based attention.

We extend a vibrotactile driving assistance interface that has been shown before to successfully support a driver in gaining a better understanding of the environment through sensory augmentation [53,54]. The tactile assistance provides two types of information—temporal distances and the directions of objects that are on a collision trajectory with the ego-vehicle. The extension introduced here consists of further encoding uncertainty in the tactile stimuli about the directions of objects that are directly approaching. We refer to this uncertainty as directional or spatial uncertainty. Because the underlying assistance system provides information about both direction and temporal distance, also temporal uncertainty, that is, uncertainty about temporal distances can exist. This dimension of uncertainty is not investigated here and the system is marginalized to have full temporal certainty in this study.

We expect that the effect of directional uncertainty communication will be moderated by the driver's own certainty about the directions of potential collision objects. More specifically, we propose the following hypotheses:

**Hypotheses 1 (H1).** *Understanding. Drivers perceive and understand directional uncertainty encoded in tactile stimuli which communicate spatiotemporal distances of approaching vehicles.*

**Hypotheses 2 (H2).** *Subjective Benefit. Drivers utilize complementary uncertainty information in tactile stimuli for their subjective benefit.*

**Hypotheses 3 (H3).** *Disturbance. Drivers are not disturbed by receiving redundant uncertainty information.*

#### **Hypotheses 4 (H4).** *Safety. Signaling complementary uncertainty information leads to higher objective safety.*

We here understand *subjective benefit* as a term that subsumes impressions of usefulness, satisfaction and reduced workload and *objective safety* as an expression of safety derived from driving

data such as the the smallest predicted time-to-contact to any vehicle that is on a collision trajectory with the ego-vehicle (i.e., the minimal time-to-contact, see Sections 2.3 and 2.5.5.4). *Complementary uncertainty* information is here defined as information that augments uncertain human perception. *Redundant uncertainty* information is defined as information that is already fully covered by more certain human perception. *Disturbance* should be understood as the opposite of benefit and would be expressed in lower scores on the subjective measures and lower performance on the objective measures. For this study, we created conditions that enable us to induce both machine and human sensory uncertainty and thereby determine how complementary or how redundant the encoded uncertainty information becomes.

#### **2. Materials and Methods**

#### *2.1. Participants*

Fourteen drivers (1 female) between 21 and 41 years old (*M* = 29.1, *SD* = 5.4) participated in the study. All participants reported that they had (corrected-to) normal vision and held a valid driving license for an average of 11 years. All participants gave their written informed consent before taking part in the study.

#### *2.2. Experimental Setup*

The experiment was conducted in a static driving simulator (Figure 1) with controls for steering, braking, and accelerating. Gear-shifting/transmission was set to automatic mode. Three display panels (50 inch diagonal, 1080p each, 60 Hz) presented the driving scenario and the remaining parts of the interior (dashboard, instrument cluster, mirrors), using the SILAB 5.1 driving simulation software developed by the WIVW GmbH (Würzburg Institute for Traffic Sciences, Germany). Participants wore a 120 Hz monocular eye-tracker (Pupil Labs GmbH [55]). In addition, participants wore a waist belt (feelSpace GmbH [56]) containing 16 equally spaced vibromotors (between 4.9 and 7.5 cm depending on the size of the belt). In particular, the belt contains eccentric rotation mass motors that can have a maximum amplitude of 2.2 g and a frequency spectrum of 50–240 Hz (0.45–3.3 V) triggered with a 50 ms latency. Frequency and amplitude were set to scale approximately linearly with voltage. Four belt sizes were used in the experiment to ensure a good fit for all participants. The firmware of the belt interface was customized for the experiment.

**Figure 1.** Driving simulator setup in the foggy tunnel scenario. The experimenter screen (bottom left) shows a visualization of the tactile stimuli. In this visualization (magnified in the white box on the right side) the location of a dark dot corresponds to the current direction communicated via a tactile stimulus and the size of the dot indicates the intensity of the respective stimulus. Black bars mark the boundaries between which stimuli oscillate dependent on the current range of spatial uncertainty. This visualization was not available to participants.

#### *2.3. Stimuli*

The tactile communication was implemented with a signaling mode similar to the interface used in the experiments by Krüger et al. [53,54]. Two information dimensions about approaching objects were encoded in the tactile stimuli. First, the direction of approaching objects relative to the ego-vehicle was encoded in a mapping of stimulus location on the belt. That is, stimulus location signaled from which lane(s) and lane segments (i.e., center front/back, left front/back, right front/back) vehicles were approaching by activating pre-defined vibromotors that were corresponding to the direction of the lane and segment. In previous studies [53,54], we have found a circular arrangement of actuators, as provided by the feelSpace belt, to be suitable for intuitive signaling of direction information. Nevertheless, other arrangements may also be suitable and could be preferred when working with specific design constraints. Six out of the 16 vibromotors were chosen to realize such mapping (Figure 2). The vibromotors for directional lane encoding were distributed according to the schematic shown in Figure 2. Thereby we chose to set distances between dorsal actuators to be larger than those for the front direction due to differences in spatial discriminability between dorsal and ventral regions [47,57]. A similar direction encoding with eight actuators but no varied treatment of ventral and dorsal regions has, for instance, been successfully employed before by Van Erp et al. [32].

Second, the temporal proximity to the approaching object was encoded in the stimulus intensity. We defined the temporal proximity as the complement of the time to collision (TTC) towards a surrounding object that is on a collision track with the ego-vehicle within a fixed temporal range. Assuming that an object *b* is moving behind an object *a* along the same path and trajectory with velocities *Va* and *Vb* and *a* and *b* are distance *Dab* apart, the TTC between *a* and *b* is given by:

$$TTC = \begin{cases} \frac{D\_{ab}}{V\_b - V\_a} & \text{if } V\_b > V\_a \\ \infty, & \text{otherwise.} \end{cases} \tag{1}$$

For the left and right lanes, we simplified TTC computation by calculating the *L*<sup>2</sup> norm of a vector consisting of the respective hypothetical (i.e., assuming already being on the respective lane) longitudinal TTC (*TTCLong*) and the time to lane crossing (*TLC*) for the respective lane according to Equation (2). The TLC is derived as a TTC that is based on the lateral velocity relative to the lane and the distance to the lane boundary.

$$TT\mathbb{C}\_{L/R} = \left(TT\mathbb{C}\_{Long}^2 + T\mathbb{L}\mathbb{C}\_{L/R}^2\right)^{\frac{1}{2}}.\tag{2}$$

The TTC defines the time it would take until a collision occurred if two objects maintained their current velocities and direction of travel. In the present experiment, we decided to make the stimulus intensity correspond to the complement of the TTC for a temporal range between zero and nine seconds. Stimulus onset occurred whenever the TTC between the ego-vehicle and a surrounding object dropped below a threshold (*θ*) of nine seconds. This value was chosen as a compromise between the goal of maximizing the range of intensity coding and the need to keep stimuli in a range that can still be perceived by the participants as relevant. Stimulus intensity at onset was set to the smallest perceivable intensity identified by the experimenter, and increased linearly as the TTC dropped. If the TTC was zero (a collision), stimulus intensity reached its maximum, which was equal to the maximum intensity provided by tactile interface. Accordingly, close temporal proximities were signaled with more intense vibration and vice versa.

$$\text{Intensity} = \max\left(\frac{\theta - TTC}{\theta}, 0\right). \tag{3}$$

The tactile interface can give exact signals about the location and temporal proximity of an approaching object as long as the vehicle has precise knowledge about the location and velocity of this object. We refer to this signal as the precise signal, which served as a baseline.

**Figure 2.** (**Left**) Schematic of the belt in an example situation where from every left and center lane direction an object (large gray dot) is approaching with a time to collision (TTC) value under nine seconds. Vibromotors nr. 0, 14, 8 and 11 (small grey dots) would activate in this case. If the ego-vehicle drove on the left lane, the activations would occur at vibromotor 0, 2, 5 and 8. Note that the selected vibromotors on the rear were spaced two instead of one gap apart to account for differences in spatial discriminability between dorsal and ventral regions [47,57]. (**Right**) Photograph of the tactile waist belt (c feelSpace GmbH).

#### 2.3.1. Uncertainty Communication

In addition to the precise signal, a second signaling mode was realized to communicate the machine's uncertainty about an exact object direction to the user. We refer to this signal as the uncertainty communication. For the uncertainty communication, the encoding of temporal proximity was identical to the precise signal; only the location encoding was varied. The rationale behind the uncertainty communication was that, due to the environmental changes, the vehicle's sensory system may be unable to measure precise object locations (the exact lane), but could still signal the *presence* of an approaching object from either front or back, without specifying the ego- or a neighboring lane. In order to convey this information to the user, the direction of approach for a vehicle was no longer signaled by one unique stimulus location, but through a dynamic vibration pattern traveling over a specific range that represented the overlap between the two lanes on which a vehicle might appear. Upon stimulus onset, neighboring vibromotors were successively activated in the clock- or counter-clockwise direction, creating a tactile illusion of *apparent motion* [24]. The initial vibromotor position and direction was chosen randomly from the available vibromotors within the respective uncertainty range.

Figure 3A shows a schematic of the uncertainty signal. The stimulus development is illustrated by the pointer oscillating between the two borders with a constant frequency (1.0 Hz, from start-to-start point). The next vibromotor activated at the same instance that its predecessor switched off (Figure 3B). The pointer continued to bounce between these borders until either one of two events occured: (1) the TTC became greater than nine seconds, in which case the signal disappeared, or (2) a reliable estimate of the current lane of the approaching vehicle became available. In the latter case, the width of the range converged to one, conveying the same unique direction as in the precise signal condition. We also experimented with other representations of uncertainty, such as synchronously activating multiple actuators in the uncertainty range. However, such variants which employ co-activation of nearby actuators can produce side effects like the funneling illusion [58] and a perceived stimulus intensity increase [59]. Because such effects would interfere with the encoding of information in stimulus direction and intensity, we favored the described method of sequential activation.

**Figure 3.** Uncertainty signal for an object approaching from the front on a two-lane road (**A**). Grey dots indicate possible locations of the approaching vehicle as signaled by the system. The stimulus *traveled* between the borders and bounced back in the other direction as it hit one of the borders (**B**). The width of the range was chosen to be between the vibromotors that were allocated for the static signal (Figure 2) plus one extra vibromotor on each side. Thus, in the example in this image, the signal bounced between vibromotors 13 and 1.

#### *2.4. Experimental Design*

#### Independent Variables

Two factors were systematically varied in the experiment in order to evaluate the proposed uncertainty communication system. First, we varied the availability of uncertainty communication (on vs. off). Second, we varied the perceptual uncertainty in the different scenarios between human and machine (machine certain-human uncertain (MC-HU), machine uncertain-human certain (MU-HC), both uncertain (MU-HU)). The uncertainty manipulation was realized through contextual conditions in the driving scenarios that aimed at independently modulating the uncertainty of the vehicle's observations and the uncertainty of the human's observations. Machine uncertainty was introduced by means of driving through (a) a foggy tunnel and (b) rain. Both situations would decrease sensor reliability and increase machine uncertainty. Human uncertainty was provoked by driving through (a) a foggy tunnel and (b) a foggy road. The foggy tunnel thus served as the joint uncertainty condition, in which both the human and the machine suffered from limited sensory input. Since the goal of this study was to examine the effects of uncertainty communication in human-machine cooperation, we decided to omit a condition in which both the human and the machine would be certain. In the foggy road scenario, the machine had an accurate estimate of the position of vehicles at any distance away from it, and it could always communicate the precise signal. Therefore, uncertainty communication (uc) was only available in the foggy tunnel and rain scenarios. Participants drove through these scenarios twice: once without (MU-HU, MU-HC) and once with the uncertainty communication functionality enabled (MU-HU-uc, MU-HC-uc). In case the uncertainty communication was disabled, the vibrotactile interface provided a precise signal only as soon as the approaching car entered a visible range (see Section 2.5 for details). In case the uncertainty communication was enabled, the vibrotactile interface communicated the uncertain signal whenever the defined threshold of a TTC lower than nine seconds to an approaching object was reached. This resulted in a total amount of five experimental conditions, the characteristics of which are summarized in Figure 4.


**Figure 4.** Overview of five experimental conditions with corresponding ranges for human vision and machine sensors. Colors are assigned to individual conditions to facilitate condition mapping of the results. For machine uncertain conditions (blue and green), the light colors mark conditions without uncertainty communication while their dark counterparts indicate uncertainty communication.

#### *2.5. Procedure*

The study was structured into five experimental and two familiarization blocks. The two familiarization blocks had the purpose of introducing the participants to the driving simulator and the tactile interface. The first familiarization procedure was carried out according to guidelines specified by Hoffmann and Buld [60]. This procedure aimed at reducing the probability of causing simulator sickness by gradually increasing exposure to virtual accelerations. The second familiarization scenario allowed the driver to explore the direction and temporal proximity encoding provided by the tactile interface in a scenario where the machine was certain (precise signal). In the five experimental blocks, the participant's task was to maintain a speed of 120 km/h where possible and avoid collisions with other vehicles. All scenarios consisted of a straight two-lane highway. To rule out potential learning effects, the order in which experimental conditions were conducted varied between participants. Half of the participants started with the two uncertainty communication conditions and half without. Foggy scenarios and rain scenarios were alternated. Before the uncertainty communication conditions, participants were verbally instructed by the experimenter about the machine limitations as follows—"In this section, you will drive through rain/a tunnel. Therefore, the vehicle is less certain about the locations of vehicles that are further away". The following sections further detail the design of the scenarios. Conceptually each scenario followed the same structure: To maintain an objective speed of 120 km/h the driver had to detect and overtake slower cars on the left or right lane from the front, and avoid faster cars that approached at 160 km/h from the rear, possibly changing lanes for overtaking.

#### 2.5.1. Familiarization—System Exploration Scenario

The scenario consisted of a two-lane highway on a sunny day. Participants were not informed about the functionality of the tactile interface and were asked to maintain a speed of 120 km/h where possible. Since vehicles on the passing lane were designed to drive faster than the target speed, the task was most easily fulfilled by driving on the rightmost lane. However, vehicles on the right lane that were trailed by the ego-vehicle would occasionally slow down, forcing the participant to either overtake via the left lane or brake to avoid a collision. These instances ensured that the time to collision between the ego-vehicle and its surrounding vehicles dropped below the threshold value of nine seconds, causing exposure to the tactile stimuli (the precise signal). After five minutes of driving, participants were asked to park their car on the emergency lane, and the system exploration scenario was stopped. Participants were then asked what they thought the tactile stimuli communicated, and they were informed about the true nature of the assistance function. This scenario was similar to the experimental scenario by Krüger et al. [53,54], who found that participants were able to develop an intuitive understanding of the stimuli within four minutes of system exposure. Similarly rapid

user understanding times for directional tactile displays were described by Cassinelli et al. [40] and Hogema et al. [61].

2.5.2. Experimental Block-Foggy Road: Machine Certain, Human Uncertain (MC-HU)

The foggy road scenario was simulated as a night-time scenario, designed to make the human uncertain by inserting a dense fog field and disabled lights of surrounding traffic. The fog was parameterized to limit the look-ahead distance to about 33 m (Figure 5), corresponding to a look-ahead time of about one second assuming the driver drove at the target speed. A temporal distance of one second has been suggested as the threshold below which a driving situation can be considered critical [62,63]. We assumed that this look-ahead distance would induce uncertainty in drivers, as they would need to be continuously prepared for the occurrence of a critical situation.

**Figure 5.** Visibility in the foggy scenarios. Vehicles disappear at a distance of approximately 33 m.

Machine observations were not affected by the mist or darkness, so a precise signal was communicated for vehicles driving at any distance away from the ego-vehicle. The experimenter triggered the onset of a target vehicle approaching the ego-vehicle according to a fixed script. This approach allowed for an easy verification that participants were driving at the approximate target speed, which was a prerequisite for the correct situation development. When a command was given, the target vehicle started approaching behind the fog barrier from one of the four possible lane directions (front, front-left, rear, rear-left). Vehicles coming from the rear were driving at a speed of 160 km/h. Vehicles in the front were driving at 80 km/h. As a consequence, the target vehicle would overtake (or be overtaken by) the ego-vehicle, assuming that the participant kept driving around the target speed of 120 km/h. Vehicles that approached from the rear on the right lane were programmed to change lanes and overtake the ego-vehicle at a distance of 30 m. After the target vehicle had passed and disappeared into the fog again, and the experimenter confirmed that the participant was driving at the target speed, the next target vehicle was launched. This procedure was carried out 14 times. Directions from which cars approached were pseudo-randomized.

#### 2.5.3. Experimental Block-Foggy Tunnel: Machine and Human Uncertain (MU-HU)

The foggy tunnel scenario was identical to the foggy road (MC-HU) scenario, except for the addition of a tunnel that ran for the entire course and a change in *sensor reliability* such that vehicles outside a 33 m radius from the ego-vehicle could at most be signaled via uncertainty communication as described in Section 2.3.1. Limitations of the look-ahead distance were the same as in the foggy road condition (33 m, 1 s) for the human. For comparability reasons, traffic definitions were identical to the foggy road scenario (MC-HU).

#### 2.5.4. Experimental Block-Rain: Machine Uncertain, Human Certain (MU-HC)

The rain scenario consisted of a straight road on a rainy day. The rain was visually present, though at an intensity at which it did not have much influence on the driver's visual perception. The reliability of the machine was said to be negatively affected by the rain, in the same manner as it was in the foggy tunnel scenario. That is, the look-ahead distance of the machine for precise direction identification and signaling was limited to 33 m. Because the driver's field of view was not obstructed, the traffic setup had to be organized in a different way compared to the fog conditions. The altered traffic profile for the rain scenario is explained in Figure 6.

**Figure 6.** Traffic definition in the rain (machine uncertain-human certain condition (MU-HC)) scenarios. Five vehicles were driving on the right lane at 80 km/h, spaced 250 m apart (I). The ego-vehicle could maintain the target speed (120 km/h) by overtaking the vehicles. When the front truck (E) was overtaken, a trigger point was activated that made the trailing cars B, C, and D switch to the left lane, and adjust their speed to 160 km/h (II). This resulted in B, C and D eventually overtaking the ego-vehicle from the rear. When D passed the ego-vehicle (III), the leading vehicle (A) accelerated to 160 km/h, and it changed to the left lane if it came within a distance of 30 m of the ego-vehicle.

#### 2.5.5. Dependent Measures

As dependent variables, we recorded subjective measures concerning the usefulness, satisfaction and perceived workload in the different experimental conditions, as well as the overall understanding and experience. In addition, we were interested in objective measures which express effects on peoples' gazing behavior and their performance in a driving task.

We used three questionnaires for the subjective evaluation. These were used to gain insights into the subjective experiences which the different experimental conditions induced and see whether the conditions were correctly perceived and understood.

#### 2.5.5.1. Task Load, Usefulness, Satisfaction

After each experimental condition, the NASA Task Load Index (Raw-TLX, [64]) assessment was conducted. Usefulness and satisfaction ratings were obtained using the Van der Laan acceptance scale [65].

#### 2.5.5.2. Understanding and Experience

Furthermore, after every experimental block, participants were asked to rate a number of statements on a 5-point Likert scale (strongly disagree to strongly agree). These statements were included to check if (a) the modulation of human perceptual confidence through environmental factors was successful, (b) the participants had understood the machine's level of uncertainty, and (c) participants experienced that the machine expressed its level of uncertainty.

#### 2.5.5.3. Gaze Distributions

The front gaze ratio was computed as the ratio of the number of gaze points in the front window versus the total amount of gaze points in the mirrors and windshield (Equation (4)). A higher front gaze ratio indicates that the driver allocated more attention towards the front; a lower front gaze ratio indicates that the user allocated more attention towards the rear. By means of this measure, we aimed

at evaluating whether the uncertainty communication caused shifts in visual attention towards the direction of the presented signal.

$$\text{front gaze ratio} = \frac{\text{gaze count on wind shield}}{\text{gaze count on wind shield} + \text{mirrons}}.\tag{4}$$

#### 2.5.5.4. Trial Safety

Trial safety was operationalized as the *Minimum Time-to-Contact* (MTTC) recorded in each trial in any direction. The MTTC can be understood as a conservative measure of safety that only takes into account the smallest recorded TTC and thus indicates how dangerous a trial became at the most (see e.g., References [20,54]).

#### 2.5.5.5. Trial Definition

We restricted the analysis of gaze distributions and safety to specific periods of interest which we refer to as trials. A trial occurred for every vehicle that overtook or was overtaken by the ego vehicle. The starting point of a trial was set to the moment where time to passing (TTP) of a surrounding vehicle dropped below nine seconds. Here, we defined the TTP as the time it would take until two vehicles would pass each other if they would maintain their current velocities. The TTP can be understood as a TTC (see Equation (1)) without the requirement for being on a collision trajectory. We set the end point of a trial to the moment at which the ego-vehicle and the other vehicle passed each other.

#### *2.6. Analysis*

We split the analysis of the data into three parts—(1) custom questionnaire data, (2) subjective data on perceived workload as well as on perceived system acceptance in terms of usefulness and satisfaction, and (3) objective behavioral and performance data, including gaze distribution results and measures of trial safety. To rule out potential confounds, we only ran statistical tests between experimental conditions that shared the same traffic profiles. While the differences in traffic profiles prevented comparisons between fog and rain conditions, this design choice did not impair the investigation of our research hypotheses. It allowed us to prioritize internal validity through the implementation of scenarios that contained credible sources of uncertainty for each environmental condition.

Statistical analysis was carried out using the *scipy* python library. Plots were generated using the python packages *matplotlib* and *seaborn*.

#### 2.6.1. Custom Questionnaire Data—H1 (Understanding)

Custom questionnaire data for all conditions were analyzed descriptively based on median responses and interquartile ranges. According to H1, we expected participants to indicate understanding of the uncertainty encoding stimuli.

#### 2.6.2. Acceptance and Workload—H2 (Subjective Benefit) and H3 (Disturbance)

Usefulness and satisfaction scores were obtained by mapping subsets of Van Der Laan Questionnaire responses to two respective scales in the [−2, 2] range (see [65]). Figure 7 illustrates the outcome that we would expect for usefulness, satisfaction and workload under our research hypotheses H2 and H3. We expected usefulness and satisfaction to be higher in human uncertain (HU) conditions with uncertainty communication than when omitting the information. We further assumed that an advantage of the machine certain (MC—red) over the uncertainty communication (dark blue) condition should exist due to the higher information gain achievable by precise signals. On the other hand, for cases with higher human certainty (HC—green) we would expect information from an uncertainty communication to be redundant and therefore to cause no advantage over an

omission of signals in the uncertainty range. However, under H3 also no disadvantage from redundant uncertainty communication was assumed.

For workload, measured as the NASA Task Load Index (Raw-TLX [64]), the expected relationship would be reversed because we define the relationship between workload and benefit as inverse, that is, a high workload reflects low benefit whereas a low workload can indicate higher benefit.

We compared scores of human uncertain conditions (MC-HU, MU-HU-uc, MU-HU—red, blue) using Friedmann tests and post-hoc one-sided Wilcoxon signed rank tests with Bonferroni adjusted alpha levels for repeated testing. As there were only two human certain conditions (MU-HC-uc, MU-HC—green) we directly compared scores for these conditions using Wilcoxon signed rank tests with Bonferroni adjusted alpha levels.


**Figure 7.** Predicted outcome of subjective evaluations according to our research hypotheses when assuming successful experimental manipulations. Usefulness and satisfaction: Symbols +,0 are used to illustrate the predicted valuation. Relative workload predictions were given verbally. For machine uncertain conditions (blue and green), the light colors mark conditions without uncertainty communication. Their dark counterparts indicate uncertainty communication.

#### 2.6.3. Gaze Distribution and Safety—H4 (Safety)

Figure 8 illustrates the outcome that we would expect for safety and gaze guidance under H4. While gaze guidance is not directly subsumed in the *benefit* term, here we understand it as a behavioral indicator for an influence on peoples' information sampling which relates to our second and fourth hypotheses. The assistance system primes relevant regions of interest through tactile stimuli which may prompt users to shift their gaze accordingly in order to acquire additional information or visual confirmation. Under H2 and H4 we would therefore expect gaze guidance to be observable for conditions in which the system can provide novel information, that is, machine certain (MC—red) and human uncertain with uncertainty communication (MU-HU-uc—dark blue) conditions. In contrast, according to H3 this should not be the case for cases in which human uncertainty is equal or lower than machine uncertainty (light blue and green).

Prior to gaze distribution analysis, we filtered the data to only include trials in which vehicles approached from behind. As driving requires frontal visual attention at most times, especially with low visibility conditions, a comparison of front gaze ratios is more meaningful for situations in which safety-relevant events take place behind the ego vehicle. Due to the presence of outliers and a violation of the normality assumption, we compared front gaze ratios of human uncertain conditions (MC-HU, MU-HU-uc, MU-HU—red, blue) using Friedmann tests and post-hoc one-sided Wilcoxon signed rank tests with Bonferroni adjusted alpha levels for repeated testing. As there were only two human certain conditions (MU-HC-uc, MU-HC—green) we directly compared front gaze ratios for these conditions using one-sided Wilcoxon signed rank tests with Bonferroni adjusted alpha levels.

For the analysis of safety we focused on human uncertain conditions and trials in which vehicles approached from the front right lane because these trials required corrective actions by the driver to ensure safety. In line with H4 we expected safety to be highest in the machine certain (MC—red) condition, lowest in the absence of >33 m signaling (MU-HU—light blue) and intermediate with

uncertainty communication enabled (MU-HU-uc—dark blue). MTTC scores (see Section 2.5.5.4) were calculated for each trial and mean MTTC scores per participant and condition were compared using a Friedmann test and post-hoc one-sided Wilcoxon signed rank tests with Bonferroni adjusted alpha levels for repeated testing.


**Figure 8.** Predicted outcome of behavioral measures according to our research hypotheses when assuming successful experimental manipulations through the introduced conditions. For machine uncertain conditions (blue and green), the light colors mark conditions without uncertainty communication. Their dark counterparts indicate uncertainty communication.

#### **3. Results**

#### *3.1. Subjective Reports*

#### 3.1.1. Custom Questionnaire—H1 (Understanding)

Response distributions to the eight Likert items that were used in our customized questionnaire are shown in Figure 9 for each experimental condition. For human uncertain conditions, participants strongly indicated weather conditions as a cause for feeling unconfident whereas other road users had a smaller influence and belt signals were not negatively affecting confidence. For human certain conditions, none of these three factors reduced confidence. These ratings suggest that our experimental manipulation of human uncertainty through different weather conditions was successful. Statements 4 and 5 targeted the understanding of the tactile stimuli and the machine uncertainty state. In support of H1, participants generally identified system uncertainty when present (MU), especially with uncertainty communication (uc) and correctly indicated its absence (MC). This suggests that the state transparency achieved by the uncertainty communication supported system state understanding. The last three statements were included for an estimate on which modalities the participants relied during the different conditions. Reliance on own capabilities and visual sensing was highest in the human certain conditions (HC). For human uncertain conditions (HU), reliance on the tactile stimuli was high, especially for the machine certain (MC) and machine uncertain + communication (MU-HU-uc) conditions. This was no longer the case when uncertainty communication was disabled (MU-HU). In support of the H2 and H3, this suggests that participants utilized tactile stimuli depending on system reliability and their own confidence state. In summary, participant responses suggest that the experimental manipulations worked as intended and induced different levels of congruency between human and machine perceptual uncertainty.

**Figure 9.** Median agreement ratings (square) and 25th and 75th percentiles on a custom 5-point Likert scale questionnaire. SD = Strongly Disagree, D = Disagree, N = Neutral, A = Agree, SA = Strongly Agree.

3.1.2. Usefulness and Satisfaction—H2 (Subjective Benefit) and H3 (Disturbance)

An overview of the usefulness and satisfaction scores that were obtained in each experimental condition can be found in Figure 10b. As expected, the overall highest score was found for the machine certain and human uncertain condition (MC-HU). The overall lowest score was obtained for the machine uncertain-human certain condition (MU-HC). We were interested in comparing conditions within a given level of human certainty, that is a comparison between the three human uncertain conditions (HU—red and blue) and between the two human certain conditions (HC—green).

The human uncertain conditions (MC-HU, MU-HU-uc, MU-HU) differed significantly for usefulness, *χ*2(2) = 20.87, *p* < 0.001 (<*α* = 0.025), as well as for the satisfaction scores, *χ*2(2) = 16.62, *p* < 0.001 (<*α* = 0.025). Post-hoc comparisons revealed that usefulness was rated significantly higher with uncertainty communication enabled (MU-HU-uc—dark blue) than disabled (MU-HU—light blue), MU-HU-uc vs. MU-HU: *w* = 0.0, *p* < 0.001 (<*α* = 0.008) where *w* denotes the sum of the ranks of the differences above zero (In contrast to test statistics of many parametric tests, a small value for *w* is therefore a strong indicator for consistent and significant differences). Similarly, usefulness in the machine certain condition (MC—red) was rated significantly higher than in the machine uncertain condition without uncertainty communication (MU-HU), MC-HU vs. MU-HU: *w* = 0.0, *p* < 0.001 (<*α* = 0.008). However, there was no significant difference in usefulness ratings between the machine certain (MC-HU) and the uncertainty communication condition (MU-HU-uc), MC-HU vs. MU-HU-uc: *w* = 32.0, *p* = 0.289 (>*α* = 0.008). The same pattern of results was observed for the

satisfaction ratings, MU-HU-uc vs. MU-HU: *w* = 10.5, *p* = 0.004 (<*α* = 0.008), MC-HU vs. MU-HU: *w* = 0.0, *p* < 0.001 (<*α* = 0.008), MC-HU vs. MU-HU-uc: *w* = 34.5, *p* = 0.219 (>*α* = 0.008).


**Figure 10.** Results of subjective measures for different conditions. Conditions are visually represented by distinct colors. For machine uncertain conditions (blue and green), the light colors mark conditions without uncertainty communication. Their dark counterparts indicate uncertainty communication. (**a**) Mean usefulness, satisfaction, and NASA TLX scores for each condition. Standard deviations are shown in brackets. Asterisks indicate statistically significant differences between conditions linked by brackets; (**b**) Mean usefulness and satisfaction scores of the assistance functionality in MC-HU (Foggy Road), MU-HU-uc (Foggy Tunnel), MU-HC-uc (Rain), MU-HU (Foggy Tunnel, no UC), MU-HC (Rain, no UC). Error bars display the standard deviation; (**c**) NASA Raw TLX scores per condition. Scores of individual questions were averaged to obtain the overall RTLX score in the range [0,100].

These results support the prediction driven by H2 that usefulness and satisfaction ratings should be higher with enabled than disabled uncertainty communication. However, contrary to our assumption, no advantage of the machine certain (MC-HU) over the uncertainty communication (MU-HU-uc) condition, reflecting a difference in potential information gain, could be confirmed.

Also for the human certain conditions (HC–green), we found that usefulness was rated as significantly higher with uncertainty communication enabled (MU-HC-uc) than disabled (MU-HC), MU-HC-uc vs. MU-HC: *w* = 16.5, *p* = 0.012 (<*α* = 0.05). For satisfaction ratings, the differences between human certain conditions were not significant, MU-HC-uc vs. MU-HC: *w* = 21.0, *p* = 0.429 (>*α* = 0.05). While average satisfaction ratings were somewhat neutral for both conditions, the usefulness of a late-supporting system was negatively judged. Average neutral usefulness ratings for the uncertainty communication condition support our predictions made under H3, presumably because it was neither needed nor disturbing.

#### 3.1.3. Workload—H2 (Subjective Benefit) and H3 (Disturbance)

NASA TLX workload ratings (Figure 10c) differed significantly between human uncertain conditions (MC-HU, MU-HU-uc, MU-HU), *χ*2(2) = 11.66, *p* = 0.003 (<*α* = 0.05). Post-hoc comparisons revealed that workload was rated significantly lower with uncertainty communication enabled (MU-HU-uc—dark blue) than disabled (MU-HU—light blue), MU-HU-uc vs. MU-HU: *w* = 14.0, *p* = 0.008 (<*α* = 0.016). Also in the machine certain condition (MC—red), workload was rated significantly lower than in the machine uncertain condition without uncertainty communication (MU-HU), MC-HU vs. MU-HU: *w* = 1.0, *p* = 0.001 (<*α* = 0.016). These results confirm the prediction that workload should be reduced when enabling uncertainty communication and thus support H2. However, differences in subjective workload between the machine certain (MC-HU) and the uncertainty communication condition (MU-HU-uc) were not significant, MC-HU vs. MU-HU-uc: *w* = 19.0, *p* = 0.032 (>*α* = 0.016). In contrast to H2, an assumed advantage of the machine certain (MC-HU) over the uncertainty communication (MU-HU-uc) could therefore not be confirmed.

For the human certain conditions (HC—green), workload ratings were comparably low and did not differ significantly between conditions with uncertainty communication enabled (MU-HC-uc—dark green) and disabled (MU-HC—light green), MU-HC-uc vs. MU-HC: *w* = 31.0, *p* = 0.310 (>*α* = 0.05). When contrasted with results from the human uncertain (HU) conditions, the low averages and the lack of difference in satisfaction and workload between the two human certain (HC) conditions may be seen as support for H3. However, due to the use of different driving profiles, a formal comparison of differences would not be valid.

#### *3.2. Gaze Distribution—H2 (Subjective Benefit) and H4 (Safety)*

Figure 11b shows the ratio of gaze points on the front (front window) divided by front+back (front window + mirrors). Front gaze ratios differed significantly between human uncertain conditions (MC-HU, MU-HU-uc, MU-HU) for trials in which vehicles approached from the back, *χ*2(2) = 16.0, *p* < 0.001 (<*α* = 0.05). Post-hoc comparisons revealed that the front gaze ratios were significantly lower with uncertainty communication enabled (MU-HU-uc—dark blue) than disabled (MU-HU—light blue), MU-HU-uc vs. MU-HU: *w* = 0.0, *p* < 0.001 (<*α* = 0.016). Also in the machine certain condition (HC—red), front gaze ratios were significantly lower than in the machine uncertain condition without uncertainty communication (MU-HU), MC-HU vs. MU-HU: *w* = 2.0, *p* < 0.001 (<*α* = 0.016). Differences in front gaze ratios between the machine certain (MC-HU) and the uncertainty communication condition (MU-HU-uc) were not significant, MC-HU vs. MU-HU-uc: *w* = 14.0, *p* = 0.007 (<*α* = 0.016 but *w* > *wcritical* = 12).

Between human certain conditions (MU-HC, MU-HC-uc—green), differences between front gaze ratios could not be regarded as significant for trials in which vehicles approached from the back, MU-HC vs. MU-HC-uc: *w* = 24.0, *p* = 0.037 (>*α* = 0.016 and *w* > *wcritical* = 12). These findings indicate an increased overt attention guidance for conditions in which the assistance can provide novel relevant information. They are therefore in line with our predictions (see Figure 8) made under H2 and H4.

For comparison, for situations in which vehicles approached from the front (Figure 11c), the gaze distributions substantially shifted to the front (MU-HC: *M* = 0.92, *SD* = 0.05; MU-HC-uc: *M* = 0.91, *SD* = 0.06; MU-HU: *M* = 0.97, *SD* = 0.02; MU-HU-uc: *M* = 0.96, *SD* = 0.04; MC-HU: *M* = 0.94, *SD* = 0.07) across all conditions. Differences between uncertainty communication and no uncertainty communication diminished, as stimuli with uncertain direction encoding only drew attention to front regions.


(**a**)

**Figure 11.** Results of objective measures for different conditions. Conditions are visually represented by distinct colors. For machine uncertain conditions (blue and green), the light colors mark conditions without uncertainty communication. Their dark counterparts indicate uncertainty communication. (**a**) Mean front gaze ratios and MTTC scores for each applicable condition. Standard deviations are shown in brackets. Asterisks indicate statistically significant differences between conditions linked by brackets; (**b**) Gaze ratio for conditions in which the machine was uncertain and for trials in which vehicles were approaching from the rear. Lower values indicate more gazing towards the mirrors. Due to failed eye tracking recordings, *n* = 13 (instead of 14) for all conditions; (**c**) Gaze ratio for conditions in which the machine was uncertain and for trials in which vehicles were approaching from the front.

#### *3.3. Trial Safety—H4 (Safety)*

Figure 12 displays the MTTC scores for human uncertain conditions. We only considered the data of the human uncertain (HU—blue and red) conditions for statistical tests. MTTCs differed significantly between human uncertain conditions (MC-HU, MU-HU-uc, MU-HU), *χ*2(2) = 24.14, *p* < 0.001 (<*α* = 0.05). We found that the MTTCs were significantly higher for the MU-HU-uc condition (*M* = 2.59 s, *SD* = 0.88) than for the MU-HU condition (*M* = 1.24 s, *SD* = 0.46); *w* = 4.0, *p* = 0.001 (<*α* = 0.016). Furthermore, driving safety in terms of MTTC was also significantly higher in the MC-HU condition (*M* = 3.92 s, *SD* = 1.11) than in the MU-HU-uc condition, *w* = 7.0, *p* = 0.002 (<*α* = 0.016) and the MU-HU condition, *w* = 0.0, *p* < 0.001 (<*α* = 0.016). In poor visibility conditions (MU), imprecise tactile direction signaling (MU-HU-uc) appears superior to a variant only capable of signaling specific, reliable observations within a substantially constrained spatial range (MU-HU). In accordance with H4, participants thus seem to have taken advantage of the information available in the tactile stimuli to adjust their driving behavior for achieving higher safety.

**Figure 12.** Minimum Time-to-Contact (MTTC) scores for human-uncertain conditions (*n* = 14).

#### **4. Discussion**

In the present driving simulator study, we investigated the effects of a novel approach to encode spatial uncertainty in the stimuli of a vibrotactile assistance system. We aimed at evaluating the influence of the uncertainty communication on subjective measures indicative of perceived usefulness, satisfaction, and workload, as well as on behavioral measures, that is, driving safety and gaze allocation. We assumed that any effect of the uncertainty communication would be influenced by the relation of spatial uncertainty in human perception and the assistance system. Therefore, we experimentally varied the driving scenarios to simulate machine uncertainty (tunnel + fog, rain) and to induce human uncertainty (fog, tunnel + fog). We found that our suggested uncertainty communication mode was understood by participants and had significant effects on both subjective and objective behavioral measures. Thereby the utility of the system seemed to depend on the driver's perceptual confidence state. In our experiment, the uncertainty communication was regarded as beneficial and had a measurable influence on driver behavior in cases where the human driver was uncertain as well.

#### *4.1. Signal Understanding and Experiment Validation*

A prerequisite to this study was that our environmental scenario manipulations had the effect that we intended. Data from our custom questionnaire indicate that this was indeed the case. Participants reported that they felt uncertain due to the weather conditions and agreed that they relied more on the belt signal than on their own perception in the human uncertain conditions. Furthermore, participants experienced higher workload in the human uncertain conditions compared to the human certain conditions.

Besides, we were interested in the participants' subjective agreement on understanding the manipulation of machine uncertainty and the respective uncertainty communication signal. This was important to further validate our experimental procedure and the design of our uncertainty signal. Participants indicated that they had understood when the machine was uncertain and that they understood the meaning of the signal. Interestingly, they seemed to have noticed the machine uncertainty more strongly in the conditions in which the uncertainty communication was enabled, which suggests that this feature helped to make the machine state more transparent. Taken together, in support of hypothesis H1 (Understanding), these results indicate that our experimental manipulations were valid and that participants seemed to have an appropriate understanding of the uncertainty communication.

An important difference between earlier studies that have demonstrated successful communication of uncertainty (e.g., References [13,16,23]) and the work presented here, is that we relied on an *implicit* representation of uncertainty in the tactile modality: The uncertainty component was encoded within the spatiotemporal signaling functionality of our vibrotactile interface. Instead of explicitly stating that "I am uncertain", the machine agent implicitly communicates uncertainty by being less specific in its display of the location of objects. We argue that the distinction between *implicit* and *explicit* uncertainty communication may be useful for the future design of reliability displays. Implicit uncertainty communication is characterized by an increase in ambiguity or vagueness, or a decrease in specificness of presented information. One example of implicit uncertainty communication

that we encountered in the literature is by Finger and Bisantz [14], who added distortions to an image to make it increasingly difficult to specify the underlying image.

#### *4.2. Uncertainty Signaling in Human Uncertain Conditions*

In terms of behavioral adaptations and user acceptance, we found substantial differences in the results between the human certain and the human uncertain conditions. In particular, in case of both human and system uncertainty, uncertainty communication was perceived as significantly more useful and satisfying compared to the no uncertainty communication conditions. Uncertainty communication also yielded significantly lower workload, increased driving safety and more strongly guided gaze behavior, indicating that more attention was allocated towards the direction of the uncertainty signal. These results support hypotheses H2 (Subjective Benefit) and H4 (Safety) by showing that the vibrotactile uncertainty communication had beneficial effects on driving comfort and safety.

In the human uncertain conditions, the uncertainty communication signal was not perceived as significantly different from the precise signal in terms of perceived usefulness and satisfaction, as well as in perceived workload. This is somewhat surprising as one might think that participants would naturally value the accessibility of the full information that is provided by the precise signal more than the more ambiguous uncertainty information signal. Overall, this outcome indicates that making the vehicle's perceptual state transparent is appreciated by participants. Our results suggest that users are still satisfied with the directional cues and recognize the usefulness of the uncertainty signals, despite the lower quality in terms of information specificity. However, in case of driving safety, we observed a significant advantage of the precise signal over the uncertainty communication signal. That is, we observed the safest driving behavior in terms of MTTC scores in conditions where the machine's sensory capabilities were unaffected by the environment.

We conclude that the precise signal was appropriately used by participants to acquire a more accurate understanding of the direction of surrounding objects.This finding is in line with the reports by Krüger et al. [53,54], who found that participants rapidly gained an understanding of vibrotactile stimuli and presented safer driving behavior using the same vibrotactile assistance with a precise signal mode compared to driving without.

#### *4.3. Uncertainty Signaling in Human Certain Conditions*

Analysis of the eye-tracking data revealed that visual attention distributions were affected significantly by the uncertainty signaling in scenarios in which human visibility was limited (human uncertain conditions), but not in the human certain conditions. Furthermore, usefulness and satisfaction ratings showed neutral ratings in the human certain conditions. In agreement with hypothesis H3 (Disturbance) this suggests that there is no direct disadvantage but also no benefit in sharing observations continuously when the human is confident.

For successful human-machine cooperation [7,8] or teaming, a human mental representation of system uncertainty may not be enough. When the machine also has a representation of human confidence in different environments, it allows the machine to decide under what conditions to provide support to the user. However, such a selective and presumably personalized communication could induce confusion when violating a user's assumptions on what the machine is communicating. In this example, it might not even be possible for a user to unambiguously distinguish between cases in which the machine is not providing stimuli because it has not detected a potential collision event and cases in which it has selectively disabled communication because it could confirm that the user has a sufficient scene understanding. Selectively deactivating systems that implicitly encode the absence of issues through an absence of stimuli could therefore be problematic but may be an important challenge to tackle in the design of future driving assistance systems.

#### *4.4. Limitations*

Despite the relatively small sample size, the results show clear statistical significance and accordingly provide support for the benefits of uncertainty communication. A limitation of the current study is that the sample (technically schooled, 13/14 male) was not balanced to be representative of a diverse population. Consequently, inferences are restricted to mostly male drivers younger than 42 years. It is well known that age is associated with sensory and cognitive decline [66]. However, prior work on sensory integration [67] and proximity alerting [68] suggests a positive relationship between age and multimodal facilitation effects such as reaction time shortening. Future work should investigate whether such a relationship also exists with our system. Another limitation comes from the restriction to highly challenging situations for cases with human uncertainty. An advantage of the fast succession of safety-critical situations is that it ensured exposure to the functionality of the device, which currently only provides stimuli when operating outside a safety margin. This means that in safe conditions the system does not produce any stimuli. The fact that the system proved its usefulness in challenging situations can be seen as a strength. However, we do not know if the observed effects would remain with less frequent system activation under more common traffic conditions. Future work could address this issue by implementing easier scenarios where a participant encounters fewer safety-critical event(s) for an overall longer exposure time.

#### *4.5. Conclusions*

Taken together, the study yields new insights about the communication of directional uncertainty for a driving assistance system in the tactile modality. We found that an implicit encoding of spatial uncertainty in a vibrotactile interface was easily understood and used by participants, and that its impact on drivers depended on the drivers' sense of certainty. Importantly, in case the human driver was uncertain, the uncertainty communication signal was perceived as equally useful and satisfying as a precise signal of the assistance system. Along with previous literature, our findings stress the value and importance of communicating appropriate information and making machine states transparent to the user. Our results suggest that the tactile modality is a suitable candidate for communicating such information to the user unobtrusively and intuitively while potentially circumventing the risks and challenges which an additional utilization of the visual modality would introduce.

**Author Contributions:** Conceptualization, M.K., T.D., C.B.W.-H. and J.C.F.d.W.; Methodology, M.K., T.D., C.B.W.-H. and J.C.F.d.W.; Software, T.D. and M.K.; Validation, M.K. and J.C.F.d.W.; Formal analysis, T.D. and M.K.; Investigation, T.D.; Resources, H.W.; Data curation, T.D. and M.K.; Writing–original draft preparation, T.D. and M.K.; Writing–review and editing, M.K., C.B.W.-H., J.C.F.d.W., H.W. and T.D.; Visualization, T.D. and M.K.; Supervision, M.K., C.B.W.-H., J.C.F.d.W. and H.W.; Project administration, C.B.W.-H. and H.W.; Funding acquisition, H.W.; All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Honda Research Institute Europe GmbH.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


c 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **Standardized Test Procedure for External Human–Machine Interfaces of Automated Vehicles**

**Christina Kaß 1,\*, Stefanie Schoch 1, Frederik Naujoks 2, Sebastian Hergeth 2, Andreas Keinath <sup>2</sup> and Alexandra Neukum <sup>1</sup>**


Received: 28 February 2020; Accepted: 23 March 2020; Published: 24 March 2020

**Abstract:** Research on external human–machine interfaces (eHMIs) has recently become a major area of interest in the field of human factors research on automated driving. The broad variety of methodological approaches renders the current state of research inconclusive and comparisons between interface designs impossible. To date, there are no standardized test procedures to evaluate and compare different design variants of eHMIs with each other and with interactions without eHMIs. This article presents a standardized test procedure that enables the effective usability evaluation of eHMI design solutions. First, the test procedure provides a methodological approach to deduce relevant use cases for the evaluation of an eHMI. In addition, we define specific usability requirements that must be fulfilled by an eHMI to be effective, efficient, and satisfying. To prove whether an eHMI meets the defined requirements, we have developed a test protocol for the empirical evaluation of an eHMI with a participant study. The article elucidates underlying considerations and details of the test protocol that serves as framework to measure the behavior and subjective evaluations of non-automated road users when interacting with automated vehicles in an experimental setting. The standardized test procedure provides a useful framework for researchers and practitioners.

**Keywords:** eHMI; standardized test procedure; use cases; test protocol; automated driving

#### **1. Introduction**

With the introduction of automated vehicles into mixed traffic environments, drivers may be (temporarily) allowed to engage in non-driving-related tasks while driving. As a consequence, the drivers of automated vehicles will often be unavailable for communication while their vehicle is interacting with non-automated road users. To face this change and to ensure safe interactions, there is a broad acceptance among practitioners and researchers that in some situations, automated vehicles may need to replace the informal communication of human drivers (such as gestures and eye contact) with external human–machine interfaces (eHMIs) [1,2]. Currently, eHMI systems represent a completely new and immature technology. Before introducing such a new technological system to the market and to the traffic environment, it is important to carefully determine its usability.

Since 2017, a large body of research has been investigating the impact of different eHMI approaches on the subjective evaluations and behavior of non-automated road users. Previously studied eHMI approaches basically differed with regard to the content of communication (e.g., maneuver intention, automation status, and request for action) [3] and concrete interface design solutions (e.g., the position and modality of the signal) [4–8]. Results are inconclusive regarding the benefit of using an eHMI to signal maneuver intentions of automated vehicles. In some studies, communicating the maneuver

intention of the automated vehicle increased the subjective ratings of interaction partners in comparison to interactions without an eHMI [9–12]. In other studies, such eHMI concepts did not have any impact on pedestrians' perceived trust and safety [1] or even had a negative effect on pedestrians' workload during interactions with automated vehicles [13]. Moreover, it is still unclear whether communicating the vehicle's automation status with eHMI systems improves the subjective experiences of interaction partners. On the one hand, eHMI systems that signaled the automation status with light-emitting diode (LED) strips had a positive effect on pedestrians' emotional experience [11] and perceived safety [12] compared to interactions without an eHMI. On the other hand, other studies did not reveal an impact of communicating the automation status on pedestrians' perceived stress [14] and perceived safety [1], such as on cyclists' reported behavior [15]. Furthermore, previous studies have offered contradictory findings on the effect of eHMI signals on the behavioral decisions of non-automated road users. eHMI concepts that communicated the vehicle's intention to stop [10] or gave a concrete request for action ("Walk!" or "Ok") [16] increased pedestrians' willingness to cross the road in a shared space compared to interactions without an eHMI. In addition, two studies found that pedestrians needed less time to make their decision to cross or not cross the road with than without an eHMI [6,17]. However, the results of [14,18] revealed that pedestrians focused to a higher degree on vehicle speed and distance to the vehicle when making crossing decisions than on eHMI signals. Deb et al. [19] found that a verbal warning saying "safe to cross" shortened the time pedestrians needed to cross the street compared to no eHMI, while different visual eHMI concepts had no effect on crossing time.

Overall, although extensive human factors research has been carried out on eHMIs, a systematic understanding of the usability of different eHMI concepts is still lacking. Previous research has used very different methodological approaches and has had methodological limitations. Methodological limitations include a lack of behavioral measurements [8,9], small sample sizes [12,13], and vague result reports [4,20]. In some studies, participants evaluated the eHMI after they had received a thorough briefing and explanation of the signal meanings [9,11,12]. In other studies, participants reported their subjective ratings of the situation even though some of them had not even perceived the eHMI [1,14]. Another limitation pertains to the research environments used in previous studies. Commonly used methods such as the Wizard of Oz technique, virtual reality (VR) pedestrian simulators, and video or photo studies use only simplified behavioral measurements, resulting in limited external validity. For example, participants were instructed to simply report their behavior [15], to press a button [10], or to take only one step forward to indicate their intention to cross [14]. The outlined methodological differences and limitations render comparisons of different eHMI variants impossible. Therefore, results are inconclusive regarding the required content of communication (e.g., maneuver intention, automation status, and detection feedback), interface design requirements (e.g., modality, position, and text or symbols), the operational design domain for eHMIs (e.g., urban environment, crosswalks, and intersections), and the role of the interaction partner (e.g., pedestrian, cyclist, or manual driver).

Furthermore, most studies have not provided an explanation for the selection of the use case under investigation. The majority of studies have examined interactions in urban areas in a low speed range where communication was required to negotiate the right of way. The most frequently investigated use cases so far have been interactions with pedestrians at crosswalks [4,6,9,12,17–19,21–23] or crossing situations with an ambiguous right of way, e.g., shared spaced or parking areas [1,8,10–14,16,18,20,24,25]. While prior work has already developed frameworks to derive use cases to test the in-vehicle HMIs of automated driving systems [26–29], there has been very limited research on taxonomies for use cases of eHMIs [30].

To date, there are no standardized test procedures to assess the usability of eHMIs of automated vehicles. There is no consensus on relevant use cases, evaluation requirements, and proper experimental designs yet. To advance the development of eHMIs, there is a necessity to standardize the evaluation process of eHMIs. Standardized test procedures allow for reliable and meaningful conclusions and enable comparisons between different studies and interface designs. Standardized methods already exist for other research areas of traffic psychology, e.g., for the evaluation of the in-vehicle HMIs of vehicles with automated driving systems [31] or to measure the eyes-off-road time as an indicator of distraction potential when interacting with in-vehicle information systems [32,33]. In their review article on the current state of research on eHMIs, Rouchitsas and Alm [34] declared that the "standardization of relevant procedures is a fundamental requirement for effective interface evaluations and meaningful comparisons. Therefore, future conceptual and empirical work in the field should primarily be concerned with producing standardized procedures for evaluating and comparing different implementations"(p. 10). The present article provides a response to this request. We propose a newly developed methodological framework that standardizes the usability evaluation process of eHMIs. This standardized test procedure consists of three parts:


#### **2. Methods and Results**

#### *2.1. Definition of Use Cases*

Prior research has mainly focused on vehicle–pedestrian interactions at crosswalks or at ambiguous crossing points in urban environments at a low speed. However, this only represents a limited selection of the possible use cases of an eHMI. To evaluate the usability of an eHMI in a standardized way, it is important that study participants encounter the eHMI with a set of relevant use cases. Thus, the definition of relevant use cases is the core of each evaluation process [31], as it ensures that the test procedure generates meaningful and comparable results. Fuest, Sorokin, Bellem, and Bengler [30] published a taxonomy of traffic situations that intends to serve as a basis to assess the communication between automated vehicles and human road users. Their taxonomy provides an overview of attributes and associated value facets that are considered to influence implicit and explicit communication in traffic, e.g., the attribute "right of way" with the value facets automated vehicle, human road user, or undefined. To define a traffic situation, one can choose and combine attributes and value facets that are relevant for the research question at hand. The combination of all listed value facets results in 373,248 situations. The authors do not provide an instruction how to deduce relevant use cases. Furthermore, the taxonomy lacks attributes that specify the approach direction of the interaction partners and the currently executed driving maneuver of the automated vehicle. Therefore, we developed a new methodological approach to deduce relevant use cases for a given eHMI.

We used a multi-stage gradual methodological approach that claims to consider an exhaustive set of use cases of an eHMI. These use cases are subsequently reduced step-by-step by applying different filters. More specifically, the collection and combination of use cases and their specifications alternate with stepwise reductions of use cases based on redundancies and theoretical and practical considerations. Figure 1 illustrates an overview of the procedure of this approach.

**Figure 1.** Overview of the methodological approach to select relevant use cases of an external human–machine interface (eHMI).

#### 2.1.1. Defining a Use Case of an eHMI

The basis of the approach was the definition of a use case of an eHMI. A use case of an eHMI is defined as a situation where an automated vehicle and at least one non-automated road user intend to "occupy the same region of space at the same time in the near future" [36]. This situation requires the interactive behavior of at least one involved road user to avoid a potential traffic conflict. Interactive behavior signifies that the road user adapts its initially planned behavior to the anticipated behavior of the other road user, e.g., by changing speed or trajectory. Traffic conflicts arise when "two or more road users approach each other in space and time to such an extent that a collision is imminent if their movements remain unchanged" [37]. The use of eHMIs as communication aids of automated vehicles can potentially support non-automated road users in understanding and anticipating the interactive behavior of the automated vehicle. From this, the users can draw conclusions for their own interactive behavior.

#### 2.1.2. System-Based Approach

The system-based approach was used to collect all possible driving maneuvers that an automated vehicle can execute. Driving maneuvers were divided into lateral and longitudinal maneuvers. Lateral maneuvers consist of driving straight ahead, turning (left, right), and changing the lane (left, right). When the vehicle is in motion, longitudinal maneuvers are keeping a constant speed, decelerating, and accelerating. When at a standstill, longitudinal maneuvers are keeping a constant speed (0 km/h), starting to drive forward, and reversing. Filter A (Figure 1) reduced the number of collected maneuvers based on the assumption that an eHMI should be only used in situations in which it adds benefit to conventional lighting. Consequently, all lateral maneuvers and the reversing maneuver were filtered, as they can be signaled by the turn signal and the reversing light. In principle, acceleration, deceleration, and keeping a constant speed can be perceived by other road users by observing the automated vehicle. However, these cues are often very subtle, and eHMIs could support the perception by signaling these maneuver intentions prior to action execution. In conclusion, the resulting maneuvers are keeping a constant speed (while driving or at standstill), accelerating (while driving or at standstill), and decelerating (while driving).

#### 2.1.3. Generic Situation-based Approach

The generic situation-based approach considered all factors that characterize interactions between traffic participants. This approach is generic because it does not consider the context in which a situation takes place, e.g., urban context, highway, intersection, or parking area. The first factor represents the intended moving direction of the interaction partner, which may be in the opposite direction to the automated vehicle, at a crossing angle to the automated vehicle, laterally approaching the automated vehicle in the same direction, and driving in the same direction as the automated vehicle (see first row of Figure 2). The second factor represents the position of the interaction partner relative to the automated vehicle (see second row of Figure 2). A combination of these two factors leads to certain combinations that would never result in traffic conflicts between the two traffic participants (see definition of eHMI use cases), e.g., when the interaction partner is located next to the automated vehicle while driving in the opposite direction. Filter B (Figures 1 and 2) was used to reduce those combinations that cannot lead to traffic conflicts. The remaining combinations represent situations that would result in traffic conflicts without the interactive behavior of at least one involved road user (see third row of Figure 2). We hypothesized that the driving direction of the interaction partner (left or right) and the exact start position of the interaction partner in a merging situation do not lead to relevant differences between the resulting use cases. These redundant situations are indicated by blue boxes in Figure 2. Thus, Filter C (Figures 1 and 2) filtered out these redundant situations. The resulting three generic situations are shown in the bottom row of Figure 2: The interaction partner approaches the automated vehicle frontally, orthogonally from the side, or merges in front of the automated vehicle with a lateral approach direction. Figure 3 illustrates possible ways to implement these three situations in a driving simulation with a cyclist as the interaction partner.

**Figure 2.** Generic situation-based approach with two filters. Grey squares represent an arbitrary interaction partner, and the white vehicles represent automated vehicles. Blue boxes indicate redundant scenarios.

**Figure 3.** Implementation of the three derived situations in a driving simulation with a cyclist as the interaction partner. The arrows indicate the trajectories of the automated vehicle and the cyclist, and the black cross represents their virtual crossing point.

#### 2.1.4. Combination of Maneuvers and Situations: Context-Independent Use Cases

In the next step, the remaining maneuvers of the automated vehicle and generic situations were combined (see Figures 1 and 4). The resulting context-independent use cases are illustrated in Figure 4. For example, the interaction partner could approach the automated vehicle orthogonally from the side while the automated vehicle keeps a constant speed.

**Figure 4.** Combination of situations and maneuvers of the automated vehicle, resulting in context-independent use cases. Grey squares represent an arbitrary interaction partner, and the white vehicles represent automated vehicles.

#### 2.1.5. Collection of Situation-Specific Factors

In order to ensure an exhaustive set of use cases, we collected a set of all situation-specific factors that could potentially influence an interaction between the automated vehicle and its interaction partner. Following the procedure by Fuest et al. [30], we assigned value facets to the collected factors. Table 1 presents the collected situation-specific factors and their value facets. Filter D was used to reduce certain value facets or complete factors (filtered factors and value facets are marked by <sup>2</sup> in Table 1). This reduction was based on the guideline that the use cases should be used to evaluate the usability of an eHMI. Accordingly, if we expected that a certain factor and/or its corresponding value facets would not lead to different requirements for an eHMI, they were not further considered. The following paragraph elucidates the collected situation-specific factors and the application of Filter D.

The type of road can be either urban, rural, or a highway [38]. We assumed that the usability of an eHMI would not differ depending on the type of road on which an interaction partner experiences the system. For example, a certain eHMI signal should have the same usability during a merging maneuver regardless of whether the maneuver takes place on an urban or rural road. Independent of the type of road, an eHMI must be able to communicate whether it is letting the interaction partner merge or whether he/she must brake and merge behind the vehicle. Furthermore, use cases can take place in different traffic environments, such as at intersections, in parking areas, or somewhere on the road. It was hypothesized that the system perception and interpretation and, thus, the requirements for an eHMI will not change depending on the traffic environment. Regardless of communicating at an intersection or on the road, the eHMI must signal if the automated vehicle will let the interaction partner cross or not. Thus, by applying Filter D, the factors of the type of road and traffic environment were not further considered as situation-specific factors.


**Table 1.** Situation-specific factors, their value facets, and the application of Filter D.

<sup>1</sup> These factors are based on the taxonomy by Fuest et al. [30]. <sup>2</sup> These factors and value facets are filtered by Filter D. <sup>3</sup> These value facets were combined by Filter D.

The right of way can be either assigned to the automated vehicle (e.g., green traffic light), to the interaction partner (e.g., crosswalk), or can be undefined [30]. To test the usability of an eHMI, the eHMI should be the only mean that influences the interaction between the automated vehicle and the non-automated road user. Thus, we decided to filter those value facets in which clear traffic rules determine the right of way for one of the interaction partners. The use cases for the eHMI test procedure should take place in a traffic environment without right-of-way rules.

The type of interaction partner can be either motorized vehicles (cars, powered two-wheelers, and trucks) or non-motorized vulnerable road users ((VRUs) such as pedestrians and cyclists). To date, there have been no studies that systematically compare the impact of eHMI signals on the interaction of automated vehicles with different types of road users. Interactions with motorized vehicles are usually more dynamic (higher velocities) than with VRUs. The drivers of motorized vehicles often have another visual perspective on the automated vehicle than VRUs. However, prior research and technical developments have suggested that automated vehicles will primarily use vehicle-to-X (V2X) technology to communicate with manual car drivers. With this technology, automated vehicles can send messages directly to the in-vehicle displays of manually-driven vehicles, e.g., about their intent, their willingness to cooperate, or requests of cooperative behavior of the human driver [39]. Thus, with V2X communication, automated vehicles do not necessarily need an eHMI to communicate with the human drivers of manually-driven vehicles. Furthermore, unsuccessful interactions usually have more severe consequences for VRUs than for the drivers of motorized vehicles. Compared to pedestrians, interactions with cyclists are evaluated to be more critical because they move at higher speeds, and, thus, interactions evolve more dynamically [15]. These differences might lead to different requirements for an eHMI when the automated vehicle interacts with different types of interaction partners. Principally, the use cases should be experienced from the perspective of a manual car driver (as the most common representative of a motorized interaction partner), as well as from the perspective of a cyclist (as the worst-case representative of a VRU). However, due to V2X technology as another communication aid between automated and manually-driven vehicles, we recommend to primarily focus on use cases with VRUs as interaction partners.

The automation level represents a further potentially relevant factor. The categorization published by the Society of Automotive Engineers defines six automation levels [40]. Driving on automation levels 0–2 does not represent a use case of an eHMI, as the human driver is responsible to monitor the driving environment and must remain attentive. Thus, the driver him- or herself can still communicate with other road users. On automation levels 3–5, the driver is allowed to engage in non-driving-related tasks as soon as the automated driving system is activated. The system makes decisions about upcoming driving maneuvers and could communicate these to other road users via an eHMI. In general, levels 3–5 can be considered as a single use case because the requirements for an eHMI do not differ. In comparison to levels 4 and 5, however, an automated driving system at level 3 could potentially hand over control to the driver during an interaction situation when a system limit is reached. A so-called take over situation results in the additional requirement that the interaction partner needs to understand that the previous eHMI signal might be no longer be valid once the driver has taken control. As a consequence, a takeover situation during an interaction with an automated vehicle at level 3 should be considered as an additional special use case.

Visibility conditions might influence the perceptibility of an eHMI and, thus, might lead to different requirements of an eHMI. However, these requirements rather relate to the pure visibility of eHMI signals in different visibility conditions than to the usability of the system. Thus, Filter D neglects different visibility conditions. Use cases to test the usability of an eHMI should take place under normal visibility conditions.

As additional factors, the speed of both interaction partners determines how fast an interaction builds up and develops. This might lead to different requirements of the eHMI with regard to the degree of detail and the required velocity of communication. For example, it is conceivable that the communication with eHMI signals must be faster when the driving speed is higher. Additionally, a more detailed eHMI signal could be more useful at a low speed than at a high speed. The speed of the automated vehicle at the beginning of an interaction depends on the used automated driving system and its operational design domain. According to the taxonomy by Fuest et al. [30], 0 km/h represents a vehicle at a standstill, 30 km/h is considered as a low speed range, 50 km/h is considered as an urban speed range, and 130 km/h is the permissible maximum speed in most European countries. These different speeds of the automated vehicle should be considered as use cases within the scope of the

operational design domain of the respective automated driving system. The speed of the interaction partner at the beginning of an interaction depends on the type of interaction partner. Motorized vehicles could theoretically approach the automated vehicle at many different speeds. We assumed an average speed of 4.4 km/h for pedestrians [30,41] and 17.5 km/h for cyclists [30,42]. Additionally, speeds of 0 km/h (interaction partner at standstill), 30 km/h (low speed range), 50 km/h (urban speed range), and 130 km/h (maximum speed) should be considered. The reduction of these value facets depends on the type of interaction partner.

The distance between automated vehicle and interaction partner at beginning of interaction depends on their current speed. Based on the initial speed of both interaction partners, the prerequisite that both interaction partners should theoretically arrive at their "virtual crossing" point at the same time (see Figure 3) and a certain predefined time for the interaction partner to perceive and interpret the eHMI to make a behavioral decision, and to execute an action, one can calculate the distance of the interaction partners at the beginning of the interaction. For example, the use case represents the situation in which the interaction partner (cyclist = 17.5 km/h) approaches the automated vehicle (low speed range of 30 km/h) frontally. If we assume a time interval of 4 s, the cyclist drives 12.25 m and the vehicle drives 20.75 m in this time until they reach the virtual crossing point. Thus, the total initial distance must be 33 m. Based on this procedure, the distance does not represent an independent factor but results from other factors.

#### 2.1.6. Combination of Context-Independent Use Cases and Situation-Specific Factors

In a next step, the context-independent use cases and remaining situation-specific factors were combined (Figure 1). However, there were still 864 possible combinations to deduce use cases. Filter E deleted implausible use cases from the full use-case set (Figure 1). This reduction was based on an analysis of realistic and unrealistic combinations of the type of interaction partner, speed, situation, and maneuver of the automated vehicle. For example, when the automated vehicle is at standstill, it can only remain at standstill or start from standstill (lower part of Figure 4). A deceleration maneuver is not possible. Other examples are realistic speeds for the three situations (upper part of Figure 4). An initial speed of 130 km/h for those situations in which the interaction partner approaches the automated vehicle frontally or orthogonally is not realistic for any of the interaction partners.

#### 2.1.7. Deduction of Relevant Use Cases

In the last step, Filter F serves to select those use cases that are relevant for testing the usability requirements defined in Section 2.2 with the eHMI and the automated driving system under investigation. For example, we would like to test the usability of an eHMI of an "urban pilot" with the following specifications: The operational design domain of the system is in urban areas with a speed range between 0 and 30 km/h. If the system detects another road user within a radius of 60 meters, it will not accelerate due to safety reasons. Furthermore, the eHMI signal for keeping a constant speed is the same when the vehicle is at standstill or is moving. When deducing the relevant test cases from the use case set, these specifications further reduce the number of relevant test cases. Filter F can be applied to test different eHMI variants of automated driving systems with varying specifications.

The advantage of this methodological approach is that it provides a reproducible and clear procedure to select relevant use cases to test the usability of any given eHMI. The present set of use cases represents all scenarios that are relevant to test the usability of eHMIs during interactions with automated vehicles. It needs to be noticed that controllability or misuse tests might need different procedures for reducing and selecting relevant use cases. Furthermore, it must be emphasized that this method can and will not cover all conceivable use cases and situations—in particular, sound adaptations will be required for corner cases. Accordingly, researchers and practitioners who want to use this method will have to take care when they apply this method, thus extending and strengthening its validity.

#### *2.2. Usability Requirements, Parameters, and Criteria*

Prior research on eHMIs has not yet provided consensus on specific requirements for the usability of eHMIs. For the evaluation of the in-vehicle HMIs of automated driving systems, the National Highway Traffic Safety Administration (NHTSA) has defined minimum requirements that must be fulfilled by an HMI [43]. However, there are no published standardized requirements to assess the usability of eHMIs.

In order to define evaluation requirements, it is important to recall the initial considerations for the development of eHMIs. There were concerns that interactions between automated vehicles and other road users could result in difficulties and dangerous situations because the driver/passenger will not be available for informal communication [18]. Therefore, automated vehicles must ensure safe and efficient interactions with other road users [3]. The implementation of eHMIs is one possible way to support non-automated road users during interactions with automated vehicles. An alternative or complementary approach is to informally communicate driving behavior and intentions to other road users by developing appropriate driving strategies of automated vehicles [44,45]. In order to justify the implementation of an eHMI, it must have advantages for interaction partners compared to automated vehicles without an eHMI. At least, it should not deteriorate the quality of interaction. Thus, the basic requirement for an eHMI is its usability. According to the usability definition by ISO 9241-11 [35], the usability of a system is determined by its effectiveness, efficiency, and satisfaction. To be effective, an eHMI must support the non-automated road user in choosing an accurate behavioral decision during interactions with automated vehicles. An eHMI improves the interaction partner's efficiency if it has a positive effect on the time and mental effort required for a successful interaction. To be satisfying, the interaction partner must perceive the use of the eHMI as pleasant. This is relevant to facilitate its use and acceptance. As such, we defined three usability requirements for an eHMI:


The test procedure needs to differentiate between eHMIs that fulfill or do not fulfill these requirements. To decide whether a certain eHMI meets the defined requirements, it is necessary to define parameters for each requirement. These parameters are used to make the respective usability requirement measurable. The following paragraphs define specific parameters for each usability requirement (effectiveness, efficiency, and satisfaction) and propose methods for how to assess these parameters. To finally decide whether an eHMI is compliant with the respective requirement, it is necessary to define a pass/fail criterion for each parameter. In sum, only when an eHMI passes the specified criteria of all parameters per requirement does it fulfil the specific usability requirement as a whole.

Such parameters can be assessed by behavioral or self-reported measures. Behavioral measures can indicate if and how fast the interaction partner is able to understand the eHMI signal and if they are able to deduce correct behavioral decisions. However, there is a certain guess probability that the interaction partner makes the correct decision by chance (e.g., to either continue driving or to brake/stop). Furthermore, the driving behavior of the automated vehicle serves as an additional indicator for the interaction partner to make an appropriate behavioral decision. Thus, correct behavioral decisions of the interaction partner cannot be exclusively explained by their correct understanding of the eHMI signal. Additionally, self-reported measures are necessary to assess whether the interaction partner understands the eHMI signal correctly or not. On the other hand, self-reported measures alone would be insufficient because it must be ensured that a correct system understanding leads to correct behavior. Therefore, we propose a combination of both behavioral and self-reported measures.

Compared to an interaction without an eHMI, an eHMI should improve the effectiveness and efficiency of an interaction. At least, it should not deteriorate the interaction. To assess this difference between interactions with and without an eHMI, a baseline condition without an eHMI is required. With this methodological approach, relative criteria can be used to assess the effectiveness and efficiency of an eHMI. However, certain parameters require an absolute instead of a relative criterion. For example, an eHMI should completely prevent the safety-critical behavior of interaction partners. Thus, the investigation should not focus on the question of whether there are less safety-critical situations with than without an eHMI. Instead, it is most important that there are no safety-critical situations with an eHMI at all (absolute criterion). Additionally, to evaluate the satisfaction with an eHMI, absolute criteria appear to be more appropriate than relative criteria.

#### 2.2.1. Parameters and Criteria to Prove the Effectiveness of an eHMI

The effectiveness of an eHMI can be assessed by the parameters system comprehension and the correctness of behavioral decision. To measure system comprehension without giving participants the possibility to additionally consider the observed driving behavior of the automated vehicle as a confounding factor, we propose the occlusion method (see Section 2.3.2 for a detailed explanation). After the view on the automated vehicle has been occluded, participants need to answer the open-ended question "What will the automated vehicle do next?" The experimenter categorizes the answer as either correct or incorrect. The occlusion method does not allow for a comparison with the baseline condition because the screen is blanked before participants can deduce the vehicle's intention from its driving behavior. An absolute criterion can be used to evaluate the system comprehension. We propose a criterion of 85% correct answers for each use case. The appropriate indicators to assess the correctness of the behavioral decision depend on the driving maneuver of the automated vehicle in the respective use case. When the automated vehicle decelerates, the correctness of the behavioral decision can be measured by the minimal speed of the interaction partner during the interaction. The eHMI can be considered as being effective if the interaction partners reduce their speed to a significantly lower extent with an eHMI than without an eHMI (relative criterion). No or only slight reductions of speed would demonstrate that the eHMI supported interaction partners in predicting the unobserved behavior of the automated vehicle prior to real time. When the automated vehicle keeps a constant speed or accelerates, the interaction partner must reduce his or her speed or wait to prevent a safety-critical situation. Continued driving or walking represent incorrect behavioral decisions. However, the correctness of the behavioral decision should be assessed by an absolute criterion with a pass-fail logic. The relevant criterion is the resulting minimum distance between the automated vehicle and the interaction partner. A minimum distance that falls below one meter can be considered as a safety-critical distance. Following the guidelines of the RESPONSE Code of Practice [46], 20 of 20 participants need to pass the defined criterion to support the assumption that 85% of the population would also pass the criterion.

#### 2.2.2. Parameters and Criteria to Prove the Efficiency of an eHMI

To measure the efficiency of an eHMI, we propose the parameters mental workload, time to cross, and visual attention. Mental workload can be assessed by a self-reported measure. After each interaction, the participant answers the question "How high was your mental workload during the interaction with the automated vehicle?" on a 7-point Likert scale ranging from very low to very high. Using a relative criterion, the mental workload should be significantly lower with than without an eHMI. To measure if the eHMI supported the efficiency of the interaction in a timely manner, the time between the first visual contact with the automated vehicle and the crossing of the virtual crossing point (see Figure 3) can be compared with and without the eHMI. The time to cross should be significantly shorter with than without the eHMI (relative criterion). To determine whether the eHMI improved the efficiency of the interaction with regard to the required visual attention, the proportion of visual attention towards the automated vehicle during the interaction should be significantly lower with than without the eHMI (relative criterion). Visual attention can be measured by eye tracking, head tracking, or by video coding.

#### 2.2.3. Parameters and Criteria to Prove the Satisfaction with an eHMI

The satisfaction with the eHMI can be determined by the parameters satisfaction, attitude toward use, behavioral intention, and preference. All parameters are measured by items after participants have encountered all use cases with an eHMI. Table 2 includes a list of proposed items and the respective scales. All parameters use an absolute criterion. With regard to satisfaction, attitude toward use, and behavioral intention, at least 85% of all participants must choose a positive judgement (ratings between 5 and 7 on a 7-point Likert scale). To assess the preference, participants need to decide whether they would prefer interactions with automated vehicles with or without an eHMI in the future. To pass the relative criterion, a significantly higher proportion of participants must prefer future interaction with an eHMI to interactions without an eHMI.


**Table 2.** Parameters and items to assess the satisfaction with an eHMI.

<sup>1</sup> Item adapted from [47].

The proposed requirements, parameters, and criteria contribute to the standardization of test procedures for evaluating the usability of eHMIs. Together with the definition of use cases, these standardized requirements form the basis for reliable eHMI evaluations and allow for meaningful comparisons between different eHMI variants and the results of different studies. Overall, this contribution will support the definition of design requirements for optimal interface specifications. The selection of the parameters can be adapted to the respective research questions and selected use cases.

#### *2.3. Test Protocol*

To evaluate the usability of an eHMI, it is important that users interact with the system in a standardized manner. We developed a test protocol for the empirical evaluation of eHMIs with a user study. The test protocol provides a proper experimental design to systematically investigate the usability of eHMIs. The objective is to prove whether a certain eHMI meets the usability requirements defined in Section 2.2. For this purpose, the test protocol defines a methodological procedure to observe and measure users' behavior and subjective evaluations during specified use cases and experimental conditions. The following sections elucidate the methodological details of the test protocol and underlying considerations.

#### 2.3.1. Test Environment

The test environment must allow for controlled, standardized, and economic testing in a safe environment. At the same time, participants should encounter realistic scenarios to guarantee external validity. Furthermore, it is important that the parameters defined in Section 2.2. can be measured. Thus, the test environment must enable behavioral measurements, the observation of participants' behavior

and the communication between experimenter and participants for interim questions. Additionally, a realistic implementation of an eHMI is important. Prior research mainly used methods such as VR pedestrian simulators with head-mounted displays [7], desktop computers to demonstrate photos or videos [23], or the Wizard of Oz technique [14]. These test environments often do not enable the dynamic development of interactions. This leads to limitations of external validity, limited use case selection, and limited possibilities to measure behavioral data. We recommend the use of high-fidelity driving simulators to investigate interactions with motorized interaction partners (cars, trucks, powered-two wheelers) or VRUs (cyclists). The chosen simulator should include a realistic mock-up; active intervention options for braking, accelerating, and steering; and the possibility to implement the eHMI. To investigate interactions with pedestrians, VR pedestrian simulators remain the most suitable test environment. However, it is important that the pedestrian simulator provides enough of a physical environment to enable dynamic interactions and possibilities to measure dynamic pedestrian behavior, e.g., by using a motion suit [6].

#### 2.3.2. Procedure and Instruction

The procedure of the test protocol is shown in Figure 5. The instruction informs participants that the study investigates interactions between automated vehicles and manual drivers/cyclists/pedestrians. They are told that automated driving systems perform the entire dynamic driving task, at least in a specific operational design domain. Thus, the car driver can perform tasks other than driving. Furthermore, participants are informed that the experimental drive will take place on a simulated test track without right of way rules. The latter information is very important to ensure ambiguous interaction situations. The instruction at the beginning of the study does not include any information about the eHMI.

**Figure 5.** Procedure with measured parameters.

After a short familiarization with the respective simulator (about 5 min) without any interactions with an automated vehicle, participants go through a learning period. They already encounter all use cases in which they interact with automated vehicles that use the tested eHMI. The learning period serves as the opportunity to learn to associate the eHMI signals with the subsequently executed driving maneuver of the automated vehicle. After the leaning period, a short interview is conducted. The experimenter asks the following questions:


The participants' answers to these questions indicate the perceptibility and visibility of the eHMI signals. Furthermore, the questions serve to assess a first, global understanding of the eHMI. Independent of the answers of the respective participant, the experimenter explains at the end of the interview that the study aims to investigate signals that automated vehicles use to communicate with

other road users. This information is important to achieve a common basis for all participants for the subsequent test blocks. The experimenter emphasizes to not being able to give any advice or help during the experimental drive as the objective is to investigate whether the signals are comprehensive and helpful.

Thereupon, participants either first experience the test block with the eHMI (Test Block 1) or without the eHMI (Test Block 2). The sequence of the test blocks should be counterbalanced to control for transition and learning effects. Test Block 1 consists of three parts that should be encountered in the same recommended sequence (see Figure 5). In Test Block 1a, participants encounter all use cases with the eHMI while behavioral data (driving data and visual attention) are constantly recorded. Additionally, participants verbally indicate their mental workload after each interaction (see Section 2.2). The scale to measure mental workload should be located somewhere in the simulator that is visible to the participants. In Test Block 1b, the occlusion test block serves to measure system comprehension. Therefore, participants experience all use cases once again. With the occlusion method, the simulation screen is blanked during each interaction at predefined points in time. This method was adapted from [48] and intends to achieve an open outcome of the situation. The screen should be blanked when the eHMI already signaled the subsequent intention (or communication content in general) but before the automated vehicle has already started to execute the signaled maneuver. After some seconds (e.g., 5 s), the screen shows the last scene again while the automated vehicle has been removed in the meantime. To prevent simulation sickness, it is recommended to automatically brake down the participant to a standstill while the screen is blanked. The suggested open-ended question "What will the automated vehicle do next?" can be adapted to the communication content of the tested eHMI. After the occlusion test block, participants answer a survey that includes different items to measure satisfaction with the eHMI (see Table 2, except for the preference item). In Test Block 2, participants encounter all use cases without the eHMI while behavioral data are recorded and they indicate their mental workload after each interaction. At the end of the study, participants finally evaluate their preference for future interactions with automated vehicles with or without an eHMI. After each test block, participants have the opportunity to take a break. At the end of the experiment, the experimenter thoroughly debriefs the participant.

To control for transition effects between the different use cases, it is recommended to permutate the sequence of the use cases to three different sequences. Thus, the use cases of each test block (1a, 1b, and 2) are encountered in different sequences (A, B, and C). Furthermore, a certain test block should not be experienced in the same sequence by all participants, e.g., each participant experiences Test Block 2 in sequence C. Therefore, the different sequences of use cases should be additionally counterbalanced between the three test blocks. The sequence of the use cases in the learning period can be the same for all participants. In conclusion, the outlined considerations require an equal division of the participants in six different groups. Table 3 shows an exemplary experimental design with six different experimental groups that differ according to the sequence of Test Blocks 1 and 2 and the sequence of use cases in the different test blocks.


**Table 3.** Example of the experimental design with different sequences of test blocks and use cases.

Note. TB = Test block, Seq. = Sequence of use cases in the test block.

#### 2.3.3. Sample

To deduce reliable conclusions from the experimental data, the sample size should be sufficiently large. In reference to RESPONSE [46], at least 20 test persons should take part in the study. The target population of persons who will interact with automated vehicles in the future is very broad. Accordingly, people of all ages, nationalities, educational levels, body heights, and so forth should be eligible for studies that test the effects of eHMIs. To achieve a representative age distribution, NHTSA [43] proposed different age groups of *n* = 5 each, 18–24, 25–39, 40–54, and older than 54 years. Beyond these age groups, it is important to examine the effects of eHMIs on children's behavior and comprehension [21]. Dependent on the interaction partner under investigation, participants may need to fulfill further specific prerequisites. For example, participants in a bicycle simulator study should ride a bike on a regular basis, and participants in a driving simulator study should hold a driver's license.

#### **3. Discussion**

Due to a great variety of methodological approaches and methodological limitations, the current state of research on the usability of eHMIs does not allow to draw general conclusions. The standardization of test procedures is, thus, a fundamental prerequisite to effectively evaluate and compare different eHMI design variants. Therefore, the aim of the present article was to outline a standardized test procedure that allows for the systematic investigation of the usability of eHMIs. We have proposed a methodological framework that consists of a method to deduce relevant use cases, a definition of specific usability requirements and appropriate parameters, and a test protocol for the empirical evaluation of an eHMI.

The definition of relevant use cases provides the basis of the test procedure to ensure meaningful and comparable results. To make reliable conclusions, the usability of an eHMI must be proved in use cases previously defined as relevant. Prior studies on eHMIs have often used only one randomly selected use case. The proposed multi-stage gradual methodological approach presented in this article claims to consider all theoretically possible use cases of an eHMI. Using a variety of theoretical and practical considerations, the approach finally results in a set of use cases that are relevant to evaluate the usability of an eHMI. The intersection scenario represents the use case that has been studied most often in previous work on eHMIs [1,3–5,7,8,11–23]. To the best of our knowledge, there has only been one study that used a narrow area as a use case of an eHMI so far [49], and there has been no study that has examined a merging scenario. Thus, the approach to define use cases provides new perspectives for future research on eHMIs. Researchers can easily apply the proposed procedure to select use cases for the eHMI and automated driving system under investigation. All stages and filters before Filter F can be taken as default. Therefore, the selection process can be entered at Filter F. At this point, users can select those use cases that are relevant for the eHMI and automated driving system at hand. A potential limitation of the presented methodological approach to select use cases is that it only considers use cases in which automated vehicles interact with one non-automated road user. In principle, an eHMI that exclusively communicates information about the automated vehicle, such as its status and intentions, should always have the same usability, independent from the number of non-automated road users with which it is currently interacting. With this content of communication, it is not relevant whether only one pedestrian or three pedestrians and two cyclists need to understand the meaning of the eHMI signal and, thus, make decisions about their subsequent behavior. However, if an eHMI directly addresses its message to a specific road user, interactions with more than one non-automated road user quickly become very complex and require an extended approach to deduce use cases. For example, many previous studies have examined eHMI signals that tell pedestrians to "walk," "go ahead," or "don't walk" [8,16,22], that project green arrows [8] or crosswalks [17] on the road surface in front of the vehicle or show a green pedestrian in the windscreen [22]. If another traffic participant feels addressed by such an eHMI signal that was initially directed to another road user, the situation can become very critical. Therefore, we highly recommend not to use eHMI signals that ask a particular road user to take any specific action. As a result, the methodological approach presented in

this paper provides an appropriate tool to deduce use cases for eHMIs that communicate information about the automated vehicle itself rather than communicating requests for action to other road users.

The definition of evaluation requirements constitutes an additional prerequisite to standardize the evaluation process of eHMIs. Following the ISO definition of usability [35], we derived three requirements: An eHMI must render the communication of automated vehicles with non-automated road-users effective, efficient, and satisfying. By defining specific parameters and criteria for each usability requirement, the test procedure can differentiate between eHMIs that fulfill or do not fulfill these requirements. Further work is necessary to evaluate the discriminatory power of the proposed parameters. It might be possible that some parameters can better differentiate between eHMIs that meet or do not meet the requirements than others. With increasing experience based on future empirical studies, the specific measurement methods of the parameters can be adapted and extended, e.g., the selection of appropriate items to measure the satisfaction parameters. The parameters could be supplemented by further parameters and the criteria could be adapted if necessary. For example, following the controllability guidelines of the RESPONSE code of practice [46], it is also justified to aim at a system comprehension rate of 100%. The guideline requires that 20 of 20 participants pass the predefined criterion and give the correct answer. However, it must be emphasized that the proposed requirements, parameters, and criteria focus on the usability testing of eHMIs. To prove the controllability of eHMIs, the test procedure needs to be adapted. Nevertheless, part of our test procedure already addresses controllability testing, as the criterion for the minimal distance to the automated vehicle has a pass-fail logic and does not even allow for one fail event in 20 subjects.

The test protocol provides a proper experimental design to systematically evaluate eHMI variants with user studies in a standardized way. The test protocol provides several advantages. First, results of studies that are conducted in accordance with the test protocol allow for reliable conclusions regarding whether the tested eHMI can fulfill the defined usability requirements. Second, the results of different studies that followed the test protocol allow for comparisons between the tested interface designs. Thus, the test protocol constitutes a basis to derive optimal interface specifications based on comparisons of different studies. Another major advantage of the test protocol is that it enables the measurement of an eHMI's usability without confounding factors. As there are no right of way rules and the sequence of the test blocks with and without eHMIs is counterbalanced, different behavioral decisions of the interaction partners in the different test blocks can be explained by the usability of the tested eHMI. Similarly, the occlusion method ensures that comprehension measurements are also essentially based on the comprehensibility of the eHMI. To compare two or more eHMI variants with each other, the test protocol can be adapted and extended. Test Block 1 can be repeated with an additional eHMI variant with the same group of subjects as a repeated-measures design. However, it is very important to always compare participants' behavior with an eHMI with their behavior during interactions without an eHMI in Test Block 2. Moreover, the inclusion of a further test block requires the permutation of the three test blocks and a random distribution of the participants to the resulting sequences. Alternatively, the test protocol allows for the comparison of different eHMI variants that were examined in different studies with different samples. As prerequisites, the samples must be comparable and the studies must select the same use cases.

The next step is the application of the test procedure for the usability evaluation of different eHMI design variants and automated driving systems with different specifications. With increasing experience, the method can be iteratively refined and improved. In turn, the standardized evaluation procedure will become a valuable tool for the scientific and technical community. The standardized test procedure can serve as a basis to establish best practices in the field of communication between automated vehicles and non-automated road users.

**Author Contributions:** Conceptualization, F.N., S.H., A.K. and A.N.; methodology, C.K. and S.S.; writing—original draft preparation, C.K.; writing—review and editing, S.H., F.N. and S.S.; supervision, A.K. and A.N.; project administration, C.K., S.S., F.N. and S.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Acknowledgments:** We thank Stefanie Ebert, Thomas Stemmler, and Florian Fischer for their technical support. **Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

### *Article* **How Much Space Is Required? E**ff**ect of Distance, Content, and Color on External Human–Machine Interface Size**

#### **Michael Rettenmaier \*, Jonas Schulze and Klaus Bengler**

Chair of Ergonomics, Technical University of Munich, 85748 Garching, Germany; schulze.jonas@mytum.de (J.S.); bengler@tum.de (K.B.)

**\*** Correspondence: michael.rettenmaier@tum.de

Received: 3 May 2020; Accepted: 1 July 2020; Published: 3 July 2020

**Abstract:** The communication of an automated vehicle (AV) with human road users can be realized by means of an external human–machine interface (eHMI), such as displays mounted on the AV's surface. For this purpose, the amount of time needed for a human interaction partner to perceive the AV's message and to act accordingly has to be taken into account. Any message displayed by an AV must satisfy minimum size requirements based on the dynamics of the road traffic and the time required by the human. This paper examines the size requirements of displayed text or symbols for ensuring the legibility of a message. Based on the limitations of available package space in current vehicle models and the ergonomic requirements of the interface design, an eHMI prototype was developed. A study involving 30 participants varied the content type (text and symbols) and content color (white, red, green) in a repeated measures design. We investigated the influence of content type on content size to ensure legibility from a constant distance. We also analyzed the influence of content type and content color on the human detection range. The results show that, at a fixed distance, text has to be larger than symbols in order to maintain legibility. Moreover, symbols can be discerned from a greater distance than text. Color had no content overlapping effect on the human detection range. In order to ensure the maximum possible detection range among human road users, an AV should display symbols rather than text. Additionally, the symbols could be color-coded for better message comprehension without affecting the human detection range.

**Keywords:** automated driving; external human–machine interface; interface size; legibility

#### **1. Introduction**

The process of introducing automated vehicles (AVs) into road traffic is progressing. In urban areas in particular, a gradual change is taking place towards mixed traffic, including AVs, human drivers, cyclists, and pedestrians. From automation level 2 and higher, the system sustains lateral and longitudinal vehicle motion control [1], which could directly impact the nature of the interactions between the AV and road users in the surroundings. One approach for enabling communication of AVs with their environments is to use external human–machine interfaces (eHMIs). These are displays mounted on the surface of the vehicle [2,3], light strips [4–6], and projections on the road [7,8]. These devices enable AVs to indicate, for instance, their status, perception, or intention [9] in relevant scenarios, such as at intersections, in parking lots, in narrow spaces, or in merging traffic [10,11]. Current research is almost exclusively devoted to the question of what content these interfaces should display in order for them to be comprehensible to pedestrians [12] or human drivers [2]. Based on a comprehensible eHMI design, the interaction is comfortable, efficient, and safe if the human interaction partner has enough time to perceive and process the eHMI content and act accordingly. The dynamics of road traffic and the time required by the receiver result in a certain lead time within which the AV has to communicate its message. In turn, a minimum content size is required in which the AV has to display its message.

For the purpose of dimensioning the eHMI, this paper makes reference to the road bottleneck scenario from two previous studies [2,7], with obstacles on both sides of the road due to double-parked vehicles. In this scenario, an AV and a simultaneously oncoming human driver negotiate the right of way within a 30 km/h speed limit zone. The AV displays its message to the human driver at a distance of 100 m. Rettenmaier, Pietsch, and Bengler [7] recommend that in such a bottleneck scenario an AV should communicate via a display mounted on the front of the vehicle, in order for the interaction to be efficient and safe. Front-mounted displays are particularly suitable for communication purposes in straight-approach scenarios [13]. Owing to the high dynamics and relative speeds of the AV and the human driver when approaching the road bottleneck, the resulting required eHMI size exceeds that which would be needed for interactions in a tighter space. Thus, the determined size is also suitable for communicating with pedestrians in road crossing scenarios in which the AV's communication commences at a shorter distance between the AV and the pedestrian, for instance, 45 m [4] or 50 m [12]. Despite all its positive potentials, one disadvantage of communicating via displays is that the content size must be large to be viewed at a distance [14]. However, there was no research found that deals with the question of how large text or symbols need to be with respect to content and color in order for them to be legible from a particular distance. As there are as yet no standards governing the design of eHMIs, this paper investigates the size that displayed text or symbols must have, in order for them to render a message legibly in a bottleneck scenario.

#### **2. Objectives**

The present study aims to determine the content size required to render distinct communication at a certain distance for different content types. An additional aim is to examine the influence of content color and content type on the human detection range, which we defined as the distance from which a certain content size is legible. For this purpose, we developed an eHMI prototype (Section 3) including a package space analysis (Section 3.1), ergonomic requirements (Section 3.2), the selection of hardware and software (Section 3.3), and the presentation of the final prototype (Section 3.4). We conducted a study involving 30 participants (Section 4) to analyze the effects of distance, content type, and content color on the required content size, and we set up the following research questions (RQs):

RQ1: Is there any difference in the required content size for it to be legible at a certain distance depending on the content type?

RQ2: Is there any difference in the human detection range depending on the content type? RQ3: Is there any difference in the human detection range depending on the content color?

#### **3. Development of External HMI Prototype**

#### *3.1. Package Space Analysis of Existing Vehicle Models*

An AV communicates with an oncoming human driver via its external display. For this reason, the vehicle front is the only surface suitable for displaying information. This surface can be divided into the bumper, radiator grille, headlights, hood, windshield, and rear of the side mirrors. The area of the side mirrors is small and incoherent, while the projection area of the hood is small in the vertical plane. Moreover, the AV's passenger must be able to use the windshield for monitoring the driving scene, while the function of the headlights is to illuminate the road ahead. For these reasons, we considered the bumper and the radiator grille as suitable areas for implementing the eHMI, as the radiator grille is no longer required for engine cooling in an electric vehicle. It is also a suitable area for displaying messages from the AV in straight approach scenarios [13].

We selected three car models to represent each of the six vehicle categories of the Commission for European Communities (mini cars, small cars, large cars, executive cars, luxury cars, and sport utility cars) [15]. The selection was based on the new registration data published by the German Federal Motor Transport Authority for the month of June 2019 [16]. The vehicle's front dimensions were determined by digital measurement of the official dimensions given in a technical drawing and subdividing this area into individual sections for the radiator grille and the bumper (Figure 1). The scale of the technical drawing was recorded, while the pixel size and, in turn, the size of the defined sections were calculated using a digital pixel meter. The potential eHMI size dimensions were determined as the minimum height (*H*) of the radiator grille and the bumper together (Mercedes C-Class: 459 mm) and the minimum width (*W*) of the radiator grille (VW Up: 772 mm) of all car models.

**Figure 1.** Dimensioning of the vehicle front using the technical drawing [17] of a BMW 5 Touring model as an example. We divided the front into separate radiator grille and bumper sections.

#### *3.2. Ergonomic Requirements*

Due to the complexity of the driving task during manual driving, it is necessary that all pertinent information is easily and comfortably legible for drivers. Similarly, the eHMI must be legible at all times of day and night. During the day, the required luminance of the display varies between 1000 cd/m2 [18] and 5000 cd/m2 [19] for outdoor use. At night, the eHMI must not be so bright as to dazzle nearby road users. Therefore, the display luminance must be adjustable so as not to impair the eye's adaptability to changes in light levels [20]. Another requirement is that the eHMI should display bright text and symbols on a dark background and not the other way around since this display mode is suitable for day and night use [21]. The contrast ratio of the display between the text/symbol and the background should be 5:1 at high brightness and at least 3:1 at common brightness levels [18].

The symbols on the display should have a minimum visual angle of 20 min of arc (MOA). The minimum visual angle of text written in Latin letters must be 16 MOA and 20–22 MOA for comfortable reading. Moreover, the ratio between letter height and letter width should be 0.7:1–0.9:1. The line width of sans-serif fonts should be 10–17% of the letter height, and there should be a space of one line width between letters [20].

The letter or symbol height requirements specify the minimum display height. The number of letters in a word limit the minimum display width.

#### *3.3. Hardware and Software*

The prototype consists of 12 outdoor light-emitting diode (LED) modules made by Coreman Technology Co. [22]. Each red-green-blue (RGB) LED matrix measures 256 × 128 mm with 62 × 32 pixels and a pixel distance of 4 mm. The minimum visual angle of 20 MOA at a distance of 100 m, as considered in the bottleneck scenario [2,7], has a matrix height *H* = 582 mm. The available package space dimensions are *W* = 772 mm and *H* = 459 mm. The 4 × 3 matrix layout has a size of *W* = 768 mm and *H* = 512 mm, with a resolution of 192 × 128 pixels. This represents a good compromise between the theoretically required space and the available space. The luminance of each module is higher than 6000 cd/m2, resulting in a maximum illuminance of 2358 cd when the whole matrix illuminates in full brightness in white. This exceeds the limit value of 1200 cd [23] prescribed for road traffic. Since fewer than 50% of the pixels illuminate for displaying symbols and letters, the illuminance is lower than the legally required threshold.

The working temperature of the module is between −30 ◦C and +55 ◦C. The LED matrix is controlled by a Raspberry Pi 4 Computer Model B with 2 GB of memory and a quad-core 64-bit processor with a frequency of 1.5 Hz [24]. The prototype uses the official operating system Raspbian, based on Debian GNU/Linux. The LED matrix is controlled by a laptop via a remote desktop connection. The LED matrix is controlled by an open source C++ library [25]. It is, therefore, able to display pictures, texts, and animations [26].

#### *3.4. Final eHMI Prototype*

Figure 2 shows the final eHMI prototype. The LED modules are screw-fitted to a frame made from aluminum sheets. The prototype satisfies the visual angle requirements of 20 MOA at a distance of 88 m between display and participant pursuant to DIN EN ISO 9241-303 [20] with a display size of 768 × 512 mm. This eHMI display distance is less than the 100 m used in the previous studies [2,7], on which the present investigation is based, but it would provide the human driver in the bottleneck scenario sufficient time to interact comfortably with the AV [27].

**Figure 2.** The external human–machine interface (eHMI) prototype developed and evaluated in the present investigation. The content colors do not match the real colors due to the display angle and camera distortion.

#### **4. Evaluation of External HMI Prototype**

#### *4.1. Sample*

Thirty participants took part in the experiment. As no data were discarded, there were 30 valid data sets in the study. The age of the sample was *M* = 31.07 years (*SD* = 12.54 years). The age span ranged from 18 years to 69 years. Nineteen participants were male and 11 were female. Eighteen participants had a visual impairment, which was corrected in 17 cases in the course of the experiment and not corrected in one case. Additionally, there was one participant with red-green deficiency. We refrained from excluding these two data sets from the analysis, as persons with visual impairments also participate in real road traffic. The eye test [28,29] resulted in a visual acuity of *M* = 1.47 (*SD* = 0.37). The visual acuity ranged from 0.8 to 2.0. The participants were recruited at the Technical University of Munich and did not receive an expense allowance.

#### *4.2. Display Content*

Figure 3 shows the three different content types displayed by the eHMI prototype during experiment 1 and experiment 2 (Figure 4). The text fulfills the ergonomic requirements (Section 3.2). In experiment 1, we chose to display four letters, since this number was easily readable from a distance of 88 m in a pre-test. In experiment 2, the eHMI displayed five letters (E, P, C, F, D). In both experiments, the eHMI displayed cryptic chunks of letters, so that it was hardly possible to guess the sequence of letters. To avoid the effect of varying legibility for different letters, the display showed the same letters for each participant, but in a randomized order. The letters were derived from one row of the Snellen chart. In addition to text, the prototype also displayed two types of symbols. The arrow and the "E" from the E chart were visualized in four degrees of rotation (0◦, 90◦, 180◦, 270◦) such that the limbs of the E and the arrow tip were pointing up, down, to the left, or to the right. The content size is defined throughout this article as the height of the text or the height of the arrow and the E in the orientation given in Figure 3. Even if the arrow is rotated by 90◦, its size is the distance from the end of the arrow to its tip.

p

**Figure 3.** The three different content types displayed by the eHMI in the present study.

#### *4.3. Experimental Design*

It was necessary to conduct two experiments (Figure 4) in order to obtain answers to the research questions. In experiment 1, the participants were at a constant distance of 88 m to the prototype. This distance corresponds to the recommended visual angle of 20 MOA for a prototype height of 512 mm [20]. Following a pre-test, the symbols were scaled to six sizes (ranging from 80 mm to 230 mm), while the text was scaled to five different sizes (from 80 mm to 200 mm) (Table 1) for determining the size required for it to be legible at a distance of 88 m. In experiment 1, the prototype displayed the message in white (R = 255, G = 255, B = 255), since this represents the highest contrast to the LED matrix.

**Table 1.** Content sizes used in experiment 1. The distance from which the respective size has a visual angle of 20 min of arc (MOA) is presented according to DIN EN ISO 9241-303 [20].


Experiment 2 analyzed the effect of content type and content color on the human detection range. The size of symbols and texts was set to 164 mm, which made it possible to display five letters on the prototype. Additionally, texts and symbols were displayed in the colors white (R = 255, G = 255, B = 255), red (R = 255, G = 0, B = 0), and green (R = 0, G = 255, B = 0). We decided to use red and green in addition to white, as they are already familiar in the context of traffic as an indication of either yielding or insisting on the right of way. In experiment 2, the participants approached the eHMI

prototype from a distance of 150 m. The participants stopped at a distance *X* from the eHMI as soon as the content type became legible and thus their detection range was attained.

**Figure 4.** Experiment 1 evaluated the required content size for it to be legible from a distance of 88 m. Experiment 2 analyzed the human detection range (*X*) depending on content type and content color.

The participants performed experiment 1 and experiment 2 in a permuted order (Figure 5). In experiment 1, the participants read the text (in five different sizes), the arrow (in six different sizes), and the E (in six different sizes) in a permuted order. The text segment of the experiment also displayed two distracting text blocks, in which letters appeared twice, after the first text and after the third text, such that the participants could not assume that the respective letters only appeared once within a text. The data from these two distractor texts were not considered in the evaluation.

In experiment 2, the participants approached the prototype displaying the text, arrow, and E three times each. In each of the three parts, the message was displayed once in white, red, and green.

**Figure 5.** Experimental design dividing the study into two experiments. Both the experiments and the different content types within each experiment were presented in a permuted order.

#### *4.4. Procedure*

Once they had been duly informed about the experiment, the participants gave their written consent to take part in the study. They then filled in a demographic questionnaire, which included questions on age and gender. The participants were also asked to indicate whether they had any visual impairment or color vision deficiency. Afterwards, they underwent eye testing using the software FrACT 3.10.2 [29], which displayed the Landolt-C on a computer monitor. The participants had to discern in which of the eight possible positions the Landolt-C opening appeared. The distance between monitor and participant and the number of trials can be configured in the software. The participants then received the instructions for the study, after which experiment 1 and experiment 2 were conducted in a permuted order. The participants were not subject to time limits when identifying the displayed items. Prior to the experiments, the illuminance was measured directly at the eHMI prototype because of the possibility of ambient illumination affecting contrast requirements [30]. The average illuminance was *M* = 2812 lx (*SD* = 1092 lx), with a range of 132 lx to 5483 lx. The total duration of the experiment was about 45 min.

#### *4.5. Dependent Variables*

The correctness of the text and symbol identification was evaluated in both experiments. The text was correctly identified and was considered legible if the participant read the sequence of letters in the right order. The arrow and the E were considered legible if the respective symbol and its orientation were correctly identified. In experiment 2, the participants additionally had to state the content color for correct identification. In experiment 1, the content size required for legibility at a distance of 88 m was calculated from the correctly identified content data, while the human detection range from which content of a certain size became legible was investigated in experiment 2.

Experiment 1 collected subjective data regarding the legibility of content, the concentration required for identifying the content, and the participants' confidence in having correctly identified the content, each on a 5-point Likert scale (Table 2). Experiment 2 collected subjective data regarding the participants' confidence in having identified the eHMI content correctly.

**Table 2.** The three items used to collect subjective data.


#### *4.6. Statistical Analysis*

Data preparation was performed with Excel and the statistical analysis was conducted using the software JASP [31]. In experiment 1, since the data were not normally distributed, we applied a Friedman test to analyze the content size required for legibility from a constant distance of 88 m. Post hoc comparisons were conducted using Wilcoxon tests and a Bonferroni correction was applied. The effect size of the Friedman test was classified using Kendall's *W* (small effect: *W* = 0.1; medium effect: *W* = 0.3; large effect: *W* = 0.5). In the case of the Wilcoxon tests, we classified the effect sizes with the Pearson moment correlation *r* (small effect: *r* = 0.1; medium effect: *r* = 0.3; large effect: *r* = 0.5) [32].

In experiment 2, we chose to conduct three ANOVAs to evaluate the effect of both content type and content color. The assumption of sphericity (Mauchly's test: *p* > 0.05) was always fulfilled. In both cases, we performed a Bonferroni correction. We refrained from analyzing content type and content color within a single ANOVA, as there were values missing for the text, which would have resulted in the exclusion of nine participants in the analysis as a whole. Our approach allowed the data of these participants to be at least partially incorporated into the statistical analysis. We rated the effect sizes by applying η<sup>2</sup> *<sup>p</sup>* (small effect: η<sup>2</sup> *<sup>p</sup>* = 0.01; medium effect: η<sup>2</sup> *<sup>p</sup>* = 0.06; large effect: η<sup>2</sup> *<sup>p</sup>* = 0.14) for the ANOVA and Cohen's benchmark *d* (small effect: *d* = 0.2; medium effect: *d* = 0.5; large effect: *d* = 0.8) for the post-hoc comparisons [32].

#### **5. Results**

#### *5.1. Experiment 1*

#### 5.1.1. Effect of Content Size

Table 3 shows the absolute number and the percentage of correct identifications according to content size. All participants usually recognized the three largest sizes, regardless of the content. The only exception was one participant who could not identify the orientation of the arrow at a size of 170 mm. At a size of 110 mm and 80 mm, the number of correct identifications of the text was considerably lower than the number of correct identifications of the arrow and the E.


**Table 3.** Correct identification in absolute and relative terms (*n* = 30).

Figure 6 shows the content size from which the participants could correctly identify the contents. The text was identified correctly at a size of *Mdn* = 110 mm. The arrow (*Mdn* = 95 mm) and the E (*Mdn* = 95 mm) could be identified at a smaller size. The Friedman test reveals a significant effect of content type on the required content size (*X*<sup>2</sup> = 14.59, *p* < 0.001, Kendall's *W* = 0.549). The post-hoc comparisons using Wilcoxon tests (Table 4) show significant differences between the text and the arrow and between the text and the E, each with a large effect.

**Figure 6.** Content size from which the text and symbols were correctly identified (*n* = 30).


**Table 4.** Post-hoc comparisons using Wilcoxon tests.

Note: A Bonferroni correction was applied, and the corrected level of significance was set to α = 0.0167.

#### 5.1.2. Subjective Results

Table 5 contains the participants' subjective ratings of legibility, concentration, and confidence on a 5-point Likert scale. In the case of legibility and concentration, the two biggest content sizes include high ratings of *Mdn* = 4 and *Mdn* = 5. The two smallest content sizes produce low ratings of *Mdn* = 2 and *Mdn* = 1. As for their confidence in identifying the display content, the participants gave high ratings for the biggest four content sizes and considerably lower ones for the two smallest content sizes.

**Table 5.** Subjective participant ratings on a 5-point Likert scale with regard to legibility, concentration, and confidence (*n* = 30).


#### *5.2. Experiment 2*

#### Effect of Content Type and Content Color

Figure 7 shows the detection range from which the participants were able to identify the eHMI content for each content color. The text implies the smallest distance to the prototype for all three colors (Table 6). Table 7 contains the three ANOVAs, one for each color, to evaluate the effect of the content type. For all colors, there were significant effects of the content type on the detection range with large effect sizes. Post-hoc comparisons for the color white (Table 8) reveal significant differences with a medium effect between the text and the arrow and a large effect between the text and the E. The analysis of the red content indicates a significant difference between the text and the E, with a medium effect. The post-hoc comparison of the green content shows significant differences between all three content types with a large effect between the text and the E and medium effect sizes between the text and the arrow as well as between the arrow and the E.

We analyzed the influence of content color by conducting three ANOVAs (Table 9). For text, there was a significant difference with regard to the color, with a large effect. Post-hoc comparisons (Table 10) reveal a significant difference in the distance between the colors white and red and a significant difference between the colors red and green, each with a medium effect.

**Table 6.** Descriptive data giving the distance from which the display content could be identified divided by content type and content color.


**Figure 7.** Distance from which the display content could be identified correctly divided by content type and content color.

**Table 7.** Statistics for the ANOVAs conducted to evaluate the effect of content type with respect to content color.


Note: A Bonferroni correction was applied, and the corrected level of significance was set to α = 0.0167.


**Table 8.** Post-hoc comparisons analyzing the content type.

**Table 9.** Statistics for the ANOVAs conducted to evaluate the effect of content color with respect to content type.


Note: A Bonferroni correction was applied, and the corrected level of significance was set to α = 0.0167.

**Table 10.** Post hoc comparisons analyzing the text color.


#### **6. Discussion**

#### *6.1. E*ff*ect of Content Type*

An increase in content size increases the legibility of the display content regardless of the content type, reflected by the higher numbers of correct identifications from a distance of 88 m, as well as by the participants' higher legibility ratings. Moreover, the concentration required for identifying the content decreases and the confidence in identifying it increases. An increase in text or symbol size leads to the display content taking up more space in the total area of the prototype. Since the brightness of each LED was the same within a color scheme in all trials, the use of larger texts or symbols results in a greater number of illuminated LEDs and thus higher luminance of the message. Additionally to the larger visual angle with large content sizes, with an increase in luminance, there is also a rise in the participants' visual acuity [33], showing that larger content sizes result in increasing legibility.

Text and symbols should be at least 140 mm high to be legible from a distance of 88 m. The participants rated their confidence in identifying the display content as sure (*Mdn* = 4) for all content types. Moreover, the percentage of correct identifications drops considerably with smaller content sizes. For safety-critical interactions with AVs at a road bottleneck, the oncoming human driver must always be able to identify the message with confidence. Moreover, in real traffic interactions, environmental factors such as vehicle body movements, as well as the driving activity itself, distract the driver from focusing on the eHMI. We can therefore state that the AV should display its message in a slightly larger size than the minimum value. We recommend a value of between 170 mm (6.64 MOA) and 200 mm (7.81 MOA), as these sizes resulted in participants feeling very confident in identifying the display content. For this content size, a display width of 768 mm was sufficient for displaying different symbols and small blocks of text comprising four to five letters, such as "WALK", "GO", "OK", and "STOP", as proposed in several studies [12,34,35].

According to the standard DIN EN ISO 9241-303 [20], 170 mm is the size that should be used at distances of less than 29 m, while the content size for a distance of 88 m should be 512 mm to comply with a recommended visual angle of 20 MOA. However, according to our findings, a content size of 6.64 MOA to 7.81 MOA is sufficient for good legibility. This result underlines the importance of new international standards for future eHMI development. The transferability of findings from guidelines on technology, task, and environment-independent performance specifications and recommendations [20] is not applicable.

Symbols require a smaller size than text for them to be legible, which coincides with the findings of Kline, Ghali, Kline, and Brown [36]. Moreover, symbols of equal size were legible over longer distances than text. The prototype displayed the symbols individually and not surrounded by other elements. The letters within the text did not stand alone and were not delimited from each other, which complicated the correct identification of individual letters. In addition, it can be assumed that the contours of texts and symbols were blurred by the haze effect [20], which depends, among other things, on the relative atmospheric humidity [37]. Even though the haze effect affects symbols and text equally, the contours of text tend to merge in letters that are close together. The impact of haze and the small distances between multiple letters resulted in the text being misread in 13 attempts in experiment 2. The blurred delimitation of individual letters led to confusion, for instance, between the letters C, O, and G, as well as between F and P. In contrast to text identification, participants expected the symbol to be displayed, which means that the symbol type was already identified and only its orientation had to be determined. For safety-critical AV–human driver interaction at road bottlenecks, these findings imply that standalone symbols should be used for communication in order to achieve the most accurate identification and the greatest possible legibility of the AV's message. Moreover, taking into account the comments of the participants, it can be concluded that if using arrows for communication, the arrow tips should be designed more distinctly to improve identification of its orientation. This is reflected in the lower legibility rating of the arrow compared with the distinct orientation of the E for sizes greater than 170 mm.

#### *6.2. E*ff*ect of Content Color*

The statistical analysis showed that the effect of color was significant for displaying text, in a way that the color red was found to be readable from greater distances, although this color had the lowest contrast ratio. There was no significant effect of symbol color on the human detection range. This finding may be due to the fact that contrast and luminance are confounded variables [30] and thus human visual performance varies with different ratios of contrast and luminance [38]. The red light may have affected the contrast–luminance ratio between several letters in favor of better legibility. All in all, we can state that the influence of color was negligible, which corresponds with the findings of Lin [39], who showed that the color of letters has no significant effect on the visual performance of text identification on TFT-LCD monitors.

We recommend the use of symbols for AV communication (Section 6.1). As the factor of color has no effect on the human detection range, we are free to use red, green or white in an eHMI design in order to attain good legibility. Moreover, the display provides color fidelity at viewing angles of less than 140◦, and humans are able to perceive the colors red and green in an area of 65◦ and 60◦ respectively [40]. Therefore, in straight approach scenarios like the AV–human driver interaction at a road bottleneck, it is possible to communicate via color and, at the same time, there is no risk of reducing the human detection range. This fact enables coding of AV messages via colors, leading to faster reaction times if the color meets the expectation of the human interaction partner [41]. Red and green are familiar from traffic in the context of yielding or insisting on the right of way. As an example, when texts are green, participants perceive a higher level of safety to cross the street [42], while using symbols in green to communicate to yield the right of way at a road bottleneck enables an efficient and safe passage for the human driver [2].

#### *6.3. Limitations*

The sample taking part in the study consisted mainly of young participants between the ages of 25 years and 30 years. This means that a considerable proportion of human drivers were not represented. Elderly people, in particular, are more likely to suffer from vision deficiency such as impaired contrast sensitivity [43], which can influence the results of the experiments. A future study should therefore use an age-balanced sample.

Moreover, in contrast to interactions at road bottlenecks, the participants identified the display content without sitting in a vehicle. Thus, the investigation did not take into account the potential influence of the windshield on the legibility of the display. Additionally, vehicle body movements and dirt can impair the eHMI's legibility in real traffic. A further limitation is that the absence of any driving activity means that participants can devote their full attention to the display. To counteract these effects, we did not recommend a content size of 140 mm for display legibility, but calculated a range of 170 mm–200 mm for use in eHMI designs.

The experiments were conducted on dry winter days. Thus, the analysis did not consider the influence of summer light conditions or rainfall. Before conducting the experiments, we measured the illuminance. Initial analysis indicated an effect of illuminance on the human detection range such that an increase in illuminance led to an increase in range. We refrained from presenting this result in the present paper, because in addition to illuminance, there are several other parameters, such as luminance distribution, light color, and glare [44], which characterize real-life lighting conditions, while haze [37] and thus legibility are affected by air humidity and fog. Therefore, we could not assign the effect only to illuminance. To investigate the influence of individual factors, these need to be isolated and examined in a controlled environment in future work.

#### **7. Conclusions**

Content type significantly influences the required display size, with a large effect. Symbols can be displayed in a smaller size than text for them to be legible from a constant distance. Moreover, symbols can be identified at a greater distance than text, which means that in the same scenario the human interaction partner has more time to perceive and process an AV's message in the form of a symbol. In the bottleneck scenario, we state that the height of the display content should be 170 mm (6.64 MOA) to 200 mm (7.81 MOA), as this leads to very good legibility at a distance of 88 m and the majority of the participants were able to identify the smaller content in experiment 2 from even greater distances. In addition, this recommendation considers potential environmental influences that may negatively affect legibility.

Regardless of the display content, we did not find a content overlapping effect of color on the human detection range. The influence of color was only significant when displaying text. In conclusion, we state that in order to ensure the widest possible range of AV communication, the colors investigated in this study are suitable for displaying simple symbols without running the risk of negatively influencing legibility. Therefore, color coding in addition to the symbol shape can be employed in the interests of good legibility and communicating AV messages more clearly.

**Author Contributions:** Conceptualization, M.R. and J.S.; data curation, M.R. and J.S.; formal analysis, M.R. and J.S.; funding acquisition, K.B.; investigation, M.R. and J.S.; methodology, M.R. and J.S.; project administration, M.R.; software, J.S.; supervision, K.B.; validation, M.R. and J.S.; visualization, M.R.; writing—original draft, M.R.; writing—review and editing, M.R. All authors have read and agreed to the published version of the manuscript.

**Funding:** The German Federal Ministry of Economics and Energy funded this research within the project @City: Automated Cars and Intelligent Traffic in the City, grant number 19A17015B.

**Conflicts of Interest:** The authors declare no conflicts of interest.

#### **References**


© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

*Article*
