*2.6. Validation Methods and Techniques*

During the simulations runs, all aircraft data were recorded. This included aircraft performance data, such as velocity, three-dimensional position and actual thrust value. Aircraft performance data were logged. The resulting log files were used for post-analysis to examine the concept's impact on the defined dependent variables flight distance and number of landed aircraft.

The dependent variables mental workload, perceived safety and situation awareness were assessed on the basis of questionnaires and debriefing sessions. Two different sets of questionnaires were administered. The PRQ was used after each simulation run. The PRQ includes the NASA Task Load Index (NASA-TLX) [47,48] and the situation awareness part of the Solutions for Human Automation Partnerships in European ATM (SHAPE) (SASHA) questionnaire, which was developed to assess the effects of system automation and trust on ATCOs' situation awareness [49,50].

NASA-TLX was used to assess the different dimensions of workload [47]. The NASA-TLX includes the subscales mental demand, physical demand, temporal demand, performance, effort and frustration. The subscale physical demand was omitted in the present trials, as no physical demand was expected for the task. Participants were instructed to place a score on slider bars with 21 gradations each, ranging from 0 (low) to 100 (high) (or 0 (good) to 100 (poor) in the case of performance) in steps of 5. Raw TLX ratings were used; i.e., the sub-scales were not weighted. According to Hart [47], this is a common practice and does not reduce sensitivity. A global raw TLX score was computed by calculating the mean of the five subscale ratings.

In order to assess ATCOs' experienced situation awareness, the SASHA questionnaire [50] was administered. SASHA consists of six items on a 7-point Likert-scale from 0 (never) to 6 (always) [51]. By inverting the ratings of items 2, 3, 5 and 6 and then calculating

the mean of all item ratings, the overall SASHA score was computed [50]. A higher score represented higher situation awareness and was thus preferable.

The post exercise questionnaire (PEQ) was administered after the ATCOs completed the full simulation day. The PEQ included a bespoke questionnaire. Only selected statements about situation awareness and perceived safety are reported in this paper. Statements were rated on a 5-point Likert-scale from 1 (strongly disagree) to 5 (strongly agree). Means and standard deviations were calculated for the bespoke statements, where mean rating of 3 was used as the success criterion.

In addition to the introduced ATCO radar and supporting tools, the Instantaneous Self Assessment (ISA) measure was integrated into the CWP on a second touchscreen to obtain subjective mental workload ratings [52–54]. The ATCO was prompted to rate their perceived mental workload on a five-point rating scale (1 = under-utilised, 5 = excessively busy) every five minutes [55]. The data were used afterwards to evaluate the ATCOs perceived mental workload in different traffic situations.
