**4. Discussion**

The traffic and trajectories analysis results display that during the experimental scenarios, S30 and S60, the trajectories were on average 7.5% shorter than in the baselines R1 and R2. As cumulative events, it was found that the share of 4D-FMS aircraft allowed a greater number of flights to be implemented with shorter approach distances on average. Although the capacity was not planned to be addressed by this solution, it was observed that ATCO assistance tools effectively supported them in guiding the traffic during the validation activities. Taking that into consideration, it can be pointed out that a greater number of FMS equipped aircraft were more efficiently routed for landing by all ATCOs.

Based on ISA measurements, two conclusions for mental workload analysis can be drawn. Firstly, mental workload remained at acceptable levels in the S30 scenario and in the S60 scenario, pointing out that no mental overload arose. This evaluation was also confirmed through ATCOs' feedback during the debriefing sessions. Nevertheless, ISA ratings during the S80 scenario pointed towards mental underload, because increasing automation took away much of the traffic guidance work. Mental workload could be expected to be lower in simulations than in real operations. However, since mental underload is a potential safety risk because crucial events can be missed, this should be tested in further validation campaigns with adjusted preconditions. For example, improved visualisation of the tactical assistance systems and a bigger sample size could be used. Secondly, the experienced ATCO's mental workload seemed to be inversely related to the percentage of 4D-FMS aircraft. Increasing the amount of untouchable 4D-FMS aircraft results in a reduction in the share of 3D-FMS aircraft navigated by the ATCO. Indeed, the number of aircraft a ATCO manages simultaneously at a given time was the most used index to estimate the workload [56]. However, this index is influenced by the way aircraft are spread over space and time [56], and therefore, less aircraft to be managed does not necessarily result in less workload. An alternative explanation could be linked to the main task of the ATCO: Given the route structure (separated by design) and the sequence proposed by the AMAN (considering required separation), the ATCO mainly monitored and guided the 3D-FMS aircraft towards the TargetWindow to meet the optimal position on final approach, unless he decided to choose an alternative path based on direct routing. That being said, the higher the number of 4D-FMS aircraft, the less intervention is required from the ATCO, potentially resulting in lower mental workload.

The results from NASA TLX analysis point out that the mean global score and all mean sub-scores were higher in the S30 scenario than in the S60 scenario, indicating higher overall workload in the S30 scenario than in the S60 scenario on a descriptive level. Those results coincide with the ISA ratings. Additionally, standard deviations were especially high for the subscales frustration and performance. This means that a wide range of answers were attributed to these subscales. This divergence was also indicated during the debriefings and explorative simulation runs. For instance, some ATCOs felt comfortable being in charge of fewer 3D-FMS aircraft, while others pointed out their frustration about the untouchable character of 4D-FMS aircraft. Likewise, some ATCOs tried to further optimise the sequence proposed by the AMAN; others reported strictly following the proposed sequence. The latter might have impacted ATCOs' perceived performance ratings. Nevertheless, the NASA TLX analysis shows that increasing the amount of 4D-FMS aircraft lowers the perceived ATCOs workload. This could result in more spare mental capacity, which can be used for other ATCOs tasks, such as safety monitoring or improving sequence planning.

During the debriefing session, ATCOs named both the ghosts and the TargetWindows as beneficial, if not essential, for increasing and maintaining situation awareness. However, one ATCO raised the concern that situation awareness will be lost if the share of 4D-FMSequipped aircraft is too high. Communication between ATCOs and 4D-FMS pilots is reduced to a minimum after the initial call. Such a little amount of exchange of information could reduce situational awareness for specific aircraft. In short, it can be concluded that besides the discussed effects, the perceived ATCO situation awareness remained at an acceptable level for a 4D-FMS aircraft percentage of up to 60%. More research will be needed to assess the impact of higher percentages of untouchable 4D-FMS aircraft on situation awareness. Monitoring automated systems and assuming a more passive role instead of actively engaging with a system can impair situation awareness, possibly resulting in an out-of-the-loop performance problem [57]. As a higher share of 4D-FMS aircraft leads to the ATCO passively monitoring more aircraft, an overly large number of 4D-FMS aircraft might result in lowered situation awareness. This possibility should be critically considered in future research.

To sum up, the new airspace design and supporting functions were considered as acceptable from ATCO perspectives in terms of safety. ATCOs felt able to provide the same safety level compared to current operations. Nevertheless, ATCOs addressed potential safety risks, such as the possible loss of situation awareness for overly high shares of 4D-FMS aircraft.

Although the qualitative and quantitative assessments provided promising initial results, it should be added that there exist still some limitations. Those limitations are mainly related to the constraints on the large-scale implementation of such systems. For example, no baseline scenario was used to compare human performance results. Additionally, the human performance data were analysed on a non-parametric, descriptive level; i.e., no statements can be made regarding statistically significant differences due to the sample size of five per iteration. Therefore, a bigger sample size should test the GreAT concept and its impacts on efficiency, mental workload and situation awareness.
