*4.3. Explainability of Machine Learning Model*

Once the model has been validated and the indicators are found to be above the preset minimum, a study of the model's explainability is carried out.

The explainability of the model makes it possible to learn what the learning process of the machine learning model is like. It also shows the behavioural patterns found in the data and how it arrives at the predictions it makes. The Python SHAP library is used to perform this explainability analysis, which allows us to obtain graphs of explainability graphs. This method of describing the explainability of a machine learning model was introduced in [41] and is explained as "SHAP values attribute to each feature the change in the expected model prediction when conditioning on that feature. They explain how to get from the base value E[f(z)] that would be predicted if we did not know any features to the current output f(x) [41]" by the creators of this algorithm.

The first graph obtained is the influence of the temporal parameters on the prediction of the model. The SHAP algorithm starts from an initial expected value and adjusts the actual value of the prediction according to the labels of the input variables. Therefore, it is possible to analyse what effect the input variables will have, whether they will make the sector regulated (adding to the estimated initial value) or will make it unregulated (subtracting from the estimated initial value). Figure 9 presents the effect of the period of the day (a), the day of the week (b), and the month of the year (c) on the final prediction.

It can be seen how Figure 9 is very similar to the histograms shown in Figure 6. Figure 9 represents the influence of each of the training sample data, so the distribution of the training sample will be adjusted as seen above. Regarding the period of the day, approximately up to about period 40 of the day (06:30), this variable allows predicting the sector as regulated. On the other hand, from period 90 to 120 (15:00–20:00), there are

two peaks where this variable helps the sector to be regulated. This presents an analogy with Figure 6c where at 15:00 and 18:00 there are two peaks in the time of occurrence of regulation.

**Figure 9.** Effect of date-based variables on the output of the model. Analysis of hour (**a**), day of the week (**b**) and month (**c**).

Furthermore, the day of the week will only serve for the sector to be regulated sometimes on Fridays and Sundays, and always on Saturdays. This also represents a clear analogy with Figure 6b, where it can be seen that most of the regulations in LECMPAU appear on Saturday and Sunday.

As for the months of the year, the trend is also similar to the histogram results in Figure 6a. The months from June to September will be those in which the model tends to regulate the sector, while during the rest of the year, the model will tend not to regulate the sector.

Based on these results, it can be said that the model has been able to analyse time trends and make a personalised prediction based on the time data. The result allows us to verify that the approach of the model, mainly focused on an analysis of temporal patterns, seems to be correct. Furthermore, Figure 10 shows the same type of graphs, but in this case for the number of aircraft (a) and the number of flows (b) in the sector. This analysis is executed to check whether the model has also been able to find patterns in the traffic data of the sector.

**Figure 10.** Effect of a number of aircraft and flows on the output of the model. Analysis of number of aircraft (**a**) and number of flows (**b**).

These variables representing air traffic behaviour also seem to behave quite intuitively. As for the number of aircraft, it is observed that when there are hardly any aircraft in the sector (from 0 to 5), the model helps to predict that the sector will not be regulated, whereas from five aircraft upwards, the model tends to predict that the sector will be regulated. This makes a lot of sense from an operational point of view, as the sector is most likely to be

regulated when there is a considerable amount of traffic. The number of flows has a similar trend, with the frontier at four flows. From here, this variable will help the prediction that the sector is regulated.

In addition, the relative importance of the model is presented in this explanatory study. The SHAP algorithm can also identify which variables will be most important in the prediction of both classes. The graph is presented in Figure 11.

**Figure 11.** The relative importance of Machine Learning model.

The main variables of the algorithm are time variables. The greatest relative importance is that of the timing variables. This leads to the conclusion that the model is based on time-based components, and that it is possible to predict when the sector will be regulated or unregulated based on the date of analysis. The following variable is the number of aircraft in the sector and the presence of various traffic flows. In particular:


Among the flows, there are the main flows of each of the previously classified groups. There are two representatives of the flows that cross the sector from north to south, as traffic from Madrid-Barajas is the most influential in terms of traffic in LECMPAU.

To conclude the explanatory analysis, two examples of predictions and how they are influenced by different variables are presented. Figure 12 shows an example where the sector will not be regulated, and Figure 13 shows an example where it will be regulated.

**Figure 12.** SHAP Explanation of not regulated example.

The variables that cause the sector to be regulated are shown in red, and those that cause the sector to be not regulated are shown in blue. In the first example, on a Tuesday in March, even though there are 12 aircraft in 5 flows in the sector, the time component has too much influence on the prediction. On the other hand, in the second example, it can be seen how a flight on a Monday in October is regulated. In this case, the time component is more complex. The period of the day and the day of the week tend not to regulate the sector, but the month of the year compensates for this tendency. The latter variable, with the help of the traffic component of the sector, means that the sector is expected to be regulated.

This explainability analysis gives an idea of how the algorithm behaves and on which variables it bases its predictions on. Thanks to this analysis, it is possible to validate the model and the methodology developed from an operational point of view. This, together with the results obtained, which are above the established standards, means that the methodology is considered a success in LECMPAU.
