1. Introduction
The aim of this paper is to identify the causes of flight delays and to analyze the changes in the delay situation over the period 2013–2019. Furthermore, the impact of various factors on the length of flight delays for a selected airline operating in Europe was assessed. The aim is also to identify the potential risks and causes of delays in air transport, which entail a risk in particular in relation to passenger dissatisfaction and the resulting need to pay compensation for delayed flights. This imposes high financial costs on airlines under EU law.
This research has been conducted for a minor airline (approx. 10,000 flights per year, approx. 30 airplanes) operating in the European region. The results could therefore be generalized and applied to similarly sized airlines, while the primary airline will use the data to optimize their internal policies in order to curb the reasons for delays.
This article attempts to provide an evaluation of the causes of flight delays at European international airports used by a selected airline. The classification of the delay causes was based on the so-called IATA codes modified for the company’s needs; the codes are listed in
Table 1. The delay causes were confronted with other factors affecting the delays. These factors emerged directly from our own research. In essence, these are categorical data or at least data suitable for categorization. Using tests of independence in contingency tables, the time development of the delay causes and flight parameters were investigated. The results are displayed graphically via correspondence analysis as well.
A delay represents an issue that affects both passengers and the airport staff, this issue being addressed, for example, by Wu and Truong [
2] or by Yıldız et al. [
3]. They came up with a comparison of the IATA delay data system with the coding system developed by the authors themselves. Hsu et al. [
4] suggested strategies to reduce the delays, such as, e.g., setting scheduled times for the completion of a process, increasing the number of service counters, and providing priority service for emergent flights. Optimization of airport check-in scheduling at the passenger terminal was the subject of the Al-Sultan article [
5]. The article by Skorupski and Wierzbinska [
6] deals with the difficulties encountered due to late check-ins and looks for an optimal time limit, after which it is appropriate to stop waiting for the latecomers. The article by Evans and Schäfer [
7] describes the situation at airports using game theory and subsequent simulations; their goal is to find a way to avoid long delays. Their paper presents a model that depicts and predicts the operational responses of airlines to airport capacity problems. Short-term prediction of national airport throughput with a graphical attention recurrent neural network was investigated by Zhu et al. [
8]. Research on slot allocation for an airport network in the presence of uncertainty was addressed in a paper by Liu et al. [
9].
The association between delays and higher fuel consumption is addressed by Ryerson et al. [
10], who came to the conclusion that the major culprit of delays and, therefore, higher fuel consumption is inefficient flight plans. Many other authors describe the potential optimization of flights that would minimize the occurrence and propagation of delays, namely the article by Xing et al. [
11] and by Gladys et al. [
12]. The economic aspects of the subject, including possible optimizations, are widely discussed in the article by Pan et al. [
13] and by Addu et al. [
14]. The issue of flight optimization is also analyzed [
15,
16,
17,
18]. The work by Samá et al. [
19] suggests models optimize air traffic with the aim of minimizing total travel and delay time caused by aircraft conflicts. According to Ionescu et al. [
20], flight schedules often do not sufficiently respond to the possibility of a delay that could occur due to unforeseeable events, late notification of a technical defect, or congestion of the airport or airspace. The designed models allow for more accurate predictions of delays.
Further research focused on the modeling of the course and propagation of delays during subsequent flights, see, for example, Campanelli et al. and Rebollo and Balakrishnan [
21,
22]. The article by Kafle and Zou [
23] examines more specifically the chaining delays, i.e., the propagation of delays to subsequent flights. While tackling this issue, the authors employ an analytical and econometric estimation. Their model uses various factors influencing the delay and quantifies these factors. The results of the analysis may be used to help the airlines plan their flights. The description and measurement of the propagation of delays to other flights were discussed in detail by Brueckner [
24]. The issue of holdups entailing the delays of subsequent flights is the prevailing topic of the article by Hao et al. [
25]. Their article uses data from New York airports. Both studies—ours and the last mentioned one—are based on delayed flights’ data covering the past few years. Hao et al. [
25] compare an econometric model and a simulation model. Since the simulation model fails to depict all factors, the results of the two models may differ on several occasions. Yet another article, this time by Arikan et al. [
26], focuses on the situation at US airports. The authors search for airports that are the source of the highest number of delays influencing the subsequent flights, plus they address the question of how increasing the capacity of these airports would improve the situation. They are looking for an appropriate metric to measure the robustness of airline schedules. The proposed models aim to make these schedules more resistant. The authors seek an answer to the following question—which flight in the daily aircraft rotation is causing the greatest difficulties and therefore requires special attention? They identified two factors influencing the delay of consecutive flights. The first is the randomness of the intrinsic flight time for scheduled flights, and the second is the propagation of this randomness through the transport infrastructure and air travel network. The used stochastic models are supposed to develop robustness measures for airline networks.
The article by authors Liang and Chaovalitwongse [
27] describes the optimization of flight plans that would allow for maximizing aircraft utilization and minimizing maintenance costs, thus improving the competitiveness of the airline. They suggest optimization models intended to tackle logistical problems. Others [
28] propose an optimization model for flight plans to minimize delays. While some [
29] suggest flight planning models with the use of logistic and quantitative regression methods. These models would predict the situation for up to several months ahead. Similarly to this paper, the authors Samá et al. [
19] focus on aircraft delays at major European airports. They design various optimization models, focusing especially on the aircraft movement through the airport area. Optimization of departure runway planning, including arrival crossings, was addressed in a paper by Ma et al. [
30]. The article by Simaiakis and Balakrishnan [
31] sets out the optimization of the traffic flow, queuing and waiting of aircraft on the taxi-out and departure runways at the airport using queuing systems. The authors of another article [
32] address the imbalance between the demand and airport capacity with regard to possible administrative and economic measures.
Optimization of airport operations and flight schedules leads to the elimination of aircraft delays and, therefore, fuel savings, which can have positive environmental impacts. Author Paprocki [
33] describes the business model of the virtual airport hub, an innovative solution supported by digital technologies, the implementation of which in continental air transport can lead to a reduction in energy consumption and greenhouse gas emissions. The authors Szaruga and Zaloga [
34] proved that airports compete with each other for passengers and carriers instead of cooperating and shifting some traffic to geographically close airports. If airports cooperated, the idea of sustainable development would be respected (with respect to the rationalization of energy consumption).
Zámková and Prokop [
35,
36,
37] analyzed factors influencing flight delays at Czech international airports with a focus on the financial implications of delays in the airline business and also analyzed the factors influencing flight delays with a focus on the situation at the airports located at the popular destinations for Czech tourists. Forbes et al. [
38] claim that it would be appropriate for airlines to release information concerning the number of delayed flights with a delay longer than 15 min. The views of selected sources and authors on this issue are summarised in
Table 2.
Air traffic management is one of the most important strategic management factors for airlines that deals with the management of internal processes, which also include, e.g., operational management, financial management, etc. [
39,
40].
A detailed description of the data, the selected variables and the methods used to process them follows in the next sections of the paper. The results section is followed by an analysis of the different causes of delays and the effects on the length of aircraft delays. Furthermore, the change in the situation regarding flight delays during the period 2013-2019 was observed in detail.
2. Materials and Methods
The data originated from an internal database of an airline operating in the European region. Primary data were acquired within the peak season (June–September) in 2013–2019, and it contained information about the lengths and the causes of delays as well as information regarding the aircraft type, departure dates and times, and delay reasons according to the IATA classification. The numbers of both delayed and not delayed flights are listed below in
Table 2. The departure and arrival airports include airports in central Europe, as well as airports in the vicinity of holiday destinations, mostly in the Mediterranean. The data were obtained in cooperation with a specific airline that asked authors to publish the data anonymously. This is an airline operating primarily in Europe. The sample contains complete data for all flights for the summer season (June–September) in 2013–2019, so it is a complete sample. The data sample contains information on tens of thousands of flights.
The data obtained from the internal database are categorical in nature. Therefore methods suitable for processing this type of data were used. The aim of the statistical analysis was to determine the relationship between the length of flight delays and the type of aircraft, and the occupancy of the aircraft. Furthermore, it was examined how the frequency of delayed flights varies over the period of interest for different types of flights, different times of the day, different causes of delay and different lengths of delay. These dependencies were examined by contingency table analysis and graphically represented using bivariate correspondence analysis. Three-dimensional correspondence analysis was also used to examine the dependencies of multiple variables simultaneously.
The variables used for the analysis were: length of delay scaled on a scale of (No delay, less than 15 min, 0:15–0:30, 0:31–1:00, 1:01–2:00, 2: 01 and more); aircraft type (Airbus A319, Airbus A320, Boeing 737–500, Boeing 737–700, Boeing 737–800); load factor (0–20%, 21–40%, 41–60%, 61–80%, 81–100%); flight type (empty, charter, regular); year observed (2013–2019); time of day (0:01–6:00, 6:01–12:00, 12:01–18:00, 18:01–24:00); cause of delay (see
Table 1).
In categorical data analysis, contingency tables represent an easy way to illustrate the data relations. Depending on the character of the data, we used applicable tests of independence. According to Řezanková [
41], in the case of the contingency table, we use Pearson’s chi-square test—for further details, see also [
42,
43,
44,
45]. The intensity of the dependencies in the contingency tables was measured by the contingency coefficient, and Cramer’s V. A dependency was considered statistically significant at a significance level of less than 0.001.
Correspondence analysis (CA) is a multivariate statistical technique. It is conceptually similar to principal component analysis but applies to categorical rather than continuous data. In a manner similar to principal component analysis, it provides a means of displaying or summarizing a set of data in two-dimensional graphical form. For details, see Blasius and Greenacre [
46].
All data should be nonnegative and on the same scale for CA to be applicable, and the method treats rows and columns equivalently. It is traditionally applied to contingency tables—CA decomposes the chi-squared statistic associated with this table into orthogonal factors. The distance between single points is defined as a chi-squared distance. The distance between
i-th and
i’-th row is given by the formula
where
rij are the elements of row profiles matrix
R and weights
cj correspond to the elements of column loadings vector
cT, which is equal to the mean column profile (centroid) of column profiles in multidimensional space. The distance between columns
j and
j′ is defined; similarly, weights correspond to the elements of the row loadings vector
r and sum over all rows.
The total variance of the data matrix is measured by inertia [
47], which resembles a chi-square statistic but is calculated based on relative observed and expected frequencies.
In correspondence analysis, we observe the relation between individual categories of two categorical variables. The result of this analysis is a correspondence map introducing the axes of the reduced coordinates system, where individual categories of both variables are presented in graphical form. Graphic tools of this method allow us to describe the association of nominal or ordinal variables and to obtain a graphic representation of a relationship in multidimensional space. The aim of this analysis is to reduce the multidimensional space of row and column profiles and to save maximally original data information [
48]. Each row and column of a correspondence table can be displayed in c-dimensional (r-dimensional respectively) space with coordinates equal to values of corresponding profiles. The row and column coordinates on each axis are scaled to have inertia equal to the principal inertia along that axis: these are the principal row and column coordinates. Unistat and Statistica software was used for primary data processing.
3. Results
Table 3 clearly shows that the total number of flights operated by the selected airline has increased over the years under review, most notably in the last two years. The table indicates that the proportion of delayed flights during the considered period is neither increasing nor decreasing; the value is oscillating around 50%.
As illustrated in
Table 4, with increasing delay, the effect of the aircraft capacity is becoming more important. The fleet of the observed airline consists of Airbus A320 (320), Airbus A319 (319), Boeing 737–500 (500), Boeing 737–700 (700), and Boeing 737–800 (800). Larger aircraft (A320, B737–800) are associated with slightly longer delays. As for shorter delays, the differences between the various types of aircraft are not significant. The dependence intensity is lower (Chi-square = 588.22; df = 16;
p-value = 0.000; contingency coefficient = 0.103; Cramer’s V = 0.05).
The dependence is also evident from the correspondence map in
Figure 1. Large Airbus A320 aircraft often face long delays of more than one hour; similarly, Boeing 737–800 aircraft encounter delays of more than half an hour. On the contrary, smaller Boeing 737–500 and 737–700 aircraft often fly with minimal delay or no delay at all.
Table 5 shows that the higher the load factor of the aircraft, the less likely it is that the aircraft will be punctual. When it comes to longer delays, the effect of the load factor on the delay occurrence decreases. The intensity of dependence is low (Chi-square = 191.48; df = 16;
p-value = 0.000; contingency coefficient = 0.06; Cramer’s V = 0.03).
It is obvious from the correspondence map in
Figure 2 that, with longer delays, there is no significant effect of the load factor on the delays. All values representing the load factor are very distant from the values of these delays in the graph. On the contrary, with short delays of up to 15 min, the occurrence of delays is frequent on more busy flights.
The proportion of delayed flights is approx. 50%, and it does not change very much during the period under review. In most cases, the proportion of delayed flights is slightly lower when it comes to charter flights, as opposed to regular ones, see
Table 6.
Table 7 shows that at any time of day, the proportion of delayed flights did not change significantly during the years under review. Distribution of delayed flights in different parts of the day remains unchanged. The dependence intensity is weak (Chi-square = 74.14; df = 18;
p-value = 0.000; contingency coefficient = 0.037; Cramer’s V = 0.02).
There are small differences that are registered in the correspondence map in
Figure 3, where the maximum proportion of delayed flights was in 2017 in the night hours, in the evening hours of 2015, in the afternoon of 2019, and in the morning of 2016.
Table 8 shows that the number of delayed flights to the total number of delayed flights ratio is slightly increasing for AIC, ARH, TAE, FOC, and AGA reasons. The R reason occurs more and more frequently, except for the last year surveyed. The share of the ATFMR delay reason is rather decreasing, with the exception of the last year. The dependence intensity is slightly lower (Chi-square = 1799.14; df = 48;
p-value = 0.000; contingency coefficient = 0.178; Cramer’s V = 0.074).
The correspondence map in
Figure 4 indicates that the ATFMR reason was more frequent in the earlier years, while the AIC, ARH, TAE, and AGA reasons have been more common in recent years.
Table 9 demonstrates the fact that the proportions of delayed flights with different duration do not show significant development in 2013–2019. The dependence intensity is slightly lower (Chi-square = 930.51; df = 24;
p-value = 0.000; contingency coefficient = 0.129; Cramer’s V = 0.065).
The correspondence map in
Figure 5 illustrates that delays of around two hours are more likely to occur in the middle of the reference period, i.e., in 2015 and 2016, while the frequency of these delays decreases towards the end of the reference period.
The multidimensional correspondence analysis was employed in order to search for dependencies between categories of more than two variables. The correspondence map
Figure 6 includes the following variables: daytime, length of the delay and delay causes, and it shows that the short delays (under 15 min) occur mainly in the nighttime, with the most frequent causes being MISC and AGA. Longer delays (under 30 min) occur more often in the afternoon, with frequent causes being AIC, ATFMR, ARH, PB, and FOC. Even longer delays are often caused by TAE and R and happen more frequently in the afternoon and in the evening.
The multidimensional correspondence map (
Figure 7) illustrates the following categories of variables: type of flight, aircraft, and load factor. The highest load factor has been recorded for the A320, B737-800, and A319 aircraft since these planes are often used for charter flights. The aircraft used less often for charter flights tends to show a lower load factor. Charter flights are usually more loaded than regular flights. All of the above was confirmed based on the respective double-dimensional analyses.
Overall, the survey results showed that approximately 50% of flights were delayed during the study period. Furthermore, the length of delays was analyzed in relation to various parameters. It has been shown that longer delays are more frequent for larger and busier aircraft. Lower delays occur for charter flights compared to scheduled flights. The proportion of delayed flights did not change significantly during the years under review at different parts of the day. The paper focused on the evolution of causes of flight delays, with operational reasons of the airline, delays caused by suppliers—handling, fuelling, catering, delay caused by technical maintenance or aircraft defect, delays caused by operational requirements and crew duty norms, and delay caused by airport restriction slightly increasing as causes of delays during the period under study. With the exception of the last year under review, delays due to delays of previous flights have steadily increased, while the incidence of delays due to air traffic control has rather decreased. A more detailed analysis of the interaction of several factors revealed that short delays (up to 15 min) occur mainly during nighttime, with specific delays and delays caused by airport restrictions being the most common causes. Longer delays (up to 30 min) occur more frequently in the afternoon, with operational reasons of the airline, delays caused by air traffic control, delays caused by suppliers—handling, fuelling, catering, delay caused by passenger and baggage handling and delay caused by operational requirements and crew duty norms being common causes. Even longer delays are often caused by technical maintenance or aircraft defects and by the delay of the previous flight and occur more frequently in the afternoon and evening.
4. Discussion
Drawing on the quoted authors, our research has likewise implied that the propagation of delays to subsequent flights is the main cause behind the delayed flights. Xing et al. [
11] describe the optimization of consecutive flights to avoid the chained delay. An analysis of the propagation of delays to subsequent flights is provided in the article by Campanelli et al. [
21], which focuses on the airline systems behaving in a nonlinear way that is difficult to predict. For the purposes of further research, it would be advisable to focus on optimization models for air traffic control and traffic on flight paths, for aircraft placement to the gates, and for appropriate use of flight slots. Some authors, for instance [
11,
21,
22], have already addressed this issue. Considering further research, a comparison of different types of the models introduced in their articles could therefore be of great interest. As suggested by Zámková and Prokop [
35], the delays entailed by the delay of the previous flights generally occur very often. Zámková and Prokop [
37] moreover concluded that the delays occurring due to the delays of previous flights are the most frequent case at airports close to popular holiday destinations of tourists. It has now been found that even at European airports, the propagated delays are becoming, with the exception of the last year, more and more frequent.
In the articles by Zámková and Prokop [
35], the authors came to the same conclusion as they did now: the delays caused by supply companies are usually minimal. Both articles imply that there is a clear effort to achieve smooth operation. In addition, it has now been proved that out of the list mentioning possible causes of delays at European airports, only air traffic control shows a positive trend over the years in question. Ivanov [
49] offers yet another perspective to observe the issue of the influence of air traffic control on the emergence of delays. It was furthermore concluded that the air traffic control personnel in flight planning often fails to take into account the situation where one delay propagates to the next flight. They primarily try to minimize the very short delay caused by themselves, while the chaining of delays may result in a delay possibly even ten times greater. The authors recommend the possibility of controlling air traffic flow management delay distribution in a way so as to minimize delay propagated to subsequent flights but also to increase flights’ adherence to airport slots at coordinated airports. The most important thing, however, is to respect the flight schedule and to track the origin of delayed arriving flights. Flight optimization options for air traffic control are covered in the works of Wu et al. [
16]) and Belkoura et al. [
17]. The question of the influence of air traffic control on air traffic flows dealt with in the article by Weosonga [
50].
The article by Zámková and Prokop [
35] showed that staffing is not a major problem when it comes to delays, but current research suggests that crew standards are becoming a more frequent cause of delays at European airports, and therefore an enhancement of the human resources of aviation personnel should be considered. Optimization of delays emerging due to aircraft and crew scheduling is addressed by AhmadBeygi et al. [
15]. The airline workforce deployment is probed by Chung et al. [
51]. They propose to reduce flight delays with the use of crew pairing, i.e., the use of a reserve crew. A lack of flight crew members may disrupt, delay, or completely cancel a flight. This often happens because the crew gets stuck in the delayed previous flight. The authors solve the problem using neural networks based on a large number of historical arrival and departure data. Their analysis shows the possible need for one or more reserve crews. With a clear definition of the need and number of these crews, the company can save costs significantly and create stable flight schedules. Our analysis has shown that delays triggered in association with the passengers and their luggage are not a common problem at European airports. The articles by Huang et al. [
52] and Abdelghany et al. [
53] deal with the question of how to solve possible problems in this area. They suggest optimization models for luggage handling. Problems with lost luggage are discussed in the article by Alsyouf et al. [
54] (2015). Similarly, the study by Hsu et al. [
4] comes up with various strategies to reduce the holdups, but again from a slightly different viewpoint. It is recommended, for example, to increase the number of service counters for check-in. While the article by Skorupski and Wierzbinska [
6] deals with the difficulties caused by waiting for late passengers, our analysis has shown that latecomers are not a major problem at European airports.
According to Forbes, Lederman, and Tombe [
38], it would be advisable for airlines to release information on delayed flights with a delay of more than 15 min. In our opinion, it would be highly appropriate for all airlines to provide this information publicly; it would be beneficial for both the passengers and the airlines themselves. In the context of increasing their competitiveness, airlines could improve internal corporate practices on the basis of this data. As claimed by a well-informed employee—likewise to the suggestion made by Pan et al. [
13]—the monitored airline tries to optimize its operating costs as well.
A limitation of the research may be that the conclusions are based on data from one airline in the European area, so it is not possible to generalize to other locations (such as America and Africa). However, it is possible to generalize the results to the European area, which addresses the issue of delays in a comprehensive way, taking into account the need for compensation for delayed flights.
5. Conclusions
The authors of the article processed data from an airline database operating in the European region, and they are the first to analyze the evolution of causes and duration of flight delays over the period of time spanning from 2013 to 2019. The research has shown that the selected airline was flourishing over the years under review, which, inter alia, as evidenced by the increasing number of operated flights. The most significant increase in the number of flights was recorded in the last two years. Another positive result of the research is the fact that the total share of delayed flights of the selected company did not increase over the years. This fact can be considered a success, considering the increasing occupancy of European airports. The data analysis has resulted in a revelation of other factors influencing delays, which complement the data on the causes of delays listed in the airline database and classified according to the IATA methodology. Other identified factors include, for example, aircraft capacity, aircraft utilization, flight type, flight distribution in different parts of the day, and the year under review.
The findings of this research suggest that higher-capacity aircraft are more prone to longer delays. The dependence of the delay time on the airplane’s load factor was not very strong, considering the fact that it was only the short delays of up to 15 min, for which it can be stated that the occurrence of delays was more frequent on more busy flights and the less busy flights were more often completed without any delay. The share of delayed charter flights during the period under review is several percentage points lower in comparison to regular flights. This is a positive outcome from the company’s perspective, as charter flights make up most of its portfolio. On this note, the proportion of delayed flights according to the time of the day proved to be almost constant.
Drawing on the data on the delay causes in the airline’s database, it has become clear from the analysis that the delays entailed by the delays of the previous flights were becoming more and more frequent with the exception of the last year. The company struggles to achieve maximum usability of its air fleet. According to airline employees’ opinion, airplanes spend more and more time in service and are scheduled for more flights during the day, and that is why the propagated delays may occur more frequently. The same growing trend was observed with respect to airlines’ operational reasons, aircraft clearance by supplier companies, technical maintenance and aircraft defects, operating procedures and crew flight standards, and airport restrictions. Among the airline’s operational reasons is listed, for example, the transportation of people with disabilities, which now occurs more and more often. Delays associated with technical maintenance might be more frequent due to stricter regulations that require more frequent checks, plus there are new procedures being introduced. The crew standards are becoming a more frequent cause of delays, and a possible reinforcement of the human resources should be therefore considered. The restrictions at airports turned out to be a frequent reason for delays. Speaking of which, let us mention, for example, flight crew strikes and also the ever-stricter hygiene standards. Only the air traffic control demonstrated a positive trend in the number of delays triggered during the reference period. This illustrates the successful efforts of employees to achieve smooth operations at airports and in the airspace. The share of flights with different durations of delays has not changed significantly over the years under review. Due to the fact that on average more than half of the flights go on as scheduled and that the share of flights with long delays was very low, we may conclude on a positive note, since the situation has not deteriorated in the years under review.
All tested dependencies appeared to be statistically dependent (p-value is less than 0.001). The results of this research have been consulted with an expert working in the aircraft company.