1. Introduction
Over the course of several decades, the weather has significantly contributed, around the world, to decreased airspace efficiency and capacity. It ranks first for causes of flight irregularities by around 50% [
1]. To manage air traffic during poor weather conditions or airspace congestion, traffic flow managers may implement various air traffic flow management (ATFM) measures, including ground delay programs, ground stops, reroute advisories, and mileage restrictions [
2]. Among these measures, ground delay programs are often used to regulate air traffic flow, resulting in significant flight delays. In 2011, the United States alone had 1065 ground delay programs, causing 519,940 flights to be delayed by a total of 26.8 million min or an average delay of 52 min per flight [
3].
The deferments engender considerable pecuniary losses for the aviation sector. By prognosticating ground delay instances, the pertinent regulatory bodies can take preemptive measures, such as tailoring flight plans, refining resource allotment, and augmenting operational efficacy, thereby diminishing the frequency of flight delays and fuel consumption, enhancing operational efficiency and service quality, abating the economic and environmental repercussions of flights, as well as ameliorating the efficiency and safety of air traffic management.
Current research on ground delay programs (GDP) can be broadly categorized into two approaches. The first category focuses on using simulation methods or mathematical modeling to find more efficient and cost-effective strategies for optimizing flight times, assuming that a GDP is necessary [
4,
5,
6]; this approach aims to minimize the impact on airport and airline operations. The second category employs machine learning techniques to predict ground delay programs [
7,
8,
9]. These ground delay programs, modeled mathematically, aim to reduce flight delays by determining which flights must wait and for how long [
10]. Ball et al. [
4] investigated two-layer-network-structured integer programming to address the ground holding issues at one single airport. Avijit Mukherjee et al. [
5] presented a dynamic stochastic linear programming model using weather forecasts at various decision points to revise ground delays, thus solving single-airport ground-holding problems. Later, Avijit Mukherjee et al. [
11] introduced probabilistic airport capacity forecasts to determine flight departure delays. This approach optimizes the number of scheduled arriving flights in stages using a static stochastic ground-holding model, which is more straightforward to construct than earlier proposed stochastic dynamic optimization models, and it also offers a new perspective on ground delay programs. Yan et al. [
12] established a comprehensive platform to model flight operations during a GDP and proposed flight route recovery schemes. Jacquillat [
13] developed a large-scale integer optimization model using a passenger-centric GDP (GDP-PAX) optimization strategy, which significantly reduced passenger delays while only slightly increasing aircraft delay costs.
The application of machine learning in the aviation industry has gained significant momentum in recent years [
14,
15]. With the availability of historical data, researchers have been able to make predictions about the occurrence of GDP and their causes. Grabbe et al. [
2,
16] conducted a study using improved clustering algorithms on three years’ worth of GDP records to identify the primary factors leading to GDPs. The results showed that clustering techniques have great potential for determining the causes of GDPs. Liu et al. [
17] proposed a semi-supervised learning algorithm to evaluate the weather forecast similarity of strategic air traffic management, and to determine whether GDP should be released by searching for similar days. Smith et al. [
7] used terminal aerodrome forecasts (TAF) as variables with Support Vector Machine (SVM) algorithm to predict aircraft arrival rates. Then, they used the airport accept rate (AAR) to determine the planned rate, duration, and passenger delays for GDPs. Liu et al. [
17] also compared the performance of logistic regression and the Random Forest algorithm in predicting GDP incidence using weather and arrival demand variables that were generated by SVM. Wang et al. [
9] studied the impact of dynamic airport ground weather on GDP and used T-WITI-FA (terminal weather-impacted traffic index forecast accuracy) and air traffic data to model GDP prediction. Chen et al. [
8] employed multi-agent reinforcement learning (MARL) to simulate the application of GDP to resolve the demand and capacity balancing problem in high-density situations in the pre-tactical stage. The MARL approach, using a double Q-learning network (DQN), has the potential to significantly decrease the number of flights delayed and the average delay duration.
The abovementioned GDP models focus on defining objective functions and selecting decision variables; in addition, many conditional assumptions are predetermined. Thus, solving GDPs with limited practical applications can be challenging. The quantitative analysis of weather data is also critical, and many researchers utilize various forms of the weather-impacted traffic index (WITI) [
18]. However, due to their complex nature and diverse forms, many types of WITI are not well suited for fast-time modeling and forecasting. Previous studies [
9,
18] have attempted to predict GDP duration in a given hour, yielding promising results. Meanwhile, Bloem [
13] found that machine learning models had difficulty predicting the initiation and cancellation of GDP. As such, the predicting time of departure delay during GDP duration remains a relevant challenge.
Collectively, the results described in this paper show that the proposed methodology demonstrates a solution that offers significant contributes to GDP decision makers. (a) We employed the ATMAP method to evaluate the airport weather and to quantify the meteorological aerodrome report (METAR) messages. The ATMAP score, calculated using an ordinary equation, was easier to determine than other metrics, such as the WITI. To this end, we created a dataset for predictive modeling by utilizing actual flight operating data, meteorological data, and ATMAP scores. (b) The study comprehensively compares the predictive abilities of three classification models, namely SVM, random forests, and XGBoost, using Bayesian parameter modification to enhance their predictiveness and accuracy. Meanwhile, we investigate the correlation between actual GDP runs and feature importance. (c) We established a departure flight delay prediction model based on known GDP duration time and assessed the model’s ability to predict outcomes with or without ATMAP scores; in addition, the reliability of the ATMAP scores was also assessed.
The rest of the paper is organized as follows:
Section 2 briefly introduces the description of GDP followed by the research objective.
Section 3 explains the process of gathering and processing data, as well as the regression model and machine learning classification model used to forecast the occurrences of GDPs and delay time, respectively.
Section 4 describes the experimental findings, assesses how well the machine learning model is working, and examines the influence of various characteristics. Finally,
Section 5 summarizes the conclusions and suggests areas for future study.
5. Conclusions
This paper presents a novel approach for predicting GDP incidence by utilizing machine learning techniques and evaluating the performance of three different models, including SVM, Random Forest, and XGBoost. Although the models achieved similar results in terms of their AUC, XGBoost outperformed the others regarding F1 score, accuracy, and Kappa metrics. (The test set accuracy was 0.902 when using the XGBoost model.) Regarding feature importance, the highest contributing factor was the ceiling, which was consistent with the analysis of local weather conducted by the ATMAP algorithm.
We then investigated the forecasting of departure delays during GDP duration using both linear and non-linear regression models. The results showed that the decision tree model outperformed the ridge and LASSO models with a minimum MAE of 10.9 to 12 min. Incorporating the ATMAP score into the models led to improved accuracy, especially for the decision tree model, which saw an 8% increase in values. This could be attributed to the ATMAP score’s ability to reflect non-linear weather variance and capture the delays in ground holding that is caused by weather events, resulting in a more precise prediction of flight delays. The results also show that the ATMAP scores introduced in this study are indeed of high importance for flight departure delay prediction accuracy during GDP durations.
In conclusion, the GDP prediction model outlined in this paper sheds light on the interplay between weather conditions and air traffic flow with regard to GDP occurrences. These machine learning models offer a valuable tool for controllers to make informed decisions on GDP activation and help to anticipate the extent of its impact on flight delays, thus improving operational efficiency and reducing the environmental impact. The combination of these models has the potential to predict the entire GDP process accurately. This study sets the stage for further research to expand our understanding of GDP formation and its effects. It is recommended that future studies explore the collaborative decision-making paradigm and examine the ripple effect of GDP among airport clusters, taking into account its contributing factors.