Short-Term Air Traffic Flow Prediction Based on CEEMD-LSTM of Bayesian Optimization and Differential Processing

Zhou, Rui; Qiu, Shuang; Li, Ming; Meng, Shuangjie; Zhang, Qiang

doi:10.3390/electronics13101896

Open AccessArticle

Short-Term Air Traffic Flow Prediction Based on CEEMD-LSTM of Bayesian Optimization and Differential Processing

College of Air Traffic Management, Civil Aviation Flight University of China, Guanghan 618307, China

^*

Author to whom correspondence should be addressed.

Electronics 2024, 13(10), 1896; https://doi.org/10.3390/electronics13101896

Submission received: 8 April 2024 / Revised: 7 May 2024 / Accepted: 10 May 2024 / Published: 12 May 2024

(This article belongs to the Special Issue Application of Time Series Analysis and Forecasting in Computer Science)

Download

Browse Figures

Versions Notes

Abstract

:

With the rapid development of China’s civil aviation, the flow of air traffic in terminal areas is also increasing. Short-term air traffic flow prediction is of great significance for the accurate implementation of air traffic flow management. To enhance the accuracy of short-term air traffic flow prediction, this paper proposes a short-term air traffic flow prediction model based on complementary ensemble empirical mode decomposition (CEEMD) and long short-term memory (LSTM) of the Bayesian optimization algorithm and data differential processing. Initially, the model performs CEEMD on the short-term air traffic flow series. Subsequently, to improve prediction accuracy, the data differencing is employed to stabilize the time series. Finally, the smoothed sequences are, respectively, input into the LSTM network model optimized by the Bayesian optimization algorithm for prediction. After data reconstruction, the final short-term flow prediction result is obtained. The model proposed in this paper is verified by using the data from Shanghai Pudong International Airport. The results show that the evaluation indexes of the prediction accuracy and fitting degree of the model, RMSE (Root Mean Square Error), MAE (Mean Absolute Error), and R² (Coefficient of Determination), are 0.336, 0.239, and 97.535%, respectively. Compared to other classical time-series prediction models, the prediction accuracy is greatly improved, which can provide a useful reference for short-term air traffic flow prediction.

Keywords:

short-term air traffic flow prediction; complementary ensemble empirical mode decomposition (CEEMD); data differential processing; Bayesian optimization algorithm (BO); long short-term memory (LSTM)

1. Introduction

In recent years, China’s civil aviation has developed rapidly, and the airspace traffic in the approach control area has been increasing. To facilitate supervision and management by air traffic controllers, the current airspace route system mainly consists of predetermined routes [1]. However, the significant increase in demand for air traffic operations caused by a surge in airspace traffic can disrupt the balance of capacity within the air traffic control system [2], leading to the trajectories of aircraft uncertainty within the airspace, the potential for traffic conflicts, and ultimately an overload of air traffic control personnel due to increased workload complexity [3]. Short-term air traffic flow prediction can enable air traffic control departments to implement accurate control over air traffic flow and help air traffic controllers anticipate risks and potential hazards in the next few minutes. This allows staff to adjust their plans promptly, optimize the management of the airspace route system, and provide crucial decision support for the aviation transport system [4].

At present, both domestically and internationally, there are mainly three methods for short-term air traffic flow prediction: linear prediction, nonlinear prediction, and model prediction based on data mining and a computer algorithm [5]. Among them, the linear prediction is relatively simplistic, which is suitable for long-term macroscopic forecasting. However, when it comes to short-term traffic flow prediction with non-stationary and nonlinear characteristics, the predictive performance is not ideal [6]. In terms of nonlinear prediction, Packard et al. [7] have applied the theory of phase space reconstruction to the study of chaotic time series. Polson et al. [8] realized the traffic flow prediction of special period events by establishing the framework of the nonlinear spatiotemporal flow effect. Based on the ELM non-iterative algorithm, Zhang et al. [9] constructed an air traffic prediction system before and after the epidemic to predict airspace traffic. In order to reduce the uncertainty of continuous descent trajectory in high-density airspace, Zhang et al. [10] established a dynamic trajectory prediction model based on UKF according to real-time ADS-B data and realized the prediction of dynamic flight trajectory. However, the short-term traffic flow prediction method based on nonlinear feature prediction cannot accurately describe the random characteristics of traffic flow [11]. And, the accuracy and stability of the model cannot be guaranteed [12], and some nonlinear filtering algorithms may even lead to a decrease in model accuracy or the divergence of the algorithm [13]. In the research of a combination model based on data mining and a computer algorithm, Razali et al. [14] summarized the use of a convolutional neural network or combination technology to enhance the accuracy of traffic flow prediction. Gui [15] mined ADS-B data, extracted useful information, mapped it to the route, and established an air traffic flow platform based on aviation big data to predict and count air traffic flow between different cities. Li et al. [16] established a segment traffic flow prediction model by using a neural network under the premise of considering the correlation of segments and parameter fusion, and predicted the segment traffic flow per unit time period. Based on the combination model of data mining and a computer algorithm, it is necessary to mine the complex relationship between data for self-learning. However, only a large amount of data can be used for data mining [17], and the prediction effect is not ideal when the amount of data is small.

According to the non-stationary and nonlinear data characteristics of short-term air traffic flow series, this paper proposes a new short-term air traffic flow prediction model, based on complementary ensemble empirical mode decomposition (CEEMD) and long short-term memory (LSTM) of the Bayesian optimization algorithm and data differential processing. By smoothing the noise of the short-term air traffic flow series, it can more accurately capture the dynamic changes in traffic flow and improve prediction accuracy.

In light of the information presented, the contributions of this paper are as follows:

1.: The flight flow sequence is decomposed by CEEMD to effectively reduce the impact of noise in prediction.
2.: The data difference processing method is used to improve the stability of the flight flow sequence and ensure prediction accuracy.
3.: The Bayesian optimization algorithm is used to find the optimal hyperparameters of the neural network, which saves computing resources and enhances the prediction ability of the model.
4.: The model based on CEEMD-LSTM of the proposed Bayesian optimization and difference processing in this paper is verified by the actual flight flow series, which shows that it is superior to other classical models in the typical indicators of prediction accuracy and fitting degree.

2. Materials and Methods

2.1. Complementary Ensemble Empirical Mode Decomposition

Complementary ensemble empirical mode decomposition (CEEMD) is a decomposition method based on empirical mode decomposition (EMD) and ensemble empirical mode decomposition (EEMD). Different from the decomposition methods based on priors, EMD, which utilizes the Hilbert–Huang Transform (HHT) method, is an adaptive decomposition method that does not rely on any predefined basis functions [18]. It divides complex signals into oscillatory components (Intrinsic Mode Functions, IMFs) ranging from low to high frequencies and a smooth monotonic residual solely. The decomposition formula of EMD is shown in Equation (1):

y (t) = \sum_{i = 1}^{n} k_{i} (t) + r (t),

(1)

where y(t) is the original signal, k_i(t) is the ith IMF obtained after EMD, n is the number of IMFs obtained after decomposition, and r(t) is the residual component.

When the original signal is more complex and the change is more intense, the IMF components obtained through EMD may suffer from modal aliasing and end effects, leading to inaccurate decomposition results. In order to suppress the occurrence of modal aliasing, EEMD superimposes Gaussian white noise on the basis of EMD and performs multiple empirical mode decomposition. Using the statistical characteristics of Gaussian white noise with uniform frequency distribution, the extreme point characteristics of the signal are changed by adding different white noises of the same amplitude each time, and then the corresponding IMF obtained by multiple EMD is averaged to offset the added white noise.

However, it is difficult to completely eliminate the residual white noise in EEMD only by finite ensemble averaging, which will produce large errors in reconstruction and affect the prediction accuracy. On the basis of EEMD, CEEMD eliminates the influence of residual noise by adding Gaussian white noise with positive and negative different signs and the same amplitude as the original signal. The decomposition flow chart of the CEEMD algorithm is shown in Figure 1.

The specific decomposition process steps are as follows:

(1) Add mixed Gaussian white noise signals with positive and negative signs and equal amplitudes to the initial time series y(t):

y_{i}^{+} (t) = y (t) + e_{i}^{+} (t),

(2)

y_{i}^{-} (t) = y (t) + e_{i}^{-} (t),

(3)

where y_i⁺(t) and y_i⁻(t) are the signals after adding mixed Gaussian white noise with positive and negative signs;

e_i⁺(t) and e_i⁻(t) are mixed Gaussian white noise signals with positive and negative signs and equal amplitudes;

y(t) is the original signal.

(2) According to the EMD decomposition method, decompose the composite signal using EMD to obtain:

y_{i}^{+} (t) = \sum_{j = 1}^{n} k_{i j}^{+} (t) + r_{i}^{+} (t),

(4)

y_{i}^{-} (t) = \sum_{j = 1}^{n} k_{i j}^{-} (t) + r_{i}^{-} (t),

(5)

where k_ij⁺(t) is the jth IMF component after adding positive Gaussian noise for the ith time;

k_ij⁻(t) is the jth IMF component after adding negative Gaussian noise for the ith time;

r_i⁺(t) and r_i⁻(t) are the residual components;

y(t) is the original signal.

(3) Repeat Steps (1) and (2) N times, adding different Gaussian white noise each time.

(4) Take the average of the obtained positive and negative IMF components to obtain the jth IMF component and residual component:

k_{j} (t) = \sum_{i = 1}^{N} (k_{i j}^{+} (t) + k_{i j}^{-} (t)),

(6)

r (t) = y (t) - \sum_{j = 1}^{n} k_{j} (t),

(7)

where k_j(t) represents the various IMF components, j = 1, 2, 3…, n;

r(t) represents the residual component.

2.2. Data Processing

2.2.1. Stationarity Test for Data

The data stationarity test is a statistical method used to determine whether time series data exhibits stationarity [19]. In the analysis and prediction of time series, stationarity is an important assumption. It refers to the statistical characteristics of the data that remain stable over time, that is, the mean, variance, and autocorrelation function do not change with time. In order to ensure the prediction accuracy, it is usually necessary to ensure that the data are stable [20].

The Augmented Dickey-Fuller (ADF) test is a unit root test used to examine the stationarity of time series data [21]. It is an improved version based on the Dickey-Fuller test (DF), which adds an autoregressive term and a lag term to better process time series data with trends [22]. The basic assumption of the ADF test is that the time series data contain a unit root process, indicating the non-stationarity of the data. The hypothesis is evaluated by comparing the test statistic (T-value) with the critical values at a given significance level (usually 0.01 or 0.05) [23]. If the p-value is less than the significance level and the T-value is smaller than the critical value, the null hypothesis is rejected, indicating that the data are stationary. Conversely, if the p-value is greater than the significance level or the T-value is larger than the critical value, the null hypothesis is accepted, suggesting that the data are non-stationary.

2.2.2. Data Differential Processing

Data differential processing (DF) is a common data processing method, which is usually used to reduce data trends, stabilize sequence variance, remove noise, and extract features [24]. It works by calculating the differences between adjacent time points or adjacent spatial points to enhance the stationarity of the time series. The data differential processing is shown in Equation (8):

d_{m} = q_{m + 1} - q_{m},

(8)

where d_m represents the mth differenced data, q_m₊₁ is the (m + 1)th sampling point of the original sequence, and q_m is the mth sampling point of the original sequence.

The short-term traffic sequence is decomposed by the EMD algorithm to obtain the original time series matrix Y_IMF shown in Equation (9); performing the data differential processing, as described in Equation (8), on it will yield the detrended differenced sequence matrix Y_d, as shown in Equation (10):

Y_{IMF} = |\begin{matrix} {IMF}_{1} & {IMF}_{2} & \dots & {IMF}_{n} \\ q_{11} & q_{12} & \dots & q_{1 m} \\ q_{21} & q_{22} & \dots & q_{2 m} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ q_{n 1} & q_{n 2} & \dots & q_{n m} \end{matrix}|

(9)

Y_{d} = |\begin{matrix} d_{1} \\ d_{2} \\ ⋮ \\ ⋮ \\ d_{n} \end{matrix}| = |\begin{matrix} d_{11} & d_{12} & d_{13} & \dots & d_{1 (m - 1)} \\ d_{21} & d_{22} & d_{23} & \dots & d_{2 (m - 1)} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ d_{n 1} & d_{n 2} & d_{n 3} & \dots & d_{n (m - 1)} \end{matrix}| .

(10)

2.3. Long Short-Term Memory Network

The long short-term memory network (LSTM) is a special recurrent neural network, which is different from the traditional recurrent neural network [25]. It has obvious advantages in dealing with time series problems, so it is often used to process time series data prediction. Short-term traffic flow has complex nonlinear relationships, and LSTM introduces special gating mechanisms to extract long-term features of traffic flow data. It can efficiently process long-term temporal dependencies and has superior performance in capturing spatiotemporal relationships [26]. Therefore, the LSTM neural network can be used to predict short-term air traffic flow. The LSTM neural network is composed of the basic components of the LSTM unit, including input gates, output gates, forgetting gates, and cell state components. The basic structure is shown in Figure 2, where I_t represents the input gate at the current time step, F_t is the forget gate at the current time step, O_t is the output gate at the current time step, according to [27], and the expressions are shown in Formulas (11)–(13). H_t−1 and H_t are the hidden states at the previous and current time steps, t is the current input, G_t is the candidate memory cell at the current time step, C_t−1 and C_t represent the memory cells at the previous and current time steps, and Y_t is the output.

At each time step, when the LSTM neural network input X(t) = (X₁, X₂, ……, X_n), the hidden layer output is H(t) = (H₁, H₂, ……, H_n), the output layer is Y(t) = (Y₁, Y₂, ……, Y_n), and the cell state is C(t) = (C₁, C₂,……, C_n). This is expressed as

I_{t} = σ (ω_{x_{t}, i} x_{t} + ω_{h_{t}, i} h_{t - 1} + ω_{c_{t}, i} c_{t - 1} + b_{i}),

(11)

F_{t} = σ (ω_{x_{t}, f} x_{t} + ω_{h_{t}, f} h_{t - 1} + ω_{c_{t}, f} c_{t - 1} + b_{f}),

(12)

O_{t} = σ (ω_{x_{t}, o} x_{t} + ω_{h_{t}, o} h_{t - 1} + ω_{c_{t}, o} c_{t - 1} + b_{o}) .

(13)

2.4. Bayesian Optimization Algorithm

The Bayesian optimization algorithm (BO) is widely used in the fields of machine learning, deep learning hyperparameter tuning, neural network architecture search, automated machine learning, engineering design optimization, and more. Due to its ability to find the optimal hyperparameter configuration in relatively few iterations, thus saving computational resources, it is often applied to problems that involve finding the best hyperparameter settings [28]. In this article, to optimize the hyperparameters of the prediction model, the BO is used for parameter optimization. In the case of an unknown function, the BO estimates the posterior distribution of the objective function based on known data and prior distributions and selects the next sample point based on this distribution. The basic flowchart of the BO is shown in Figure 3.

The specific steps are as follows:

1.: Randomly generate initial sample points within the optimization range of the model hyperparameters. Input these points into the Gaussian process and train the corresponding model. Evaluate and adjust the Gaussian process based on the loss values output by the model’s objective function, allowing the model to approximate the true function distribution;
2.: After evaluating and adjusting the Gaussian model, use the sampling function to select the next set of sample points to input into the model for training. Obtain new output values for the model’s objective function loss, thereby updating the Gaussian model and the sample set;
3.: If the loss value of the newly selected sample points meets the requirements, terminate the process and output the currently selected best parameter combination along with the corresponding loss value of the model’s objective function;
4.: If the loss value of the newly selected sample points does not meet the requirements, update the sample points in the sample set and return to Step 2. Continue evaluating and adjusting the Gaussian model until the requirements are met.

2.5. CEEMD-DF-BO-LSTM Prediction Model

Due to the non-stationarity and nonlinearity of short-term traffic flow data, this can affect the prediction accuracy [29]. When there are significant trends or seasonal variations in traffic flow data, models may fail to accurately capture these features. To stabilize the variance of the traffic flow time series and reduce the impact of trends, we decomposed the traffic flow using CEEMD and then applied differencing to each component series. By using first-order differencing, we calculated the differences between adjacent time points to obtain a more stable time series. Subsequently, we used the Bayesian optimization algorithm to determine the hyperparameters of the LSTM network for LSTM prediction.

To improve the prediction accuracy of the short-term traffic flow, this article proposes a predictive model of “data decomposition-model prediction-data recombination”, as shown in Figure 4. This predictive model mainly consists of three steps:

1.: Data pre-processing: After conducting the ADF test on the original traffic flow sequence, it is decomposed into multiple IMFs and residuals using CEEMD. The ADF test and differencing processing are applied to components other than IMF1, resulting in ( $n - 1$ ) differenced sequences. These differenced sequences are then subjected to the ADF test and data scaling for subsequent modeling. After data scaling, the range of the differenced traffic flow sequence data in the short-term spatial-temporal domain is [−1, 1];
2.: LSTM modeling: After determining the hyperparameters of the predictive model using the Bayesian optimization algorithm, the scaled data are used as input for the model. They can be divided into two parts: 70% as the training set and 30% as the test set. The training set is used to train the model, while the test set is used to predict the differenced traffic flow sequences in the short-term spatial-temporal domain, obtaining ( $n - 1$ ) predicted differenced sequences;
3.: Data reconstruction: The predicted values of differenced sequences are inverse-scaled and inverse-differenced to obtain the predicted values of each decomposition component. The data of each component and IMF1 decomposed by CEEMD are cumulatively added up to generate the final prediction result.

Figure 4. The framework diagram of the CEEMD-DF-LSTM prediction model.

3. Experiment and Results

3.1. Data Description

To verify the feasibility of the model, this study collected flight data from Shanghai Pudong International Airport in December 2023 for experimentation. The number of flights is counted every 5 min, and the distribution of the number of flights in the airspace within a certain five min is shown in Figure 5. A total of 864 short-term traffic flow data points were obtained in this study. The data were divided into a 70% training set and a 30% testing set. Two days of flight data (576 in total, No. 0-575) were selected as training data, and one day of flight data (288 in total, No. 576-863) was selected as test data. The flight data are shown in Figure 6.

3.2. Performance Evaluation

In order to measure the fit and accuracy of the prediction model to the actual air traffic flow and compare the performance of different prediction models, according to reference [30], the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Coefficient of Determination (R²), and Relative Error (RE) are commonly used in short-term traffic flow prediction, as shown in Equations (14)–(17):

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}},

(14)

MAE = \frac{1}{n} \sum_{i = 1}^{n} | y_{i} - {\hat{y}}_{i} |,

(15)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - {\bar{y}}_{i})}^{2}},

(16)

RE = \frac{{\hat{y}}_{i} - y_{i}}{y_{i}},

(17)

where n represents the total number of predicted data, represents the true value of flight traffic, and represents the predicted value of flight traffic.

3.3. Data Pre-Processing

3.3.1. Data Decomposition

The above short-term traffic flow data were decomposed using the EMD, EEMD, and CEEMD methods (Figure 7, Figure 8 and Figure 9), and the reconstruction errors were calculated (Figure 10). It can be observed that after EMD, the IMF2 and IMF3 components contain features with different time scales, but there is a significant modal aliasing phenomenon, resulting in unsatisfactory decomposition results. And, the reconstruction error of EEMD is significantly larger compared to CEEMD decomposition, and CEEMD solves the incomplete decomposition issue of EMD and improves the decomposition quality and prediction accuracy. The CEEMD achieves better decomposition results.

3.3.2. Data Differential Processing

A stationary initial sequence can result in smaller prediction errors. As shown in Table 1, the p-value of the initial sequence is greater than 0.05 and the T-value exceeds the critical values of three confidence intervals. Therefore, the initial sequence is non-stationary and requires further stabilization processing. From Figure 5, it can be observed that the initial sequence data exhibit significant fluctuations and demonstrate a periodic “daily” pattern. Consequently, the sequence after decomposition using the CEEMD algorithm is differentially processed according to Equation (12) to transform it into a stationary time series. During this process, the stationarity of the component sequences is tested using the ADF test. As indicated in Table 2, the stationarity of the differenced sequence is significantly improved.

3.3.3. Data Scaling

The normalization of the data is carried out using the Min–Max method, and its calculation formula is shown in Equation (18):

{\tilde{x}}_{i} = \frac{x_{i} - x_{\min}}{x_{\max} - x_{i}},

(18)

where x_i represents the actual value, x_min and x_max represent the minimum and maximum values, respectively, and represents the normalized value.

3.4. Hyperparameters of Predicting Model by Bayesian Optimization

The Bayesian optimization algorithm is used to optimize the hyperparameters of the LSTM prediction model. By automatically searching the hyperparameter space, the demand for manual parameter adjustment is reduced, the efficiency and accuracy of parameter adjustment are improved, and the performance and generalization ability of the LSTM prediction model are improved. Among them, in this study, 128 hidden neural units were selected by manual tuning, and 23 hidden units were selected by Bayesian tuning. The Bayesian tuning method found a relatively small but still effective number of hidden units by searching the parameter space more effectively, so the model achieved good results in performance and reduced the risk of overfitting. The initial learning rate refers to the initial learning rate of the model during the training process, which determines the step size of each step of the model when the parameters are updated. Bayesian parameter adjustment can explore the parameter space more carefully by reducing the initial learning rate, which improves the search ability of the model to the local optimal solution and helps to avoid instability in the training process. Artificial tuning selects 150 iterations, while Bayesian tuning selects 26 iterations. In contrast, Bayesian tuning may adopt a finer tuning strategy, explore the parameter space through a small number of iterations, and find better parameter settings in a shorter time, thereby reducing tuning time and computing resources. The optimization results of each hyperparameter of the network are shown in Table 3.

4. Results Discussion

To further verify the predictive accuracy and effectiveness of the proposed short-term air traffic flow prediction model, three sets of comparative experiments were conducted.

1.: The comparative experiment on the prediction results of the single models: The short-term air traffic flow sequence was only subjected to data scaling without data decomposition. Three models, namely Support Vector Regression (SVR), Autoregressive Integrated Moving Average (ARIMA), and the LSTM neural network, were used to predict the short-term air traffic flow. The results are shown in Figure 11, and the local Relative Error was calculated and presented in Figure 12;
Overall, the SVR model exhibits the lowest predictive accuracy, while the ARIMA and LSTM models generally align with the trend of the original sequence. By examining enlarged plots of local regions, it is evident that the LSTM model achieves significantly higher prediction accuracy. An analysis of the Relative Error plots reveals that the SVR and ARIMA models have relatively large errors, whereas the Relative Errors for the LSTM model are mostly around 0.5. A comparison of prediction errors among the various models in Table 4 indicates that the LSTM model outperforms the SVR and ARIMA models across all evaluation metrics, indicating that the LSTM model possesses superior fitting capability and higher predictive accuracy when dealing with non-stationary, nonlinear short-term spatial-temporal traffic flow data.
2.: The comparative experiment on the prediction results of the composite models: The original short-term air traffic flow sequence was decomposed using EMD, EEMD, and CEEMD. The decomposed Intrinsic Mode Functions (IMFs) were then scaled and fed into the LSTM neural network model for prediction. After data reconstruction, the final prediction results were obtained, as illustrated in Figure 13, with the corresponding Relative Errors shown in Figure 14.
Upon examination, it is evident that the prediction curves generated by the models incorporating three different decomposition methods are notably closer to the actual curves and exhibit higher predictive accuracy compared to a single LSTM prediction model. The prediction error of the EMD-LSTM and EEMD-LSTM combination models fluctuates greatly, and the prediction error of the CEEMD-LSTM model fluctuates less. The error indexes of each prediction model in Table 5 are compared; for the R² score, the CEEMD-LSTM model is close to 65%; furthermore, in comparison to the EEMD-LSTM and EEMD-LSTM models, the CEEMD-LSTM model exhibits a lower Root Mean Square Error and Mean Absolute Error, reflecting a higher overall predictive accuracy with respect to the original sequence. This also suggests that the CEEMD method not only effectively suppresses mode mixing but also minimizes reconstruction errors, thereby elevating predictive accuracy.
3.: The comparison experiment of the prediction results based on the Bayesian and differential processing optimization: The CEEMD-DF-BO-LSTM combined neural network prediction model optimized by Bayesian and differential processing can be seen from Figure 15; it can be seen that its prediction accuracy and fitting ability are significantly better than the combined prediction model without differential processing and the Bayesian optimization. The CEEMD-DF-BO-LSTM model is also the most accurate in predicting the data at the peak. In Figure 16, it can be seen that the Relative Error of the CEEMD-DF-BO-LSTM model is below 0.25, indicating that its prediction fluctuation is very small. As can be seen from each evaluation index of the prediction model in Table 6, the Root Mean Square Error decreases by 73.7%, and the R² reaches 97%, which proves that the prediction accuracy and fitting degree of the model are the highest.

5. Conclusions

Aiming at the problem of low accuracy of traffic flow prediction caused by the characteristics of short-term air traffic flow, such as nonlinearity, non-stationarity, and complex change trend, this paper proposes a short-term air traffic flow prediction model based on CEEMD and LSTM of the Bayesian optimization algorithm and data differential processing (CEEMD-DF-BO-LSTM). Using the data from Shanghai Pudong Airport, the model is theoretically analyzed and experimentally verified, and the short-term traffic flow prediction with a five-minute interval is realized. Our conclusions are as follows:

1.: The original short-term air traffic flow sequence is decomposed into a finite number of IMF components by CEEMD decomposition, which suppresses the modal aliasing phenomenon, reduces the reconstruction error, and effectively reduces the influence of noise in prediction.
2.: According to the difference processing of each component, except the IMF1 component, the trend of the air traffic flow time series is effectively eliminated, and it can improve the stability of the flight flow sequence and ensure prediction accuracy.
3.: The parameters of the LSTM network model are optimized by the Bayesian optimization algorithm, which not only saves computing resources but also brings significant advantages to the real-time task of the model.

The final prediction results not only fully exploit the hidden time series characteristics of the original sequence but also eliminate the influence of nonlinear and non-stationary factors on the prediction model. Compared with other classical models, the evaluation indexes of RMSE, MAE, and R² are 0.336, 0.239, and 97.535%, respectively, which are much better than other classical models and can accurately simulate the dynamic changes in complex short-term air traffic flow. This model and method can provide a reference for air traffic flow prediction and practical air traffic control applications.

Author Contributions

Conceptualization, R.Z. and S.Q.; methodology, R.Z.; software, S.Q.; validation, S.Q. and M.L.; formal analysis, S.Q.; investigation, S.M.; resources, Q.Z.; data curation, S.Q.; writing—original draft preparation, S.Q.; writing—review and editing, R.Z.; visualization, S.Q.; supervision, R.Z.; project administration, R.Z.; funding acquisition, R.Z. and Q.Z. All authors have read and agreed to the published version of this manuscript.

Funding

This research was funded by the Central University Basic Scientific Research Business Expenses Special Fund Support (No. ZJ2023-007), the Key R&D Project of Sichuan Provincial Science and Technology Plan (No. 2022YFG0353), Fundamental Research Funds for the Central Universities (No. J2022-056), Sichuan Provincial College Student Innovation and Entrepreneurship Training Program (No. S202310624288).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are available upon reasonable request from the corresponding author.

Acknowledgments

We acknowledge the equipment support provided by the Civil Aviation Flight University of China.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of this study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

CEEMD	complementary ensemble empirical mode decomposition
LSTM	long short-term memory
SVR	Support Vector Regression
ARIMA	Autoregressive Integrated Moving Average
BO	Bayesian optimization algorithm
DF	data differential processing
EMD	empirical mode decomposition
EEMD	ensemble empirical mode decomposition
IMFs	Intrinsic Mode Functions
RES	residual component
ADF	Augmented Dickey–Fuller
RMSE	Root Mean Square Error
MAE	Mean Absolute Error
$R^{2}$	Coefficient of Determination
RE	Relative Error

References

Gerdes, I.; Temme, A. Traffic network identification using trajectory intersection clustering. Aerospace 2020, 7, 175. [Google Scholar] [CrossRef]
Moreno, F.P.; Comendador, V.F.G.; Jurado, R.D.A.; Suárez, M.Z.; Janisch, D.; Valdés, R.M.A. Methodology of air traffic flow clustering and 3-D prediction of air traffic density in ATC sectors based on machine learning models. Expert Syst. Appl. 2023, 223, 119897. [Google Scholar] [CrossRef]
Corver, S.C.; Unger, D.; Grote, G. Predicting air traffic controller workload: Trajectory uncertainty as the moderator of the indirect effect of traffic density on controller workload through traffic conflict. Hum. Factors 2016, 58, 560–573. [Google Scholar] [CrossRef]
Wang, J.H.; Zhu, X.B.; Xia, Z.H. Visualization Analysis of Domestic Air Traffic Management Based on Knowledge Graph. Traffic Inf. Saf. 2019, 37, 11–19. [Google Scholar]
Wang, C.; Zheng, X.F.; Wang, L. Research on Nonlinear Characteristics of Air Traffic Flow in Intersecting Routes. J. Southwest Jiaotong Univ. 2017, 52, 171–178. [Google Scholar]
Wang, L.L.; Zhao, Y.F. Air Traffic Flow Prediction Method Based on GA, RBF, and Improved Cao Method. Transp. Inf. Saf. 2023, 41, 115–123. [Google Scholar]
Packard, N.H.; Crutchfield, J.P.; Farmer, J.D.; Shaw, R.S. Geometry From a Time Series. Phys. Rev. Lett. 1980, 45, 712. [Google Scholar] [CrossRef]
Polson, N.G.; Sokolov, V.O. Deep Learning for Short-Term Traffic Flow Prediction. Transp. Res. Part Emerg. Technol. 2017, 79, 1–17. [Google Scholar] [CrossRef]
Zhang, Z.; Zhang, A.; Sun, C.; Xiang, S.; Guan, J.; Huang, X. Research on Air Traffic Flow Forecast Based on ELM Non-iterative Algorithm. Mob. Netw. Appl. 2021, 26, 425–439. [Google Scholar] [CrossRef]
Zhang, J.; Wang, G.; Xiao, G. Dynamic Trajectory Prediction for Continuous Descend Operations Based on Unscented Kalman Filter. In Proceedings of the Chinese Intelligent Systems Conference; Springer: Singapore, 2021; Volume I, pp. 206–216. [Google Scholar]
Dong, C.X.; Wei, X.; Zhang, K.P. Large-scale Road Network Traffic Flow Prediction Based on Graph Transformer. Ind. Eng. 2023, 26, 159–167. [Google Scholar]
Rong, J.; Liu, L.; Wen, H.; Wang, Q.; Zhou, L. Application of GM(1,1)-AR Forecasting Model Based on Kalman Filter in Deformation Prediction. J. Guilin Univ. Technol. 2018, 38, 301–305. [Google Scholar]
Wang, G.; Yang, C.; Ma, L.; Dai, W. Nonlinear Kalman Filtering Based on Gaussian-Generalized Hyperbolic Mixture Distribution. Acta Autom. Sin. 2023, 49, 448–460. [Google Scholar]
Razali, N.A.M.; Shamsaimon, N.; Ishak, K.K.; Ramli, S.; Amran, M.F.M.; Sukardi, S. Gap, Techniques and Evaluation: Traffic Flow Prediction Using Machine Learning and Deep Learning. J. Big Data 2021, 8, 152. [Google Scholar] [CrossRef]
Gui, G.; Zhou, Z.; Wang, J.; Liu, F.; Sun, J. Machine Learning Aided Air Traffic Flow Analysis Based on Aviation Big Data. IEEE Trans. Veh. Technol. 2020, 69, 4817–4826. [Google Scholar] [CrossRef]
Li, G.Y.; Hu, M.H. A Multi-model Fusion Dynamic Forecasting Method for Route Congestion Situation Considering Segment Correlation. Transp. Syst. Eng. Inf. 2018, 18, 215–222. [Google Scholar]
Wang, H.; Zhang, R.; Cheng, X.; Yang, L. Hierarchical Traffic Flow Prediction Based on Spatial-temporal Graph Convolutional Network. IEEE Trans. Intell. Transp. Syst. 2022, 23, 16137–16147. [Google Scholar] [CrossRef]
Opeyemi, I.O.; Akanni, O.O. Tool and Workpiece Condition Classification Using Empirical Mode Decomposition (EMD) with Hilbert–Huang Transform (HHT) of Vibration Signals and Machine Learning Models. Appl. Sci. 2023, 13, 2248. [Google Scholar] [CrossRef]
Liu, L.Y. Research on the Meaning and Nature of Econometrics. Ph.D. Thesis, Dongbei University of Finance and Economics, Dalian, China, 2013. [Google Scholar]
Meng, Q.L.; Dou, Y. Research on Short-term Forecast of Railway Passenger Volume Based on EMD-CNN-LSTM Model. Railw. Transp. Econ. 2023, 45, 65–73. [Google Scholar]
Yang, Y.J.; Tang, D. Research on Real-time Warning of Price Bubbles in Nonferrous Metal Futures Market in China: An Analysis Based on Upper Bound ADF Test. Price Theory Pract. 2022, 12, 114–117+202. [Google Scholar]
Ogbemudia, H.O.; Aghogho, B.E.; Usunobun, M.O. Bearing Failure Diagnosis and Prognostics Modeling in Plants for Industrial Purpose. J. Eng. Appl. Sci. 2023, 70, 17. [Google Scholar]
Cui, J.S.; Li, X. Research on CNN-LSTM Coupled Model for Chemical Process Early Warning. J. Process. Eng. 2024, 1–9. [Google Scholar]
Han, Z.Y. Inventory Demand Forecasting Based on LSTM Model Based on Data Difference. China Storage Transp. 2023, 9, 152–153. [Google Scholar]
Zhao, M.W.; Zhang, W.S.; Wang, K.W. Short-term Passenger Flow Prediction of Urban Rail Transit Based on EMD-PSO-LSTM Combined Model. Railw. Transp. Econ. 2022, 44, 110–118. [Google Scholar]
Cheng, Z.L.; Zhang, X.Q.; Liang, Y. Railway Feight Volume Prediction Based on LSTM Network. J. Railw. Sci. 2020, 42, 15–21. [Google Scholar]
Smagulova, K.; James, A.P. A survey on LSTM memristive neural network architectures and applications. Eur. Phys. J. Spec. Top. 2019, 228, 2313–2324. [Google Scholar] [CrossRef]
Kim, J.Y.; Oh, J.S. Electric Consumption Forecast for Ships Using Multivariate Bayesian Optimization-SE-CNN-LSTM. J. Mar. Sci. Eng. 2023, 11, 292. [Google Scholar] [CrossRef]
Wang, F. Nonlinear Fractal Characteristics of Air Traffic Flow. J. Southwest Jiaotong Univ. 2019, 54, 1147–1154. [Google Scholar]
Yan, Z.; Yang, H.; Li, F.; Lin, Y. A deep learning approach for short-term airport traffic flow prediction. Aerospace 2021, 9, 11. [Google Scholar] [CrossRef]

Figure 1. Flowchart of CEEMD decomposition.

Figure 2. LSTM cell unit.

Figure 3. Flowchart of BO algorithm.

Figure 5. 5 min airspace flow chart.

Figure 6. The original flight flow data.

Figure 7. Results of EMD decomposition.

Figure 8. Results of EEMD decomposition.

Figure 9. Results of CEEMD decomposition.

Figure 10. Reconstruction errors.

Figure 11. The prediction results of the single models.

Figure 12. Local prediction Relative Error of single models.

Figure 13. The prediction results of the composite models.

Figure 14. Local prediction Relative Error of composite models.

Figure 15. The prediction results of the optimized model.

Figure 16. Local prediction Relative Error of optimized model.

Table 1. ADF test for initial sequence.

Time Series	T	Confidence Interval			p
Time Series	T	1%	5%	10%	p
5-min flight traffic	−2.750	−3.43808	−2.86495	−2.56859	0.06576

Table 2. ADF test for differenced sequence.

Time Series	T	Confidence Interval			p
Time Series	T	1%	5%	10%	p
IMF2	−11.700	−3.43813	−2.86497	−2.56860	$1.573 \times 10^{- 21}$
IMF3	−11.080	−3.43813	−2.86497	−2.56860	$4.334 \times 10^{- 20}$
IMF4	−10.448	−3.43806	−2.86494	−2.56858	$1.460 \times 10^{- 18}$
IMF5	−5.539	−3.43812	−2.86497	−2.56860	$1.716 \times 10^{- 6}$
IMF6	−3.594	−3.43800	−2.86492	−2.56857	$5.886 \times 10^{- 3}$
Res	−2.918	−3.43816	−2.86499	−2.56861	$4.324 \times 10^{- 2}$

Table 3. Hyperparameter results of LSTM model optimization.

Parameter	Parameter Range	Manual Tuning	Bayesian Tuning
LSTM Layers	[1, 4]	3	2
Hidden Units	[16, 256]	128	23
Initial Learning Rate	[1 × $10^{- 5}$ , 1 × $10^{- 2}$ ]	0.01	0.001
Number of Iterations	[10, 200]	150	26
Dropout Rate	[1 × $10^{- 3}$ , 1 × $10^{- 1}$ ]	0.1	0.052

Table 4. The prediction error of the single models.

Prediction Model	RMSE/Flights	MAE/Flights	R²/%
SVR	1.811	1.431	28.374%
ARIMA	1.850	1.483	25.297%
LSTM	1.637	1.313	41.466%

Table 5. The prediction error of the composite models.

Prediction Model	RMSE/Flights	MAE/Flights	$R^{2}$ /%
EMD-LSTM	1.356	1.115	59.880%
EEMD-LSTM	1.381	1.127	58.338%
CEEMD-LSTM	1.277	1.033	64.424%

Table 6. The prediction error of the optimized model.

Prediction Model	RMSE/Flights	MAE/Flights	$R^{2}$ /%
CEEMD-LSTM	1.277	1.033	64.424%
CEEMD-DF-BO-LSTM	0.336	0.239	97.535%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhou, R.; Qiu, S.; Li, M.; Meng, S.; Zhang, Q. Short-Term Air Traffic Flow Prediction Based on CEEMD-LSTM of Bayesian Optimization and Differential Processing. Electronics 2024, 13, 1896. https://doi.org/10.3390/electronics13101896

AMA Style

Zhou R, Qiu S, Li M, Meng S, Zhang Q. Short-Term Air Traffic Flow Prediction Based on CEEMD-LSTM of Bayesian Optimization and Differential Processing. Electronics. 2024; 13(10):1896. https://doi.org/10.3390/electronics13101896

Chicago/Turabian Style

Zhou, Rui, Shuang Qiu, Ming Li, Shuangjie Meng, and Qiang Zhang. 2024. "Short-Term Air Traffic Flow Prediction Based on CEEMD-LSTM of Bayesian Optimization and Differential Processing" Electronics 13, no. 10: 1896. https://doi.org/10.3390/electronics13101896

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Short-Term Air Traffic Flow Prediction Based on CEEMD-LSTM of Bayesian Optimization and Differential Processing

Abstract

1. Introduction

2. Materials and Methods

2.1. Complementary Ensemble Empirical Mode Decomposition

2.2. Data Processing

2.2.1. Stationarity Test for Data

2.2.2. Data Differential Processing

2.3. Long Short-Term Memory Network

2.4. Bayesian Optimization Algorithm

2.5. CEEMD-DF-BO-LSTM Prediction Model

3. Experiment and Results

3.1. Data Description

3.2. Performance Evaluation

3.3. Data Pre-Processing

3.3.1. Data Decomposition

3.3.2. Data Differential Processing

3.3.3. Data Scaling

3.4. Hyperparameters of Predicting Model by Bayesian Optimization

4. Results Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI