Turbulence along the Runway Glide Path: The Invisible Hazard Assessment Based on a Wind Tunnel Study and Interpretable TPE-Optimized KTBoost Approach

Khattak, Afaq; Zhang, Jianping; Chan, Pak-Wai; Chen, Feng

doi:10.3390/atmos14060920

Open AccessArticle

Turbulence along the Runway Glide Path: The Invisible Hazard Assessment Based on a Wind Tunnel Study and Interpretable TPE-Optimized KTBoost Approach

¹

The Key Laboratory of Infrastructure Durability and Operation Safety in Airfield of CAAC, College of Transportation Engineering, Tongji University, 4800 Cao’an Road, Jiading, Shanghai 201804, China

²

The Second Research Institute of Civil Aviation Administration of China, Civil Unmanned Aircraft Traffic Management Key Laboratory of Sichuan Province, Chengdu 610041, China

³

Hong Kong Observatory, 134A Nathan Road, Kowloon, Hong Kong, China

^*

Authors to whom correspondence should be addressed.

Atmosphere 2023, 14(6), 920; https://doi.org/10.3390/atmos14060920

Submission received: 16 April 2023 / Revised: 16 May 2023 / Accepted: 23 May 2023 / Published: 24 May 2023

(This article belongs to the Special Issue Aviation Meteorology: Current Status and Perspective)

Download

Browse Figures

Versions Notes

Abstract

:

Aircraft landings can be dangerous near airport runways due to wind variability. As a result, an aircraft could potentially miss an approach or divert off its flight path. In this study, turbulence intensity along the runway glide path was investigated using a scaled-down model of Hong Kong International Airport (HKIA) and the complex terrain nearby built in a TJ-3 atmospheric boundary layer wind tunnel. Different factors, including the effect of terrain, distance from the runway threshold, assigned approach runway, wind direction, and wind speed, were taken into consideration. Next, based on the experimental results, we trained and tested a novel tree-structured Parzen estimator (TPE)-optimized kernel and tree-boosting (KTBoost) model. The results obtained by the TPE-optimized KTBoost model outperformed other advanced machine learning models in terms of MAE (0.83), MSE (1.44), RMSE (1.20), and R² (0.89). The permutation-based importance analysis using the TPE-optimized KTBoost model also revealed that the top three factors that contributed to the high turbulence intensity were the effect of terrain, distance from the runway threshold, and wind direction. The presence of terrain, the shorter distance from the runway, and the wind direction from 90 degrees to 165 degrees all contributed to high turbulence intensity.

Keywords:

aviation safety; turbulence intensity; wind tunnel; TPE-optimized KTBoost

1. Introduction

Wind shear is defined as a sudden change in either the speed of the wind or the direction of the wind that occurs within 5.50 km (3 nautical miles) of the airport runway threshold at an altitude of 500 m (~1600 ft) or lower [1]. Wind shear has the potential to significantly change the amount of lift that an airplane experiences during takeoff and landing, putting the lives of everyone on board in jeopardy. The wind shear that is caused by the surrounding terrain becomes quite significant when an airport is located in an area with complex terrain, such as rocky mountains. This type of terrain can be found in many parts of the world. This is the case because complicated topography fosters the flow of more turbulent air. The same thing takes place when there are high-rise buildings in the surrounding area, which leads to building-induced wind shear and intense crosswinds. Figure 1 demonstrates that these events have a very high intensity, a very small spatial scale, and an impromptu nature. Because of these characteristics, they pose a significant threat to the smooth operation of aircraft takeoffs and landings. Additionally, it makes airport operations more difficult, which in turn extends the amount of time that passengers are delayed for their flights.

Several researchers have used numerical simulation models to simulate the impacts of wind fields, snow loads, and air infiltration on runways and around airport terminals and urban communities. Lin et al. [2] conducted simulations of the advection fog occurrence over Shanghai Pudong International Airport using the Weather Research and Forecasting (WRF) model to look into its origin, progression, and spread. Air infiltration at Chengdu Shuangliu International Airport was calculated by Liu et al. [3] by combining Computational Fluid Dynamics (CFD) models and on-site measurements. Zhou and Li [4] evaluated the impacts of snow redistribution on the roof of the Beijing airport terminal using CFD modeling and simulation. Leung et al. [5] used a CFD model to predict the wind field effect at Hong Kong International Airport (HKIA) in order to take into account the impact of both natural terrain and man-made structures. Li and Chan [6] employed a CFD model to better understand the frequency of wind shear at HKIA during a typhoon in light of the complicated topography nearby. Li and Chan [7] employed a CFD approach and a mesoscale model at HKIA to investigate the vortex shedding brought on by the topography. Another study employed CFD models based on the steady Reynolds-averaged Navier–Stokes (RANSs) equations using the k − ω model to estimate the wind flow distribution around urban community buildings in the Hail area of Saudi Arabia [8]. Based on the existing literature, it has been observed that the CFD models use the RANS equation, which can model and predict the average characteristics of the airport wind field but cannot estimate the intensity of the turbulence.

For the analysis of turbulence characteristics under complex terrain conditions, wind tunnel studies provide a practical substitute for numerical simulation models. The effects of wind on bridges, wind turbines, low- and high-rise building structures, and wind turbines have all been studied in wind tunnel experiments by a number of researchers from a wide range of disciplines [9,10,11,12]. In order to assess the reliability of numerical simulations, the findings of wind tunnel tests provide a critical foundation. The primary disadvantages of this strategy, however, include the expensive expense of wind tunnel testing and the lack of availability of testing facilities and time. Numerous tests have to be carried out in different situations in order to obtain the required findings. This causes efficiency to decline, wasting time and money in the process. Moving from experimental work to empirical modeling techniques is important to address the aforementioned issues.

Recent years have seen a significant increase in the application of artificial techniques, including machine learning, deep learning, and reinforcement learning procedures [13,14,15,16]. This is because sophisticated computing techniques are increasingly needed to process massive data sets. To predict wind pressures around spherical cylinders, Hu and Kwok [17] built machine learning models. In order to identify pressure patterns in building structures, Kim et al. [18] used unsupervised machine learning. Similar to this, Wada et al. [19], likewise based on wind tunnel tests, used deep reinforcement learning and discrete actions to regulate the pitch of an aerial vehicle.

It is evident that a wide variety of applications have led to the employment of machine learning methods in order to estimate the behavior of wind-induced responses to structures. On the contrary, its practicality in addressing the influence of wind field characteristics on airport runway glide paths, such as turbulence intensity, is severely limited. This study proposed a novel tree-structured Parzen Estimator (TPE)-optimized KTBoost model as an alternative to performing simulation analysis and wind tunnel tests in order to construct non-parametric models for estimating turbulence intensity along the airport runway glide route. Data from the TJ-3 atmospheric boundary layer (ABL) wind tunnel were used for the training and testing of the KTBoost model [20], which had its hyperparameters tuned using TPE optimization [21]. In addition, the TPE-optimized KTBoost model was evaluated alongside other advanced machine learning models. Following that, the TPE-optimized KTBoost model was implemented in order to investigate the overall significance of the factors, as well as how those factors relate to the turbulence intensity.

2. Background and Methods

The Hong Kong International Airport (HKIA) was selected to serve as the case study in the current study for the evaluation of turbulence along the runway glide path. At Tongji University in Shanghai’s State Key Laboratory for Disaster Prevention in Civil Engineering, a scaled model of the HKIA and the surrounding landscape was built. The experimental findings were used to develop the TPE-optimized KTBoost model. The flowchart for determining the turbulence intensity according to wind tunnel measurements and the TPE-optimized KTBoost model is shown in Figure 2. HKIA is located on an island called Lantau in a subtropical region off the southeast coast of mainland China (Figure 3). Many observational and modeling investigations have demonstrated that the HKIA’s significant land–sea contrast and complex orography are favorable for the formation of low-level turbulence and wind shear [22,23,24].

2.1. Wind Tunnel Experiments

This study used wind tunnel tests to assess the level of turbulence prevalent along the north runway of HKIA under various input wind conditions. The testing range, which had a total perimeter of 27.2 km and an average height of roughly 425.2 m, encompassed Lantau Island and the HKIA. Experiments were conducted in the TJ-3 ABL wind tunnel of the State Key Laboratory for Disaster Prevention in Civil Engineering at Tongji University in Shanghai. It is a closed-circuit, vertical-return form of low-speed wind tunnel. The dimensions of the test section of the wind tunnel are 14 m × 15 m × 2 m.

2.1.1. Terrain Model

With a geometric scaling ratio of 1:4000 and a model diameter of 6.80 m, a complex terrain that constituted Lantau Island, north runway of HKIA and surrounding buildings were crafted (Figure 4). The height within the simulation range was 0.106 m. The surrounding terrain model was constructed layer by layer along the contour lines using dense foam with a one-inch texture (equivalent to a difference in actual terrain height of 40 m). The exterior of the scaled-down model was painted to resemble the roughness of a real mountain’s surface. The wind tunnel blockage ratio

(b_{r})

, which is the ratio of windward or projected area of the test model

(A_{p m})

to the cross-sectional area of the section of wind tunnel

(A_{c t})

, was calculated by using Equation (1):

b_{r} = \frac{A_{p m}}{A_{c t}} \times 100 (%) = (\frac{0.106 \times 6.80}{15 \times 2}) \times 100 (%) = 2.40 %

(1)

The wind tunnel blockage ratio was 2.40%, which was less than 5% (preferable for wind tunnel research) and satisfied the requirements of the wind tunnel experiments.

2.1.2. Inflow Configuration

The study considered the prevailing wind patterns of Hong Kong, including east to southeasterly winds and the southwest monsoon, and measured wind direction from 90 to 240 degrees in 15-degree intervals, as depicted in Figure 5. Eleven distinct wind conditions were present. The wind directions were specified as 0 degrees for north, 90 degrees for east, 180 degrees for south, and 270 degrees for west.

2.1.3. Measuring Location

Aircraft typically follow a three-degree glide slope during the last three nautical miles of their descent, as illustrated by the black dotted line in Figure 6. Two sets of eight measurement sites total (a1, a2, a3, a4) and (b1, b2, b3, b4) were put along the glide trajectories of runways 07LA and 25RA, respectively, for use in wind tunnel tests. It is pertinent to mention that the airport runways are numbered according to compass bearings. Runway numbers are determined by compass bearings, where north is represented by 360, east by 90, south by 180, and west by 270. The designation 07LA denotes the left arrival runway at a heading of 070 degrees.

At each measurement point, a Series 100 Cobra Probe from TFI (Turbulent Flow Instrumentation Pty Ltd., Australia) was utilized to assess the fluctuating wind speeds, or turbulence intensities. The Cobra probe is a dynamic multi-hole pressure probe that can measure three components of mean and varying wind speeds. This instrument has a frequency response of 0–2000 Hz and can gauge wind speeds between 2 m/s and 100 m/s with a precision of 0.5 m/s, which is adequate to assess turbulent flows within a wind tunnel.

The declination for effective measurement of this Cobra probe is 45°; that is, effective measurement can be carried out when the included angle between the direction of the inflow and the forward axis of the probe does not exceed 45°. The installation’s total height was changed to match the measuring site’s height, while the cobra probes were installed on custom stands. Following each operational condition test, the probes were oriented towards the inflow direction. The study recorded each operating state for a total of 65.536 s using a single-point sampling rate of 1000 Hz.

2.2. Model Development

To estimate the turbulence intensity along the airport runway glide path, a novel TPE-optimized KTBoost model was built using data from wind tunnel experiments. Figure 7 illustrates the procedure for developing a TPE-optimized KTBoost model with factors that serve as inputs and outputs. It is essential to perform the label encoding before a model is developed, as shown in Table 1.

2.2.1. Combined Kernel and Tree Boosting (KTBoost)

In this study, we propose a novel kernel and tree-boosting algorithm known as KTBoost, which combines the concepts of kernel boosting and tree boosting. The regression tree or kernel ridge regression (KRR), also known as penalized reproducing kernel Hilbert space (RKHS) regression functions

(Ψ (x))

, is included in the ensemble as a result of the KTBoost algorithm [25]. In order to accomplish this, a regression tree

(f_{m}^{T} (x))

and RKHS function

(f_{m}^{K} (x))

are first learned with the help of a functional Newton’s method or functional gradient descent step, which incorporates the loss function

(L)

. Following this, the base learner whose addition to the ensemble results in the lowest empirical risk is chosen. During each iteration of the learning process, the KTBoost algorithm chooses a base estimator from among two different and distinct function classes. Depending on the kernel function, RKHS functions can either be continuous or more regular in their pattern. On the other hand, trees have a more discontinuous set of functions. The core concept behind this algorithm is that the various types of base estimators are complementary to one another, and that the combination of regression trees and RKHS functions as base estimators may achieve higher predictive accuracy. The pseudo code for the KTBoost algorithm can be found below (Algorithm 1):

Algorithm 1. kernel and tree boosting (KTBoost)
1	Initialization: $Ψ_{0} (x) = \arg \min_{C \in ℜ^{d}} ℜ (c)$
2	for $m = 1 to Ξ$ do
3		The functional gradient is computed as $g_{m, i} = \frac{\partial}{\partial F} L (y_{i}, Ψ) \|_{Ψ = Ψ_{m - 1} (x_{i})}$ and Hessian is computed as $h_{m, i} = \frac{\partial^{2}}{\partial Ψ^{2}} L (y_{i}, Ψ) \|_{Ψ = Ψ_{m - 1} (x_{i})}$ at the function $Ψ_{m - 1} (x)$ and $I_{{{x = x}_{i}}} (x)$ , where $I_{{{x = x}_{i}}} (x) = 1$ if ${x = x}_{i}$ , otherwise 0.
4		Compute the candidate regression tree $f_{m}^{T} (x)$ as well as reproducing kernel Hilbert space regression function $f_{m}^{K} (x)$ $f_{m}^{T} (x) = \arg \min_{f \in τ} ℜ^{2} (Ψ_{m - 1} + f)$ $f_{m}^{K} (x) = \arg \min_{f \in H} ℜ^{2} (Ψ_{m - 1} + f) + \frac{1}{2} λ \times {‖ f ‖}_{H}^{2}$ where the empirical/approximate risk $ℜ^{2} (Ψ_{m - 1} + f)$ is defined as $ℜ^{2} (Ψ_{m - 1} + f) = \sum_{i = 1}^{n} g_{m, i} \times f (x_{i}) + \frac{1}{2} h_{m, i} \times f {(x_{i})}^{2}$ is
5		If $ℜ^{2} (Ψ_{m - 1} + v \times f_{m}^{T} (x)) \leq ℜ^{2} (Ψ_{m - 1} + v \times f_{m}^{K} (x))$ then
6		$f_{m} (x) = f_{m}^{T} (x)$
7		else
8		$f_{m} (x) = f_{m}^{K} (x)$
9		End if
10		Update $Ψ_{m} (x) = Ψ_{m - 1} (x) + v \times f_{m} (x)$
11	End for

2.2.2. Tree-Structured Parzen Estimator (TPE)

The effectiveness of every machine learning model is contingent on its hyperparameters. They control the learning algorithm or the model’s underlying structure. In practice, however, there is no standard method for selecting hyperparameters. In lieu of this, hyperparameters are frequently set by trial and error, utilizing optimization search strategies, or are occasionally left at their default values. Hyperparameter optimization provides a systematic approach to this issue by framing it as an optimization problem: a good set of hyperparameters should (at the very least) minimize the difference between predicted and actual values.

In this study, we employed the TPE algorithm for the hyperparameter tuning. The TPE algorithm is a sequential model-based global optimization algorithm that effectively determines the hyperparameters of a machine learning model. It was developed to overcome the shortcomings of the traditional Bayesian Optimization approaches when dealing with categorical and conditional hyperparameters by introducing Parzen window estimators, thereby enhancing the performance of the hyperparameter search strategy [26,27]. Employing the Parzen-window density estimation, the TPE algorithm produces probability density functions within a hyperparametric search space. The search space may be built using a uniform distribution, a discrete uniform distribution, or a logarithmic uniform distribution.

During the startup iterative process, a random search is carried out to start initializing the distribution by randomly selecting the response surface

{θ^{(i)} {, y}^{(i)}, i = 1, 2, \dots {, N}_{i n t}}

, where

θ

denotes the hyperparameters set, y denotes the corresponding value of the response surface, and

N_{i n t}

indicates the total number of iterations. In contrast to standard Bayesian Optimization, the TPE algorithm employs Parzen window estimators as its building component. The Parzen window estimator is a statistical model for density estimation, which is also called the kernel density estimator. The Parzen window estimators are employed to compute the densities of good hyperparameters and bad hyperparameters. The computed hyperparameter are arranged into two sets by using a quantile threshold value

y *

, which can be chosen arbitrarily. The kernel density estimator

p (θ | y)

is defined by dividing both good and poor hyperparameter samples by the algorithm’s configuration space, as shown in Equation (2):

p (θ | y) = {\begin{matrix} {pr}_{good} (θ) & if y < y * \\ {pr}_{bad} (θ) & if y \geq y * \end{matrix}

(2)

where y < y* represents a lower value of the function than the threshold. The explanation of Equation (2) is that two different distributions for the hyperparameters can be obtained, i.e., one equation

({p r}_{g o o d} (θ))

where the value of the function is less than the threshold value, and another equation

{p r}_{b a d} (θ)

where the value of the function is greater than the threshold value. The determination of the optimal hyperparameter configuration is shown by Equation (3):

θ * = argmin \frac{{pr}_{bad} (θ)}{{pr}_{good} (θ)}

(3)

The TPE algorithm chooses the optimum hyperparameters according to a set of the best observations and respective distributions, in addition to choosing the best observations. The TPE algorithm’s general flowchart is shown in Figure 8.

2.3. Performance Measures

Four different metrics, including mean absolute error (MAE), mean squared error (MSE), root mean square error (RMSE), and R² (coefficient of determination), may be employed to compare the efficacy of different models. The average absolute value of each prediction error calculated across all instances is known as the MAE (Equation (4)). The average of the squares of the divergence between the actual and predicted values is used to compute MSE, as shown in Equation (5). According to Equation (6), RMSE is the square root of the variation between actual and predicted values. R², which varies from 0 to 1, demonstrates how effectively a model can predict values. Equation (7) provides an R² value.

M A E = \sum_{x = 1}^{ζ} \frac{| \partial_{x} - {\bar{\partial}}_{x} |}{ζ}

(4)

M S E = \frac{1}{ζ} {\sum_{x = 1}^{ζ} (\partial_{x} - {\bar{\partial}}_{x})}^{2}

(5)

R M S E = \sqrt{\sum_{x = 1}^{ζ} \frac{{(\partial_{x} - {\bar{\partial}}_{x})}^{2}}{ζ}}

(6)

R^{2} = 1 - \frac{\sum_{x = 1}^{ζ} {(\partial_{x} - {\bar{\partial}}_{x})}^{2}}{\sum_{x = 1}^{ζ} {(\partial_{x} - \partial_{a v g})}^{2}}

(7)

where

ζ

is the total number of observations (in training/testing dataset),

\partial

represents the actual observation (in training/testing dataset),

{\bar{\partial}}_{x}

represents the predicted value, and

\partial_{a v g}

represents the average of the actual observations.

3. Results

3.1. Turbulence Intensity

The ratio of the standard deviation of the wind speed to the mean wind speed is known as the turbulence intensity, as shown by Equation (8):

T_{i} = \frac{σ}{\bar{v}}

(8)

where

T_{i}

denotes the turbulence intensity,

σ

represents the standard deviation of the wind speed, and

\bar{v}

represents the mean wind speed.

3.2. Correlation Analysis

The correlation of various factors was examined using Pearson correlation values prior to building the KTBoost model based on wind tunnel data. Between 0 and 1, the Pearson correlation is positive, and between −1 and 0, the Pearson correlation is negative. The stronger the relationship between the two variables, the greater its absolute value. Generally speaking, the correlation is very weak between 0.0 and 0.2, moderate between 0.2 and 0.6, strong between 0.6 and 0.8, and extremely strong between 0.8 and 1.0. Figure 9 illustrates the relationship between the different input factors. There is no strong correlation between the factors because the values are incredibly low. In light of this finding, the TPE-optimized KTBoost modeling will take into account all relevant factors.

3.3. Hyperparameter Tuning

The values of a machine learning model’s parameters, especially the hyperparameter values, have a substantial effect on the accuracy of the model’s predictions. In this study, the hyperparameters of the KTBoost model, including n_estimators, learning rate, max_depth, and min_sample_leaf, were optimized using the TPE technique in an effort to maximize the model’s performance and decrease the MAE. In case of KTBoost mode, n_estimator is the number of boosting iterations to perform. Max_depth is the maximum depth of regression tree. The number of nodes in the tree is constrained by the maximum depth. This quantity controls how the predictor variables interact. The learning rate shrinks each learner’s contribution by ‘learning_rate’. A trade-off exists between learning_rate and n_estimators. The min_sample_leaf is the bare minimum of samples that must be present at a leaf node.

The progress of the TPE procedure is depicted in Figure 10 using MAE as the training metric throughout 100 iterations. The KTBoost model was tuned using the hyperparameters associated with the model at the best MAE to provide the best prediction model. Figure 10 demonstrates how the MAE of the KTBoost model tended to decline as the number of trials increased. The TPE-optimized KTBoost model converged at iteration 80, indicating that it was able to depart from the local optimum at iteration 80 and arrive at a more workable solution. The matching n_estimators for the optimal KTBoost model were 85, learning_rate was 0.11, max_depth was 5, and min_samples_ leaf was 2.

3.4. Prediction Results and Comparative Analysis

To ascertain whether the proposed TPE-optimized KTBoost model has been efficacious in estimating turbulence intensity along the runway glide path, four other machine learning models, including the TPE-optimized Extra Tree (ET) model [28], the TPE-optimized Light Gradient Boosting Machine (LightGBM) model [29], the TPE-optimized Extreme Gradient Boosting (XGBoost) model [30], the TPE-optimized Gradient Boosting (GB) model [31], and a statistical multivariate linear regression model [32], were employed to estimate the turbulence intensity along the runway glide path. In addition to hyperparameter tuning to prevent over-fitting, it is important to note that the dataset was partitioned after being shuffled with 40% and 50% test data, and the results revealed that the metric values for model testing remained consistent within a 95% confidence interval. After altering the number of samples of test data, there was no indication in the metrics that anything unusual had occurred. Table 2 provides an illustration of the performance measures, using both the training dataset and the test dataset. It has been observed that the TPE-optimized KTBoost model outperformed other models with an MAE value of 0.58, MSE value of 0.88, RMSE value of 0.94, and R² value of 0.95 for training data. For testing data, the MAE value was 0.83, MSE value of 1.44, RMSE value of 1.20, and R² value of 0.89. The multivariate linear regression model with an MAE of 1.80, MSE of 5.84, RMSE of 2.41, and R² of 0.68 for training data showed the worst performance, while the model with MAE of 1.50, MSE of 4.19, RMSE of 2.04, and R² of 0.71 for testing data showed the best performance.

The scatter plots between the experimental results and the values predicted by the TPE-optimized machine learning model and a multivariate linear regression model are displayed in Figure 11. The TPE-optimized KTBoost model results based on both training and testing dataset were generally closer to the 45-degree reference line than that of other models, as shown in the scatter plot of the turbulence intensity estimation. The fitted points for the other models were less tightly clustered around the 45-degree reference line, and their prediction accuracy lagged behind the TPE-optimized KTBoost model.

3.5. Model Uncertainty Analysis

When employing the TPE-optimized KTBoost approach for estimating turbulence intensity, it is crucial to address the associated uncertainties resulting from the proposed computational scheme in addition to estimating prediction error. This is accomplished by using the experimental-to-predicted ratio of the TPE-optimized KTBoost model as well as other models, as plotted in Figure 12. A high percentage of the data points (training and testing) that are close to the unit line, which strongly suggest the low uncertainties, can be used to show coherency between experimental results and predicted values. Additionally, each sub-figure reports the total values of the computed mean and standard deviation (SD). The model offers lower uncertainties the closer the mean value is to 1 and the lower the SD. It is clear that the TPE-optimized KTBoost model, with a mean of 1.00 and SD of 0.08, provides low uncertainty and demonstrates the strong coherence between experimental and predicted values.

3.6. KTBoost Factor Importance

The assigned factors for estimating the intensity of turbulence along the airport runway glide path were effect of terrain, distance from runway threshold, assigned runway direction, wind direction, and wind speed. Regarding the importance of various factors in the estimation of turbulence intensity, permutation-based factor importance analysis with a TPE-optimized KTBoost model was used to evaluate the most influential factors on turbulence intensity. Figure 13 depicts the permutation-based importance ranking of the turbulence intensity prediction factors. The mean importance of effect of terrain was 1.47, distance from runway threshold was 0.43, and wind direction was 0.36. Effect of terrain contributed 70% towards turbulence intensity, distance from runway threshold contributed 15.5% towards turbulence intensity, and wind direction contributed 12.5% towards turbulence intensity. The total importance of the top three factors was 98%.

3.7. KTBoost Partial Dependence Plots

Factor importance shows which factors have the greatest influence on predictions, whereas partial dependence plots show the relationship between a factor and predictions. By repeatedly changing the value of just one factor in the TPE-optimized KTBoost model, which estimates the turbulence intensity, we can plot the predicted results to show how they depend on the various factors and when they plateau. The plots in Figure 14 should be interpreted as:

●: The intensity of the turbulence increases with increasing terrain effect values. This shows that the presence of terrain increased the intensity of the turbulence more (Figure 14a).
●: The turbulence intensity decreases as the distance from the runway threshold increases. This demonstrates that the turbulence intensity is greater along the glide path very close to the runway (Figure 14b).
●: When the wind direction is 90 degrees or more, the turbulence increases until 165 degrees before abruptly decreasing. This may be because the southern mountains block the wind, which ultimately results in fluctuations (Figure 14c).

4. Conclusions and Recommendations

In this study, we proposed a novel TPE-optimized KTBoost model for the estimation of turbulence intensity along the runway glide path based on the wind tunnel experiments. The predictive performance of the TPE-optimized KTBoost model was also compared with other state-of-the-art machine learning models. The analysis revealed that the TPE-optimized KTBoost model demonstrated higher predictive performance with MAE (0.58), MSE (0.88), RMSE (0.94), and R² (0.95) for training data, and MAE (0.83), MSE (1.44), RMSE (1.20), and R² (0.89) for testing data. The worst performance was shown by the multivariate linear regression model with MAE (1.80), MSE (5.84), RMSE (2.41), and R² (0.65) for training data, and MAE (1.50), MSE (4.18), RMSE (2.04), and R² (0.71) for testing data.

Furthermore, the permutation-based factor importance analysis of the TPE-optimized KTBoost model illustrated that the effect of terrain is a highly influential factor, followed by distance from the runway threshold and wind direction. The presence of terrain, the shorter distance from the runway, and the wind direction of 90 degrees to 165 degrees all contributed to high turbulence intensity.

Although this study utilized a number of different input parameters to estimate the turbulence intensity along the runway glide path, many other parameters, such as atmospheric pressure and temperature, could be considered in future studies. Future research can take into account post hoc interpretation techniques for optimal models such as SHapley Additive exPlanations (SHAP) and local interpretable model–agnostic explanations (LIME). Based on the SHAP and LIME approaches, the model can be interpreted from both a global and local perspective. In this study, the turbulence intensity along the airport runway glide path was a parameter of concern. In future, similar considerations can be given to headwind speed, crosswind speed, and turbulence integral length scale, which are also significant wind field characteristics.

Author Contributions

Conceptualization, A.K.; Data curation, J.Z. and P.-W.C.; Formal analysis, A.K. and P.-W.C.; Funding acquisition, F.C.; Methodology, A.K. and P.-W.C.; Project administration, J.Z. and F.C.; Software, A.K. and F.C.; Supervision, F.C.; Validation, J.Z. and F.C.; Writing—original draft, A.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by National Natural Science Foundation of China [Grant No. 52250410351 and U1733113], National Foreign Expert Project [Grant No. QN2022133001L], Shanghai Municipal Science and Technology Major Project [Grant No. 2021SHZDZX0100], and the Fundamental Research Funds for the Central Universities.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is unavailable due to privacy.

Acknowledgments

We would also like to express our gratitude to our colleagues at the Hong Kong Observatory of Hong Kong International Airport for their guidance.

Conflicts of Interest

The authors declare no conflict of interest.

References

ICAO. Meteorological Service for International Air Navigation-Annex 3 to the Convention on International Civil Aviation; ICAO: Montreal, QC, Canada, 2010; p. 206. [Google Scholar]
Lin, C.; Zhang, Z.; Pu, Z.; Wang, F. Numerical simulations of an advection fog event over Shanghai Pudong International Airport with the WRF model. J. Meteorol. Res. 2017, 31, 874–889. [Google Scholar] [CrossRef]
Liu, X.; Lin, L.; Liu, X.; Zhang, T.; Rong, X.; Yang, L.; Xiong, D. Evaluation of air infiltration in a hub airport terminal: On-site measurement and numerical simulation. Build. Environ. 2018, 143, 163–177. [Google Scholar] [CrossRef]
Zhou, X.Y.; Li, X.F. Simulation of snow drifting on roof surface of terminal building of an airport. Disaster Adv. 2010, 3, 42–50. [Google Scholar]
Leung, D.Y.C.; Lo, W.Y.; Chow, W.Y.; Chan, P.W. Effect of terrain and building structures on the airflow in an airport. J. Zhejiang Univ. A 2012, 13, 461–468. [Google Scholar] [CrossRef]
Li, L.; Chan, P. Numerical simulation study of the effect of buildings and complex terrain on the low-level winds at an airport in typhoon situation. Meteorol. Z. 2012, 21, 183–192. [Google Scholar] [CrossRef]
Li, L.; Chan, P.W. LIDAR observation and numerical simulation of vortex/wave shedding at the Eastern Runway Corridor of the Hong Kong International Airport. Meteorol. Appl. 2016, 23, 379–388. [Google Scholar] [CrossRef]
Hnaien, N.; Hassen, W.; Kolsi, L.; Mesloub, A.; Alghaseb, M.A.; Elkhayat, K.; Abdelhafez, M.H.H. CFD Analysis of Wind Distribution around Buildings in Low-Density Urban Community. Mathematics 2022, 10, 1118. [Google Scholar] [CrossRef]
Diana, G.; Omarini, S. A non-linear method to compute the buffeting response of a bridge validation of the model through wind tunnel tests. J. Wind. Eng. Ind. Aerodyn. 2020, 201, 104163. [Google Scholar] [CrossRef]
He, X.; Zou, S. Advances in wind tunnel experimental investigations of train–bridge systems. Tunn. Undergr. Space Technol. 2021, 118, 104157. [Google Scholar] [CrossRef]
Li, Y.; Xu, X.; Zhang, M.; Xu, Y. Wind tunnel test and numerical simulation of wind characteristics at a bridge site in mountainous terrain. Adv. Struct. Eng. 2016, 20, 1223–1231. [Google Scholar] [CrossRef]
Huang, M.; Zhang, B.; Lou, W. A computer vision-based vibration measurement method for wind tunnel tests of high-rise buildings. J. Wind. Eng. Ind. Aerodyn. 2018, 182, 222–234. [Google Scholar] [CrossRef]
Khattak, A.; Almujibah, H.; Elamary, A.; Matara, C.M. Interpretable Dynamic Ensemble Selection Approach for the Prediction of Road Traffic Injury Severity: A Case Study of Pakistan’s National Highway N-5. Sustainability 2022, 14, 12340. [Google Scholar] [CrossRef]
Nama, M.; Nath, A.; Bechra, N.; Bhatia, J.; Tanwar, S.; Chaturvedi, M.; Sadoun, B. Machine learning-based traffic scheduling techniques for intelligent transportation system: Opportunities and challenges. Int. J. Commun. Syst. 2021, 34, e4814. [Google Scholar] [CrossRef]
Dong, S.; Khattak, A.; Ullah, I.; Zhou, J.; Hussain, A. Predicting and Analyzing Road Traffic Injury Severity Using Boosting-Based Ensemble Learning Models with SHAPley Additive exPlanations. Int. J. Environ. Res. Public Health 2022, 19, 2925. [Google Scholar] [CrossRef] [PubMed]
Khattak, A.; Chan, P.-W.; Chen, F.; Peng, H. Prediction and Interpretation of Low-Level Wind Shear Criticality Based on Its Altitude above Runway Level: Application of Bayesian Optimization–Ensemble Learning Classifiers and SHapley Additive exPlanations. Atmosphere 2022, 13, 2102. [Google Scholar] [CrossRef]
Hu, G.; Kwok, K. Predicting wind pressures around circular cylinders using machine learning techniques. J. Wind. Eng. Ind. Aerodyn. 2020, 198, 104099. [Google Scholar] [CrossRef]
Kim, B.; Yuvaraj, N.; Tse, K.; Lee, D.-E.; Hu, G. Pressure pattern recognition in buildings using an unsupervised machine-learning algorithm. J. Wind. Eng. Ind. Aerodyn. 2021, 214, 104629. [Google Scholar] [CrossRef]
Wada, D.; Araujo-Estrada, S.A.; Windsor, S. Unmanned Aerial Vehicle Pitch Control Using Deep Reinforcement Learning with Discrete Actions in Wind Tunnel Test. Aerospace 2021, 8, 18. [Google Scholar] [CrossRef]
Sigrist, F. KTBoost: Combined Kernel and Tree Boosting. Neural Process. Lett. 2021, 53, 1147–1160. [Google Scholar] [CrossRef]
Khoei, T.T.; Ismail, S.; Kaabouch, N. Boosting-based models with tree-structured parzen estimator optimization to detect intrusion attacks on smart grid. In Proceedings of the 2021 IEEE 12th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 1–4 December 2021; pp. 0165–0170. [Google Scholar]
Chan, P.W. Case study of a special event of low-level windshear and turbulence at the Hong Kong International Airport. Atmos. Sci. Lett. 2022, 24, e1143. [Google Scholar] [CrossRef]
Chan, P.W.; Hon, K.K. Observation and Numerical Simulation of Terrain-Induced Windshear at the Hong Kong International Airport in a Planetary Boundary Layer without Temperature Inversions. Adv. Meteorol. 2016, 2016, 1–9. [Google Scholar] [CrossRef]
Chan, P.W.; Hon, K.K. Performance of super high resolution numerical weather prediction model in forecasting terrain-disrupted airflow at the Hong Kong International Airport: Case studies. Meteorol. Appl. 2015, 23, 101–114. [Google Scholar] [CrossRef]
Murphy, K.P. Machine Learning: A Probabilistic Perspective; MIT Press: Cambridge, MA, USA, 2012. [Google Scholar]
Liang, J.; Liao, Y.; Chen, Z.; Lin, H.; Jin, G.; Gryllias, K.; Li, W. Intelligent fault diagnosis of rotating machinery using lightweight network with modified tree-structured parzen estimators. IET Collab. Intell. Manuf. 2022, 4, 194–207. [Google Scholar] [CrossRef]
Watanabe, S.; Awad, N.; Onishi, M.; Hutter, F. Multi-objective Tree-structured Parzen Estimator Meets Meta-learning. arXiv 2022, arXiv:2212.06751. [Google Scholar]
Li, Y.; Bao, T.; Gong, J.; Shu, X.; Zhang, K. The Prediction of Dam Displacement Time Series Using STL, Extra-Trees, and Stacked LSTM Neural Network. IEEE Access 2020, 8, 94440–94452. [Google Scholar] [CrossRef]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 2017, 30, 52. [Google Scholar]
Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobot. 2013, 7, 21. [Google Scholar] [CrossRef]
Kelley, K.; Bolin, J.H. Multiple regression. In Handbook of Quantitative Methods for Educational Research; Sense Publishers: Rotterdam, The Netherlands, 2013; pp. 69–101. [Google Scholar]

Figure 1. Occurrence of turbulence along the airport runway glide path.

Figure 2. Proposed framework for the estimation of turbulence intensity.

Figure 3. HKIA and surrounding environment.

Figure 4. Scaled-down model of HKIA buildings and surrounding terrain.

Figure 5. Direction of inflow.

Figure 6. Cobra probe setup along the north runway glide slope.

Figure 7. Flow chart of TPE-optimized KTBoost modeling process with input and output factors.

Figure 8. Flow chart of TPE optimization.

Figure 9. Correlation matrix of factors.

Figure 10. Progress of TPE optimization.

Figure 11. Prediction error plot; (a) TPE-optimized KTBoost model based on training data; (b) TPE-optimized KTBoost model based on testing data; (c) TPE-optimized ET model based on training data; (d) TPE-optimized ET model based on testing data; (e) TPE-optimized XGBoost model based on training data; (f) TPE-optimized XGBoost model based on testing data; (g) TPE-optimized LightGBM model based on training data; (h) TPE-optimized LightGBM model based on testing data; (i) TPE-optimized GB model based on training data; (j) TPE-optimized GB model based on testing data; (k) multivariate linear regression model based on training data; (l) multivariate linear model based on testing data.

Figure 12. Model uncertainty analysis; (a) Experimental-to-predicted ratio for TPE-optimized KTBoost model based on training and testing data; (b) experimental-to-predicted ratio for TPE-optimized ET model based on training and testing data; (c) experimental-to-predicted ratio for TPE-optimized XGboost model based on training and testing data; (d) experimental-to-predicted ratio for TPE-optimized LightGBM model based on training and testing data; (e) experimental-to-predicted ratio for TPE-optimized GB model based on training and testing data; (f) experimental-to-predicted ratio for linear regression model based on training and testing data.

Figure 13. Permutation-based factor importance by TPE-optimized KTBoost model.

Figure 14. Partial dependence plots: (a) effect of terrain; (b) distance from runway threshold; (c) wind direction.

Table 1. Label encoding of the factors.

Factors	Data Type	Coding
Turbulence Intensity	Continuous	-
Building Effect	Discrete	1: If the effects of surrounding terrain are taken into account. 0: If the effects of surrounding terrain are ignored.
Wind Direction	Continuous	-
Runway Orientation	Discrete	1: When the glide slope of Runway 25RA is utilized. 0: When the glide slope is for Runway 07LA is utilized.
Distance from Runway	Discrete	0: When the distance is 0.25 nautical miles (0.25 MF) from the end of the approaching runway. 1: When the distance is 0.75 nautical miles (0.75 MF) from the end of the approaching runway. 2: When the distance is 1.25 nautical miles (1.25 MF) from the end of the approaching runway. 3: When the distance is 1.75 nautical miles (1.75 MF) from the end of the approaching runway.
Wind Speed	Continuous	-

Table 2. Performance measure of various models.

Models	Training Dataset (70%)				Testing Dataset (30%)
Models	MAE	MSE	RMSE	R²	MAE	MSE	RMSE	R²
TPE-KTBoost	0.58	0.88	0.94	0.95	0.83	1.44	1.20	0.89
TPE-ET	0.97	2.67	1.61	0.84	0.78	1.89	1.37	0.86
TPE-GB	1.07	2.73	1.64	0.83	1.10	2.35	1.54	0.83
TPE-XGBoost	1.16	2.30	1.51	0.86	1.26	2.78	1.67	0.79
TPE-LightGBM	1.04	2.11	1.46	0.87	1.09	1.82	1.35	0.87
Linear Regression	1.80	5.84	2.41	0.68	1.50	4.19	2.04	0.71

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Khattak, A.; Zhang, J.; Chan, P.-W.; Chen, F. Turbulence along the Runway Glide Path: The Invisible Hazard Assessment Based on a Wind Tunnel Study and Interpretable TPE-Optimized KTBoost Approach. Atmosphere 2023, 14, 920. https://doi.org/10.3390/atmos14060920

AMA Style

Khattak A, Zhang J, Chan P-W, Chen F. Turbulence along the Runway Glide Path: The Invisible Hazard Assessment Based on a Wind Tunnel Study and Interpretable TPE-Optimized KTBoost Approach. Atmosphere. 2023; 14(6):920. https://doi.org/10.3390/atmos14060920

Chicago/Turabian Style

Khattak, Afaq, Jianping Zhang, Pak-Wai Chan, and Feng Chen. 2023. "Turbulence along the Runway Glide Path: The Invisible Hazard Assessment Based on a Wind Tunnel Study and Interpretable TPE-Optimized KTBoost Approach" Atmosphere 14, no. 6: 920. https://doi.org/10.3390/atmos14060920

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Turbulence along the Runway Glide Path: The Invisible Hazard Assessment Based on a Wind Tunnel Study and Interpretable TPE-Optimized KTBoost Approach

Abstract

1. Introduction

2. Background and Methods

2.1. Wind Tunnel Experiments

2.1.1. Terrain Model

2.1.2. Inflow Configuration

2.1.3. Measuring Location

2.2. Model Development

2.2.1. Combined Kernel and Tree Boosting (KTBoost)

2.2.2. Tree-Structured Parzen Estimator (TPE)

2.3. Performance Measures

3. Results

3.1. Turbulence Intensity

3.2. Correlation Analysis

3.3. Hyperparameter Tuning

3.4. Prediction Results and Comparative Analysis

3.5. Model Uncertainty Analysis

3.6. KTBoost Factor Importance

3.7. KTBoost Partial Dependence Plots

4. Conclusions and Recommendations

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI