1. Introduction
Affected by component mismatching and the variations of the flight environment, the inlet flow conditions of compressors and turbines often change in the real world. For different cruising altitudes, the inlet total pressure and Reynolds number of Low-Pressure Turbines (LPTs) fluctuate a lot. As flight attitude changes, flow distortion with varied total pressure and boundary layer can be commonly found at the inlet of the compressor. Under the influence of combustion instability, the inlet flow of a high-pressure turbine exhibits strong uncertainties, further exerting uncertainty impact on the downstream LPT [
1,
2]. In such situations, performance in service of the blades designed by deterministic methods, i.e., neglecting the effects of uncertainties, inevitably deviates from the design objective. Such design results in not only a decrease in mean performance but also in an increase in performance dispersion. To improve the mean performance and reduce the dispersion, robust design optimization has attracted wide attention. Ghisu et al. [
3] found that the probability of compressor stall is considerable if considering the effects of stochastic inlet flow variations, especially under operation conditions with low rotational speed. They then performed robust design optimization by regarding the statistical performance as the objective and successfully decreased the probability of compressor stall. Luo et al. [
4] found that the impact of stochastic variations of inlet flow angle on flow loss of a turbine cascade is severe under off-design operation conditions. They then performed robust design optimization and successfully improved the robustness by reducing the variance of flow loss. For robust design optimization, the evaluation of the uncertainty impact and calculation of the statistical performance is an issue. Usually, a large number of uncertainty quantifications are needed to conduct robust design optimization, which results in a great cost of computing resources. Therefore, the method of quantification is desired to be efficient and accurate at the same time.
With the development of computational fluid dynamics (CFD), more and more researchers have conducted uncertainty quantification (UQ) investigations of turbine cascades [
5,
6]. Different quantification methods have been developed [
7], which are mainly divided into two categories, one based on sensitivity analysis and the other based on model methods. The sensitivity-based UQ analysis method is suitable for the case where nonlinear dependence between the output and input uncertainties is not strong. Putko et al. [
8] used the method of moment (MM) to propagate the uncertainties of geometric changes and flow changes in quasi-one-dimensional flows. Luo et al. [
9] and Xu et al. [
10] evaluated performance changes using sensitivity-based UQ methods. Using this method, the computational cost for statistical analysis is significantly reduced. However, it cannot be used for strongly nonlinear uncertainty problems.
UQ methods based on surrogate models have been widely studied and applied. The polynomial chaos (PC) model is a popular method that constructs the model through the expansion of multiple sets of orthogonal random processes. Compared with Monte Carlo simulation (MCS), fewer samples are required by the PC method, while the prediction accuracy is high, making it more attractive [
11]. Hosder et al. [
12] applied the non-intrusive polynomial chaos (NIPC) method to a three-dimensional wing flow in which the stochastic variations of free flow Mach number and angle of attack are taken into account. Simon et al. [
13] used a sparse PC method to study the transonic unsteady flow of airfoil under geometric uncertainty. Guo et al. [
14] used the NIPC to evaluate the performance changes in a compressor cascade under the influence of installation angle and contour errors. Chen et al. [
15] used the adaptive NIPC method to study the performance of a transonic compressor blade under the influence of uncertain inlet flow angle changes. Emory et al. [
16] investigated the influence of the compounded variations of inlet total pressure, inlet turbulence intensity, and wall temperature on the performance changes in a turbine blade using NIPC. Moreover, Gopinathrao et al. [
17] utilized the NIPC method to analyze the influence of stochastic variations of inlet total pressure on the changes in total pressure ratio and adiabatic efficiency of NASA Rotor 37. Tang et al. [
18] constructed a Kriging model, which is a specific case of Gaussian process (GP) for performance prediction of a centrifugal compressor, in which the effects of stochastic variations of inlet flow angle, tip clearance, etc. are considered. In recent years, rapidly developed machine-learning methods have been widely used in various disciplines [
19]. By machine learning, surrogate models can be constructed through supervised learning. Hu et al. [
20] applied support vector regression to model construction, which was then used for predicting the aerodynamic performance of a transonic compressor. Wang et al. [
21], He et al. [
22], and Cao et al. [
23] investigated the impact of uncertainties on performance changes in compressors using artificial neural networks (ANNs).
However, it is time-consuming for the training of ANN models that numerous hyperparameters, including the number of hidden layers and neurons on each hidden layer, etc., need to be optimized. By contrast, GP is more flexible without much fine-tuning of the hyperparameters due to its non-parametric features [
24]. Additionally, GP exhibits higher accuracy compared to the NIPC method, which is also investigated in this study. Generally, it is necessary to construct a surrogate model with high response accuracy using as few training samples as possible. GP with adaptive sampling is a good choice. GP was originally developed as a probability theory concept by Wiener and Kolmogorov in the 1940s. It originated as a regression tool in geostatistics by Krige [
25] and later found applications in spatial statistics [
26], general regression [
27], computer experiments [
28], and machine learning [
29]. In the context of machine learning, GP is applied in nonlinear regression and classification [
29]. GP not only predicts the mean response of test points but also estimates the variance within the sampling space. This unique capability makes it easier to conduct adaptive sampling using the variance, resulting in a decrease in the number of training samples for model training. Due to these characteristics, GP has become popular in various applications [
24,
30].
From the aforementioned literature working on the performance impact of stochastic variations of inlet flow, the effects of inlet flow angle, inlet total pressure, and inlet turbulence intensity were considered in most cases. In the present study, uncertainty impact on the performance parameters, such as total pressure-loss, outflow angle, and Zweifel lift coefficient of an LPT cascade, will be investigated. It is well known that the changes in inlet flow angle and inlet total pressure should influence the lift of the turbine cascade, which subsequently changes the Zweifel lift coefficient and flow deviation at the outlet, while the change in inlet turbulence intensity usually has a strong impact on flow transition of LPT. Thus, the effects of inlet flow angle, inlet total pressure, and turbulence intensity are considered in the study; meanwhile, the adaptive Gaussian process (ADGP) is used for performance prediction. The organization of this paper is as follows. Flow simulations are first carried out, and the impacts of each uncertainty on performance changes are analyzed. The principles of ADGP are then introduced, and the method is verified and validated through a series of function experiments. The surrogate models of performance parameters with respect to the uncertainties are learned by ADGP. Using the models, performance changes in the LPT cascade are finally quantified. MCS-based statistical analysis of the flow fields is performed to reveal the impact mechanisms of uncertainties. Moreover, the results of the Sobol sensitivity analysis are given to illustrate the contribution of each uncertainty to performance changes.
2. Numerical Simulation
The two-dimensional cascade of the first rotor in a two-stage LPT of a small aero engine is utilized in the study. The uncertain effects of inlet flow angle , inlet total pressure , and inlet turbulence intensity on the cascade flow are considered. The flow simulation adopts an in-house program, which solves the Reynolds-averaged Navier–Stokes (RANS) equations, SST turbulence model, and transition model equations. LU-SGS time-marching is used. Multigrid and local time step techniques are used to accelerate convergence.
In this paper, the total pressure-loss coefficient
, outlet flow angle
, and Zweifel number
of the turbine cascade are calculated. The total pressure-loss coefficient and Zweifel number are defined as:
where
and
P are total pressure and static pressure, respectively; the subscripts
and
represent inlet and outlet, respectively; the subscripts
p and
s represent pressure side and suction side, respectively;
x and
are the distance from the leading edge and axial chord, respectively.
Specifications of the turbine cascade are given in
Table 1, where the inlet total pressure
, inlet total temperature
, inlet flow angle
, turbulence intensity
and outlet back pressure
are given. The inlet total pressure and temperature are uniformly distributed in the circumferential direction, and a mass-averaged calculation is performed to obtain the outlet back pressure and angle.
Figure 1 presents the cascade geometry and the topology of multi-block grids. Four sets of grids are used for flow simulations, the resolutions of which are
, respectively.
Figure 2 shows the flow solutions of
,
, and
of the four different grids, where N is the serial number of the grids,
is a scaled function representing the performance parameters, where the reference
is the one for the fourth grid. It is evident that as the grid resolution increases, all the performance parameters approach those of the fourth grid. The results of the third and fourth grids are almost the same. The grid-independent flow solutions, including
,
, and
, are also given in
Table 1. In the following study, the third grid is utilized.
To demonstrate the effects of inlet flow parameters on the aerodynamic performance changes, inlet total pressure
and inlet flow angle
are perturbed in the interval
, while the turbulence intensity
is perturbed in the interval
. The relative variations of the inlet flow parameters are defined as
where
f is a universal function representing
,
, and
, and the subscript
denotes the reference value.
Figure 3 presents the relative variations of performance parameters versus the relative variations of inlet flow parameters. Generally, the impact of inlet flow variations on the changes in
are considerable. At the interval boundaries, more than
variations in
can be found. It is obvious that the increase of inlet flow angle and turbulence intensity induce more flow losses to the turbine cascade, while the increase of inlet total pressure is effective in reducing the flow loss. By contrast, the impact of inlet flow parameters on outlet flow angle is rather weak since the maximum relative variation of
is about
. Moreover,
is almost independent of the variations of inlet flow angle and turbulence intensity. Similar results can be found for the variations of the Zweifel number, as shown in
Figure 3c. Besides the inlet total pressure, the inlet flow angle also changes Zweifel number obviously. It is known that both outlet flow angle and Zweifel number are closely dependent on the lift of the turbine cascade. The increase of inlet flow angle and inlet total pressure has been well recognized to be effective in increasing blade loading, which undoubtedly results in increased Zweifel number and increased flow-turning angle. It should be noticed that in
Figure 3b, as
increases, the negative relative change in outlet flow angle is attributed to the increased inlet flow angle, although the flow-turning angle increases. Moreover, as shown in
Figure 3c, the decrease of Zweifel number resulting from inlet total pressure increase is attributed to the increase of
, as shown in Equation (
2).
It is well known that the occurrence of laminar flow transition on the suction side of the turbine cascade can usually be found, which immediately and significantly changes the flow loss. To further understand the impact mechanisms of inlet flow variations on the changes of total pressure-loss coefficient, the contours of intermittency factor in the boundary layer of the suction side of the turbine cascade are given in
Figure 4.
Figure 4b–d are the contours with maximum absolute variations of inlet flow angle, inlet total pressure, and turbulence intensity, respectively. From
Figure 4b,d it can be observed that when
and
are
and
, respectively, flow transition on the suction side moves upstream compared with the contour as shown in
Figure 4a. When
and
are
and
, respectively, flow transition on the suction side moves downstream, resulting in reduced flow losses. However, as shown in
Figure 4c, the movements of flow transition on the suction side under varied inlet total pressure are opposite compared with those given in
Figure 4b,d. In such situations,
with respect to
exhibits totally different variations compared with those with respect to
and
, as shown in
Figure 3a.
3. Adaptive Gaussian Process
3.1. Gaussian Process
In probabilistic statistical theory, GP is an important branch of the stochastic process, which is defined as a Gaussian process, which is a collection of random variables, any finite number of which have a joint Gaussian distribution [
29]. From the perspective of function space, GP can be described succinctly. GP is completely specified by its mean function and covariance function. The mean function
and the covariance function
for a real-valued Gaussian process
are defined as follows:
GP
can be written as:
Usually, for notational simplicity, the mean function is assumed to be zero, although this does not need to be done. The random variable in the definition represents the function value . x can be either time or a multidimensional parameter.
Covariance functions are also called kernel functions, which define nearness or similarity between data points. There are many choices for specifying the kernel functions, such as squared exponential functions, quadratic rational functions, Matern-class functions, etc. In the present study, the squared exponential function is used as the kernel function:
where
is the output length scale,
is the
l-th component of the input vector,
is the feature-length scale on the dimension
l. Given a training set
, where
X is the set of input variables, and
Y is the set of corresponding output values, the output values at the test points
satisfy the following joint Gaussian distribution:
where
is the observed noise variance. Then the conditional distribution of
can be determined as:
The hyperparameters
,
and
can be determined by maximizing the marginal likelihood function. The expression of the marginal likelihood function is given by Equation (
11), where
is simplified to
K:
Fitting the optimal value of the hyperparameters is essentially one kind of optimization problem, which can be quickly achieved using the ADAM gradient-based method [
31].
3.2. Adaptive Sampling
The accuracy of the GP regression model depends largely on the selection of training samples. Training samples can be selected by manual one-time selection (such as random or uniform sampling in the sample space) or adaptive sampling. The GP regression model can predict the mean and variance of any test point in the sampling space. The variance represents the uncertainty of output, which can be used for adaptive sampling.
In the iterative sampling process, the ADGP determines the new required training samples according to the position of the maximum uncertainty. The new training samples are further included in the training set. The hyperparameter optimization is carried out in each interaction. In such a way, the accuracy of the model will be improved step by step, which is more scientific and efficient than manual one-time selection.
The standard ADGP usually produces only one sample per iteration. To improve sampling efficiency, this paper adopts the batch sampling method, which can add multiple samples per iteration. The GP based on batch sampling also has a disadvantage, i.e., the newly sampled points may be clustered within a certain range. To avoid this situation, after obtaining the first newly sampled point with the largest uncertainty in each iteration, the covariance matrix of GP is updated using the renewed samples directly without retraining the model, and then sampling for selecting the second new sample point [
32].
The key point of adaptive sampling is the sampling criterion. In the study, the standard deviation predicted by the model is used as the acquisition function to guide the sampling. A convergence threshold is given to stop the adaptive sampling. The acquisition function is as follows:
Figure 5 gives the flowchart of ADGP. The processes can be briefly described as follows.
Step 1: The initial training sets are prepared for GP training, and a batch of test sets are prepared for prediction.
Step 2: Train the GP model by hyperparameter optimization, which is then used for function predictions of the test sets.
Step 3: Calculate the acquisition function at each test point and determine the maximum , which is then compared with the threshold.
Step 4: If , model training can be completed, where is the threshold. If , the current GP model does not meet the accuracy requirement. The first selected test point is added to the training sets. Calculate the for the second time and determine the maximum , which is then compared with the threshold.
Step 5: If , model training can be completed. If , the second selected test point is added to the training sets and goes to Step 2.
It should be noted that after adding the first selected test point to the training sets, it is not necessary to train the GP model following Step 2, while only the covariance matrix needs to be updated. In this way, two new training samples can be selected per each iteration.
3.3. Function Test
To verify the prediction accuracy of ADGP and the computational cost of model training, three different function experiments are presented. Moreover, GP without adaptive sampling and the widely used adaptive NIPC method are also used, and the results are compared. The functions are given as follows.
Himmelblau function (2-d) [
33]:
Four-dimension function (4-d) [
34]:
Six-dimension function (6-d):
where
and
.
In the domain of function definition, initial training samples of ADGP are obtained by the Latin Hypercube Sampling (LHS) method. The numbers of initial training samples and the thresholds for the three function experiments are given in
Table 2, where
and
are the numbers of initial and total training samples, respectively.
Figure 6 presents the convergence history of the maximum acquisition function
, where N means the iteration counter. Starting from the initial training samples,
decreases, demonstrating the exploitation phase of adaptive sampling and that the prediction accuracy of ADGP is gradually improved.
To compare the prediction accuracy of different surrogate models, for each function experiment, the same number of training samples are generated by LHS for static GP and adaptive NIPC. The principles and procedures of an adaptive NIPC based on a sparse grid sampling technique have already been introduced in previous studies [
15,
35]. In the study, after trying polynomials of different orders, seven-order polynomials are used, and the numbers of total samples are maintained with those of ADGP. Moreover, a large number of points are selected in the function domain and used as the test samples for assessing the prediction accuracy of surrogate models. The prediction accuracy is measured by mean absolute percentage error (MAPE), the definition of which is given as
where
is the prediction of
i-th sample,
n is the number of test samples, the subscripts
and
represent model prediction and exact values, respectively.
Table 3 shows the MAPE of the three models, where GP and ANIPC are static GP and adaptive NIPC, respectively. It is obvious that compared with static GP and ANIPC, the prediction accuracy of ADGP is higher in all three function experiments using the same number of training samples. Moreover, the prediction accuracy of both static GP and ADGP is higher than that of ANIPC, demonstrating that machine learning is useful in improving the prediction accuracy of the surrogate models.
To better distinguish the difference of sample distributions in the same function domain,
Figure 7 shows the original two-dimensional function and the distributions of training samples. Compared with the sample distribution of both static GP and ADGP, the samples of adaptive NIPC are concentrated in the adjacent region of maximum value. That is because the new sparse grids are produced by a simple interpolation method, and more grids are required to reduce the interpolation error near the maximum value. In
Figure 7d, the red and black points are the initial and adaptively generated training samples, respectively. Compared with the agglomerative sample distribution of adaptive NIPC, the samples of ADGP are more uniformly distributed on the boundaries.
3.4. ADGP for Aerodynamic Parameters
In the following, ADGP will be used for predicting the aerodynamic performance of the turbine cascade, as shown in
Figure 1. The compounded effects of inlet flow angle, inlet total pressure, and turbulence intensity are taken into account. For each performance parameter of total pressure-loss coefficient
, outflow angle
, and Zweifel number
, the corresponding surrogate model based on ADGP will be trained. Following the same procedures as shown in
Figure 5, eight initial training samples are used. The threshold of the acquisition function is 0.1 for
and
, while it is 0.01 for
. Notice that all the training samples are generated in the same space as mentioned before, i.e., the uncertainty is 10% for inlet total pressure and inlet flow angle, and it is 60% for inlet turbulence intensity.
Figure 8 shows the convergence histories of
for the three performance parameters. After 10, 4, and 3 iterations, the maximum acquisition function of
,
and
, respectively, decreases to below the corresponding threshold. The total sample numbers are 28, 16, and 14 for training the ADGP models of
,
and
, respectively. The samples necessary for training the ADGP models of
and
are much less than those of
. It is mainly attributed to the more weakened nonlinear dependence of
and
on the inlet flow variations, as shown in
Figure 3b,c. As shown in
Figure 3a,
is nonlinearly dependent on the inlet flow variations. It thus requires more training samples for model training.
Again, the prediction accuracy of the ADGPs is evaluated by several test samples. Four thousand test samples agreeing with the following Gaussian distribution are generated, which will also be used in the following UQ studies.
where
E is the truncation boundary and
in the study,
is the scaled uncertainty variable satisfying the normal Gaussian distribution with the definition:
where
x is the universal uncertainty variable considered in the study (inlet flow angle, inlet total pressure, inlet turbulence intensity),
is the reference value under the baseline operation condition, which equals the statistical mean of
x,
represents the standard deviation of uncertainty variable
x. In the present study, the standard deviations are 5, 5, and 30 for the relative variations of inlet flow angle, inlet total pressure, and inlet turbulence intensity, respectively. In such situations, the maximum relative variations of inlet flow angle, inlet total pressure, and inlet turbulence intensity are 10%, 10%, and 60%, respectively.
Performance parameters of each test sample are predicted by the corresponding ADGP. Then MAPE is calculated and shown in
Table 4. The prediction accuracy of all ADGPs is high enough, and they can be used in the following UQ studies.