1. Introduction
The performance of an aero-engine is determined by numerous factors, such as its geometric dimensions, material properties, and environments. In the process of the equipment design and manufacturing, these factors will inevitably experience certain random changes, which lead to security risks in the deterministic design [
1,
2,
3,
4]. The traditional method is to set a safety factor to consider these random uncertainties, which often results in an increased structural weight. This is particularly unfavorable for aero-engines that pursue high thrust-to-weight ratios. Therefore, a reliability analysis that considers the influence of various random factors came into being [
5,
6,
7,
8,
9,
10,
11,
12]. The Monte Carlo (MC) method [
13] depends on numerous sample points to obtain statistical data. It can be performed efficiently when the performance function
y(
x) is expressed as an explicit form in terms of the random variables
x. When the performance functions are implicit, the performance must be obtained through experiments or numerical simulations. Obviously, the MC method is impractical for numerous costly experiments, or time-consuming numerical simulations, such as 3-D Finite Element (FE) simulations [
14,
15], which are abundant in the design and manufacturing of aero-engines. Therefore, a surrogate model is proposed to replace these expensive experiments or simulations, which utilizes approximate methods to establish an approximate function that can represent the real function. It only needs a limited number of experiments or numerical simulations to establish a surrogate model. After that, The MC method is performed on the established surrogate model to greatly improve the calculation efficiency while meeting the accuracy. This approach has been used in many engineering applications, including predicting the Low Cycle Fatigue (LCF) life of turbine disks [
16,
17,
18,
19,
20] and structure optimization [
21,
22,
23].
Due to the extremely complex structure and its harsh operating environment, the performance of an aero-engine exhibits high nonlinearity in practice. The original and widely used Response Surface Method (RSM) [
24,
25,
26,
27,
28] underperforms in dealing with this type of issue. Although it is possible to increase its nonlinear ability by increasing the polynomial order, overfitting will occur. A large amount of work was performed to improve the traditional RSM. Bai and Fei [
29] utilized distributed collaborative RSM to design the reliability of the high-pressure turbine blade tip clearance of an aero-engine, and verified the feasibility and effectiveness of the method; Li and Lu [
30] and Fei and Bai [
31] combined the support vector machine (SVM) with RSM to analyze the reliability of structures; Yan et al. [
32], Lu et al. [
33] and Ren and Bai [
34] combined the neural network (NN) algorithm with RSM to establish the neural network RSM, and used it for the regression analysis of the limit state surface of complex structures to improve the calculation accuracy. However, these methods still have relatively large limitations for highly nonlinear problems.
Generally, for some highly nonlinear problems in engineering, not only global optimal solutions but also local optimal solutions are required. Thus, the surrogate model needs to have the ability to balance the global and local approximations. A radial Basis Function (RBF) uses all the sample points in the design space to approximate the response function at a given point, and has strong global fitting capabilities [
14,
35,
36]. It is inefficient to use sample points around prediction points. The Kriging method (KG) is a local approximation scheme, which largely depends on the sample points around the prediction point [
37,
38,
39]. The global representation ability of KG decreases as the dimension increases. Arroyo and Ortiz [
40] proposed a Local Maximum-Entropy (LME) approximation for the data interpolation of mesh-free methods, which combined the global maximum-entropy approximation method and the minimum width basis function based on Delaunay triangulation. Global or local capabilities can be adjusted by a dimensionless parameter
. Based on this, Fan et al. [
18] proposed an LME-based Surrogate Model (LMESM) considering both global and local approximations and found that the LME approximation has a great potential for dealing with highly nonlinear problems through a systematic comparison with RSM, RBF, and KG. In highly nonlinear problems, the degree of nonlinearity varies with space. The number of sample points used for fitting or interpolating different prediction points should be different. For example, in low-order nonlinear or linear regions, only a small number of sample points can achieve high accuracy, whereas highly nonlinear regions need more sample points to ensure their accuracy. Therefore, when constructing a surrogate model, the influence of a different number of sample points on the prediction points within different nonlinearity regions should be considered. In the original LME approximation and LMESM, the number of samples near the prediction points in different regions can be controlled by changing the basis function parameter
. However, a globally unified
was adopted in both, instead of updating
according to the different number of sample points around the different prediction points, which greatly limits its highly nonlinear approximation ability. Moreover,
was determined manually, which requires numerous trial and error approaches. A large amount of time is spent adjusting
in the actual analysis. Even so, it is difficult to ensure the optimality of
.
In this study, a new model for automatically updating basis functions, Adaptive Local Maximum-Entropy Surrogate Model (ALMESM), is proposed. A Particle Swarm Optimization (PSO) algorithm is adopted to optimize an objective function, which is the sum of the squared errors between the real and predicted values of all the sample points, to solve the optimal in different regions automatically. The Centroidal Voronoi Tessellations (CVT) sampling method is adopted to ensure the uniformity of the sample point distribution. The performance of ALMESM is systematically studied by comparing the original LME approximation with a globally unified , RBF, and KG, with two explicit highly nonlinear mathematical functions. Finally, we apply this model to the reliability analysis of an aero-engine turbine disk with LCF life considering the uncertainty of the geometric parameters. The results show that ALMESM maintains higher accuracy based on the refinement of local characterization by comparison with the original LME approximation, RBF, and KG. Finally, the combination of ALMESM and the MC method is used to calculate the LCF life under different reliability and compared with the direct MC method. The relative error is within 1%, which proves that ALMESM can be used for the life reliability analysis of turbine disks.
The remainder of this study is organized as follows. In
Section 2, the ALMESM approximation scheme is introduced. In
Section 3, the performance of ALMESM is presented through two explicit highly nonlinear mathematical functions. The results from applying the proposed surrogate model to an industry application are presented in
Section 4. Some conclusions are summarized in
Section 5.
2. ALMESM
The original LME approximation method can be found in detail in Arroyo and Ortiz [
40]. For the sake of the completeness of this study, here is a brief review of the process of establishing this method. In the original LME approximation, an implicit performance function
y(
x), at a given prediction point
x, is approximated by:
where
is the
n dimensional vector
=
,
.
=
is the set of
N sample points, and
is the corresponding response at
.
is the basis function of the sample point
at a given prediction point
x.
Let
be the convex hull of the sample point set
X, and the basis function of the sample point
at a given prediction point
x is defined as the solution of problem (2):
given
,
where
is a Pareto optimal parameter used to ponder the locality term against the entropy term in the objective function.
The constraints in the minimization problem ensure that the resulting basis functions satisfy the zeroth and first order consistencies. Consequently, this model can exactly reproduce constant and linear functions. The solution of the minimization problem is unique; therefore, the basis function can be directly calculated as:
where
The calculation of the basis function in (3) requires finding
and
.
was determined by a dimensionless parameter
and the local characteristic length
of the prediction point
:
where the characteristic length
of a given prediction point
is calculated by finding the nearest sample points
around
, and then finding the nearest sample point
to
. Thus, the Euclidean distance between
and
is defined as
of
. In the previous report [
18],
usually ranges from 0.1 to 6.8. It determines the support width of the LME basis functions, which introduces a seamless bridge between local approximation and global approximation as shown in
Figure 1. Obviously, the LME basis functions are allowed to depend on the location of the prediction point and have the potential to adjust themselves adaptively by varying
to achieve optimal degrees of locality, which provides a great advantage for the construction of high-order approximations. The basis function itself covers all global sample points, but the locality of the basis function determines the contribution of each sample point to the predicted point. By setting a truncation error
(e.g.,
), the sample points with minimal contributions can be ignored. Therefore, by controlling the locality of the basis function, the global and local capabilities of the surrogate model can be adjusted. As shown in
Figure 1, as
increases, the basis function support width gradually narrows, and with fewer sample points with a significant contribution included, the local ability will be stronger. On the contrary, the global capability will be strengthened.
Nevertheless, in the previous research work [
18,
40], the value of
was manually set to a globally uniform value, which is difficult for determining an optimal value and requires a lot of trial and experience. In this study, by introducing an objective function for optimization, the optimal
of each sample point is obtained.
Optimal
should minimize the fitting error of the surrogate model. That is, the predicted value at the sample point should be as close as possible to the real value. In this study, when calculating the optimal
of a sample point, the surrounding sample points with a significant contribution to this sample point are selected as a local neighborhood
. Then, we take the sum of the squared errors of the real and predicted values of the sample points in
as the objective function for optimization, i.e.,
where
is the real response value of the sample points,
is the predicted value of the sample points, and
is the number of sample points in the local set
of the prediction sample point by selecting
.
It can be observed that
and
can be obtained by iteratively solving (4) and (7). In the original LME approximation,
was optimized by a Newton–Raphson algorithm and a Nelder–Mead simplex algorithm. However, the Newton–Raphson algorithm based on the gradient is usually limited by the initial value and derivative operation, resulting in convergence to the local minimum solution or divergence. The emergence of an intelligent optimization algorithm effectively avoids the abovementioned problems. The PSO algorithm, proposed by Kennedy J. and Eberhart R. [
41], has the advantages of a fast convergence and strong global optimization ability [
42]. Therefore, the PSO algorithm is adopted in this study to solve (4) and (7) to obtain
and
.
Suppose there is a set
of
particles in a D-dimensional objective search space. The
i-th particle is represented as a D-dimensional solution vector
=
. It describes the position of the particle in the search space. The velocity of the individual particles is recorded as
=
. The best position that the individual particle has experienced, i.e., the individual extreme value, is recorded as
=
. The best position that the entire group has experienced, i.e., the extreme value of the group, is recorded as
=
. The PSO algorithm updates the particle velocity and position in the following manner based on
and
:
where
is the inertial factor,
and
are the acceleration factors,
and
are the random number between [0, 1], and
is the constraint factor which is the weight that controls the velocity. The convergence condition adopted in this study is that the values of the objective function (4) and (7) remain unchanged for five consecutive iterations. Finally, the ALMESM algorithm resulting from the preceding scheme is listed in
Table 1.
Through the ALMESM algorithm, an optimized
for the location can be automatically solved according to the nonlinear degree of the location of each sample point, which is a better fit to the real model.
Figure 2 shows the characteristics of the ALMESM algorithm. The gray solid line represents the original function, and the dots on the line represent the sample points.
and
are prediction points, which are in two regions with different nonlinearities.
and
are the neighborhoods of the two prediction points. It can be observed from
Figure 2 that these two neighborhoods contain different numbers of sample points, and the value of
at each sample point is different, leading to a different basis function.
3. Performance
To verify that the ALMESM can obtain the optimal
of each sample point, as well as the improvement of the performance of the original LME approximation, we compare it with the original LME approximation, an RBF surrogate model with a Gaussian basis function, and a KG surrogate model with a Gaussian correlation function. The detailed introduction of RBF and KG can be found in Simpson et al. [
43], Wild et al. [
44], and Forrester and Keane [
45]. Two explicit mathematical functions are selected as the test cases from Hock and Schittkowski [
46] for testing the highly nonlinear optimization algorithms.
- (a)
A 3-D function with high-order nonlinearity
- (b)
A 6-D function with high-order nonlinearity
R-squared (
R2) is a widely used assessment metric for surrogate models, which is used in the study to make a quantitative assessment of the fitting performance of a surrogate model
where
is the prediction value obtained by the surrogate model, and
is the mean of the response values of all the sample points. While Mean Square Error (MSE) represents the departure of the surrogate model from the real function, the variance captures how irregular the problem is.
embodies the fitting degree between the surrogate model and the real model. The larger the value is, the higher the fitting accuracy of the surrogate model will be.
When constructing a surrogate model, it is important to select the sample points scientifically for the accuracy of the surrogate model. Generally, it is required that the selected sample points can evenly cover the entire design space, and the number of sample points should be as few as possible. Considering these two requirements, this article uses the Centroidal Voronoi Tessellations Sampling (CVT) method for sampling. Let
be
points in M-dimensional hyper cubic space. The Voronoi cell
corresponding to
is defined as
where
represents the Euclidian distance between
and
;
is called the Voronoi diagram. For each Voronoi unit, the center of gravity
is defined as
If and only if a Voronoi diagram meets the following conditions (15), it is called the gravity center Voronoi diagram.
That is, in the gravity center Voronoi diagram, the generating point of each Voronoi unit happens to be the gravity center of the unit. Currently,
forms a set of CVT points.
Figure 3 shows a set of CVT points in a two-dimensional space with 100 sample points and 50 prediction points. It can be observed that the sample points and prediction points evenly cover the entire design space.
We first explored the influence of the randomness of the sampling on the results of the original LME approximation and ALMESM. Assuming that the sampling scheme is represented by (
A,
B), where
A is the number of sample points and
B is the number of prediction points, we selected four sampling schemes, i.e., (60,30), (80,40), (100,50), (120,60), and conducted five CVT random samplings for each scheme. Taking the 3-D function (10) as the example, we compared the
of the original LME approximation with the globally unified
= 0.8 (randomly selected) and ALMESM. It can be observed from
Figure 4 that the results of the two algorithms are less dispersive, and as the number of sample points increases, the dispersion tends to further decrease. Moreover, we performed statistics on the dispersion results (see
Table 2).
and
represents the maximum and minimum value of
in the five CVT samplings, respectively. It can be found that the maximum absolute error in
of the original LME approximation is 0.08, resulting in a relative error of 8.70%. As a contrast, ALMESM gives the absolute error of 0.04 and the relative error of 4.21%. The above results show that the results of the two algorithms are less affected by the randomness of sampling. Moreover, the stability (the degree of variation in the prediction accuracy of a surrogate model that is affected by the number of sampling points and the randomness of sampling) of ALMESM relative to the original LME approximation is further improved.
According to the above mathematical functions, eight sets of sample points with the number in [60, 200] were extracted by the CVT method.
Table 3 is the selection scheme of the sample points and prediction points. We compared the
trend graph of ALMESM, the original LME approximation with a globally unified
, RBF, and KG.
In the 3-D function (
Figure 5), the accuracy of the original LME approximation changes significantly with
when the sample size is small. As the sample size increases, the accuracy gradually stabilizes. However, when the dimension increases (
Figure 6), the accuracy of the original LME approximation is greatly influenced by
and the number of sample points. The accuracy varies widely, and even negative values appear. This is because when the number of sample points increases, the global and local nonlinearity of the original function will be more prominent, making the basis function that was originally applicable to a smaller number of sample points not applicable at this time, resulting in reduced accuracy. The original LME approximation cannot automatically adjust
to change the locality of the basis function. Thus, it leads to low accuracy. In contrast, it can be observed from the two examples that ALMESM always maintains high accuracy. ALMESM can automatically adjust
so that the basis function can be adaptively adjusted in various situations. Therefore, under different sample points, it can be stabilized with high accuracy.
Figure 7 shows the optimal
of each sample point obtained by ALMESM in the two examples. It can be observed that the increase in the dimensionality and the degree of nonlinearity causes the
of each sample point to vary more drastically than the 3-D ones. The reason is that the nonlinearity of the 6-D function is more severe than that of the 3-D function, and the basis functions with higher
are required to capture local features. Therefore, the large amount of the optimal
obtained by ALMESM for the 6-D function are quite large (>10.0) and it is difficult to achieve higher accuracy with a globally unified
. In summary, compared to the original LME approximation, ALMESM improves the accuracy and stability of the calculations.
In
Figure 8, the accuracy of KG is undesirable when the sample size is small, and high accuracy is not obtained until the number of sample points reaches 140. Whereas RBF and ALMESM can reach high accuracy in the range of 60–200 sample points. With the increase in the dimension and nonlinear order (
Figure 9), the fitting accuracy of the three models gradually increases as the number of samples increases, but ALMESM has the highest accuracy. It can be observed from the above two examples that ALMESM realizes the automatic optimization of
for each sample point and can achieve higher accuracy dealing with highly nonlinear problems. In the following, ALMESM is used for the LCF and reliability analysis of an aero-engine turbine disk considering geometric uncertainty.
4. Application
As a life-limiting part of an aero-engine, the reliability of the turbine disk is a weak link in the aero-engine design and has become a bottleneck restricting engine research and development [
47]. The LCF is the most important failure mode of turbine disks. In practice, the structure of the turbine disk is uncertain due to the dispersion of the dimensional tolerances, so that the actual life of the turbine disk is statistically dispersed. Meanwhile, the operation environment of the turbine disk is complex. There are multiple load conditions, which leads to a strong nonlinearity in the reliability prediction of the turbine disk.
Structural reliability is the probability that a structure completes a specified function under specified conditions within a specified time
. Assuming that the performance function of the structure is
, the limit state equation
divides the basic variable space of the structure into a failure domain and a reliability domain. The failure probability
can be obtained by integrating the joint probability density function of the basic random variable
. For general multi-dimensional problems and complex integration domain or implicit integration domain problems, the integration of the failure probability has no analytical solution. In such a case, the MC method can be used to solve this problem. Based on the MC method, the reliability function
can be achieved as:
where
represents the number of total sample points, and
represents the cumulative number of sample points in the failure domain during the working time from 0 to
.
In this study, the ALMESM was used to predict the LCF life and perform the reliability analysis of a turbine disk considering geometric uncertainty, which has been divided into the following four steps (also as outlined in
Figure 10):
- (1)
Filter key geometric variables. As shown in
Figure 11, there are numerous geometric parameters of the turbine disk, and it is necessary to perform a sensitivity analysis on all the geometric parameters to screen out the geometric variables that have the greatest influence on the LCF. Fan et al. [
17] have detailed the process of sensitivity analysis, pointing out that the inner diameter (d_R1), the outer diameter (d_R2), and the rim thickness (d_W6) are the three key variables for LCF analysis.
- (2)
Determine the probability distributions of the key variables. For the reliability analysis, the fidelity of the analysis strongly depends on how well we know the underlying distributions of the random variables. In TDRA, the Kolmogorov–Smirnov (K-S) test approach [
48,
49] is adopted to determine the probability distributions of the three key geometric variables. Fan et al. [
17] have found that the three geometric variables obey the normal distribution.
- (3)
Construct the ALMESM. For fast and high-fidelity MC simulations, the ALMESM needs to be constructed. First, perform CVT sampling on the three key geometric variables. Then, carry on the FE analysis to the sample points, obtain the LCF life of each sample point according to the LCF life formula, and finally train out ALMESM.
- (4)
Predict probability for fatigue life. Perform the MC simulation on the constructed ALMESM to obtain the reliability analysis results.
Figure 10.
Flow chart for the reliability analysis of a turbine disk considering geometric uncertainties in the Turbine Disk Reliability Analysis (TDRA).
Figure 10.
Flow chart for the reliability analysis of a turbine disk considering geometric uncertainties in the Turbine Disk Reliability Analysis (TDRA).
Figure 11.
Variable geometric parameter of the GH720Li turbine disk, where the three key parameters are marked with blue circles.
Figure 11.
Variable geometric parameter of the GH720Li turbine disk, where the three key parameters are marked with blue circles.
The material properties of GH720Li at 650 °C and the boundary conditions applied to the disk in the FE simulations of step (3) are listed in
Table 4. According to the LCF test data of GH720Li and stress–strain model and strain–life model in Tang and Lu [
50] and Gao et al. [
51], respectively, the Mason–Coffin model of GH720Li material at 650 °C was obtained, as shown in (17):
where
is the LCF life,
is the stress range,
is the mean stress,
is the elastic modulus, and
and
are two standard normal random variables.
A total of 10 sets of sample points (
Table 5) were selected to test the accuracy of the surrogate model by using CVT sampling. To verify the performance of the model proposed in this article, we compared it with the original LME approximation with a globally unified
, RBF, and KG. From
Figure 12, the accuracy of the original LME approximation is greatly influenced by
and the number of sample points. An improper value of
can even lead to a negative value of
(
= 6.8, 120 sample points). In contrast, under different sample points, the accuracy of ALMESM can always be maintained at a high level, and even the lowest accuracy is higher than 0.80. Overall, the accuracy tends to increase as the number of sample points increases. When the number of sample points is 240, the accuracy of ALMESM reaches 0.96, and gradually tends to converge. In summary, compared to the original LME approximation, ALMESM realizes the automatic optimization of
for each sample point, and improves the accuracy and stability of calculations. From
Figure 13, KG has a certain prediction accuracy when the sample is small. However, as the number of sample points increases (>80), the predictive ability of KG is lost in this highly nonlinear engineering case. This phenomenon was also reported in literature [
18]. The KG model is a local approximation scheme; the correlation matrix can become singular if multiple sample points are spaced close to one another. Under a different number of sample points, the accuracy of ALMESM and RBF can be maintained at a high level. However, the ALMESM has higher accuracy overall. When the number of sample points is 100, the difference of
between ALMESM and RBF is the largest, with a value of 0.11. In summary, ALMESM can achieve higher accuracy for approximating the LCF life of the GH720Li turbine disk considering geometric uncertainty.
Subsequently, the LCF life reliability of the turbine disk was calculated using ALMESM, and the LCF life value under different reliability was obtained. Since the accuracy of ALMESM gradually converges when the number of samples is 240 (
Figure 13), the surrogate model established at this time was used for TDRA. A total of 100,000 sample points were created by using the MC random sampling method. By conducting MC simulations 100,000 times on ALMESM the frequency histogram of the LCF life was obtained, as shown in
Figure 14. It can be observed that the LCF life approximately obeys the lognormal distribution within the scope of the sampling. The minimum LCF life is 148,392 cycles and the maximum LCF life is 154,896 cycles. When the LCF life equals 150,892 cycles, it reaches the highest frequency of 0.057. In addition, the reliability–life curve was obtained as shown in
Figure 15. It is evident that although the variation of the dimensional tolerance is small, it still has a significant contribution to the LCF life of the turbine disk.
To verify the reliability results of the ALMESM, direct MC simulations (FE + MC) were performed on the turbine disk model based on the previous 100,000 samples. Through a statistical analysis,
Table 6 displays the results of the same samples to the LCF life of a certain reliability level obtained by the two methods and the total computational cost of the analysis. The results obtained by the two methods are relatively close to each other. The relative error is within 1%. The ALMESM is more efficient than the direct MC simulation. The comparison results fully demonstrate that it is feasible to use ALMESM for a reliability analysis.