2.1. Basic Identification Procedure
The model calculates parameters of the engine gas path
at steady-state modes depending on an operating point, external conditions
and component performance
. Hence, in the general case, it is presented as:
The linear model may be formulated as
which relates small deviations of the gas path parameters
and parameters of components’ performances
at a single operating point. H is an influence coefficient matrix (ICM). The linear model (2) is a component of the identification algorithm for the non-linear model.
Identification of the non-linear model (1) by measuring results
and
requires determining estimations
, which are solutions of the following optimization task:
where
is the Euclidean norm of the vector, Φ–least squares functional to be minimized.
The precision of estimation will improve if the amount of a priori information is higher than the number of unknown parameters. Therefore, the identification can be provided by a set of measurements that are done at N different operating points. For this purpose, the generalized vector of residuals is minimized:
Model (1) which is included in (4) is numerical. Therefore, the minimization of residuals (4) is a numerical iterative procedure, which in each iteration determines the solution as a sum of the previous solution and current correction:
and iterations continue until corrections become negligible.
The correction
is determined as a solution of the overdetermined system of linear algebraic equations:
where
is a generalized matrix of influence coefficients, which is composed of elementary matrixes
determined for each operating point. In accordance with Equation (6), for each iteration, such correction
of required parameters is found, which minimizes residuals
(4).
The known solution of the linear system (6) by the least squares method takes the form:
where
–Fisher information matrix, G–weight diagonal matrix, whose elements are inverse to variations of measuring errors
.
Unfortunately, the least squares method is sensitive to outliers in the right part of the system (6), which can be related to faults in experimental data. Therefore, practical applications of the classic LSM suffer from the instability of estimations and poor convergence of the identification algorithm as a whole. In some cases, the estimations are far from the expected values. The reasons for such results may be the lack of empirical information, an excessive number of estimated parameters or correlation between two or more state parameters. This causes ill-conditioning of the Fisher Matrix and excessive deviations of estimations. These estimations lose a physical sense (are out of range of possible components’ performances parameters variation—for example, efficiency is more than one) and can cause the model calculation program to crash.
Hence, the LSM identification procedure needs modification to provide its stability and physically adequate estimations.
2.2. Regularized Identification Procedure
In the known regularization procedure named for Tikhonov, the generalized functional is minimized, which besides the residuals of measured parameters includes a norm of the finding parameter vector with a weighting factor α. The identification task takes the following form:
where g
i is the weight coefficient that describes measurement error (
), y
i is a component of vector
and
is a component of vector
.
The regularized identification is based on the least squares method described above. As it is shown in Equation (9), the minimized functional is extended by elements which are related to parameters
to be found. Thus, the main changes in the identification procedure are the generalized vector of residuals
and generalized matrix of influence coefficients
. The modified terms have the following structure:
where I is an identity matrix.
This identification procedure will deliver the non-regularized solution if the regularization coefficient is zero. Otherwise, the influence of coefficient α depends on the proportion between components: and of the generalized functional . Therefore, the effect of regularization is determined by the conditions of identification such as the number of measured parameters m, number of operating points N, measuring errors G and number of estimated parameters r.
2.3. Numerical Simulation
To assess the impact of the regularization coefficient α, the identification procedure was tested with the zero-dimensional thermodynamic model of a turboshaft engine [
17] (
Figure 2), based on experimental component maps. Performed numerical experiments involved calculation of N = 12 operating points for two scenarios:
- (1)
Without measuring noise, for a wide range of α values;
- (2)
With measuring noise, for selected α values.
Five gas path parameters were used for model identification:
- (1)
Compressor discharge pressure p2;
- (2)
Turbine inlet temperature TIT;
- (3)
Power turbine discharge temperature;
- (4)
Rotational speeds of rotors N2 and N1.
Initial deviations of two compressor performance parameters were assumed as follows:
Four estimated parameters included:
Deviation of flow rate: = ΔWC;
Deviation of compressor efficiency: = ΔηC;
Deviation of HPT mass flow: = ΔWHPT;
Deviation of HPT efficiency: = ΔηHPT.
Figure 3 presents estimations
and simulation results
for different values of α, which varied in a range from 0 to 1. ΔY
av is the average deviation between initial and estimated models:
and
is the average deviation between measurements and estimated model:
Figure 3 shows that despite the initial growth of
and
absolute values, the average value
decreases monotonically, whereas the residual between measurements and the corrected model increases. This corresponds to theoretical predictions of the influence of regularization on the identification process.
The simulation results help to choose the value of the regularization coefficient α. If the deviation of parameters is to be lower than 5 % and the deviation of parameters lower than 1%, then the value of α cannot exceed 0.03.
To obtain precise values of variance and to analyse the distributions of estimations , a sequence of data sets with random measuring errors with variance was generated and identified 1000 times. This provided the average precision of about 1% of initial residuals for parameters Y and about 0.5 % of simulated deviation 0.03 for the estimated parameters .
Figure 4 presents histograms for calculations with two estimated parameters at α = 0 and α = 0.04. Both non-regularized and regularized estimations have distributions close to normal ones. Total scatter of estimations is also similar, but centres of regularized estimations moved from their true values by 0.008 and 0.005, respectively.
When four parameters were estimated (
Figure 5), estimations
and
preserved their distributions while estimations
and
had significant differences. Such abnormal behaviour of parameters
and
is explained by their correlation.
At low values of the regularization coefficient α, centres of distributions are close to their true values, but scatter is high, and distributions are strongly asymmetrical. When α increases, then mean values move, but scatter decreases, and its distribution approaches the normal one.
As shown above, the value of the regularization coefficient α has to be set appropriately to provide meaningful results of regularized least squares. This procedure is useful in ill-conditioned engine model fitting but needs preliminary adjustment and sensitivity analysis.
2.4. Regularized Identification Procedure Development Using a Priori Information
This paper demonstrates an idea to implement engine heuristics involving component parameters and performances to improve the precision and stability of model identification. The main difficulty is in the diversity of this information, which is presented in one of the following forms:
Exact statement (for example, a part-load performance in a determined area is smooth);
Statement in the form of limitations of the area of acceptable solutions (for example, the efficiency of the individual compressor cannot differ more than 3% from the efficiency of an “average” compressor, the performance of which is used in the initial model);
Statement in the form of fuzzy information (for example, the gas temperature in the turbine will grow with the engine’s life);
Statistical form (for example, probability density functions of parameters).
The next difficulty is in the formalization of parameters that characterize the model quality. These parameters are set on the basis of subjective preferences of decision makers (DM). The same difficulties appear in the ranking of partial criteria and limitations according to their significance for estimation of the model quality. The analysis showed that the main problem is that theoretical-probabilistic methods are difficult to apply in the presence of uncertainties, which are related to subjective preferences whose nature is not statistical. Actually, the choice of the model’s structure is the DM procedure, which in multi-criteria case is inevitably highly subjective. If the complexity of the task increases, the role of quality factors will grow. Therefore, it is possible to take into account all criteria using the proper mathematical tool.
We propose to use the fuzzy set theory [
33] as this tool. This theory makes the uniform base to describe the information given in all the forms listed above, thus providing the correct mathematical definition of the identification task.
An example of applying fuzzy sets in the GPA and engine model matching is given by M. Zwingenberg et al. [
34]. They used fuzzy logic for the evaluation of sensor failures. In contrast, we introduce a priori information about engine performance and experimental data directly into the stabilizing functional through the fuzzy logic approach.
Generalization of the functional (9) gives:
where
is the least squares functional (3), which minimizes measurement error, so it is called the empirical risk functional and
is the stabilizing functional, which is considered as the functional of a priori risk.
The determined a priori information is set as a limitation, which may take the form of an equality or inequality. The more general form of setting a priori information is its representation as a fuzzy set. The fuzzy set of parameter x is represented as a definition domain and a membership function in this domain. For the considered task, the parameters x may be the model parameters or the output (calculated) variables .
Next, we will use limited normal fuzzy sets with limited definition domain (
xmin,
xmax) and
. We will express each a priori information as a particular functional of a priori risk
and will determine the general functional of a priori risk as a linear composition of particular functionals:
where
are weight coefficients.
Let us consider some types of a priori information and its expression in view of fuzzy sets:
Case 1. Limitations of some parameters are known. For example, it is known that 0 <
η < 1 and 0 <
σ < 1, etc. Using experience and calculation results, these limits can be significantly reduced; for example, 0.5 <
η < 0.9 and 0.9 <
σ < 0.99 (
Figure 6).
Case 2. The a priori mathematical model is known. This can be the model with design maps of components or the model of the average engine, which is matched with previous testing results.
This information may be expressed as a relationship between μ
a or Φ
a and the difference between parameters that correspond to the matched and a priori models. These parameters may be the model parameters
as measured (for example fuel flow) or non-measured calculated parameters (for example thrust). The membership function, in this case, can be of symmetrical triangular shape and the functional of a priori risk
that characterizes the similarity between values of the parameter
of matched and a priori models can be formed as
where x
q is a calculated parameter of the engine;
is the value calculated by the model to be matched;
—the value calculated by the a priori model;
are the parameters of the a priori model.
Case 3. Information about the confidence in different sets of experimental data obtained in different conditions with different precision.
Case 4. Confidence in the available maps of the engine components. For example, we know that the compressor map used in the model corresponds to the old version of the engine and is far from the actual map. Thus, we can express this knowledge in view of the confidence functions that are in this case the membership functions.
The empirical risk functional
in Equation (3) may be considered as a square of Euclidian distance between two sets, one of which contains experimental data, and the other is composed of simulation results:
By analogy, the stabilizing functional can be formed as a second power of a distance from estimated parameter (or function x() that is determined using the estimated parameters).
The main problem in this analogy is that the fuzzy set that contains a priori information is infinite, so the sum in the above equation must be replaced with integral. For example, the second power of the distance from the engine map parameter θ = x to the fuzzy set with a given membership function μ(θ) equals:
In the iteration process of the task solution, each integral must be calculated numerically. This requires a lot of time and high computational capacity. A numerical solution can be found for certain types of the membership function. We considered the trapezoidal function, the particular cases of which are the rectangle, triangle and isosceles trapezium (
Figure 7).
The functional derived for the trapezium:
The proposed modifications of the objective functional cause the non-linearity of the estimation problem, so traditional solution methods are not applicable. This problem is overcome by adapting genetic optimization algorithm to the specifics of the engine model matching [
35]. Genetic algorithms are increasingly used in gas turbines to solve complex optimization problems [
4,
10,
36,
37].