**5. Simulation**

Based on the benchmark simulation demo for a 2 MW wind turbine in GH Bladed software, no excitation signals are applied on the controller or the output. Meanwhile, wind speeds with di fferent turbulence intensities are produced to stimulate the benchmark demo where operation data under di fferent wind scenarios can be acquired. The defaulted controllers in the benchmark demo are used as shown in Figures 3 and 5. The controllers can arbitrarily switch with the varying wind speed.

## *5.1. Parameters Setting*

GH Bladed is a high-fidelity software for wind turbine simulation. In this software, a benchmark simulation demo of a 2 MW wind turbine with gear-box and DFIG is adopted. It uses the controllers and control strategies shown in Figures 3 and 5. The main parameters are listed as follows: rated capacity (2 MW), radius (40 m), hub height (61.5 m), gear-box ratio (83.33), cut-in wind speed (4 m/s), rated wind speed (10m/s), cut-out wind speed (25 m/s), generator inertia (60 kg·<sup>m</sup>2), sti ffness coe fficient (1.6 × 10<sup>8</sup> Nm/rad), damping coe fficient (2.5 × 10<sup>5</sup> Nm·<sup>s</sup>/rad), variable speed controller (OTC), variable pitch controller (gain scheduling PI controller). Concretely, for this benchmark 2 MW wind turbine model, the horizontal-axis three blades are controlled integratively. For each blade, the blade length is 38.75 m where thickness to chord ratio, Reynolds number, pitching moment center and deployment angle are set as 21%, 2 × 106, 25% and 0◦. More information of the turbine blades such as blade structure and aerofoil parameters are shown in Figure A1, Tables A1 and A2 in the 'Appendix A' part.

#### *5.2. Scenarios Setting and Simulation*

Under the closed-loop condition, four types of wind scenarios with di fferent turbulence intensities are produced to stimulate the benchmark demo of a 2 MW wind turbine along the whole wind speed range. For each wind scenario, operation data are acquired for identification.

Wind speeds excite wind turbine dynamics. To evaluate volatility of wind speed, turbulence intensity is used, calculated as follows:

$$I\_{\text{tur}} = \frac{\sigma\_{\text{std}}}{V\_{\text{mean}}} \tag{14}$$

$$
\sigma\_{\rm std} = \sqrt{\frac{1}{N\_{\rm WS} - 1} \sum\_{i=1}^{N\_{\rm WS}} (V\_i - V\_{\rm mean})^2} \tag{15}
$$

where *V*mean is mean wind speed in time window *T*WS with sampling period *<sup>T</sup>*sp; *N*WS = *T*WS/*T*SP is number of sampling points; σstd is standard deviation; *I*tur is turbulence intensity; *Vi* is sampled values. According to IEC 61400-1 [35], three grades of turbulence intensity such as A(0.16), B(0.14) and C(0.12) are given. In this section, using wind-generation function of GH Bladed [19], wind scenarios with di fferent mean values and turbulence intensities are generated, shown in Figure 6. Equations (14) and (15) were used to calculate mean values and turbulence intensities with di fferent time windows, shown in Table 2. Di fferent wind scenarios' settings have some di fferences but the whole wind speed range and the whole intensity range of turbulence can both be tested. Finally, the identified datasets with di fferent operation information caused by di fferent wind scenarios and switched controllers can be obtained for further grey-box parameter identification and validation.

**Figure 6.** Wind scenarios.


**Table 2.** Characteristic parameters of wind scenarios.

Using wind speeds in Figure 6 as inputs of a 2 MW wind turbine model in GH Bladed, operation data can be acquired. Using the benchmark demo of a 2 MW wind turbine in the GH Bladed software, the virtual sensors and their measuring points are shown in Figure 7. For the identification of the two-mass model, the measuring points include 'Nominal wind speed at hub position', 'Rotor speed', 'Rotor azimuth angle', 'Generator speed', 'Generator azimuthal position', 'Generator torque', 'Aerodynamic torque' and 'Low speed shaft torque'. For the identification of the tower model, the measuring points include 'Nominal wind speed at hub position', 'Tower *Fx*–Tower station height = 60 m', 'Nacelle *x*-deflection', 'Nacelle *x*-velocity' and 'Nacelle *x*-acceleration'.

**Figure 7.** Diagram of wind turbine structure and sensor locations.

Then, MI values of two-mass model and tower model were calculated under di fferent wind scenarios, shown in Tables 3 and 4. In Table 3, divided by the maximum value of each row, normalized MI values of each row can be obtained.


**Table 3.** Mutual information (MI) values of two-mass model under di fferent wind scenarios.

**Table 4.** MI values of tower model under di fferent wind scenarios.


According to MI values in Table 3, <sup>ω</sup>r–ωg has the highest correlation. Correlation of *<sup>T</sup>*g–*T*shaf is relatively high, reflecting the similar dynamics of *T*g and *T*shaf. Correlation of *<sup>T</sup>*g–ωr is similar to *<sup>T</sup>*g–ωg. Correlations of <sup>ω</sup>r–*T*shaf and <sup>ω</sup>g–*T*shaf are similar. Correlation of *T*r–ωr is similar to that of *<sup>T</sup>*r–ωg. Obviously, due to di fferent control strategies and controllers being used below or above rated wind speed, MI values are di fferent under di fferent wind scenarios. From Figure 5, the two-mass model locates at the forward channel of a closed loop. Scenarios Type 1 and Type 2 work below rated wind speed where OTC controller acts. Scenarios Type 3 and Type 4 work above rated wind speed where variable-pitch controller acts. Depending on the acquired data, di fferent data sets yield di fferent MI values. Then, through analysis, connection of control loops and information contained in the acquired data can be preliminarily understood according to control strategies in Figure 5.

The tower model (Equation (7)) is an open-loop system. As a result, in Table 4, correlations between *F*t and *z* are very similar under di fferent wind scenarios.

From the view of basic principle, subspace identification is di fferent from grey-box identification. For subspace identification, estimating the Kalman state vector from the input–output data is the first step and using the Kalman state vector to reconstruct the state space model is the second step. Among them, how to estimate the Kalman state vector from the input–output data is a critical step for subspace identification. For grey-box identification including RLS identification and optimal identification, estimating the model parameters from input–output data based on the selected state-space model structure is the first step and using the obtained state-space model to estimate the Kalman state vector is the second step. The above three identification methods have di fferent basic principles.

Firstly, due to the identified two-mass model including no-self-balancing channel, which is unstable, exploration of di fferent identification methods for direct identification of the two-mass model without excitation experiment under closed-loop condition is mainly studied as follows.

For two-mass model (Equation (6)), define *x* = [<sup>ω</sup>r, <sup>ω</sup>g, Δδ] T, *u* = [*T*r, *<sup>T</sup>*g] T, *y* = [<sup>ω</sup>g, *T*shaf] T. Then, the state-space equation and transfer function matrix can be obtained, shown as following

$$M\_{\text{TWO-mases}} = \begin{bmatrix} \frac{b\_1s^2 + b\_2s + b\_3}{s(s^2 + a\_1s + a\_2)} & \frac{-(h\omega + b\overline{\omega})}{s(s^2 + a\_1s + a\_2)}\\ \frac{b\_6s + b\_7}{s^2 + a\_1s + a\_2} & \frac{b\_8s + b\_9}{s^2 + a\_1s + a\_2} \end{bmatrix} \tag{16}$$

where *a*1 = *<sup>B</sup>*damp/*J*r + *<sup>B</sup>*damp/[*J*g (*N*gear) 2], *a*2 = *A*stif/*J*r + *<sup>A</sup>*stif/[*J*g (*N*gear) 2]; *b*1 = 1/*J*r, *b*2 = *<sup>B</sup>*damp/[*J*r*J*g (*N*gear) 2], *b*3 = *<sup>A</sup>*stif/[*J*r*J*g (*N*gear) 2]; *b*4 = *<sup>N</sup>*gear *b*2, *b*5 = *<sup>N</sup>*gear *b*3; *b*6 = *<sup>B</sup>*damp *b*1, *b*7 = *A*stif *b*1; *b*8 = *J*r *b*4, *b*9 = *J*r *b*5. Obviously, for the input–output channels from *T*r to <sup>ω</sup>g and *T*g to <sup>ω</sup>g, transfer functions with no-self-balancing ability are obtained. For the input–output channels from *T*r to *T*shaf and *T*g to *T*shaf, transfer functions with self-balancing ability are obtained. In execution, no-self-balancing object is di fficult to be identified.

For the two-mass model, the measured data are obtained from operation of the two megawatts wind turbine in GH Bladed using wind speed inputs in Figure 6. Under the Hammerstein structure, inputs are turbine rotor torque, *T*r, and generator reaction torque, *<sup>T</sup>*g, for two-mass model. Outputs are generator rotor speed, <sup>ω</sup>g, and internal shaft torque, *T*shaf. Then, measured data can be used for identification. Utilizing the identified state-space model with *T*r and *T*g as inputs, the estimated generator rotor speed and internal shaft torque can be acquired. Herein, all the measured data for identification are produced by the wind turbine model in GH Bladed with high-order nonlinear characteristics. As a result, comparisons of the three identification methods under di fferent wind scenarios are shown in Table 5 and Figures 8–11.


**Table 5.** Comparison of identification methods for two-mass model.

**Figure 8.** (**a**) Comparison of generator rotor speed under scenario Type 1; (**b**) Comparison of internal shaft torque under scenario Type 1.

**Figure 9.** (**a**) Comparison of generator rotor speed under scenario Type 2; (**b**) Comparison of internal shaft torque under scenario Type 2.

**Figure 10.** (**a**) Comparison of generator rotor speed under scenario Type 3; (**b**) Comparison of internal shaft torque under scenario Type 3.

**Figure 11.** (**a**) Comparison of generator rotor speed under scenario Type 4; (**b**) Comparison of internal shaft torque under scenario Type 4.

Subspace identification—a data-driven black-box identification method—is mainly realized via robust numerical solution with QR decomposition or singular value decomposition. All the subspace identification methods assume that input signal and process noise signal are independent with each other. It suggests that the system does not have a feedback loop and all poles strictly fall in the

left-half-plane, and thus subspace identification can have unbiased estimation for open-loop system. Under closed-loop condition, input signal becomes dependent with process noise, so the closed-loop system identification via subspace identification is di fficult. If directly using subspace identification for the closed-loop condition, a biased estimate or even error estimate may happen. Then, eliminating the dependence between input signal and process noise becomes a basic step when using subspace identification for a closed-loop system. In this case, if su fficient excitation can be designed and injected into the closed-loop system and higher signal-to-noise ratio can be yielded, better results can be obtained by subspace identification for the closed-loop system. Its identification accuracy highly depends on the design of the test signal. However, in this paper, system identification under closed-loop condition without excitation signal is mainly studied for the wind turbine model in GH Bladed with high-order nonlinear characteristics. In theory, the identified system will remain biased or with error if no-excitation signal is used under closed-loop condition. From Table 5 and Figures 8–11, these methods all have limited identification performance. Through comparison, subspace identification via MOESP has the best performance, showing its good adaptability to closed-loops. Even for the mixed input–output channels, where *<sup>T</sup>*g–ωg and *<sup>T</sup>*r–ωg represent no-self-balancing channels and *<sup>T</sup>*g–*T*shaf and *T*r–*T*shaf represent self-balancing channels, subspace identification has good fit-percent using weighted loss function. However, reconstructed states of the discrete high-order system have no-physical-meanings and are disadvantage for control design to improve dynamic performance of the system. Besides, the identified system cannot guarantee its autonomous stability. As a result, subspace identification only yields a good black-box result from the view of fit-percent based on its high-order discrete model structure, whereas the identified system is still unbiased both in theory and in practice.

RLS identification—a black-box or grey-box identification method—is a generalization to LS identification and is mainly solved by numerical solution algorithm. Herein, the drive-train model structure has been selected, so a grey-box problem is formed. RLS identification can have unbiased estimation for an open-loop system without feedback loop where the process noise signal and input signal are independent. Yet, the drive-train model is a no-self-balancing and instable model where closed-loop identification is necessary. Under closed-loop condition, the system needs to be fully excited where a suitable excitation signal should be designed and injected into the system to reduce or even eliminate the dependence between process noise signal and input signal. In this case, the RLS algorithm may have a better identification result. Herein, the simplest identification condition is provided for RLS identification where no-excitation-signal is injected into the closed-loop of drive-train and the data yielded by the closed-loop system are directly acquired for identification. In theory, only unbiased or error estimate may be obtained. RLS identification is just applicable for single output system. Based on Equation (12), *T*r is used to estimate parameters. Fit-percent and estimation error of the identification results for the two-mass model performs the worst, as shown in Table 5 and Figures 8–11. Besides, for the identified parameters, negative values often appear. These parameters does not always keep consistent with their physical meanings and cannot guarantee autonomous stability of system. Using the estimated parameters, the simulated outputs deviate a lot to the measured outputs and error estimate happens in practice. As a result, for the direct identification of the two-mass model using operation data under the closed-loop condition, RLS identification is di fficult to have proper results. It reflects that RLS method hugely relies on the acquired data and excitation experiment.

Grey-box identification utilizes the physical structure and input–output data to establish a system model where a priori physical knowledge and model uncertainty caused by process noise are both considered. It has obvious advantages for capturing the natural behavior of the actual system. Usually, the selected model structure needs include the noise terms to describe model uncertainty. Yet, many unknown noise signals are di fficult to be estimated and improper estimation may yield bad results. Herein, the two-mass model is selected as the identified model structure whereas its noise terms are not added, so the approaching ability of the identified two-mass model may limited to the high-order nonlinearity. Then, based on the two-mass model structure, optimization search algorithm is used to find the parameters with the best estimation performance. Shown in Table 5 and Figures 8–11, grey-box optimization identification has moderate performance. Although it is applicable for multi-output system, it performs poorly for the mixed channels. Because estimation bias always exists and even diverges for the no-self balancing channels, the fit-percent and estimation error have poor performance. To display the performance, single-step estimates of grey-box optimization identification are also shown in Figure 8a, Figure 9a, Figure 10a, and Figure 11a. It can be found that the single-step estimates are convergen<sup>t</sup> and show their trend tracking abilities to the measured data. For the self-balancing channels, only limited estimation performance is obtained due to the simplified model structure without noise terms and weighted identification with the no-self-balancing channels. Even so, grey-box identification has obvious advantages such as better adaptability to identified data for direct identification under closed-loop condition and is a more convenient identification procedure. Besides, it has the most important advantage that the estimated parameters have physical meaning and they can guarantee autonomous stability of the identified system. As a result, even if the grey-box identification may have limited accuracy, it is still very useful for control design. Especially, through choosing better identified data, a better result of grey-box identification can be yielded. For wind scenarios with higher MI values, better identification performance can be obtained. In Type 1, Type 2 and Type 3, *<sup>T</sup>*r–ωg, *<sup>T</sup>*g–ωg, *T*r–*T*shaf and *<sup>T</sup>*g–*T*shaf have higher MI values than that of Type 4, so the identification results of Type 1, Type 2 and Type 3 have less MSE and better fit-percent.

In summary, for parameter identification of the two-mass model, subspace identification gets the best fit-percent. It is suitable for the predictive control design while unsuitable for many advanced optimal control design methods, such as *H* ∞ control and linear quadratic regulator control, etc. RLS identification is proved to be invalid for the two-mass model under the simple identification condition in this paper. Optimization identification has moderate performance. It is suitable for all kinds of control design algorithms. However, dynamics based on the simplified model structure need to be compensated. Besides, a novel method is needed to realize accurate identification for the no-self-balancing channel. Comparison analyses of above identification results for the two-mass model are briefly summarized and shown in Table 6.


**Table 6.** Comparison analyses of identification results for two-mass model.

Then, open-loop identification of the tower model is studied as follows. It can directly show the identification performance of di fferent methods.

For tower model (Equation (7)), define *x* = [*d*, . *d*] T, *u* = *F*t, *y* = *d*. Then, state-space equation and transfer function of this two-order spring-damping model can be obtained, shown as follows:

$$M\_{\text{Tower}} = \frac{1}{s(M\_{\text{ft}}s + D\_{\text{t}}) + K\_{\text{t}}} \tag{17}$$

For the input–output channel from *F*t to *d*, it has self-balancing ability.

Under open-loop condition, performance of the three identification methods are strongly dependent on that whether the identified data contain fully excited dynamic information. Herein, only a very simple identification condition is set without excitation signal, so the identification result only depends on the acquired operation data.

For tower model, comparisons of three identification methods are shown in Table 7 and Figures 12–15. RLS identification and grey-box optimization identification both performs better than the subspace identification. It suggests the grea<sup>t</sup> advantage of priori model structure information to accurately approach the high-order nonlinear dynamics of tower system. Meanwhile, the arbitrarily acquired data may contain insufficient excited dynamic information which affect the estimation of Kalman state vector form the input–output data, so subspace identification performs the worst. It also suggests that subspace identification is a data-driven black-box identification method and its performance strongly depends on the acquired data and excitation signal.


**Table 7.** Comparison of identification methods for tower model.

**Figure 12.** Comparison of tower deflection under scenario Type 1.

**Figure 13.** Comparison of tower deflection under scenario Type 2.

**Figure 14.** Comparison of tower deflection under scenario Type 3.

**Figure 15.** Comparison of tower deflection under scenario Type 4.

Especially, RLS achieves the best fit-percent, while parameters incongruent with its physical meaning may be identified and they cannot guarantee autonomous stability of the system. Subspace identification via CVA has the worst fit-percent, which just uses input–output data and reconstructs the states with no-physical meanings. It also cannot guarantee autonomous stability of the identified system. In contrast, optimization identification has moderate fit-percent, but it is sensitive to the set of initial domains of parameters. Its advantage is that the identified parameters have clear physical meanings and they can guarantee autonomous stability of the identified system. However, the simplified model structure limits its representational ability to the nonlinear tower dynamics. For the subsequent applications, subspace identification is suitable to test identifying feasibility of acquired data in the early stage. RLS is suitable to determine the rough range of identified parameters in the middle stage. Then, optimization identification is suitable for the final stage to obtain parameters with physical meaning while guaranteeing autonomous stability of the open-loop tower system. Of course, if simplified mechanism model is adopted and the fit-percent does not perform well, reasonable dynamic compensation can be added. Besides, because the two-order spring-damping model of the tower system lies in an open-loop, similar MI values are obtained under different wind scenarios. As a result, identification performance under different wind scenarios are very similar, too. Comparison analyses of the above identification results for the tower model are briefly summarized and shown in Table 8.

**Table 8.** Comparison analyses of identification results for tower model.


The main purpose of this paper is to explore the identification performance of three representative methods under a simple condition with minimal complexity. At last, application potential and features of these identification methods are summarized and shown in Figure 16. Overall, using the minimal complexity for identification, grey-box optimization identification has the best adaptability to obtain a valid state-space model with physical meaning and guaranteed stability. It is very helpful for subsequent control design to improve both the stable and dynamic performance of the system. The work herein will be part of a hybrid modeling method in future where the biased estimation of the identified state-space model will be compensated by a non-parametric machine learning algorithm. As a result, the hybrid modeling method can efficiently balance modeling complexity and accuracy where the state-space model with high-precision approaching ability to high-order nonlinearity can be obtained.

**Figure 16.** Application potential and features of different identification methods.
