1. Introduction
Operation and maintenance activities can reach up to 25% of the overall costs of a wind farm project [
1] and this percentage can reach 35% for offshore installations [
2]. This in general motivates the widespread interest in wind turbine condition monitoring [
3,
4], because the ability to adequately plan interventions for wind turbines is fundamental for diminishing the producible energy losses.
In [
5], it is reported that for a typical 2 MW wind turbine the cost of the generator is in the order of 10% of the total component cost and that generator failure is among the most impacting factors as regards the number of producible hours lost (around 200). The electric generator is estimated to be among the top three contributors to the failure rates and the downtime of wind turbines, according to [
6,
7]. The above matters of fact motivate the importance of developing reliable methodologies for early detection of wind turbine generator damage and for evaluation of their health status.
Faults related to wind turbine generators are particularly difficult to diagnose in real-world applications because they evolve quickly in an uncontrolled environment with variable operation conditions. This consideration includes mechanical damage to rotating elements, for example generator bearings, and of course includes electric damage to generator components.
Furthermore, in wind energy practice, the most employed information source is constituted by supervisory control and data acquisition (SCADA) data, which have a sampling time of minutes (typically ten). The main drawback is therefore that this time scale is definitely non-optimal, in particular for the diagnosis of electrical faults. In order to acquire electrical measurements with the appropriate sampling frequency (in the order of kHz, as is done for example in [
8,
9]), it is at present unavoidable to intervene on site and it is inconceivable to do this on a condition basis. Consequently, the standard is that the inspections of wind turbine generators are periodic and therefore poorly related to the incipience of possible faults.
On the grounds of the above considerations, it would be appreciable to develop SCADA data analysis methodologies that could be helpful to evaluate the health status of wind turbine generators. This is exactly the objective of the present study, which is organized as a real-world test-case discussion.
The innovative aspects of the present study can be appreciated in light of a brief discussion of the literature about wind turbine generator fault diagnosis, from which it arises that most studies deal with the use of SCADA data regarding mechanical faults. As often happens when SCADA data are employed for this aim [
10,
11,
12,
13,
14], the analysis of sub-component temperatures is useful. In [
15,
16,
17], the targets are the stator winding temperature, the generator bearing temperature, and the generator slip ring temperatures. In [
18], a test case of generator damage (rotor winding failure) is analyzed: the diagnosis is based on a dynamic model sensor method representing the relationship between the generator temperature, wind speed, and ambient temperature.
Two inspiring studies for the purposes of the present work are [
19,
20]. In [
19], a series of phenomena possibly related to wind turbine generator damage are listed and these can be individuated through SCADA data analysis. In addition to anomalous heating, these are miscorrelation between generator speed and produced power, or reactive power, and anomalies in the shaft torque. In [
20], it is shown that real-world generator damage can be diagnosed by analyzing the Mahalanobis distance and the correlation matrix of a set of features.
As discussed above, the present study aims at contributing to the topic of SCADA data analysis methods for wind turbine generator fault diagnosis. Thanks to the support of the Lucky Wind spa company, which provided the data sets employed in this study, it has been possible to investigate the behavior of a Vestas V52 wind turbine before and after electrical damage occurred at the generator. The most important operation variables (such as blade pitch, rotational speed, and so on) and electrical parameters have been analyzed; in particular, on the grounds of the above literature discussion, it arises that a relevant point of novelty of this study is the use of electrical parameter SCADA measurements. Actually, in this work it is shown that it is possible to construct data-driven normal behavior models describing the relation between electrical parameters and operation variables and these models are responsive in individuating incipient electrical damages. The normal behavior model is constructed through support vector regression (SVR) with a Gaussian kernel because this has been shown to be effective for tackling the typical non-linear problems arising in wind energy data practice. The features are orthogonalized and reduced in dimension through principal component analysis (PCA).
To summarize, in this work a reference healthy wind turbine and the target damaged one are analyzed in parallel, and it is shown that it is possible to distinguish the damaged wind turbine with respect to the healthy one when the fault is incipient (in the order of two weeks before the fault) and that, after the replacement of the generator, the observations are compatible with the normal behavior model.
The organization of this work is as follows:
Section 2 is devoted to the description of the test cases and of the data sets, in
Section 3 the methods are described; the results are collected and discussed in
Section 4; finally, conclusions are drawn in
Section 5.
2. Test Cases and Data Sets
The wind farm of interest features six Vestas V52 wind turbines, installed in the year 2007, and it is sited in Italy on mountainous terrain.
The SCADA data sets that were used have ten minutes of sampling time and go from 1 January 2017 to 31 December 2018. At the beginning of March 2018, the target wind turbine (named Tar in this study) experienced electrical damage at the generator, in consequence of which the generator had to be replaced. In [
21], a study was devoted to the analysis of the performance of the Tar wind turbine before and after the generator replacement, in comparison to the other wind turbines in the farm. The objective of that study was the assessment of the impact of the generator aging on wind turbine performance. The result was that after the generator replacement the performance of Tar slightly recovered, while the performance of the rest of the wind farm kept slightly worsening due to the effect of aging. In [
21], only the main operation variables were analyzed.
The present study is instead devoted to the diagnosis of generator damage before it occurs and, for this purpose, the data set used also included the most important electrical parameters. The measurements used for this study were:
Wind speed v (m/s);
Power P (kW);
Blade pitch angle ();
Rotor speed (rpm);
Generator speed (rpm);
Gear bearing temperature (K);
Generator Phase 1 temperature (K);
Generator Phase 2 temperature (K);
Generator Phase 3 temperature (K);
Current Phase 1 (A);
Current Phase 2 (A);
Current Phase 3 (A);
Voltage Phase 1 (V);
Voltage Phase 2 (V);
Voltage Phase 3 (V);
Section 3.3 features a detailed discussion about how the data sets were arranged for the diagnosis of the damage at the Tar wind turbine and for the comparison against a healthy reference (named Ref). For completeness, Tar and Ref correspond to ITA4 and ITA3 according to the nomenclature of [
21].
As regards the data pre-processing, in general it should be noted that the operation of wind turbines in industrial farms is affected by curtailments dictated by grid requirements and it is therefore necessary to filter the data appropriately. This was done as follows (similarly to [
22]):
filter using the run-time counter, requesting production for 600 s out 600;
filter below rated power (approximately );
the data corresponding to operation under grid curtailment are filtered out by automatic data clustering through the random forest algorithm [
23].
An example of the scattered power curve before and after data pre-processing is reported in
Figure 1.
4. Results
A preliminary result regards the positive effect of the use of PCA for the normal behavior model.
Table 2 reports an estimation of the computational time required for the models with and without PCA. The improvement in computational time is appreciable.
Figure 3 reports an example of a time series of simulated and measured current for the healthy Ref wind turbine. From
Figure 3, it is possible to notice that the dimension reduction through the PCA does not significantly affect the behavior of the model estimates, whose error metrics for the test data set are summarized in
Table 3. From
Table 3, it arises that the error metrics on the test data set are acceptably low and therefore the model is potentially capable of detecting anomalies, as will be seen herein. In order to estimate margins for the reliability of the proposed model, a 10-fold cross validation was performed with 300 model runs: for each model run, the error metrics were computed and the results reported in
Table 3 should be intended as the average metrics over the model runs, to which a standard deviation was associated. From
Table 3 it arises that the average error metrics can be considered particularly robust because their standard deviation is very low. Furthermore, as regards the power
P, it is possible to compare with the literature: the order of the error metrics is similar to the results collected in [
36] for the same test case. Therefore, it can be stated that the dimension reduction through the PCA does not remarkably affect the quality of the regression and helps in highlighting the dependence of the target on the orthogonalized input variable matrix.
In
Figure 4, qualitative results are reported that are meaningful for the identification of the incoming fault and which will be elaborated further on. The simulated power is reported as a function of the measured power for two data sets approaching the date of the fault. On the left, the plots for the reference wind turbine Ref are reported, while the plots for the target faulty wind turbine Tar are on the right. It arises that, approximately two weeks before the fault, it is possible to distinguish anomalous behavior at the Tar wind turbine: the simulated power is significantly higher than the measured value. This means that the input features are such that the extracted power should be higher, according to the normal behavior model. In
Figure 5, a similar kind of plot is reported for the simulation of the current, approaching the fault (the same data set as
Figure 4c,d); from this Figure, it arises that the fault onset can be individuated, also modeling directly the electric parameters of the generator.
In
Figure 6, the same kind of plot as in
Figure 4 is reported for a sample data set after the replacement of the Tar generator. From
Figure 6, it arises that the reference and the target are hardly distinguishable, which is different to what happened in proximity of the fault.
The situation depicted in
Figure 4 and
Figure 5 can be interpreted quantitatively by computing the Pearson correlation coefficients
between the measurements of the targets and of the input variables in the data set
for the target wind turbine. The results are reported in
Table 4, from which it arises that the correlation coefficients change drastically with respect to the
data set (
Table 1). In this sense, it is confirmed that the onset of generator damage can also be individuated as a change in the correlation between relevant operation variables (as suggested in [
19]). In
Table 5, the Pearson correlation coefficients
are reported for the data set
and, compatibly, it results that the obtained values return to being of the order of those reported in
Table 1 for the target wind turbine in healthy conditions.
In
Figure 7, the behavior of
Figure 4 is elaborated through the indication of meaningful historical thresholds for the detection of the incoming fault. In
Figure 7, the difference in NMAE between the Tar and the Ref wind turbines is plotted for the
data set. The statistical indicators for the quantity of
Figure 7 have been computed using the
data set: these are the average (blue line) and the standard deviation (yellow line). Therefore, for each measurement of
Figure 7 in the
data set, it is possible to estimate by how many standard deviations the difference in NMAE between the Tar and the Ref deviates from the historical normal trend. This allows observation of the evolution of the incoming fault and two reasonable thresholds can be defined: a pre-alert one (two standard deviations) and an alert one (three standard deviations). From
Figure 7, it arises that two relevant peaks above the alert thresholds have occurred before the stopping of the wind turbine. It was crosschecked against the alarm log book for the wind turbines that both peaks are concomitant with the onset of alarms individuating current anomalies reaching the converter: in light of the present work, this phenomenon can be explained as being due to anomalous electrical behavior of the generator, eventually resulting in converter current anomalies. It should be pointed out that the present method is more informative with respect to the analysis of the alarm log book: alarms are impulsive events, while the approach of this study can be employed for online continuous monitoring. Defining appropriately, as is done in this work, a set of thresholds, it is possible to individuate possible faults with more advance with respect to the mere elaboration of the alarm logs. A further observation regards the fact that in the
, the quantity reported in
Figure 7 deviates more than one standard deviation with respect to the historical only in anticipation of an alarm event. This supports that the number of false positives indicated by this method can be reasonably low and this aspect will be analyzed further when other test case studies are available to the authors. Another interesting aspect of
Figure 7 is that the latter peak anticipates the damage and two observations arise: this peak is higher with respect to the former alarm event and the alarm threshold is passed approximately two weeks before the stopping of the wind turbine. From these matters of fact, the usefulness of the proposed method is supported because it is responsive and can also anticipate error logs that are not associated with short-term stopping of the wind turbine. This is particularly important in the perspective of using this kind of method for condition assessment.
5. Conclusions
The present study has been devoted to use of SCADA data for the diagnosis of electrical damage at wind turbine generators. As discussed in
Section 1, this objective is particularly challenging because the sampling time of SCADA data is not the most appropriate for interpreting the dynamics of electrical phenomena of machines subjected to non-stationary conditions, as wind turbines are. Nevertheless, the widespread use of SCADA data and the potential applications in wind energy practice motivate a continuously growing scientific interest.
The main result of this study is that appropriate SCADA data analysis methods are helpful in diagnosing electrical damage to wind turbine generators. This has been accomplished through the analysis of a real-world test case, which is the breakdown and the consequent replacement of a generator at a Vestas V52 wind turbine sited in southern Italy.
The proposed methodology is based on the construction of normal behavior models for the power and for the voltages and currents of the wind turbine. The relevant features were selected based on their Pearson correlation coefficient with the target and the PCA was employed to reduce the dimension of the problem and to deal with the collinearity of the regressors. The non-linear relation between the features and the target was taken into account using support vector regression with a Gaussian kernel.
Through the analysis of how the mismatch between model estimates and measurements evolved approaching the time of the fault, it was possible to individuate the damage with an advance in the order of two weeks. Considering the nature of the damage and the type of employed data, this result is very promising as regards the use of SCADA data for monitoring the health status of wind turbine generators.
A valuable further direction of the present work would be the use of time-resolved SCADA data with a sampling time of the order of the second [
37]. Despite the fact that in wind energy practice they are more complex to manage from the OPC-DA servers of wind turbines, these kinds of data could likely provide a deeper insight into the dynamics of wind turbines and this could further improve the diagnostic capability of data-driven methods.
In general, a realistic application for the type of analysis presented in this work is to constitute a first advice for the assessment of generator conditions, based on which it could be possible to plan devoted inspections using methodologies similar to those in [
8]. From this point of view, the present study represents a contribution to the methodologies for condition-based maintenance of wind turbine generators.