Wind Farm Prediction of Icing Based on SCADA Data

Zhang, Yujie; Rotea, Mario; Kehtarnavaz, Nasser

doi:10.3390/en17184629

Open AccessArticle

Wind Farm Prediction of Icing Based on SCADA Data

by

Yujie Zhang

¹

,

Mario Rotea

^2,*

and

Nasser Kehtarnavaz

³

¹

Center for Wind Energy, Department of Electrical and Computer Engineering, University of Texas at Dallas, Richardson, TX 75080, USA

²

Center for Wind Energy, Mechanical Engineering Department, University of Texas at Dallas, Richardson, TX 75080, USA

³

Department of Electrical and Computer Engineering, University of Texas at Dallas, Richardson, TX 75080, USA

^*

Author to whom correspondence should be addressed.

Energies 2024, 17(18), 4629; https://doi.org/10.3390/en17184629

Submission received: 30 July 2024 / Revised: 12 September 2024 / Accepted: 13 September 2024 / Published: 15 September 2024

(This article belongs to the Topic Advances in Wind Energy Technology)

Download

Browse Figures

Versions Notes

Abstract

:

In cold climates, ice formation on wind turbines causes power reduction produced by a wind farm. This paper introduces a framework to predict icing at the farm level based on our recently developed Temporal Convolutional Network prediction model for a single turbine using SCADA data.First, a cross-validation study is carried out to evaluate the extent predictors trained on a single turbine of a wind farm can be used to predict icing on the other turbines of a wind farm. This fusion approach combines multiple turbines, thereby providing predictions at the wind farm level. This study shows that such a fusion approach improves prediction accuracy and decreases fluctuations across different prediction horizons when compared with single-turbine prediction. Two approaches are considered to conduct farm-level icing prediction: decision fusion and feature fusion. In decision fusion, icing prediction decisions from individual turbines are combined in a majority voting manner. In feature fusion, features of individual turbines are averaged first before conducting prediction. The results obtained indicate that both the decision fusion and feature fusion approaches generate farm-level icing prediction accuracies that are 7% higher with lower standard deviations or fluctuations across different prediction horizons when compared with predictions for a single turbine.

Keywords:

farm-level icing prediction; decision fusion for wind farm icing prediction; feature fusion for wind farm icing prediction

1. Introduction

The prediction of icing is critical to the normal operation of wind farms in cold climates. Ice formation on the blades of wind turbines would cause load imbalance and structural damage, bringing safety risks to the surrounding area [1]. Prediction of icing would allow operators to put in place remedies before power losses occur due to icing events [2]. Developing a prediction framework for an entire wind farm enables anticipating the amount of power loss to the grid and taking appropriate actions.

A review of machine learning approaches for the prediction of icing on wind turbines based on Supervisory Control and Data Acquisition (SCADA) data has been conducted by our research team, which is reported in [3]. This review discusses existing machine learning methods for the prediction of icing on wind turbines in detail. A brief summary of the articles reviewed is as follows. In [4], a data-driven neural network approach was used to predict icing on wind turbines based on SCADA data and historical weather data reporting an accuracy of 83% for the dataset examined. In [5], Federated Learning (FL) was used to predict icing on wind turbine blades. Each turbine was trained as a local model and then a global model was aggregated using all the local models reporting a prediction accuracy of 70% for the dataset examined. In [6], Random Forest (RF) was used to predict icing events on wind turbine blades reporting an accuracy of 74% for the dataset examined. In [7], a Recurrent Neural Network (RNN) was used to predict icing reporting an accuracy of 72% for the dataset examined. In [8], a data-driven Graph Neural Network (GNN) was used to predict icing on wind turbine blades reporting an accuracy of 75% for the dataset examined.

In our previous work [9], we developed a framework to predict icing on a wind turbine. A Temporal Convolutional Network (TCN) prediction model was used, which generated an average prediction accuracy of 77.6% for future times up to 48 h or 2 days ahead. Only SCADA data and meteorological data were used as input to the prediction model. The prediction model did not rely on the installation of any additional sensors on the turbine.

In this follow-up work, two questions are addressed:

Can the predictors trained on one turbine of a wind farm be used to conduct icing predictions for the other turbines in the wind farm?
How to extend the turbine-level prediction framework to an entire wind farm?

The first question is addressed by carrying out cross-validation or by examining the generalization ability of TCN predictors trained on a single turbine. In other words, predictors trained on one turbine are tested on the other turbines in the same wind farm. The common performance measures of accuracy and

F_{1}

-score are used to evaluate the generalization ability. Accuracy is a measure that represents the number of times prediction is performed correctly across all the predictions performed. The

F_{1}

-score is a measure that provides a combined representation inversely proportional to the number of false positive and false negative predictions across all the predictions performed. A higher

F_{1}

-score indicates fewer incorrect predictions; see [9] for formulas for accuracy and

F_{1}

-score. The second question is addressed by carrying out two types of fusion approaches: decision fusion and feature fusion. Fusion combines results from multiple turbines, and then give final predictions for the wind farm. In decision fusion, prediction is performed for each turbine independently or individually. Then, all individual prediction decisions are combined by majority voting to obtain a farm-level icing prediction. In feature fusion, features of all individual turbines are combined via averaging. Then, farm-level icing prediction is achieved by one predictor per prediction horizon. Fusion approaches have been previously used in other engineering applications, e.g., [10,11,12]. However, it is worth mentioning that this is the first time fusion approaches are used to achieve farm-level icing prediction. More specifically, the contributions of this work are two fold: (i) examination of the generalization ability of predictors trained on a single turbine for the other turbines in a wind farm, (ii) the development of a farm-level icing prediction framework based on two fusion approaches.

The remainder of this paper is organized as follows. Section 2 describes the cross-validation study conducted to answer the first question. Section 3 describes the fusion approaches to conduct icing prediction for an entire wind farm answering the second question. The icing prediction results for an entire farm are reported and discussed in Section 4. Finally, the paper is concluded in Section 5.

2. Cross Validation: Generalization Ability of a Single Turbine Predictor

In [9], we developed a prediction framework to forecast icing on wind turbines up to 2 days ahead using only SCADA data and meteorological data, if available. This approach is based on TCN predictors for different times in the future (prediction horizons). This prediction framework includes the modules of data preprocessing, prediction model training and testing, and prediction model evaluation. Based on the SCADA data from a single turbine, our TCN predictors produced an average prediction accuracy of 77.6% across different prediction horizons from 10 min ahead to 2 days ahead.

The SCADA dataset used in this paper is from a wind farm located in the northern part of the US. This dataset includes 11 features or variables of all the turbines in the wind farm measured every 10 min from January 2023 through July 2023. These features or variables are listed in Table 1. In addition to the SCADA dataset, weather data features or variables listed in Table 2 for the same location and time period were acquired from the VisualCrossing weather database [13].

For the utilization of these predictors at the farm level, it is necessary to examine their performance on SCADA data from other turbines. The wind farm layout considered is shown in Figure 1. The rated power of each wind turbine in the farm is 2 MW with a cut-in wind speed of 4 m/s, a rated wind speed of 12 m/s, and a cut-out wind speed of 25 m/s. The prevailing wind direction is shown in the figure.

Cross-validation is often used to evaluate the performance of a model on unseen data [14]. An illustration of the cross-validation conducted here is shown in Figure 2. For each turbine, its predictors are trained using its own SCADA data, which are then tested on the SCADA data of all the other turbines in the wind farm. An assessment metric

P_{i, j}

consisting of accuracy and

F_{1}

-score is used to evaluate the generalization ability, where i denotes the turbine index a predictor is trained for and j denotes the turbine index the predictor is tested on. The metric

P_{i, j}

is defined as follows:

P_{i, j} = (\begin{matrix} {accuracy}_{i, j} \\ F_{1} {- score}_{i, j} \end{matrix})

where accuracy is a reflection of the number of correct predictions whereas

F_{1}

-score is a reflection of incorrect predictions. The equations for accuracy and

F_{1}

-score appear in [9]. Note that each

{accuracy}_{i, j}

value and

F_{1} {-score}_{i, j}

value are the average over the prediction horizons. The average assessment metric

P_{k}

is obtained by averaging

P_{i, j}

along j or all the testing turbines. The average assessment metric

P_{k}

can be used to assess the prediction performance of the predictors that are trained on turbine k and tested on all other turbines.

The outcome of the cross-validation is provided in Figure 3. For the predictors trained on each turbine, the average accuracy and

F_{1}

-score (average assessment metric) for the testing data from the other turbines are plotted. This figure provides the generalization ability of trained predictors of a single across all the other turbines. The turbine numbers T1, T13, T15, and T24 were not fully operational and thus were not used in our analysis, which explains the missing accuracy and

F_{1}

-scores on Figure 3 for these turbines.

The predictor trained on the turbine T55 exhibited the highest accuracy and

F_{1}

-score (dash vertical line in Figure 3). The accuracy and

F_{1}

-score of the predictors trained on T55 and tested on all the other turbines are shown in Figure 4. The average metric

P_{55}

(defined in Figure 2) consists of an average accuracy of 86.10% and an average F1-score of 0.50, indicating that the predictors trained on the turbine T55 have the best performance when predicting icing on the other turbines in the same wind farm.

For each tested turbine, accuracy and

F_{1}

-score are drawn as a box plot. This box plot indicates the accuracy and

F_{1}

-score range across all the prediction horizons, from 10 min ahead to 2 days ahead. Each box plot contains the statistical information including the minimum, maximum, median, first quartile (Q1), and third quartile (Q3) values. It is seen from box plots that the accuracy and

F_{1}

-score values vary between tested turbines, indicating that the predictors trained on T55 can perform well on some turbines but not on other turbines.

The predictor trained on the turbine T56 exhibited the lowest accuracy and

F_{1}

-score. The accuracy and

F_{1}

-score of the predictors trained on T56 and tested on all the other turbines are shown in Figure 5. The average metric

P_{56}

(defined in Figure 2) consists of an average accuracy of 63.88% and an average F1-score of 0.39, indicating that the predictors trained on the turbine T56 have the worst performance when predicting icing on the other turbines in the same wind farm. By comparing Figure 4 and Figure 5, it can be observed that the box plots in Figure 5 have lower mean values and higher variances than the box plots in Figure 4, indicating that the predictors trained on the turbine T55 outperform the predictors trained on the turbine T56.

The above analysis suggests that when individual turbine predictors, trained based on the data associated with a specific turbine, are tested on the data associated with another turbine, can perform well when the distributions of the SCADA features of the testing and training data are close and may not perform well when the distributions of the features are not close. As an example, Table 3 shows the Fisher distance [9,15], a measure of closeness of two distributions, of the features from three turbines, where the predictors are trained on T55 and tested on T54 and T56, respectively. By inspecting features in Table 3, the Fisher distance discrepancies between testing turbine T56 and training turbine T55 is more significant than the feature discrepancies between testing turbine T54 and T55. Therefore, the testing accuracy on T56 is lower than the testing accuracy on T54, which is illustrated in Figure 6. This figure shows the distribution across all 288 prediction horizons (10 min to 2 days) of the prediction accuracy when T54 and T56 use the predictors from T55. While the histogram for T55 and T54 almost overlap, the histogram for T56 is skewed to the left clearly showing reduced accuracy.

Hence, the answer to the question posed earlier “Can the predictors trained on one turbine in a wind farm be used to conduct icing predictions for all the other turbines in the same wind farm?” is that the predictors trained on one turbine can be used to conduct predictions on the other turbines only if the distributions of the SCADA data features used for training are close to the distributions of the features for the other turbines. However, since there are variations of SCADA data among the turbines in a wind farm, the distributions of the features may not be close between different turbines, and thus one cannot generalize the predictors of one turbine to other turbines in a wind farm.

3. Farm-Level Prediction by Fusion

In this section, first, an overview of the framework reported in [9] using a single turbine SCADA data is provided to set the stage for conducting prediction at the farm level. Next, the second question stated earlier is addressed. That is, “How to extend the icing prediction framework of a single turbine to an entire wind farm?”.

The prediction model of TCN was used in our single-turbine prediction framework. The architecture of TCN is shown in Figure 7. This deep learning model consists of convolution layers, ReLU (Rectified Linear Unit) layers, and dropout layers [16]. The convolution layer takes in SCADA feature data as an input tensor with the size

w s

by F, where

w s

denotes the input window size and F denotes the number of features, see Figure 7. For each turbine, the best input window size and the number of features are determined by carrying out grid search experiments. The output of the network is a binary value, indicating the prediction outcome (1 if the prediction corresponds to “ice” state and 0 if the prediction corresponds to “normal” operation state). The parameters of the TCN model are in Table 4. Interested readers are referred to [9] for the experimentations conducted to reach these parameters.

Figure 8 indicates a specific time in the future for which the prediction is performed. For example, if at time

t_{0}

the user desires to predict icing one hour into the future (or 6 samples ahead noting that samples are taken every 10 min), the features in the red (past) and green (present) boxes are used to predict the icing condition in the blue stem. This process is repeated every 10 min.

3.1. Qualified Turbines

There are 75 turbines in the wind farm. The wind rose of the turbines is first checked to exclude the turbines with narrowly defined wind rose with respect to the other turbines for conducting farm-level icing prediction. For the wind farm examined with 75 turbines, the turbines T1, T13, T15, and T24 were excluded.

3.2. Rules Used for Labeling Ice Condition

For each turbine, the ice condition is labeled using the three rules in Table 5 since the SCADA datasets normally do not provide ice condition labels. Three rules reflect temperature, relative humidity, and actual power as described in [17,18]. If all the three rules are met for a data sample, that data sample is labeled as an “ice” state (“1”). Otherwise, it is labeled as a “normal” state (“0”).

For an entire wind farm, ice labels need to be generated. This is necessary in order to test the farm-level predictors for accuracy and

F_{1}

-score. Ice labels were generated based on all the turbines using a majority voting scheme as illustrated in Figure 9. At each time step, each turbine generates “ice” or “normal” labels independently. Then, a farm-level ice condition is generated using all the turbine labels via majority voting.

3.3. Fusion Approaches

Two fusion approaches are proposed in this work: decision fusion and feature fusion. In decision fusion, individual predictors for each turbine make independent decisions, and then their decisions are combined to generate the farm-level decision. Majority voting is often used for this purpose where each decision is attached to the same importance or weight and the overall decision is considered to be the decision with the highest vote [19,20]. The decision fusion approach is illustrated in Figure 10.

In feature fusion, for each feature, out of the thirteen features listed in Table 1 and Table 2, all the data samples of the wind turbines are combined by averaging before carrying out predictions. A method to combine features is by averaging them. Then, the average is used to train one single predictor per prediction horizon. In this work, the predictor architecture used is from [9]. The feature fusion approach is illustrated in Figure 11.

For farm-level ice labeling of data samples as well as for decision fusion of predictions, the majority voting scheme is used which involves counting outcomes. A simple illustration of the majority voting scheme is shown in Figure 12. In example 1, the count of ones is greater than the count of zeros leading to an output or outcome of “1”. In example 2, the count of zeros is greater than the count of ones leading to an output or outcome of “0”.

4. Farm-Level Prediction Results

In this section, the results of the prediction of icing using decision fusion and feature fusion are presented. Comparisons are made between fusion and single-turbine approaches.

Icing prediction accuracy across 288 prediction horizons covering from 10 min ahead to 2 days ahead are shown in Figure 13. For each prediction horizon, our predictor made icing predictions for the time-series testing samples with the duration covering the winter season from January to April. The green curve represents the outcome of the decision fusion. The blue curve represents the outcome of the feature fusion with red curve representing a single turbine. As compared with the prediction accuracy using a single turbine, decision fusion demonstrates higher prediction accuracy and fewer fluctuations across different prediction horizons. In decision fusion, prediction is performed independently for each turbine. This lowers the chance of the overlap among the prediction errors of different turbines. The predictions from all the turbines are then combined by using majority voting. In other words, even if some of the turbines may provide incorrect predictions, the final decision is determined by the majority of predictions. This makes the decision fusion approach more robust to prediction errors as compared to the feature fusion approach. Feature fusion also exhibits an improvement of the prediction accuracy with fewer fluctuations over that of a single turbine, but more fluctuations than the decision fusion approach. In feature fusion, each data feature is averaged across all turbines in the farm. Since there is only one predictor per prediction horizon, the chance of making a prediction error is higher than in the decision fusion approach because in this latter case, there are many predictors per prediction horizon.

The distributions of accuracy across different prediction horizons are shown in Figure 14. The average accuracy and standard deviation are shown in Table 6. As can be seen from this figure and table, both decision fusion and feature fusion can increase the prediction accuracy and decrease the standard deviation with respect to the single turbine case for all 288 prediction horizons. Decision fusion has the advantage of having the least standard deviation, or fewer fluctuations, due to the smoothing resulting from combining many decisions. Feature fusion has the advantage of needing the training of only one predictor per prediction horizon, translating into less training time compared with decision fusion. Note that the latter requires training all turbines in the wind farm for each prediction horizon; in this case, this results in approximately 75× increase in the number of predictors.

An example prediction time series (simulating the way prediction is actually conducted in real-time) for the decision fusion approach is shown in Figure 15 for the prediction horizon of ten minutes ahead. The predicted icing time-series is plotted in green for the decision fusion, while the actual farm-level icing time-series is plotted in blue. Recall that “icing” is labeled as “one” and “no icing” is labeled as “zero”. As illustrated in these time-series plots, most of the icing events are correctly predicted by the decision fusion approach.

5. Conclusions

A framework has been introduced in this paper to predict icing at the farm level based on our recently developed Temporal Convolutional Network model based on SCADA data. This is the first time icing prediction is performed at the farm level beyond an individual turbine. A cross-validation study has been conducted to evaluate if predictors trained on a single turbine can be used to predict icing on other turbines in the wind farm. Then, two fusion approaches, using SCADA data from each individual turbine, have been carried out to provide icing predictions at the farm level. The key contributions of this work are listed below:

(i): Cross-validation experiments demonstrated that the predictors trained using the SCADA data from a single turbine can be used to predict icing using the SCADA data from another turbine in the wind farm provided that the distributions of the SCADA features for the two turbines are similar. However, when the distributions of the SCADA features are not similar, the predictors of one turbine cannot be used to predict icing on another turbine.
(ii): Two fusion approaches are introduced to predict icing for an entire farm. Testing results indicate that both of the fusion approaches generate farm-level icing prediction accuracies that are approximately 7% higher than prediction accuracies associated with a single turbine.
(iii): The prediction accuracies of the decision and feature fusion approaches are comparable. However, for decision fusion, the predictors from all the turbines need to be trained for each prediction horizon, whereas for feature fusion, only one predictor needs to be trained per prediction horizon. If the training time is of concern, then the feature fusion approach is recommended to be used. Otherwise, the decision fusion approach is recommended to be used as it provides a smaller standard deviation of prediction accuracy compared to the feature-level approach.
(iv): When performing icing prediction for an entire farm, it is required to have the SCADA data for all the turbines in the farm to be able to conduct the fusion approaches. This may pose a challenge as the data from key turbines could be missing or corrupted. Also, due to the unavailability of icing labels in typical SCADA data, the ice labeling of the data samples is conducted by three rules in this study which can generate errors in the ice labels of the data samples. A possible future work that would improve the prediction accuracy involves more accurate ice labeling of data samples by using ice detection sensors on the turbines.

Author Contributions

Conceptualization, M.R. and N.K.; methodology, Y.Z., M.R. and N.K.; software, Y.Z.; validation, Y.Z., M.R. and N.K.; formal analysis, Y.Z., M.R. and N.K.; investigation, Y.Z., M.R. and N.K.; resources, M.R. and N.K.; data curation, Y.Z., M.R. and N.K.; writing—original draft preparation, Y.Z. and N.K.; writing—review and editing, Y.Z., M.R. and N.K.; visualization, Y.Z., M.R. and N.K.; supervision, M.R. and N.K.; project administration, M.R. and N.K.; funding acquisition, M.R. and N.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Science Foundation under Award Number 1916776, Phase II IUCRC at UT Dallas: Center for Wind Energy Science, Technology and Research (WindSTAR) and the WindSTAR IUCRC Company Members. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or WindSTAR members.

Data Availability Statement

The data presented in this study are available on request from the corresponding author due to privacy.

Acknowledgments

The authors wish to thank Xcel Energy for providing the data for this study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Gao, L.; Hu, H. Wind turbine icing characteristics and icing-induced power losses to utility-scale wind turbines. Proc. Natl. Acad. Sci. USA 2021, 118, e2111461118. [Google Scholar] [CrossRef] [PubMed]
Fakorede, O.; Feger, Z.; Ibrahim, H.; Ilinca, A.; Perron, J.; Masson, C. Ice protection systems for wind turbines in cold climate: Characteristics, comparisons and analysis. Renew. Sustain. Energy Rev. 2016, 65, 662–675. [Google Scholar] [CrossRef]
Dai, J.; Zhang, Y.; Rotea, M.; Kehtarnavaz, N. A review of machine learning approaches for prediction of icing on wind turbines. Proceedings of 19th IEEE Conference on Industrial Electronics and Applications, Kristiansand, Norway, 5–8 August 2024. [Google Scholar]
Kreutz, M.; Ait-Alla, A.; Varasteh, K.; Oelker, S.; Greulich, A.; Freitag, M.; Thoben, K. Machine learning-based icing prediction on wind turbines. Procedia Cirp. 2019, 81, 423–428. [Google Scholar] [CrossRef]
Zhang, D.; Tian, W.; Cheng, X.; Shi, F.; Qiu, H.; Liu, X.; Chen, S. FedBIP: A federated learning-based model for wind turbine blade icing prediction. IEEE Trans. Instrum. Meas. 2023, 72, 3516011. [Google Scholar] [CrossRef]
Ge, Y.; Yue, D.; Chen, L. Prediction of wind turbine blades icing based on MBK-SMOTE and random forest in imbalanced data set. In Proceedings of the 2017 IEEE Conference on Energy Internet and Energy System Integration (EI2), Beijing, China, 26–28 November 2017; pp. 1–6. [Google Scholar]
Zhang, Z.; Fan, B.; Liu, Y.; Zhang, P.; Wang, J.; Du, W. Rapid warning of wind turbine blade icing based on MIV-tSNE-RNN. J. Mech. Sci. Technol. 2021, 35, 5453–5459. [Google Scholar] [CrossRef]
Ying, L.; Xu, Z.; Zhang, H.; Xu, J.; Cheng, X. Graph Temporal Attention Network for Imbalanced Wind Turbine Blade Icing Prediction. IEEE Sens. J. 2024, 24, 9187–9196. [Google Scholar] [CrossRef]
Zhang, Y.; Nasser, K.; Rotea, M.; Dasari, T. Prediction of Icing on Wind Turbines Based on SCADA Data via Temporal Convolutional Network. Energies 2024, 17, 2175. [Google Scholar] [CrossRef]
Ruta, D.; Gabrys, B. An overview of classifier fusion methods. Comput. Inf. Syst. 2000, 7, 1–10. [Google Scholar]
Ren, Z.; Gallo, O.; Sun, D.; Yang, M.; Sudderth, E.; Kautz, J. A fusion approach for multi-frame optical flow estimation. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 7–11 January 2019; pp. 2077–2086. [Google Scholar]
Jeon, B.; Landgrebe, D. Decision fusion approach for multitemporal classification. IEEE Trans. Geosci. Remote Sens. 1999, 37, 1227–1233. [Google Scholar] [CrossRef]
Visualcrossing. Available online: https://www.visualcrossing.com (accessed on 13 November 2023).
Schaffer, C. Selecting a classification method by cross-validation. Mach. Learn. 1993, 13, 135–143. [Google Scholar] [CrossRef]
Mika, S.; Ratsch, G.; Weston, J.; Scholkopf, B.; Mullers, K.R. Fisher discriminant analysis with kernels. In Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (cat. no. 98th8468); IEEE: Piscataway, NJ, USA, 1999; pp. 41–48. [Google Scholar]
Karami, F.; Zhang, Y.; Rotea, M.; Bernardoni, F.; Leonardi, S. Real-time wind direction estimation using machine learning on operational wind farm data. In Proceedings of the 2021 60th IEEE Conference on Decision and Control (CDC), Austin, TX, USA, 13–17 December 2021; pp. 2456–2461. [Google Scholar]
Davis, N.N.; Pinson, P.; Hahmann, A.N.; Clausen, N.E.; Zagar, M. Identifying and characterizing the impact of turbine icing on wind farm power generation. Wind Energy 2016, 19, 1503–1518. [Google Scholar] [CrossRef]
Swenson, L.; Gao, L.; Hong, J.; Shen, L. An efficacious model for predicting icing-induced energy loss for wind turbines. Appl. Energy 2022, 305, 117809. [Google Scholar] [CrossRef]
Dogan, A.; Birant, D. A weighted majority voting ensemble approach for classification. In Proceedings of the 2019 4th International Conference on Computer Science and Engineering (UBMK), Samsun, Turkey, 11–15 September 2019; pp. 1–6. [Google Scholar]
Cheng, X.; Shi, F.; Liu, Y.; Liu, X.; Huang, L. Wind turbine blade icing detection: A federated learning approach. Energy 2022, 254, 124441. [Google Scholar] [CrossRef]

Figure 1. Wind farm layout. The numbers indicate the location of each turbine.

Figure 2. Cross-validation: Predictors trained on single turbine (red) and tested on other turbines (blue) but itself (white). Evaluation of accuracy and

F_{1}

-score for each tested turbine (green). For example, first row tests the predictors trained with turbine 1 data on turbines 2 through 75.

Figure 2. Cross-validation: Predictors trained on single turbine (red) and tested on other turbines (blue) but itself (white). Evaluation of accuracy and

F_{1}

-score for each tested turbine (green). For example, first row tests the predictors trained with turbine 1 data on turbines 2 through 75.

Figure 3. Accuracy and

F_{1}

-score (components of assessment metric

P_{k}

) of the predictors trained on turbine k,

k = 1, \dots, 75

, and tested on all the other turbines in the wind farm.

Figure 3. Accuracy and

F_{1}

-score (components of assessment metric

P_{k}

) of the predictors trained on turbine k,

k = 1, \dots, 75

, and tested on all the other turbines in the wind farm.

Figure 4. Accuracy and

F_{1}

-score of predictors (trained on T55) tested on all the other turbines.

Figure 4. Accuracy and

F_{1}

-score of predictors (trained on T55) tested on all the other turbines.

Figure 5. Accuracy and

F_{1}

-score of predictors (trained on T56) tested on all the other turbines.

Figure 5. Accuracy and

F_{1}

-score of predictors (trained on T56) tested on all the other turbines.

Figure 6. The distribution across 288 prediction horizons (10min to 2 days) of the prediction accuracy when T54 (green) and T56 (red) use the predictors from T55 (blue).

Figure 7. TCN architecture used for a single turbine icing prediction.

Figure 8. Illustration of the input to the prediction model, together with the prediction horizon. The input to the prediction model contains features in the past (red) and present (green). The prediction horizon defines a specific future time (blue stem) when the prediction is made.

Figure 9. Ice condition labeling scheme using majority voting (farm level).

Figure 10. Decision fusion: Each turbine makes icing predictions individually. Then, all the prediction decisions are combined via majority voting for farm-level icing prediction.

Figure 11. Feature fusion: For each feature, the data samples from all the wind turbines are combined via averaging. Then, farm-level icing prediction is achieved by one predictor per prediction horizon.

Figure 12. Illustration of the majority voting scheme.

Figure 13. Prediction accuracy across prediction horizons. Green: decision fusion, blue: feature fusion, red: single turbine (T55).

Figure 14. Histogram of prediction accuracies across prediction horizons using decision fusion, feature fusion, and single turbine approaches.

Figure 15. Time-series of icing prediction using decision fusion when using one-step-ahead predictor.

Table 1. Features in the SCADA dataset.

Variable or Feature	Unit	Description
Power_Avg	kW	Generated power
Wind Speed	m/s	Wind speed
Gen_RPM	$R P M$	Generator speed
Wind Direction	degree (°)	Wind direction
Nacel_Direct	degree (°)	Nacelle direction
Blade_Pitch	degree (°)	Blade pitch angle
Yaw_Error	degree (°)	Yaw error
Temper_Nac	Celsius (°C)	Nacelle temperature
Temper_Amb	Celsius (°C)	Ambient temperature
Temper_Gen	Celsius (°C)	Generator bearing temperature
Temper_Gear	Celsius (°C)	Gear bearing temperature
Oper_State	-	Flagged for normal operating condition

Table 2. Features in the weather database.

Variables or Features	Unit	Description
Temperature	Celsius (°C)	Air temperature from the weather database
Relative Humidity	%	Relative humidity from the weather database

Table 3. Fisher distances for the features for training turbine (T55) and testing turbines (T54 and T56).

Fisher Distance	Training Turbine (T55)	Testing Turbine (T54)	Training Turbine (T55)	Testing Turbine (T56)
Temp_Gear	$1.37 \times 10^{0}$	$1.94 \times 10^{0}$	$1.37 \times 10^{0}$	$3.47 \times 10^{- 1}$
Power_Avg	$1.14 \times 10^{0}$	$1.67 \times 10^{0}$	$1.14 \times 10^{0}$	$5.53 \times 10^{- 1}$
Gen_RPM	$1.00 \times 10^{0}$	$1.30 \times 10^{0}$	$1.00 \times 10^{0}$	$4.17 \times 10^{- 1}$
Temp_Gen	$9.14 \times 10^{- 1}$	$1.66 \times 10^{0}$	$9.14 \times 10^{- 1}$	$3.96 \times 10^{- 1}$
Relative Humidity (weather station)	$7.66 \times 10^{- 1}$	$7.41 \times 10^{- 1}$	$7.66 \times 10^{- 1}$	$6.37 \times 10^{- 1}$
Wind Speed	$3.41 \times 10^{- 1}$	$9.60 \times 10^{- 1}$	$3.41 \times 10^{- 1}$	$2.07 \times 10^{- 1}$
Blade_Pitch	$2.62 \times 10^{- 1}$	$2.26 \times 10^{- 1}$	$2.62 \times 10^{- 1}$	$1.28 \times 10^{- 1}$
Temper_Nac	$4.34 \times 10^{- 2}$	$5.21 \times 10^{- 3}$	$4.34 \times 10^{- 2}$	$1.53 \times 10^{- 1}$
Temperature (weather station)	$1.72 \times 10^{- 2}$	$1.53 \times 10^{- 2}$	$1.72 \times 10^{- 2}$	$2.88 \times 10^{- 1}$
Yaw_Error	$1.26 \times 10^{- 2}$	$1.46 \times 10^{- 2}$	$1.26 \times 10^{- 2}$	$8.68 \times 10^{- 4}$
Temper_Amb	$2.77 \times 10^{- 3}$	$4.58 \times 10^{- 3}$	$2.77 \times 10^{- 3}$	$6.12 \times 10^{- 4}$
Nacel_Direct	$2.62 \times 10^{- 3}$	$2.07 \times 10^{- 4}$	$2.62 \times 10^{- 3}$	$2.90 \times 10^{- 4}$
Wind Direction	$9.10 \times 10^{- 4}$	$1.85 \times 10^{- 4}$	$9.10 \times 10^{- 4}$	$1.11 \times 10^{- 5}$

Table 4. TCN model parameters used.

Parameters	Value or Setting
Optimizer	Adam
Loss Function	Binary Cross-Entropy
Epoch	10
Learning rate	0.001
Batch size	8
Kernel size	3
Dropout Probability	0.2

Table 5. Rules for labeling ice state of data samples (turbine level).

Region 2 of Power Curve	Region 3 of Power Curve
Temperature $< 0 °$ C	Temperature $< 0 °$ C
Relative Humidity $> 85 %$	Relative Humidity $> 85 %$
Actual Power $< 85 % \times$ Power Curve	Actual Power $< 85 % \times$ Rated Power

Table 6. Prediction accuracy mean and standard deviation using decision fusion, feature fusion, and single turbine approaches.

Prediction	Average Accuracy across All Prediction Horizons (%)	Standard Deviation (%)
Decision fusion	88.5	1.1
Feature fusion	89.1	3.3
Single turbine	81.2	4.4

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, Y.; Rotea, M.; Kehtarnavaz, N. Wind Farm Prediction of Icing Based on SCADA Data. Energies 2024, 17, 4629. https://doi.org/10.3390/en17184629

AMA Style

Zhang Y, Rotea M, Kehtarnavaz N. Wind Farm Prediction of Icing Based on SCADA Data. Energies. 2024; 17(18):4629. https://doi.org/10.3390/en17184629

Chicago/Turabian Style

Zhang, Yujie, Mario Rotea, and Nasser Kehtarnavaz. 2024. "Wind Farm Prediction of Icing Based on SCADA Data" Energies 17, no. 18: 4629. https://doi.org/10.3390/en17184629

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Wind Farm Prediction of Icing Based on SCADA Data

Abstract

1. Introduction

2. Cross Validation: Generalization Ability of a Single Turbine Predictor

3. Farm-Level Prediction by Fusion

3.1. Qualified Turbines

3.2. Rules Used for Labeling Ice Condition

3.3. Fusion Approaches

4. Farm-Level Prediction Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI