1. Introduction
The number and size of wind turbines (WTs) are increasing, and operation and maintenance costs constitute up to 30% of the total energy cost of WTs [
1]. Therefore, the cost of operation and maintenance is a serious problem for most WT operators [
2].
Figure 1a shows the breakdown of operational expenditures of an offshore wind farm [
3] Therefore, using an effective health monitoring system prevents huge repair and unsupplied wind energy costs [
4]. WTs are usually faced with severe weather conditions, mostly off-shore ones, e.g., extreme high/low temperature, high humidity, severe wind speed, and direct sunlight. Moreover, WTs include many mechanical moving systems, which increase the probability of WT failure [
5].
Prioritizing the WTs component failure provides a deeper understanding of the maintenance scheduling problem [
6].
Figure 1b shows the primary causes of WT downtime [
7]. In a report from the National Renewable Energy Lab, the failure rates of the key components of a WT have been investigated [
7]. Based on this report, three elements that comprise the WT drivetrain—the
gearbox,
generator, and
main shaft/bearing—cause about 44% of the total WT downtime. Moreover, the electric parts of the WT, i.e.,
generator,
transformer,
converter, and
control system, cause about 40% of the total WT failures. Thus, the four leading causes of WTs failures can be listed as
gearbox,
generator,
transformer, and
converter. Regarding this matter, the condition of all of them are considered in this paper for proper health monitoring of the WTs.
Temperature monitoring of the WTs key component provides relevant information on the state of its health condition [
8]. The temperature of the WT key components should be maintained in a safe range and must not overpass throughout the normal operating conditions [
9]. An exceeded temperature over the safe band may address an anomaly in the corresponding component of the WT, e.g., rotor over-speed, aging, short circuits, and lubrication failure. Thus, the temperature monitoring method is an acceptable approach for health monitoring of the WTs and can be applied for the diagnosis stage of maintenance management system [
10].
Since the bearing is one of the critical components of the WT, the bearing temperature is considered for the condition monitoring of WT in [
11]. Temperature higher than the allowed limit indicates the probability of bearing malfunction, such as: issues on lubrication, electrical leakage through the shaft, aging, and variability of external loads. In case of not perfect bearing operation, the characteristics of the lubricating oil will change, which can result in more fatigue loads [
12]. Thus, the bearing is not efficient enough, and leads to loss of energy during the bearing operation and an increase in the bearing oil temperature.
A rise in the temperature of WT components can be due to many factors. It is usually difficult to identify the main source of the abnormal temperature [
13]. If the temperature of a component increases due to a failure, it affects the temperature of the nearby components. Thus, the nearby sensors may transfer incorrect data in the higher temperature, while there isn’t any problem in the the corresponding component, but it is affected due to the nearby components [
14]. Actually, it is very difficult to indicate the main reason of abnormal temperatures in all of the sensors. Therefore, a coordinated monitoring system should record the temperature of all components and perform an integrated assessment. In addition, with WTs in different locations, the sensor data received via the SCADA system, the interpretation of the SCADA data and the trustworthy analysis of the alarms can be another problem in this domain [
15].
A bearing’s degradation model has been addressed in [
16] to estimate the real-time remaining useful life (RUL). In this method, the SCADA data provides the WT bearings temperatures and the relative temperature is calculated by means of moving average. The performance degradation model is determined using the Wiener process with linear fluctuations. The parameters of this model are tuned using the maximum likelihood estimation method. The results of this study indicated that the real-time RUL estimation method can be more effective compared to the traditional methods. The first measured data above the safe band of the bearing temperature considered by inverse Gaussian distribution that can lead to enhancement in WT operation and maintenance strategies [
17].
The converter is one of the important components in WT and has a considerable failure rate. Thus, detection or prediction of upcoming failures are crucial in WT condition monitoring systems. Authors in [
18] present a method for WT converter fault detection using convolutional neural network models that are developed using data from the WT SCADA. The proposed method begins with the selection of fault indicator variables, and then the fault indicator variable data are extracted from the WT SCADA system. Convolutional neural network models using the generated data to extract features from the radar charts and analyze feature characteristics for fault detection. Power transformers in WTs are exposed to mechanical, thermal, and electrical stresses during the operation period. The authors in [
19] propose an improved aging model of transformers using the Frequency Response Analysis (FRA) method for the detection of faults and the location of mechanical deformations of their live parts and the correlation function is used to determine the level of fault detected. Another important component of a WT is its generators. Since the generator includes both mechanical and electrical parts, it has a considerable contribution to WT failures. The authors of [
20] develop a test case for the detection of damage to the slip rings of the WT generator. A principal component regression is adopted, directed to the temperature collected in the slip ring. Moreover, using the data collected at the nearby WT on the farm, it is possible to identify the incoming fault approximately one day before the occurrence of a failure [
20]. Transformation is another important component in a WT. The most common faults for WT transformers are combustion, and an abnormal increase in electrical resistance was frequently detected in a large number of windings and lead bars. To solve these problems, the authors at [
21] propose a series of characterization methods to investigate assembly structure, matrix materials, and macro/microscopic morphologies of failed transformers. A temperature simulation experiment was also carried out on [
21] to evaluate normal operating conditions. Analysis results in [
21] showed that improper installation, unreasonable design, unqualified fabrication, and improper maintenance were the main causes of WT transformer failure.
The relationship between WT component temperature variation and WT health condition was investigated in [
22]. It has been addressed that the probability of failure occurrence can be calculated by studying the overheating behavior of the WT’s bearing. The outcomes of [
22] indicated that it is possible to predict the failure occurrence even one month earlier, which is a brilliant result. Bayesian inference can be appropriate for prediction of the WT failures, and it offers a compensation between model performance and computational efficiency. The contributions of this paper are listed as follows:
Introducing an optimal risked-based methodology for WT condition monitoring;
Proposing an artificial neural network-based model for estimating the normal condition of WT key components;
Presenting a real-time risk indicator, which is used in the health monitoring and anomaly detection of WT.
In this paper, an optimal temperature-based condition monitoring is proposed for WTs. In the first stage, the normal condition of the WT’s key components, i.e., gearbox, converter, generator, and transformer, has been estimated through an artificial neural network model. In the second stage, the deviation of real-time measurement with reference to the estimated values has been calculated. The estimated values provide the healthy conditions of the WT components and any deviations from these reference values can be marked as an anomaly. In this paper, a risk indicator is also introduced, which is calculated on the basis of a safe band. The safe band represents the maximum acceptable deviation between the real-time temperature measurement and the estimated normal conditions for the WT key component’s temperature. Therefore, the calculation of the safe band plays an important role in the calculation of the risk indicator. One of the main contributions of this paper is to propose a flowchart and modeling for the optimal calculation of the safe band to increase the accuracy of the WT condition monitoring system. Finally, the effectiveness of the model has been proved using the real data of an offshore wind farm in Germany.
The rest of the paper is structured as follows:
Section 2 presents background on different maintenance strategies of WT. The conceptual framework of the proposed model of condition monitoring has been explained in
Section 3, and its corresponding mathematical modeling is addressed in
Section 4. The results of numerical studies are discussed in
Section 5 and finally
Section 6 concludes the paper.
3. Optimal Temperature-Based Condition Monitoring Framework
Today, several information sources exist within a wind farm that can assist in decision-making during a maintenance scheduling process, but sometimes they are not integrated enough into the comprehensive modeling of the system. This section describes an integrated predictive maintenance framework in which several tools are integrated to assist the process of asset management in a wind farm. In order to make a trade-off between enhancing the short-term reliability of each individual WT and reducing the maintenance costs, an optimal temperature-based condition monitoring framework proposed which is shown in
Figure 3. As it is seen in
Figure 3, the proposed framework includes four main parts, which are explained in-detail as follows:
Input-Data Preparation Unit: This unit provides the input-data for the proposed model. There are two sources of data i.e., the historical or logged data and the real-time measurement data. The out-put data of this unit not only includes the real-time measurements of the different sensors that are suitable for the real-time calculations, but also includes the historical data that has been used for the training of the ANN. These input data consist of ambient temperature (°C), wind speed (m/s), nacelle temperature (°C), and the amount of active generated power (kW) of the wind turbine.
Normal Operation Estimation Unit: In this stage, the input-data from the previous unit are entered into four independent neural network predictors, which are trained properly during the normal operation of the WT, i.e., when there are no reported components alarms. The goal of this unit is to estimate the normal (or healthy) condition of the system. Thus, in this stage, we are able to diagnose the healthy conditions of the WT main components based on the operating conditions. As it is indicated in the Introduction section, temperature monitoring is an appropriate way for analyzing the health condition of the WT main components. In this stage, there are four neural network predictors, and all of them receive the same input-data, which are prepared in the previous unit and predict different temperatures, i.e., the temperature of gearbox oil, converter, generator winding, and transformer oil.
Optimal Safe-Band Calculator: The safe band represents the maximum acceptable or tolerable deviation between the real-time temperature measurement and the estimated normal conditions for the temperature of the WT key components. Therefore, the calculation of the safe band plays an important role in the calculation of the risk indicator and enhancing the precision of the WT condition monitoring system. In this unit, the dependency of the historical alarms with the risky conditions of the WT has been evaluated in different safe-band values.
Health Monitoring and Anomaly Detection Unit: This stage is the heart of the proposed framework, which is shown in
Figure 3. Normal operation estimation unit provides some temperature data as a benchmark to represent the healthy condition of the WT main components, and in parallel the real-time values of these parameters are entered into the
Health Monitoring and Anomaly Detection Unit from the measurement units. The
real-time values—from the sensors/measurement devices—and
normal estimated values are compared with each other and the amount of the deviations between these two values are calculated. More deviation represents the more risky situation. These deviations between real and expected conditions may be more or less severe for the component’s life. Another important contribution of this paper is introducing the risk indicator, which is calculated based on the proposed deviations. It can be interpreted as symptoms leading to possible failure modes.
5. Results
The proposed model has been applied on the real data of an offshore wind farm in Germany, which includes 30 WTs. The data were logged for two years from October 2016 to October 2018 with the sample rate of 10 min. We used the first 18 months of the data set as the training part and consequently the last 6 months as the test data. The results that are shown in this section are extracted from the test data.
Figure 6 present the input data of the proposed framework.
Figure 6a represents the wind speed (m/S) and the active generated power (kW) of a WT. The ambient and nacelle temperature (°C) are shown in
Figure 6b. As it is shown in this figure, the inside temperature of the nacelle is about 15–20 °C higher than the ambient temperature, which is rooted in the operation of the WT and the cooling system.
As it could be seen in
Figure 7, the optimization coefficient is maximized in a certain amount of the safe band. Thus, the optimal value for the safe band is 10.8 °C. This amount for the safe band is taken into account in the remaining simulations.
Figure 8 addresses the results of the study which includes both normal conditions of the WT key components and its corresponding deviations. In
Figure 8, the green line shows the estimated value, which represents the normal or healthy conditions of the system as well as the safe band, which is calculated 10.8 °C more than the estimated value. The red line addresses the measurement data. The deviation, which has been defined by Equation (
4), is shown in the second axis in blue color. As it can be seen in
Figure 8, if the measurement data (red line) increases more than the maximum value of the safe band, a deviation will be reported.
The accuracy of “detecting failures” compared to the “historical failures” during the test period was different for different components of wind turbines. The accuracy for Gearbox was 94%, for Transformer was 91%, for Generator was 90%, and for converter was 87%.
By precisely observing
Figure 8, different characteristics and behaviors could be observed between the variation of the real-time measurements and the estimated data. For example, the variation of the real-time measurement data in
Figure 8c,d is much higher compared to the estimated data, while the estimated data vary more smoothly. In addition, the variation of the estimated data in
Figure 8b, especially on 25 and 27 August, is greater than the real-time measurement data. Assuming that the sensor data do not meet the frequency requirements of the thermal signal, this may be due to the thermal inertia of the system, e.g., the oil temperature in
Figure 8b. Thus, the question may come to mind regarding whether the high frequency variation of the oil temperature estimate is feasible. To answer this question, we try to consider the thermal inertia in our proposed model for the ANN model (
Figure 4). However, from
Figure 8a,b, it can be understood that it does not work in an ideal manner. In other words, considering the thermal inertia in the model by estimating the current values based on the previous values, it has been expected to have smoother variations. However, as it can be seen in
Figure 8a,b, this was not perfect. In this sense, we began to study the thermal inertia of the oil temperature both in the gearbox and in the transformer in physical terms and not only in mathematical terms such as future work in order to analyze the estimated values and develop a physical model to demonstrate that oil temperatures in these mechanisms can change at such a high frequency as indicated in 27 August in
Figure 8b. In fact, in this paper, we proposed an optimal value of the safe band to cover the mismatch between the variation frequency of the real-time measurements and the estimated normal values. However, the main concern regarding this mismatch would be some additional deviation data (blue bars) in
Figure 8a,b (for example, those for days 23–24 and 28–29 in
Figure 8b). To mitigate the impacts of this problem, we proposed a cumulative risk indicator in (
6) to reduce the impacts of these small deviations. In other words, the proposed cumulative risk indicator will be affected when there are some continuous deviations over a period of time and not just in a snapshot of a time.
Figure 9 illustrates the variation of the risk indicator over a 4-month of the test period with respect to temperature deviations of the gearbox, generator, converter, and transformer to the normal condition model that estimates its healthy operation condition. Using the deviation information included in
Figure 8 and using Equations (
5) and (
6), the values of the risk indicator have been calculated. The representation of these values in
Figure 9 is for four months and is able to detect abnormalities of temperature deviation in the key components of WT. By focusing on the variation of the proposed risk indicator of the gearbox (blue line), it can be seen that there is huge growth in the second week of July. The gearbox did not show a considerable deviation from its normal expected condition for about first three weeks of July 2018. However, around the second week of July 2018, something happened in the gearbox, which caused high values of deviations and an important change in the value of the risk indicator. An amount of stress for this failure mode appeared. A process of rapid degradation was observed from this moment with a progressive increase in the risk indicator values and also the slope of the curve represented. An interesting point regarding the gearbox is that it was working without any failure during all of those 4-month periods. However, the change in the slope of the gearbox risk indicator represents the need for an appropriate maintenance program as soon as possible to prevent occurring sever damages to the gearbox and consequently huge maintenance costs.