*3.2. Using the Local Regression Method (LRM) and the Smooth-Line Approach*

Next, based on the first 1024 data entries in each dataset (data sequence), we found the smooth line to represent the trend of the data points using the local regression method (LRM). In Appendix D, we plotted it as a bold blue line in Figures A4–A6 for different wind-speeds subject to the wind turbine settings of 3.0-blade (full blades), 2.5-blade and 2.0-blade configurations, respectively. For space reason, only several subfigures used for subsequent discussions are preserved here in Figures 5–7.

**Figure 5.** *Cont*.

**Figure 5.** Magnitude and trend of acceleration for the full (normal) 3.0-blade turbine. (**a**) Based on datasets recorded along the X axis with no wind imposed. (**b**) Based on datasets recorded along the Y axis with no wind imposed. (**c**) Based on datasets recorded along the Z axis with no wind imposed.

**Figure 6.** *Cont*.

**Figure 6.** Magnitude and trend of acceleration for the 2.5-blade (partially broken) turbine. (**a**) Based on datasets recorded along the X axis with no wind imposed. (**b**) Based on datasets recorded along the Y axis with no wind imposed. (**c**) Based on datasets recorded along the Z axis with no wind imposed.

**Figure 7.** *Cont*.

In these visualisations, we used a small grey circle, instead of points, to denote each original data point, so that the intensity of the data points could be observed in terms of their overlaps. Through the presentation of these points, we might easily find that the cycle of a waveform roughly concurs with the interval between the 'peaks' or the interval between the 'valleys' of the original acceleration data. As seen in most of these figures (except for the acceleration signals recorded along the X-axis and the Y-axis for the 2.5-blade turbine with no wind, i.e., Figure 6a,b), a waveform containing several repeated waves with a highly static time cycle appeared.

However, despite some 2.5-blade cases being distinguished easily, the 3.0-blade cases and the 2.0-blade cases are not dissimilar visually. Therefore, no rule can be established to differentiate these three cases through the eyes till now, let alone this vision-based process lacks a mathematical foundation. Here, only the observed static time cycle is worthwhile, i.e., in each subfigure, each pair of two nearby peaks, or each pair of two nearby valleys, was almost equally spaced, so the interval for nearby peaks or that for nearby valleys was consistent as well.

However, from these graphs we also saw that (1) the number of data points outside the waveform, (2) the degree to which they were outside the waveform, (3) the concentration of these data points and (4) the proportion of data outliers were all different (i.e., they varied from case to case). For example, in general, on the same axis, there were more acceleration data outliers in Figure A5 for the 2.5-blade setting than in Figure A4 for the 3.0-blade setting and in Figure A6 for the 2.0-blade setting.

Take the settings with 0.6 m/s wind-speed arbitrarily as an example. In Figure A5a (upper-right subfigure), there are 14 outliers (out of the first 1024 data entries), according to the normal outlier equations below:

$$\begin{cases} Outliers(VD) = \mathcal{U}(VD) \backslash \{ vd \in VD | vd = \left[ Q\_1 - 1.5 \times IQR, Q\_3 + 1.5 \times IQR \right] \} \\ IQR = Q\_3 - Q\_1 \end{cases}$$

At the same time, in the upper-right subfigure in Figure A4a (no blade broken), imposing the same wind-speed, there is no (0) outlier. And surprisingly, in the upper-right subfigure in Figure A6a (a full blade broken) with the same wind-speed being imposed, there is no (0) outlier either.

In addition, in Appendix D in general, on the X axis and the Y axis, apart from the acceleration magnitude ranges, it seems that accelerations with the 2.0-blade setting (e.g., Figure 7a,b) were more concentrated than those with the 3.0-blade setting (e.g., Figure 5a,b). Combined with the previous observations, these provided further clues for obtaining the effective information for constructing the judgement rules in the next subsection.

#### *3.3. Further Transformation: Resampling and Resmoothing*

In this part of the analysis, we found that the initial 1024 data points in each data sample (i.e., the 'data digest') were sufficient to establish the final rules to identify the different (faulty) situations of turbine blade malfunction after further data transformations using resampling and resmoothing. After conducting experiments, we also found that resampling eight consecutive data points as a representative one (i.e., eight original sampling time units as a 'clock') was appropriate. Therefore, three processes were run as follows.

First, each shortened dataset was resampled using the new clock. We let the original dataset be *a*(*t*), where *t* = *tj*, *j* = 0, 1, ..., 1023 was the original sampling time sequence, and then defined the resampled dataset as *a*(*t*), where *t* was the redefined clock sequence and *ti* = {*t*8*i*, *t*8*i*+1, *t*8*i*+2, ··· , *t*8*i*+7}, *i* = 0, 1, ..., 128. For each clock, the information from the original data was preserved as follows:

$$m(\overline{t}\_i) = \sum\_{j=0}^{7} \frac{a(t\_{8i+j})}{8},$$

$$v(\overline{t}\_i) = \frac{1}{8} \sum\_{j=0}^{7} \left( a(t\_{8i+j}) - m(\overline{t}\_i) \right)^2.$$

,

where *m*(*ti*) and *v*(*ti*) are, respectively, the mean of the original data sequence on the redefined clock sequence *t*.

Next, the curve-like piecewise line was approached using LRM, and it was in fact a predictor that also produced the 'theoretical value' of the acceleration degree at any specified time point *t*, i.e., *P*(*t*). Therefore, when the data was re-considered using a 'clock', this clock was also applied to the predictor function. We named this new predictor a function called *P*(*t*), where *t* was the redefined clock sequence and *ti* = {*t*8*i*, *t*8*i*+1, *t*8*i*+2, ··· , *t*8*i*+7}. Therefore, *P*(*t*) could be simply redefined as:

$$
\overline{P}(\vec{t}) = \overline{P}(m(\vec{t}\_i)).
$$

However, in the above equation, the function *P*(·) was identical to *P*(·) because both used local regression as the smoother function.

Third, the computational results are rendered in Figures A7–A9 for the 3.0-blade, 2.5-blade and 2.0 blade settings, respectively. Once again, for these settings, only the no-wind-imposed cases on the 3 axes are presented in Figures 8–10 for simplicity. These

simplified plots were qualified to clarify the features among the different faulty blade cases and establish the rules to distinguish them automatically.

**Figure 8.** Means and peak/valley variances of resampled data for the 3.0-blade turbine. (**a**) Based on datasets recorded along the X axis with no wind imposed. (**b**) Based on datasets recorded along the Y axis with no wind imposed. (**c**) Based on datasets recorded along the Z axis with no wind imposed.

**Figure 9.** Means and peak/valley variances of resampled data for the 2.5-blade turbine. (**a**) Based on datasets recorded along the X axis with no wind imposed. (**b**) Based on datasets recorded along the Y axis with no wind imposed. (**c**) Based on datasets recorded along the Z axis with no wind imposed.

**Figure 10.** Means and peak/valley variances of resampled data for the 2.0-blade turbine. (**a**) Based on datasets recorded along the X axis with no wind imposed. (**b**) Based on datasets recorded along the Y axis with no wind imposed. (**c**) Based on datasets recorded along the Z axis with no wind imposed.

As can be seen, these plots provided clear ways to compare and establish the rules to identify the two malfunctioning blade cases against that with full blades working normally.

#### **4. Establishment of Rules and Discussion**

Based on these results, the rules to judge whether a blade on the wind turbine was half-broken (i.e., the 2.5-blade case), normal with full-blades running (i.e., the 3.0-blade case), or completely missing a blade (i.e., the 2.0-blade case), could be established.

#### *4.1. First Rule to Judge the 2.5-Blade Case*

First, certain features of the 2.5-blade cases emerged as salient. From the graphs in Figure A8, e.g., Figure 9b, it was easily observed that many of the 2.5-blade cases had distorted or abnormal waveshapes after the data points (*t*, *a*) were smoothly interpolated using LRM, compared to the 3.0 or 2.0 cases in Figures 8b and 10b. An extreme case of this could be seen in the plot for the 2.5-blade, Y axis, no-wind case, while other trivial cases were the 2.5-blade, X axis, no wind; 2.5-blade, X axis, wind-speed = 12 m/s; 2.5-blade, X axis, wind-speed = 18 m/s; 2.5-blade, Y axis, wind-speed = 12 m/s and 2.5-blade, Y axis, wind-speed = 18 m/s cases, as seen in Figure A8. As seen in the figures, almost all of these cases appeared based on the data series recorded on the X and Y axes, despite the level of the peaks/valleys also being slightly jittered based on the data recorded on the Z axis subject to certain wind-speeds, e.g., '2.5-blade, Z axis, no wind' and '2.5-blade, Z axis, wind-speed = 18 m/s'.

However, despite its simplicity, a rule based on the visualisation process was difficult to implement because it was established through human-based pattern recognition. For example, to what extent could the so-called 'distortion' and 'abnormality' be justified for a wave-like plot? Therefore, a numerical rule needed to be established so that an algorithm could be implemented based on the results for automatic detection in the future. This relied on the true mean and variance values of the acceleration source data that corresponded to a peak or valley in a resampled and smoothed waveform, which were displayed as bold vertical red line segments in the plot. Each such line ranged from (mean − variance) to (mean + variance) of a source data slice associated with some peak or valley.

As can be observed, in general, the red line segments were longer in the figures plotted based on the X-axis and Y-axis vibration data subject to the 2.5-blade setting, no matter how great the wind-speed imposed on the turbine, compared to those plotted based on the X-axis and Y-axis vibration data subject to either the 2.0-blade setting or the 3.0-blade setting. Moreover, no such situation was found for the figures plotted based on the Z-axis vibration data subject to the same 2.5-blade setting.

As such, this feature (Rule 1), i.e., occurrences of the long red line segments around the peaks and valleys of the LRM-smoothed and resampled waveform that appear for the X-axis and Y-axis vibration data under different wind-speed settings, can be used to identify whether half (0.5) a blade on the wind turbine, or a part of a blade, is broken (i.e., the 2.5-blade case).

#### *4.2. Second Rule to Judge the 2.0-Blade Case*

Next, since the 2.5-blade case could be excluded using the above rule, the remaining problem involved how to distinguish the 2.0-blade (one blade totally missing) case from the 3.0-blade case. A clue to the reasoning was that a turbine having full blades (the 3.0-blade setting) should be heavier than the same turbine with a blade totally missing (the 2.0-blade setting) in the case of no wind (static without other conditions changing). That is, unlike the vibration data recorded along the X axis or Y axis, the Z axis data corresponded to the vertical power (i.e., the weight factor of the turbine) of the turbine interacting with the foundation structure and the land. Therefore, lower accelerations should be detected along the Z axis for a turbine with a full blade missing than for a normal (full-bladed) turbine, and this effect could be clearly compared and displayed when no wind was imposed.

This was reflected in the experimental results. Comparing the acceleration data recorded along the Z axis for the situation with no wind (0 m/s) (see Figures 8c and 10c), the interval of the predicted acceleration values for the 3.0-blade normal case was [–1.4,–2.0] (m/s) (bounded by the peaks and valleys of the waveform). In contrast, the interval of those predictions for the 2.0-blade case was [–1.2,–1.8] (m/s), a lower window.

Therefore, the results supported our theoretical suppositions. These became the second rule to distinguish a turbine with a blade totally missing (i.e., the 2.0-blade case) from a normal turbine (Rule 2): if the case is not 'a part of a blade is broken' (which can be detected based on the first rule), the predictive waveform identified from the Z axis data can be used to see whether or not a blade is missing for any possible reason by checking to see if waveform fluctuation of the resampled data, in terms of the interval delimited by the peaks and valleys of the smoothed function (i.e., max*P*(*t*), min*P*(*t*) ), is narrower than usual.

#### *4.3. Discussion*

In short, the means and variances of the time-domain data recorded along the X axis or the Y axis around the peaks and valleys of the locally regressed and resampled waveform can be used to determine whether the wind turbine already has a partially broken blade (i.e., the 2.5-blade case). The bold vertical red line segments are measured based on these means and variances, and the extraordinary appearance of these line segments may indicate the faulty 2.5-blade case or perhaps harm to a blade.

Following this rule and excluding the 2.5-blade case, the fluctuation range of the predictive waveform obtained based on the data interval recorded along the Z axis when there is no wind can be used as a measure to distinguish the 2.0-blade case from the normal 3.0-blade case. When this interval rises further upward than usual, the case in which a blade is totally missing (the 2.0-blade case) can be detected.

Combing these two rules, all three of the cases—the normal case and the two faulty cases of wind turbine fan-blade damage—can be explained systematically. As these rules are simple (with limited computational complexity), they can be used to detect these malfunctions almost in real time and to transmit the necessary warning messages to those in charge just in time, given that the vibration datasets for the wind turbines are synchronised routinely (within a short period of time) on the server side of the CMS. This greatly benefits the unplanned maintenance efforts of wind farm operations.

The Circum-Pacific Belt area is prone to super typhoons and strong typhoons, whose wind-speeds may easily reach 150 km/h or above, and thus a situation in which a wind turbine blade breaks apart or falls off completely should not be news to anyone in the green energy industry. Although the proposed set of rules does not fit the case in which a turbine totally collapses (as it is unclear whether the accelerometer and remote-side CMS would still work in this case), in most cases it can serve as a computerised remedy to detect whether a blade is broken or falling off, if the wireless transmission works.

Since another common cause of turbine damage in the studied area involves earthquakes, which are usually as unpredictable as typhoons, the maintenance by prediction mechanism is suggested as a supplement to regular predictive maintenance, even during periods of predictable weather conditions (i.e., to best control that which is controllable). In short, from the above discussions for the Circum-Pacific Belt area and articulating back to the outset of this study (see Section 1.1), it should be clear that the proposed mechanism, even just putting a puzzle piece for detecting the damages on the fan component of a turbine (with respect to the whole integrative perspective of turbine maintenance), can improve the operating efficiency of turbines. This reduces the maintenance costs and benefits the unplanned maintenance of wind farm operations.

Note that in terms of digital signal processing (DSP), we established the set of detection rules based on the converted (original) time-domain data, rather than the data in any other domain, e.g., frequency-domain data. Doing so not only maintains simplicity for future implementation and makes the entire process faithful to the original data but also avoids possible ambiguity. For example, if the upper and lower limits of the predictive waveform window determined by the peaks and valleys are logged, it becomes harder to examine whether there has been an upward shift in the interval from the 3.0-blade setting to the 2.0-blade setting (i.e., for the second rule) just because the window in the logged domain would become narrower.
