Identification and Localization of Wind Turbine Blade Faults Using Deep Learning

Davis, Mason; Nazario Dejesus, Edwin; Shekaramiz, Mohammad; Zander, Joshua; Memari, Majid

doi:10.3390/app14146319

Open AccessArticle

Identification and Localization of Wind Turbine Blade Faults Using Deep Learning

by

Mason Davis

,

Edwin Nazario Dejesus

,

Mohammad Shekaramiz

^*

,

Joshua Zander

and

Majid Memari

Machine Learning and Drone Lab, Electrical and Computer Engineering Department, Utah Valley University, Orem, UT 84097, USA

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(14), 6319; https://doi.org/10.3390/app14146319

Submission received: 13 June 2024 / Revised: 10 July 2024 / Accepted: 16 July 2024 / Published: 19 July 2024

(This article belongs to the Section Green Sustainable Science and Technology)

Download

Browse Figures

Versions Notes

Abstract

:

This study addresses the challenges inherent in the maintenance and inspection of wind turbines through the application of deep learning methodologies for fault detection on Wind Turbine Blades (WTBs). Specifically, this research focuses on defect detection on the blades of small-scale WTBs due to the unavailability of commercial wind turbines. This research compared popular object localization architectures, YOLO and Mask R-CNN, to identify the most effective model to detect common WTB defects, including cracks, holes, and erosion. YOLOv9 C emerged as the most effective model, with the highest scores of mAP50 and mAP50-95 of 0.849 and 0.539, respectively. Modifications to Mask R-CNN, specifically integrating a ResNet18-FPN network, reduced computational complexity by 32 layers and achieved a mAP50 of 0.8415. The findings highlight the potential of deep learning and computer vision in improving WTB fault analysis and inspection.

Keywords:

YOLO; Mask R-CNN; fault detection; wind turbine blades; inspection; deep learning

1. Introduction

As the demand for renewable energy sources grows, wind power has emerged as an efficient and safe solution. This sector has experienced significant expansion in recent years, driven by governmental incentives, lower production costs, and advancements in turbine technology [1]. Within the United States, wind power accounted for 22% of new electricity capacity installed in 2022, reaching a total capacity of 90,000 turbines [2]. Globally, the trend is equally robust. Europe increased wind generation capacity by 18.3 GW in 2023 as part of their continued commitment to green energy and 2030 energy targets [3]. This rapid growth in wind energy production presents new challenges in maintenance and inspection, necessitating innovative solutions to sustain efficiency and reliability.

Inspection and early preventive maintenance of damages become vital to ensure the safe production of wind energy. Throughout the lifetime of the turbine, the stresses and harsh operating environment lead to damages that, if not addressed, can result in total loss. In addition, blade repairs range from 3000 to 16,000 USD, depending on the severity and size of the damage, while total blade replacement requires significantly more capital [4]. Common damage to WTBs includes delamination, structural cracks, and erosion at the leading edge. These are commonly a result of external factors such as strong wind, rain, snow, salt, and temperature fluctuations [5]. Delamination and surface cracks are the result of adhesive joint failures, core debonding, or excess loads, while leading edge erosion is often the result of cutting through particulate matter over the lifetime of the turbine.

Current WTB fault detection methods include sensor-based monitoring, Supervisory Control and Data Acquisition (SCADA) system analysis, and visual inspection [6]. Visual-based inspection is used to identify and monitor blade defects. Methods of visual inspection have included RGB imaging, thermal, and a fusion of these modes to allow auxiliary information to be present during inference [7]. In this area, deep learning has proved to be a pivotal technology in advancing this field [6]. Additionally, the integration of deep learning and drones allows for an autonomous solution to the inspection problem, which is proving to be a promising direction as turbines continue to scale [8]. Existing work in the detection of WTB faults typically addresses architectural changes to a single deep learning network to improve performance for a given task. While an important contribution, the comparison between distinct architectural designs, single and two-stage detectors, could be improved. This study investigates the performance of these architectural design patterns through YOLO and Mask R-CNN to determine which method achieves the best performance on turbine blade fault localization. The dataset used in this study was introduced in our previous classification research [9]. Building upon this, the dataset is annotated for object detection architectures, allowing for the model comparison presented here. Additionally, methods such as the Adaptive Kriging Damage Assessment (AK-DA) improve inspection efficiency and reliability, further aiding in maintenance and extending equipment lifespans [10].

The focus of this study is on the halted wind turbine case scenario, which allows for the inspection of the WTBs without rotation. The wind speed and direction for in-action wind turbines can cause drone destabilization, turbine yaw directional change, and spinning of wind turbine blades, as expected. This leads to blurry/unfocused aerial images, which makes it more challenging for the blade fault inspection problem. Although this is not the direction of our work here, there are recent works that explore CNN-BiLSTM networks for a predictive analysis of wind speeds [11] that can help better stabilize the drone while collecting images and pre-processing techniques to de-blur the aerial images [12]. These could be incorporated to allow inspection in the active turbine case scenario. The contributions of this work are as follows:

Examination of single-stage and two-stage detection algorithms, YOLO and Mask R-CNN, in WTB fault localization
Modifications to the backbone of Mask Region-based convolutional neural network (R-CNN) to reduce computational complexity and increase accuracy

In the following section, we investigate the current methods and architectures that exist for WTB fault localization and their performance. Then, an overview of the architectures investigated in our research is discussed along with the model precursors, simulations, and results. Finally, a discussion of the findings is made, and conclusions are drawn.

2. Literature Review

The growing availability of high-performance computational resources coupled with the surge in research interest surrounding deep learning has increased the application of these algorithms to the domain of wind turbine inspection [13]. Of these inspection methods, object detection remains dominant in the literature, providing an accurate solution to the localization and classification of WTB anomalies. The algorithms deployed in this sector consist of either a single-stage model or a two-stage model. The inherent trade-offs in these different architecture design methods include inference computation time and accuracy, with two-stage models traditionally garnering higher accuracy overall. However, in resource-constrained or time-sensitive environments, single-stage architectures can provide close to real-time inference. Here, the literature introduces novel modifications to increase the accuracy, inference speed, or both. In this section, a breakdown of the recent methods and results is discussed for both single-stage detectors and two-stage detectors, followed by alternative methods.

To further reduce the computational complexity of the backbone of You Only Look Once (YOLO) version 4, Zhang, Yang, and Yang incorporated a MobileNetv1 architecture as the feature extraction [14]. Furthermore, the model was further enhanced with attention modules, including Squeeze and Excitation Networks (SENet), Convolutional Block Attention Module (CBAM), and Efficient Channel Attention for Deep Convolutional Neural Networks (ECANet) [15]. Through a sensitivity study, the backbone-only adjustment reduced overall model complexity but decreased the performance compared to base YOLOv4. With the addition of attention modules, specifically SENet and ECANet, the model precision overtook base YOLOv4 while maintaining a reduced model complexity due to backbone alteration [14]. The proposed model, MobileNetv1-YOLOv4, achieved a mAP50 of 88.61 on their created dataset consisting of spalling, pitting, crack, and contamination faults on WTB [14]. Similarly, Mohammadi and Sharifian investigated the performance of YOLOv4 by adding Spatial Attention Modules (SAMs) and a unique Mish activation function [16]. With the added generalization of pre-trained COCO weights for distillation and data augmentation schemes to introduce training variance, the proposed model achieved 86.17% mAP50 [16].

To increase the performance of YOLOv5 on minor target defects in WTBs, Ran et al. proposed novel architecture modifications, including an improved feature pyramid network, coordinate attention module Coordinate Attention (CA), and integration of an Efficient Intersection over Union (EIoU) loss function [17]. The alteration to the pyramid network utilizes the Bi-directional Feature Pyramid Network (BiFPN) to enable the network to learn input features of varying importance through weighted feature fusion [17]. Additionally, the CA module that was implemented allows the model to augment the representations of interest. Finally, with the EIou implemented, the model achieved a better localization effect of the WTB defects. To test their proposed model, a dataset of 599 images collected with Unmanned Aerial Vehicles (UAVs) was utilized after being synthetically augmented to increase the dataset to 2995. The following ablation experiment indicated the most model improvement with adding the CA module, increasing mAP50 from 80.5% in base YOLOv5s to 82.5%. Finally, with all modifications, the model achieved 83.7% [17].

Echoing the performance abilities of YOLOv5, Liu, An, and Yang also investigated a BiFPN backbone to replace the base YOLOv5 Feature Pyramid Network (FPN) and Path Aggregation Network (PANet) to allow better feature fusion on WTBs [18]. Additional modifications in the proposed architecture also included focal loss, making it harder to classify examples to better handle class imbalance. In addition, the inclusion of SENet attention modules was utilized to enhance feature extraction and reduce redundant feature information. Additionally, the C3 module was replaced with the C2F module, resulting in a more abundant gradient flow [18]. With these architecture changes, the performance gained an increase in precision over the base YOLOv5s by 1.9% [18]. Using the performance gains of BiFPN and CBAM, Yu et al. proposed an enhanced YOLOv8 architecture [19]. Like [17,18], BiFPN replaced PANet, which prioritizes small target features and increases multi-scale fusion. Further improving small-target acquisition, the CBAM attention module was integrated into the backbone. Finally, EIoU was deployed to help with model training and convergence. For the validation of model performance, a dataset was created through UAV inspection of WTBs with the following classes: breakages, cracks, and scratches [19]. The ablation experiment indicated the highest performance increase through BiFPN, which achieved an mAP50 of 83.3% compared to a base YOLOv8 of 81.9% [19].

In an effort to mitigate the complexity of YOLOx and enhance multi-scale small-target WTB feature detection, Yao, Wang, and Fan introduced WT-YOLOx [20]. This novel architecture incorporates a RepVGG backbone, which effectively diminishes the computational burden of YOLOx, thereby augmenting real-time performance while preserving extensive feature fusion through a cascade feature fusion module. To further optimize model performance and address class imbalance, focal loss was employed. For the purpose of evaluating the architecture, a total of 725 images were captured using UAVs at a wind farm in Mongolia. To augment the dataset’s variability, image augmentation techniques were applied, culminating in a comprehensive dataset of 5800 images encompassing the classes of pollution, fix, crack, and break [20]. These architectural enhancements culminated in an impressive 94.29% mAP50, surpassing the performance of both YOLOx and SSD architectures [20]. The modifications of the YOLO architectures for WTB fault diagnosis can be summarized as follows.

Backbone Alteration: By adjusting the backbone feature extraction architecture, higher performance can be gained in low-level feature propagation through the network. This can result in higher accuracy in hard-to-detect classes in fault diagnosis. Additionally, modifications can strive for a reduction in overall computational complexity resulting in better real-time and edge performance.
Loss Function: In changing the loss function, the model can be driven toward specific goals. Commonly, the focal loss is deployed which achieves better performance on hard-to-detect or partially obscured defects by weighting these over easy objects of interest.
Attention Modules: In the pursuit of enhancing the feature maps produced by the network, attention modules allow a unique weighting to prominent features to increase representation through the network. This has been shown to be a valuable addition in fault identification, specifically in the performance on smaller anomalies.

Mask R-CNN, a prominent network architecture in WTB visual analysis, offers instance segmentation alongside precise bounding box predictions. In this context, the term mask refers to the capability of the model to generate pixel-wise segmentation masks for each detected object instance. These masks outline the exact shape and extent of the object within the image, providing detailed information about its spatial distribution. Consequently, the masks generated by Mask R-CNN enable researchers to perform further analysis, such as assessing the size and severity of defects on WTBs. For instance, Zhang, Wen, and Liu proposed a pre-processing pipeline aimed at refining mask data obtained from UAV imagery to filter out background noise [21]. This was accomplished through the proposed MinReact Network (MRnet) where the Graham scan algorithm extracts the convex hull points of a fault and allows the orientation of the image to minimize the area of the bounding box. This was followed by DenseNet-121 forming a new predictive head [21]. This method achieved an mAP50 of 98.7% on the utilized dataset, showing promising accuracy in WTB fault detection. However, the authors indicated a reduction in inference speed due to redundant calculations [21].

For identifying gaps in object detection performance metrics, Zhang, Cosma, and Watkins proposed three new evaluation measures for fault detection, namely Prediction Box Accuracy (PBA), Recognition Rate (RR), and False Label Rate (FLR) [22]. Additionally, an image augmentation pipeline was introduced to further enhance the predictive capabilities of the Mask R-CNN architecture. The simulations used a dataset created through routine WTB maintenance and consisting of the following classes: cracks, erosion, voids, and others. With the proposed image enhancement and augmentation pipeline consisting of flipping, rotation, and histogram equalization, the performance resulted in the mAP50 of 84.21% [22]. Seeking computational performance gains, researchers Diaz and Tittus utilized depthwise separable convolution in the ResNet50 backbone of the Cascade Mask R-CNN model [23]. This reduction in backbone complexity resulted in a 13 million parameter decrease overall. Inference speed when compared to Mask R-CNN was reduced dramatically from 368 s to 136 s [23]. Leveraging a dataset consisting of lightning damage, rust, erosion, crack, and broken parts along with data augmentation, the performance of the proposed model was compared. Here, the decoupled convolution layers in the proposed architecture provides comparable accuracy with a considerable inference speed increase at mAP 84.36% overall [23].

After reviewing all these related works, it is clear that while significant advancements have been made in applying deep learning to WTB fault detection, there remain critical gaps that our study addresses. Current research predominantly focuses on individual architectural modifications without comprehensive comparative analyses between single-stage and two-stage detectors. Our study is crucial as it provides a detailed comparison of YOLO and Mask R-CNN architectures, highlighting their performance in terms of accuracy and computational efficiency under consistent conditions. Additionally, the creation of a new synthetic dataset simulating real-world conditions fills a vital resource gap, enabling more practical and scalable solutions. By addressing these gaps, our study enhances the robustness and efficiency of WTB fault detection systems, contributing to improved maintenance and reliability in wind turbine operations. The modifications of the Mask R-CNN architectures for WTB fault diagnosis can be summarized as follows.

Image Enhancement and Data Augmentation: With the inherent difficulty of image collection for WTB inspection, image enhancement techniques can provide de-blurring or feature enrichment. Additionally, the use of data augmentation has been heavily explored for further performance gains.
Backbone alteration: Caused by the two-stage architecture design of Mask R-CNN, the inference speed is considerably slower than that of YOLO. Novel backbone changes reducing complexity can bring this architecture closer to real-time detection with minimal accuracy impacts. Additionally, architecture improvements can further increase the accuracy of this model for defect identification.

3. Materials and Methods

The following section discusses the two architectures utilized in this study. It then introduces the dataset used for training and analyzing these models. Lastly, the evaluation metrics employed for comparison and discussion are presented.

3.1. YOLO

The YOLO algorithm is a popular object detection method that comprises a backbone network for feature extraction, a neck network for feature aggregation, and predictive heads. Originally created by Redmon et al. in 2015, the architecture makes use of a single network for bounding box predictions and class probabilities [24]. By the introduction of YOLO, many new architectures have been proposed building upon the foundation of YOLO and improving detection accuracy and speed, such as V5 [25], V7 [26], V8 [27], and V9 [28]. Each of these architectures has its own merits and drawbacks depending on the application in terms of speed versus accuracy. Our focus in this paper is on YOLOv5, YOLOv8, and YOLOv9 due to their performance compared to the other architectures.

YOLOv5 modifies the backbone and neck, improving both the accuracy of the model and the real-time performance [25]. The modifications made to the backbone architecture include the design and implementation of Cross Stage Partial (CSP)-Darknet53 and modifications to the neck include Cross Stage Partial Path Aggregation network (CSP-PAN) and Spatial Pyramid Pooling Layer (SPPF) structures [25].

YOLOv7 proposes a trainable bag-of-freebies to allow for the optimization of the training process in which accuracy can be improved without high inference computational costs. The key proposed features include model re-parameterization to help with gradient flow and dynamic label assignment to better handle multi-head training, among other reductions in model size through proposed scaling systems.

YOLOv8’s proposed architecture enhancements include improvements to the backbone, neck, head, and anchor proposal system [27]. An overview of the architecture can be seen in Figure 1.

YOLOv9 is one of the most recent iterations of the YOLO family of architectures [28]. Their work addresses the data loss of traditional deep networks through the proposed Programmable Gradient Information (PGI) structure. This structure comprises three main components, the main branch, auxiliary reversible branch, and multi-level auxiliary information [28]. The main branch is responsible for inference, while the two auxiliary branches feed information to the main branch assisting in learning. The auxiliary branches are illustrated in Figure 2.

3.2. R-CNN

Region-Based Convolutional Neural Networks (R-CNNs) are another popular object detection algorithm. The first iteration of R-CNN was developed back in 2014 by Girshick, Donahue, Darrell, and Malik [29]. The architecture used selective searching to propose regions of interest, fed into the backbone feature extractor to extract key features and compute classifications for each proposed region. The backbone architecture used for R-CNN was Support Vector Machines (SVMs), classifying the features based on a vector map and using kernel functions to classify images [30]. Since the selective search algorithm was the first block in R-CNN, it had to propose regions of interest on the image without any prior knowledge of key features. As a consequence, two thousand proposals were needed to effectively perform object detection. This effectively made training an R-CNN model a time-consuming matter. This also had the consequence of not being able to perform real-time object detection, taking an average of 43 s to make a prediction on a test image.

3.2.1. Fast R-CNN

The developers of R-CNN fixed the drawbacks of R-CNN with Fast R-CNN in 2015 [31]. Instead of having the selective search algorithm propose initial regions of interest, the algorithm was put at the end of the backbone layer. After the selective search algorithm, a new algorithm, the Region of Interest (ROI), was used to turn the feature maps and region proposals into bounding boxes. This allowed the selective search algorithm to propose a drastically fewer amount of regions of interest, effectively making Fast R-CNN a real-time object detection algorithm. Fast R-CNN’s backbone extractor was also switched out from SVM’s to CNN’s such as Visual Geometry Group-16 (VGG-16).

3.2.2. Faster R-CNN

While Fast R-CNN improved the accuracy and speed of R-CNN, the selective search algorithm was still hampering the success of R-CNN architectures due to it being a static algorithm. The selective search algorithm was unable to learn the key features of an image, slowing down the speed of Fast R-CNN. Faster R-CNN fixes this issue by replacing the selective search algorithm with a Regional Proposal Network (RPN) algorithm [32]. This speeds up the computational calculations of Faster R-CNN, since the RPN network learns the key features of the image and proposes regions of interest that only consist of key features.

3.2.3. Mask R-CNN

Mask R-CNN was introduced in 2018 by Kaiming He et al. and built on the Faster R-CNN architecture with an additional predictive path creating instance segmentation along with object detection [33]. This is achieved by modifying the ROI algorithm to output new features of interest for instance segmentation alongside object detection. The features of interest, along with the feature maps and bounding boxes, flow into the mask head generating segmentation masks. This architecture design is visualized in Figure 3 where the two distinct predictive paths can be seen.

3.3. Dataset

To allow the bench-marking of model performance, the Combined Aerial and Indoor Small Wind Turbine Blade dataset (CAI-SWTB) was leveraged in [9]. In [9], due to the lack of available faulty images, we created intentional synthetic damages to the blades of a small-scale turbine simulating the most common faults including leading edge erosion, cracks, and holes in operational commercial wind turbines. The selected fault types focus on the common damage types of commercial wind turbines. It is worth noting that other damages may exist in commercial turbine blades such as fiber splitting, buckling due to compressive load, gel-coat debonding, among other less common manufacturing errors [6]. However, such damages were not studied in our work. For the creation of this dataset, a Primus AirMax turbine (Ryse Energy, Houston, TX, USA), shown in Figure 4, was studied. This turbine provides output power within the range of 0–0.425 kW. The CAI-SWTB dataset was created via a handheld cell phone camera along with aerial footage from a DJI Mini 3 Pro drone (DJI Technology, Nanshan, Shenzhen, China). We investigated the binary classification of healthy versus faulty images of the blades in our previous work [9].

Building upon our previous work, the object detection problem is investigated to provide additional positional data along with distinct fault instance classification. To allow for object localization, the dataset was hand-labeled for each of the three fault types, holes, erosion, and cracks. A sample from the training set of this dataset can be seen in Figure 5. Here, Figure 5a illustrates samples with the synthetic erosion and hole damage type simulating leading edge erosion and impact damage typical of commercial grade turbines. Figure 5b contains images with synthetic crack damage which simulates common delamination and stress fractures that present in turbines over time. Finally, Figure 5c illustrates the healthy examples from the dataset which contain no faults.

The final dataset consists of a total of 6000 RGB images with a size of

300 \times 300

with a training, validation, and testing split of 70%, 10%, and 20%, respectively. This is further visualized in Table 1. Additionally, the distribution of annotated faults provides details on the dataset and is shown in Table 2. Here, it can be seen the crack frequency is higher than the other classes but maintains a similar distribution between splits.

3.4. Metrics

To allow comparison between architectures’ fault localization abilities, performance metrics are defined. By making use of the Intersection over Union (IoU) of the predicted model’s bounding boxes and the labeled ground truth bounding boxes, the amount of overlap can be quantified. Here, varying levels of confidence are set to determine classification. Through each value in the range [0, 1], precision and recall scores are calculated for the model. Precision is defined as the number of correct predictions, True Positives (

T P

s), out of the total number of positive predictions, true positives, and False Positives (

F P

s). Recall determines the true positive rate or sensitivity of the model by accounting for the true positives correctly identified out of the entire positive label distribution. Precision and recall equations are shown below.

precision = \frac{T P}{T P + F P}

(1)

recall = \frac{T P}{T P + F N}

(2)

Upon deriving precision and recall metrics across diverse confidence thresholds for each class within the dataset, the area under the curve (AUC) can be computed. The AUC quantifies the Average Precision (AP) of the model for each respective class. By averaging these AP values across all classes, the mean Average Precision (mAP) is obtained. Furthermore, by adjusting the Intersection over Union (IoU) threshold, the mAP can be modified to enforce more stringent criteria for bounding box overlap. Specifically, mAP0.5 denotes the mAP where the predicted bounding box must overlap the ground truth by at least 50% to be deemed accurate. These metrics will be employed to evaluate and compare model performance in the subsequent section.

4. Simulations and Results

This study investigates the performance of the YOLO family of architectures at differing parameter sizes, along with Mask R-CNN with altered backbone networks, for the purpose of fault localization in wind turbines. Various hardware platforms were utilized to accomplish the training including the use of online cloud-based solutions such as Google Colaboratory and local machines with NVIDIA RTX 4080 Graphic Processors (Nvidia, Santa Clara, CA, USA). Additionally, PyTorch v. 2.3.1 was leveraged as the deep learning framework for both YOLO and Mask R-CNN training.

4.1. YOLO Results

With the rapid evolution and continuous enhancement of YOLO architectures, the task of WTB fault localization has become an area of intense interest. In this study, we compared the performance of the YOLOv5, YOLOv7, YOLOv8, and the latest YOLOv9, which was unveiled in February 2024. We also trained and compared the nano, medium, and extra-large model sizes to identify the optimal performance among the YOLO architectures. This research promises to bring new insights and advancements to the field.

To ensure the reliability and fairness of our comparison between architectures, we deployed image augmentation through the provided repositories. The same parameters were meticulously utilized for each YOLO model, as shown in Table 3. The mosaic augmentation, a sophisticated strategy that overlays four images along with their corresponding labels into one image, was employed. This approach simulates various scene compositions and object interactions, thereby enhancing the performance of object detectors.

Each architecture and corresponding model size was trained with the parameters shown in Table 4. Following training, the test set was deployed to determine model performance on unseen data. The corresponding results can be seen in Table 5. Here, each of the YOLO versions along with model sizes are shown with the obtained mAP50 and mAP50-95 metrics. The newest YOLO model in version 9 provided the highest mAP50 and mAP50-95, showing a considerable improvement over previous architecture designs. Between the two available sizes, the smaller size in C provided the best resulting performance on this dataset, obtaining an mAP50 of 0.8490 and mAP50-95 of 0.5390. The trend of the medium size architectures providing considerable performance increase over the nano and extra-large variants continued through each of the next YOLO architecture versions. Here, YOLOv8 Medium achieved the third best performance, just short of both YOLOv9 sizes.

YOLOv9C, the model with the best performance, is further evaluated using the confusion matrix in Figure 6, which visualizes the predictions against the ground truths. The confusion matrix reveals that the model infrequently predicts other fault types among the existing false negatives. However, it often misclassifies backgrounds as containing no faulty blades. The fault class hole shows the lowest accuracy, with 77% of labels correctly predicted, while the crack class exhibits the highest accuracy, with 90% of labels correctly predicted. This disparity in class performance is also reflected in the precision–recall curve shown in Figure 7.

To visually inspect the different YOLO architecture versions, a graph of the validation mAP50 values versus the epoch of training is plotted in Figure 8. Here, the best corresponding runs from each architecture were chosen. It can be seen that the YOLOv9 architecture achieves a higher mAP50 on the validation data at a faster rate than the previous versions. YOLOv5 and YOLOv8 are comparable throughout each epoch with YOLOv5 achieving a marginally better mAP50 at epoch 100.

When comparing the performance of each YOLO architecture on the test set, there is a common trend between the three classes, hole, erosion, and crack. Each YOLO version excels in the performance of crack localization with the lowest reported mAP50 obtained by YOLOv8 Nano at 0.8373 and the highest with YOLOv9 C and E achieving 0.9180. However, all models suffered performance loss in the localization and classification of hole damage in the dataset. YOLOv9c achieved the highest in the hole class with a mAP50 of 76% while YOLOv5 extra-large had the lowest performance of 64%. Reflected in Table 5, the lowest performing class for each YOLO model was hole damage.

4.2. Mask R-CNN Results

In the backbone comparison study for Mask R-CNN, different CNN architectures were implemented in our study as the feature extraction layers. The traditional ResNet50-FPN network, along with a ResNet18-FPN and VGG19-FPN network, was trained and compared to perform the WTB fault localization [34,35]. Table 6 shows the parameters that were used to train each variant of the Mask R-CNN backbone. Data augmentation schemes were not deployed with Mask R-CNN.

Each of the modified backbone Mask R-CNN models was trained using a distinct set of parameters identified through the hyperparameter tuning procedure. Subsequent to training, each model was subjected to the test dataset to conduct an evaluation, yielding the following metrics: box mAP 50, box mAP 50-95, segmentation mAP50, and segmentation mAP50-95. The segmentation scores were instrumental in assessing the overall performance of a Mask R-CNN model, given its unique capability to generate both bounding boxes and segmentation masks. The overall loss scores for each of the backbone variants are plotted against epochs in Figure 9. It is observed that the ResNet-50 backbone attains the lowest loss after epoch 25; however, the other two variants exhibit similar trends and are in close proximity. The results of the best evaluation metrics for each backbone variant are shown in Table 7. In terms of combined box mAP 50, the ResNet18-FPN backbone managed to perform the best, while the ResNet50-FPN backbone outperformed all the other backbones in the combined box mAP50-95 metric. The opposite trend occurs when it comes to the segmentation scores. The ResNet50-FPN backbone was the best-performing model in terms of combined segmentation mAP50, while ResNet18-FPN performed the best in terms of segmentation mAP50-95.

5. Discussion

Further investigation of the model performance was conducted following the optimal performance and comparison of each architecture. The top model and size were selected with corresponding test set inference results for visualization, allowing trend and pattern analysis in model prediction. Figure 10 shows a sample of the predictions from the YOLOv9C model. Figure 10a illustrates the misclassifications of the model. Reflections from the shiny blade produced features that looked like cracks, leading to an increased number of false detections. Additionally, crack-like features outside the blade area also contributed to false detections due to the background and turbine chassis features. Figure 10b illustrates the confident predictions from the YOLOv9C model. The model was capable of localizing each small hole feature and crack along the width of the turbine blade. Furthermore, the erosion damage, which is angled to offer more background information, attained significant confidence values. Analyzing the inference results for each YOLO model and size including v5, v8, and v9, shows a common trend in hole classification that explains the lower mAP values in Table 5. Figure 11 contains a sample of these inference results along with their corresponding ground truth label. Here, the model commonly classifies the mounting hardware as hole damage. While the mounting points contained distinctly hole-like features, they were not labeled as such during dataset creation. This led the models to generalize the hole damage features and still detect and classify in these cases, resulting in lower overall performance. Additionally, the crack classification results observed in YOLOv9 continued in each YOLO architecture, with turbine chassis detections and light reflections causing false detections.

As was discussed earlier, Table 7 describes the performance of Mask R-CNN with three backbones of VGG19, ResnNet50, and ResNet18 on the CAI-SWTB dataset. According to the inference results of this table, Mask-RCNN with ResNet18 shows the most promising results with a higher combined mAP50 box score and a lower computational cost. Figure 12 shows a sample of inference results from the Mask R-CNN model with the ResNet18-FPN backbone network. According to this Figure, it is evident here that the instance segmentation and localization abilities suffer when the turbine chassis is present within the data. The shadow created by the hub, chassis, and blade, shown in Figure 12a, leads to crack and erosion false detections explaining the lower erosion class metrics for each of the modified backbone architectures. However, when the blade is the main component of the composition, the localization abilities are considerably high in confidence. This is further illustrated in Figure 12b, where the confidence for erosion, crack, and hole detection achieves well above 90% and provides instance segmentation, allowing for pixel-level classification. Figure 13a shows additional samples of these common misclassifications from Mask R-CNN with a ResNet18 backbone that were also observed using ResNet50 but with different confidence intervals. The corresponding ground truth labels of healthy blade images are illustrated in Figure 13b.

This study investigated two popular architectures for the localization of faults in WTBs: YOLO and Mask R-CNN. While traditionally two-stage detectors, like Mask R-CNN, are considered more accurate, our study shows that the single-stage design of YOLO achieves a comparable performance. This raises the question of which architecture should be chosen for fault localization in WTBs. To answer this, multiple aspects must be considered. YOLO achieves promising precision while maintaining low computational costs and the ability to achieve close-to-real-time inference. However, Mask R-CNN provides comparable precision with the addition of instance segmentation. For the case of fault localization in WTBs, real-time performance is not essential, and thus, the greater level of detail could be considered a larger asset for maintenance engineers, allowing more precise size estimation. Alternatively, if computational performance is limited, a light YOLO model could be deployed to provide quick inspection results in a constrained environment.

6. Conclusions and Discussion on Limitations

With this rise of installed turbines comes a mounting need for improved maintenance and inspection techniques. In this research, two popular object detection architectures were deployed and compared for performance on the localization task of WTB faults. The YOLO family of architectures was compared with versions 5, 8, and 9 along with three varying parameter sizes to allow a comprehensive performance comparison. This study showed that the most recent version of YOLO, YOLOv9 with size c, provided the highest mAP50 and mAP50 with 0.849 and 0.539, respectively. In addition to the YOLO architectures, Mask R-CNN was investigated along with altered feature extraction layers for improved performance. Here, it was shown that the base ResNet50-FPN network provides adequate performance but can be improved through a reduction in layers and parameters with a ResNet18-FPN backbone. This alteration provided a higher mAP50 of 0.8415 and instance segmentation mAP50 of 0.7933 while contributing a computation decrease through a 32-layer backbone reduction. While the scores obtained by the modified mask R-CNN did not surpass the latest YOLOv9 architecture results, they did provide a greater level of detail with instance segmentation masks. The comparative study on WTB fault localization obtained promising results. This enhanced precision can be leveraged to improve inspection accuracy and throughput as increased demand drives turbine installation numbers.

While this study provides promising results on the identification and localization of wind turbine blade faults, there remain challenges to be addressed. The dataset utilized in this study was created in a controlled environment simulating the halted wind turbine case scenario, which is the current trend of inspection. However, the loss of power production is a constraint to this method. If automated methods can address the collection of images while the wind turbine is active, it can eliminate this loss of power. Here, the rotation of the blades, along with the yaw of the turbine due to the environment wind direction and speed, can cause blurry aerial images taken with the drone, resulting in poor image quality. The drone itself can experience some turbulence causing unfocused imaging. To tackle these issues, more wind-resistant drones with better camera payloads can be a solution at an additional cost. As an alternative, applying de-blurring methods such as Lucy–Richardson or Wiener filter methods can be used as a pre-processing stage to mitigate the blurs of the captured images before feeding the images to the deep learning stage for blade fault identification and localization [12].

Another solution includes mounting an anemometer such as TriSonica Mini Ultrasonic Anemometer to obtain the real-time wind speed and direction data to better control the drone for further stabilization as it takes aerial images from the blades. This approach would allow the drone to collect more focused images of halted turbine blades in the harsh environments presented in onshore and offshore wind farms. As an alternative, CNN-based wind forecasting approaches can also be used for drone stabilization and path-planning [11].

Finally, to allow for a more generic solution for inference on commercial WTBs, more publicly available datasets are required. While the presented dataset is limited in scope and size, the goal was to increase the availability of wind turbine fault datasets through open access. Furthermore, the dataset leveraged in this study is limited to three major fault types, i.e., erosion, holes, and cracks. While these are some of the most common faults, it is not a comprehensive list seen in operational wind turbines. Further improving the scope of faults will allow even greater levels of generalization for fault detection.

Author Contributions

Conceptualization, M.S.; methodology, M.S., M.D. and E.N.D.; software, M.D. and E.N.D.; validation, M.S., M.D., E.N.D. and M.M.; formal analysis, M.S., M.D., E.N.D., J.Z. and M.M.; investigation, M.S., M.D., E.N.D., J.Z. and M.M.; resources, M.S.; data curation, M.D., E.N.D., J.Z.; writing—original draft preparation, M.D., E.N.D., J.Z. and M.M.; writing—review and editing, M.S., M.D., E.N.D., M.M.; visualization, M.S., M.D., E.N.D., J.Z. and M.M.; supervision, M.S.; project administration, M.S.; funding acquisition, M.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Office of the Commissioner of Utah System of Higher Education (USHE)—Deep Technology Initiative Grant 20210016UT.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to confidentiality.

Acknowledgments

The authors would also like to thank Bridger Altice for his support in creating the initial labeled dataset.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CBAM	Convolutional Block Attention Module
CNN	Convolutional Neural Network
COCO	Common Objects in Context
CAI-SWTB	Combined Aerial and Indoor Small Wind Turbine Blade
EIoU	Efficient Intersection over Union
FPN	Feature Pyramid Network
GELAN	Generalized Efficient Layer Aggregation Network
mAP	mean Average Precision
mAP50	mean Average Precision at 50% IoU
mAP50-95	mean Average Precision at IoU thresholds from 50% to 95%
PGI	Programmable Gradient Information
RGB	Red, Green, Blue
ROI	Region of Interest
RPN	Regional Proposal Network
R-CNN	Region-based Convolutional Neural Network
SCADA	Supervisory Control and Data Acquisition
SENet	Squeeze and Excitation Networks
SVM	Support Vector Machine
UAV	Unmanned Aerial Vehicle
VGG	Visual Geometry Group
WTB	Wind Turbine Blade
YOLO	You Only Look Once

References

Hashimshony Yaffe, N.; Segal-Klein, H. Renewable energy and the centralisation of power. The case study of Lake Turkana Wind Power, Kenya. Political Geogr. 2023, 102, 102819. [Google Scholar] [CrossRef]
Department of Energy. US Department of Energy Projects Strong Growth US Wind Power Sector. 2023. Available online: https://www.energy.gov/articles/us-department-energy-projects-strong-growth-us-wind-power-sector (accessed on 13 June 2024).
Wind Europe. Wind Energy in Europe: 2023 Statistics and the Outlook for 2024–2030. 2024. Available online: https://windeurope.org/intelligence-platform/product/wind-energy-in-europe-2023-statistics-and-the-outlook-for-2024-2030 (accessed on 13 June 2024).
Boopathi, K.; Mishnaevsky, L., Jr.; Sumantraa, B.; Premkumar, S.A.; Thamodharan, K.; Balaraman, K. Failure mechanisms of wind turbine blades in India: Climatic, regional, and seasonal variability. Wind Energy 2022, 25, 968–979. [Google Scholar] [CrossRef]
Wang, W.; Xue, Y.; He, C.; Zhao, Y. Review of the Typical Damage and Damage-Detection Methods of Large Wind Turbine Blades. Energies 2022, 15, 5672. [Google Scholar] [CrossRef]
Memari, M.; Shakya, P.; Shekaramiz, M.; Seibi, A.C.; Masoum, M.A.S. Review on the Advancements in Wind Turbine Blade Inspection: Integrating Drone and Deep Learning Technologies for Enhanced Defect Detection. IEEE Access 2024, 12, 33236–33282. [Google Scholar] [CrossRef]
Memari, M.; Shekaramiz, M.; Masoum, M.A.S.; Seibi, A.C. Data Fusion and Ensemble Learning for Advanced Anomaly Detection Using Multi-Spectral RGB and Thermal Imaging of Small Wind Turbine Blades. Energies 2024, 17, 673. [Google Scholar] [CrossRef]
Shihavuddin, A.; Chen, X.; Fedorov, V.; Nymark Christensen, A.; Andre Brogaard Riis, N.; Branner, K.; Bjorholm Dahl, A.; Reinhold Paulsen, R. Wind Turbine Surface Damage Detection by Deep Learning Aided Drone Inspection Analysis. Energies 2019, 12, 676. [Google Scholar] [CrossRef]
Altice, B.; Nazario, E.; Davis, M.; Shekaramiz, M.; Moon, T.K.; Masoum, M.A.S. Anomaly Detection on Small Wind Turbine Blades Using Deep Learning Algorithms. Energies 2024, 17, 982. [Google Scholar] [CrossRef]
Ren, C.; Xing, Y.; Patel, K.S. Application of an active learning method for cumulative fatigue damage assessment of floating wind turbine mooring lines. Results Eng. 2024, 22, 102122. [Google Scholar] [CrossRef]
Zhang, Y.M.; Wang, H. Multi-head attention-based probabilistic CNN-BiLSTM for day-ahead wind speed forecasting. Energy 2023, 278, 127865. [Google Scholar] [CrossRef]
Altice, B.; Moon, T.K.; Shekaramiz, M. Velocity-Based Wind Turbine Blade Deblurring Using Richardson-Lucy Algorithm. In Proceedings of the 2024 Intermountain Engineering, Technology and Computing (IETC), Logan, UT, USA, 12–14 May 2024; pp. 215–220. [Google Scholar]
Liu, X.; Liu, C.; Jiang, D. Wind Turbine Blade Surface Defect Detection Based on YOLO Algorithm. In International Congress and Workshop on Industrial AI; Springer: Cham, Switzerland, 2023; pp. 367–380. [Google Scholar]
Zhang, C.; Yang, T.; Yang, J. Image Recognition of Wind Turbine Blade Defects Using Attention-Based MobileNetv1-YOLOv4 and Transfer Learning. Sensors 2022, 22, 6009. [Google Scholar] [CrossRef]
Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient channel attention for deep convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 11534–11542. [Google Scholar]
Mohammadi, R.; Sharifian, S. Improving Wind Turbines Blades Damage Detection by Using YOLO BoF and BoS. In Proceedings of the 2023 31st International Conference on Electrical Engineering (ICEE), Tehran, Iran, 9–11 May 2023; pp. 871–876. [Google Scholar]
Ran, X.; Zhang, S.; Wang, H.; Zhang, Z. An improved algorithm for wind turbine blade defect detection. IEEE Access 2022, 10, 122171–122181. [Google Scholar] [CrossRef]
Liu, C.; An, C.; Yang, Y. Wind Turbine Surface Defect Detection Method Based on YOLOv5s-L. Non-Destr. Test. (NDT) 2023, 1, 46–57. [Google Scholar] [CrossRef]
Yu, H.; Wang, J.; Han, Y.; Fan, B.; Zhang, C. Research on an Intelligent Identification Method for Wind Turbine Blade Damage Based on CBAM-BiFPN-YOLOv8. Processes 2024, 12, 205. [Google Scholar] [CrossRef]
Yao, Y.; Wang, G.; Fan, J. WT-YOLOX: An Efficient Detection Algorithm for Wind Turbine Blade Damage Based on YOLOX. Energies 2023, 16, 3776. [Google Scholar] [CrossRef]
Zhang, C.; Wen, C.; Liu, J. Mask-MRNet: A deep neural network for wind turbine blade fault detection. Renew. Sustain. Energy 2020, 12, 053302. [Google Scholar] [CrossRef]
Zhang, J.; Cosma, G.; Watkins, J. Image Enhanced Mask R-CNN: A Deep Learning Pipeline with New Evaluation Measures for Wind Turbine Blade Defect Detection and Classification. Imaging 2021, 7, 46. [Google Scholar] [CrossRef] [PubMed]
Diaz, P.; Tittus, P. Fast detection of wind turbine blade damage using Cascade Mask R-DSCNN-aided drone inspection analysis. Signal Image Video Process. 2023, 17, 2333–2341. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Jocher, G. Ultralytics YOLOv5. GitHub. Available online: https://github.com/ultralytics/yolov5 (accessed on 5 June 2024).
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
Jocher, G.; Chaurasia, A.; Qiu, J. Ultralytics YOLOv8. GitHub. Available online: https://github.com/ultralytics/ultralytics (accessed on 5 June 2024).
Wang, C.Y.; Yeh, I.H.; Liao, H.Y.M. YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. arXiv 2024, arXiv:2402.13616. [Google Scholar]
Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Girshick, R. Fast R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1440–1448. [Google Scholar]
Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef]
He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]

Figure 1. YOLOv8 architecture overview.

Figure 2. YOLOv9 architecture overview.

Figure 3. Mask R-CNN architecture overview.

Figure 4. Primus AirMax wind turbine.

Figure 5. Samples from the labeled small WTB dataset. (a) Sample images containing erosion and hole faults; (b) Sample images containing crack faults; (c) Sample images containing no faults.

Figure 6. YOLOv9 C confusion matrix.

Figure 7. YOLOv9 C precision–recall curve.

Figure 8. YOLO models’ best training run compared.

Figure 9. Mask R-CNN model’s best loss scores compared.

Figure 10. Sample of YOLOv9 C inference results. (a) False positive classifications; (b) True positive classifications.

Figure 11. YOLO v5, v8, and v9 commonly misclassified samples from the test data. (a) YOLO model inference results; (b) Labeled ground truth.

Figure 12. Sample of Mask R-CNN ResNet-18 FPN backbone inference results. (a) False positive classifications. The confidence levels in these plots read as follows. Top Left: Crack 0.8639; Top Right: Erosion 0.080, Erosion 0.202; Bottom left: Crack 0.387, Erosion 0.095; Bottom Right: Crack 0.345; (b) True positive classifications.

Figure 13. Additional samplesof Mask R-CNN ResNet-18 FPN backbone inference results. (a) Mask R-CNN model inference results with ResNet18 backbone. The confidence levels in these plots read as follows. Top Left: Crack 0.859; Top Right: Crack 0.306; Bottom left: Erosion 0.187; Bottom Right: Crack 0.063, Erosion 0.947; (b) Labeled ground truth.

Table 1. CAI-SWTB dataset split [9].

Image Label	Training	Validation	Test	Total Images
Indoor Faulty	1400	200	400	2000
Indoor Healthy	1400	200	400	2000
Outdoor Faulty	700	100	200	1000
Outdoor Healthy	700	100	200	1000
Total	4200	600	1200	6000

Table 2. CAI-SWTB dataset label distribution.

Class	Training	Validation	Test	Total Labels
Crack	4001	536	1096	5633
Hole	1092	215	353	1660
Erosion	867	173	283	1323
Total	5960	924	1732	8616

Table 3. YOLO architectures image augmentation parameters.

Augmentation Parameter	Value
Hue	0.015
Saturation	0.7
Brightness	0.4
Translation	0.1
Scale	0.5
Flip Left/Right	0.5
Mosaic	1.0

Table 4. YOLO architecture training parameters.

Training Parameter	Value
Epoch	100
Optimizer	AdamW
Image Size	300

Table 5. The performance outcomes of the YOLO models trained on the dataset.

Model	Class	Box Precision	Box Recall	mAP 50	mAP 50-95
YOLOv5 Nano	Combined	0.8332	0.7086	0.7550	0.4430
	Crack	0.8723	0.8165	0.8495	0.5060
	Hole	0.8455	0.6261	0.6653	0.3375
	Erosion	0.7817	0.6833	0.7501	0.4855
YOLOv5 Medium	Combined	0.8753	0.7566	0.8058	0.5004
	Crack	0.8931	0.8604	0.8954	0.5693
	Hole	0.8804	0.6674	0.7281	0.4027
	Erosion	0.8523	0.7420	0.7938	0.5291
YOLOv5 X-Large *	Combined	0.8504	0.7336	0.7637	0.4579
	Crack	0.9044	0.8370	0.8754	0.5456
	Hole	0.7824	0.6289	0.6377	0.3087
	Erosion	0.8645	0.7350	0.7781	0.5194
YOLOv8 Nano	Combined	0.8574	0.6755	0.7564	0.4436
	Crack	0.8601	0.7876	0.8373	0.5097
	Hole	0.9229	0.5943	0.7095	0.3710
	Erosion	0.7892	0.6445	0.7225	0.4502
YOLOv8 Medium	Combined	0.8811	0.7667	0.8233	0.5194
	Crack	0.9307	0.8581	0.9048	0.5872
	Hole	0.8952	0.6827	0.7430	0.4122
	Erosion	0.8174	0.7592	0.8222	0.5589
YOLOv8 X-Large *	Combined	0.8900	0.7526	0.8110	0.5176
	Crack	0.9254	0.8644	0.9052	0.5960
	Hole	0.8814	0.6486	0.7108	0.3907
	Erosion	0.8631	0.7448	0.8170	0.5660
YOLOv9 C	Combined	0.8850	0.7920	0.8490	0.5390
	Crack	0.9250	0.8790	0.9180	0.5980
	Hole	0.8610	0.6940	0.7570	0.4230
	Erosion	0.8700	0.8020	0.8710	0.5980
YOLOv9 E	Combined	0.8910	0.7830	0.8450	0.5380
	Crack	0.9330	0.8740	0.9180	0.6030
	Hole	0.8800	0.6770	0.7510	0.4130
	Erosion	0.8590	0.7980	0.8650	0.5990

* X-Large indicates extra large size.

Table 6. Training parameters for the Mask R-CNN.

Training Parameter	Value
Epoch	25, 50
Optimizer	Adam, AdamW, RMSprop, SGD
Image Size	300

Table 7. The performance outcomes of the Mask R-CNN model trained on the dataset.

Model	Class	Box mAP 50	Box mAP 50-95	Segmentation mAP 50	Segmentation mAP 50-95
VGG-19 FPN	Combined	0.7947	0.4223	0.7503	0.3044
	Crack	0.8081	0.4683	0.7676	0.2667
	Hole	0.8789	0.4531	0.8667	0.4309
	Erosion	0.6973	0.3456	0.6166	0.2156
ResNet-50 FPN	Combined	0.8372	0.4670	0.7940	0.3029
	Crack	0.8378	0.5089	0.7791	0.2749
	Hole	0.8686	0.4500	0.8406	0.3718
	Erosion	0.8052	0.4420	0.7622	0.2020
ResNet-18 FPN	Combined	0.8415	0.4434	0.7933	0.3128
	Crack	0.8421	0.5080	0.7717	0.2582
	Hole	0.8843	0.4290	0.8709	0.4146
	Erosion	0.7981	0.3932	0.7374	0.2687

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Davis, M.; Nazario Dejesus, E.; Shekaramiz, M.; Zander, J.; Memari, M. Identification and Localization of Wind Turbine Blade Faults Using Deep Learning. Appl. Sci. 2024, 14, 6319. https://doi.org/10.3390/app14146319

AMA Style

Davis M, Nazario Dejesus E, Shekaramiz M, Zander J, Memari M. Identification and Localization of Wind Turbine Blade Faults Using Deep Learning. Applied Sciences. 2024; 14(14):6319. https://doi.org/10.3390/app14146319

Chicago/Turabian Style

Davis, Mason, Edwin Nazario Dejesus, Mohammad Shekaramiz, Joshua Zander, and Majid Memari. 2024. "Identification and Localization of Wind Turbine Blade Faults Using Deep Learning" Applied Sciences 14, no. 14: 6319. https://doi.org/10.3390/app14146319

APA Style

Davis, M., Nazario Dejesus, E., Shekaramiz, M., Zander, J., & Memari, M. (2024). Identification and Localization of Wind Turbine Blade Faults Using Deep Learning. Applied Sciences, 14(14), 6319. https://doi.org/10.3390/app14146319

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Identification and Localization of Wind Turbine Blade Faults Using Deep Learning

Abstract

1. Introduction

2. Literature Review

3. Materials and Methods

3.1. YOLO

3.2. R-CNN

3.2.1. Fast R-CNN

3.2.2. Faster R-CNN

3.2.3. Mask R-CNN

3.3. Dataset

3.4. Metrics

4. Simulations and Results

4.1. YOLO Results

4.2. Mask R-CNN Results

5. Discussion

6. Conclusions and Discussion on Limitations

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI