Figure 1.
Locations of the coast of Mallorca and Cabrera where H.i images were taken. The red dots indicate these locations, specifically marking Ses Illetes, Cala Blava, and Colònia de Sant Jordi in Mallorca and sa Platgeta in Cabrera.
Figure 1.
Locations of the coast of Mallorca and Cabrera where H.i images were taken. The red dots indicate these locations, specifically marking Ses Illetes, Cala Blava, and Colònia de Sant Jordi in Mallorca and sa Platgeta in Cabrera.
Figure 2.
Dataset examples illustrating different types of scenarios in which H.i can be found depending on its invasive stage. Left: sparse scenario, corresponding with an early stage of invasion. Right: dense scenario, corresponding with an advanced stage of invasion of the algae.
Figure 2.
Dataset examples illustrating different types of scenarios in which H.i can be found depending on its invasive stage. Left: sparse scenario, corresponding with an early stage of invasion. Right: dense scenario, corresponding with an advanced stage of invasion of the algae.
Figure 3.
Image annotation example. (a) Original image. (b) Ground truth label map, where the H.i is marked in white and the background in black. (c) Original image. (d) Labelled image marking the H.i instances with bounding boxes.
Figure 3.
Image annotation example. (a) Original image. (b) Ground truth label map, where the H.i is marked in white and the background in black. (c) Original image. (d) Labelled image marking the H.i instances with bounding boxes.
Figure 4.
Dataset management. The complete dataset is partitioned into the sparse dataset and the dense dataset.
Figure 4.
Dataset management. The complete dataset is partitioned into the sparse dataset and the dense dataset.
Figure 5.
Schema of the U-net network architecture, redrawn from [
29]. The left side of the U-shape is the contraction path, where each layer consisting of two 3 × 3 convolutions with ReLu activation and a 2 × 2 maximum pooling layer. The right is the expansion part and consists of the decoding stage and the upsampling process that is realized via 2 × 2 deconvolution to reduce the quantity of input channels by half.
Figure 5.
Schema of the U-net network architecture, redrawn from [
29]. The left side of the U-shape is the contraction path, where each layer consisting of two 3 × 3 convolutions with ReLu activation and a 2 × 2 maximum pooling layer. The right is the expansion part and consists of the decoding stage and the upsampling process that is realized via 2 × 2 deconvolution to reduce the quantity of input channels by half.
Figure 6.
Diagram of the SS module. The trained U-net processes the input images and outputs a probability map that is then binarized using a threshold to obtain the coverage map. The probability map and the coverage map are the outputs of the SS module.
Figure 6.
Diagram of the SS module. The trained U-net processes the input images and outputs a probability map that is then binarized using a threshold to obtain the coverage map. The probability map and the coverage map are the outputs of the SS module.
Figure 7.
Schema of the YOLOv5 network architecture, redrawn from [
41]. It is composed of three main parts: Backbone(CSPDarkNet), Neck (PANet), and Head (YOLOv5 Head). The data are first input to CSPDarknet for feature extraction and then fed to PANet for feature fusion. Finally, the Head outputs the detection results (class, score, and bounding box).
Figure 7.
Schema of the YOLOv5 network architecture, redrawn from [
41]. It is composed of three main parts: Backbone(CSPDarkNet), Neck (PANet), and Head (YOLOv5 Head). The data are first input to CSPDarknet for feature extraction and then fed to PANet for feature fusion. Finally, the Head outputs the detection results (class, score, and bounding box).
Figure 8.
Diagram of the OD module. The trained Yolov5 network processes the input images and outputs the array that is thesholded by to obtain a new array. These instance arrays are converted to an image format, generating the probability and the coverage map, respectively. The probability map, the coverage map, and the array are the outputs of the module.
Figure 8.
Diagram of the OD module. The trained Yolov5 network processes the input images and outputs the array that is thesholded by to obtain a new array. These instance arrays are converted to an image format, generating the probability and the coverage map, respectively. The probability map, the coverage map, and the array are the outputs of the module.
Figure 9.
Weighted merging pipeline. It performs a weighted combination of the probability maps outputted by the SS and OD modules and applies a confidence threshold to obtain the weighted coverage map.
Figure 9.
Weighted merging pipeline. It performs a weighted combination of the probability maps outputted by the SS and OD modules and applies a confidence threshold to obtain the weighted coverage map.
Figure 10.
AUC merging pipeline. It uses OD instance information to validate areas of coverage generated from the SS coverage map. Later, this information is merged with the OD coverage map to obtain the AUC coverage map.
Figure 10.
AUC merging pipeline. It uses OD instance information to validate areas of coverage generated from the SS coverage map. Later, this information is merged with the OD coverage map to obtain the AUC coverage map.
Figure 11.
SS coverage map clustering. (a) SS coverage map. (b) Blobs after applying the opening operation. (c) Blobs after applying the 4-connectivity connected-component algorithm. (d) Final blobs after applying the threshold .
Figure 11.
SS coverage map clustering. (a) SS coverage map. (b) Blobs after applying the opening operation. (c) Blobs after applying the 4-connectivity connected-component algorithm. (d) Final blobs after applying the threshold .
Figure 12.
Curves indicating the increase in TPs () and increase in FPs (), and its difference () for each .
Figure 12.
Curves indicating the increase in TPs () and increase in FPs (), and its difference () for each .
Figure 13.
AUC and obtaining process. (a) Original image; (b) instances from the OD module output printed over the original image; (c) blob generated from the coverage map outputted by the SS module; (d) coverage-confidence curve, indicating its AUC, and value, obtained as the number of instances that intersect with the blob; (e–g) instances with confidence >5, 20, 40%, respectively, printed over the blob, showcasing its coverage at these confidence thresholds.
Figure 13.
AUC and obtaining process. (a) Original image; (b) instances from the OD module output printed over the original image; (c) blob generated from the coverage map outputted by the SS module; (d) coverage-confidence curve, indicating its AUC, and value, obtained as the number of instances that intersect with the blob; (e–g) instances with confidence >5, 20, 40%, respectively, printed over the blob, showcasing its coverage at these confidence thresholds.
Figure 14.
(Left): Flow chart of the blob validation algorithm. For each blob generated from the SS coverage map clustering, both its and AUC metric are computed. Afterwards, the AUC range corresponding to the AUC value of the blob is determined. If the value of the blob surpasses the value associated with the blob’s AUC range, the blob is validated; otherwise, it is discarded. (Right): blob validation example. The blob validation algorithm is applied to each blob from the SS coverage map clustering. In this example, the yellow and pink blobs are validated as their is greater than the corresponding value within the AUC range matching their AUC value. Conversely, blue and green blobs are discarded as they fail to meet this condition.
Figure 14.
(Left): Flow chart of the blob validation algorithm. For each blob generated from the SS coverage map clustering, both its and AUC metric are computed. Afterwards, the AUC range corresponding to the AUC value of the blob is determined. If the value of the blob surpasses the value associated with the blob’s AUC range, the blob is validated; otherwise, it is discarded. (Right): blob validation example. The blob validation algorithm is applied to each blob from the SS coverage map clustering. In this example, the yellow and pink blobs are validated as their is greater than the corresponding value within the AUC range matching their AUC value. Conversely, blue and green blobs are discarded as they fail to meet this condition.
Figure 15.
SPARUS II AUV.
Figure 15.
SPARUS II AUV.
Table 1.
Combination of hyperparameters tested for the SS module. The network was trained for each learning rate with and without using data augmentation. The learning rate values were selected by altering the default U-net learning rate, 0.001, by a scale of 3.
Table 1.
Combination of hyperparameters tested for the SS module. The network was trained for each learning rate with and without using data augmentation. The learning rate values were selected by altering the default U-net learning rate, 0.001, by a scale of 3.
Data Augmentation | Learning Rate |
---|
| 0.009 |
| 0.003 |
Yes | 0.001 |
| 0.00033 |
| 0.00011 |
| 0.009 |
| 0.003 |
No | 0.001 |
| 0.00033 |
| 0.00011 |
Table 2.
Combinations of hyperparameters tested for the OD module. The network was trained for each learning rate with and without using data augmentation. The learning rate values were selected by altering the default Yolov5 learning rate, 0.01, by a scale of 3.
Table 2.
Combinations of hyperparameters tested for the OD module. The network was trained for each learning rate with and without using data augmentation. The learning rate values were selected by altering the default Yolov5 learning rate, 0.01, by a scale of 3.
Data Augmentation | Learning Rate |
---|
| 0.03 |
| 0.01 |
Yes | 0.0033 |
| 0.0011 |
| 0.00037 |
| 0.03 |
| 0.01 |
No | 0.0033 |
| 0.0011 |
| 0.00037 |
Table 3.
values to validate blobs for each AUC range.
Table 3.
values to validate blobs for each AUC range.
AUC (%) | 0–10 | 10–20 | 20–30 | 30–40 | 40–50 | 50–60 | 60–70 | 70–80 | 80–90 | 90–100 |
| 80 | 15 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Table 4.
Results of the hyperparameter study for the SS module. The table represents, for each combination of hyperparameters, the mean F1-score, obtained as a result of the five k-fold cross-validation.
Table 4.
Results of the hyperparameter study for the SS module. The table represents, for each combination of hyperparameters, the mean F1-score, obtained as a result of the five k-fold cross-validation.
Data Augmentation | Learning Rate | F1-Score |
---|
| 0.009 | 73.8% |
| 0.003 | 76.4% |
No | 0.001 | 84.7% |
| 0.00033 | 87.8% |
| 0.00011 | 85.2% |
| 0.009 | 73.1% |
| 0.003 | 73.7% |
Yes | 0.001 | 81.8% |
| 0.00033 | 85.9% |
| 0.00011 | 84.5% |
Table 5.
Results of the hyperparameter study for the OD module. The table represents, for each combination of hyperparameters, the mean F1-score, obtained as a result of the five k-fold cross-validation.
Table 5.
Results of the hyperparameter study for the OD module. The table represents, for each combination of hyperparameters, the mean F1-score, obtained as a result of the five k-fold cross-validation.
Data Augmentation | Learning Rate | F1-Score |
---|
| 0.03 | 69.3% |
| 0.01 | 72.9% |
No | 0.0033 | 68.2% |
| 0.0011 | 67.8% |
| 0.00037 | 66.8% |
| 0.03 | 76.7% |
| 0.01 | 76.7% |
Yes | 0.0033 | 77.4% |
| 0.0011 | 77.4% |
| 0.00037 | 76.7% |
Table 6.
Comparison results of SS and OD modules and merging methods over the complete dataset test set.
Table 6.
Comparison results of SS and OD modules and merging methods over the complete dataset test set.
Approach | F1-Score | Coverage Error |
---|
SS Module | 78.8% | 10.0% |
OD Module | 35.0% | 19.9% |
Weighted Merging | 79.7% | 8.8% |
AUC Merging | 84.2% | 5.9% |
Table 7.
YOLOv5 model size inference times. The table represents the inference time in the AUV onboard computer for each Yolov5 size. The total time is the sum of the OD module inference time and the SS module inference time.
Table 7.
YOLOv5 model size inference times. The table represents the inference time in the AUV onboard computer for each Yolov5 size. The total time is the sum of the OD module inference time and the SS module inference time.
Module | Inference Time(s) |
---|
SS | 1.13 |
OD | XL | large | medium | small | nano |
1.63 | 0.93 | 0.45 | 0.2 | 0.09 |
Total | 2.75 | 2.05 | 1.60 | 1.34 | 1.23 |