Application and Evaluation of the AI-Powered Segment Anything Model (SAM) in Seafloor Mapping: A Case Study from Puck Lagoon, Poland

Janowski, Łukasz; Wróblewski, Radosław

doi:10.3390/rs16142638

Open AccessArticle

Application and Evaluation of the AI-Powered Segment Anything Model (SAM) in Seafloor Mapping: A Case Study from Puck Lagoon, Poland

by

Łukasz Janowski

^1,*

and

Radosław Wróblewski

^2,3

¹

Maritime Institute, Gdynia Maritime University, Roberta de Plelo 20, 80-548 Gdańsk, Poland

²

Department of Geophysics, University of Gdansk, Piłsudskiego 46, 81-378 Gdynia, Poland

³

MEWO S.A., Starogardzka 17A, 83-010 Straszyn, Poland

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(14), 2638; https://doi.org/10.3390/rs16142638

Submission received: 30 April 2024 / Revised: 11 June 2024 / Accepted: 17 July 2024 / Published: 18 July 2024

(This article belongs to the Special Issue Advanced Remote Sensing Technology in Geodesy, Surveying and Mapping)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The digital representation of seafloor, a challenge in UNESCO’s Ocean Decade initiative, is essential for sustainable development support and marine environment protection, aligning with the United Nations’ 2030 program goals. Accuracy in seafloor representation can be achieved through remote sensing measurements, including acoustic and laser sources. Ground truth information integration facilitates comprehensive seafloor assessment. The current seafloor mapping paradigm benefits from the object-based image analysis (OBIA) approach, managing high-resolution remote sensing measurements effectively. A critical OBIA step is the segmentation process, with various algorithms available. Recent artificial intelligence advancements have led to AI-powered segmentation algorithms development, like the Segment Anything Model (SAM) by META AI. This paper presents the SAM approach’s first evaluation for seafloor mapping. The benchmark remote sensing dataset refers to Puck Lagoon, Poland and includes measurements from various sources, primarily multibeam echosounders, bathymetric lidar, airborne photogrammetry, and satellite imagery. The SAM algorithm’s performance was evaluated on an affordable workstation equipped with an NVIDIA GPU, enabling CUDA architecture utilization. The growing popularity and demand for AI-based services predict their widespread application in future underwater remote sensing studies, regardless of the measurement technology used (acoustic, laser, or imagery). Applying SAM in Puck Lagoon seafloor mapping may benefit other seafloor mapping studies intending to employ AI technology.

Keywords:

Segment Anything; object-based image analysis; underwater remote sensing; multibeam echosounder; bathymetric LiDAR; satellite-derived bathymetry; seafloor mapping; geomorphological bedforms; Puck Lagoon

1. Introduction

Challenge 8, also recognized as the Digital Representation of the Ocean, is a significant initiative under UNESCO’s Ocean Decade [1]. This challenge is dedicated to product a complete digital model of the ocean containing a dynamic ocean map. The main objective is to develop a digital tool that will facilitate understanding of the ocean’s health, enable prediction of future changes, and guide the sustainable utilization of ocean resources. Additionally, it aims to contribute to the protection of the marine environment.

Presently, a significant contribution to the digital model of the seabed and its features comes from direct high-resolution measurements of the seafloor using remote sensing methods. Depending on the survey area and depth, various methods are utilized. These include traditional direct acoustic measurements using for example multibeam echosounders and side-scan sonars [2]. In addition, they also include laser or photogrammetric measurements from the air [3]. It is also important to consider alternative remote sensing methods, which enable effective bathymetry acquisition from space, generating specific products like satellite-derived bathymetry (SDB) [4,5].

In addition to the bathymetry digital elevation model (DEM), modern multibeam echosounder (MBES) and airborne laser bathymetry (ALB) devices can capture the intensity of the acoustic or laser signals reflected off the seabed [6,7,8]. These measurements, often referred to as backscatter intensity, can act as indicators for various characteristics of the seabed. While these measurements are more complex than basic bathymetry, they provide a more comprehensive understanding of the seafloor. However, it is crucial to note that accurately identifying specific features like sediment types or vegetation requires corroborating information from on-site observations. Consequently, backscatter intensity forms the foundation of the contemporary discipline of benthic habitat mapping [9,10], which is closely tied to recent advancements in underwater remote sensing technology. This interdisciplinary approach combines insights from various fields including oceanography, underwater acoustics, ecology, sedimentology, geomorphology, statistics, geoinformation, geoengineering, geodesy, and numerical modelling [11]. While the collection of MBES bathymetry is standardized for hydrographic applications [12], interpreting the intensity measurements of acoustic/laser signals from the seabed is a more complex task [13].

Modern remote sensing studies have evolved from pixel-based image analysis to object-based image analysis (OBIA), which is now widely applied in the discipline [14]. In parallel, image classification techniques have seen significant advancements, moving from unsupervised to supervised methods, and more recently including neural networks and artificial intelligence (AI). A fundamental step of OBIA is segmentation [15]. Choosing an effective segmentation plan is important to perform efficient remote sensing analysis of geospatial data. One of the most powerful and well-known segmentation algorithms is multiresolution segmentation (MRS) [16], which has been applied in numerous remote sensing studies and has become synonymous with the fundamental step of OBIA.

In 2023, META AI developed the Segment Anything Model (SAM) [17], which is as a potent tool in remote sensing image analysis. SAM, an AI-powered algorithm, can identify and isolate individual objects in images. It was trained on over 1 billion segmentation masks collected from 11 million images from diverse geographic locations [17]. SAM can be utilized across various domains, including the geospatial industry, simplifying tasks such as identifying man-made objects, like buildings and cars, and also natural properties like individual trees in remote sensing imagery. SAM’s strength lies in its remarkable ability to generalize and learn from zero-shot scenarios [18,19]. This makes it a valuable asset for analyzing aerial and orbital images from a wide range of geographical areas, a feature that could be particularly beneficial in the diverse and variable underwater environments. The model has been put to the test on datasets of various scales, using different input prompts such as bounding boxes, individual points, and text descriptors. Its versatility has been tested in various disciplines, including medical sciences [20,21,22], biosciences [23], terrestrial remote sensing [18,24,25], planetary exploration [26], and even cultural heritage studies [27].

To the best of our knowledge, SAM has not yet been applied to seafloor mapping studies. Underwater remote sensing surveys are unique due to the additional water medium covering the seabed. This makes it challenging to directly determine what lies on the bottom, such as sediments or benthic organisms [28]. The spatial features of benthic habitats, seabed geomorphological formations, and other objects on the seabed can take various forms [29]. These are not repeatable and are therefore difficult to assign a clear classification, unlike objects found on land, such as characteristic buildings, trees, roads, agricultural fields, and so on. Moreover, since scattering intensity information from acoustic or laser signals serves as a proxy for the character of the seabed, we typically need additional information in the form of direct ground truthing [30]. For this reason, survey campaigns are usually planned with a focus on collecting as much ground truth information as possible to gather comprehensive data about the characteristics of the seabed [31]. Such ground truth information is most often provided in the form of sediment distributions from sediment samples or underwater photographs or videos [32]. Indeed, the data obtained from undersea research often require further interpretation or a classification approach based on supervised methods due to their spotty nature. This is why we are particularly interested in applying SAM in undersea research. In this context, objects on the seabed may not be as they initially appear, especially when compared to the everyday objects seen on the surface that SAM was pre-trained on. The potential of SAM to adapt to this unique environment and provide valuable insights is an exciting prospect worth exploring.

Therefore, this study aims to address specific performance aspects of SAM, including the delineation accuracy in response to visible seafloor properties and the computation speed required to process the benchmark dataset. The following research questions were raised:

(1): What are the application possibilities of SAM in current underwater remote sensing studies involving various types of measurements?
(2): How do the SAM and SAM + MRS algorithms perform in terms of delineation of bedforms compared to standalone MRS segmentation?
(3): What typical types of geomorphological bedforms is it possible to detect with the SAM algorithm?

2. Materials and Methods

2.1. Study Area

Puck Lagoon, situated in the northwestern part of the Gulf of Gdansk (southern Baltic), is a unique geographical feature [33]. It is isolated from the open sea by the Hel Peninsula and separated from Puck Bay by a partially submerged sand barrier known as Seagull Shallow [34]. Puck Lagoon is one of the most valuable area in terms of biodiversity in the Polish part of the southern Baltic. It is therefore not surprising that it has been meticulously studied in a diverse range of environmental studies, including hydrobiology [35,36], hydrochemistry and marine pollution [37,38,39], sediment transport [40], shoreline migrations [41], and geological origin [42]. The lagoon is a shallow body of water characterized by a diverse bottom relief, featuring depressions reaching up to 9 m deep and long sand bars with an average depth of 1–2 m. The bottom relief of Puck Lagoon showcases forms associated with the development of barriers, lagoons, and river deltas, as well as remnants of glacial landforms [43].

The algorithms were tested on a benchmark remote sensing dataset consisting of raster grids from MBES, bathymetric lidar, airborne photogrammetry, and satellite imagery, covering an overall area of 121 km² of the Puck Lagoon, Poland. Most of the dataset, specifically the MBES, ALB, and airborne photogrammetry datasets, were acquired in 2022. SDB was performed based on a SPOT-6 satellite image, collected on 19 April 2021. Figure 1 illustrates the spatial extent of all the datasets, as well as the location of the research site within central Europe.

A detailed description of the datasets, including the exact devices used for acquisition, collection objectives, and validation, is available in Janowski, et al. [6]. Table 1 shows a summary of all datasets with their type, resolution, and pixel size used as input layers for SAM and MRS algorithms.

2.2. Description of the Methodology

The performance of the SAM algorithm was evaluated on an affordable workstation equipped with the following specifications: AMD Ryzen 5800H CPU, Nvidia GeForce RTX3070 GPU, 64 GB DDR4 3200 MHz RAM, and a Samsung 970 Evo Plus SSD. We evaluated both the standalone SAM algorithm and a combination of the MRS and SAM algorithms to assess the potential benefits of integrating MRS as seeds for SAM. All segmentation procedures were applied in Trimble eCognition software (version 10.4), with a direct use of PyTorch library and Python environment.

SAM, a brainchild of META AI, is an innovative AI model that has the capability to isolate any object in an image with just a single click [44]. It is a unique segmentation system that can adapt to new and unfamiliar objects and images without the necessity for further training [17].

The implementation of the SAM algorithm allows for selection of three vision transformer (ViT) model types: ViT-B, ViT-L, and ViT-H [45]. The first model type (ViT-B) is the fastest and it produces the coarsest results with low complexity. The last model type (ViT-H) gives very detailed results, but its performance is the slowest of all. The middle model type (ViT-L) is characterized by intermediate properties giving a compromise between quality and performance. In this study, we tested all available model types.

SAM operates using a diverse range of input prompts. These prompts, which specify what to segment in an image, enable SAM to perform a broad spectrum of segmentation tasks without additional training [23,46]. One of the key strengths of SAM lies in its advanced training methodology. It has been trained on millions of images and masks, collected via a “data engine” that operates in a model-in-the-loop manner. This engine allows for researchers to use SAM and its data to annotate images interactively and update the model. This iterative process has been repeated numerous times to enhance both the model and the dataset. The final dataset is quite extensive, comprising over 1.1 billion segmentation masks collected from approximately 11 million licensed and privacy-preserving images [47,48].

The output from SAM, in the form of masks, can be utilized as inputs to other AI systems. For instance, object masks can be tracked in videos, used in image editing applications, converted to 3D, or employed for creative tasks such as collaging. SAM has developed a comprehensive understanding of what constitutes an object. This understanding facilitates zero-shot generalization to unfamiliar objects and images, eliminating the need for additional training [18,49].

In addition, the SAM algorithm was evaluated and compared with the standalone results of the MRS algorithm. MRS has been widely applied in other seafloor mapping studies, e.g., [32,50,51]. MRS starts by creating image objects involving the use of a bottom-up region merging technique, starting from one-pixel objects. This process was guided by primary features such as greyscale or shape [16]. The merging of adjacent image objects occurred step by step, with each step involving the pair of objects that resulted in the smallest increase in heterogeneity. The process was controlled by a scale parameter, which halted the fusion of image objects once a homogeneity criterion was met [16]. This homogeneity criterion can be understood as the minimum standard deviation of heterogeneity, which is determined by the relationship between the color, shape, compactness, and smoothness of image objects. These parameters were grouped into two weighted pairs: color/shape and smoothness/compactness. In this study, the color parameter was associated with the relative values of bathymetry, backscatter intensity, and orthophoto values within the considered image objects. The shape parameter was composed of the remaining parameters: smoothness and compactness. Smoothness was defined as the ratio of the border length of an image object to its bounding box, while compactness was the ratio of the border length of an image object to the square root of the pixel count within the image [52]. Both weighted pairs of parameters could be assigned values ranging from 0.1 to 0.9, with the total value of each pair equaling 1. The detailed equations for all heterogeneity parameters, including the scale parameter, are provided in Benz, et al. [16]. The scale parameter determined the size of the created image objects, allowing for more objects to be merged together and for larger objects (as defined by the heterogeneity parameters) to be expanded if the scale parameter was larger [52]. Notably, the scale parameter did not have any units. In the standalone MRS segmentation approach, all input layers were included for the generation of image objects. Additionally, other multiresolution segmentation parameters—shape and compactness—were set to 0.1 and 0.5, respectively. These same values have been used in other marine habitat mapping studies that employed the multiresolution segmentation method, e.g., [29,53,54].

The SAM algorithm creates a grid of initial points, or “seeds”, based on the “point_per_side” parameter. These seeds are used to segment images, but they may lack significance. To address this issue, we applied an alternative to have more control over the seed creation process for object delineation by SAM. In this approach, we utilized SAM + MRS as seeds, combining MRS with SAM. By applying MRS before SAM, the seeds become more meaningful, leading to a more precise object delineation in certain situations. The scale parameter was set to 250 in the MRS algorithm to produce seeds that more accurately represent the objects in the benchmark dataset.

In the subsequent phase of the process, we employed a supervised classification technique utilizing the machine learning-based random forest (RF) classifier [55]. The manual delineation of bedforms served as the basis for generating 2100 random control points. We ensured these points were strategically and representatively placed, maintaining a minimum distance of 4 m from each other and 10 m from the class boundaries [8]. Following this, we divided the control points into training and validation samples, adhering to a 70/30 split [56]. The training points were utilized to execute the random forest classification, while the validation samples were used to assess the accuracy of the classification [57,58,59]. In the correlation matrix, the diagonal elements denote instances that have been correctly classified for each respective class. Conversely, the off-diagonal elements signify instances that have been misclassified. The producer’s accuracy is a measure of the likelihood that a given ground truth class is accurately classified. On the other hand, the user’s accuracy signifies the probability that a predicted class corresponds to that class in reality. The kappa per class metric provides a measure of agreement between the predicted and actual class labels, adjusted for chance agreement [60]. This value ranges from −1, indicating total disagreement, to 1, indicating perfect agreement. A value of 0 would suggest an agreement equivalent to random classification. The overall accuracy is calculated as the ratio of the total number of correct classifications to the total number of instances [57]. Lastly, the kappa statistic measures the degree of agreement between the predicted and actual class labels, taking into account the possibility of chance agreement [60]. A kappa value of 1 signifies perfect agreement, while a kappa value of 0 suggests agreement no better than random classification. This approach allowed for us to maintain the integrity of the classification process while also providing a means for performance evaluation. A detailed workflow presenting all steps mentioned in this section is provided in Figure 2.

Data processing performance was a crucial determinant in this study. In relation to the units employed in digital photography, most of the remote sensing data used in this study have a resolution surpassing 6000 Mpix. As the processing performance is contingent on the datasets with the greatest size (and smallest pixel size), it was hypothesized that if the algorithm struggled to yield a result for the original pixel size datasets, we would test the lower resolutions using the “copy with scale” command in eCognition software.

The precision of the SAM algorithm was determined through an expert interpretation of significant geomorphological bedforms present in Puck Lagoon. A thorough manual investigation was conducted to meticulously examine these bedforms. Due to the necessary level of detail, the delineation was conducted based on all the datasets except SDB. The process involved the analysis of seafloor relief, where the boundaries of the bedforms were defined. The interpretation of depth, slope, and aspect facilitated the delineation of these boundaries, which encompassed edges, slope bases, bedform ridges, and trough form axes. The analysis was based on bathymetry data, derived at a scale of 1:5000 or larger. The slope, aspect, and bathymetric profiles were generated using Global Mapper software (version 22.1).

In the final analysis, we were able to distinguish 21 distinct types of bedforms (Table 2). Each of these bedforms represents a unique aspect of the marine environment. To enhance clarity in the subsequent sections of the paper, all bedforms have been denoted using specific symbols. Additionally, Abbreviations table provides a comprehensive summary of all abbreviations utilized in this study.

3. Results

The application of the SAM algorithm to the benchmark remote sensing datasets of Puck Lagoon showed that the time required by the algorithm varies depending on the resolution of the dataset and the model type used. The algorithm requires a significant amount of GPU memory, and it appears that 8 GB of Nvidia RTX3070 may be insufficient for such large or heterogeneous datasets as the Puck Lagoon dataset, especially for high-resolution applications. Table 3 provides a summary of the parameters used to evaluate the performance of the SAM algorithm, including the pixel size of the datasets, the model type used, the output (whether the result was achieved), and the time required to complete the computations.

Similarly, the tests involving the combined SAM and MRS approach yielded comparable results (Table 4). Due to the more complex combined segmentation design, the generation of results took significantly longer. Furthermore, it was observed that one fewer result was obtained with the ViT-B model type (with a pixel size of 3 m), while one additional result was obtained for the ViT-L model type (with a pixel resolution of 4 m).

In contrast, Figure 3 depicts all the positive results of both the SAM and SAM + MRS algorithms obtained in this study, presented as maps with spatial representation of image segments. It is evident that the results from the standalone SAM algorithm are largely similar (Figure 3a–c). The only exception is one result, which used the ViT-L model type and a resolution of 5 m, and this differs from the others (Figure 3d). More variations become apparent after the introduction of the combined SAM + MRS algorithm. In these scenarios, the results appear finer, slightly more complex, and represent more seafloor features that could be bedforms (Figure 3e–h). While none of the results are as complex as the manual map prepared at a 1:5000 scale or larger (Figure 3i), there are visible similarities between some features in both the artificial and manual approaches.

The segmentation process effectively captured elements of considerable size or those with significant length and continuity. This was particularly true for forms with substantial slope gradients. However, elements similar in scale but with smaller denivelations, such as the bottom relief elements of the deeper parts of Puck Bay, were not as well represented due to the resolution of the data.

The most accurately depicted elements included the following:

-: Anthropogenic relief elements at the bottom of Puck Lagoon, which were clearly captured due to the significant depth difference and the steepness of the slopes.
-: Long, continuous areas of slopes with pronounced gradients, including the distal slopes of sand waves.

These results align with the accepted resolution of the input data that were analyzed. This suggests that the imaging process was successful in capturing the major features of the seabed, while some of the finer details may have been lost due to limitations in data resolution.

Figure 4 illustrates the outcome of the MRS segmentation and RF classification. The figure also plots the extents of the manual extractions and one of the SAM application results (SAM + MRS algorithms, ViT-L model type, and 5 m pixel size). The random forest classification, as depicted in the results, is capable of handling the significant complexity of the task, which involves dividing into 21 classes, and distinguishing a multitude of finer details. Despite the presence of several misclassifications, which are notably evident in the residual measurement artifacts (for instance, the measurement noise visible in certain areas of the ALB remote sensing data, as shown in Figure 1d), the overall result is comprehensive. It distinctly demonstrates the division into 21 bedform classes. Furthermore, it is important to remember that the control points relied on manual delineation. Due to the availability of high-resolution data, this did not encompass the entire study area. However, the application of MRS across all data types enabled the delineation of segments and facilitated predictions for the entire study area (Figure 4a).

Contrarily, Figure 4b depicts the outcome of applying the RF classification algorithm to the combined MRS and SAM results. It is evident that the majority of the area was categorized as “anthropogenic formations”. This was followed by “uneven seabed”, and finally “accumulations of organics”. Due to the significantly lower level of segmentation complexity compared to the MRS result, only 3 out of 21 classes were assigned to various image objects with differing levels of accuracy.

The accuracy assessment results, as shown in Table 5, offer valuable insights into the performance of the classification algorithm in the MRS + RF scenario. For example, the class of “megaripplemarks rhomboidal” has 24 instances that were correctly classified, while “slightly undulating seabed” has 26. The algorithm correctly identified 83% of the actual instances of “megaripplemarks rhomboidal”, yielding a producer’s accuracy of 0.83. When the algorithm designates an instance as “megaripplemarks rhomboidal”, it is accurate 75% of the time, indicating a user’s accuracy of 0.75. A kappa statistic of 0.82 for “megaripplemarks rhomboidal” illustrates a significant concurrence between the predicted and actual classifications. The overall accuracy of the algorithm, which is the total number of correct classifications divided by the total number of instances, stands at 0.66 or 66%. The overall kappa statistic is 0.64, suggesting a substantial agreement.

While the algorithm demonstrates a reasonable level of accuracy in classifying certain classes, it struggles with others. This could be attributed to various factors such as class imbalance, noise in the data, or the algorithm’s inability to capture the distinguishing features of certain classes.

For comparison, we also conducted an accuracy assessment for the SAM + MRS + RF algorithm (Table 6). The results indicate a marginally higher overall accuracy compared to the previous findings. However, it is important to note that due to the reduced complexity of segmentation, the number of classes that emerged decreased to 5 out of 21. Furthermore, even though some classes were present in the validation ground truth dataset, they were not classified at all. This led to false positive results, particularly in the “undulating seabed” and “foreshore slope” classes.

The Table 7 presents the percentage of spatial distribution of different types of bedforms, as determined by three methods: manual delineation, MRS + RF, and SAM + MRS + RF. The manual delineation method is considered the reference point for the other two methods. The MRS + RF method shows a significant increase in the identification of “uneven seabed” (from 1.26% to 30.90%) and “flat, even seabed with vegetation” (from 5.04% to 21.13%) compared to the manual delineation. However, it significantly underperforms for “slightly undulating seabed” (from 32.74% to 2.34%), “undulating seabed” (from 11.71% to 1.63%), and “flat, even seabed” (from 22.12% to 1.82%). The SAM + MRS + RF method only identifies two bedforms: “uneven seabed” (19.60%) and “anthropogenic formations” (77.97%). It significantly outperforms the manual delineation for “anthropogenic formations” (from 0.87% to 77.97%) but underperforms for “uneven seabed” (from 1.26% to 19.60%).

This analysis suggests that while the MRS + RF and SAM + MRS + RF methods can outperform manual delineation for certain bedforms, they also significantly underperform for others. This could be due to the different characteristics of the bedforms and the specific conditions under which the methods are applied. Considering the complexity of the task, it is clear that the MRS + RF method has a broader application as it was able to classify all 21 bedforms. This suggests that it might be a more general method, capable of handling a wider variety of bedforms.

4. Discussion

Based on the results provided above, we can derive some meaningful insights. The results from the standalone SAM algorithm provide a comprehensive overview of the performance of the method on the Puck Lagoon dataset. It is evident that the algorithm’s performance is significantly influenced by the pixel size of the datasets and the model type used. The results from the combined SAM and MRS approach provide some additional insights. The increased complexity of the combined segmentation design led to longer computation times, but it also resulted in more detailed and complex results, particularly for the ViT-B and ViT-L model types with a pixel resolution of 4 m and 5 m.

According to Wei et al. [61], ViT-H shows substantial improvement over ViT-B, but only marginal gains over ViT-L. Most runs of the SAM algorithm resulted in an “Out of memory” error, suggesting that these attempts might require significant GPU memory resources and their performance could be constrained by the available memory. The ViT-B model was capable of producing outputs at pixel sizes 3, 4, and 5 without any errors, indicating that ViT-B might be more memory-efficient compared to the other models. The ViT-L model was only able to produce an output at pixel size 5 before running into memory issues, suggesting that ViT-L might require more memory resources than ViT-B. All attempts with the ViT-H model resulted in “Out of memory” errors, indicating that ViT-H might be the most memory-intensive among the three models.

The SAM + MRS algorithm, with ViT-B and ViT-L models, was able to produce outputs at pixel sizes 4 and 5 without any errors. This suggests that these models might perform optimally in environments with more memory, particularly when handling larger pixel sizes. The ViT-H model consistently ran out of memory across all pixel sizes, indicating that it may be more memory-intensive and might not be suitable for environments with limited memory resources. The processing time varied significantly, with some instances taking just a few seconds while others took over an hour. This indicates that the processing time is highly dependent on both the model type and pixel size.

The spatial representation of image segments clearly illustrates the differences between the standalone SAM algorithm and the combined SAM + MRS approach. While the standalone SAM algorithm produced largely similar results, the combined approach resulted in finer, more complex results that represent more seafloor features that could be bedforms. This suggests that the combined SAM + MRS approach may be more effective at capturing the complexity of the seabed.

The model did encounter some difficulties when dealing with images of lower spatial resolution. However, SAM showed promising adaptability in the analysis of remote sensing data. The results align with the resolution of the input data analyzed. It is clear that there is a need for further refinement of the discussed methods and algorithms. While the primary elements of the bottom relief are discernible at resolutions of 3, 4, and 5 m per pixel, there is substantial room for improvement. When compared to manual analyses conducted on data at a resolution of 0.2 m and a scale greater than 1:5000, it is evident that the efficiency of the algorithms used needs to be enhanced. This is an area that warrants further work and development. Future research should aim to enhance the model’s effectiveness by integrating it with additional fine-tuning techniques and other networks.

In our analysis, the algorithm primarily emphasizes elements that form continuous, elongated structures, with a significant length-to-width ratio. This approach effectively captures strong features associated with distinct changes in the bottom surface, such as steep slopes and abrupt depth changes (for example, anthropogenic post-dredging pits that are remnants of sediment extractions for beach and shore refinement [62,63,64]). Given the resolution of the data used in the calculations, it is not entirely feasible to verify whether the algorithm would have equally detected the shorter edge features of smaller formations.

We believe that additional tests are necessary to evaluate how the algorithm would perform in the geomorphological analysis of a specific section of the bottom surface. This includes not only the shape, slopes, edges, and distinct depth changes, but also the nature of the infilling of the resulting area shape (and any potential further subdivisions based on this). At present, the feature most strongly emphasized by the algorithm is the edges. The surface approach, which involves the analysis of the bottom surface texture [65], its roughness [66], and its ripples, does not appear to be the algorithm’s strong suit, as it does not form the basis for indicating boundaries. Therefore, at this stage, it is challenging to assess how well the algorithm distinguishes between different types of bottom surfaces, particularly in terms of describing the character of the surface, its roughness, ripples, and so on [67].

However, it is important to note that the resolution of the data can significantly impact the effectiveness of the segmentation process [68]. While elements of considerable size or those with significant length and continuity were effectively captured, smaller elements with less pronounced gradients were not as well represented. This highlights the need for high-resolution data as well as powerful GPU units to capture the full complexity of the seabed.

The results from the MRS segmentation and RF classification further underscore the potential of these algorithms for seabed mapping [8]. Despite the presence of several misclassifications, the overall result is comprehensive and demonstrates the division into 21 bedform classes. The application of MRS across all data types enabled the delineation of segments and facilitated predictions for the entire study area, suggesting that this approach could be effective for large-scale seabed mapping [69].

These results highlight the importance of selecting the appropriate model and pixel size for remote sensing applications. Further research could focus on optimizing the SAM algorithm for different model types and pixel sizes to improve its performance and efficiency. Additionally, considering the significant GPU memory requirement of the algorithm, future work could also explore methods to reduce the memory footprint of the algorithm, enabling its application on larger or more heterogeneous datasets.

While both the standalone SAM algorithm and the combined SAM + MRS approach show promise for seabed mapping, careful consideration must be given to the choice of model type, pixel size, and data resolution. Further research could focus on optimizing these algorithms for different model types and pixel sizes, as well as exploring methods to reduce the computational time and memory requirements. Additionally, the development of methods to improve the accuracy of segmentation and classification, particularly for smaller elements with less pronounced gradients, could also be an important area of future research.

The application of SAM in underwater remote sensing surveys presents a promising avenue for future research. Its ability to identify and isolate individual objects in images could potentially enhance our understanding of the seabed and its diverse features. However, further research is needed to explore the effectiveness of SAM in this unique and complex environment. This could pave the way for more comprehensive and accurate seabed mapping studies, ultimately contributing to advancements in the field of underwater remote sensing and the broader goal of the Digital Representation of the Ocean initiative under UNESCO’s Ocean Decade.

5. Conclusions

The Segment Anything Model (SAM), developed by META AI in 2023, has demonstrated significant potential in the field of remote sensing image analysis. Its ability to identify and isolate individual objects in images, coupled with its remarkable capacity to generalize and learn from zero-shot scenarios, makes it a valuable tool across various domains, including the geospatial industry. Despite its proven effectiveness in various disciplines, SAM has not yet been applied to seafloor mapping studies. The unique challenges posed by underwater remote sensing surveys, such as the additional water medium covering the seabed and the variable forms of benthic habitats and seabed geomorphological formations, make this a complex yet intriguing area for the application of SAM.

In this study, we addressed the research areas that emerged from the questions outlined in the Introduction.

(1): The application possibilities of the SAM algorithm in current underwater remote sensing studies are quite broad. The algorithm’s performance is significantly influenced by the pixel size of the datasets and the model type used. However, it is important to note that the resolution of the data can significantly impact the effectiveness of the segmentation process. While elements of considerable size or those with significant length and continuity were effectively captured, smaller elements with less pronounced gradients were not as well represented. This highlights the need for high-resolution data to capture the full complexity of the seabed as well as extending processing power to ensure operation of the SAM algorithm.
(2): The standalone SAM algorithm and the combined SAM + MRS approach both have their strengths and weaknesses in terms of delineating bedforms. While the standalone SAM algorithm produced largely similar results, the combined approach resulted in finer, more complex results that represent more seafloor features that could be bedforms. This suggests that the combined SAM + MRS approach may be more effective at capturing the complexity of the seabed. However, the increased complexity of the combined segmentation design led to longer computation times.
(3): The SAM algorithm is capable of detecting some types of geomorphological bedforms. The results from the MRS segmentation and RF classification underscore the potential of these algorithms for seabed mapping. Despite the presence of several misclassifications, the overall result is comprehensive and demonstrates the division into 21 bedform classes. The application of MRS across all data types enabled the delineation of segments and facilitated predictions for the entire study area, suggesting that this approach could be effective for large-scale seabed mapping. However, it is important to note that the effectiveness of the SAM algorithm in detecting these bedforms is significantly influenced by the pixel size of the datasets and the model type used.

The need for additional information in the form of direct ground truthing, typically provided by sediment distributions from sediment samples or underwater photographs or videos, further underscores the potential of SAM in this context. Given the spotty nature of data obtained from undersea research, the application of SAM could potentially facilitate further interpretation and classification based on supervised methods.

While objects on the seabed may not initially appear as they do on the surface, the potential of SAM to adapt to this unique environment and provide valuable insights is an exciting prospect. As such, future research could focus on exploring the application of SAM in undersea research, with the aim of harnessing its capabilities to enhance our understanding of the seabed and its diverse features. This could pave the way for more comprehensive and accurate seabed mapping studies, ultimately contributing to advancements in the field of underwater remote sensing.

Author Contributions

Conceptualization, Ł.J.; methodology, Ł.J.; software, Ł.J.; validation, Ł.J. and R.W.; formal analysis, Ł.J.; investigation, Ł.J.; resources, Ł.J.; data curation, Ł.J. and R.W.; writing—original draft preparation, Ł.J. and R.W.; writing—review and editing, Ł.J.; visualization, Ł.J.; supervision, Ł.J.; project administration, Ł.J.; funding acquisition, Ł.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded in whole or in part by National Science Centre, Poland [Grant number: 2021/40/C/ST10/00240].

Data Availability Statement

All remote sensing datasets utilized in this paper can be assessed through the Marine Geoscience Data System [70].

Conflicts of Interest

Author Radosław Wróblewski was employed by the company MEWO S.A. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

OBIA	Object-based image analysis
SAM	Segment Anything Model
SDB	Satellite-derived bathymetry
DEM	Digital elevation model
MBES	Multibeam echosounder
ALB	Airborne laser bathymetry
AI	Artificial intelligence
MRS	Multiresolution segmentation
RF	Random forest
GPU	Graphic processing unit
ViT-B	Vision transformer model type with low complexity
ViT-L	Vision transformer model type with moderate complexity
ViT-H	Vision transformer model type with high complexity
sbz	Sandbank zone
fsl	Foreshore slope
ssl	Steep slopes
esb	Flat, even seabed
isb	Uneven seabed
ssb	Slightly undulating seabed
usb	Undulating seabed
srb	Sand ribbons
swm	Sand waves with mega ripple marks
mrp	Mega ripple marks
rmr	Rhomboidal mega ripple marks
mmr	Medium-sized mega ripple marks with ripple marks
smr	Small mega ripple marks with ripple marks
sbv	Flat, even seabed with vegetation
uso	Undulating seabed with minor accumulations of organics
org	Accumulations of organics
del	Delta
rdf	Relict deltaic formations
gou	Glacial outcrops
rsb	Relict sandbanks
ant	Anthropogenic formations

References

Guan, S.; Qu, F.; Qiao, F. United Nations Decade of Ocean Science for Sustainable Development (2021–2030): From innovation of ocean science to science-based ocean governance. Front. Mar. Sci. 2023, 9, 1091598. [Google Scholar] [CrossRef]
De Moustier, C.; Matsumoto, H. Seafloor acoustic remote sensing with multibeam echo-sounders and bathymetric sidescan sonar systems. Mar. Geophys. Res. 1993, 15, 27–42. [Google Scholar] [CrossRef]
Doneus, M.; Doneus, N.; Briese, C.; Pregesbauer, M.; Mandlburger, G.; Verhoeven, G. Airborne laser bathymetry—Detecting and recording submerged archaeological sites from the air. J. Archaeol. Sci. 2013, 40, 2136–2151. [Google Scholar] [CrossRef]
Sagawa, T.; Yamashita, Y.; Okumura, T.; Yamanokuchi, T. Satellite derived bathymetry using machine learning and multi-temporal satellite images. Remote Sens. 2019, 11, 1155. [Google Scholar] [CrossRef]
Ashphaq, M.; Srivastava, P.K.; Mitra, D. Review of near-shore satellite derived bathymetry: Classification and account of five decades of coastal bathymetry research. J. Ocean. Eng. Sci. 2021, 6, 340–359. [Google Scholar] [CrossRef]
Janowski, Ł.; Skarlatos, D.; Agrafiotis, P.; Tysiąc, P.; Pydyn, A.; Popek, M.; Kotarba-Morley, A.M.; Mandlburger, G.; Gajewski, Ł.; Kołakowski, M.; et al. High resolution optical and acoustic remote sensing datasets of the Puck Lagoon. Sci. Data 2024, 11, 360. [Google Scholar] [CrossRef] [PubMed]
Mandlburger, G.; Pfennigbauer, M.; Schwarz, R.; Pöppl, F. A decade of progress in topo-bathymetric laser scanning exemplified by the pielach river dataset. ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. 2023, X-1/W1-2023, 1123–1130. [Google Scholar] [CrossRef]
Janowski, L.; Wroblewski, R.; Rucinska, M.; Kubowicz-Grajewska, A.; Tysiac, P. Automatic classification and mapping of the seabed using airborne LiDAR bathymetry. Eng. Geol. 2022, 301, 106615. [Google Scholar] [CrossRef]
Misiuk, B.; Brown, C.J. Benthic habitat mapping: A review of three decades of mapping biological patterns on the seafloor. Estuar. Coast. Shelf Sci. 2024, 296, 108699. [Google Scholar] [CrossRef]
Brown, C.J.; Smith, S.J.; Lawton, P.; Anderson, J.T. Benthic habitat mapping: A review of progress towards improved understanding of the spatial ecology of the seafloor using acoustic techniques. Estuar. Coast. Shelf Sci. 2011, 92, 502–520. [Google Scholar] [CrossRef]
Lecours, V.; Devillers, R.; Schneider, D.C.; Lucieer, V.L.; Brown, C.J.; Edinger, E.N. Spatial scale and geographic context in benthic habitat mapping: Review and future directions. Mar. Ecol. Prog. Ser. 2015, 535, 259–284. [Google Scholar] [CrossRef]
International Hydrographic Organization. IHO Standards for Hydrographic Surveys S-44 ed. 6.0; International Hydrographic Organization: Monaco, Monaco, 2020.
Lamarche, G.; Lurton, X. Recommendations for improved and coherent acquisition and processing of backscatter data from seafloor-mapping sonars. Mar. Geophys. Res. 2017, 39, 5–22. [Google Scholar] [CrossRef]
Lu, D.; Weng, Q. A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens. 2007, 28, 823–870. [Google Scholar] [CrossRef]
Blaschke, T.; Hay, G.J.; Kelly, M.; Lang, S.; Hofmann, P.; Addink, E.; Queiroz Feitosa, R.; van der Meer, F.; van der Werff, H.; van Coillie, F.; et al. Geographic Object-Based Image Analysis—Towards a new paradigm. ISPRS J. Photogramm. Remote Sens 2014, 87, 180–191. [Google Scholar] [CrossRef] [PubMed]
Benz, U.C.; Hofmann, P.; Willhauck, G.; Lingenfelder, I.; Heynen, M. Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information. ISPRS J. Photogramm. Remote Sens. 2004, 58, 239–258. [Google Scholar] [CrossRef]
Kirillov, A.; Mintun, E.; Ravi, N.; Mao, H.Z.; Rolland, C.; Gustafson, L.; Xiao, T.T.; Whitehead, S.; Berg, A.C.; Lo, W.Y.; et al. Segment Anything. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023; pp. 3992–4003. [Google Scholar]
Osco, L.P.; Wu, Q.S.; de Lemos, E.L.; Gonsalves, W.N.; Ramos, A.P.M.; Li, J.; Marcato, J., Jr. The Segment Anything Model (SAM) for remote sensing applications: From zero to one shot. Int. J. Appl. Earth Obs. Geoinf. 2023, 124, 103540. [Google Scholar] [CrossRef]
Nanni, L.; Fusaro, D.; Fantozzi, C.; Pretto, A. Improving existing segmentators performance with zero-shot segmentators. Entropy 2023, 25, 1502. [Google Scholar] [CrossRef] [PubMed]
Ma, J.; He, Y.T.; Li, F.F.; Han, L.; You, C.Y.; Wang, B. Segment anything in medical images. Nat. Commun. 2024, 15, 654. [Google Scholar] [CrossRef] [PubMed]
Mazurowski, M.A.; Dong, H.Y.; Gu, H.X.; Yang, J.C.; Konz, N.; Zhang, Y.X. Segment anything model for medical image analysis: An experimental study. Med. Image Anal. 2023, 89, 102918. [Google Scholar] [CrossRef]
Shi, P.L.; Qiu, J.N.; Abaxi, S.M.D.; Wei, H.; Lo, F.P.W.; Yuan, W. Generalist Vision Foundation Models for Medical Imaging: A Case Study of Segment Anything Model on Zero-Shot Medical Segmentation. Diagnostics 2023, 13, 1947. [Google Scholar] [CrossRef] [PubMed]
Chen, F.; Chen, L.Y.; Han, H.J.; Zhang, S.N.; Zhang, D.Q.; Liao, H.E. The ability of Segmenting Anything Model (SAM) to segment ultrasound images. Biosci. Trends 2023, 17, 211–218. [Google Scholar] [CrossRef] [PubMed]
Ding, L.; Zhu, K.; Peng, D.F.; Tang, H.; Yang, K.W.; Bruzzone, L. Adapting Segment Anything Model for Change Detection in VHR Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5611711. [Google Scholar] [CrossRef]
Li, Y.Q.; Wang, D.D.; Yuan, C.; Li, H.; Hu, J. Enhancing Agricultural Image Segmentation with an Agricultural Segment Anything Model Adapter. Sensors 2023, 23, 7884. [Google Scholar] [CrossRef] [PubMed]
Giannakis, I.; Bhardwaj, A.; Sam, L.; Leontidis, G. A flexible deep learning crater detection scheme using Segment Anything Model (SAM). Icarus 2024, 408, 115797. [Google Scholar] [CrossRef]
Réby, E.; Guilhelm, A.; De Luca, L. Semantic Segmentation using Foundation Models for Cultural Heritage: An Experimental Study on Notre-Dame de Paris. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023; pp. 1681–1689. [Google Scholar]
Huy, D.Q.; Sadjoli, N.; Azam, A.B.; Elhadidi, B.; Cai, Y.; Seet, G. Object perception in underwater environments: A survey on sensors and sensing methodologies. Ocean. Eng. 2023, 267, 113202. [Google Scholar] [CrossRef]
Lucieer, V.; Hill, N.A.; Barrett, N.S.; Nichol, S. Do marine substrates ‘look’ and ‘sound’ the same? Supervised classification of multibeam acoustic data using autonomous underwater vehicle images. Estuar. Coast. Shelf Sci. 2013, 117, 94–106. [Google Scholar] [CrossRef]
Stephens, D.; Diesing, M. A comparison of supervised classification methods for the prediction of substrate type using multibeam acoustic and legacy grain-size data. PLoS ONE 2014, 9, e93950. [Google Scholar] [CrossRef] [PubMed]
Diesing, M.; Mitchell, P.; Stephens, D. Image-based seabed classification: What can we learn from terrestrial remote sensing? ICES J. Mar. Sci. 2016, 73, 2425–2441. [Google Scholar] [CrossRef]
Ierodiaconou, D.; Schimel, A.C.G.; Kennedy, D.; Monk, J.; Gaylard, G.; Young, M.; Diesing, M.; Rattray, A. Combining pixel and object based image analysis of ultra-high resolution multibeam bathymetry and backscatter for habitat mapping in shallow marine waters. Mar. Geophys. Res. 2018, 39, 271–288. [Google Scholar] [CrossRef]
Sokołowski, A.; Jankowska, E.; Balazy, P.; Jędruch, A. Distribution and extent of benthic habitats in Puck Bay (Gulf of Gdańsk, southern Baltic Sea). Oceanologia 2021, 63, 301–320. [Google Scholar] [CrossRef]
Szymczak, E.; Rucińska, M. Characteristics of morphodynamic conditions in the shallows of Puck Bay (southern Baltic Sea). Oceanol. Hydrobiol. Stud. 2021, 50, 220–231. [Google Scholar] [CrossRef]
Greszkiewicz, M.; Fey, D.P.; Lejk, A.M.; Zimak, M. The effect of salinity on the development of freshwater pike (Esox lucius) eggs in the context of drastic pike population decline in Puck Lagoon, Baltic Sea. Hydrobiologia 2022, 849, 2781–2795. [Google Scholar] [CrossRef]
Kruk-Dowgiałło, L. Long-term changes in the structure of underwater meadows of the Puck Lagoon. Acta Ichthyol. Piscat. 1991, 21, 77–84. [Google Scholar] [CrossRef]
Jędruch, A.; Bełdowska, M.; Ziółkowska, M. The role of benthic macrofauna in the trophic transfer of mercury in a low-diversity temperate coastal ecosystem (Puck Lagoon, southern Baltic Sea). Environ. Monit. Assess. 2019, 191, 1–25. [Google Scholar] [CrossRef] [PubMed]
Glasby, G.; Szefer, P. Marine pollution in Gdansk Bay, Puck Bay and the Vistula lagoon, Poland: An overview. Sci. Total Environ. 1998, 212, 49–57. [Google Scholar] [CrossRef]
Ciszewski, P.; Kruk-Dowgiałło, L.; Andrulewicz, E. A study on pollution of the Puck Lagoon and possibility of restoring the lagoon′s original ecological state. Acta Ichthyol. Piscat. 1991, 21, 29–37. [Google Scholar] [CrossRef]
Szmytkiewicz, A.; Szymczak, E. Sediment deposition in the Puck Lagoon (Southern Baltic Sea, Poland). Baltica 2014, 27, 105–118. [Google Scholar] [CrossRef]
Miotk-Szpiganowicz, G. Holocene shoreline migrations in the Puck Lagoon (Southern Baltic Sea) based on the Rzucewo Headland case study. Landf. Anal. 2003, 4, 3–97. [Google Scholar]
Kramarska, R.; Uścinowicz, S.; Zachowicz, J.; Kawińska, M. Origin and evolution of the Puck Lagoon. J. Coast. Res. 1995, 187–191. [Google Scholar]
Uścinowicz, S.; Witak, M.; Miotk-Szpiganowicz, G.; Burska, D.; Cieślikiewicz, W.; Jegliński, W.; Jurys, L.; Sydor, P.; Pawlyta, J.; Piotrowska, N. Climate and sea level variability on a centennial time scale over the last 1500 years as inferred from the Coastal Peatland of Puck Lagoon (southern Baltic Sea). Holocene 2020, 30, 1801–1816. [Google Scholar] [CrossRef]
Carraro, A.; Sozzi, M.; Marinello, F. The Segment Anything Model (SAM) for accelerating the smart farming revolution. Smart Agric. Technol. 2023, 6, 100367. [Google Scholar] [CrossRef]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Ren, S.; Luzi, F.; Lahrichi, S.; Kassaw, K.; Collins, L.M.; Bradbury, K.; Malof, J.M. Segment anything, from space? In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 3–8 January 2024; pp. 8355–8365. [Google Scholar]
Wu, Q.; Osco, L.P. samgeo: A Python package for segmenting geospatial data with the Segment Anything Model (SAM). J. Open Source Softw. 2023, 8, 5663. [Google Scholar] [CrossRef]
He, S.; Bao, R.; Li, J.; Grant, P.E.; Ou, Y. Accuracy of segment-anything model (sam) in medical image segmentation tasks. arXiv 2023, arXiv:2304.09324. [Google Scholar]
Maquiling, V.; Byrne, S.A.; Niehorster, D.C.; Nyström, M.; Kasneci, E. Zero-shot segmentation of eye features using the segment anything model (sam). Proc. ACM Comput. Graph. Interact. Tech. 2024, 7, 1–16. [Google Scholar] [CrossRef]
Summers, G.; Lim, A.; Wheeler, A.J. A Scalable, Supervised Classification of Seabed Sediment Waves Using an Object-Based Image Analysis Approach. Remote Sens. 2021, 13, 2317. [Google Scholar] [CrossRef]
Prampolini, M.; Angeletti, L.; Castellan, G.; Grande, V.; Le Bas, T.; Taviani, M.; Foglini, F. Benthic Habitat Map of the Southern Adriatic Sea (Mediterranean Sea) from Object-Based Image Analysis of Multi-Source Acoustic Backscatter Data. Remote Sens. 2021, 13, 2913. [Google Scholar] [CrossRef]
Baatz, M.; Schape, A. Multiresolution Segmentation: An Optimization Approach for High Quality Multi-Scale Image Segmentation; Angewandte Geographische Informationsverarbeitung: Beijing, China, 2000; pp. 12–23. [Google Scholar]
Diesing, M.; Green, S.L.; Stephens, D.; Lark, R.M.; Stewart, H.A.; Dove, D. Mapping seabed sediments: Comparison of manual, geostatistical, object-based image analysis and machine learning approaches. Cont. Shelf Res. 2014, 84, 107–119. [Google Scholar] [CrossRef]
Montereale Gavazzi, G.; Madricardo, F.; Janowski, L.; Kruss, A.; Blondel, P.; Sigovini, M.; Foglini, F. Evaluation of seabed mapping methods for fine-scale classification of extremely shallow benthic habitats—Application to the Venice Lagoon, Italy. Estuar. Coast. Shelf Sci. 2016, 170, 45–60. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Mohan, A.; Singh, A.K.; Kumar, B.; Dwivedi, R. Review on remote sensing methods for landslide detection using machine and deep learning. Trans. Emerg. Telecommun. Technol. 2020, 32, ett.3998. [Google Scholar] [CrossRef]
Foody, G.M. Status of land cover classification accuracy assessment. Remote Sens. Environ. 2002, 80, 185–201. [Google Scholar] [CrossRef]
Congalton, R.G. A review of assessing the accuracy of classifications of remotely sensed data. Remote Sens. Environ. 1991, 37, 35–46. [Google Scholar] [CrossRef]
Story, M.; Congalton, R.G. Accuracy assessment: A user’s perspective. Photogramm. Eng. Remote Sens. 1986, 52, 397–399. [Google Scholar]
Cohen, J. A Coefficient of Agreement for Nominal Scales. Educ. Psychol. Meas. 1960, 20, 37–46. [Google Scholar] [CrossRef]
Wei, Y.; Luo, S.; Xu, C.; Fu, Y.; Dong, Q.; Zhang, Y.; Qu, F.; Cheng, G.; Ho, Y.-P.; Ho, H.-P. SAM-dPCR: Real-Time and High-throughput Absolute Quantification of Biological Samples Using Zero-Shot Segment Anything Model. arXiv 2024, arXiv:2403.18826. [Google Scholar]
Kotwicki, L.; Szymelfenig, M.; Fiers, F.; Graca, B. Diversity and environmental control of benthic harpacticoids of an offshore post-dredging pit in coastal waters of Puck Bay, Baltic Sea. Mar. Biol. Res. 2015, 11, 572–583. [Google Scholar] [CrossRef]
Graca, B. The Puck Bay as an example of deep dredging unfavorably affecting the aquatic environment. Oceanol. Hydrobiol. Stud. 2009, 38, 109–127. [Google Scholar] [CrossRef]
Szymelfenig, M.; Kotwicki, L.; Graca, B. Benthic re-colonization in post-dredging pits in the puck bay (Southern Baltic sea). Estuar. Coast. Shelf Sci. 2006, 68, 489–498. [Google Scholar] [CrossRef]
Masetti, G.; Mayer, L.; Ward, L. A Bathymetry- and Reflectivity-Based Approach for Seafloor Segmentation. Geosciences 2018, 8, 14. [Google Scholar] [CrossRef]
Schönke, M.; Wiesenberg, L.; Schulze, I.; Wilken, D.; Darr, A.; Papenmeier, S.; Feldens, P. Impact of Sparse Benthic Life on Seafloor Roughness and High-Frequency Acoustic Scatter. Geosciences 2019, 9, 454. [Google Scholar] [CrossRef]
Lurton, X.; Eleftherakis, D.; Augustin, J.M. Analysis of seafloor backscatter strength dependence on the survey azimuth using multibeam echosounder data. Mar. Geophys. Res. 2017, 39, 183–203. [Google Scholar] [CrossRef]
Hao, S.; Cui, Y.; Wang, J. Segmentation scale effect analysis in the object-oriented method of high-spatial-resolution image classification. Sensors 2021, 21, 7935. [Google Scholar] [CrossRef] [PubMed]
Janowski, L.; Wroblewski, R.; Dworniczak, J.; Kolakowski, M.; Rogowska, K.; Wojcik, M.; Gajewski, J. Offshore benthic habitat mapping based on object-based image analysis and geomorphometric approach. A case study from the Slupsk Bank, Southern Baltic Sea. Sci. Total Environ. 2021, 801, 149712. [Google Scholar] [CrossRef] [PubMed]
Janowski, Ł.; Skarlatos, D.; Agrafiotis, P.; Tysiąc, P.; Pydyn, A.; Popek, M.; Kotarba-Morley, A.; Mandlburger, G.; Gajewski, Ł.; Kolakowski, M.; et al. Bathymetry and Remote Sensing Data of the Puck Lagoon, Southern Baltic Sea; Interdisciplinary Earth Data Alliance: Palisades, NY, USA, 2023. [Google Scholar] [CrossRef]

Figure 1. Geographical representation of the study site and remote sensing datasets used as benchmark in this study: (a) location of the study site within the central Europe, marked by red area; (b) MBES bathymetry; (c) MBES backscatter; (d) bathymetric LiDAR intensity; (e) orthophoto of the study site; (f) SDB; (g) joint DEM generated by integration of MBES and ALB bathymetries.

Figure 2. Detailed flow chart of the methods used in this study.

Figure 3. The side-by-side presentation of the spatial results of SAM allowing for a comparative analysis of different parameters of SAM application and manual discrimination. Image segments were outlined by black border over SDB bathymetry: (a) SAM algorithm, ViT−B model type, and 3 m pixel size; (b) SAM algorithm, ViT−B model type, and 4 m pixel size; (c) SAM algorithm, ViT−B model type, and 5 m pixel size; (d) SAM algorithm, ViT−L model type, and 5 m pixel size; (e) SAM + MRS algorithms, ViT−B model type, and 4 m pixel size; (f) SAM + MRS algorithms, ViT−B model type, and 5 m pixel size; (g) SAM + MRS algorithms, ViT−L model type, and 4 m pixel size; (h) SAM + MRS algorithms, ViT−L model type, and 5 m pixel size; (i) result of manual image segmentation by expert interpretation.

Figure 4. (a) The map presents a comparative analysis of the results obtained from the application of three methods: SAM + MRS (represented by a black solid line), manual delineation (depicted by a grey dashed line), and MRS with the RF algorithm (the classification of which is expressed in a color scale). (b) The map presents result of RF classification over SAM + MRS outcome. Distinct types of bedforms in both maps have been identified and named according to the “symbol names” column in Table 2.

Table 1. List of all remote sensing datasets used in this study.

Source	Type	Pixel Size [m]	Resolution [pix]	Size
MBES	Bathymetry (xyz)	0.2	57,837 × 78,665	1.49 GB
MBES	Backscatter (xyz)	0.2	57,554 × 79,170	1.57 GB
ALB	Intensity (xyz)	0.2	68,618 × 88,627	5.71 GB
Airborne camera	Orthophoto (rgb)	0.2	68,379 × 88,627	1.71 GB
Satellite image	Bathymetry (xyz)	~5 × 8	2631 × 3461	14.4 MB
MBES and ALB	Bathymetry (xyz)	0.2	68,632 × 90,280	6.48 GB

Table 2. Distinct types of bedforms identified in this study.

No	Symbol	Bedform	Description
1	sbz	Sandbank zone	Area in the direct vicinity of the shore, characterized by sandbanks parallel to the shore, repeating its course
2	fsl	Foreshore slope	A gently sloping section of bottom enclosing a sandbank zone on the body of water.
3	ssl	Steep slopes	Areas with a sharp incline
4	esb	Flat, even seabed	A seabed that is level and uniform
5	isb	Uneven seabed	A seabed that is irregular or not level
6	ssb	Slightly undulating seabed	A seabed with small, gentle waves or curves
7	usb	Undulating seabed	A seabed with waves or curves
8	srb	Sand ribbons	Long, narrow bands of sand
9	swm	Sand waves with mega ripple marks	Large-scale wave-like structures in sandy seabeds, accompanied by large ripple marks
10	mrp	Mega ripple marks	Large ripple marks on the seabed
11	rmr	Rhomboidal mega ripple marks	Diamond-shaped large ripple marks
12	mmr	Medium-sized mega ripple marks with ripple marks	Medium-sized large ripple marks accompanied by smaller ripple marks
13	smr	Small mega ripple marks with ripple marks	Small large ripple marks accompanied by smaller ripple marks
14	sbv	Flat, even seabed with vegetation	A level and uniform seabed that has plant life
15	uso	Undulating seabed with minor accumulations of organics	A wavy seabed with small amounts of organic material
16	org	Accumulations of organics	Areas where organic material has gathered and forms distinct clusters
17	del	Delta	A landform at the mouth of a river created by sediment deposits
18	rdf	Relict deltaic formations	Remnants of old delta formations
19	gou	Glacial outcrops	Exposed, in the form of small elevations, fragments of glacial forms
20	rsb	Relict sandbanks	Remnants of old sandbanks
21	ant	Anthropogenic formations	Structures or features caused by human activity

Table 3. Results of SAM algorithm performance.

Pixel Size [m]	Model Type	Output (Y/N)	Time [hh:mm:ss]
0.2	ViT-B	N	00:13:49.049
0.4	ViT-B	N	00:01.45.203
0.6	ViT-B	N	00:01:19.328
0.8	ViT-B	N	00:01:04.016
1	ViT-B	N	00:00:28.047
2	ViT-B	N	00:00:19.859
3	ViT-B	Y	00:07:42.453
4	ViT-B	Y	00:03:30.282
5	ViT-L	Y	00:02:39.468
0.2	ViT-L	N	00:02:46.672
0.4	ViT-L	N	00:01:52.954
0.6	ViT-L	N	00:01:00.453
0.8	ViT-L	N	00:00:43.359
1	ViT-L	N	00:00:35.360
2	ViT-L	N	00:00:24.891
3	ViT-L	N	00:00:17.187
4	ViT-L	N	00:00:27.125
5	ViT-L	Y	00:04:54.328
0.2	ViT-H	N	00:02:46.141
0.4	ViT-H	N	00:01:05.328
0.6	ViT-H	N	00:00:39.656
0.8	ViT-H	N	00:00:37.953
1	ViT-H	N	00:00:03.515
2	ViT-H	N	00:00:13.844
3	ViT-H	N	00:00:09.719
4	ViT-H	N	00:00:08.765
5	ViT-H	N	00:00:06.469

Table 4. Results of SAM + MRS algorithm performance.

Pixel Size [m]	Model Type	Output (Y/N)	Time [hh:mm:ss]
0.2	ViT-B	N	01:13:09.828
0.4	ViT-B	N	00:21:04.859
0.6	ViT-B	N	00:11:55.218
0.8	ViT-B	N	00:11:01.140
1	ViT-B	N	00:11:22.890
2	ViT-B	N	00:05:34.156
3	ViT-B	N	02:52:26.344
4	ViT-B	Y	02:10:36.766
5	ViT-L	Y	00:57:33.937
0.2	ViT-L	N	01:02:58:500
0.4	ViT-L	N	00:25:54.203
0.6	ViT-L	N	00:12:13.328
0.8	ViT-L	N	00:10:02.625
1	ViT-L	N	00:12:10.125
2	ViT-L	N	00:06:47.063
3	ViT-L	N	05:54:14.531
4	ViT-L	Y	06:34:50.813
5	ViT-L	Y	04:44:28.703
0.2	ViT-H	N	01:02:54.937
0.4	ViT-H	N	00:24::27.546
0.6	ViT-H	N	00:10:54.063
0.8	ViT-H	N	00:10:20.593
1	ViT-H	N	00:04:13.719
2	ViT-H	N	00:05:35.234
3	ViT-H	N	00:04:36.016
4	ViT-H	N	00:02:39.578
5	ViT-H	N	00:02:37.156

Table 5. Accuracy assessment including correlation matrix and accuracy statistics for MRS and RF algorithm. Distinct types of bedforms are named according to the “symbol names” column in Table 2.

		Reference
		mrp	ssb	uso	rmr	fsl	sbz	esb	smr	swm	isb	del	rdf	sbv	org	srb	ant	usb	mmr	ssl	rsb	gou	Sum
Prediction	mrp	24	0	0	4	0	2	0	0	0	0	0	1	0	0	0	1	0	0	0	0	0	32
	ssb	0	26	0	2	0	1	0	0	1	2	0	0	0	0	0	0	0	0	0	0	0	32
	uso	1	0	11	0	3	0	2	0	0	0	0	0	0	0	0	0	0	5	0	0	1	23
	rmr	0	0	0	19	3	0	0	0	0	1	0	0	0	0	0	0	0	0	0	0	0	23
	fsl	0	0	1	0	15	1	3	0	0	0	0	0	0	0	0	1	0	3	1	0	0	25
	sbz	0	1	0	0	0	15	0	0	0	2	2	0	1	3	0	0	0	0	0	0	0	24
	esb	1	0	6	0	1	0	19	0	0	1	1	0	0	2	0	0	3	1	0	0	0	35
	smr	0	0	1	0	0	0	0	26	0	0	1	0	0	0	0	0	0	0	0	0	0	28
	swm	0	0	0	2	0	0	0	0	27	0	0	3	2	0	0	0	0	1	0	0	0	35
	isb	0	1	0	1	1	0	0	0	0	16	0	0	1	0	0	1	0	0	2	1	0	24
	del	0	0	0	0	0	0	0	1	0	0	15	0	0	2	0	0	6	2	1	0	2	29
	rdf	0	1	0	1	0	0	0	1	2	0	1	24	0	0	0	0	1	0	0	0	0	31
	sbv	0	0	0	0	0	0	0	0	0	1	1	0	15	1	0	0	0	0	4	1	0	23
	org	2	0	2	0	0	3	1	0	0	0	2	0	4	12	0	0	7	0	0	0	0	33
	srb	0	0	0	0	0	0	0	0	0	0	2	0	0	0	27	0	0	0	0	0	0	29
	ant	0	0	0	0	0	0	0	0	0	0	0	1	0	0	0	21	1	0	0	0	0	23
	usb	0	0	8	0	1	0	1	1	0	0	0	1	1	7	0	0	6	2	0	0	0	28
	mmr	0	0	0	0	2	0	4	0	0	0	1	0	0	0	0	0	2	16	0	0	0	25
	ssl	0	0	0	0	4	7	0	0	0	5	1	0	5	0	0	1	0	0	22	0	0	45
	rsb	0	0	0	0	0	1	0	0	0	1	0	0	0	0	0	1	0	0	0	14	0	17
	gou	1	0	0	0	0	0	0	0	0	1	3	0	1	3	0	0	4	0	0	0	26	39
	Sum	29	29	29	29	30	30	30	29	30	30	30	30	30	30	27	26	30	30	30	16	29	603
	Producer	0.83	0.90	0.38	0.66	0.50	0.50	0.63	0.90	0.90	0.53	0.50	0.80	0.50	0.40	1.00	0.81	0.20	0.53	0.73	0.88	0.90
	User	0.75	0.81	0.48	0.83	0.60	0.63	0.54	0.93	0.77	0.67	0.52	0.77	0.65	0.36	0.93	0.91	0.21	0.64	0.49	0.82	0.67
	Kappa per class	0.82	0.89	0.35	0.64	0.48	0.48	0.61	0.89	0.89	0.51	0.47	0.79	0.48	0.37	1.00	0.80	0.16	0.51	0.71	0.87	0.89
	Overall accuracy	0.66
	Kappa	0.64

Table 6. Accuracy assessment including correlation matrix and accuracy statistics for SAM + MRS and RF algorithm. Distinct types of bedforms are named according to the “symbol names” column in Table 2.

		Reference
		ant	org	Usb	fsl	isb	Sum
Prediction	ant	3	0	0	1	1	5
	org	0	3	1	0	0	4
	usb	0	0	0	0	0	0
	fsl	0	0	0	0	0	0
	isb	0	0	0	0	1	1
	Sum	3	3	1	1	2
	Producer	1.00	1.00	0.00	0.00	0.50
	User	0.60	0.75	-	-	1.00
	Kappa per class	1.00	1.00	0.00	0.00	0.44
	Overall accuracy	0.70
	Kappa	0.58

Table 7. Percentage of spatial distribution of the results for distinct types of bedforms obtained from the application of the three methods (manual delineation, MRS + RF, and SAM + MRS + RF). Distinct types of bedforms are named according to the “symbol names” column in Table 2.

NO	Symbol	Manual	MRS + RF	SAM + MRS + RF
1	ssb	32.74	2.34	0.00
2	sbv	5.04	21.13	0.00
3	mrp	1.64	0.87	0.00
4	uso	0.06	2.30	0.00
5	isb	1.26	30.90	19.60
6	gou	0.21	1.44	0.00
7	smr	2.22	0.36	0.00
8	usb	11.71	1.63	0.00
9	esb	22.12	1.82	0.00
10	sbz	1.56	5.58	0.00
11	ant	0.87	0.57	77.97
12	fsl	0.23	1.44	0.00
13	del	1.12	2.10	0.00
14	rdf	2.25	1.61	0.00
15	rsb	1.50	0.61	0.00
16	srb	0.41	0.27	0.00
17	swm	11.31	0.58	0.00
18	ssl	0.46	14.67	0.00
19	org	0.13	6.37	2.43
20	mmr	3.09	1.18	0.00
21	rmr	0.05	2.26	0.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Janowski, Ł.; Wróblewski, R. Application and Evaluation of the AI-Powered Segment Anything Model (SAM) in Seafloor Mapping: A Case Study from Puck Lagoon, Poland. Remote Sens. 2024, 16, 2638. https://doi.org/10.3390/rs16142638

AMA Style

Janowski Ł, Wróblewski R. Application and Evaluation of the AI-Powered Segment Anything Model (SAM) in Seafloor Mapping: A Case Study from Puck Lagoon, Poland. Remote Sensing. 2024; 16(14):2638. https://doi.org/10.3390/rs16142638

Chicago/Turabian Style

Janowski, Łukasz, and Radosław Wróblewski. 2024. "Application and Evaluation of the AI-Powered Segment Anything Model (SAM) in Seafloor Mapping: A Case Study from Puck Lagoon, Poland" Remote Sensing 16, no. 14: 2638. https://doi.org/10.3390/rs16142638

APA Style

Janowski, Ł., & Wróblewski, R. (2024). Application and Evaluation of the AI-Powered Segment Anything Model (SAM) in Seafloor Mapping: A Case Study from Puck Lagoon, Poland. Remote Sensing, 16(14), 2638. https://doi.org/10.3390/rs16142638

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Application and Evaluation of the AI-Powered Segment Anything Model (SAM) in Seafloor Mapping: A Case Study from Puck Lagoon, Poland

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Description of the Methodology

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI