1. Introduction
Understanding and monitoring the seafloor is fundamental for both marine conservation and geohazard assessment. In this context, pockmarks, widespread seabed depressions, are often created when fluids or gases from marine sediments concentrate and escape. Usually round to elliptical in shape, they frequently have flat to gently concave floors and steep margins, creating U- or V-shaped profiles [
1,
2]. Pockmarks vary in scale, with the smallest measuring ~1 m and the largest exceeding 200 m in diameter and 20 m in depth [
3,
4]. These characteristics are present in a variety of marine environments, ranging from deep basins to continental shelves [
5,
6,
7,
8]. They are especially prevalent in fine-grained, low-permeability substrates where fluid mobility is limited to distinct pathways [
9]. Some of the underlying geological conditions that can affect their distribution include faults, reservoirs of gas, and hydrate systems [
4,
10].
The seepage process may be triggered by several factors, including thermogenic or biogenic methane, porewater expulsion by compaction, gas hydrate dissociation, or groundwater flow [
1,
11,
12]. However, pockmarks can also form through non-fluid-related mechanisms such as benthic foraging by marine mammals, scouring by bottom currents, or sediment reworking associated with sea-level regressions and storm events [
13,
14].
Beyond their geological significance, pockmarks are also ecologically and environmentally important. They can act as indicators to methane emission hotspots also associated with climate change [
9,
12,
15]. Furthermore, by increasing habitat heterogeneity, pockmarks provide refuge from anthropogenic disturbances including trawling while also supporting benthic biodiversity [
16,
17].
As a result, pockmarks are complex seafloor features formed by the interaction of biological, environmental, and sedimentary processes. Therefore, these features demand detailed spatial analysis beyond the capability of traditional visual interpretation methods. To effectively study these seafloor features, benthic habitat mapping offers a robust solution by integrating geophysical and geomorphological data [
18,
19]. In addition, advancements in hydroacoustic technologies, particularly multibeam echo sounders (MBES), have significantly improved data resolution, leading to more frequent and fine-scale detection of seabed features such as pockmarks [
20,
21,
22,
23]. Geomorphological mapping has evolved over the past few decades from being a largely expert-interpretative into a data-intensive science, with growing support from computational approaches.
The complexity of pockmark genesis and distribution, often influenced by multiple geological and hydrodynamic factors, requires detailed and consistent seafloor mapping. With the increasing availability of high-resolution bathymetric data from advanced acoustic systems, traditional expert-based methods have become insufficient; manual interpretation is time-consuming, subjective, and not scalable for large datasets. To address this, the field has shifted toward semi-automated or fully automated workflows that offer greater objectivity, reproducibility, and analytical consistency. In particular, GIS-based tools, rule-based classification systems, and multiscale geomorphometric approaches have significantly improved the efficiency and consistency of seabed mapping.
Automated and semi-automated approaches for mapping seabed habitat have been the subject of numerous studies over the past 20 years, with a particular emphasis on terrain attributes derived from bathymetric data. A foundational workflow was established by early attempts, like the Benthic Terrain Modeler (BTM) [
24,
25], which classified seabed features by computing derivatives (e.g., BPI, slope, and curvature). These concepts were extended into domain-specific toolboxes and techniques for a range of geomorphological contexts. For example, the authors in [
26] integrated multiscale bathymetric data and ROV-derived microbathymetry to semi-automatically delineate over 500 Lophelia pertusa mini-mounds using the British Geological Survey (BGS) seabed mapping toolbox. Likewise, with an emphasis on cold-water coral systems, a study [
27] showed that carbonate mounds could be accurately mapped using a semi-automated pixel-based terrain analysis using bathymetry derivatives, with results that were on par with manual mapping. In order to adapt fluvial taxonomy to marine environments, the authors in [
28] developed a semi-automated multi-scale geomorphometric method for classifying seabed morphology. This method used optimized spatial scales to objectively delineate terrain features. With successful applications in various continental shelf environments, a GIS-based semi-automated toolbox was created for the mapping and morphometric characterization of pockmarks in the context of seabed depression [
29].
Building upon these frameworks, more recent toolboxes have placed a greater emphasis on script-based implementation and automation, reducing manual input while increasing analytical capabilities. For example, the Geoscience Australia’s Semi-automated Morphological Mapping Tools (GA-SaMMT) toolbox [
30] integrates more than 70 morphometric metrics into an ArcGIS Pro graphical user interface (GUI), offering consistent mapping in a variety of contexts. In a similar vein, feature identification is automated by the MATLAB-based POSIT toolbox [
31] using convolution and correlation with user-defined structural elements. The TargAn toolbox has been used to automate the analysis of pockmark fields in swath sonar data, demonstrating that shape quantification descriptors, such as elongation, complexity, and orientation, can effectively characterize these fluid escape features [
32]. Finally, an ArcGIS Pro Python solution for semi-automated delineation of seabed features using boundary- and element-based approaches with morphometric filtering is offered by the CoMMa (Confined Morphologies Mapping) Toolbox [
33]. Benchmark tests showed that it improves efficiency and reproducibility while maintaining accuracy comparable to manual interpretation [
33]. The effectiveness of CoMMa has been demonstrated in recent studies, such as in [
34], where it enabled accurate and efficient delineation of reefs. All of these developments show a continuous shift away from manual processes and towards scalable, code-based solutions that improve accuracy, efficiency, and comparability across studies.
Even though these developments mark a substantial increase in automation and analytical power, the quality of the input data is still a major limitation. Despite having a high resolution, (MBES) bathymetry is frequently impacted by artifacts from platform motion, environmental interference, and noise specific to the system [
35]. These artifacts have the potential to distort terrain attributes and hinder the accuracy of automated mapping outputs. By taking advantage of their robustness to noisy data, machine learning classifiers, such as Random Forests, can be used in post-processing to address these problems. More specifically, they can lower false positives, improve geomorphological segmentation results, and lessen the negative impacts of artifacts. Beyond post-processing refinement, machine learning has also been widely adopted for predictive modeling in marine geomorphology and habitat mapping. It offers strong tools for classifying seafloor features and predicting spatial patterns based on multivariate terrain descriptors [
36,
37].
This study presents a semi-automated workflow designed to delineate and classify geomorphologically and also ecologically significant seafloor depressions, specifically pockmarks in Flensburg Fjord, Germany–Denmark. This approach integrates terrain-based geomorphometric analysis, carried out using the CoMMa toolbox, with a machine learning refinement. The characteristic concave morphology of these features was captured by applying feature-specific terrain derivatives, including the Bathymetric Position Index (BPI). To enhance mapping accuracy and filter false positives, a Random Forest classifier was then trained on morphometric attributes. With the goal of improving habitat mapping practices and standardizing methods in marine spatial analysis, the aim was to develop a reproducible, parameter-based framework for the semi-automated mapping of pockmarks while minimizing noise, thereby assessing the effectiveness, scalability, and reliability of combining supervised classification with GIS-based geomorphometry for seabed feature delineation.
2. Materials and Methods
2.1. Study Area
Flensburg Fjord is a 50 km long, narrow bay in the southwestern Baltic Sea, forming the border between Denmark and Germany. It is subdivided into inner (10–20 m), middle (18–20 m), and outer (10–39 m) sections, including Sonderborg Bay, Gelting Bay, and adjacent open waters [
38]. Due to a 10 m sill at Holnis Peninsula, water exchange in the inner fjord is limited, occasionally leading to stratification [
39]. Sediments in the inner fjord are mainly dark sandy mud and soft mud [
39]. In the Flensburg Fjord, a study from 2023 [
40] identified and categorized numerous circular to elliptical pockmarks with shallow relief, diameters of 6–29 m, and slope angles up to 54°, often associated with gas-related acoustic anomalies and the absence of elevated rims. Grab samples from these pockmarks reveal sandy silt on high-backscatter rims and silty, black, H
2S-rich sediments in depression centers, with most depressions occurring in silty substrates at 13–24 m depth. Acoustic turbidity in sub-bottom profiler data supports the interpretation of active or past fluid escape activity. Our study includes several of these depressions, highlighting their prevalence and relevance within the study area. The training area was selected for its morphological diversity, containing both clustered pockmark fields and isolated features, which helped train the model on varied expressions of pockmarks. The validation area lies outside Flensburg Bay and was used to assess transferability in a similar, though not identical, geological and acoustic setting.
2.2. Bathymetric Data and Preprocessing
The multibeam data in raster format for Flensburg Fjord were obtained from the Federal Maritime and Hydrographic Agency of Germany (BSH) (
https://www.bsh.de). The resolution of the provided raster was 1 m for both training and validation area, as shown in
Figure 1.
2.3. Terrain Derivatives and Feature Extraction
In order to extract geomorphic features from bathymetric data, a semi-automated method was applied, utilizing terrain derivatives designed to highlight the unique morphological characteristics of each feature type. For the detection of pockmarks, the Bathymetric Position Index (BPI) was used due to its effectiveness in identifying localized depressions over varying spatial scales. BPI compares the elevation of each cell in a bathymetric grid to the average elevation of its surrounding neighborhood, producing positive values for elevated features, negative values for depressions, and values near zero for flat terrain [
24].
The approach adopted an intentionally inclusive strategy, prioritizing the detection of true positives (correctly identified geomorphic features) even at the risk of increased false positives (incorrectly classified non-features). The strategy influenced not only delineation thresholds but the parameterization of terrain derivatives, including BPI inner and outer radii. Various combinations of inner and outer radii were empirically tested to optimize the detection of concave depressions such as pockmarks. For example, combinations like 1–3, 1–5, and 1–10 were selected based on the mean size of the observed features, which often ranged a few meters. A small inner radius (e.g., 1) ensures that local variations within the feature are captured, while a progressively larger outer radius (e.g., 3, 5, or 10) allows for sufficient contextual contrast with the surrounding seafloor. By employing this permissive parameterization, morphologically subtle or uncertain features were preserved at the initial processing stage, thus ensuring comprehensive feature detection before subsequent refinement. Building on this approach, an inner radius of 1 m and outer radius of 5 m were chosen to emphasize local depression contrasts with the surrounding land surface.
After the creation of the BPI raster, in the CoMMa toolbox, candidate depressions were extracted as contiguous low-value zones and converted into polygon features. To avoid the accidental merging of adjacent features, a minimum vertical cutoff value of 0.03 m was applied, thus preserving each pockmark’s individual geometry. The minimum vertical relief criterion removed shallow, minor depressions by requiring a minimum difference of 0.3 m between the maximum and minimum elevation values within each polygon, which were established using zonal statistics from the original bathymetric data. A minimum width criterion of 5 m was also used to discard small or negligible features. The width-to-length ratio was also utilized to discard very elongated polygons that lacked the typical circularity of pockmarks, with values below 0.3 being excluded and values approaching one indicating near-circularity. A buffer distance of 0.7 m was also utilized to adjust polygon borders, enhancing the alignment between the mapped shapes and the physical extent of the depressions.
2.4. Morphometric Characterization
Since the initial delineation process deliberately favored over-segmentation to guarantee complete capture of all possible seafloor features, a refinement process was necessary in order to distinguish valid geomorphic features. Therefore, an extensive set of morphometric descriptors was extracted for each delineated polygon using tools embedded in the CoMMa toolbox and ArcGIS Pro. These descriptors, later used as input variables in the supervised classification step, helped differentiate true pockmarks from non-characteristic depressions based on shape, relief, and slope. The full list of extracted morphometric descriptors, including their definitions and categorization, is presented in
Table 1, which outlines the parameters used for characterizing delineated seafloor features.
These descriptors quantify various aspects of geometry, shape complexity, topographic variability, and internal surface structure, providing a strong input basis for subsequent classification. To describe the geometry and topographic expression of each delineated polygon, a set of shape and morphometric attributes was calculated. Surface area, perimeter, and minimum bounding geometry (MBG) dimensions were among the shape metrics that were used to calculate width, length, orientation, and width-to-length ratio. Convex hull-based metrics and the Polsby–Popper score were used to measure compactness and edge complexity.
In addition to vertical relief estimated by conventional and confined fill operations, topographic descriptors included minimum, mean, and maximum values for both depth and slope. The dissection index measured internal variation in elevation, whereas depth range acted as a stand-in for vertical extent. Mean, maximum, and variance metrics were used to evaluate local deviation from global bathymetric trends (LDfG), with a quintile rank added to show relative prominence. It should be noted that these depth-related parameters, including Depth_MIN, Depth_MAX, and Depth_MEAN, were deliberately not selected for use within the classification model. Even though the considered variables could help better understand the nature of absolute bathymetric placement, they are profoundly dependent on site-specific topographic settings and the seafloor gradient as a whole scope, which could vary substantially between datasets and geographic area. As a result, using those values might shift the trained model towards learning local phenomena, such as precise depth limitations of each feature, rather than its inherent bulk-morphological aspects.
2.5. Supervised Classification
Supervised classification was employed to refine the results of the semi-automated delineation workflow. The initial polygon features generated through the CoMMa toolbox, supplemented with morphometric descriptors for both the training and validation areas, were exported as a comma-separated values (CSV) file. Each feature was labeled based on a manual classification of verified pockmarks, and assignments were determined by whether the features intersected with these manually classified pockmarks, ensuring an accurate distinction between pockmarks and other depressions. Features that did not intersect with manually classified pockmarks were assigned to the ‘Noise’ class. This category includes a range of non-pockmark features, such as irregular depressions, artifacts arising from data processing or seafloor heterogeneity, and oversegmented seabed structures, all of which were carefully identified through visual inspection of the bathymetry.
The dataset was then imported into Waikato Environment for Knowledge Analysis (WEKA) (version 3.8.6), an open-source machine learning platform, where a Random Forest (RF) classifier was used. The selection of Random Forest hyperparameters in this study was guided by a combination of bibliographic precedent and empirical testing. Specifically, the configuration was based on recommendations from [
41], who successfully applied similar settings for the supervised classification of seafloor features. Initial trial-and-error experiments with alternative parameter values showed negligible improvements in validation performance, supporting the decision to adopt a streamlined and literature-informed approach.
More specifically, the classifier was trained using 200 trees (numIterations = 200) with a 70% training and 30% testing split. Preliminary tests showed that using more than 200 trees provided only slight improvements at the cost of much longer computation time.
The Random Forest classifier was configured to maximize model interpretability and performance without excessive parameter tuning. The maximum depth of trees was set to unlimited (maxDepth = 10), allowing the model to fully explore decision paths until pure nodes were reached. All available features were considered at each split (numFeatures = 0), enabling the model to identify the most informative attribute without restriction, which is appropriate given the relatively moderate number of morphometric variables. A batch size of 30 (batchSize = 30) was used for processing efficiency during training, and bagging was implemented with 100% of the training data (bagSizePercent = 100), ensuring maximal use of the dataset in each bootstrap sample.
The next step was the score threshold optimization. Threshold selection was performed by visual inspection of precision, recall, and F-measure vs. threshold curves using WEKA’s threshold curve visualization tool, which outputs precision, recall, and F-measure across a range of thresholds. The threshold between the intersection point of precision and recall, up to the point where F-measure stayed stabilized, was chosen as the optimal threshold range for classification. The trained Random Forest classifiers were saved and subsequently applied to the validation datasets with WEKA’s AddClassification filter, generating a new classification score for every polygon.
The resulting classifications were then exported and re-aggregated spatially into the original polygon geometries through attribute joins in ArcGIS Pro. A final filtering step was conducted by applying the optimized probability threshold, retaining only those features that exceeded the score cutoff. This provided a filtered subset of classified pockmarks that were applied in subsequent spatial analyses and evaluations.
4. Discussion
The study of seabed depressions is important for understanding both seafloor morphology and the subsurface processes that shape it. Beyond their morphological expression, they often reflect subsurface processes such as fluid or gas migration, including methane seepage from deeper sediment layers, pore-water expulsion due to sediment loading, or deformation associated with tectonic structures. Such processes can have important implications for seabed stability, benthic habitat distribution, and the identification of potential geohazard zones [
9]. Against this broader geological context, the present study developed a semi-automated workflow by integrating terrain analysis and machine learning to better delineate and classify these seafloor geomorphological features. We applied a morphometric mapping approach with a boundary-based method on a high-resolution bathymetric dataset. Our results demonstrate that this approach allows for more permissive mapping thresholds by using Random Forest classification trained on morphometric descriptors, effectively refining overinclusive polygon datasets while preserving true positives and minimizing false detections. The final mapped polygons closely match expert-generated data, showing how combining geomorphometric methods with machine learning can significantly reduce time and manual effort without losing accuracy. The BPI-based pockmark definitions at Flensburg Fjord were effective in capturing small, circular depressions.
This highlights the strength of combining data-driven methods with expert knowledge. Importantly, the intentional application of permissive parameterization in both workflows enabled the capture of real geomorphic features with a high recall but also sensitized the algorithms to high false positives. The segmentation accuracy and general shape fidelity of delineated features compared well with expert-derived mappings, reinforcing the utility of terrain-driven semi-automation in heterogeneous seabed environments.
Through morphometric analysis, the features delineated in both the training and validation datasets exhibit dimensions and geometries that are consistent with unit pockmarks. Specifically, their diameters predominantly fall within the 18–20 m range, with maximum values not exceeding 40 m, and are associated with moderate relief and flanking slopes between 3° and >6°. These metrics align with typical descriptions of unit pockmarks in comparable sedimentary environments [
42]. Furthermore, statistical analyses confirmed indications of similar geometrics and topographies across datasets, with the pockmarks showing compact, near-circular planforms (average circularity: 0.66), low aspect ratios, and subdued relief (mean: 0.22 m), supporting their interpretation as individual, gas-related seabed depressions. These morphometric traits consistently justified the confidence in the delineation procedure.
The Random Forest classifiers trained on morphometric characteristics had considerable strength in noise suppression, which displayed useful accuracy in their capacity to detect pockmarks. The classifier’s generalizability was reinforced by validation metrics from independent regions, which also attested to the effectiveness of morphometric thresholds derived from training data across datasets that differed geographically and morphologically. Model performance metrics (precision, recall, and F1-score) exceeded 0.80, with secondary validation areas showing only slight performance degradation. Significant agreement between the reference and predicted data is indicated by Cohen’s Kappa values of 0.60 to 0.68.
Our findings align with earlier research demonstrating the efficacy of semi-automated geomorphometric workflows for seafloor mapping. Foundational tools such as the Benthic Terrain Modeler (BTM) [
24,
25] have been instrumental in rule-based classification using terrain derivatives like slope and BPI. More recent advancements, such as GA-SaMMT [
30] and CoMMa [
33,
34], introduced multiscale morphometric filtering within reproducible GIS environments, tailored for confined morphologies. In this study, CoMMa’s delineation capabilities were effectively used to identify pockmarks in the southwestern Baltic Sea, although permissive parameterization in these tools reinforced the need for post-segmentation refinement.
Building on this, a key contribution of this study is the integration of Random Forest classification to filter and validate delineated features. This approach is innovative in combining standard bathymetric morphometry tools with machine learning to effectively separate noise from true seabed elements, addressing inherent MBES artifacts that often challenge traditional methods. The model’s use of MBG-derived traits (length, width, and width-to-length ratio) as primary predictors demonstrates how pockmark identification fundamentally relies on their specific morphologies, including their bounded dimensions and proportional geometries. These metrics effectively encode the characteristic depression morphologies that distinguish pockmarks from other seabed features, and in particular the width-to-length ratio is useful for distinguishing “actual” pockmarks from irregular depressions. Moreover, confined relief (Conf_R) and minimum slope (Slope_Min) provide relevant topographic information pertaining to the local intensity of ejecta pockmark depressions and detectable slope variances within their confines. Additionally, confined relief (Conf_R) measures local depth differences, helping assess pockmark depression intensity. It may be affected by seafloor slope, but as most pockmarks occur on gentle slopes, this effect is limited; future work could apply slope correction to refine estimates. Slope_Min will help enhance sensitivity to sharp gradients, and for distinguishing pockmark edges, helping to distinguish geomorphic features from terrain irregularities or oversegmented surface features that do not reflect fluid-induced formation processes. Collectively, these high-ranking features reflect typical pockmark morphology, such as compactness (MBG_W_L), confined geometry (MBG_Length and Width), and localized depth drops (Conf_R). Their importance in the classification model confirms that geometric confinement and relative relief are key discriminators between pockmarks and MBES-derived noise, in agreement with expert morphological interpretation.
This aligns with recent work highlighting RF’s advantages in geomorphological classification, including robustness to overfitting, ability to model non-linear relationships, and minimal distributional assumptions [
36]. The use of WEKA (version 3.8.6) software further facilitated rapid model development and reproducibility, which is critical for transparent and scalable habitat mapping.
Despite these strengths, the workflow has limitations. The reliability of terrain derivatives depends heavily on the resolution and quality of the underlying bathymetric data. In areas with lower data fidelity or noise, small pockmarks may be misclassified or be missed. Additionally, while parameter thresholds were empirically tuned, subjectivity remains in defining criteria for initial segmentation (e.g., BPI radius, cutoff values). The reliance on expert-derived training data may also introduce bias, potentially limiting the model’s generalizability. Furthermore, as evidenced by the variation in precision and recall across validation areas, site-specific factors such as sediment composition, the quality of the acoustic survey, and MBES artifacts can influence feature expression. These artifacts, which are also observable through morphometric analysis as distortions in parameters like area and relief, further complicate classification. This indicates that application of the methodology to new areas will likely require an adaptive, calibrated solution to maintain classification efficacy. Despite this calibration, the method still remains more objective than manual visual analysis.
Future research should explore automated threshold optimization methods and compare performance across different classifiers, such as Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Convolutional Neural Networks (CNNs). Subsequent work should consider integrating multiple expert interpretations or validating against independent ground-truth datasets (e.g., sub-bottom profiles, sediment cores) to minimize subjectivity and improve robustness. Further directions could also involve hybrid modeling approaches (ensemble methods, rule-based plus deep learning) to balance performance and interpretability. Future projects could incorporate adaptive delineation methods that extract defining features to optimize parameters based on their context and meaning. Furthermore, beyond terrain-based approaches, other automated classification methods have also proven effective for benthic habitat mapping. For example, Object-Based Image Analysis (OBIA) using acoustic backscatter has proven effective for classifying benthic habitats [
43,
44], reinforcing the value of multisensor workflows. As such, future research could benefit from systematically comparing the performance, adaptability, and data requirements of our terrain-based workflow against these alternative approaches to better define their respective strengths and applications. We also anticipate that the method could be transferable to seafloor features exhibiting both positive and negative relief (e.g., coral fields), although this requires further validation.
This workflow can be implemented directly in applications such as marine spatial planning, environmental monitoring, and offshore infrastructure development. By delivering a reproducible and time-efficient process for mapping the seafloor, it has the potential to help with environmental impact evaluation, habitat conservation, and geological hazard assessment. Most importantly, the scalability of this approach, using the same logic and only needing to tune parameters, could be extremely important with how marine research and policy manage their continually growing bathymetric datasets.
5. Conclusions
In this work, a semi-automated workflow that combines machine learning and terrain analysis is presented for identifying and categorizing pockmarks from high-resolution multibeam bathymetric data in Flensburg Fjord. The workflow makes use of the boundary-based delineation feature of the CoMMa toolbox, and permissive BPI-derived parameters guarantee that morphologically subtle depressions are recorded during the first segmentation step. A supervised Random Forest classifier was trained using a variety of morphometric descriptors that were taken from every polygon in order to refine the resulting overinclusive dataset. The classifier successfully decreased false positives while maintaining the essential morphometric features of authentic pockmarks, achieving high precision and generalizability. Important parameters that best captured the compactness, symmetry, and local depth contrast characteristic of pockmark morphology were found to be MBG-derived width, length, and width-to-length ratio, as well as confined relief (Conf_R) and minimum slope. These findings support expert-based interpretations, emphasizing the importance of relative relief and geometric confinement in differentiating pockmarks from irregular depressions or MBES artifacts. The model’s robustness was validated using a separate test area, although there was some performance degradation that was probably caused by site-specific differences in the acoustic quality, sediment composition, or artifacts like nadir gaps. The trained model achieved 86.16% classification accuracy and performed consistently across validation areas, demonstrating its robustness in detecting pockmarks with high shape fidelity and effective suppression of false positives. To further improve robustness, scalability, and applicability to larger mapping initiatives, future studies should investigate threshold automation, alternative classifiers, multi-sensor integration and obtain ground-truth validation via sub-bottom profiling, sediment cores, or visual surveys.
In conclusion, the workflow offers a scalable and time-efficient approach to seabed feature classification. Given its adaptability and reproducibility, this method holds strong potential for integration into national seabed mapping initiatives, environmental impact assessments (EIAs), and offshore infrastructure planning frameworks, particularly in areas where rapid and standardized habitat evaluation is essential for informed decision-making.