1. Introduction
The Clarion Clipperton Fracture Zone (CCZ), positioned in the northeast Pacific, is in industrial focus as metal-rich polymetallic nodules occur in high abundances on the seafloor in this area [
1,
2]. Several governmental and private companies have obtained licenses to explore 75,000 km
2-large areas under the auspices of the International Seabed Authority (ISA) (
https://isa.org.jm/). There is consensus that potential future mining of these nodules will have distinct impacts on the ecosystem, with slow recovery rates of most organisms and potential changes in biodiversity [
3,
4,
5]. Although investigation of the benthic communities in these nodule areas already started in the 1970s [
5], many fundamental questions concerning their ecology remain unanswered [
6]. A major difficulty is the high biodiversity of the communities, in contrast with very low abundances in the deep sea [
7,
8]. In the CCZ, habitat heterogeneity is high, not only due to areas with varying nodule sizes and densities alternating with nodule-free areas, but also due to topographic features such as seamounts and abyssal hills [
6,
9].
Here, we focus on the study of metazoan meiofauna, a group containing all organisms dwelling in benthic sediments that pass through a sieve with a mesh size of 1 mm but are retained by a 32 µm mesh. Meiofauna occurs in high abundances in the deep sediments of the CCZ, with more than several thousand individuals usually being observed per 100 cm
2 of seafloor sediment [
8,
10,
11,
12,
13]. Although the abundance of meiofauna is high in the CCZ [
8,
12], diversity of the most abundant taxon Nematoda is comparable to other deep-sea areas [
10]. Comparing nodule-free and nodule-covered areas, abundance is higher in nodule-free areas, whereas diversity is higher at areas covered with nodules [
14], most likely due to increased substrate complexity through the nodules [
15]. In the Peru Basin, another nodule area in the southeastern Pacific, communities with different taxonomic composition have been described from nodule crevices compared to the adjacent sediments, thus increasing overall biodiversity [
16,
17].
Although polymetallic nodules are an important environmental attribute of the CCZ, meiofauna is known to be influenced by a variety of different environmental variables ranging from sediment composition [
18] and food availability [
19] to currents [
20] and topographic features [
21]. Additionally, distinct differences were observed whilst comparing samples obtained in 2004 and 2012 from the same areas, with 1.5-fold higher abundance in 2012 and clear shifts in nematode community composition [
11]. These differences were hypothesized to be attributable do differences in primary production and, hence, nutrient input in the study area [
11]. Strong seasonal variation, however, seems unlikely as no significant intra-annual differences in meiofauna and nematode communities could be observed between April and October 2015 in the eastern Belgian license area in the CCZ [
12].
All of these investigations are based on point-source data. Another approach is to predict meiofauna abundance and diversity spatially across areas using predictive models, which is often used for management and conservation planning [
22]. However, there are a variety of different methods that can be implemented, depending on their aims and applications [
23]. These range from simple spatial interpolation from the nearest neighbour as used by Wedding et al. [
24] for macrofaunal abundances in the CCZ to more complex models integrating environmental variables and even biological interactions (e.g., [
25]). In the marine environment, General Linear Models (GLMs) have been commonly used [
23]. However, caution has to be given to possible non-linear responses to environmental parameters, especially those using many variables of unknown influence [
23]. The tree-based approach random forest, introduced by Breiman [
26], is more robust under such conditions [
27] and is hence preferred in this study.
In the CCZ, spatial prediction using random forest regression has been suggested as an appropriate tool for the definition of preservation areas [
13] at the scale of a whole contractor area (approx. 75,000 km
2). On that scale, only continuous predictor data obtained and derived from either multibeam echosounder systems or satellite-based remote sensing are available [
13]. However, distribution models can also be a useful tool to investigate possible influences of environmental variables as well as temporal changes [
23], especially in smaller areas, e.g., at the potential scale of a mining operation (~1500 km
2). Although environmental data are often available as point-source data from the same positions as meiofauna abundance, they can potentially also be obtained from different positions. In this case, the most straightforward approach is to also model and spatially predict the distribution of environmental variables such as grain size of the sediment or organic contents [
28].
In this study, we aim to predict the spatial and temporal variation (regular sampling between 2010 and 2018) of meiofauna abundance at a scale half as large as a potential mining operation (~600 km2) using sediment and nodule characteristics as predictors in a random forest approach. Furthermore, we compare distribution models of meiofauna computed solely with remote-sensing data such as bathymetry and backscatter to models also including the predicted layers of sediment and nodule characteristics. Inter-annual variability in meiofauna abundance is explored in a similar way, using data only from each sampling year as response variables in the models. Hence, we aim at evaluating the use of distribution models and predicted distributions to describe and investigate the meiofauna community.
2. Materials and Methods
The study area is a prospective nodule area that could be envisaged as a future mining site in the CCZ and is located within a license area assigned to the Federal Institute for Geosciences and Natural Resources (BGR), Germany, by the ISA for the exploration of polymetallic nodules. The study area has a length of ~20 km from north to south and ~29 km from east to west (
Figure 1). Meiofauna has been sampled in the study area in 2010 as well as every year between 2013 and 2018 except for 2017 [
29,
30,
31,
32] (Access can be given by the BGR on request:
[email protected]), [
33,
34] in water depths between −4156 m and −4076 m. In all years, sampling was conducted between the end of March and mid-May. Six stations in the study area have been assigned to a time series; highest repetition of stations was achieved for two stations, with samples being available from the years 2013, 2014, 2016 and 2018.
Sampling was conducted with a multicorer, adding up to a total of 35 multicorer deployments in total for all years together. A varying number (1–5) of cores was available from each deployment; hence a number of 106 core samples could be used for this study. The upper five centimetres of sediment were fixed in a 4% formaldehyde/seawater solution with the overlying bottom water. Core diameter amounted to 9.6 cm, except for 2013, when smaller cores with a diameter of only 7.4 cm were used. To extract the meiofauna from the sediment, the differential flotation method [
35] was used with the colloidal gel Levasil
® (Bohus, Sweden). The supernatant containing the organisms was stained with Rose Bengal to simplify the recognition of the meiofauna, and abundances were counted on high taxonomic level (e.g., Nematoda, Copepoda, Tardigrada) using a dissecting microscope. Due to the use of cores with a smaller diameter in 2013 and hence a smaller sampling volume, all abundances were converted to abundance per 100 cm
2. Based on abundances, taxon richness was computed as the number of higher taxa occurring at a site; Simpson’s Index [
36] and Pielou’s Evenness [
37] were computed as measures of biodiversity on a higher taxonomic level.
Characteristics of sediments and polymetallic nodules have been investigated using samples obtained with a boxcore (50 × 50 cm
2) during seven cruises between 2008 and 2016 [
29,
30,
31,
38,
39] (Access can be given by the BGR on request:
[email protected]), [
33,
40] (
Figure 1). Sediment parameters include grain size, shear strength and wet and dry bulk density, as well as the content of carbon and different metal elements in the sediment. For nodules, the total nodule number, weight and size were determined from each boxcore. Additionally, the chemical composition of the nodules was determined.
The main information source for the spatial prediction of environmental variables as well as meiofauna abundance and diversity is ship-based bathymetry obtained with a Simrad EM 120 with a resolution of 100 m [
38]. A second parameter is the backscatter value, also obtained from the sea surface via a Simrad EM 120, available with a resolution of 120 m [
38]. Hence, bathymetry was resampled to the lower resolution using the function
resample() from the R package raster [
41]. Based on bathymetry, the function
terrain(), also part of the R-package raster [
41], was used to compute slope, aspect, roughness, Terrain Ruggedness Index (TRI) and the direction of the greatest drop in elevation. The Bathymetric Position Index (BPI) was computed on three different scales of 300 m, 1.5 km and 5 km using the function
focal() from the same R package [
41].
Spatial predictions were computed with random forest regression [
42] as applied by Uhlenkott et al. [
13]. Predictor variables for model computation were subsets from the available environmental layers as well as bathymetry and backscatter at the sampling positions. For spatial prediction, the layers were used as new data to allow for continuous prediction of sediment and nodule characteristics as well as meiofauna abundance and diversity. Meiofauna community analysis was based on Bray-Curtis Dissimilarity [
43], fitting environmental variables as vectors on a non-metric multidimensional scaling (nMDS) ordination using the function
envfit() [
44]. Differences between years were investigated with a permutational analysis of variance (PERMANOVA) [
45] as implemented in the function
adonis() [
44]. For pairwise comparison, the function pairwise.adonis was used [
46] (available from Github (
https://github.com/pmartinezarbizu/pairwiseAdonis)). Additionally, differences in multivariate dispersion between years were investigated using an analysis of multivariate homogeneity of group dispersions as implemented in the functions
betadisper() and
permutest() [
44].
All computations were carried out in the statistical environment of R [
47] using the R packages colorRamps [
48], randomForest [
42], rangeBuilder [
49], raster [
41], vegan [
44] and viridisLite [
50].
4. Discussion
Distribution modelling can be a powerful tool, not only for management issues [
22], but also to investigate potential influences of environmental variables on organisms [
23]. Hence, the spatial description of the abiotic environment is an important first step to investigate patterns affecting faunal distributions [
28]. However, the only continuous and available layers with appropriate resolution were bathymetric variables and backscatter value; all of these variables were obtained as remote-sensing multibeam data. Continuous layers of relevant environmental variables such as grain size are scarcely available in the CCZ [
13,
24] or rather available on very low, global-scale resolution with information on, e.g., particulate organic matter [
51], temperature, salinity or dissolved oxygen [
52]. In this study, only 26 out of 73 parameters describing sediment and nodule characteristics proved to be significant and useful for modelling purposes. Hence, it is possible that some parameters display a distribution that is not mirrored in large-scale geographic features and also cannot be described using a combination of these.
Environmentally, the abyssal plain is a habitat characterized by low food flux and low physical energy [
53] and, due to these conditions, the deep-sea fauna is usually highly diverse but occurs in low abundances [
7]. Hence, the natural variability of the benthic fauna of all size classes is high on a small scale [
8,
54,
55]. Therefore, if natural variability of environmental parameters as well as within the benthic fauna is too large, the model also cannot find patterns within the area. Regarding meiofauna, the variability between cores of the same multicorer deployment is often as large or even larger than the variability between cores of geographically separated deployments [
56]. In our study, we could observe that such high variability can also be observed temporally, with heightened meiofauna abundance in only one core of the time series samples obtained in 2013.
Differences in predicted spatial distribution of meiofauna, however, are only minor, regardless of whether the model is computed based on bathymetry and backscatter, the predicted environmental parameters or all available variables. This might be due to the fact that the models predicting distribution of sediment and nodule characteristics use bathymetry and backscatter as predictor variables. Hence, significant influences of environmental variables on the community analysis based on Bray Curtis Dissimilarity largely mirror the influences of water depth and aspect. However, geographic features are known to mirror environmental drivers [
23,
57], and therefore, it is nonetheless useful to include predicted environmental layers to investigate their influences on the meiofauna. Most meiofauna variables have a smoother predicted distribution when integrating environmental parameters into the models. The use of higher resolution might actually not be useful for modelling the absolute values of meiofauna densities due to the high natural variability [
13,
56,
58], which mainly allows for the prediction of trends. For computing predictions on species level, other taxa or other size classes, the additional use of environmental variables might become more important.
Water depth is often related to food availability, which usually decreases with increasing water depth [
19,
57,
59]. In the study area, water depth varies only slightly between sampling positions, with the largest difference amounting to 80 m. However, despite similar directions of vectors across the meiofauna community depicted in Bray-Curtis Dissimilarity, Pearson’s correlation of −0.17 (p: ~0) between water depth and total organic carbon is even slightly negative in the study area, which may be the result of small-scale lateral transport of food particles. On a larger scale, there are distinct gradients in POC-flux across the CCZ [
60] that are likely to influence meiofauna patterns across a distance of hundreds of kilometres [
8]. These differences were found to even mask the small-scale influence of grain size on meiofauna [
8]. On the small scale of this study, it is more likely that bottom currents influenced by topographic features such as seamounts and ridges as well as nodule sizes shape food availability and other sediment characteristics such as grain size [
61,
62,
63]. Still, the variability of meiofauna sampled with the multicore might be influenced by variability on a smaller scale than that obtained with the larger boxcore samples and expressed in the predicted distribution of environmental variables.
The other continuously measured variable, backscatter value, is a proxy for the firmness of the seafloor and can therefore be used to predict the nodule cover on the soft-sedimented seafloor [
9,
64]. Different nodule patterns correlate with meiofauna in different ways. High nodule abundance can lead to lower meiofauna abundance simply by reducing the available sediment volume. However, communities in nodule areas also vary from communities observed in nodule-free areas [
10,
14], and in the Peru Basin, different communities have also been described from nodule crevices [
16,
17]. Nevertheless, it is unlikely that parameters such as the content of molybdenum or other elements in the nodules directly influence meiofauna patterns, as suggested by the community analysis (
Figure 2). Most likely, they are either mirroring other features of the polymetallic nodules, e.g., porosity or size, or are proxies of other environmental variables that do influence the meiofauna community. Hence, they can be used as predictor variables for distribution modelling just as geographic features, although they possibly do not influence the meiofauna abundance directly.
In contrast, variation between models computed for the different years is high. Natural temporal variability of environmental conditions is high in the deep sea [
5] and significant inter-annual differences have already been described for meiofauna in the French license area in the CCZ [
11]. In this study, abundance of Nematoda was 1.5-fold higher in 2012 compared to abundance in 2004 [
11], which is similar to the differences observed in this study, with a steady increase from 2010 to 2016, followed by a drop in abundance back to the level of 2010 in 2018. Spatially, a shift in abundance patterns seems to occur between 2013 and 2014 in our study area. A possible explanation could be shifts in the hydrodynamic regime of the study area due to changes in intertidal flow or benthic storms initiated by mesoscale eddies passing through the area [
65]. Such events might influence meiofauna abundance within the sediment in different ways through, e.g., input and redistribution of food particles. At current velocities of 8 cm/s, resuspension of particles can be observed [
63], which might limit the access of meiofauna to food particles deposited on the seafloor. However, model performance is weak if the models are solely based on data obtained in 2013, 2014 and 2016, and additionally, the number of samples available for 2010 and 2018 is too low for model computation. Therefore, it is difficult to clearly separate bias from natural patterns. Still, distribution models proved to be a useful tool to investigate temporal patterns in the spatial distribution of meiofauna abundance and diversity.
In the context of potential nodule mining, distribution models are an important tool to define appropriate protected areas and preservation zones within the CCZ and within individual contractor areas, respectively [
13]. Here, we show that random forest regression can also help to investigate relationships between environmental variables and deep-sea meiofauna at the geographic scale of a potential mining operation even if these variables are obtained from different sampling sites. Although the variance explained by the models is rather low due to a small number of observations contrasting high variability, such models are the best possibility to investigate spatial distribution of the benthic fauna. Hence, the distribution modelling approach applied in this study could also prove useful for other size classes as well as on lower taxonomic levels, e.g., of the most abundant meiofauna taxon Nematoda. A potential issue might be the low dominance of species and genera in the deep sea [
7], which can lead to few observations of the same taxon even within a distinctive sampling approach. However, the great differences between years also advocate for extended investigation of temporal variability to better define the levels of natural variability in order to obtain a better understanding of possible mining impacts and ecosystem recovery.
A similar modelling approach could be used to investigate the influence of pilot mining actions that do not only remove the sediment from the mining area but also provoke a sediment plume that will additionally affect more distant areas [
5]. Although the specific effects of nodule mining in the deep sea are uncertain, impacts are likely to be severe [
5]. Combining distribution models of benthic organisms with models of the impact, e.g., of the sediment plume [
63], could help in the establishment of preservation zones that are out of the impact area of the disturbance. Furthermore, distribution models can help to estimate possible spatial extents of the impact during monitoring and later to investigate recovery of the disturbed areas.