2.1.3. Pleiades-1A Multispectral Satellite Imagery

A MS Pleiades-1A imagery acquired on 25 May 2020, at 07 h 24 min UTC, was provided by the French space agency CNES through the data platform DINAMIS (https: //dinamis.data-terra.org/, last accessed: 28 December 2021). Pleiades-1A imagery is delivered with four MS bands at 2 m pixel size, with an 11-bit dynamic range: Blue (430–550 nm), green (500–620 nm), red (590–710 nm), and near infrared (740–940 nm). Moreover, a panchromatic band at 0.5 m pixel size (470–830 nm) is included, with the same radiometric resolution [35]. The four-band imagery is geometrically projected with the WGS84/UTM38S coordinate system and radiometrically corrected to units of top of atmosphere (TOA) reflectance.

#### 2.1.4. ICESat-2 LiDAR Satellite Soundings

ICESat-2 is in a near-polar orbit at an altitude of 496 km and operates with a revisit period of 91 days over oceans [19,36]. ICESat-2 was mainly designed to measure icesheet topography, sea ice, and various inherent properties of the atmosphere and terrestrial vegetation, although ocean and inland surface waters are also observed. The Advanced Topographic Laser Altimeter (ATLAS), a photon-counting LiDAR, is the only sensor onboard the satellite, emitting a green laser beam at a wavelength of 532 nm. ATLAS enhances spatial sampling by splitting the laser beam into three pairs of beams separated by 3.3 km. Each pair, separated by 90 m, consists of a "weak" energy beam and a "strong" beam with a four-fold higher pulse energy [19,36]. ICESat-2 data can be downloaded with different degrees of processing, depending on the users' needs. This study uses the 3rd version of the L2 ATL03 georeferenced photons (data publicly available at https://search.earthdata.nasa.gov/search, last accessed: 28 December 2021) [37]. Data about each photon are provided with the latitude, the longitude, and the height relative to the WGS84 ellipsoid as well as other ancillary information. Considering that ICESat-2 was not designed to study the sub-surface water or the bottom topography, it is necessary to include in the analyses a correction for refraction bias induced by the water column.

We selected the ICESat-2 track acquired on the date closest to the acquisition date of the MS imagery. The two datasets were acquired 10 days, 10 h and 33 min apart. Then, the specific study area in Mayotte was selected based on the range of depths for which calibration data were available. ICESat-2 passed over Mayotte on 14 May 2020, at 20 h 51 min UTC, and collected bathymetric data down to a depth of 15 m.

#### *2.2. Data Processing*

Most of the ICESat-2 photons that reach the oceans penetrate into the water. However, compared to the water surface returns, only a small fraction is returned from the water column backscatter and bottom reflectance. Therefore, ICESat-2 signal photons correspond primarily to the water surface reflectance, water column backscatter, seabed reflectance, and noise.

ICESat-2 ATL03 data are provided with a preliminary classification of every photon regarding how likely it is to be signal or noise (confidence levels are: Noise, Low, Medium, High, and Buffer). Photons classified as "Buffer" are identified after all the signal photons are clustered. These are the photons for which doubt remains, which are at the limit to be identified as part of the signal. Therefore, this category has been created to ensure that all of

the photons identified as signal are present in the corrected product [37]. Figure 2 presents the transect used in this study, indicating the original classification of the georeferenced photons. In this figure, photon positions (latitude and longitude) were projected onto a local geographic plane. Therefore, the horizontal axis corresponds to the along track distance. The origin point corresponds to the northernmost location of the trajectory. However, this clustering is not suited to underwater environments as it considers a considerable amount of the seafloor as noise. Therefore, all of the photons were considered and a modified density-based spatial clustering of application with noise (DBSCAN) algorithm was used to separate the photons characterizing the noise and the sea surface from those related to the seabed [17,38–40].

**Figure 2.** Photon point clouds of the transect along ICESat-2 gt1l (strong beam), acquired on 14 May 2020. The confidence levels provided by ATL03 are displayed.

#### 2.2.1. Noise Removal and Detection of the Sea Surface

In the dataset, noise corresponds to sparse points with a low spatial density compared to the sea surface and seabed clusters. Georeferenced photons likely to be noise were removed, and photons associated with the sea surface were identified.

Here, a density-based spatial clustering method was used, which is an unsupervised learning method used to identify clusters in a dataset. The method is based on the premise that each cluster is defined as a region of points with a given density and spatially isolated from other groups by areas of lower density. The DBSCAN algorithm used scanned the entire dataset and established a search radius on each point successively. The point considered during a given step is a "core point". DBSCAN allows the users to specify a search radius size according to two criteria: The search circle radius and the minimum number of points *MinPts*. Once a criterion is no longer satisfied, the algorithm begins a new classification group [41].

Previous studies successfully implemented DBSCAN on the ICESat-2 dataset of islands located in the south of China and in the Bahamas. One particular study provides formulas to configure the *MinPts* and the radius parameters of the DBSCAN algorithm [17]. In the present research, the search radius was manually chosen by the user to guide the clustering process and optimize the results. In addition, the *MinPts* parameter is defined by Equation (1) [17] (this formula is suited for a study of a water column whose depth is not expected to exceed 60 m).

$$MinPts = \frac{2SN\_1 - SN\_2}{\ln\left(\frac{2SN\_1}{SN\_2}\right)},\tag{1}$$

where *SN*<sup>1</sup> is the number of expected photons corresponding to signal and noise and defined by Equation (2):

$$SN\_1 = \frac{\pi \epsilon^2 N\_1}{l l l},\tag{2}$$

where *N*<sup>1</sup> is the total number of photons (both signal and noise), *h* is the vertical range and *l* is the along track range. *SN*<sup>2</sup> is the expected noise photons number and is defined by Equation (3):

$$SN\_2 = \frac{\pi \epsilon^2 N\_2}{h\_2 l},\tag{3}$$

where *N*<sup>2</sup> corresponds to the number of photons in the layer with the fewer bathymetric photons, while *h*<sup>2</sup> is the height of the corresponding layer [17].

The variable *MinPts* is constrained to a value no lower than 3. If the previous formula provides a value lower than this threshold, then *MinPts* was set to 3 [17]. This algorithm might not be optimal in the present situation, since the dataset contains isolated photons from the seabed which could be identified as noise. Considering the small number of photons from the seabed, it was decided not to optimize the noise cleaning process, even if it meant that some manual cleaning had to be done. Therefore, the remaining noise points were removed manually using GlobalMapper software 22.1.0 (Blue Marble Geographics, Hallowell, ME, USA).
