*2.2. Vegetation Indices*

To determine the vegetation indices that delineated patches with highest accuracy, 14 commonly used indices were calculated (Table 1) from the RGB aerial image that covered the study area (Figure 1). The image contained individual and clumped mangrove trees within a graminoid marsh matrix. Through their interactions with incoming solar radiation, the two vegetation classes (marsh and mangrove) vary in absorption and reflection of electromagnetic radiation of different wavelengths, with trees also casting shadows onto other trees and marsh vegetation (Figure 3). As expected, mangroves reflected more light in the green spectrum than surrounding marsh vegetation or shadows (Figure 3). Suitable vegetation indices enhance the contrast between tree patches, marsh matrix, and shadows. The 14 vegetation indices (Table 1) were calculated with equations presented in Table 1.

**Figure 3.** RGB spectral values shown across a selected linear profile AB extracted from the aerial photograph.

**Table 1.** Commonly used vegetation indices their equations and source references.



**Table 1.** *Cont.*

No normalization was applied to the brightness values because the vegetation indices were mainly used to identify tree markers, and normalization does not necessarily enhance the contrast in index values between trees and marsh. The calculation of vegetation indices resulted in grayscale images as shown by ExG and ExR images (Figure 4). Gray index images were then binarized by Otsu's automatic thresholding method and used for delineation of markers.

**Figure 4.** The original RGB aerial photograph and grayscale images of vegetation indices ExG and ExR.

#### *2.3. Otsu's Thresholding Method*

Otsu's automatic thresholding method [37] was used to generate the binary images for tree patches (with values of 1) and background marsh matrix (with values of 0). This thresholding method is a non-parametric approach which uses the histogram of the pixel brightness values derived from grayscale images representing two normal intensity distributions that show a bimodal distribution [25]. One distribution represents the target pixels (i.e., mangrove patches) and the other represents the background (i.e., marsh matrix). Figure 5a shows the histograms of five vegetation indices that display narrow to widely spread bimodal distributions. Otsu's method maximizes the between-class variance while minimizing the within-class variance of the intensity values in the image, thereby providing optimal thresholding for an index (Figure 5b).

**Figure 5.** (**a**) Histogram of five vegetation grayscale indices (ExG, ExGR, COM, TGI, and GRB); (**b**) Otsu's automatic threshold for the ExG grayscale image.

#### *2.4. Marker Detection and Watershed Segmentation*

The markers for watershed segmentation were delineated from the binary image performing the following steps:


**Figure 6.** (**a**) Mangrove patches in the original RGB aerial photograph, (**b**) unequivocal tree patches derived from ExG index, and (**c**) marker image showing expanded tree patches (gray), background (brown), and indeterminate region (black) derived from ExG index.

*2.5. Removal of Shadows*

> After applying the watershed segmentation algorithm, the following steps removed shadows:

1. A mean RGB (mRGB) index image was calculated by summing intensity values from all the bands and dividing by three. A mask image of same size as the mRGB index was created, where values from the mRGB index image was kept at 0 if mRGB values were less than the first percentile, and 1 if mRGB was equal to or greater than the first percentile.


## *2.6. Parameter Sensitivity*

The marker-detection process consisted of three morphological operations as described in Section 2.4. The parameters of these operations were values for the morphological kernel size (MKS) for opening and dilation, the opening and dilation iterations, and the distance transform coe fficient (DTC). A sensitivity analysis was performed to determine the parameter values that enhanced segmentation, which was evaluated on the basis of overall accuracy of tree detection. The parameters and test values are provided in Table 2. The full-factorial design produced 90 model combinations per index image, resulting in a total of 2520 models. Point-based accuracy estimates as described in Section 2.7 for each of these 2520 models were used to determine optimal parameter combinations and indices.


**Table 2.** Parameters and their values used for sensitivity analysis.

#### *2.7. Tree-Cover Estimation from Random Samples and Tree Detection Accuracy*

To evaluate the performance of each index, a simple random sample reference data set was generated. The first objective was to estimate the tree cover (area of patches) within the study area as a reference, and the second was to establish a reference for overall and class-specific omission and commission errors for each of the predicted tree cover maps. Since each map was to be evaluated with the same sample set, we chose a simple random sampling design [58]. The required minimum number of simple random sample points was calculated for a 2% precision (d = ±2%) estimate within a 95% confidence interval (z = 1.96) (Equation (1)) [58].

$$m = z^2 \times p \times (1 - p) \, \, \, \, \, d^2 \tag{1}$$

Considering the worst-case sampling scenario of p = 50% tree cover, a minimum of 2401 samples were required to estimate the tree-cover proportion within a 2% margin and a 95% confidence, and the sample points were randomly generated within the study area. Since resolution and contrast of the aerial photograph were high enough to visually distinguish trees from marsh and shadow, and because it is optimal to evaluate maps from their photo source data to avoid potential changes [59], we visually evaluated each random sample from the 2017 aerial photograph and assigned class labels (tree, marsh, or shadow). The visually interpreted random points were then used to estimate the tree cover within the study area. For this estimate the marsh and shadow classes were combined to a no-tree class.

To estimate overall and class-specific user's and producer's accuracy for each of the algorithm-predicted maps, the classified cover type was extracted for all random samples from each tree-cover map. The extracted values and the reference labels were then cross tabulated to generate confusion matrices. From the confusion matrices we estimated adjusted overall, and adjusted class-specific user's and producer's accuracies for both tree and no-tree classes, along with their standard errors [60], as well as adjusted tree cover proportions, factoring in the class proportion information of each map [58]. We used the terms overall, user's, and producer's accuracy in Section 3, Section 4, Section 5 to refer to their adjusted values, respectively. Furthermore, we were interested in how the presence of shadows affected the accuracy of segmentation. For each index for which segmented images with and without shadow removal were generated, the differences in overall, user's, and producer's accuracy, and proportional area were calculated and compared.

#### *2.8. Object-Based Overlap Accuracy Assessment*

The performance of index images in delineating the patches was further evaluated by overlap analysis of automatically detected patches with a manually digitized reference dataset. We used an object-based approach with tree polygons as sampling units, and a post-classification simple random sampling design with equal probabilities for all polygons. Unlike point-based accuracy assessment where the same reference data can be used to evaluate the performance of all models, individual reference data have to be created for object-based evaluation of each model output, because each model generates a different number of polygons with different polygon sizes and, therefore, must be sampled individually. Consequently, it was not feasible to evaluate the performance of all 2520 models. Instead, we selected the two models with the highest point-based overall accuracy: one with shadows and the other with shadows removed. For both predicted tree cover maps, we selected 50 polygons using simple random sampling from a list frame. Random sampling from a list of all units within a population ensured equal selection probability for every polygon regardless of size. Point sampling would have increased the probability of including large polygons and over-representing large polygons at the cost of small polygons of individual trees [59]. We digitized tree patches manually from the original RGB aerial photograph. Since the patch polygons were of different sizes including either an individual tree or a group of trees (clumps), in addition to patch boundaries, when possible, we digitized individual tree crowns with their centers inside a predicted polygon. We then assigned the sample identifier of the predicted polygon to all digitized patches in order to evaluate the count of trees within each polygon that was delineated by the watershed segmentation. The spatial union of reference data and model-generated patches produced three types of areas: (1) Correctly predicted tree patches, i.e., areas where prediction and reference agreed; (2) areas of omission error, which included tree polygons in the reference data that were missed by the model; and (3) areas of commission error, i.e., algorithm-delineated portions of mangrove polygons that were not part of the reference data. We quantified the three area types using Equations (2)–(4).

$$\text{Actual Tree Area} = \frac{\Sigma \text{Area of true overlap of patches from automatic segmentation}}{\Sigma \text{Area of patches from reference data}} \tag{2}$$

$$\text{Omission Error of predicted tree curves} = \frac{\Sigma \text{Area of omitted patches from automatic segmentation}}{\Sigma \text{Area of patches from reference data}} \quad \text{(3)}$$

*Comission Error o f predicted tree crowns* = Σ*Area o f comitted patches f rom automatic segmentation* Σ*Area o f true patches f rom automatic segmentation* (4)

We also tabulated the total number of individual trees in each predicted polygon to determine if the detected tree was an individual or part of a clump of trees.
