2.5.3. Feature Importance

The variables in RFR and gradient boosting machine algorithms, such as XGBR and GBR are often ranked by the variable-importance approach [55,63,64]. Relative variable importance is computed as follows. The first step searches for a candidate subset of variables (in this case, by the grid search approach). Initially, the grid search includes all variables derived from the S-2, VIs, and ALOS-2 PALSAR-2 datasets. The datasets are input to the XGBR model, which ranks the variables in descending order of their importance based on the root mean squared error (RMSE) and the coefficient of determination ( *R*2). Next, a certain number of the least important variables are removed, and the surviving variables form a variable subset. In this paper, the search/selection iterations were terminated when the *R*<sup>2</sup> of the prediction model of the subset did not improve the performance in the test set. The final step validates the selected variable subset and determines the relative variable importance (in this case, by the five-fold CV approach).

The modeling and generated variable importance of the XGBR model were implemented in the Python environment.

## 2.5.4. Model Evaluation

The model performances of the various ML techniques were evaluated and compared by the RMSE (Equation (2)) and *R*<sup>2</sup> (Equation (3)), which are widely employed in estimates of forest AGB biomass. Both standards evaluate the errors in a regression model from the di fferences between the measured data (the mangrove forest measurements) and the estimated AGB data [50]. A well-performing model will achieve a high *R*<sup>2</sup> and a low [24,47].

$$\text{RMSE} = \sqrt{\sum\_{1}^{n} \frac{(y\varepsilon\_i - ym\_i)2}{n}} \tag{2}$$

$$R^2 = \frac{\sum\_{i=1}^{n} \left( y\mathbf{e}\_i - \overline{y\mathbf{e}} \right) \left( ym\_i - \overline{y\mathbf{m}} \right)}{\sqrt{\sum\_{i=1}^{n} \left( y\mathbf{e}\_i - \overline{y\mathbf{e}} \right)^2 \left( ym\_i - \overline{y\mathbf{m}} \right)^2}} \tag{3}$$

In the above expressions, *yei* is the mangrove AGB predicted by the ML model, *ymi* is the measured mangrove AGB, *n* is the total number of sampling plots, and *ye* and *ym* are the mean values of the predicted and measured mangrove AGBs, respectively.
