**3. Results**

#### *3.1. Performance of the GBDT Estimation Model*

The optimal hyperparameter values for the GBDT, as determined with the exhaustive grid search method, were 0.01 for the shrinkage factor, 3 for the tree depth, and 1500 for the number of trees, respectively. Figure 6 depicts the GBDT estimation results for the training, validation, and testing sets. It is shown in Figure 6a that all of the training data points are grouped closely around the 45-degree line. Table 3 demonstrates extremely low error matrices (an AAE of 0.33 m<sup>3</sup>/t, an ARE of 2.31%, and a RMSE of 0.42 m<sup>3</sup>/t) and a remarkably high R<sup>2</sup> of 0.993 for the training set. These evaluation matrices prove that the GBDT is capable of accurately reproducing the adsorption amount that is based on the input variables. For the validation and testing set, although the cross plots of the measured

versus estimated values show a more scattered pattern than the training set, the majority of the data points are distributed around the 45-degree line and the deviations are within small ranges (Figure 6b,c). The AAE, ARE, RMSE, and determination coefficient (R2) are calculated to be 0.83 m<sup>3</sup>/t, 5.97%, 1.00 m<sup>3</sup>/t, and 0.950 for the validation, and 0.85 m<sup>3</sup>/t, 6.35%, 1.06 m<sup>3</sup>/t, and 0.946 for the testing sets, respectively. The error matrices for the validation and testing are quite comparable, suggesting strong robustness of the constructed model (Table 3). In this regard, the GBDT model can be considered to have a strong generalization capability, as indicated by the relatively low error matrices and high R2.

**Figure 6.** Cross plot of the gradient boosting decision trees (GBDT) estimated versus measured adsorption amount for the (**a**) training (**b**) validation; and, (**c**) testing sets. Open circles are data points; red lines are 45-degree lines.


**Table 3.** Error statistics of the GBDT, BP-ANN, and SVM models.

The comparison between the estimated and measured adsorption isotherms for typical samples in the testing set was conducted in order to further demonstrate the accuracy of the GBDT model in reproducing the adsorption isotherm for an individual coal sample. The methane adsorption capacity on the coal samples is predominantly controlled by the ash and fixed carbon contents, as mentioned in Section 2.4.1. Therefore, four typical samples—one with the highest ash content, one with the lowest ash content, one with the highest fixed carbon content and one with the lowest fixed carbon content—among all samples in the testing set were selected for illustrating the model accuracy.

For the two samples with respective ash contents of 9.6% and 39.96% and one sample with low fixed carbon content (83.88%), the estimated adsorption isotherms are in excellent agreemen<sup>t</sup> with the measured ones, as can be seen from Figure 7. For the sample with high fixed carbon content (91.54%), the adsorption equilibrium points at lower pressures (≤≈4.0 MPa) agrees well with the measured ones, whereas certain deviations exist for the equilibrium points at higher pressures (<sup>&</sup>gt;≈4.0MPa). The maximum error occurs at an equilibrium pressure of ≈8.0 MPa, with the estimated and measured adsorption amounts being 23.71 and 25.23 m<sup>3</sup>/t, respectively. Such discrepancy, as we note, can be considered to be acceptable given the uncertainties that are associated with sample preparation, data acquisition, and measurement operations [12]. Previous reproducibility tests [50,51] showed that discrepancies in the adsorption isotherm measurement may reach high, up to 10–15% on a same coal sample, which are even higher than the GBDT estimation results. It should also be pointed out that the estimated adsorption amount follows a monotonically increasing trend with increasing pressure (which is basic characteristics for methane adsorption isotherms on coals), although no specific constraint was applied in the training process to compel such monotonicity. These results confirm the reliability of the constructed GBDT model in estimating the methane adsorption isotherms on coals with reasonable accuracies.

**Figure 7.** Comparison of the estimated with measured adsorption isotherms for samples with (**a**) ash contents of 9.6% and 39.96%, respectively, and (**b**) fixed carbon contents of 83.88% and 91.54%, respectively.

#### *3.2. Comparison with BP-ANN and SVM*

Figure 8 shows the cross plots of BP-ANN estimated with measured adsorption amounts for the training, validation, and testing sets. All of the data points are generally located on the 45-degree line, which suggests that BP-ANN has an extraordinary capability to accurately correlate the output with input variables for the training set, as can be seen from Figure 8a. Table 3 demonstrates that the BP-ANN outperforms the GBDT in terms of error matrices for the training set. However, Figure 8b,c demonstrate that a noticeable number of data points deviate severely from the 45-degree line for both the validation and testing sets, resulting in higher errors (AAE, ARE, and RMSE) and lower R<sup>2</sup> than the GBDT (Table 3). These observations sugges<sup>t</sup> that the generalization capability of BP-ANN is highly questionable and severe over-fitting issue occurs. As such, the BP-ANN should not be considered to be suitable for accurately estimating the adsorption isotherms.

**Figure 8.** Cross plot of the BP-ANN estimated versus measured adsorption amounts for the (**a**) training, (**b**) validation, and (**c**) testing sets. Open circles are data points; red lines are 45-degree lines.

Figure 9 depicts the estimation results of SVM regression. As shown, there is a noticeable number of data points that severely deviate from the 45-degree line for the training, validation, and testing sets. Thus, it is concluded that the SVM is neither capable of accurately learning the underlying correlations between the output and input variables nor capable of giving reasonable predictions. Comparisons of the evaluation matrices for the SVM with those for the GBDT and BP-ANN (Table 3) sugges<sup>t</sup> that the SVM has better generalization capability than the BP-ANN, but performs worse than the GBDT.

**Figure 9.** Cross plot of the SVM estimated versus measured adsorption amounts for the (**a**) training, (**b**) validation, and (**c**) testing sets. Open circles are data points; red lines are 45-degree lines.

**Figure 10.** Relative importance of the input variables to the adsorption isotherm.

**Figure 11.** Dependence of vitrinite reflectance on fixed carbon of the coal samples.

**Figure 12.** Calculated adsorption isotherms using the constructed GBDT model with reference to varying (**a**) fixed carbon (d.a.f), (**b**) ash (a.d.), (**c**) inherent moisture (a.r.), (**d**) equilibrium moisture, (**e**) temperature, (**f**) vitrinite (m.m.f), and (**g**) vitrinite reflectance.
