**1. Introduction**

As an unconventional hydrocarbon resource, coalbed methane (CBM) has been unlocked for commercial development in the USA, China, Australia, Canada, and India [1]. The recovery of CBM from coal seams has multiple favorable e ffects, such as the reduction of greenhouse gas release into the atmosphere, enhancement of underground coal mining safety, and addition to natural gas supply [2,3]. It is commonly believed that the majority of methane exists within coal seams via physical adsorption [4,5]. The accurate characterization of methane adsorption isotherms in coals is crucial

for the successful development of CBM resources, because the isotherm determines the in situ level of gas saturation, which significantly a ffects CBM production rates [6].

To date, experimental methods that were commonly used for measuring high-pressure methane adsorption isotherms have included the manometric, the gravimetric, and the volumetric methods [7]. Although these methods di ffer in the means by which the adsorption amount is determined, they all require indispensable procedures that typically include sample preparation, adsorption equilibrium, and data deduction. Such tedious experimental procedures are not only time-consuming, but they also may result in varying sources of uncertainties. Previous studies [8,9] showed that adsorption isotherms on a same sample measured in di fferent laboratories may exhibit noticeable discrepancies, which may be attributed to uncertainties that stem from e.g., the determination of reference/pump and void volume [10,11], the choice of equation of state (EoS) [8] and impurities in the measurement gas [9]. As such, it is pointed out by [12] that extremely tedious procedures, including through the calibration of the instrument, careful operations, and check of the repeatability, are needed in order to ensure the accuracy and consistency in adsorption isotherm measurements.

When compared with the measurement of adsorption isotherms, determinations of coal properties (e.g., proximate analysis, maceral group identification, vitrinite reflectance measurement) are much easier and faster. Numerous studies have shown that the methane adsorption capacity on coals is potentially a ffected by the coal properties (e.g., ash, fixed carbon and inherent moisture contents, maceral group, vitrinite reflectance) and experimental conditions (e.g., sample particle size, equilibrium moisture, and temperature) [13–15]. As such, it is reasonable and should be viable to estimate/predict the adsorption isotherm while using mathematical regression techniques that are based on these influencing factors. Feng et al. [16] quantitively correlated the Langmuir volume (VL) with vitrinite reflectance, proximate parameters, vitrinite content, and temperature while using the alternating conditional expectation (ACE) algorithm. More recently, Chattaraj et al. [1] applied the multiple regression analysis method to develop a predictive model for VL based on proximate and ultimate parameters. It should be noted that only VL was estimated in Feng et al. and Chattarj et al.; neither study considered the estimation of Langmuir pressure (PL), which determines the curvature of an adsorption isotherm. In other words, the models that were proposed by [1,16] can only predict the maximum adsorption capacity instead of the adsorption isotherms. The di fficulty in the accurate estimation of PL may be due to the uncertainties in its correlations with coal properties. For example, Laxminarayana and Crosdale [17] and Dutta et al. [18] found that Langmuir pressure decreases with the increase in vitrinite reflectance for Australian and Indian coals. However, Busch et al.'s [13] statistics on ≈1000 coal samples show a very scattered pattern between PL and vitrinite reflectance. Zhang et al. [19] proposed the use of deep neural network (DNN) in order to predict CO2 adsorption on porous carbon based on surface area, micropore, and mesopore volumes. However, gas adsorption behavior on coals is more complicated than on porous carbons due to the higher degree of chemical and physical complexity of coals.

Having addressed these issues, it should be of practical significance to accurately estimate the adsorption isotherm from parameters that are easy and fast to determine in order to reduce the time-consuming and expensive work of adsorption isotherm measurement. This paper proposed the use of the gradient boosting decision trees (GBDT) [20,21] in order to accurately estimate adsorption isotherms that are based on coal properties (ash, fixed carbon, moisture, vitrinite, and vitrinite reflectance) and experimental condition (equilibrium moisture and temperature) for coal samples acquired from the Qinshui basin. The GBDT is an ensemble method that combines a number of base estimators (decision trees) with the gradient boosting algorithm in order to improve the robustness over a single estimator. The GBDT has empirically proven to be highly e fficient and promising for solving various regression and classification problems in the field of energy and petroleum engineering [22,23]. However, to the best knowledge of the authors, the application of GBDT in estimating the adsorption isotherm has not ye<sup>t</sup> been reported. The superiority of the GBDT in terms of accuracy and robustness was then confirmed by comparison with the back-propagation artificial neural network (BP-ANN) and

support vector machine (SVM). Sensitivity analysis was then conducted while using the constructed GBDT model to analyze the e ffect of each input variable on the adsorption isotherm.

#### **2. Materials and Methods**

#### *2.1. Geological Background of the Study Area*

The study area is the Anze block in the southern Qinshui basin, North China (Figure 1), where commercial developments of CBM resources have been ongoing since more than two decades ago. The Qinshui Basin is a large compound synclinal basin surrounded by the uplifts of Wutai Mountain, Taihang Mountain, Zhongtiao Mountain, and Huo Mountain [24]. The study area consists of the Pennsylvanian Benxi and Taiyuan formations, Permian Shanxi, Xiashihezi, Shangshihezi and Shiqianfeng formations, and Triassic to Quaternary deposits. The primary CBM-bearing formations are No. 3 coal seam in the Shanxi formation and No. 15 coal seam of the Taiyuan formation (Figure 2). The No. 3 and No. 5 coals are characterized with high metamorphism. The coal ranks are in the range of low volatile bituminous to anthracite with Ro,m high up to 4.5% [25]. Maceral compositions are dominated by vitrinite and subordinate inertinite, while liptinite is microscopically unrecognizable. The Lithotypes are primarily semi-bright and bright coals.

**Figure 1.** Illustration of the study area where coal samples were retrieved. Reprint with permission [24]; 2011, Elsevier Ltd.

#### *2.2. Samples and Experiments*

A number of 165 coal samples were acquired while using the downhole coring technique from 72 CBM wellbores in No. 3 and No. 5 coal seams. After being transported to the laboratory in sealed tanks, the coal samples were subjected to proximate analysis, vitrinite reflectance, and adsorption isotherm measurements in order to develop a database that is used for machine learning. Proximate analysis was conducted following the Chinese standard GB/T 212-2008 [26]. The maceral group was identified at 50× magnification under plane polarized reflected light with a fluorescence illuminator, following the Chinese standard GB/T 8899-2013 [27]. Vitrinite reflectance (Romax) was measured according to Chinese standard GB/T 6948-2008 [28] at a magnification of 500 × oil immersions. More details on the analysis procedure are given in [29]. Methane adsorption isotherms were measured on 60–80 mesh moisture-equilibrium coal powders while using the manometric method [7]. For each sample, the experimental temperature was set to be identical with the in situ temperature where the sample was retrieved. Each adsorption isotherm is comprised of eight equilibrium pressures (ranging from ≈0.5 to ≈8.5 MPa) with corresponding adsorption amounts, which results in a total number of 8 × 165 = 1320 data points in the database. Table 1 summarizes experimental data for the samples.

**Figure 2.** Stratigraphy of the coal-bearing strata in the study area. Reprint with permission [24]; 2011, Elsevier Ltd.

**Table 1.** Summary of the experimental data.


Note: a.r.—as received; a.d.—air dry; d.a.f—dry ash free; m.m.f—mineral matter free.

#### *2.3. Basics of GBDT*

The basic philosophy behind the GBDT is to use an ensemble of classification and regression trees (CARTs) to fit the training data samples through minimizing a regularized objective function. Each CART is comprised of a number of leaf nodes and each leaf node is associated with a binary decision rule structure and a continuous score. In GBDT, a number of CARTs are developed in a sequential manner in order to form an accurate ensemble model. For the completeness of this paper, the GBDT algorithm is briefly addressed, as follows. Readers are referred to [20,21] for more details on the GBDT algorithm.

For a given data set with d dimensions and n examples D = 4(**<sup>x</sup>***i*, *yi*)5,**<sup>X</sup>***<sup>i</sup>* ∈ <sup>R</sup>*d*, *yi* ∈ R, *i* = 1, 2, ... , *n*, the output *F* is predicted as the sum of K additive functions, which is written as

$$F(\mathbf{x}) = \sum\_{m=0}^{M} \beta\_m h(\mathbf{x}; \{\mathcal{R}\_{lm}\}\_{l}^{L}) \tag{1}$$

where *h* represents a tree with a number of *L* nodes; *Rlm* represents partitioned region that is defined by the terminal node l of the mth tree; 4β*m*5*<sup>M</sup>*0 are expansion coefficients that are jointly fit with {*Rlm*}*Ll* to the training data set by minimizing a regularized objective function:

$$\mathcal{L} = \sum\_{i}^{n} \psi(y\_i, F(\mathbf{x}\_i)) \tag{2}$$

where ψ is a differentiable loss function, which was assigned as the squared error in this study.

The minimization of the loss function is achieved by iteratively adding leaf nodes that result in the steepest decent [21], which is mathematically expressed as:

$$\gamma\_{lm} = \underset{\gamma}{\text{argmin}} \sum\_{\mathbf{x}\_i \in \mathcal{R}\_{lm}} \psi(y\_i, F\_{m-1}(\mathbf{x}\_i) + \mathbf{y}) \tag{3}$$

$$F\_m(\mathbf{x}) = F\_{m-1}(\mathbf{x}) + \nu \gamma\_{lm} \mathbf{l}(\mathbf{x}\_i \in R\_{lm}) \tag{4}$$

where υ is the shrinkage factor in the range of (0, 1] that controls the learning rate of the training process. Empirically, small values of υ are beneficial in conserving the model and, thus, help in increasing the generalization capability [22].

#### *2.4. Construction of the GBDT Estimation Model*
