*3.2. eXtreme Gradient Boosting (XGBoost)*

The eXtreme Gradient Boosting (XGBoost) method, proposed by Chen et al. [49], is also an ensemble learning method based on gradient boosting machines. Similar to RF, XGBoost is a learner based on Classification and Regression Trees (CART). It implements ensemble learning of multiple CART trees by optimizing the traditional Gradient Boosting Decision Tree (GBDT). It can be used to solve various machine learning problems, including classification and regression. While each tree in the RF algorithm is trained in parallel, the decision trees in XGBoost are not mutually independent. The construction process of the XGBoost model is as follows: First, an initial tree is built using the training set for model training, which results in residuals between the model's predicted and actual values. Then, during each iteration, a tree is added to fit the residuals from the model's previous prediction until the model's learning process is terminated. Ultimately, this forms an iterative residual tree collection, an ensemble of numerous tree models. The predicted value can be calculated as follows:

$$\mathcal{Y}\_{i} = \sum\_{k=1}^{K} f\_{k}(\mathbf{x}\_{i}) \tag{4}$$

where *y*ˆ*<sup>i</sup>* represents the final model prediction value, *K* represents all the built CART trees, *xi* represents the features of the *i* sample, and *fk*(*xi*) represents the prediction value of the *k* tree. The objective function calculation formula for XGBoost is shown in Equation (5):

$$O\_{b\circ} = \sum\_{i=1}^{m} l(\hat{y}\_i, y\_i) + \sum\_{k=1}^{K} \Omega(f\_k) \tag{5}$$

where *m* represents the total amount of sample data imported into the *k* tree. The first term is the loss function, which measures the error between the true value *yi* and the predicted value *y*ˆ*i*. The second term is the regularization term, used to control the model's complexity and prevent overfitting. The complexity of each tree is defined as:

$$
\Omega(f) = \gamma T + \frac{1}{2}\lambda||w||^2 \tag{6}
$$

where *γ* represents the difficulty of node splitting, *T* represents the number of leaf nodes, *λ* is the L2 regularization coefficient to prevent overfitting, and *w* is the modulus of the leaf node vector.

## *3.3. Light Gradient Boosting Machine (LGBM)*

Developed by Microsoft Research in 2017, Light Gradient Boosting Machine (LGBM) stands as one of the most effective and advanced machine learning algorithms [50]. LGBM has evolved from the boosting regression algorithms. It employs a histogram-based algorithm, storing continuous features into discrete bins. The use of a histogram-based method accelerates the training speed and reduces memory usage. Additionally, LGBM utilizes the leaf-wise tree growth algorithm. The growth process involves choosing the leaf with the highest delta loss. This contrasts with many boosting algorithms (such as XGBoost) that use a level-wise approach. Although a level-based approach ensures a consistent number of leaves at each level, the leaf-wise strategy leads to a different number of leaves at each respective level. This approach helps LGBM achieve lower loss. The main process of the LGBM algorithm is shown in Equation (7):

$$F\_n(\mathbf{x}) = \alpha\_0 f\_0(\mathbf{x}) + \alpha\_1 f\_1(\mathbf{x}) + \dots + \alpha\_n f\_n(\mathbf{x}) \tag{7}$$

where the classifier begins with *n* decision trees, and the weight assigned to the training samples is <sup>1</sup> *<sup>n</sup>* . The weak classifier *f*(*x*) and its weight *α* are determined. The process continues, with the classifier adjusting the weights until it arrives at the final classifier, denoted as *Fn*(*x*). In summary, the main goal of the LGBM algorithm is to improve training

efficiency and accuracy through feature parallelization and a histogram-based decision tree algorithm. It also uses gradient boosting methods to continuously optimize the model, thereby achieving better classification and regression results.
