*3.1. Objective Function of the Model*

XGBoost adds the regularization factor Ω(θ) to represent the complexity of the tree based on the Gradient Boosting Decision Tree (GBDT) algorithm, and it defines the objective function of the optimization in the training model using the equation below.

$$Obj(\theta) = L(\theta) + \Omega(\theta) \tag{18}$$

where θ is the model parameter, Ω(θ) is the regular term, which represents the complexity of the model, and *L*(θ) is the loss function, which represents the matching degree between the model and the training set.

For a given data set with *n* examples and *m* features, *D* = ' (**x***i*, *yi*) ( (*i* = 1, 2, ... , *n*, **x***<sup>i</sup>* ∈ *Rm*, *yi* ∈ *R*); a tree ensemble model uses *S* additive functions θ = ' *f*1, *f*2, ··· *fs*, ··· , *fS* ( to predict the output.

$$\mathcal{G}\_{l} = \sum\_{s=1}^{S} f\_{s}(\mathbf{x}\_{l})\_{r} f\_{s} \in F \tag{19}$$

where *F* = *f*(**x**) = *wq*(**x**) (*w* ∈ *RT*, *q* : *Rm* → *T*) is the space of regression trees (also known as CART). In this case, *q* represents the structure of each tree that maps an example to the corresponding leaf index. *T* is the number of leaves in the tree. Each *fs* corresponds to an independent tree structure *q* and leaf weight *w*. Unlike decision trees, each regression tree contains a continuous score on each of the leaves, and we use *wi* to represent the score on the *i*-th leaf. We will use the decision rules in the trees (given by *q*) to classify it into the leaves and calculate the final prediction by summing up the score in the corresponding leaves (given by *w*). To learn the set of functions used in the model, we minimize the following regularized objective.

$$Obj(\theta) = \sum\_{i=1}^{n} l(y\_i, \mathfrak{g}\_i) + \sum\_{s=1}^{S} \Omega(f\_s) \tag{20}$$

where "*<sup>n</sup> i*=1 *l*(*yi*, *y*ˆ*i*) is a differentiable convex loss function that measures the difference between the prediction *<sup>y</sup>*ˆ*<sup>i</sup>* and the target *<sup>y</sup>*ˆ*i*. The second term "*<sup>S</sup> s*=1 Ω(*fs*) penalizes the complexity of the trees.
