*3.3. Complexity Calculation in the Objective Function*

For a given tree ensemble model, the complexity of the model can be defined by the equation below.

$$
\Omega(f\_t) = \gamma T + \frac{1}{2}\lambda \sum\_{j=1}^{T} w\_j^2 \tag{26}
$$

where γ and λ are both regularization factors. γ is the parameter used to control tree node splitting. When the cost function of a node after splitting is less than this value, it will not split. When it is greater than this value, it will split. λ is the regularization weight. *T* is the number of leaf nodes, and *wj* is the weight of the leaf nodes.

Define *Ij* = ' *i*|*q*(*xi*) = *j* ( as the instance set of leaf *j*. We can substitute Equation (26) into the objective function Equation (25) as:

$$\begin{aligned} Obf^{(t)} &= \sum\_{i=1}^{n} \left[ g\_i f\_l(\mathbf{x}\_i) + \frac{1}{2} h\_i f\_t^2(\mathbf{x}\_i) \right] + \gamma T + \frac{1}{2} \lambda \sum\_{j=1}^{T} w\_j^2 \\ &= \sum\_{i=1}^{n} \left[ g\_i w\_{q(\mathbf{x}\_i)} + \frac{1}{2} h\_i w\_{q(\mathbf{x}\_i)}^2 \right] + \gamma T + \frac{1}{2} \lambda \sum\_{j=1}^{T} w\_j^2 \\ &= \sum\_{j=1}^{T} \left[ \left( \sum\_{i \in I\_j} g\_i \right) w\_j + \frac{1}{2} \left( \sum\_{i \in I\_j} h\_i + \lambda \right) w\_j^2 \right] + \gamma T \end{aligned} \tag{27}$$

If we define *Gj* = " *<sup>i</sup>*∈*Ij gi*, *Hj* = " *<sup>i</sup>*∈*Ij hi*, then Equation (27) can be abbreviated as:

$$Obj^{(t)} = \sum\_{j=1}^{T} \left[ G\_j w\_j + \frac{1}{2} (H\_j + \lambda) w\_j^2 \right] + \gamma T \tag{28}$$

#### *3.4. Optimization of the Objective Function*

When a fixed structure of the tree is *q*(*x*) *Ij* = ' *i*|*q*(**x***i*) = *j* ( , we can compute the optimal weight *w*∗ *<sup>j</sup>* of leaf *j* by using the equation below.

$$w\_j^\* = -\frac{\sum\_{i \in I\_j} g\_i}{\sum\_{i \in I\_j} g\_i + \lambda} = -\frac{G\_j}{H\_j + \lambda} \tag{29}$$

When Equation (29) is substituted into the objective function Equation (28), the optimal value of the objective function is found.

$$Obj^\* = -\frac{1}{2} \sum\_{j=1}^{T} \frac{G\_j^2}{H\_j + \lambda} + \gamma T \tag{30}$$

By optimizing the objective function, the optimal structure of the decision tree can be obtained. The value of the objective function can be understood as an index score of information gained, and, for the value of the function, the lower value is better.

A split finding algorithm is proposed in the XGBoost algorithm. This means that a splitting point is added to each leaf node. If a node is decomposed into two leaf nodes, then the score gained can be found using the equation below.

$$\text{Gain} = \frac{1}{2} \left[ \frac{G\_L^2}{H\_L + \lambda} + \frac{G\_R^2}{H\_R + \lambda} - \frac{\left(G\_L + G\_R\right)^2}{H\_L + H\_R + \lambda} \right] - \gamma \tag{31}$$

where *GL* is the sum of samples *gi* distributed to the left cotyledon, *GR* is the sum of samples *gi* distributed to the right cotyledon, *HL* is the sum of samples *hi* distributed to the left cotyledon, and *HR* is the sum of samples *hi* distributed to the right cotyledon. In addition, in Equation (31), the first term in square brackets is the score of the left node, the second term is the score of the right node, and the third term is the score of the original node. Thus, select a feature as the reference quantity, and then scan from left to right with a certain step length in order to find out the gain of each splitting point. Take the point with the largest gain as the splitting point for the search, and it is not necessary to add a branch if the gain is less than γ.

Based on the principle of node splitting in the XGBoost, the model will continuously optimize itself, according to residuals during the iteration. Since the objective function of node splitting contains both an error term and a regularization term, the model has high precision.
