3.3.3. Gini Index

It is obtained by the sum of squares of individual probabilities of each class from one. A Higher Gini index value indicates higher homogeneity. The CART algorithm uses the Gini Index to create splits in data [37]. The equation gives Gini index at a node

$$Gini = 1 - \sum\_{i=1}^{K} (P\_i)^2 \tag{6}$$

where *pi* is the percentage of class *i* in the node, and the index *i* runs from 1 to *K* number of classes. It measures the "impurity" of a dataset. It takes a minimal value of zero to a maximal value of (1-1/*K*). In the attribute selection process of Decision Tree modeling, that particular attribute is selected for which there is a largest reduction in the value of Gini index. It turns out the reduction of Gini index essentially is accompanied by lowering of entropy.
