*3.2. GEP Network*

Gene Expression Programming (GEP) was proposed by Ferreira in 2001. It is a new evolutionary model and belongs to the family of genetic algorithms [18,19]. The GEP algorithm, like the Genetic Algorithm (GA) and Genetic Programming (GP), is a computational model that simulates the evolutionary process of living things. Because GEP chromosomes are simple, linear, and compact; make it easy to carry out genetic operations, etc.; and have stronger problem-solving capabilities, they are 2 to 4 orders of magnitude faster than GA and GP [19]. Because of these advantages, GEP technology has attracted the attention of many researchers and has been used in machine learning fields such as function discovery, symbol regression, classification, clustering, and association rule analysis.

(*S*, *F*, *T*) represents a GEP gene as a 3-tuple, of which *S* is a fixed-length string, *F* is the set of calculation functions, and *T* is the basic terminal set. Sometimes, for convenience, the fixed-length string *S* is called a gene. The gene is divided into two parts, the head and the tail. The former symbol can be taken from *F* and *T*, and the latter must be taken from *T*. GEP gene coding rules ensure that it can be decoded into an expression tree corresponding to a legal mathematical expression. Suppose its head length is *h*, tail length is *t*, and *n*max is the maximum number of parameters of a function in the function set; then the relationship between *h* and *t* can be expressed by Equation (6).

$$t = h(n\_{\text{max}} - 1) + 1\tag{6}$$

GEP's neural network selective integration process is divided into two stages.

**Stage 1**: Network group generation.

We use existing methods such as Boosting and Bagging to generate network groups. Assume the output vector of the network population is *Y* = (*y*1, *y*2, ... , *yn*), *yi* ∈ {0, 1}.

**Stage 2**: Individual network selection and conclusion generation based on GEP.

For input *x*, there are the following integrated classification results:

$$y(\mathbf{x}) = \text{sign}(f(Y')).\tag{7}$$

Among them, *<sup>Y</sup>* <sup>⊆</sup> *<sup>Y</sup>*, *<sup>Y</sup>* <sup>=</sup> *yi*<sup>1</sup> , *yi*<sup>2</sup> , ... , *yim* . That is, the final classification result is synthesized from some network outputs in the network group in some way. Because GEP has powerful function discovery and parameter selection functions, it can be discovered using the GEP method *f*(*Y* ).

Taking the threshold *lamda* = 0.5, the integrated classification result *y*(*x*) is calculated as follows:

$$y(\mathbf{x}) = \begin{cases} 1 & f(\mathbf{Y'}) \ge 0.5 \\ 0 & f(\mathbf{Y'}) < 0.5 \end{cases} \tag{8}$$
