*3.3. Fitness Function Design*

The fitness function is the guideline for the evolution of the GEP algorithm. For feed-forward neural networks, the topology selection and training of weights can be regarded as an optimization process. The purpose of optimization is to design the network so that the fitness function value reaches the maximum value. The performance of the current network for a given training data set is described by a least squares error function.

We use category-based multilabel evaluation indicators. We first measure the classifier's corresponding two-class classification performance on a single class, and then calculate the average performance of the classifier on all classes as the evaluation index value of the classifier. Suppose we have a multilabeled test set with *<sup>p</sup>*-many sample data *<sup>S</sup>* = <sup>+</sup> (*x*i, Yi) " " "<sup>1</sup> <sup>≤</sup> <sup>i</sup> <sup>≤</sup> <sup>p</sup> , for the *j*th category *y*<sup>j</sup> (1 ≤ *j* ≤ *q*). In terms of the multilabel classifier, the two-class classification performance of *h*(·) in this category can be described by the four statistics given by Equations (9)–(12).

1. *TP*<sup>j</sup> (#true positive instances)

$$TP\_{\mathbf{j}} = |\{\mathbf{x}\_{\mathbf{i}} \mid \mathbf{y}\_{\mathbf{j}} \in Y\_{\mathbf{i}} \land \mathbf{y}\_{\mathbf{j}} \in h(\mathbf{x}\_{\mathbf{i}}), \newline (\mathbf{x}\_{\mathbf{i}}, \mathbf{y}\_{\mathbf{i}}) \in \mathbf{S}\}|\tag{9}$$

2. *FP*<sup>j</sup> (#false positive instances)

$$FP\_j = |\{ \mathbf{x}\_i \mid \mathbf{y}\_j \notin \mathcal{Y}\_i \land \mathbf{y}\_j \in h(\mathbf{x}\_i), \{ \mathbf{x}\_i, \mathcal{Y}\_i \} \in \mathbf{S} \}|\tag{10}$$

3. *TP*<sup>j</sup> (#true negative instances)

$$TP\_{\hat{j}} = |\left(\mathbf{x}\_{\hat{i}} \mid y\_{\hat{j}} \notin \mathcal{Y}\_{\hat{i}} \land y\_{\hat{j}} \notin h(\mathbf{x}\_{\hat{i}}), (\mathbf{x}\_{\hat{i}}, \mathcal{Y}\_{\hat{i}}) \in \mathcal{S}\right)|\tag{11}$$

4. *FP*<sup>j</sup> (#false negative instances)

$$FP\_j = |\{ \mathbf{x}\_i \mid \mathbf{y}\_j \in \mathcal{Y}\_i \land \underline{y}\_j \notin h(\mathbf{x}\_i), (\mathbf{x}\_i, \mathcal{Y}\_i) \in \mathcal{S} \}|\tag{12}$$

From Equations (9)–(12), *TP*<sup>j</sup> + *FP*<sup>j</sup> + *TN*<sup>j</sup> + *FN*<sup>j</sup> = *p* is established. Most classification performance indicators, such as accuracy, precision, and recall, can be derived from the above four statistics—see Equations (13)–(15).

$$\text{Accuracy} = B \Big( \text{TP}\_{\text{\text{\textdegree}}} \text{FP}\_{\text{\textdegree}} \text{TN}\_{\text{\textdegree}} \text{FN}\_{\text{\textdegree}} \big) = \frac{\text{TP}\_{\text{\textdegree}} + \text{TN}\_{\text{\textdegree}}}{\text{TP}\_{\text{\textdegree}} + \text{FP}\_{\text{\textdegree}} + \text{TN}\_{\text{\textdegree}} + \text{FN}\_{\text{\textdegree}}} \tag{13}$$

$$\text{Precision} = B \left( TP\_{\text{j}}, FP\_{\text{j}}, TN\_{\text{j}}, FN\_{\text{l}} \right) = \frac{TP\_{\text{j}}}{TP\_{\text{j}} + FP\_{\text{j}}} \tag{14}$$

$$\text{Recall} = \mathcal{B}\{TP\_{\text{j}}, FP\_{\text{j}}, TN\_{\text{j}}, FN\_{\text{j}}\} = \begin{array}{c} TP\_{\text{j}} \\ \hline TP\_{\text{j}} + FN\_{\text{j}} \end{array} \tag{15}$$

Therefore, combined with category-based multilabel classification evaluation index, we design a fitness function for the MGEP classification algorithm.

1. Design of fitness function based on macro-averaging:

$$Fit\_i = \left| R - 100 \times \frac{1}{q} \sum\_{j=1}^{q} B \{ TP\_{j\cdot\cdot} FP\_{j\cdot\cdot} TN\_{j\cdot\cdot} FN\_j \} \right| \tag{16}$$

2. Design of fitness function based on micro-averaging:

$$Fit\_i = \left| R - B \left( \sum\_{j=1}^{q} TP\_{j\prime} \sum\_{j=1}^{q} FP\_{j\prime} \sum\_{j=1}^{q} TN\_{j\prime} \sum\_{j=1}^{q} FN\_j \right) \right| \tag{17}$$

In Equations (16) and (17), *Fit*<sup>i</sup> is the fitness value of the *i*th individual to the environment, and *R* is the selected bandwidth.

#### *3.4. MGEP Classification before CNN Image Super-Resolution*

In the process of image super-resolution reconstruction, the traditional K-means clustering algorithm [20] is used to classify the trained image set to improve the training effect and reduce the training time. In the K-means clustering algorithm, the value of *K* needs to be determined in advance, and it cannot be changed during the entire course of the algorithm, which makes it difficult to accurately estimate the value of *K* when training high-dimensional data sets.

We used the MGEP classification algorithm instead of the K-means clustering algorithm, and its preclassification model is shown in Figure 3.

**Figure 3.** The Multilabel Gene Expression Programming (MGEP) preclassification model.

Let *p*-many sample pattern pairs (*x*k, *y*k) constitute the training set, *k* = 1, 2, ... , *p*. According to the definition of the GEP classifier, for sample *k*, there are *x*<sup>k</sup> = (*x*k1, *x*k2, ... , *x*km) and *y*<sup>k</sup> = (*y*k1, *y*k2, ... , *y*kn), where *x*kj is the sample in the attribute *A*<sup>j</sup> (*j* = 1, 2, ... , *m*) and *y*ki is the degree of membership of the sample to the category *C*<sup>i</sup> (*i* = 1, 2, ... , *n*). The training set constitutes the set of adaptive instances, and the set of adapted instances of a particular problem forms the adaptive environment of the MGEP algorithm. Under a certain adaptive environment, starting from the initial population, selection, replication, and various genetic operations are performed according to individual fitness to form a new population. This process is repeated until the optimal individual is evolved and decoded to obtain a GEP multilabel classifier. In the MGEP preclassification learning, we find similar samples for LR image blocks in terms of color and texture features, thereby shortening the time of the next precise matching and improving the efficiency and effect of SR image restoration.

In the training of the CNN, the MGEP algorithm is used to classify the trained image set, classify the approximate images into one category, and reduce the parameter scale of the convolutional neural network model, which can reduce the training time of the network to a certain extent and improve the training efficiency of the CNN. The improved algorithm flow based on SRCNN is shown in Figure 4.

**Figure 4.** MGEP-SRCNN algorithm flow chart.

The algorithm uses Wiener filtering to construct a deconvolution layer, which is used to implement multiscale image reconstruction. The deconvolution network part adopts the mirror structure of the convolution network. The purpose is to reconstruct the shape of the input target, so the multilevel deconvolution structure can also capture the shape details of different levels like the convolution network. In the model of the convolutional network, low-level features can describe the rough information of the entire target, such as target position and approximate shape, while more complex high-level features have classification characteristics and also contain more detailed target information.

#### **4. Experiments**

#### *4.1. Experimental Environment and Parameter Settings*

The experimental software environment used was Ubuntu 14.04, Python 2.7, TensorFlow 1.4; the hardware environment was an Intel Core i7-6700K, RAM 16GB, and the GPU was an NVIDIA GTX1080.

As the training set, we used ImageNet-91 [5], and as the test set, we used Set5 [21], Set14 [22], BSD100 [23], and Urban100 [24]. We tested on three commonly used scale factors, 2, 3, and 4, and compared the results with those of the Bicubic, SCN [9], SRCNN [5], VDSR [8], and DRCN [10] algorithms. Two evaluation indexes, PSNR and Structural Similarity (SSIM), were selected as an objective reference basis for the superiority of algorithm reconstruction to measure the effect of image restoration.
