Automatic Feature Construction-Based Genetic Programming for Degraded Image Classification

Sun, Yu; Zhang, Zhiqiang

doi:10.3390/app14041613

Open AccessArticle

Automatic Feature Construction-Based Genetic Programming for Degraded Image Classification

by

Yu Sun

^1,2 and

Zhiqiang Zhang

^1,2,*

¹

School of Computer and Electronics and Information, Guangxi University, Nanning 530004, China

²

Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning 530004, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(4), 1613; https://doi.org/10.3390/app14041613

Submission received: 21 January 2024 / Revised: 12 February 2024 / Accepted: 13 February 2024 / Published: 17 February 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Accurately classifying degraded images is a challenging task that relies on domain expertise to devise effective image processing techniques for various levels of degradation. Genetic Programming (GP) has been proven to be an excellent approach for solving image classification tasks. However, the program structures designed in current GP-based methods are not effective in classifying images with quality degradation. During the iterative process of GP algorithms, the high similarity between individuals often results in convergence to local optima, hindering the discovery of the best solutions. Moreover, the varied degrees of image quality degradation often lead to overfitting in the solutions derived by GP. Therefore, this research introduces an innovative program structure, distinct from the traditional program structure, which automates the creation of new features by transmitting information learned across multiple nodes, thus improving GP individual ability in constructing discriminative features. An accompanying evolution strategy addresses high similarity among GP individuals by retaining promising ones, thereby refining the algorithm’s development of more effective GP solutions. To counter the potential overfitting issue of the best GP individual, a multi-generational individual ensemble strategy is proposed, focusing on constructing an ensemble GP individual with an enhanced generalization capability. The new method evaluates performance in original, blurry, low contrast, noisy, and occlusion scenarios for six different types of datasets. It compares with a multitude of effective methods. The results show that the new method achieves better classification performance on degraded images compared with the comparative methods.

Keywords:

genetic programming; degraded image classification; evolutionary computation; program structure; information transmission

1. Introduction

Image classification aims to analyze image content and assign the correct category labels and is one of the core tasks in the field of computer vision. This technology has been widely applied in many fields, such as image retrieval, face recognition, autonomous driving, and medical image analysis [1,2,3,4]. Adequate and high-quality training samples ensure the precise capture of key features to complete the classification task. However, in the real world, most of the images obtained are of low quality, often containing a considerable amount of blurry information and noise. Therefore, analyzing degraded images and performing accurate classification is a more challenging task.

Extracting effective features is crucial for image classification tasks. Classic manual feature extraction methods such as Local Binary Patterns (LBP) [5], Scale-Invariant Feature Transform (SIFT) [6], and Histogram of Oriented Gradients (HOG) [7] aim to capture key information features of images, thereby achieving efficient image classification. However, despite these manual feature extraction methods demonstrating excellent performance on specific datasets, their classification performance may not be guaranteed when facing unknown types of classification tasks.

Compared with manual feature extraction methods, feature learning approaches are usually more effective for image classification. This is because they can automatically extract distinctive features from raw images. Convolutional Neural Networks (CNNs), as the current mainstream method for feature learning, exhibit outstanding performance in many tasks. However, their performance is dependent on the expert’s model design. Additionally, understanding and interpreting the decision-making process of CNNs remains a challenge.

As a type of evolutionary algorithm, Genetic Programming (GP) [8] has attracted widespread attention due to its excellent learning performance and good interpretability. It is applied in various tasks, such as workshop scheduling [9], symbolic regression [10], and image classification [11]. In GP-based image classification methods, a tree representation structure is generally used. In tree-structured GP, the original image and the parameters required by internal nodes are referred to as leaf nodes or terminals, serving as inputs to the program. Image filters, feature descriptors, and other image processing-related operators are used as internal nodes/functions in tree-structured GP [12]. According to this definition, each individual generated by the GP program can be viewed as a potential solution for the classification task.

Currently, image classification methods based on GP have made significant progress. Some approaches directly use raw images as inputs to the GP program. A common strategy among these is for the GP program to output learned high-level features, which are then fed into a classification algorithm for classification processing [13,14]. Another approach integrates a variety of classifiers within the internal nodes of the GP program. This allows the GP to automatically choose the most suitable classifiers and produce the predicted classification results [15].

However, the aforementioned GP methods primarily focus on processing limited types and higher quality samples, and have not thoroughly explored the issue of degraded image classification. Although some studies have proposed the EFLGP method [16] aimed at degraded image tasks, limitations in its program structure and internal node design may lead to poor classification performance. Furthermore, a common issue in the aforementioned GP methods is the excessive homogeneity of individuals within the population during the middle and later stages of the GP algorithm, which can result in a stagnation in fitness improvement for the best GP individual/program. On the other hand, existing research typically selects the best individual as the solution for classification tasks, a method that easily leads to the occurrence of overfitting.

To confront the challenges previously mentioned, this research introduces a new program structure based on information transmission and develops a novel function set and terminal set for it. The design of the structure enables nodes in specific layers to transmit the key information they learn to nodes in subsequent layers. This promotes the construction of more effective new features by GP individuals, thereby enhancing their performance as solutions for classification tasks. Furthermore, this research devises an accompanying evolution strategy that enhances population diversity by replacing redundant individuals after stagnation, thereby facilitating the search for superior GP individuals. Moreover, the proposed multi-generational individual ensemble strategy aims to combine several efficient GP individuals to construct an ensemble of GP individuals with enhanced generalization capabilities, serving as an effective solution for classification tasks. The proposed new method is abbreviated as ITACIE-GP.

The contributions of this paper lie in the following aspects.

(1): Due to the limitations in program structure and node design in existing GP methods, the performance of GP individuals is restricted. To address this, the study develops a new program structure based on information transmission, which is centered on allowing nodes in specific layers to transmit effective information to subsequent layers. Such a design aids GP individuals in constructing distinctive features, thereby enhancing their performance as solutions for classification tasks.
(2): To solve the problem of stagnation in fitness growth of the best GP individuals during the iterative process, this study proposes an accompanying evolution strategy. When stagnation in the fitness growth of the best GP individuals is detected, this strategy will guide the population toward exploring in the direction of more optimized GP individuals.
(3): To address the potential overfitting issue when using a single GP individual as the solution, this study proposes a multi-generational individual ensemble strategy. This strategy, by combining several efficient GP individuals, constructs an ensemble GP individual with stronger generalization capabilities, effectively reducing the overfitting phenomenon.
(4): ITACIE-GP is an effective method suitable for degraded image classification tasks. This method evaluates its performance in original, blurry, low contrast, noisy, and occlusion scenarios for six different types of datasets. It has been compared with several benchmark methods. Moreover, through detailed program examples, this paper delves into the reasons behind the high performance demonstrated by this method.

The remainder of this paper is structured as follows. Section 2 introduces the related work. Section 3 presents the proposed ITACIE-GP method. Section 4 details the design of the experiments. Section 5 discusses and analyzes the experimental results, and Section 6 concludes the work.

2. Related Work

2.1. GP and Strongly Typed GP

As an evolutionary algorithm, GP models the solutions to various application problems as individuals within a population. In this process, each individual is assigned a fitness score, which is determined based on a predefined fitness function. The population then undergoes several generations of evolution, thereby improving the fitness scores. At the end of the algorithm, the individual with the highest fitness score is selected as the best solution to the problem. In the application of image classification, Strongly Typed Genetic Programming (STGP) [17], based on tree representation, is widely used. This is because it allows for the specification of the input and output types of internal nodes as well as the output types of leaf nodes, enforcing constraints on the construction of the program tree, ensuring that only nodes with matching types can be connected. For example, in some operations in image processing (such as image filtering or feature extraction), the required input is of a specific type of image, and the output will also be of a specific type of image or feature value [12]. STGP is used to ensure that the generated program strictly follows the preset steps and requirements when processing images and extracting features.

2.2. Degraded Image

Degraded images are typically those affected by factors such as low contrast, occlusion, noise, or blur [16,18]. The low quality of these images may arise for a variety of reasons, including suboptimal shooting conditions, the limitations of the equipment’s performance, or interference from external environmental factors. Currently, researchers have proposed various methods to deal with degraded images. For example, some studies use image enhancement techniques to improve image quality, while others focus on reconstructing degraded images. These representative research works are displayed in Table 1.

However, the methods mentioned above, for different classification problems, all depend on diverse domain knowledge to design new solutions. Additionally, in real life, the decline in image quality is not just due to a single adverse factor. Therefore, in the field of degraded image classification, a new method is needed. This method should not rely heavily on domain knowledge while being able to adapt to a variety of task requirements, effectively solving degraded image classification problems that are affected by multiple adverse factors.

2.3. Image Classification Methods

In recent years, to more effectively carry out image classification, researchers have proposed various technical solutions and strategies. The existing image classification methods can be summarized into the following categories: manual feature-based image classification methods, CNN-based image classification methods, and GP-based image classification methods.

2.3.1. Manual Feature-Based Image Classification Methods

Image classification methods based on manual feature extraction are designed by experts based on an understanding of images and domain knowledge. In specific application scenarios, these methods often demonstrate good stability and effectiveness. Table 2 displays image classification methods based on manual features.

Although image classification methods based on manual features may demonstrate good performance in specific tasks and have the advantage of being easy to understand, their efficiency may not be high when dealing with unknown or complex classification tasks.

2.3.2. CNN-Based Image Classification Methods

CNNs are capable of automatically learning and extracting features, and they usually achieve better performance for large-scale and complex tasks. Table 3 displays image classification methods based on CNNs.

CNN-based methods have achieved significant success in image classification tasks. However, these methods typically require substantial computational resources to train parameters and often rely on expert-designed efficient architectures.

2.3.3. GP-Based Image Classification Methods

Tree-based GP has been successfully and widely applied in image classification. This method predefines certain operators and, during runtime, combines these operators within the problem space through an evolutionary algorithm to learn the best solution or program. This means that GP does not require extensive domain knowledge to adjust the model or tune parameters to obtain solutions for classification tasks, and these solutions often have a high level of interpretability.

In current research on GP-based image classification, there is a method that utilizes the GP algorithm to construct classifiers for binary classification tasks. This method takes images as direct inputs to the GP program and performs operations such as feature extraction and feature construction simultaneously in this process. Ultimately, the GP program outputs a high-level feature for making classification decisions. Table 4 displays the GP-based binary image classification methods.

These methods excel in binary image classification, but their performance in multiclass image classification has not been extensively studied yet. Another GP algorithm is applied to multiclass classification tasks which processes original images to either predict class labels directly or extract features at the root node. These features are then fed into a predefined classifier for classification. These high-level features are either applied to a fixed classification algorithm, or the root node directly outputs prediction labels for multiclass image classification. Table 5 displays the GP-based multiclass image classification methods.

Furthermore, some studies focus on improving genetic operations with the goal of obtaining GP individuals with excellent classification performance during the iterative evolution process. Table 6 displays the methods of improvement on GP genetic operators.

GP-based image classification methods have achieved notable progress. As shown in Table 4 and Table 5, some studies have achieved success in limited types of image classification tasks by designing new program structures and function sets. Others have proposed new GP methods for classifying degraded images affected by blurriness, low contrast, or noise. However, the designed program structures of these methods are not sufficiently effective, limiting the classification performance of the algorithm. As shown in Table 6, some studies have focused on improvements in genetic operators, with the aim of obtaining GP individuals with excellent classification performance. Nonetheless, these enhancements are mainly limited to a few datasets and may still face fitness stagnation in broader or more complex scenarios due to the high similarity among individuals. In cases with fewer image samples and adverse factors like blurriness and noise, the best individuals or solutions derived by GP algorithms are prone to overfitting, a problem that has not received adequate attention in current research.

Therefore, this study proposes a new GP-based method for degraded image classification. This method employs a new program structure that facilitates the transmission of information acquired by nodes across layers through an information transmission mechanism, thereby evolving efficient individuals with discriminative features. Additionally, the accompanying evolution strategy proposed accompanies multiple GP individuals during the evolutionary process and guides the algorithm to explore more effective GP individuals when the fitness of the best individual stagnates. Simultaneously, this paper introduces a multi-generation individual ensemble strategy. This strategy selects efficient individuals based on the frequency of different functions appearing in each individual and integrates them into a solution for classification tasks.

3. The Proposed Approach

This section provides a detailed introduction to the ITACIE-GP method proposed. It covers an algorithmic framework, program structure based on information transmission, new function set and terminal set, accompanying evolution strategy, and multi-generational individual ensemble strategy.

3.1. Algorithmic Framework

To effectively address the problem of image classification, this study developed the ITACIE-GP method. ITACIE-GP automates the construction of efficient image classification solutions by utilizing predefined program structure, function set, and terminal set. Figure 1 displays the training and testing process of ITACIE-GP. ITACIE-GP takes the training dataset as its input and subsequently initializes the population based on the newly defined function set and terminal set. Then, individuals/solutions in the population are evaluated for fitness using a fitness function. In each generation, an elitism strategy copies several excellent individuals directly to the next generation from the current population. The algorithm employs a tournament selection method to choose individuals with superior fitness values and stores the first type of accompanying individuals according to the storage strategy in the accompanying evolution strategy. During the iterative process, if the fitness of the optimal individuals stagnates and reaches the predetermined threshold, the replacement strategy within the accompanying evolution strategy will be activated. This approach breaks the stagnation in fitness, thereby assisting the algorithm in searching for more superior individuals. After this, the algorithm performs genetic operators (i.e., crossover and mutation operators) to generate a new population. If the population has already been updated through the accompanying evolution strategy, the algorithm stores the successfully updated individuals as the second type of accompanying individuals. This process continues until predefined termination conditions are met. Finally, GP individual set are selected for individual ensemble based on the multi-generational individual ensemble strategy, returning the ensembled individual/solution.

To validate the performance of the ensembled individual/solution obtained by the ITACIE-GP method, a test set is used as the input for the ensembled individual/solution. Then, the predicted class labels of the test set are output, and finally, the performance of this solution is evaluated based on the actual labels of the test set.

3.2. New Program Structure Based on Information Transmission

To enhance the performance of GP programs in classification tasks, this study proposes a program structure based on information transmission. This structure incorporates 11 distinct functional layers, of which 7 are designated as fixed and 4 as flexible. The fixed layers ensure the consistent inclusion of predefined functions within these layers in the GP program. In contrast, the flexible layers allow for the functions from these layers to appear in the generated GP program zero times, once, or multiple times.

By introducing the concept of information transmission, the GP program selects specific functional layers and selectively passes the information obtained from these layers to multiple subsequent layers.

Figure 2 displays this program structure and provides an example of a potential GP program. In this figure, layers connected by solid lines signify that information transmission necessarily occurs at these layers. Conversely, layers connected by dashed lines indicate that information transmission happens only in specific scenarios. Each node in the GP program example represents a function included in various layers.

The image preprocessing layer preliminarily processes the raw image data, including noise reduction and smoothing. The region extraction layer extracts key information-containing image regions from the original or filtered image data. The image fusion layer performs pixel-level fusion of different image regions to obtain a new region rich in discriminative content.

The dimension reduction layer performs downsampling operations on regions to reduce the dimensionality of features and obtain new images after downsampling.

Notably, based on the concept of information transmission, this layer preserves the original image region and passes it to subsequent layers as needed.

The feature extraction layer extracts features from the image region. The feature construction layer constructs features based on multiple types of features. The transfer layer decides whether to pass the effective information generated by the feature construction layer to subsequent layers, based on the type of input. The feature concatenation layer combines the input data into a feature vector. The classification layer utilizes a fixed classification algorithm to conduct predictive classification based on the input features, ultimately outputting the predicted class labels.

3.2.1. New Function and Terminal Set

To meet the functional requirements of each layer in the newly designed GP program, this study designs a corresponding terminal set and function set for each layer. The function set includes the image processing functions necessary for constructing GP program.

The terminal set includes original image data and parameters for image processing functions, with a detailed list provided in Table 7. Notably, the terminal m represents three different manual feature extraction methods. When the value of m is 0, the SIFT method is used for feature extraction, and the input image region is treated as a key point [37]. When the value of m is 1, the HOG method is employed to extract features, calculating the average of a 10 × 10 grid from the image area with a step size of 10 [7]. When the value of m is 2, the uniform LBP method is selected to extract histogram features, with a radius of 1.5 and a neighbor count of 8 [5].

The required parameter values for these functions will be automatically selected within a predefined range through the evolutionary process. The new method will evolve a GP with a multi-layer structure, capable of automatically selecting the appropriate functions and their corresponding parameters for each layer, tailored to different image classification tasks.

Table 8 provides a detailed list of the function set from the preprocessing layer to the pooling layer in the image preprocessing stage.

Table 9 displays the function set used from the feature extraction layer to the classification layer in the feature construction and classification stages.

Functions of the image preprocessing layer: The image preprocessing layer integrates a variety of image filtering and enhancement functions. Processing image data through these functions enhances image quality. Each function accepts either a raw image or a previously processed image, along with related parameters if required, as input. It then produces an output image, processed and of the same size as the input.

Functions of the region extraction layer: The primary function of the region extraction layer is to extract a significant region from image data, reducing the computational load for irrelevant information within the image data. The region extraction layer accomplishes its function through the

R e g i o n_R

and

R e g i o n_S

functions. The input data for these functions include either raw image data or preprocessed image data, as well as

C o o r d i n a t e

and

S i z e

. Based on

C o o r d i n a t e

and

S i z e

, the function extracts rectangular or square regions from the image data to serve as input for subsequent processing layers.

Functions of the image fusion layer: The primary function of the image fusion layer is to merge information from multiple image regions, aiming to create a new image region enriched with more comprehensive information. These functions are capable of processing two image regions of possibly different sizes. They use the size of the larger image region as a reference and apply bilinear interpolation to unify the sizes. Subsequent pixel-level fusion is performed to generate a new image region with a greater amount of information. The

M i n_F

function merges by selecting the minimum value of corresponding pixels from both image regions, thus creating a new image region. The

M e d i a n_F

function fuses by choosing the median value of corresponding pixels from the two image regions to create a new image region. The

M e a n_F

function performs fusion by calculating the average value of corresponding pixels from both image regions, thereby creating a new image region.

Functions of the downsampling layer: The primary task of the downsampling layer is to downsample the input image regions to reduce image sizes, thereby lessening the computational burden. To adapt to various classification tasks, this layer provides multiple downsampling functions. Specifically, the function set of the pooling layer includes the

M a x_P

function,

A v e_P

function, and

B i l i n_D

function. These functions take an image region and a sliding window size formed by the integers

K 1

and

K 2

as input. The

M a x_P

function emphasizes the most significant features by retaining the maximum value in each window. The

A v e_P

function calculates the average of all values within the window to preserve more information. The

B i l i n_D

function, through the use of bilinear interpolation, effectively preserves image details while reducing the size of the image. It is important to note that while the downsampled region generally retains key features, the loss of useful information can still occur. This might lead to reduced performance when using manual feature extraction methods. Hence, this layer incorporates the concept of information transmission, passing the original region to subsequent layers, ensuring the use of information-rich original region for manual feature extraction when necessary. Given this consideration, the layer’s output is designed in a tuple format, including both the original and the downsampled region.

Functions of the feature extraction layer: The primary responsibility of the feature extraction layer is to extract features from the input image region. The

F e a t u r e_E

function will take

t u p l e 1

as input. If dealing with the original image regions, it will select a manual feature extraction method based on the terminal parameter m. If it is not the original image region, each row of the region’s pixels will be concatenated into a vector to transform into pixel features. Then, these two types of features will be concatenated to form fused features and outputted. Integrating different types of features will help to comprehensively capture image information. It is noteworthy that the original image region passed from the downsampling layer may still be useful in subsequent layers, thus the function output still includes the original image regions. Additionally, the output includes a temporary variable t, initialized within the function. This value is key for further information transmission and will be introduced in the subsequent transmission layer.

Functions of the feature construction layer: To construct new features with distinctiveness, an innovative feature construction layer has been designed.

F e a t u r e_C

accepts

T u p l e 2

as input and generates new manual features based on the value of m produced by the terminal. These newly generated features are multiplied by the value of parameter i generated by the terminal, while the second element of Tuple 2 is multiplied by (1 − i). Subsequently, these two parts of features are added together to construct new features. It is important to note that since the function has the same type for both input and output, the constructed features may potentially be returned to the same layer for additional processing. During this process, the original image region will continually produce new handcrafted features, which will be preserved as output. At the same time, the already constructed features will be combined with the newly generated features for further feature construction.

Functions of the transmission layer: The primary function of the transmission layer is to decide whether to pass information to subsequent layers based on the type of input. When the function

T r a n s m i t_F

receives input originating from the feature extraction layer, it checks whether the third element of the tuple is a feature vector. If it is, the extracted features are returned directly in vector form. If it is not a feature vector, the decision to pass the newly generated manual features from the feature construction layer to subsequent layers is made based on the parameter p. Furthermore, considering the program structure uses strongly typed GP, the transmission layer also acts as a transitional layer for input-output type conversion, ensuring that the feature concatenation layer can be defined as a flexible layer.

Functions of the feature concatenation layer: The primary function of the

C o n c a t 2

function in the feature concatenation layer is to concatenate the feature data inputs from two transmission layers and output them. Due to the specific design of the transmission layers, the input data may include handcrafted features generated by the feature construction layer. Based on this, the

C o n c a t 2

function integrates all input features, forming a feature vector as the output. As a flexible layer, it can combine various features from the same region or similar features from different regions.

Functions of the classification layer: The function of the

S V M

in the classification layer is to receive the learned features and input them into the Support Vector Machine (SVM) classifier for categorization, thereby outputting the predicted category labels.

3.2.2. Fitness Function

Function set and terminal set are used to generate GP individuals/programs. Since ITACIE-GP deals with image classification tasks, classification accuracy is adopted as the fitness function for the individuals, which is defined as

Accuracy = \frac{N_{true}}{N_{total}} \times 100 %,

(1)

where

N_{true}

represents the number of instances correctly classified, and

N_{total}

denotes the total number of instances in the training set. The ITACIE-GP programs/individuals are evaluated using a stratified 5-fold cross-validation (5-fold CV) method.

3.3. Accompanying Evolution

To address the issue of fitness stagnation of the best individual during the iterative process of the GP algorithm, this paper introduces the accompanying evolution strategy. The accompanying evolution strategy consists of two sub-strategies: storage strategy and replacement strategy. The storage strategy is employed to identify and preserve individuals showing potential, referred to as accompanying individuals. These individuals are retained and not immediately reintroduced into the population. When the algorithm detects fitness stagnation of the best individual, the replacement strategy reintroduces these accompanying individuals into the population, thereby effectively enhancing the probability of finding the best individual.

3.3.1. Storage Strategy

In the storage strategy of the algorithm, two storage methods for accompanying individuals are introduced, aiming to store potential individuals at different evolutionary stages. The first method primarily focuses on high-fitness individuals that are eliminated during the evolutionary process, while the second method targets individuals that have been updated and enhanced their fitness within the new population.

The first storage method focuses on preserving individuals that, although eliminated during the evolutionary process, exhibit high fitness. This approach avoids introducing lower-quality individuals that might degrade the overall population quality, without adding extra computational resource consumption. At the end of the selection process for each generation, the eliminated individuals are ranked based on their health conditions. They are then stored as accompanying individuals according to the ratio

α

and subsequently incorporated into

A_{1}

. This ratio

α

is defined in Equation (2),

α = \{\begin{matrix} 0.15 & if 1 \leq g \leq 10 \\ 0.10 & if 11 \leq g \leq 30 \\ 0.05 & otherwise \end{matrix}

(2)

where

α

takes on different values depending on g, which represents the evolutionary generation. In the initial stages of the algorithm, the population diversity is typically high. As evolution progresses, the population might begin to converge, potentially leading to a decrease in diversity. To adapt to this transition, the algorithm reduces the number of accompanying individuals it retains.

The second type of storage method: The second storage method focuses on preserving individuals that have successfully updated. Specifically, the first type of accompanying individuals successfully replace some redundant individuals in the population, generating a new population through crossover and mutation. Within this new population, if any individual successfully updates (improves its fitness), this indicates that these new individuals have potential for development. Individuals that have updated successfully are stored in

A 2

as the second type of accompanying individuals.

3.3.2. Replacement Strategy

In the replacement strategy, to determine the appropriate timing for substituting some redundant individuals with accompanying individuals, the parameter q is introduced, as defined in Equation (3),

q = \{\begin{matrix} q + 1 & if f_{best, g} \leq f_{best, g - 1} \\ 0 & otherwise \end{matrix}

(3)

where

f_{best, g}

denotes the fitness value of the best solution in the population at the g-th generation, and q represents the number of times the best individual in the population fails to update.

During the evolutionary process, if the population’s

f_{best}

fails to update, the algorithm determines that the population’s evolution has stagnated, leading to an increment in the value of q. When

f_{best}

updates successfully or when q reaches the predetermined threshold T, the algorithm resets the value of q to zero.

When q reaches the designated threshold T, set at 5 in this study, for the first time, the algorithm selectively removes a certain proportion of redundant individuals from the population and selects individuals from

A_{1}

for replacement. The number of individuals selected follows Equation (4),

Γ_{1} = β \times P_{redundant}

(4)

where

P_{redundant}

represents the number of redundant individuals identified in the current population, while

Γ_{1}

denotes the number of individuals selected from A1.

β

is a fixed proportionality coefficient, set at 0.25 here.

When the value of q reaches the threshold T again, the algorithm evaluates whether the previous replacement strategy was effective. If replacing individuals with those from

A 1

has led to an improvement in the fitness of the best individual, the algorithm will continue with this strategy. Otherwise, individuals from

A 2

will be selected. At this point, the number of individuals chosen follows Equation (5),

Γ_{2} = min ((β \times P_{redundant}), P_{updated})

(5)

where

P_{updated}

represents the number of individuals in the population whose performance has successfully improved, while

Γ_{2}

denotes the number of individuals selected from

A 2

. Equation (5) ensures that the number of individuals chosen from

A 2

does not exceed the upper limit of accompanying individuals in

A 2

. By using Equations (4) and (5), the appropriate quantity of replacements is determined to substitute the redundant individuals in the population. When the fitness growth of the best individual in the population slows down or even stagnates, this will assist the algorithm in exploring better individuals.

3.4. Multi-Generation Individual Ensemble Strategy

To avoid the risk of overfitting that might be introduced by a single GP program individual, a multi-generation individual ensemble strategy is proposed. Specifically, starting from the 10th generation, the best GP program of each generation is continually saved. Upon completion of the GP algorithm iteration, the collection comprising the best GP program throughout the entire algorithm process and the best GP program of each generation is returned, denoted as

I_{b} = {i_{1}^{b}, i_{2}^{b}, \dots, i_{m}^{b}}

where

i_{n}

represents the best GP program generated by the n-th generation. Then, based on the function set defined in Section 3.2.1, the frequency of different functions contained in each GP program in

I_{b}

is calculated and marked as

F i

. Based on these frequencies, the difference between the GP programs in the collection

I_{b}

and the algorithm’s best GP program is calculated, as shown in Equation (6),

D_{i} = | F_{best} - F_{i} |

(6)

where

F_{best}

denotes the function frequency of the algorithm’s best GP program, and

F_{i}

represents the function frequency of each GP program in the set

I_{b}

.

D_{i}

is defined as the difference in function frequency between the best GP program and each of the best GP programs of every generation. The three GP programs with the highest

D_{i}

values, along with the two best GP programs, are then effectively combined into an ensemble GP program. This process can be explained with the following example: assume that the predictive class labels obtained from the three GP programs with the highest

D_{i}

values are 001, and from the two best GP programs are 11. These inputs are combined to form the final output 00111. A majority vote based on the outputs generates the predicted class label, with the final output in this example being 1.

4. Experimental Design

This section discusses the design of the experiment, including detailed information about the benchmark dataset, benchmark methods used for comparison, and parameter settings.

4.1. Benchmark Datasets

The proposed ITACIE-GP method is evaluated on image datasets of varying difficulties. They are FEI1 [38], FEI2 [38], JAFFE [39], KTH [40], Flower [41], and ORL [42]. Figure 3 displays examples of these image datasets.

FEI1 and FEI2 are datasets related to binary facial expression classifications, distinguishing primarily between smiling and natural expressions. The JAFFE dataset comprises images categorized into seven facial emotions: anger, fear, disgust, neutral, sadness, surprise, and happiness. The KTH dataset is designed for texture classification and includes images from 10 different categories. These images differ in lighting, angles, and sizes. The Flower dataset focuses on flower classification, featuring images of two distinct flower types. ORL, commonly used in face recognition research, consists of images from 40 different individuals. Each individual contributes 10 distinct facial photos, capturing a range of facial expressions and variables, including the presence or absence of glasses. Table 10 describes the detailed information of the image datasets, including image size, the split between training and testing sets, and the number of classes in each dataset.

This paper formed a degraded image dataset based on the parameters proposed in reference [16,18] and the aforementioned image datasets. Specifically, different levels of blur, low contrast, and noise, and various sizes of square occlusions in different positions were introduced separately into the dataset, resulting in datasets for blurred scenarios, low contrast scenarios, noisy scenarios, and occluded scenarios.

Table 11 provides a detailed list of the parameters used to create these datasets, including the degree of blur, level of contrast, size of noise, and size of the occluding squares. Figure 4 shows examples of degraded images in four different scenarios.

4.2. Benchmark Methods

To verify the performance of the ITACIE-GP method, this paper compares it with several benchmark methods. These benchmark methods include manual feature-based methods, CNN-based methods, deep feature-based methods, and a GP-based method.

For feature-based methods, this paper employs several image classification techniques that combine various handcrafted feature methods with the classification algorithm Support Vector Machine (SVM), namely SVM + Histogram, SVM + DWT [43], SVM + Gabor [44], SVM + SIFT [6], SVM + HOG [7], and SVM + uLBP [5]. These methods demonstrate relatively low computational complexity and good performance across multiple datasets.

For CNN-based methods, two widely used network architectures in computer vision tasks are selected for experimentation: CNN-5 and LeNet-5 [45].

For deep feature-based methods, this paper selects several deep models pre-trained on the ImageNet dataset [46], namely ResNet [47], VGG [48], AlexNet [49], MnasNet [50], and SqueezeNet [51], for performance evaluation. The ResNet, VGG, and MnasNet models have their last fully connected layer removed, while AlexNet has all its fully connected layers removed, thereby obtaining network structures for feature extraction. The extracted features are then fed into the SVM classification algorithm to complete the final classification task.

For the GP-based method, EFLGP [16] is considered to be representative in this field. The GP program of this method generates a set of feature vectors, which are then used as input to an SVM. This method has demonstrated superior performance on datasets with issues like blurriness, low contrast, and noise.

These benchmark methods were chosen because their performance on various publicly available datasets has been widely validated and is commonly used in this domain. The experimental design strictly accords with parameter configurations from the existing literature and assesses the classification performance of all methods on selected datasets.

4.3. Parameter Settings

The ITACIE-GP method proposed in this paper was implemented using the DEAP (Distributed Evolutionary Algorithm in Python) [52] library. The detailed parameter configurations for ITACIE-GP can be found in Table 12. The experiments were conducted on a personal computer equipped with an Intel Core i5 CPU at 3.2 GHz and a total memory of 32 GB.

The SVM classification algorithm was implemented using the scikit-learn [53], where the penalty factor C was set to 1. The implementations of the CNN-based and deep feature-based methods were built upon PyTorch [54]. For the CNN-5 method, a structure identical to that described in [28] was established, while the Lenet-5 approach adhered to the parameter settings from its original paper. In both the CNN and deep feature-based methods, the learning rate was set at 0.05.

5. Results and Discussion

5.1. Classification Performance

To ensure fairness in the experiments and minimize potential biases, each method was independently run 30 times on each dataset with different random seeds. The analysis of the experimental results primarily relies on the average classification accuracy and standard deviation on the test sets. Table 13, Table 14, Table 15, Table 16 and Table 17 list the average classification accuracy and standard deviation for ITACIE-GP and the comparison methods on the FEI_1, FEI_2, FLOWER, JAFFE, KTH, and ORL datasets in five scenarios. The non-parametric Wilcoxon Signed-Rank test assesses the significant differences between the methods, without assuming they follow a normal distribution. ITACIE-GP compares with competitors at a significance level of

α = 0.05

, denoted in the W-test column. The symbols

(+ / = / -

) respectively indicate that ITACIE-GP is better than/similar to/inferior to the comparison methods in terms of the benchmark datasets. The best classification results for each dataset are highlighted in bold and marked in red, the second-best results are marked in green, and the third-best results are marked in blue. The overall results of the significance tests are summarized in the last row.

In 180 comparisons with SVM + Gabor, SVM + Histogram, SVM + HOG, SVM + SIFT, and SVM + DWT methods, the ITACIE-GP approach significantly outperformed the comparative methods in 178 comparisons and demonstrated similar performance in 2 comparisons. Only in original and low-contrast scenarios does SVM + SIFT perform similarly to ITACIE-GP on the ORL dataset. In other scenarios, ITACIE-GP holds a notable advantage. This indicates that the simple extraction of manual features followed by classification is ineffective, and performance degrades under the influence of various adverse scenarios. In contrast, benefiting from the adoption of a program structure based on information transmission, the ITACIE-GP method can evolve and construct effective features for accurate classification based on the characteristics of the classification problem. Moreover, it does not experience a significant decline in performance due to interference from adverse scenarios.

In 60 comparisons with Lenet-5 and CNN-5, the ITACIE-GP method significantly outperformed the comparative methods in 59 comparisons and showed similar performance in 1 comparison. The ITACIE-GP method demonstrated superior performance on almost all datasets. This phenomenon may be attributed to the relatively simple architectures of Lenet-5 and CNN-5, while the ITACIE-GP method has the ability to evolve solutions of different scales based on the problem definition.

The ITACIE-GP method demonstrated significant advantages in 150 comparisons with MNASNet, VGG, AlexNet, ResNet, and SqueezeNet. Specifically, in 136 comparisons, the ITACIE-GP method was significantly superior to the comparison method. In 5 comparisons, its performance was similar to the comparison method, while in 9 comparisons, its performance was inferior to the comparison method. In all testing scenarios, including facial expression classification (FEI_1, FEI_2, and JAFFE) and a facial recognition dataset (ORL), the performance of ITACIE-GP was significantly superior to these five deep feature-based methods. However, in original, blurred, and occlusion scenarios, the ITACIE-GP method’s performance was comparable to or slightly inferior to that of MNASNet, VGG, and ResNet. This may be due to the ability of deep feature-based methods to learn distinctive features during the classification of flowers in original, blurred, and occlusion scenarios, features that are not easily affected by such adverse factors. In contrast, in low-light and noisy scenarios, the low contrast and noise in images significantly impair the performance of these methods. Although ITACIE-GP may not outperform the deep feature-based methods in original, blurred, and occlusion scenarios, it still demonstrates similar results, and significantly surpasses these methods in the other two challenging scenarios, showcasing the robustness of the ITACIE-GP method. On the KTH dataset, the ITACIE-GP’s performance in original scenarios was not as good as all the deep feature methods referenced, but in blurred and noisy scenarios, its performance was comparable to or slightly inferior to some of the deep feature methods. It is particularly noteworthy that under scenarios of low contrast and occlusion, ITACIE-GP significantly outperforms the five deep feature methods, highlighting the potential instability of deep feature approaches when dealing with image datasets under adverse scenarios. At the same time, the design goal of ITACIE-GP to maintain good performance even in various challenging scenarios was significantly demonstrated.

In the 30 comparisons with EFLGP, the ITACIE-GP method demonstrated a significant advantage in the majority of comparisons, specifically in 28 out of 30. In the 2 comparisons under low contrast scenarios, its performance was similar to that of EFLGP. This advantage mainly results from the program structure based on information transmission designed in this paper, which ensures that the initialized solutions have higher performance. Moreover, thanks to the accompanying evolution strategy, the classification accuracy continues to improve even in the later stages of the evolutionary process.

Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9 illustrate the convergence curves of the average classification accuracy of the best individuals in each generation during 30 runs of EFLGP and ITACIE-GP across five datasets, FEI_1, FLOWER, JAFFE, KTH, and ORL, under five different scenarios. The figures demonstrate that the innovative program structure of ITACIE-GP enables optimal individuals to achieve high classification performance on the training set from the very beginning. As the evolutionary process progresses, especially in the middle and later stages, ITACIE-GP continues to enhance the training set classification performance of the best individuals due to its accompanying evolution strategy. In contrast, the training set classification performance of the best individuals in EFLGP might slow down or even stagnate after a certain generation, indicating that it may be less effective in continuous optimization compared with ITACIE-GP. Overall, ITACIE-GP uses its outstanding program structure to attain superior initial training set classification performance and continuously improves this performance through its evolutionary strategy.

5.2. Ablation Experiments

To verify the effectiveness of the strategies proposed in ITACIE-GP, this paper conducted ablation experiments for comparison. The comparison methods were divided into four types: the ITACIE-GP method proposed in this paper, ACIE-GP without the program structure based on information transmission, ITIE-GP, which has the same program structure as ITACIE-GP but does not use the accompanying evolution strategy, and ITAC-GP, which has the same program structure as ITACIE-GP but lacks a multi-generational individual ensemble strategy.

The four methods were run independently five times on different datasets using the same random seed. The experimental results have been analyzed based on the average and standard deviation of the classification accuracy on the test sets.

As shown in Table 18, in the

overall

row, the performance differences of ITACIE-GP compared with ACIE-GP, ITAC-GP, and ITIE-GP in terms of average values are shown. Specifically, "+" indicates that ITACIE-GP outperforms the compared methods, "=" indicates that the performance of ITACIE-GP is identical to that of the comparison methods, and "-" denotes that ITACIE-GP underperforms relative to the comparison methods.

In the 16 performance comparisons with ACIE-GP, ITACIE-GP outperformed the comparison method in all scenarios. ITACIE-GP builds efficient feature combinations based on predefined feature types, delivering outstanding classification performance for datasets in complex scenarios. Conversely, ACIE-GP mainly relies on predefined manual features, leading to diminished classification capabilities when handling images from diverse categories or complex scenarios.

In 16 performance comparisons with ITIE-GP, ITACIE-GP outperformed the comparison method in 15 instances and performed equally in 1. This result indicates that a significant decline in classification performance occurs after losing the accompanying evolution strategy for continuous optimization. The reason is the lack of substantial improvement in multiple generations of optimal individuals. When implementing an individual ensemble strategy, it may combine solutions that are too similar and not particularly distinguished in classification performance, which not only fails to improve classification efficacy but sometimes may even lead to a reduction in performance.

In 16 performance comparisons, ITACIE-GP showed superior performance to ITAC-GP in 10 instances, presented equivalent performance in 4, and was slightly inferior in 2. Generally, the absence of the individual ensemble strategy leads to reduced classification performance. This happens because forming a final solution with multi-generational individuals/solutions effectively avoids the overfitting issue that reliance on a single solution might cause.

5.3. Performance Analysis

To clarify why ITACIE-GP is able to achieve good classification performance, this section analyzes the single programs/solutions obtained by ITACIE-GP in the FEI1 dataset under occlusion scenarios and in the JAFFE dataset under original scenarios.

5.3.1. Example Program of FEI1 Dataset under Occlusion Scenarios

Figure 10 displays an example program evolved by ITACIE-GP for the FEI1 dataset in occlusion scenarios, along with its two visualization examples. This program achieves an accuracy of 99.33% on the training set and can reach an accuracy of 98% on the test set.

In the left branch of the example program, the

P r e w i t t

operator is first applied to process the image to highlight edge information, and the mouth region with significant discriminative ability is extracted from it. Then, the extracted image region undergoes a downsampling process. Next, fused features are generated by combining the pixel features after downsampling with the LBP features of the original region. Based on these fused features, as well as the information transmitted from the downsampling layer, new features are further constructed. In the right branch of the example program, median fusion is performed on two different extracted image regions to construct new image regions with a higher discriminative. Subsequently, the new image region undergoes a downsampling process, resulting in the generation of fused features based on pixel and SIFT features. Ultimately, the features of the left and right branches of the example program are concatenated for the final classification decision.

It is worth mentioning that, in occlusion scenarios for

H a p p y

category example images (as visualized on the left side of Figure 10), the ITACIE-GP method effectively extracts key regions in the images and performs efficient feature extraction and construction. This enables the effective classification of images obstructed in the same location. However, for

N a t u r e

category example images in occlusion scenarios (as visualized on the right side of Figure 10), the region extracted by the left branch might lead to ineffective features due to the obstruction. Despite this, the fused region obtained by the right branch still contains valid information, making the classification decision based on these features effective. This demonstrates the strong generalization ability of the ITACIE-GP algorithm, capable of extracting multiple sets of discriminative features, enabling effective classification decisions even in obstructed or complex scenarios.

5.3.2. Example Program of the JAFFE Dataset in the Original Scenarios

Figure 11 shows an example program evolved by ITACIE-GP for the JAFFE dataset in the original scenarios, along with its two visualization examples. This program achieves an accuracy of 92.85% on the training set and can reach an accuracy of 89.04% on the test set.

In the left branch of the example program, two different image fusion operations are executed to generate new image region rich in information. Subsequently, based on these new image region, pixel features and LBP features are extracted and concatenated to obtain fused features. In the feature construction layer, new features are further constructed by combining the information transmitted from the downsampling layer. In the right branch of the example program, an image fusion operation is conducted, resulting in the creation of a new image region. By extracting pixel and SIFT features from these new image regions, fused features are obtained. In the feature construction layer, new features will be further constructed, achieved by integrating the information transmitted from the downsampling layer. Ultimately, by concatenating the features from both the left and right branches, and during this process, the handcrafted features generated in the feature construction process are passed to the feature concatenation layer, thereby assisting in the classification decision.

Overall, through in-depth analysis of two example programs, it is demonstrated that the solutions generated by ITACIE-GP are not only easy to interpret but also exhibit excellent performance in terms of their generalization ability and classification accuracy, thanks to their program structure based on information transmission. Moreover, the ITACIE-GP method can effectively construct features according to the needs of different classification tasks, thereby ensuring effective classification decisions even in occlusion or other complex scenarios.

6. Conclusions

In this paper, a new method based on GP is introduced for the classification of degraded images. The proposed program structure, based on information transmission, significantly enhances the potential of GP individuals. Under this structure, nodes in specific layers of GP individuals can automatically transmit effective information to nodes in subsequent layers, enabling the automatic construction of effective features. This significantly improves their performance as solutions for classification tasks. The accompanying evolution strategy effectively addresses the issue of fitness stagnation in GP individuals during the iterative process by employing storage and replacement strategies. This approach substitutes redundant individuals with those possessing the potential for development when population fitness stagnation occurs, thus directing the algorithm toward the identification of more superior individuals. Moreover, the application of the multi-generational individual ensemble strategy, through the use of individual distances, incorporates several efficacious GP individuals into an ensemble solution, effectively mitigating the problem of overfitting. Leveraging these three strategies, the ITACIE-GP algorithm exhibits exceptional classification performance on datasets of original, blurry, low contrast, noisy, and occluded images. Experimental results indicate that ITACIE-GP consistently achieves similar or superior results, particularly in the classification of degraded images. Ablation studies further confirm the effectiveness and synergy of these strategies. Additionally, performance analyses elucidate the reasons behind the high-performance solutions evolved by ITACIE-GP.

In the future, we will prioritize expanding classification examples that are closely aligned with real-world applications and develop more advanced technologies to enhance classification performance for degraded image tasks.

Author Contributions

Conceptualization, Y.S. and Z.Z.; methodology, Z.Z.; validation, Y.S. and Z.Z.; formal analysis, Z.Z.; investigation, Z.Z.; resources, Z.Z.; data curation, Z.Z.; writing—original draft preparation, Z.Z.; writing—review and editing, Y.S. and Z.Z.; visualization, Z.Z.; project administration, Y.S.; funding acquisition, Y.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by National Natural Science Foundation of China (Grant Nos. 61763002 and 62072124), Guangxi Major projects of science and technology (Grants No. 2020AA21077021), Foundation of Guangxi Experiment Center of Information Science (Grant No. KF1401).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets used in this paper are available at the following URLs: FEI Face Database (accessed on 29 January 2024) (https://fei.edu.br/~cet/facedatabase.html), Cambridge ORL Face Database (accessed on 29 January 2024) (http://cam-orl.co.uk/facedatabase.html), JAFFE Database (accessed on 28 January 2024) (https://www.kasrl.org/jaffe_download.html), and KTH-TIPS Material Database (accessed on 6 February 2024) (https://www.csc.kth.se/cvap/databases/kth-tips/credits.html). The code for this manuscript has been uploaded to GitHub, available at https://github.com/zzq12-zzq/Automatic-Feature-Construction-Based-Genetic-Programming (accessed on 29 January 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

GP	Genetic Programming
LBP	Local Binary Patterns
SIFT	Scale-Invariant Feature Transform
HOG	Histogram of Oriented Gradients
CNNs	Convolutional Neural Networks
STGP	Strongly Typed Genetic Programming
SVM	Support Vector Machine
BoW	Bag of Words
RF	Random Forest
IPF	Image Preprocessing Layer Function
REF	Region Extraction Layer Function
IFF	Image Fusion Layer Function
DF	Downsampling Layer Function
FEF	Feature Extraction Layer Function
FCF	Feature Construction Layer Function
TF	Transmission Layer Function
FCF	Feature Concatenation Layer Function
Min_F	Minimum Fusion Function
Med_F	Median Fusion Function
Mean_F	Mean Fusion Function
Max_P	Max Pooling Function
Ave_P	Average Pooling Function
Bilin_D	Bilinear Downsampling Function
Feature_E	Feature Extraction Function
Feature_C	Feature Construction Function
Transmit_F	Transmission Layer Function
Concat2	Feature Concatenation Function
DWT	Discrete Wavelet Transform
LeNet-5	LeNet-5 Convolutional Neural Network
CNN-5	5-Layer Convolutional Neural Network
MnasNet	MnasNet Convolutional Neural Network
AlexNet	AlexNet Convolutional Neural Network
VGG	Visual Geometry Group Neural Network
ResNet	Residual Neural Network
SqueezeNet	SqueezeNet Convolutional Neural Network
IT	Information Transmission
AC	Accompanying Evolution
IE	Individual Ensemble

References

Naeem, A.; Anees, T.; Ahmed, K.T.; Naqvi, R.A.; Ahmad, S.; Whangbo, T. Deep learned vectors’ formation using auto-correlation, scaling, and derivations with CNN for complex and huge image retrieval. Complex Intell. Syst. 2023, 9, 1729–1751. [Google Scholar] [CrossRef]
Nguyen, H.D.; Kim, S.H.; Lee, G.S.; Yang, H.J.; Na, I.S.; Kim, S.H. Facial expression recognition using a temporal ensemble of multi-level convolutional neural networks. IEEE Trans. Affect. Comput. 2019, 13, 226–237. [Google Scholar] [CrossRef]
Ni, J.; Shen, K.; Chen, Y.; Cao, W.; Yang, S.X. An improved deep network-based scene classification method for self-driving cars. IEEE Trans. Instrum. Meas. 2022, 71, 1–14. [Google Scholar] [CrossRef]
Abdar, M.; Fahami, M.A.; Rundo, L.; Radeva, P.; Frangi, A.F.; Acharya, U.R.; Khosravi, A.; Lam, H.K.; Jung, A.; Nahavandi, S. Hercules: Deep hierarchical attentive multilevel fusion model with uncertainty quantification for medical image classification. IEEE Trans. Ind. Inform. 2022, 19, 274–285. [Google Scholar] [CrossRef]
Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar]
Koza, J.R. Genetic programming as a means for programming computers by natural selection. Stat. Comput. 1994, 4, 87–112. [Google Scholar] [CrossRef]
Qin, M.; Wang, R.; Shi, Z.; Liu, L.; Shi, L. A genetic programming-based scheduling approach for hybrid flow shop with a batch processor and waiting time constraint. IEEE Trans. Autom. Sci. Eng. 2019, 18, 94–105. [Google Scholar] [CrossRef]
Chen, Q.; Xue, B.; Zhang, M. Genetic programming for instance transfer learning in symbolic regression. IEEE Trans. Cybern. 2020, 52, 25–38. [Google Scholar] [CrossRef] [PubMed]
Bi, Y.; Xue, B.; Zhang, M. A Gaussian filter-based feature learning approach using genetic programming to image classification. In Proceedings of the AI 2018: Advances in Artificial Intelligence: 31st Australasian Joint Conference, Wellington, New Zealand, 11–14 December 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 251–257. [Google Scholar]
Bi, Y.; Xue, B.; Zhang, M. Genetic Programming for Image Classification: An Automated Approach to Feature Learning; Springer Nature: Cham, Switzerland, 2021; Volume 24. [Google Scholar]
Wu, M.; Li, M.; He, C.; Chen, H.; Wang, Y.; Li, Z. Facial Expression Recognition Based on Genetic Programming Learning CCA Fusion. In Proceedings of the 2022 5th International Conference on Pattern Recognition and Artificial Intelligence (PRAI), Chengdu, China, 19–21 August 2022; pp. 526–532. [Google Scholar]
Bi, Y.; Xue, B.; Zhang, M. Genetic programming with image-related operators and a flexible program structure for feature learning in image classification. IEEE Trans. Evol. Comput. 2020, 25, 87–101. [Google Scholar] [CrossRef]
Yang, L.; He, F.; Dai, L.; Zhang, L. An Automatical And Efficient Image Classification Based On Improved Genetic Programming. In Proceedings of the 2022 IEEE 25th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Hangzhou, China, 4–6 May 2022; pp. 477–483. [Google Scholar]
Bi, Y.; Xue, B.; Zhang, M. Genetic programming-based discriminative feature learning for low-quality image classification. IEEE Trans. Cybern. 2021, 52, 8272–8285. [Google Scholar] [CrossRef]
Montana, D.J. Strongly typed genetic programming. Evol. Comput. 1995, 3, 199–230. [Google Scholar] [CrossRef]
Yang, S.; Zhang, L.; He, L.; Wen, Y. Sparse low-rank component-based representation for face recognition with low-quality images. IEEE Trans. Inf. Forensics Secur. 2018, 14, 251–261. [Google Scholar] [CrossRef]
Abayomi-Alli, O.O.; Damaševičius, R.; Misra, S.; Maskeliūnas, R. Cassava disease recognition from low-quality images using enhanced data augmentation model and deep learning. Expert Syst. 2021, 38, e12746. [Google Scholar] [CrossRef]
Gao, Y.; Gao, L.; Li, X. A generative adversarial network based deep learning method for low-quality defect image reconstruction and recognition. IEEE Trans. Ind. Inform. 2020, 17, 3231–3240. [Google Scholar] [CrossRef]
Yadav, A.K.; Gupta, N.; Khan, A.; Jalal, A.S. Robust face recognition under partial occlusion based on local generic features. Int. J. Cogn. Inform. Nat. Intell. (IJCINI) 2021, 15, 47–57. [Google Scholar] [CrossRef]
Attarmoghaddam, N.; Li, K.F. An area-efficient FPGA implementation of a real-time multi-class classifier for binary images. IEEE Trans. Circuits Syst. II Express Briefs 2022, 69, 2306–2310. [Google Scholar] [CrossRef]
Quoc, T.N.; Hoang, V.T. A new local image descriptor based on local and global color features for medicinal plant images classification. In Proceedings of the 2021 International Conference on Decision Aid Sciences and Application (DASA), Virtual, 7–8 December 2021; pp. 409–413. [Google Scholar]
Wu, X.; Feng, Y.; Xu, H.; Lin, Z.; Chen, T.; Li, S.; Qiu, S.; Liu, Q.; Ma, Y.; Zhang, S. CTransCNN: Combining transformer and CNN in multilabel medical image classification. Knowl.-Based Syst. 2023, 281, 111030. [Google Scholar] [CrossRef]
Han, Q.; Qian, X.; Xu, H.; Wu, K.; Meng, L.; Qiu, Z.; Weng, T.; Zhou, B.; Gao, X. DM-CNN: Dynamic Multi-scale Convolutional Neural Network with uncertainty quantification for medical image classification. Comput. Biol. Med. 2023, 168, 107758. [Google Scholar] [CrossRef]
Shi, C.; Wu, H.; Wang, L. CEGAT: A CNN and enhanced-GAT based on key sample selection strategy for hyperspectral image classification. Neural Netw. 2023, 168, 105–122. [Google Scholar] [CrossRef]
Atkins, D.; Neshatian, K.; Zhang, M. A domain independent genetic programming approach to automatic feature extraction for image classification. In Proceedings of the 2011 IEEE Congress of Evolutionary Computation (CEC), New Orleans, LA, USA, 5–8 June 2011; pp. 238–245. [Google Scholar]
Evans, B.; Al-Sahaf, H.; Xue, B.; Zhang, M. Evolutionary deep learning: A genetic programming approach to image classification. In Proceedings of the 2018 IEEE Congress on Evolutionary Computation (CEC), Rio de Janeiro, Brazil, 8–13 July 2018; pp. 1–6. [Google Scholar]
Bi, Y.; Zhang, M.; Xue, B. An automatic region detection and processing approach in genetic programming for binary image classification. In Proceedings of the 2017 International Conference on Image and Vision Computing New Zealand (IVCNZ), Christchurch, New Zealand, 4–6 December 2017; pp. 1–6. [Google Scholar]
Shao, L.; Liu, L.; Li, X. Feature learning for image classification via multiobjective genetic programming. IEEE Trans. Neural Netw. Learn. Syst. 2013, 25, 1359–1371. [Google Scholar] [CrossRef]
Bi, Y.; Xue, B.; Zhang, M. An effective feature learning approach using genetic programming with image descriptors for image classification [research frontier]. IEEE Comput. Intell. Mag. 2020, 15, 65–77. [Google Scholar] [CrossRef]
Yan, Z.; Bi, Y.; Xue, B.; Zhang, M. Automatically extracting features using genetic programming for low-quality fish image classification. In Proceedings of the 2021 IEEE Congress on Evolutionary Computation (CEC), Krakow, Poland, 28 June–1 July 2021; pp. 2015–2022. [Google Scholar]
Fan, Q.; Bi, Y.; Xue, B.; Zhang, M. Genetic programming for image classification: A new program representation with flexible feature reuse. IEEE Trans. Evol. Comput. 2023, 27, 460–474. [Google Scholar] [CrossRef]
Bi, Y.; Xue, B.; Zhang, M. Genetic programming with a new representation to automatically learn features and evolve ensembles for image classification. IEEE Trans. Cybern. 2020, 51, 1769–1783. [Google Scholar] [CrossRef] [PubMed]
Price, S.R.; Anderson, D.T.; Price, S.R. Goofed: Extracting advanced features for image classification via improved genetic programming. In Proceedings of the 2019 IEEE Congress on Evolutionary Computation (CEC), Wellington, New Zealand, 10–13 June 2019; pp. 1596–1603. [Google Scholar]
Fan, Q.; Bi, Y.; Xue, B.; Zhang, M. Genetic programming for feature extraction and construction in image classification. Appl. Soft Comput. 2022, 118, 108509. [Google Scholar] [CrossRef]
Vedaldi, A.; Fulkerson, B. VLFeat: An open and portable library of computer vision algorithms. In Proceedings of the 18th ACM international conference on Multimedia, Firenze, Italy, 25–29 October 2010; pp. 1469–1472. [Google Scholar]
Thomaz, C.E. Fei face database. FEI Face DatabaseAvailable 2012, 11, 46–57. [Google Scholar]
Lyons, M.; Akamatsu, S.; Kamachi, M.; Gyoba, J. Coding facial expressions with gabor wavelets. In Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 14–16 April 1998; pp. 200–205. [Google Scholar]
Mallikarjuna, P.; Targhi, A.T.; Fritz, M.; Hayman, E.; Caputo, B.; Eklundh, J.O. The kth-tips2 database. Comput. Vis. Act. Percept. Lab. Stock. Swed. 2006, 11, 12. [Google Scholar]
Fei-Fei, L.; Fergus, R.; Perona, P. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. In Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop, Washington, DC, USA, 27 June–2 July 2004; p. 178. [Google Scholar]
Samaria, F.S.; Harter, A.C. Parameterisation of a stochastic model for human face identification. In Proceedings of the 1994 IEEE Workshop on Applications of Computer Vision, Sarasota, FL, USA, 5–7 December 1994; pp. 138–142. [Google Scholar]
Lee, G.; Gommers, R.; Waselewski, F.; Wohlfahrt, K.; O’Leary, A. PyWavelets: A Python package for wavelet analysis. J. Open Source Softw. 2019, 4, 1237. [Google Scholar] [CrossRef]
Tao, D.; Li, X.; Wu, X.; Maybank, S.J. General tensor discriminant analysis and gabor features for gait recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 1700–1715. [Google Scholar] [CrossRef]
LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25. [Google Scholar] [CrossRef]
Tan, M.; Chen, B.; Pang, R.; Vasudevan, V.; Sandler, M.; Howard, A.; Le, Q.V. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2820–2828. [Google Scholar]
Iandola, F.N.; Han, S.; Moskewicz, M.W.; Ashraf, K.; Dally, W.J.; Keutzer, K. SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size. arXiv 2016, arXiv:1602.07360. [Google Scholar]
Fortin, F.A.; De Rainville, F.M.; Gardner, M.A.G.; Parizeau, M.; Gagné, C. DEAP: Evolutionary algorithms made easy. J. Mach. Learn. Res. 2012, 13, 2171–2175. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]

Figure 1. The training process and testing process of the proposed ITACIE-GP method.

Figure 2. The proposed ITACIE-GP program structure (IPF: Image Preprocessing Layer Function; REF: Region Extraction Layer Function; IFF: Image Fusion Layer Function; DF: Downsampling Layer Function; FEF: Feature Extraction Layer Function; FCF: Feature Construction Layer Function; TF: Transmission Layer Function; FCF: Feature Concatenation Layer Function).

Figure 3. Example images from the FEI_1, FEI_2, JAFFE, FLOWER, KTH, and ORL datasets.

Figure 4. Some examples from image datasets in four different scenarios. (a) Examples of Blurred Images with Different Standard Deviations

σ

(b) Examples of Low-Light Images with Different Contrasts (c) Examples of Noisy Images with Different Variances

σ^{2}

(d) Examples of Occluded Images with Obstructions of Different Sizes.

Figure 4. Some examples from image datasets in four different scenarios. (a) Examples of Blurred Images with Different Standard Deviations

σ

(b) Examples of Low-Light Images with Different Contrasts (c) Examples of Noisy Images with Different Variances

σ^{2}

(d) Examples of Occluded Images with Obstructions of Different Sizes.

Figure 5. Convergence curves of EFLGP and ITACIE-GP on the FEI_1 dataset in five different scenarios.

Figure 6. Convergence curves of EFLGP and ITACIE-GP on the Flower dataset in five different scenarios.

Figure 7. Convergence curves of EFLGP and ITACIE-GP on the Jaffe dataset in five different scenarios.

Figure 8. Convergence curves of EFLGP and ITACIE-GP on the KTH dataset in five different scenarios.

Figure 9. Convergence curves of EFLGP and ITACIE-GP on the ORL dataset in five different scenarios.

Figure 10. Example program evolved from the FEI1 dataset through ITACIE-GP in occlusion scenarios, along with its two visualization examples. (a) Example program, (b) Visualization example1, (c) Visualization example2.

Figure 11. Example program evolved from the JAFFE dataset through ITACIE-GP in the original scenarios, along with its two visualization examples.

Table 1. Methods for processing degraded images.

Yang et al. [18] proposed a new method for low-quality image face recognition, named Sparse Low-Rank Component-Based Representation (SLCR). This method effectively describes the features of low-quality facial samples by constructing a new dictionary and uses the minimum category reconstruction residual as the recognition rule. Experimental results show that SLCR outperforms other sparse representation-based methods in low-quality image face recognition.

Abayomi-Alli et al. [19] proposed a new data augmentation method that uses the convolution of Chebyshev orthogonal functions with probability distribution functions (PDFs). Experiments conducted on a degraded cassava leaf disease dataset demonstrated that this method effectively improved the accuracy of the enhanced MobileNetV2 network in identifying cassava leaf diseases.

Gao et al. [20] proposed a new method based on Generative Adversarial Networks (GANs), specifically designed for the recognition of degraded defective images. This method reconstructs degraded images using GAN technology and utilizes the VGG16 network for recognition. Experimental results indicate that this method significantly improves accuracy compared with other methods.

Table 2. Manual feature-based image classification methods.

Yadav et al. [21] developed a face recognition method for partially occluded scenarios, combining SIFT and MB-LBP techniques to utilize Local Generic Features and robust kernel classification. Tests on the AR database demonstrated its superior performance over existing technologies.

Attarmoghaddam et al. [22] developed a binary image multiclass classifier for embedded systems, combining HOG feature extraction with SVM classification. The integration of two binarization steps in HOG improves processing speed and resource efficiency without compromising accuracy.

Quoc et al. [23] introduced a novel approach combining SIFT, BoW, HOG, and Hu moments, integrating these extracted features with machine learning algorithms like SVM and RF for identifying different plant species from complex backgrounds.

Table 3. CNN-based image classification methods.

Wu et al. [24] developed CTransCNN, a hybrid multilabel medical image classification model combining CNN and Transformer, enhanced by MMAEF, MBR, and IIM modules, showing excellent performance and generalization in tests.

Han et al. [25] developed DM-CNN, a new CNN model for medical image diagnosis, improving feature extraction and addressing uncertainty quantification, showing superior performance in various medical tests.

Shi et al. [26] proposed CEGAT, a fusion network for hyperspectral image classification, integrating CNN with an enhanced graph attention network, using spectral discrimination and attention modules, and a key sample selection strategy, achieving superior performance with limited labels.

Table 4. GP-based binary image classification methods.

Atkins et al. [27] designed a three-layer structure (classification, aggregation, and filtering layers) where the order of the layers is fixed, but the number of functions within each layer is not limited, allowing for varying depths in each layer. Experiments have shown that this structure has comparable performance to GP-based classifiers with domain knowledge.

Evans et al. [28] proposed an image classification method that combines useful operators of CNNs, such as convolution and pooling, with GP. This approach fully utilizes the flexibility of GP, enabling the learning of convolution filter coefficients and the extraction of features from useful regions of images to construct classifiers.

Bi et al. [29] proposed an automatic region detection method aimed at identifying and extracting important regions in images. This method not only significantly reduces the dimensionality of image data, thereby saving computational costs, but also automatically extracts and constructs effective features for efficient image classification.

Table 5. GP-based multiclass image classification methods.

Shao et al. [30] developed a new GP-based image classification method that utilizes Multi-Objective Genetic Programming (MOGP) to automatically generate feature descriptors adaptable to various image domains. Tests on multiple datasets have shown that its classification accuracy significantly surpasses that of comparative methods.

Bi et al. [31] proposed a new GP-based program structure that automatically and simultaneously extracts both global and local features from original images. The performance of the proposed method is significantly superior to or on par with all comparative methods across four image classification datasets of varying difficulty.

Yan et al. [32] proposed a new GP-based method specifically for the classification of low-quality fish images. This method automatically selects image operators to process images and extract features. Its performance on a well-known fish image dataset is significantly better than various benchmark methods.

Bi et al. [16] proposed a new GP-based method for low-quality image classification, capable of extracting discriminative features from low-quality images. This method has been tested for performance in various scenarios, including original (clean), blur, low contrast, and noise scenarios. The test results demonstrate that this method excels in classifying low-quality images.

Fan et al. [33] designed a new program structure that allows for the flexible reuse of features generated by different nodes. Through this design, a combination of low-level and manual features has been achieved. The new method integrates classification algorithms into the program structure, enabling the automatic selection of suitable classification algorithms during the evolutionary process.

Bi et al. [34] integrated the selection and combination of base classifiers into a single tree, ultimately ensemble-classifying with multiple base classifiers, achieving higher generalization performance than a single classifier.

Table 6. Improvement methods for GP genetic operator.

Price et al. [35] introduced a unique adaptive mutation method, as well as a new crossover technique to control tree bloat during the evolutionary process, thereby enhancing classification performance.

Fan et al. [36] proposed a new mutation operator specifically targeting the control of program tree bloat. This operator dynamically adjusts the size of the program tree during the evolutionary process. Experimental results indicate that the method of dynamically adjusting the size of the program tree effectively enhances classification performance.

Table 7. Terminal set.

Terminal	Type	Value Range	Description
Image Data	Array	[0, 1]	N image, each represented as a two-dimensional array, with pixel values normalized to the range [0, 1]
$σ_{1}$	Integer	[1, 3]	The standard deviation parameter for Gaussian filtering in Gaussian and GauDeriv functions
$σ_{2}$	Integer	[1, 2]	The standard deviation parameter for Gaussian filtering in the LoG function
$o_{1}$ , $o_{2}$	Integer	[0, 2]	The order of Gaussian derivatives
Coordinate	Integer	([0, W-20], [0, H-20])	Coordinate of the top-left corner of the selected region, where W and H respectively represent the width and height of each image data
Size	Integer	[20, 50]	Size of the width and height of a rectangle region, and size of the side length of a square region
K1,K2	Integer	[2, 10]	The sliding window size required by the downsampling layer functions
m	Integer	{0, 1, 2}	Different values represent different manual feature extraction methods
i	Float	(0, 1)	Random number, rounded to two decimal places
p	Integer	{0, 1}	If i is 1, the manual features generated by the feature construction layer are sent to the feature concatenation layer

Table 8. Image preprocessing stage function set.

Function	Input	Output	Description
Roberts	Image Data	Image Data	The Roberts operator used for edge detection
Prewitt	Image Data	Image Data	The Prewitt operator used for edge detection
Relu	Image Data	Image Data	Rectified Linear Unit function
Sqrt	Image Data	Image Data	Take the square root of each pixel in the Image Data
Mean	Image Data	Image Data	3 × 3 Mean Filter
Min	Image Data	Image Data	3 × 3 Min Filter
Max	Image Data	Image Data	3 × 3 Max Filter
Med	Image Data	Image Data	3 × 3 Med Filter
Sobel	Image Data	Image Data	Sobel edge detector
LoG	Image Data, $σ_{2}$	Image Data	Laplace operator of the Gaussian filter
Abs	Image Data	Image Data	Return the absolute value of the pixel
Lap	Image Data	Image Data	Laplace filter
Gaussian	Image Data, $σ_{1}$	Image Data	Gaussian filter
GauDeriv	Image Data, $σ_{1}$ , $o_{1}$ , $o_{2}$	Image Data	Gaussian filter of the second derivative
Equalize	Image Data	Image Data	Histogram equalization
Region_R [12]	Image Data, Coordinate, Size	Region	Extracting rectangular regions from image data
Region_S [12]	Image Data, Coordinate, Size	Region	Extracting square regions from image data
Min_F	2 Region	Region	Select the minimum value from corresponding pixel points to achieve image fusion
Median_F	2 Region	Region	Choose the median value from corresponding pixel points to achieve image fusion
Mean_F	2 Region	Region	Calculate the average value of corresponding pixel points to achieve image fusion
Max_P	Region, K1, K2	Tuple1 (Original Region, Downsampled Region)	Perform max pooling on the region
Ave_P	Region, K1, K2	Tuple1 (Original Region, Downsampled Region)	Perform average pooling on the region
Bilin_D	Region, K1, K2	Tuple1 (Original Region, Downsampled Region)	Perform bilinear downsampling on regions

Table 9. Feature construction and classification stage function set.

Function	Input	Output	Description
Feature_E	Tuple1, m	Tuple2 (Original Region, Concatenated Features, t)	Extracting manual features and pixel features
Feature_C	Tuple2, m, i	Tuple2 (Original Region, Constructed Features, Handcrafted Features)	Constructing new features
Transmit_F	Tuple2	Vector	Transmitting feature information to subsequent layers
Concat2	Vector	Vector	Concatenating vectors into a single feature vector
SVM	Vector	Label	SVM classification algorithm

Table 10. Image dataset parameters.

Dataset	Image Size	Training Set Size	Test Set Size	# Class
FEI1	180 × 130	150	50	2
FEI2	180 × 130	150	50	2
Jaffe	128 × 128	140	73	7
Flower	100 × 100	113	38	2
KTH	100 × 100	600	210	10
ORL	92 × 112	280	120	40

Table 11. Degraded image dataset parameters.

Degraded	Description	Value
Blur	Gaussian blur with different standard deviations $σ$ is applied to each image with a random probability of 25%	[1, 2, 4, 6]
Low-Contrast	Each image’s contrast is randomly reduced with a 25% probability	[[0, 0.8], [0, 0.6], [0, 0.4], [0, 0.2]]
Noise	Gaussian noise with different variances $σ^{2}$ is randomly added to each image with a 25% probability	[0.01, 0.1, 0.5, 1]
Occlusion	Square occlusions of varying sizes are randomly introduced at different positions on each image with a 25% probability	[0.15, 0.30, 0.45, 0.60]

Table 12. GP parameters.

Parameter	Value	Parameter	Value
Population size	250	Tree depth	2-11
Generations	50	Selection method	Tournament selection
Crossover rate	0.8	Tournament size	7
Mutation rate	0.19	Initialization method	Ramped half and half
Elitism rate	0.01

Table 13. Classification accuracy (%) of ITACIE-GP and 14 comparison methods across 6 datasets in original scenarios.

Scenarios	Methods	FEI_1 (Mean ± Std)	W-Test	FEI_2 (Mean ± Std)	W-Test	FLOWER (Mean ± Std)	W-Test	JAFFE (Mean ± Std)	W-Test	KTH (Mean ± Std)	W-Test	ORL (Mean ± Std)	W-Test
Original	SVM + Gabor	83.86 ± 5.40	+	80.26 ± 5.72	+	80.52 ± 7.06	+	47.90 ± 4.19	+	41.15 ± 2.56	+	66.02 ± 3.27	+
	SVM + Hist	45.40 ± 7.72	+	41.53 ± 5.74	+	53.07 ± 6.26	+	22.55 ± 3.55	+	52.90 ± 3.03	+	92.83 ± 2.02	+
	SVM + HOG	93.40 ± 2.97	+	86.86 ± 4.09	+	60.61 ± 6.12	+	78.94 ± 3.72	+	49.52 ± 2.87	+	96.19 ± 1.47	+
	SVM + LBP	67.46 ± 5.53	+	68.46 ± 7.45	+	67.45 ± 6.53	+	23.88 ± 4.87	+	88.76 ± 2.70	+	92.66 ± 1.42	+
	SVM + SIFT	89.06 ± 3.99	+	90.53 ± 3.85	+	84.12 ± 5.71	+	77.02 ± 5.65	+	80.57 ± 1.99	+	99.13 ± 0.94	=
	SVM + DWT	93.00 ± 3.70	+	89.73 ± 4.15	+	85.52 ± 5.97	+	75.79 ± 5.29	+	45.42 ± 2.09	+	96.44 ± 1.45	+
	LeNet-5	89.87 ± 3.58	+	87.07 ± 3.64	+	81.84 ± 6.01	+	82.51 ± 6.69	+	67.70 ± 3.68	+	94.97 ± 2.54	+
	CNN-5	90.80 ± 3.78	+	87.73 ± 5.05	+	85.44 ± 4.64	+	81.42 ± 5.05	+	81.81 ± 2.88	+	97.39 ± 1.36	+
	MnasNet	87.86 ± 4.06	+	85.86 ± 4.25	+	91.57 ± 4.67	=	75.79 ± 5.42	+	96.95 ± 1.19	-	96.08 ± 2.47	+
	VGG	81.60 ± 5.82	+	79.40 ± 3.93	+	92.89 ± 3.33	-	72.42 ± 5.34	+	97.82 ± 0.98	-	95.52 ± 1.76	+
	AlexNet	88.60 ± 4.26	+	86.40 ± 3.40	+	88.77 ± 4.23	+	78.67 ± 6.23	+	95.77 ± 1.22	-	96.25 ± 1.66	+
	ResNet	89.86 ± 3.93	+	88.80 ± 2.94	+	94.47 ± 3.28	-	78.40 ± 5.13	+	98.15 ± 0.84	-	97.88 ± 1.09	+
	SqueezeNet	85.00 ± 4.37	+	84.26 ± 4.46	+	83.15 ± 4.23	+	74.29 ± 5.12	+	95.90 ± 1.12	-	95.19 ± 2.13	+
	EFLGP	93.06 ± 3.95	+	88.86 ± 4.65	+	85.26 ± 4.06	+	83.42 ± 5.20	+	78.25 ± 5.89	+	95.61 ± 2.74	+
	ITACIE-GP	95.06 ± 2.51		93.93 ± 3.48		90.96 ± 4.74		88.58 ± 4.26		93.25 ± 1.88		99.05 ± 0.74
	+/=/-	14/0/0		14/0/0		11/1/2		14/0/0		9/0/5		13/1/0

Table 14. Classification accuracy (%) of ITACIE-GP and 14 comparison methods across 6 datasets in blur scenarios.

Scenarios	Methods	FEI_1 (Mean ± Std)	W-Test	FEI_2 (Mean ± Std)	W-Test	FLOWER (Mean ± Std)	W-Test	JAFFE (Mean ± Std)	W-Test	KTH (Mean ± Std)	W-Test	ORL (Mean ± Std)	W-Test
Blur	SVM + Gabor	80.46 ± 5.05	+	80.20 ± 5.39	+	81.40 ± 5.16	+	45.57 ± 4.20	+	40.80 ± 3.61	+	72.47 ± 4.48	+
	SVM + Hist	43.73 ± 7.21	+	41.73 ± 6.74	+	54.03 ± 6.43	+	18.86 ± 4.76	+	40.95 ± 3.12	+	73.80 ± 3.21	+
	SVM + HOG	90.06 ± 3.98	+	84.46 ± 5.00	+	60.96 ± 6.55	+	53.37 ± 5.58	+	42.28 ± 2.61	+	89.05 ± 2.69	+
	SVM + LBP	58.73 ± 6.63	+	64.53 ± 5.50	+	74.12 ± 6.14	+	25.66 ± 4.46	+	75.88 ± 2.55	+	76.80 ± 3.83	+
	SVM + SIFT	88.20 ± 3.87	+	87.73 ± 5.08	+	83.07 ± 6.61	+	65.97 ± 5.01	+	72.44 ± 2.58	+	94.94 ± 1.82	+
	SVM + DWT	92.06 ± 3.34	+	88.26 ± 4.12	+	84.38 ± 5.21	+	71.32 ± 5.27	+	44.39 ± 6.17	+	96.44 ± 1.51	+
	LeNet-5	89.27 ± 5.30	+	89.93 ± 4.75	+	83.16 ± 7.08	+	75.84 ± 5.66	+	60.41 ± 4.54	+	89.93 ± 4.75	+
	CNN-5	89.07 ± 3.99	+	87.67 ± 4.41	+	86.75 ± 5.50	+	71.60 ± 4.60	+	69.14 ± 2.45	+	97.50 ± 1.60	=
	MnasNet	77.06 ± 5.13	+	74.33 ± 4.76	+	86.84 ± 4.45	=	47.76 ± 5.04	+	85.38 ± 2.62	=	69.97 ± 3.47	+
	VGG	74.06 ± 6.20	+	71.60 ± 8.39	+	88.50 ± 5.06	=	46.94 ± 5.18	+	81.98 ± 2.34	+	73.00 ± 3.87	+
	AlexNet	80.73 ± 4.82	+	74.80 ± 4.83	+	83.42 ± 5.09	+	54.52 ± 5.98	+	82.04 ± 2.10	+	72.25 ± 3.65	+
	ResNet	79.66 ± 4.79	+	80.73 ± 3.74	+	85.96 ± 4.91	+	55.06 ± 5.98	+	90.52 ± 1.84	-	82.61 ± 3.70	+
	SqueezeNet	78.66 ± 3.87	+	77.93 ± 4.39	+	85.00 ± 5.73	+	54.52 ± 5.98	+	83.38 ± 2.45	+	75.72 ± 2.54	+
	EFLGP	91.73 ± 3.99	+	89.60 ± 5.44	+	82.89 ± 6.28	+	76.29 ± 5.73	+	58.85 ± 7.64	+	88.55 ± 4.10	+
	ITACIE-GP	93.93 ± 3.48		92.53 ± 4.55		87.89 ± 5.33		80.45 ± 5.36		84.90 ± 2.72		97.30 ± 1.32
	+/=/-	14/0/0		14/0/0		12/2/0		14/0/0		13/1/1		13/1/0

Table 15. Classification accuracy (%) of ITACIE-GP and 14 comparison methods across 6 datasets in low-contrast scenarios.

Scenarios	Methods	FEI_1 (Mean ± Std)	W-Test	FEI_2 (Mean ± Std)	W-Test	FLOWER (Mean ± Std)	W-Test	JAFFE (Mean ± Std)	W-Test	KTH (Mean ± Std)	W-Test	ORL (Mean ± Std)	W-Test
Low-contrast	SVM + Gabor	72.33 ± 6.78	+	78.20 ± 7.50	+	73.68 ± 7.24	+	34.79 ± 5.58	+	28.27 ± 2.85	+	45.08 ± 3.56	+
	SVM + Hist	48.86 ± 6.27	+	48.13 ± 4.60	+	49.91 ± 5.84	+	14.29 ± 3.51	+	15.66 ± 2.25	+	7.83 ± 2.19	+
	SVM + HOG	88.06 ± 3.50	+	88.13 ± 4.13	+	55.61 ± 7.68	+	50.45 ± 5.71	+	31.33 ± 2.79	+	85.75 ± 3.31	+
	SVM + LBP	56.40 ± 5.83	+	62.86 ± 5.79	+	76.93 ± 5.72	+	21.09 ± 3.57	+	73.03 ± 2.88	+	82.75 ± 2.39	+
	SVM + SIFT	85.06 ± 4.38	+	84.26 ± 3.47	+	82.89 ± 7.17	+	74.33 ± 5.82	+	76.57 ± 2.55	+	97.22 ± 1.35	=
	SVM + DWT	75.26 ± 7.41	+	76.66 ± 8.05	+	76.75 ± 7.34	+	54.47 ± 5.53	+	29.46 ± 7.10	+	75.97 ± 7.60	+
	LeNet-5	82.33 ± 8.53	+	84.47 ± 5.97	+	82.81 ± 6.33	+	61.83 ± 11.34	+	51.59 ± 3.76	+	76.87 ± 9.00	+
	CNN-5	80.53 ± 4.90	+	80.27 ± 5.13	+	81.32 ± 5.82	+	52.65 ± 7.50	+	62.52 ± 3.94	+	88.97 ± 2.89	+
	MnasNet	60.40 ± 5.17	+	59.40 ± 7.00	+	70.61 ± 7.56	+	27.12 ± 4.75	+	64.28 ± 2.57	+	50.22 ± 4.17	+
	VGG	59.66 ± 6.02	+	60.66 ± 5.52	+	69.47 ± 5.24	+	33.37 ± 4.19	+	59.50 ± 3.12	+	51.97 ± 4.47	+
	AlexNet	64.53 ± 6.61	+	62.13 ± 7.88	+	74.91 ± 8.42	+	31.59 ± 5.59	+	66.79 ± 2.96	+	47.38 ± 3.68	+
	ResNet	58.00 ± 4.92	+	59.00 ± 7.24	+	79.91 ± 4.87	+	41.00 ± 4.70	+	72.49 ± 2.34	+	68.13 ± 5.22	+
	SqueezeNet	65.66 ± 7.06	+	64.93 ± 5.94	+	75.00 ± 4.33	+	31.59 ± 5.59	+	62.50 ± 3.91	+	60.13 ± 4.25	+
	EFLGP	90.93 ± 4.47	=	88.66 ± 3.37	=	81.22 ± 6.24	+	74.70 ± 6.52	+	64.65 ± 6.98	+	88.55 ± 4.10	+
	ITACIE-GP	91.33 ± 3.66		90.0 ± 3.42		86.92 ± 4.83		83.11 ± 4.80		86.38 ± 1.90		96.08 ± 1.92
	+/=/-	13/1/0		13/1/0		14/0/0		14/0/0		14/0/0		13/1/0

Table 16. Classification accuracy (%) of ITACIE-GP and 14 comparison methods across 6 datasets in noise scenarios.

Scenarios	Methods	FEI_1 (Mean ± Std)	W-Test	FEI_2 (Mean ± Std)	W-Test	FLOWER (Mean ± Std)	W-Test	JAFFE (Mean ± Std)	W-Test	KTH (Mean ± Std)	W-Test	ORL (Mean ± Std)	W-Test
Noise	SVM + Gabor	72.33 ± 5.80	+	68.93 ± 7.38	+	79.64 ± 6.69	+	30.09 ± 4.61	+	28.30 ± 3.06	+	28.33 ± 2.86	+
	SVM + Hist	53.40 ± 6.08	+	52.13 ± 5.91	+	54.56 ± 6.62	+	11.78 ± 3.59	+	33.90 ± 2.59	+	17.08 ± 2.68	+
	SVM + HOG	47.13 ± 6.27	+	53.06 ± 6.34	+	55.17 ± 8.10	+	24.43 ± 3.67	+	21.03 ± 2.14	+	22.63 ± 3.14	+
	SVM + LBP	50.20 ± 6.58	+	48.40 ± 6.41	+	62.89 ± 7.29	+	15.38 ± 4.28	+	39.71 ± 2.74	+	6.52 ± 2.14	+
	SVM + SIFT	76.00 ± 4.98	+	72.26 ± 5.77	+	81.14 ± 6.98	+	44.52 ± 6.13	+	44.09 ± 2.98	+	40.25 ± 3.79	+
	SVM + DWT	88.93 ± 3.00	+	84.53 ± 4.54	+	80.96 ± 7.49	+	47.17 ± 5.63	+	31.60 ± 2.51	+	70.05 ± 3.96	+
	LeNet-5	85.67 ± 6.83	+	76.87 ± 9.00	+	83.77 ± 6.00	+	49.31 ± 4.70	+	31.81 ± 2.80	+	76.87 ± 9.00	+
	CNN-5	84.27 ± 6.49	+	81.40 ± 4.98	+	82.46 ± 6.77	+	45.89 ± 6.06	+	50.33 ± 3.54	=	79.47 ± 6.62	+
	MnasNet	61.93 ± 7.32	+	57.53 ± 5.67	+	76.75 ± 7.65	+	19.95 ± 4.98	+	48.46 ± 2.39	+	16.36 ± 2.97	+
	VGG	53.26 ± 7.07	+	54.46 ± 6.42	+	75.70 ± 4.54	+	19.86 ± 4.62	+	47.84 ± 2.13	+	16.19 ± 2.22	+
	AlexNet	51.06 ± 6.54	+	55.53 ± 5.76	+	79.73 ± 5.65	+	24.79 ± 4.81	+	46.41 ± 2.89	+	11.83 ± 2.33	+
	ResNet	62.06 ± 4.80	+	57.00 ± 5.65	+	80.87 ± 6.15	+	26.84 ± 4.41	+	51.61 ± 2.49	=	24.86 ± 3.40	+
	SqueezeNet	58.60 ± 6.71	+	57.60 ± 5.47	+	77.89 ± 5.28	+	24.79 ± 4.81	+	48.30 ± 3.24	+	17.11 ± 3.31	+
	EFLGP	91.06 ± 4.24	+	86.20 ± 4.74	+	82.10 ± 5.83	+	62.82 ± 6.55	+	36.66 ± 4.15	+	76.05 ± 9.83	+
	ITACIE-GP	93.00 ± 2.40		88.33 ± 3.34		85.78 ± 4.68		67.85 ± 4.79		51.77 ± 3.94		89.55 ± 4.05
	+/=/-	14/0/0		14/0/0		14/0/0		14/0/0		12/2/0		14/0/0

Table 17. Classification accuracy (%) of ITACIE-GP and 14 comparison methods across 6 datasets in occlusion scenarios.

Scenarios	Methods	FEI_1 (Mean ± Std)	W-Test	FEI_2 (Mean ± Std)	W-Test	FLOWER (Mean ± Std)	W-Test	JAFFE (Mean ± Std)	W-Test	KTH (Mean ± Std)	W-Test	ORL (Mean ± Std)	W-Test
Occlusion	SVM + Gabor	56.20 ± 6.24	+	60.33 ± 6.46	+	59.12 ± 6.55	+	23.06 ± 4.83	+	26.78 ± 2.64	+	22.47 ± 3.12	+
	SVM + Hist	47.07 ± 6.38	+	43.13 ± 6.12	+	51.93 ± 7.15	+	18.13 ± 4.70	+	49.76 ± 2.78	+	86.89 ± 2.37	+
	SVM + HOG	68.60 ± 4.40	+	63.13 ± 5.98	+	54.30 ± 5.96	+	21.28 ± 4.67	+	30.97 ± 3.37	+	63.67 ± 3.66	+
	SVM + LBP	62.27 ± 6.25	+	61.13 ± 6.64	+	66.32 ± 6.42	+	16.76 ± 3.56	+	84.94 ± 2.29	+	80.44 ± 4.17	+
	SVM + SIFT	70.80 ± 6.38	+	64.93 ± 5.43	+	74.39 ± 7.05	+	29.91 ± 4.27	+	59.92 ± 2.74	+	85.47 ± 3.12	+
	SVM + DWT	74.87 ± 4.19	+	66.40 ± 6.59	+	68.51 ± 7.36	+	42.74 ± 4.54	+	28.44 ± 3.02	+	79.50 ± 2.40	+
	LeNet-5	64.60 ± 11.02	+	62.93 ± 10.69	+	64.74 ± 7.40	+	32.97 ± 4.36	+	33.41 ± 3.90	+	67.69 ± 7.06	+
	CNN-5	66.40 ± 6.70	+	61.60 ± 7.27	+	66.66 ± 8.63	+	36.80 ± 4.52	+	58.49 ± 3.60	+	87.02 ± 3.21	+
	MnasNet	76.73 ± 5.83	+	74.33 ± 6.45	+	87.28 ± 4.60	-	21.19 ± 3.51	+	83.25 ± 2.37	+	64.17 ± 4.92	+
	VGG	76.07 ± 4.83	+	65.53 ± 5.77	+	78.16 ± 5.77	+	24.61 ± 4.22	+	84.63 ± 2.44	+	52.17 ± 4.03	+
	AlexNet	77.40 ± 5.27	+	68.13 ± 5.63	+	77.11 ± 7.51	=	29.36 ± 4.30	+	69.05 ± 2.75	+	38.61 ± 2.82	+
	ResNet	79.07 ± 3.60	+	71.33 ± 5.02	+	86.05 ± 5.81	-	23.65 ± 4.27	+	85.92 ± 2.45	=	69.33 ± 4.66	+
	SqueezeNet	79.00 ± 4.52	+	66.67 ± 6.48	+	75.44 ± 7.14	+	25.43 ± 4.06	+	72.22 ± 3.45	+	61.81 ± 3.45	+
	EFLGP	83.26 ± 4.77	+	74.26 ± 5.05	+	62.19 ± 8.70	+	41.00 ± 6.98	+	60.39 ± 9.19	+	80.05 ± 7.72	+
	ITACIE-GP	90.13 ± 3.82		84.00 ± 4.28		77.28 ± 6.56		50.13 ± 4.80		86.23 ± 2.45		97.47 ± 1.43
	+/=/-	14/0/0		14/0/0		11/1/2		14/0/0		13/1/0		14/0/0

Table 18. Ablation experiment.

	ITACIE-GP	ACIE-GP	ITIE-GP	ITAC-GP
Dataset	Mean ± Std	Mean ± Std	Mean ± Std	Mean ± Std
FEI_1	94.40 ± 2.65	92.40 ± 2.33+	94.40 ± 2.65=	93.60 ± 2.33+
FEI_1_Blur	93.20 ± 2.40	90.80 ± 1.60+	92.00 ± 1.26+	93.20 ± 3.25+
FEI_1_Low	92.00 ± 1.79	87.60 ± 4.45+	91.20 ± 4.31+	90.80 ± 2.71+
FEI_1_Noise	91.60 ± 3.44	89.60 ± 4.45+	89.60 ± 4.63+	91.60 ± 2.71=
JAFFE	91.23 ± 1.86	85.75 ± 3.63+	89.04 ± 5.20+	90.41 ± 2.29+
JAFFE_Blur	84.11 ± 2.39	82.47 ± 4.62+	82.19 ± 2.74+	83.56 ± 3.00+
JAFFE_Low	84.93 ± 5.27	82.19 ± 4.67+	84.66 ± 5.44+	84.66 ± 5.44+
JAFFE_Noise	67.12 ± 3.87	63.84 ± 6.46+	66.30 ± 3.93+	67.12 ± 3.87=
FLOWER	87.89 ± 4.88	85.96 ± 3.28+	86.32 ± 5.86+	86.84 ± 4.99+
FLOWER_Blur	90.00 ± 3.87	85.26 ± 3.94+	87.37 ± 1.97+	89.47 ± 4.40+
FLOWER_Low	86.84 ± 3.33	85.26 ± 4.88+	78.95 ± 2.35+	87.37 ± 3.49-
FLOWER_Noise	84.74 ± 4.53	83.68 ± 4.82+	83.68 ± 1.97+	84.74 ± 4.53=
ORL	99.00 ± 0.33	96.83 ± 1.62+	98.67 ± 0.67+	99.00 ± 0.62+
ORL_Blur	90.00 ± 3.87	85.96 ± 3.28+	87.37 ± 1.97+	89.47 ± 4.40+
ORL_Low	96.83 ± 0.97	95.67 ± 1.93+	95.67 ± 1.78+	97.00 ± 1.00-
ORL_Noise	91.83 ± 2.26	88.83 ± 2.51+	89.83 ± 3.78+	91.83 ± 2.26=
overall		41+,5=,2-

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sun, Y.; Zhang, Z. Automatic Feature Construction-Based Genetic Programming for Degraded Image Classification. Appl. Sci. 2024, 14, 1613. https://doi.org/10.3390/app14041613

AMA Style

Sun Y, Zhang Z. Automatic Feature Construction-Based Genetic Programming for Degraded Image Classification. Applied Sciences. 2024; 14(4):1613. https://doi.org/10.3390/app14041613

Chicago/Turabian Style

Sun, Yu, and Zhiqiang Zhang. 2024. "Automatic Feature Construction-Based Genetic Programming for Degraded Image Classification" Applied Sciences 14, no. 4: 1613. https://doi.org/10.3390/app14041613

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automatic Feature Construction-Based Genetic Programming for Degraded Image Classification

Abstract

1. Introduction

2. Related Work

2.1. GP and Strongly Typed GP

2.2. Degraded Image

2.3. Image Classification Methods

2.3.1. Manual Feature-Based Image Classification Methods

2.3.2. CNN-Based Image Classification Methods

2.3.3. GP-Based Image Classification Methods

3. The Proposed Approach

3.1. Algorithmic Framework

3.2. New Program Structure Based on Information Transmission

3.2.1. New Function and Terminal Set

3.2.2. Fitness Function

3.3. Accompanying Evolution

3.3.1. Storage Strategy

3.3.2. Replacement Strategy

3.4. Multi-Generation Individual Ensemble Strategy

4. Experimental Design

4.1. Benchmark Datasets

4.2. Benchmark Methods

4.3. Parameter Settings

5. Results and Discussion

5.1. Classification Performance

5.2. Ablation Experiments

5.3. Performance Analysis

5.3.1. Example Program of FEI1 Dataset under Occlusion Scenarios

5.3.2. Example Program of the JAFFE Dataset in the Original Scenarios

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI