Next Article in Journal
Transformer Models and Convolutional Networks with Different Activation Functions for Swallow Classification Using Depth Video Data
Previous Article in Journal
Error-Correcting Codes on Projective Bundles over Deligne–Lusztig Varieties
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A New Parallel Cuckoo Flower Search Algorithm for Training Multi-Layer Perceptron

1
Faculty of Physics and Applied Computer Science, AGH University of Science & Technology, 30-059 Krakow, Poland
2
MEU Research Unit, Middle East University, Amman 11813, Jordan
3
University Centre for Research and Development, Chandigarh University, Mohali 140413, India
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(14), 3080; https://doi.org/10.3390/math11143080
Submission received: 21 June 2023 / Revised: 7 July 2023 / Accepted: 10 July 2023 / Published: 12 July 2023
(This article belongs to the Special Issue Biologically Inspired Computing, 2nd Edition)

Abstract

:
This paper introduces a parallel meta-heuristic algorithm called Cuckoo Flower Search (CFS). This algorithm combines the Flower Pollination Algorithm (FPA) and Cuckoo Search (CS) to train Multi-Layer Perceptron (MLP) models. The algorithm is evaluated on standard benchmark problems and its competitiveness is demonstrated against other state-of-the-art algorithms. Multiple datasets are utilized to assess the performance of CFS for MLP training. The experimental results are compared with various algorithms such as Genetic Algorithm (GA), Grey Wolf Optimization (GWO), Particle Swarm Optimization (PSO), Evolutionary Search (ES), Ant Colony Optimization (ACO), and Population-based Incremental Learning (PBIL). Statistical tests are conducted to validate the superiority of the CFS algorithm in finding global optimum solutions. The results indicate that CFS achieves significantly better outcomes with a higher convergence rate when compared to the other algorithms tested. This highlights the effectiveness of CFS in solving MLP optimization problems and its potential as a competitive algorithm in the field.

1. Introduction

Over the past decades, artificial intelligence (AI), and particularly machine learning (ML), has paved the way for researchers to study nature and build problem solving models. In particular, studying the phenomena of natural selection, social behavior, and other patterns has led to the rise of evolutionary computing, swarm intelligence, and neural networks (NN). NN are the most significant invention in the arena of soft computing, inspired by neurons present in human brain. The basic NN model was conceptualized by McCulloch and Pitts [1]. There are various types of NNs, including Kohonen self-organizing networks [2], recurrent NN [3], spiking NN [4], feed-forward networks (FNN) [5], and others. Among these NNs, FNN are the simplest, with low computational cost and high performance. FNN receive input from one side and provides output at the other. The FNN is generally unidirectional, with multiple layers in between. If there is only a single layer, the network is called a Single-layer perceptron (SLP) [6]. SLPs are used for solving linear problems. If there are multiple layers, called a multi-layer perceptron (MLP) [1,7], these networks are used to solve non-linear problems.
All NNs have a common feature of learning from experience. Such NNs are called Artificial NN (ANN), and they adapt themselves according to given set of inputs. ANNs can be supervised using an external source for providing feedback [8,9], or they can be unsupervised [10,11], taking the form of a NN that adapts to its own inputs without any external feedback. Training NNs to achieve the highest possible performance is performed by a trainer. The trainer provides the NN with a set of input samples, modifies it with the structural parameters of the NN, and finally, when the training process is complete, the trainer is omitted and the NN is set as active and is available for use. There are two types of trainers: deterministic and stochastic. Supervised learning to solve problems, brought about with the advent of the Back propagation (BP) algorithm [12] and the gradient search algorithm, are deterministic methods that aim at training through mathematical optimization to achieve maximum performance. These trainers are simple and have higher convergence speed, leading to a global optimum from a single solution. These optimization methods have a problem of becoming in a local optima that is sometimes mistaken as global optima. On the other hand, stochastic training methods use stochastic optimization methods to achieve desired performance. These methods initiate training with a random solution and enhance it to achieve a global optimum. Randomness in stochastic methods provides local optima avoidance but these methods are slower than deterministic methods [13,14]. Stochastic trainers are generally used in literature due to high avoidance of local optimum.
Stochastic trainers can be single-solution or multi-solution. For a single-solution, the NN is constructed by training it with a single random solution and evolving it iteratively until stopping criteria is satisfied. Simulated annealing (SA) [15,16], hill climbing [17], and others [18,19] are examples of single-solution NNs. Multi-solution NNs, on the other hand, are initiated with multiple random solutions and evolve each solution unless the stopping criteria is met. These criteria include Genetic algorithm (GA) [20], Ant colony optimization (ACO) [21], Artificial Bee colony (ABC) [22,23], Particle swarm optimization (PSO) [24,25], Differential evolution (DE) [26], Teacher-learning based optimization (TLBO) [27], Invasive weed algorithm (IWO) [28], ensemble techniques [29], Grey Wolf optimization (GWO) [30], and others. These algorithms have high performance in terms of finding approximate global optimum solutions. This inspires us to develop a new meta-heuristic and apply it efficiently for training NNs.
In this work, a new parallel algorithm based on Cuckoo Search (CS) [31] and Flower Pollination Algorithm (FPA) [32], which we have named Cuckoo Flower Search (CFS), is introduced. The main motivation for this work is the problem of local optima stagnation and premature convergence problems of already existing algorithms. CFS has been tested on standard benchmark functions and compared with state-of-the-art algorithms for establishing its competitiveness. In addition, it has been further tested on FNN-MLP as an application to real world problems. Nineteen benchmark functions have been used to analyze the performance of the proposed algorithm. These benchmark functions consist of unimodal functions, multi-modal functions, and fixed-dimension functions. These problems are highly challenging and any algorithm performing well on these functions is considered to be a good algorithm. A comparison with GWO, CS, FPA, BFP, and others was also conducted. Statistical tests have also been performed to prove the superiority of CFS over other comparable algorithms. The major contributions of the paper are highlighted as:
  • To avoid premature convergence and local optima stagnation, best known properties of FPA and CS are added to the proposed algorithm.
  • The global and local search phase equations of FPA and CS are optimized for addition in the proposed algorithm.
  • Solutions generated by FPA and CS are compared and best among the two is selected as the current best solution. These solutions are further generated over the course of iterations to find the global best solution.
  • A greedy selection operation is followed for retaining the best solution over subsequent iterations.
  • The proposed algorithm is tested on 19 classical benchmark functions, and Wilcoxon rank-sum test is done to prove the significance of the algorithm statistically.
  • Finally, five real-world datasets, including Heart, Breast cancer, Iris, Ballon, and XOR, are optimized using the proposed algorithm.
  • The source code of CFS algorithm is available at: https://github.com/rohitsalgotra/CFS (accessed on 20 June 2023).
The rest of the paper is organized as follows: Section 2 describes the preliminary definitions of FNN and MLP. The basics of CS and FPA are detailed in Section 3. Section 4 describes the proposed CFS algorithm. Section 5 presents with the results and discussion. Finally, Section 6 concludes the paper.

2. Feed-Forward Neural Networks and Multi-Layer Perceptron

FNNs are that are unidirectional networks and have a one-way connection between neurons. They contain several parallel layers in which neurons are arranged [33]. The first layer is the input layer and the last last is the output layer. In between these are several other layers that correspond to hidden layers. A three-layer MLP with n input nodes, h hidden nodes, and m number of outputs is shown in Figure 1, showing a simple unidirectional connection between the nodes. The outputs are calculated as in [34]:
  • Weighted sum of inputs is given by:
s j = i = 1 n W i j · W i θ j ,   j = 1,2 , h
  • Outputs of hidden layers are calculated as:
s j = s i g m o i d s j = 1 ( 1 + e x p ( s j ) ) ,   j = 1,2 , h
  • Final output based on the hidden node outputs is given as:
o k = j = 1 h W j k · S j θ k ,   k = 1,2 , m
o k = s i g m o i d o k = 1 ( 1 + e x p ( o k ) ) ,   k = 1,2 , m
where Wij and Wjk are weight connection of ith node in input layer to jth node in the hidden layer and from jth hidden layer to kth output layer, respectively, θ j and θ k are the threshold of jth hidden layer and kth output layer, respectively, and Xi is the ith input layer.
From the above equations, it can be seen that weights and thresholds define the final value of the MLPs. The major concern is finding optimum weights and thresholds (biases) for achieving a balanced relation between input and outputs.

3. Basic Cuckoo Search and Flower Pollination Algorithm

3.1. Cuckoo Search Algorithm

The CS algorithm is inspired by the obligatory brood parasitism behavior of cuckoos [35]. The cuckoos of some species lay their eggs in the nests of host birds, following an obligate brood parasitism. CS is a competitive algorithm among existing algorithms. CS contains components including the selection of best solution and ensuring that this best solution is passed on to the next generation. It employs local random walk to perform the exploitation locally and randomization via Lévy flights to perform the exploration globally. Three rules are established that describe the Cuckoo Search in a simple way. These are explained as follows:
  • Each cuckoo lays one egg and dumps it in a random nest;
  • The nest with highest fitness will carry over to next generation;
  • The host bird discovered the cuckoo’s egg with a probability pa ∈ [0, 1]. A fixed number of host nests are available. Depending on p a , a new nest is built by the host bird at a new location either by throwing the egg away from the nest or abandoning the nest.
In CS, a solution is an egg that is already present in a nest and a new solution is that egg which is laid by a cuckoo. The not-so-good solutions of nests are replaced by the new and better solutions [35]. More complicated cases arise when multiple eggs are present in each nest. In these cases, the extended form of this algorithm can be used. Based on the above three rules, Equation (5) derives the Levy flight that is performed to produce a new solution x i t + 1 for i t h cuckoo:
  x i t + 1   = x i t + α     L é vy   ( λ )
where the previous solution is denoted by x i t , is entry wise multiplication, and α > 0 is the step size. In most cases, α = 1 is used. The above equation is the stochastic equation for random walk. In the case of random walk, the current location draws a path to next status/location and the transition probability of next position. PSO also used this type of entry-wise product.
Cuckoos usually search for food using a basic random walk. This is a Markov chain whose updated position is determined by the present location and the transition probability of the following position. The performance of CS is enhanced using Lévy flights [36]. Lévy flight is a random walk measured in step-lengths following a heavy-tailed probability distribution. Ultimately, Levy flight is not a continuous space; it is used to refer to a discrete grid [37,38,39]. The Levy flight is employed in this study as a result of Levy flight’s greater efficiency in exploring the search space. Our algorithm is generated from a Levy distribution with infinite mean and variance.
As the random walk occurs via Lévy flight, the Lévy distribution draws the random step length as:
L é vy     u = t λ ,                     1 < λ 3
This random walk process is a heavy tail step-length distribution. The Lévy walk achieves new solutions nearer to the best solutions to speed up the local search [36]. Far field randomization should be used to create some of the solutions in order to prevent the system getting stuck in a local optimum. Here, some points are discussed that show that CS is analogous to and competitive with other optimization algorithms. First, as with other GA and PSO algorithms, CS is a population-based algorithm. Second, because of the heavy tailed step length, the large step is possible in CS and the randomization is more efficient. Third, a wide class of optimization problems have adapted to the CS because it tunes fewer parameters when compared to PSO and GA.

3.2. Flower Pollination Algorithm

Flowers are fascinating species. Dating from the Cretaceous period, flowers are estimated to comprise about 80 percent of the total species of plants [40]. About 250,000 species of flowers have been found on earth. The ultimate aim of flowers is to reproduce and this reproduction occurs mainly by pollination. In pollination, pollen is transferred from one flower to other by pollinators. Cross-pollination means that pollination occurs due to pollen from different plants. On the other hand, self-pollination means fertilization of pollen from the same or different flowers of the same plant. Pollinators can be insects, birds, or any other animal. Some flowers do attract only specific kinds of insects for pollination, showing a sort of flower-insect partnership; this is referred to as called flower constancy. Pollinators such as honeybees have been found to develop flower constancy. This property helps pollinators to visit only particular plant species, hence increasing the chances of reproduction for the flower and, in turn, maximizing nectar supply for the pollinator [41].When the pollen is shifted by pollinators such as insects and animals, the process is called biotic pollination (about 90 percent occurs via biotic). Meanwhile, when it occurs via diffusion or wind, the process is called abiotic [42] (this constitutes about 10 percent of pollination). In total, there are about 200,000 varieties of pollinators found on earth. Biotic cross-pollination occurs over long distance and is facilitated by birds, bats, bees, and fireflies, among other animals. This is often referred to as global pollination. Meanwhile, self-pollination is termed as local pollination.
The above characteristics are idealized into four set of rules [43]:
  • Global pollination arises via biotic and cross-pollination.
  • Local pollination occurs via abiotic and self-pollination.
  • Flower constancy, termed as reproduction probability, is proportional to the similarity of two flowers.
  • Switch probability p ϵ [0, 1] balances global and local pollination.
When designing the algorithm, it is expected that each plant has only one flower producing only a single pollen gamete. Following this, we can use yi as a solution equivalent to a flower or a pollen gamete, defining a single objective problem.
The above characteristics have been combined to design an FPA that mainly consists of local and global pollination. In the earlier version, pollination and reproduction of the fittest flower is ensured and the rules are represented mathematically as:
y i t + 1 = y i t + α L ( λ ) ( R * y i t )
where R * is the current best solution, y i t is the potential solution at t iteration, and α is the scaling factor to control the Lévy flight-based step size L(λ). Lévy flight is expressed as:
L ~ λ Γ   ( λ ) s i n ( π λ / 2 ) π   1 s 1 + λ ,       ( s s 0 > 0 )
where Γ(λ) is the standard gamma function.
The local pollination rule can be mathematically represented as:
y i t + 1 = y i t + ϵ ( y j t y k t )
where y j t and y k t are pollens from diverse flowers of the same plants. In the confined space, flower constancy corresponds to a local random walk, and is selected from a uniform distribution ϵ in [0, 1].

4. Cuckoo Flower Search Algorithm

4.1. Algorithm Definition

The CFS algorithm is proposed as a hybrid version of the CS and FPA algorithms. Both these algorithms work in coordination to attain a global optimum solution. The main idea is to generate the current best solution for both cuckoos and flower pollinators. After finding this solution, both are compared, the best solution is considered, and the process is repeated. The solution after first evaluation is fed back to the cuckoos and flower pollinators. This procedure is continued until the termination criteria are met. The final solution is the most appropriate solution to the problem under discussion. There are three phases to the proposed CFS algorithm:

Initialization

This is the first phase of the CFS algorithm, in which the population is randomly initialized. The solution is initialized according to Equation (10) and operates as a potential solution to the problem under examination, starting with an initial population of N cuckoos and flower pollinators (termed as CF).
C F i , j = C F m i n , j + a b ( C F m i n , j C F m a x , j )
where i ϵ {1,….CF}, j ϵ {1,…..D}, CFi,j is the ith solution in the jth dimension, D is the dimension or number of variables in the problem being studied, CFmin,j, CFmax,j are the lower and upper bounds, respectively, and a b is randomly generated number between [0, 1]. Here, the population initialized in Equation (10) is same for both cuckoos and flower pollinators. The fitness of the solution is estimated for objective function after initialization, and the best solution attained is treated as the initial best for all cuckoos ( C F ) and flower pollinators ( F C ) .
  • Solution generation
After the initial step, two new solutions are generated: one solution is inspired by cuckoo brood parasitism and the other from flower pollinators. The main concern here is to follow exploration and exploitation in a well-defined fashion. In cuckoos, exploration is achieved by randomization via Lévy flights. Local random walk is used to achieve exploitation. The new solution x i t + 1 is generated as per Equation (6) and its fitness is evaluated for the optimization problem being tested. The solution x i t + 1 obtained in this manner is compared with the C F , and the best (   C F b e s t ) among them is retained.
In the case of flower pollinators, exploration and exploitation is balanced by local pollination and global pollination, respectively. Equations (7) and (9), based on random probability in the search domain [0, 1], often called the switch probability, are used extensively to find the second new solution ( y i t + 1 ). This solution y i t + 1 and the initial best ( F C ) solution are also compared, resulting in another best solution F C b e s t among them.
  • Final evaluation
After comparing the best fit solutions obtained by cuckoos ( C F b e s t ) and flower pollinators ( F C b e s t ), the best solution attained is the final optimum solution. For both cuckoos and flower pollinators, the solution generated at the last stage is set as the initial best ( C F and F C , respectively) Unless and until the termination requirements are met, the same procedure is followed. The final solution obtained in this manner is the most appropriate solution. It is also worth noting that cuckoos and flower pollinators are both seeking the most appropriate solutions in parallel. If a cuckoo-produced solution becomes trapped in the local optimum and is unable to deliver the global optimal, flower pollinators assist it in exiting the local trap and achieving the global optimum, and vice versa. This characteristic increases the likelihood of the CFS algorithm reaching the global optimum solution.
Both CS and FPA are good algorithms in terms of finding global optimum solution, but the real problem is their inconsistency in finding the best fit individual every time the algorithm is run. This inconsistency is due to the problem of getting stuck in local optima while moving toward global optima. As a result, a better solution is required to move the algorithm closer to the global optimum. If the CS algorithm becomes stuck, FPA moves it towards the global optimum, and vice versa. As a result, both Cuckoos and Flower Pollinators collaborate analytically to obtain a global optimum. Figure 2 shows the flow code for the CFS algorithm. The pseudocode of the proposed algorithm is given in Algorithm 1.
Algorithm 1: Pseudocode of CFS algorithm
begin:
      1.   Initialize:  α , β 0 ,   γ , maximum iterations
      2.  Define Population, objective function f(x)
      3.  While (t < maximum iterations)
                         For i = 1 to n
                                       For j = 1 to n
                                                     Evaluate new solution using CS inspired equation;
                                                     Evaluate new solution using FPA inspired equation;
                                       Find the best among the two using greedy selection;
                                       End for j
                         End for i
      4.  Update current best.
      5.  End while
      6.  Find final best
end.

4.2. CFS-MLP Trainer

When training a MLP, the first step is to formulate a problem [44] and find values of weights and biases with the highest accuracy/classification/statistical results. These should be found by a trainer. This important step is achieved by training an MLP using meta-heuristic algorithms. There are three methods for training MLPs using meta-heuristic algorithms [45]:
  • To find combination of weights and biases of MLP for achieving the minimum error using meta-heuristic algorithms. In this approach, proper values of weights are found without changing the basic architecture of the heuristic algorithm. It has simple a encoding phase and a difficult decoding phase, and so is often used for simple NNs.
  • To find proper architecture for an MLP using heuristic algorithms. In this method, the architecture varies and it can be achieved by varying the connections between hidden nodes, layers, and neurons, as proposed in [46]. This method has a simple decoding phase but, due to complexity in the encoding phase, it is used for complex structures.
  • To tune the gradient-based learning algorithm parameters using a heuristic approach. This method has been used to train FNNs using EAs [47] and others, such as GA [48], using a combination of methods to tune FNN. In this method, the decoding and encoding processes are very complicated and hence the structure becomes very complex.
In the present work, the CFS algorithm is proposed and applied to train an MLP using the first method. The weights and biases for the CFS algorithm are given in the form of a vector as follows:
C = W , θ = ( W 1,1 ,   W 1,2 , ,   W n , n | θ 1 ,   θ 2 ,   ,   θ h )
where n is the number of nodes, Wij is the weight connection between ith and jth node, and θ j is the bias of jth hidden node. After setting the initial variables, the fitness function is to be designed using CFS algorithm. This is achieved by defining a common metric for evaluation of the MLP and is called Mean Square Error (MSE). The MSE is used to calculate the difference between the desired output and the value obtained from MLP. The performance of MLP is based upon the average MSE values of all training samples and is given by:
M S E = k = 1 s i = 1 m ( o i k d i k ) 2 s
where s is training samples count, m is the number of outputs, and o i k and d i k are the actual output and desired output of the ith input for the kth training sample, respectively. Based on MSE, the final objective function can be formulated as:
M i n i m i z e : F C = M S E
The overall process of using the CFS algorithm delivers MLP with weights as well as biases and, in turn, receives average MSE for all training samples. The CFS algorithm updates the weights and biases iteratively in order to achieve minimized average MSE. The best MSE is obtained from the last iteration of the algorithm. Since weights and biases find the best MSE in the MLP, there is a greater chance of improvement in the MLP structure at each iteration. Thus, the CFS algorithm converges toward a better solution than initial random solution.

5. Result and Discussion

This section presents the details on the applicability of the proposed CFS algorithm for classical benchmark problems and real-world optimization of FNN-MLP. We have used 19 benchmark functions, consisting of unimodal, multi-modal, and fixed dimension problems. For the optimization of real-world FNN-MLP, five highly challenging datasets have been used. More details on applicability are presented in the consecutive subsections. For performance analysis, the simulations are performed on MATLAB, using a Windows 10 × 64, Intel Core i3 processor, with 8 GB RAM.

5.1. Benchmark Problems

To check the effectiveness of the CFS algorithm, it was tested on nineteen well known benchmark problems. The Wilcoxon rank-sum tests were performed to test the validity of results statistically. This non-parametric test is used to detect the static significance of any algorithm. Differences between two pairs of populations were analyzed and compared. The test returns a p-value determining the significance level of two algorithms. This value should be less than 0.05 for an algorithm to be statistically efficient [49]. The proposed algorithm is compared with ABC [50], Firefly Algorithm [51], FPA, CS, and Bat Flower Pollinator [52] algorithms. The parameter setting to test each algorithm for benchmark problems is shown in Table 1.

5.1.1. Unimodal Functions

There is no local solution for unimodal functions; they have a single global solution. These benchmark functions are useful for evaluating the convergence characteristics of heuristic optimization techniques. The CFS algorithm was applied to four unimodal benchmark problems with three dimension sets (30, 50, and 100), as given in Table 2. The CFS algorithm was compared to the ABC, FA, FPA, CS, and BFP algorithms (see Table 3, Table 4, Table 5, Table 6, Table 7 and Table 8). For the 30 (Table 3) and 50 (Table 5) dimension (D) problems, with the function f1, the FPA algorithm has a better mean and best value but CS and CFS are found to give the best values of standard deviation. For function f2, the FA is found to be better, with a highly competitive result for the CFS algorithm. for f3, the CFS algorithm provides better results, and for f4, the ABC and CFS algorithms are both competitive when compared to rest of the algorithms. For 100 D (Table 7), the CFS algorithm performs better for f2 and f3. for f1, FPA is better, and for f4, ABC is better. The BFP algorithm is better for none of the functions. The rank-sum tests from Table 4, Table 6 and Table 8 acknowledge the superior performance of the CFS algorithm. The convergence characteristics are shown in Figure 3.

5.1.2. Multimodal Functions

Multimodal benchmark functions feature several local minima that grow in number exponentially with dimension. As such, they are good for testing an algorithm’s ability to avoid local minima. The CFS algorithm has been applied to six multimodal benchmark problems, with three-dimension sets (30, 50, and 100), as shown in Table 9. The algorithm has been compared to the ABC, FA, FPA, BFP, and CS algorithms. For 30 D and 50 D problems, with the f5, f6, and f7 functions, the CFS algorithm performs better, while only the FA algorithm performs better than CFS with the f8, f9, and f10 functions, as shown in Table 10, Table 11, Table 12 and Table 13, respectively. For 100 D (Table 14 and Table 15), the CFS algorithm performs better for the f5, f6, f7, and f9 functions. for f8 and f10, the FA algorithm performs better. The rank-sum tests from Table 11, Table 13 and Table 15 shows that the performance of the CFS algorithm is better statistically. The convergence characteristics are shown in Figure 4.

5.1.3. Fixed Dimension Functions

Fixed dimension benchmark functions have finite dimensional space. The CFS algorithm has been applied to nine benchmark functions, as shown in Table 16, and the results have been compared to the ABC, CS, FA, BFP, and FPA algorithms. It can be seen from Table 17 and Table 18 that the CFS algorithm performs better than the other algorithms for all the test problems. For functions f14, f15, f16, f17, and f18, the ABC, FA, FPA, and CS algorithms are not able to achieve the global optimum. For the remaining algorithms, even if the global optimum is met, the CFS algorithm shows superior consistency because of its better standard deviation. The CFS algorithm’s results are also statistically better, as shown in Table 18. The convergence curves are shown in Figure 5.

5.2. FNN–MLP Datasets

The proposed CFS algorithm was used to train FNN-MLP datasets. The standard benchmark FNN–MLP data sets were obtained from the Machine Learning Repository of the University of California, Irvine [53]. The datasets used are: Breast Cancer, XOR, Balloon, Iris, and Heart. The results are compared with the PSO, GA, ES, ACO, GWO, PLIB [30,54,55,56,57] algorithms, and the Whale Optimization Algorithm (WOA) [58] and the Moth Flame Optimization (MFO) [59] for verification. The optimization parameter settings for the CFS algorithm are presented in Table 19. Table 20 details the specifications of the datasets used for comparison. The simplest dataset is the 3-bit XOR; it has three attributes and eight training/test samples. for the Balloon dataset, there are four attributes and 16 training/test samples. For the Iris dataset, there are 150 training/test samples with four attributes. For the Breast Cancer dataset, the highest number of 100 test samples, 599 training samples, and nine attributes are used. for the Heart dataset, there are 80 training samples, 187 test samples, and 22 attributes. The number of classes for each dataset is two, except for Iris, which is set to three. These datasets are highly complex sets of problems and are employed to test the performance of the CFS algorithm. The total number of runs for the CFS algorithm is set to 10; this is the same as used in study [30]. The number of function evaluations for the XOR and Balloon datasets is 50 × 250 = 12,500 for all the algorithms. For the Iris, Breast Cancer, and Heart datasets, the number of function evaluations is 20 × 250 = 5000 for the CFS algorithm and 200 × 250 = 50,000 for the rest. The results are presented in the form of an average of 10 runs and their standard deviation is obtained from the best MSEs in the last iteration of the CFS algorithm. The best results are those with the lowest average and standard deviation, ultimately indicating the better performance of the proposed approach [60,61].
X t = ( x a ) × ( d c ) ( b a ) + c
The number of hidden nodes for N number of inputs of datasets is kept constant and is given by 2 × N + 1 . The structure for each MLP is given in Table 21.

5.2.1. XOR Dataset

This dataset returns XOR of input as output. It has three inputs, eight training/test samples, and one output. This dataset has a dimension of 36 with range of [−10, 10], with an MLP structure of 3−7−1. The results, in term of average and standard deviation, are given in Table 22. It can be seen in Table 4 that the performance evaluation of the CFS-MLP algorithm is far better than all other algorithms tested.

5.2.2. Balloon Dataset

The Balloon dataset has a dimension of 55, with range of [−10, 10]. This dataset has 18 training/test samples, with four attributes and two classes, with an MLP structure of 4−9−1. The results are given in Table 23. The results show that the CFS algorithm gives far higher average and standard deviation values when compared to the GWO, PSO, GA, ACO, ES, PBIL, WOA, and MFO algorithms.

5.2.3. Iris Dataset

The Iris dataset has 75 variables to be optimized in the range of [−10, 10]. It has 150 training/test samples, with four attributes and two classes. The MLP structure of 4−9−3 is utilized to solve this dataset. The results are shown in Table 24. For the Iris dataset, the results of the CFS algorithm are competitive with GWO in terms of average; for standard deviation, the results of the CFS algorithm are superior.

5.2.4. Breast Cancer Dataset

This is a challenging dataset, with 100 test samples, 599 training samples, nine attributes, and two classes. It has 209 dimensions, with an MLP structure of 9−19−1. The outcomes of this dataset are given in Table 25. The results show that the CFS algorithm is far superior than the PSO, GA, ACO, PBIL, WOA, and MFO algorithms. When compared to GWO, they are highly competitive.

5.2.5. Heart Dataset

This is the last dataset used in this paper and was solved with an MLP structure of 22−45−1. It has 187 test samples, 80 training samples, 22 attributes, and two classes. The results are shown in Table 26. The Heart dataset is a very challenging dataset, with a 1081 dimension. The CFS algorithm performs better for this dataset when compared to others.

5.3. Discussion of Results

The results comparison of the CFS algorithm with the ABC, CS, FA, FPA, and BFP algorithms show that, for test function, the CFS algorithm delivers very competitive results for unimodal and multimodal benchmark problems. For fixed dimension problems, no algorithm among ABC, CS, FA, FPA, and BFP are comparable. This occurs as a result of the inability of these algorithms to emerge from local minima. The ABC algorithm has the problem of becoming stuck in local minima, while the CS and FPA algorithms are inconsistent due to their inability to emerge from local minima and are, hence, inconsistent. In its initial stage, the FA algorithm has slower convergence because of random distribution and as a result of its insufficiency in exploring ability. At the last stage, fireflies gather around the optimal solution but, due to random motion and attractiveness, there can be flight mistake and hence the solution converges very slowly.
The results of the CFS-MLP clearly show that, for the XOR and Balloon datasets, there are same number of function evaluations and that the results are bothbetter and significant. For the Iris, Breast Cancer, and Heart datasets, the minimum number of function evaluations for CFS algorithms is 5000, while for others it is 50,000. Hence, it can be said that the CFS algorithm is able to achieve a more significant result with fewer function evaluations. This proves the superiority of the CFS algorithm over the GWO, PSO, GA, ACO, ES, PBIL, WOA, and MFO algorithms.
In the CFS algorithm, there are two search agents: cuckoos and flower pollinators. When cuckoos are not able to find the optimal solution, flower pollinators help them; in turn, the flower pollinators are helped by cuckoos when stuck in a local optimum. Therefore, two solutions (one from cuckoos and the other from flower pollinators) are generated. The final solution is the best among the two. This helps the CFS algorithm to achieve faster convergence and consistency in finding the optimal solution.

6. Conclusions

In this work, a new CFS algorithm was proposed for MLP training. The algorithm was first tested over 19 standard benchmark functions and their results were statistically compared with the ABC, CS, FA, FPA, and BFP algorithms. The results demonstrate that the CFS algorithm perform significantly better, with higher consistency, in avoiding local minima when compared to the ABC, CS, FA, FPA, and BFP algorithms. The CFS algorithm was then used to train MLPs; five datasets were used. The results of the CFS-MLP were compared, in terms of average and standard deviation, with the GWO, PSO, GA, ACO, ES, PBIL, WOA, and MFO algorithms. The experimental results again proved the superiority of the CFS algorithm for MLPs.
Despite this, the proposed algorithm has certain drawbacks. Because of the stochastic nature of the algorithm, the algorithm is inefficient for several kinds of problems. As the CFS algorithm uses two general equations for finding new solutions, the computational complexity of the algorithm increases. Thus, it is imperative to find new and prospective solutions to deal with the complexity problem. As a future direction of study, the parameters of the algorithm can be exploited to find the best set of parameters. In addition, the CFS algorithm can be applied to various other domains, including clustering, antenna array synthesis, feature selection, medical imaging, segmentation, and others. Apart from that, the proposed algorithm can be extended to multi-objective, hyper-parameters, and other dimensions.

Author Contributions

Conceptualization, R.S.; methodology, R.S.; software, N.M.; validation, R.S., N.M. and V.M.; formal analysis, V.M.; investigation, R.S.; resources, N.M.; data curation, V.M.; writing—original draft preparation, R.S.; writing—review and editing, R.S, and N.M.; visualization, R.S.; supervision, R.S.; project administration, R.S. and N.M.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Data will be made available on request.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. McCulloch, W.S.; Pitts, W. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 1943, 5, 115–133. [Google Scholar] [CrossRef]
  2. Kohonen, T. The self-organizing map. Proc. IEEE 1990, 78, 1464–1480. [Google Scholar] [CrossRef]
  3. Dorffner, G. Neural networks for time series processing. Neural Netw. World 1996. [Google Scholar]
  4. Ghosh-Dastidar, S.; Adeli, H. Spiking neural networks. Int. J. Neural Syst. 2009, 19, 295–308. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Bebis, G.; Georgiopoulos, M. Feed-forward neural networks. IEEE Potentials 1994, 13, 27–31. [Google Scholar] [CrossRef]
  6. Rosenblatt, F. The Perceptron, A Perceiving and Recognizing Automaton Project Para; Cornell Aeronautical Laboratory: Buffalo, NY, USA, 1957. [Google Scholar]
  7. Werbos, P. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Ph.D. Thesis, Harvard University, Cambridge, MA, USA, 1974. [Google Scholar]
  8. Reed, R.D.; Marks, R.J. Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks; MIT Press: Cambridge, MA, USA, 1998. [Google Scholar]
  9. Caruana, R.; Niculescu-Mizil, A. An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA, 25–29 June 2006; pp. 161–168. [Google Scholar]
  10. Hinton, G.E.; Sejnowski, T.J. Unsupervised Learning: Foundations of Neural Computation; MIT Press: Cambridge, MA, USA, 1999. [Google Scholar]
  11. Wang, D. Unsupervised Learning: Foundations of Neural Computation; MIT Press: Cambridge, MA, USA, 2001; p. 101. [Google Scholar]
  12. Hertz, J. Introduction to the Theory of Neural Computation. Basic Books 1; Taylor Francis: Abingdon, UK, 1991. [Google Scholar]
  13. Wang, G.-G.; Guo, L.; Gandomi, A.H.; Hao, G.-S.; Wang, H. Chaotic krill herd algorithm. Inf. Sci. 2014, 274, 17–34. [Google Scholar] [CrossRef]
  14. Wang, G.-G.; Gandomi, A.H.; Alavi, A.H.; Hao, G.-S. Hybrid krill herd algorithm with differential evolution for global numerical optimization. Neural Comput. Appl. 2013, 25, 297–308. [Google Scholar] [CrossRef]
  15. Van Laarhoven, P.J.; Aarts, E.H. Simulated Annealing; Springer: Berlin/Heidelberg, Germany, 1987. [Google Scholar]
  16. Szu, H.; Hartley, R. Fast simulated annealing. Phys. Lett. A 1987, 122, 157–162. [Google Scholar] [CrossRef]
  17. Mitchell, M.; Holland, J.H.; Forrest, S. When will a genetic algorithm outperform hill climbing? NIPS 1993, 51–58. [Google Scholar]
  18. Sanju, P. Enhancing Intrusion Detection in IoT Systems: A Hybrid Metaheuristics-Deep Learning Approach with Ensemble of Recurrent Neural Networks. J. Eng. Res. 2023; in press. [Google Scholar]
  19. Mirjalili, S.; Mohd Hashim, S.Z.; Moradian Sardroudi, H. Training feedforward neural networks using hybrid particle swarm optimization and gravitational search algorithm. Appl. Math. Comput. 2012, 218, 11125–11137. [Google Scholar] [CrossRef]
  20. Whitley, D.; Starkweather, T.; Bogart, C. Genetic algorithms and neural networks: Optimizing connections and connectivity. Parallel Comput. 1990, 14, 347–361. [Google Scholar] [CrossRef]
  21. Shokouhifar, A.; Shokouhifar, M.; Sabbaghian, M.; Soltanian-Zadeh, H. Swarm intelligence empowered three-stage ensemble deep learning for arm volume measurement in patients with lymphedema. Biomed. Signal Process. Control. 2023, 85, 105027. [Google Scholar] [CrossRef]
  22. Socha, K.; Blum, C. An ant colony optimization algorithm for continuous optimization: Application to feed-forward neural network training. Neural Comput. Appl. 2007, 16, 235–247. [Google Scholar] [CrossRef]
  23. Ozturk, C.; Karaboga, D. Hybrid Artificial Bee Colony algorithm for neural network training. In Proceedings of the 2011 IEEE Congress on, Evolutionary Computation (CEC), New Orleans, LA, USA, 5–8 June 2011; pp. 84–88. [Google Scholar]
  24. Mendes, R.; Cortez, P.; Rocha, M.; Neves, J. Particle swarms for feed forward neural network training. In Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN’02 (Cat. No.02CH37290), Honolulu, HI, USA, 12–17 May 2002. [Google Scholar]
  25. Gudise, V.G.; Venayagamoorthy, G.K. Comparison of particle swarm optimization and backpropagation as training algorithms for neural networks. In Proceedings of the Swarm Intelligence Symposium, SIS’03, Indianapolis, IN, USA, 26 April 2003; pp. 110–117. [Google Scholar]
  26. Ilonen, J.; Kamarainen, J.-K.; Lampinen, J. Differential evolution training algorithm for feed-forward neural networks. Neural Process. Lett. 2003, 17, 93–105. [Google Scholar] [CrossRef]
  27. Uzlu, E.; Kankal, M.; Akpınar, A.; Dede, T. Estimates of energy consumption in Turkey using neural networks with the teaching–learning-based optimization algorithm. Energy 2014, 75, 295–303. [Google Scholar] [CrossRef]
  28. Moallem, P.; Razmjooy, N. A multi-layer perceptron neural network trained by invasive weed optimization for potato color image segmentation. Trends Appl. Sci. Res. 2012, 7, 445–455. [Google Scholar] [CrossRef]
  29. Darekar, R.V.; Chavan, M.; Sharanyaa, S.; Ranjan, N.M. A hybrid meta-heuristic ensemble based classification technique speech emotion recognition. Adv. Eng. Softw. 2023, 180, 103412. [Google Scholar] [CrossRef]
  30. Mirjalili, S. How effective is the Grey Wolf Optimizer in training multi-layer perceptrons. Appl. Intell. 2015, 43, 150–161. [Google Scholar] [CrossRef]
  31. Yang, X.-S.; Deb, S. Engineering optimization by cuckoo search. Int. J. Math. Model. Numer. Optim. 2010, 1, 330–343. [Google Scholar]
  32. Yang, X.-S. Flower Pollination Algorithm for Global Optimization. In Proceedings of the 11th International Conference, UCNC 2012, Orléan, France, 3–7 September 2012; Volume 7445, pp. 240–249. [Google Scholar] [CrossRef] [Green Version]
  33. Fine, T.L. Feedforward Neural Network Methodology; Springer: Berlin/Heidelberg, Germany, 1999. [Google Scholar]
  34. Mirjalili, S.; Sadiq, A.S. Magnetic optimization algorithm for training multi-layer perceptron. In Proceedings of the Communication Software and Networks (ICCSN), 2011 IEEE 3rd International Conference, Xi’an, China, 27–29 May 2011; pp. 42–46. [Google Scholar]
  35. Payne, R.B.; Sorenson, M.D.; Klitz, K. The Cuckoos; Oxford University Press: Oxford, UK, 2005. [Google Scholar]
  36. Barthelemy, P.; Bertolotti, J.; Wiersma, D.S. A Lévy flight for light. Nature 2008, 453, 495–498. [Google Scholar] [CrossRef]
  37. Yang, X.-S.; Deb, S. Cuckoo Search via Levy Flights’. In Proceedings of the 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC), Coimbatore, India, 9–11 December 2009; IEEE Publications: Piscataway, NJ, USA, 2009. [Google Scholar]
  38. Brown, C.; Liebovitch, L.S.; Glendon, R. Lévy Flights in Dobe Ju/’hoansi Foraging Patterns. Human Ecol. 2007, 35, 129–138. [Google Scholar] [CrossRef]
  39. Pavlyukevich, I. Cooling down Lévy flights. J. Phys. A Math. Theory 2007, 40, 12299–12313. [Google Scholar] [CrossRef] [Green Version]
  40. Walker, M. How Flowers Conquered the World, BBC Earth News, 10 July 2009. Available online: http://news.bbc.co.uk/earth/hi/earth_news/newsid_8143000/8143095.stm (accessed on 1 January 2019).
  41. Waser, N.M. Flower constancy: Definition, cause and measurement. Am. Nat. 1986, 127, 596–603. [Google Scholar] [CrossRef]
  42. Glover, B.J. Understanding Flowers and Flowering: An Integrated Approach; Oxford University Press: Oxford, UK, 2007. [Google Scholar]
  43. Xin-She, Y.; Karamanoglu, M.; He, X. Flower pollination algorithm: A novel approach for multiobjective optimization. Eng. Optim. 2014, 46, 1222–1237. [Google Scholar]
  44. Belew, R.K.; McInerney, J.; Schraudolph, N.N. Evolving Networks: Using the Genetic Algorithm with Connectionist Learning; Cognitive Computer Science Research Group: La Jolla, CA, USA, 1990. [Google Scholar]
  45. Smizuta, T.; Sato, D.; Lao, M.; Ikeda, T. Shimizu, Structure design of neural networks using genetic algorithms. Complex Syst. 2001, 13, 161–176. [Google Scholar]
  46. Yu, J.; Wang, S.; Xi, L. Evolving artificial neural networks using an improved PSO and DPSO. Neurocomputing 2008, 71, 1054–1060. [Google Scholar] [CrossRef]
  47. Leung, F.H.; Lam, H.; Ling, S.; Tam, P.K.S. Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans. Neural Netw. 2003, 14, 79–88. [Google Scholar] [CrossRef] [Green Version]
  48. Montana, D.J.; Davis, L. Training Feedforward Neural Networks Using Genetic Algorithms. IJCAI 1989, 89, 762–767. [Google Scholar]
  49. Derrac, J.; García, S.; Molina, D.; Herrera, F. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evolut. Comput. 2011, 1, 3–18. [Google Scholar] [CrossRef]
  50. Karaboga, D. An Idea Based on Honey Bee Swarm for Numerical Optimization; Technical Report TR-06; Erciyes University, Engineering Faculty, Computer Engineering Department: Kayseri, Turkey, 2005. [Google Scholar]
  51. Yang, X.S. Firefly algorithms for multimodal optimization. In Stochastic Algorithms: Foundations and Applications; Lecture Notes in Computer Sciences; SAGA: Chicago, IL, USA, 2009; Volume 5792, pp. 169–178. [Google Scholar]
  52. Urvinder, S.; Salgotra, R. Synthesis of linear antenna array using flower pollination algorithm. Neural Comput. Appl. 2016, 29, 435–445. [Google Scholar]
  53. Blake, C.; Merz, C.J. {UCI} Repository of Machine Learning Databases; UCI: Aigle, Switzerland, 1998. [Google Scholar]
  54. Beyer, H.-G.; Schwefel, H.-P. Evolution strategies—A comprehensive introduction. Nat. Comput. 2002, 1, 3–52. [Google Scholar] [CrossRef]
  55. Yao, X.; Liu, Y.; Lin, G. Evolutionary programming made faster. Evol. Comput. IEEE Trans. 1999, 3, 82–102. [Google Scholar]
  56. Yao, X.; Liu, Y. Fast evolution strategies. In Proceedings of the Evolutionary Programming VI, Indianapolis, IN, USA, 13–16 April 1997; pp. 149–161. [Google Scholar]
  57. Baluja, S. Population-Based Incremental Learning: A Method for Integrating Genetic Search-Based Function Optimization and Competitive Learning; DTIC Document; Carnegie Mellon University: Pittsburgh, PA, USA, 1994. [Google Scholar]
  58. Seyedali, M.; Lewis, A. The whale optimization algorithm. Adv. Eng. Softw. 2016, 95, 51–67. [Google Scholar]
  59. Seyedali, M. Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowl. Based Syst. 2015, 89, 228–249. [Google Scholar]
  60. Zhou, Y.; Niu, Y.; Luo, Q.; Jiang, M. Teaching learning-based whale optimization algorithm for multi-layer perceptron neural network training. Math. Biosci. Eng. 2020, 17, 5987–6025. [Google Scholar] [CrossRef]
  61. Chong, H.Y.; Yap, H.J.; Tan, S.C.; Yap, K.S.; Wong, S.Y. Advances of metaheuristic algorithms in training neural networks for industrial applications. Soft Comput. 2021, 25, 11209–11233. [Google Scholar] [CrossRef]
Figure 1. An MLP with one hidden node.
Figure 1. An MLP with one hidden node.
Mathematics 11 03080 g001
Figure 2. Flow-code for CFS algorithm.
Figure 2. Flow-code for CFS algorithm.
Mathematics 11 03080 g002
Figure 3. Convergence curves for unimodal functions.
Figure 3. Convergence curves for unimodal functions.
Mathematics 11 03080 g003
Figure 4. Convergence curves for multimodal functions.
Figure 4. Convergence curves for multimodal functions.
Mathematics 11 03080 g004
Figure 5. Convergence curves for fixed dimension problems.
Figure 5. Convergence curves for fixed dimension problems.
Mathematics 11 03080 g005
Table 1. Parameter settings for various algorithms.
Table 1. Parameter settings for various algorithms.
AlgorithmParametersValues
FANumber of fireflies20
Alpha (α)0.5
Beta (β)0.2
Gamma (γ)1
Stopping Criteria200 Iterations
ABCSwarm Size20
Limit100
Stopping Criteria200 Iterations
FPAPopulation Size20
Probability Switch0.8
Stopping Criteria200 Iterations
CSPopulation Size20
Discovery Rate of alien egg0.25
Maximum number of iterations200
Stopping CriteriaMax Iteration.
BFPPopulation size20
Probability Switch0.8
Alpha (α)0.5
Stopping Criteria200 Iterations
CFSPopulation size20
Probability switch0.8
Discovery rate of alien egg (pa)0.25
Stopping Criteria200 Iterations
Table 2. Description of Unimodal Test functions.
Table 2. Description of Unimodal Test functions.
Unimodal Test
Problems
Objective FunctionSearch RangeOptimum ValueD
Schwefel function f 1 x = i = 1 D [ x i s i n ( | x i | ) ] [−500, 500]−418.9829 × D30, 50, 100
Sphere function f 2 x = i = 1 D x i 2 [−100, 100]030, 50, 100
Elliptic function f 3 x = i = 1 D ( 10 6 ) i 1 D 1 x i 2 [−100, 100]030, 50, 100
Scaffer function f 4 x = 1 n 1 s i . ( sin 50.0 s i 1 5 + 1 ) 2 s i = x i 2 + x i + 1 2 [−100, 100]030, 50, 100
Table 3. Results comparison for unimodal functions (30 Dimension).
Table 3. Results comparison for unimodal functions (30 Dimension).
Objective
Function
AlgorithmBestWorstMeanStandard
Deviation
f 1 x CFS−1.16 × 104−1.03 × 104−1.08 × 1043.47 × 102
FA−4.85 × 103−2.53 × 103−3.78 × 1036.61 × 102
ABC−9.65 × 103−7.67 × 103−8.68 × 1034.93 × 102
FPA−6.36 × 1019−4.73 × 1015−3.72 × 10181.41 × 1019
CS−7.34 × 103−6.49 × 103−6.94 × 1032.30 × 102
BFP−5.19 × 1010−2.08 × 103−2.76 × 1091.15 × 1010
f 2 x CFS1.06662.93972.09170.4731
FA0.02820.08180.05670.0137
ABC1.09 × 1042.31 × 1041.56 × 1043.27 × 103
FPA9.52 × 1032.28 × 1041.53 × 1043.20 × 103
CS2.93 × 1021.23 × 1038.07 × 1022.45 × 102
BFP3.49 × 1047.41 × 1046.00 × 1041.27 × 104
f 3 x CFS9.73 × 1033.56 × 1042.08 × 1046.88 × 103
FA1.95 × 1061.66 × 1076.96 × 1064.06 × 106
ABC6.75 × 1065.16 × 1081.04 × 1081.17 × 108
FPA1.60 × 1085.09 × 1082.81 × 1088.27 × 107
CS9.32 × 1055.87 × 1062.31 × 1061.13 × 106
BFP9.86 × 1084.43 × 1092.73 × 1097.60 × 108
f 4 x CFS06.43 × 10−141.40 × 10−141.74 × 10−14
FA3.61 × 10−100.02980.00660.0091
ABC0000
FPA1.28 × 10−50.00295.12 × 10−47.13 × 10−4
CS1.82 × 10−85.84 × 10−51.05 × 10−51.73 × 10−5
BFP4.67 × 10−24.75 × 10−13.25 × 10−11.51 × 10−1
Bold values in the table correspond to the best algorithmic values.
Table 4. P-test values of simulated algorithms for unimodal functions (30 Dimension).
Table 4. P-test values of simulated algorithms for unimodal functions (30 Dimension).
Objective
Function
FAFPACSABCCFS
f 1 x 6.79 × 10−8NA6.79 × 10−86.79 × 10−86.79 × 10−8
f 2 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8
f 3 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 4 x 8.00 × 10−98.00 × 10−98.00 × 10−9NANA
Table 5. Results comparison for unimodal functions (50 Dimension).
Table 5. Results comparison for unimodal functions (50 Dimension).
Objective
Function
AlgorithmBestWorstMeanStandard
Deviation
f 1 x CFS−1.66 × 104−1.53 × 104−1.60 × 1044.10 × 102
FA−9.18 × 103−3.88 × 103−6.19 × 1031.59 × 103
ABC−1.36 × 104−1.11 × 104−1.23 × 1047.09 × 102
FPA−1.18 × 1020−9.35 × 1015−7.75 × 10182.69 × 1019
CS−1.08 × 104−9.62 × 103−1.00 × 1043.52 × 102
BFP−6.32 × 1011−1.22 × 103−3.31 × 10101.41 × 1011
f 2 x CFS4.538511.90499.27534.5385
FA0.10620.20690.15780.0303
ABC5.83 × 1031.81 × 1041.37 × 1043.21 × 103
FPA1.46 × 1044.84 × 1043.03 × 1048.91 × 103
CS2.09 × 1035.50 × 1033.83 × 1038.82 × 102
BFP9.06 × 1041.43 × 1051.18 × 1051.63 × 104
f 3 x CFS1.14 × 1043.03 × 1041.95 × 1045.71 × 103
FA2.80 × 1061.34 × 1076.66 × 1062.99 × 106
ABC2.91 × 1071.14 × 1095.12 × 1083.16 × 109
FPA1.16 × 1084.53 × 1082.76 × 1081.04 × 108
CS1.17 × 1064.58 × 1062.42 × 1068.49 × 105
BFP1.71 × 1094.60 × 1092.70 × 1098.34 × 108
f 4 x CFS07.88 × 10−141.63 × 10−142.32 × 10−14
FA9.36 × 10−100.03360.00820.0106
ABC0000
FPA2.38 × 10−50.00690.00120.0019
CS2.37 × 10−82.39 × 10−43.22 × 10−57.18 × 10−5
BFP2.19 × 10−24.86 × 10−13.33 × 10−11.36 × 10−1
Bold values in the table correspond to the best algorithmic values.
Table 6. P-test values of simulated algorithms for unimodal functions (50 Dimension).
Table 6. P-test values of simulated algorithms for unimodal functions (50 Dimension).
Objective
Function
FAFPACSABCCFS
f 1 x 6.79 × 10−8NA6.79 × 10−86.79 × 10−86.79 × 10−8
f 2 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8
f 3 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10v8NA
f 4 x 8.00 × 10−98.00 × 10−98.00 × 10−9NANA
Table 7. Results comparison for unimodal functions (100 Dimension).
Table 7. Results comparison for unimodal functions (100 Dimension).
Objective
Function
AlgorithmBestWorstMeanStandard
Deviation
f 1 x CFS−2.78 × 104−2.36 × 104−2.60 × 1041.07 × 103
FA−1.54 × 104−5.88 × 103−9.40 × 1033.05 × 103
ABC−2.28 × 104−1.73 × 104−1.98 × 1041.45 × 103
FPA−1.63 × 1019−1.24 × 1016−1.50 × 10183.85 × 1018
CS−1.05 × 104−9.41 × 103−1.00 × 1042.79 × 102
BFP−5.75 × 108−4.56 × 103−5.87 × 1071.73 × 108
f 2 x CFS32.07451.01 × 10269.133619.4049
FA14.15041.69 × 10255.117340.4882
ABC5.70 × 1031.88 × 1041.23 × 1043.88 × 103
FPA3.03 × 1049.66 × 1045.99 × 1041.93 × 104
CS1.38 × 1042.45 × 1041.69 × 1042.59 × 103
BFP1.61 × 1053.16 × 1052.53 × 1054.57 × 104
f 3 x CFS6.22 × 1033.48 × 1042.11 × 1046.55 × 103
FA1.89 × 1061.10 × 1075.29 × 1062.83 × 106
ABC2.57 × 1081.69 × 1091.03 × 1093.57 × 108
FPA1.71 × 1084.66 × 1083.22 × 1088.91 × 107
CS1.26 × 1066.12 × 1062.69 × 1061.06 × 106
BFP1.92 × 1094.22 × 1092.87 × 1082.87 × 109
f 4 x CFS2.22 × 10−162.83 × 10−132.49 × 10−146.35 × 10−14
FA1.36 × 10−110.06670.01210.0164
ABC0000
FPA1.34 × 10−60.00284.55 × 10−47.03 × 10−4
CS1.80 × 10−86.88 × 10−51.43 × 10−52.11 × 10−5
BFP3.10 × 10−24.92 × 10−13.30 × 10−11.42 × 10−1
Bold values in the table correspond to the best algorithmic values.
Table 8. P-test values of simulated algorithms for unimodal functions (100 Dimension).
Table 8. P-test values of simulated algorithms for unimodal functions (100 Dimension).
Objective
Function
FAFPACSABCCFS
f 1 x 6.79 × 10−8NA6.79 × 10−86.79 × 10−086.79 × 10−8
f 2 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−08NA
f 3 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−08NA
f 4 x 8.00 × 10−98.00 × 10−98.00 × 10−9NA7.97 × 10−9
Table 9. Description of multimodal test problems.
Table 9. Description of multimodal test problems.
Multimodal Test ProblemsObjective FunctionSearch RangeOptimum ValueD
Rastrigin function f 5 x = 10 D + i = 1 D [ x i 2 10 c o s ( 2 π x i ) ] [−5.12, 5.12]030, 50, 100
Weierstrass function f 6 x = i = 1 D k = 0 k m a x [ a k cos 2 π b k x i + 0.5 ] D k = 0 k m a x [ a k c o s ( 2 π b k · 0.5 ) ] ; where a = 0.5, b = 3, kmax = 20[−0.5, 0.5]030, 50, 100
Griewank   f 7 = 1 4000 i = 1 N x i 2 i = 1 N c o s ( x i i ) + 1 [−600, 600]030, 50, 100
Penalized 1 Function f 8 = π n { 10 s i n ( π y 1 ) + i = 1 n 1 ( y i 1 ) 2 [ 1 + 10 s i n 2 ( π y i + 1 ) + ( y n 1 ) 2 } + i = 1 n u ( x i , 10, 100, 4) y i = 1 + x i + 1 4 ; u( x i , a, k, m) = k x i a m x i > a 0                                       a < x i < a k x i a m               x i < a [−50, 50]030, 50, 100
Penalized 2 function f 9 = 0.1 { 3 π x 1 + i = 1 n x i 1 2 1 + s i n 2 3 π x i + 1 + ( x n 1 ) 2 1 + s i n 2 ( 2 π x n ) } + i = 1 n u ( x i , 5 , 100 , 4 )
u( x i , a, k, m) = k x i a m x i > a 0                                       a < x i < a k x i a m               x i < a
[−50, 50]030, 50, 100
Ackley function f 10 x = 20 exp 0.2 1 D i = 1 D x i 2 e x p ( 1 D i = 1 D c o s ( 2 π x i ) ) + 20 + e [−100, 100]030, 50, 100
Table 10. Results comparison for multimodal functions (30 Dimension).
Table 10. Results comparison for multimodal functions (30 Dimension).
Objective
Function
AlgorithmBestWorstMeanStandard
Deviation
f 5 x CFS7.53 × 10−133.59 × 10−97.46 × 10−109.59 × 10−10
FA2.07 × 10−90.3390.02260.0765
ABC7.24 × 1024.17 × 1032.06 × 1039.69 × 102
FPA0.00170.24690.05950.063
CS8.00 × 1021.86 × 1031.16 × 1033.00 × 102
BFP4.00 × 10−31.70 × 1018.13 × 103.16 × 10
f 6 x CFS1.90042.61082.30490.2186
FA13.27921.304816.86131.8416
ABC11.134619.626415.52062.4429
FPA35.751739.447437.56971.2557
CS16.627323.892419.91461.9027
BFP42.449650.670846.89252.1102
f 7 x CFS1.31 × 10−138.61 × 10−112.18 × 10−112.44 × 10−11
FA2.25 × 10−71.48 × 10−54.01 × 10−63.54 × 10−6
ABC3.529572.96128.028916.7278
FPA1.50 × 10−40.08740.01610.0231
CS4.580318.36418.87623.1545
BFP7.52791.65 × 1027.69 × 1014.51 × 101
f 8 x CFS0.1874.47370.53450.9324
FA0.00130.1330.01670.0287
ABC8.04 × 1061.68 × 1086.93 × 1074.39 × 107
FPA5.42 × 1053.37 × 1078.71 × 1068.63 × 106
CS9.84681.066924.517415.5959
BFP3.63 × 1089.02 × 0085.82 × 1081.64 × 108
f 9 x CFS0.00860.08730.02860.0015
FA0.00590.06190.01190.0031
ABC2.41 × 1073.19 × 1081.49 × 1088.95 × 107
FPA1.38 × 1071.09 × 1084.99 × 1072.45 × 107
CS51.23081.83 × 1053.13 × 1045.18 × 104
BFP4.59 × 1081.65 × 1091.13 × 1093.41 × 108
f 10 x CFS0.51590.8790.69150.0111
FA0.14620.4690.2610.0745
ABC6.005214.983911.05132.6699
FPA14.067819.019517.40881.0373
CS10.039317.131913.00731.6329
BFP19.887720.84920.50930.2645
Bold values in the table correspond to the best algorithmic values.
Table 11. P-test values of various algorithms for multimodal functions (30 Dimension).
Table 11. P-test values of various algorithms for multimodal functions (30 Dimension).
Objective
Function
FAFPACSABCBFPCFS
f 5 x 1.23 × 10−71.23 × 10−76.79 × 10−86.79 × 10−86.79 × 10−8NA
f 6 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 7 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 8 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8
f 9 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−87.57 × 10−4
f 10 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
Table 12. Results comparison for multimodal functions (50 Dimension).
Table 12. Results comparison for multimodal functions (50 Dimension).
Objective
Function
AlgorithmBestWorstMeanStandard
Deviation
f 5 x CFS7.88 × 10−103.30 × 10−97.39 × 10−109.49 × 10−10
FA1.85 × 10−090.19890.01090.0444
ABC7.08 × 10032.62 × 10041.74 × 10046.25 × 1003
FPA0.00830.3230.10850.1053
CS3.08 × 10035.85 × 10034.34 × 10038.13 × 1002
BFP1.51 × 10001.98 × 10011.09 × 10015.08 × 1000
f 6 x CFS4.42886.34455.61940.5645
FA28.871239.251433.36553.1566
ABC32.50646.637538.0853.2999
FPA66.983274.218870.88662.0856
CS33.879246.221139.48683.1594
BFP68.174389.054880.95815.9281
f 7 x CFS9.55 × 10−143.91 × 10−104.20 × 10−119.14 × 10−11
FA1.31 × 10−90.00381.91 × 10−48.44 × 10−4
ABC45.22423.06 × 1021.84 × 10267.741
FPA0.00170.06340.01440.0174
CS21.409167.080638.13239.8239
BFP1.52371.77 × 1028.07 × 1016.33 × 101
f 8 x CFS1.416911.19034.27182.5936
FA0.00612.09450.42040.5428
ABC8.05 × 1061.07 × 185.58 × 1072.92 × 107
FPA4.61 × 1059.02 × 10072.58 × 1072.07 × 107
CS65.78358.56 × 1056.28 × 1041.90 × 105
BFP2.73 × 1081.33 × 1099.55 × 1083.23 × 108
f 9 x CFS0.08091.4460.42490.0559
FA0.00750.35390.04440.0817
ABC8.40 × 1072.33 × 1081.52 × 1084.62 × 107
FPA3.80 × 1073.59 × 1081.24 × 1087.85 × 107
CS1.47 × 1051.09 × 1065.76 × 1052.86 × 105
BFP4.59 × 1081.65 × 1091.13 × 1093.41 × 109
f 10 x CFS2.366218.02769.71384.4605
FA0.30740.84580.30740.1439
ABC10.597818.325815.84431.9489
FPA16.28718.914717.63330.7004
CS10.397217.838713.99781.9681
BFP19.579420.886820.63610.3188
Bold values in the table correspond to the best algorithmic values.
Table 13. P-test values of various algorithms for multimodal functions (50 Dimension).
Table 13. P-test values of various algorithms for multimodal functions (50 Dimension).
Objective
Function
FAFPACSABCBFPCFS
f 5 x 1.06 × 10−76.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 6 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 7 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 8 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8
f 9 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−89.12 × 10−7
f 10 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8
Table 14. Results comparison for multimodal functions (100 Dimension).
Table 14. Results comparison for multimodal functions (100 Dimension).
Objective
Function
AlgorithmBestWorstMeanStandard
Deviation
f 5 x CFS3.21 × 10−102.21 × 10−97.12 × 10−107.73 × 10−10
FA1.49 × 10−81.91 × 10−63.98 × 10−75.25 × 10−7
ABC8.09 × 1041.34 × 1051.05 × 1051.44 × 104
FPA0.00330.27150.08130.0755
CS1.22 × 1042.03 × 1041.65 × 1042.37 × 103
BFP2.10 × 10−31.49 × 1018.27 × 10004.11 × 1000
f 6 x CFS16.430122.981918.36161.4303
FA67.168681.683974.23814.1565
ABC1.03 × 1021.25 × 1021.16 × 1026.363
FPA1.23 × 1021.62 × 1021.53 × 1029.3169
CS81.726696.501189.20764.8649
BFP1.47 × 1021.86 × 1021.72 × 1021.03 × 101
f 7 x CFS9.20 × 10−143.20 × 10−104.46 × 10−117.82 × 10−11
FA1.11 × 10−66.13 × 10−61.83 × 10−61.65 × 10−6
ABC6.68 × 1021.08 × 1038.83 × 1021.15 × 102
FPA0.0030.0710.0220.0212
CS1.06 × 1021.91 × 1021.41 × 10222.8388
BFP1.00511.60 × 1025.98 × 1014.05 × 101
f 8 x CFS18.90081.86 × 10250.709135.0491
FA9.052148.027428.223310.0771
ABC3.42 × 1067.72 × 1073.44 × 1071.94 × 107
FPA3.35 × 1071.78 × 1078.73 × 1074.64 × 107
CS2.67 × 1044.32 × 1067.91 × 1059.75 × 105
BFP7.01 × 1083.61 × 1092.49 × 1098.69 × 108
f 9 x CFS1.73435.76933.54511.1998
FA2.37049.50914.56931.5113
ABC3.89 × 1072.01 × 1081.07 × 1084.86 × 108
FPA5.12 × 1077.27 × 1083.17 × 1082.03 × 108
CS2.86 × 1062.04 × 1077.63 × 1065.15 × 106
BFP1.64 × 1096.02 × 1094.75 × 1091.40 × 109
f 10 x CFS4.594819.452512.40223.9196
FA1.34123.54942.72830.5434
ABC18.268319.760119.13780.3621
FPA16.652819.68618.20810.8273
CS13.568417.849715.62181.4927
BFP20.038921.063720.76940.2923
Bold values in the table correspond to the best algorithmic values.
Table 15. P-test values of various algorithms for multimodal functions (100 Dimension).
Table 15. P-test values of various algorithms for multimodal functions (100 Dimension).
Objective
Function
FAFPACSABC CFS
f 5 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 6 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 7 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 8 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8
f 9 x 0.04396.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 10 x NA6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8
Table 16. Description of fixed dimension test functions.
Table 16. Description of fixed dimension test functions.
Fixed Dimension Test ProblemsObjective FunctionSearch RangeOptimum ValueD
Branin RCOS
Function
f 11 x = ( x 2 5.1 4 π 2 x 1 2 + 5 π x 1 6 ) 2 + 10 1 1 8 π c o s x 1 + 10 x1 ϵ [−5, 10], x2 ϵ [0, 15]0.3978872
Six Hump Camel function f 12 x = 4 2.1 x 1 2 + x 1 4 3 x 1 2 + x 1 x 2 + ( 4 + 4 x 2 2 ) x 2 2 [−5, 5]−1.03162
Goldstein & Price function f 13 x = 1 + ( x 1 + x 2 + 1 2 ( 19 14 x 1 + 3 x 1 2 14
x 2 + 6 x 1 x 2 + 3 x 2 2 ) ) ( 30 + 2 x 1 3 x 2 2 ( 18 32 x 1
+ 12 x 1 2 + 48 x 2 36 x 1 x 2 + 27 x 2 2 ) )
[−2, 2]32
Hartmann function 3 f 14 x = i = 1 4 α i e x p [ j = 1 3 A i j ( x j P i j ) 2 ] [0, 1]−3.862783
Hartmann function 6 f 15 x = i = 1 4 α i e x p [ j = 1 6 A i j ( x j P i j ) 2 ] [0, 1]−3.322376
Shekel 5 f 16 x = j = 1 5 [ i = 1 4 ( x i C i j 2 + β j ) 1 ] [0, 10]−10.15324
Shekel 7 f 17 x = j = 1 7 [ i = 1 4 ( x i C i j 2 + β j ) 1 ] [0, 10]−10.40294
Shekel 10 f 18 x = j = 1 10 [ i = 1 4 ( x i C i j 2 + β j ) 1 ] [0, 10]−10.53644
Easom function f 19 x = c o s x 1 c o s x 2 e ( ( x 1 π ) 2 ( x 2 π ) 2 ) [−10, 10]−12
Table 17. Results comparison for fixed dimension functions.
Table 17. Results comparison for fixed dimension functions.
Objective
Function
AlgorithmBestWorstMeanStandard
Deviation
f 11 x CFS0.39790.39790.39792.19 × 10−11
FA0.39790.39790.39791.30 × 10−8
ABC0000
FPA0.39790.39830.3989.64 × 10−5
CS0.39790.39790.39795.32 × 10−8
BFP0.44165.35763.07211.63 × 1000
f 12 x CFS−1.0316−1.0316−1.03161.66 × 10−10
FA−1.3016−1.3015−1.03163.47 × 10−5
ABC−1.3016−1.0250−1.03100.0015
FPA−1.3016−1.3016−1.30161.24 × 10−5
CS−1.3016−1.0316−1.30168.22 × 10−11
BFP−0.98844.45870.18621.50 × 1000
f 13 x CFS3331.54 × 10−12
FA3331.51 × 10−7
ABC3.00043.05313.01070.0148
FPA33.00153.00044.73 × 10−4
CS3339.86 × 10−9
BFP3.352598.25847.4583.46 × 101
f 14 x CFS−3.8628−3.8628−3.86286.56 × 10−12
FA−3.8628−2.1968−3.30640.6077
ABC−3.8628−3.8621−3.86262.17 × 10−4
FPA−3.8325−1.5171−3.32530.6709
CS−3.8628−3.8628−3.86281.13 × 10−8
BFP−0.5359−3.25E−6−0.08141.65 × 10−1
f 15 x CFS−3.3224−3.3224−3.32243.74 × 10−7
FA−3.3224−3.0639−3.24699.32 × 10−2
ABC−3.3223−3.1954−3.24610.059
FPA−3.2275−2.9663−3.13450.0702
CS−3.3223−3.3140−3.32010.0028
BFP−2.6298−0.7595−1.63300.5693
f 16 x CFS−10.1532−10.1532−10.15325.74 × 10−5
FA−5.0552−5.0552−5.05521.03 × 10−8
ABC−10.1486−2.6075−5.53223.4454
FPA−5.0546−5.0419−5.05130.0033
CS−10.0826−9.3309−10.08260.1799
BFP−3.9584−1.2893−2.39150.8119
f 17 x CFS−10.4029−10.4029−10.40291.33 × 10−4
FA−5.0877−5.0877−5.08779.13 × 10−9
ABC−10.5359−2.4206−5.33323.1817
FPA−5.0864−5.0771−5.08370.0025
CS−10.5358−73868−10.30060.6974
BFP−4.4980−1.6336−2.64980.8739
f 18 x CFS−10.5364−10.5364−10.53641.87 × 10−6
FA−5.1285−5.1285−5.12859.16 × 10−9
ABC−10.4895−1.8556−4.62893.0032
FPA−5.1279−5.1185−5.12440.0028
CS−10.5357−9.8686−10.43200.1724
BFP−4.3369−1.6523−2.59390.7823
f 19 x CFS−1−1.0000−1.00006.07 × 10−14
FA−1.0000−1.0000−1.00001.26 × 10−8
ABC−1.0000−0.9886−0.99770.0029
FPA−1.0000−0.9998−0.99996.79 × 10−5
CS−1.0000−1.0000−1.00005.30 × 10−10
BFP−0.5894−2.18 × 10−13−0.05741.51 × 10−1
Bold values in the table correspond to the best algorithmic values.
Table 18. P-test values of various algorithms for fixed dimension functions.
Table 18. P-test values of various algorithms for fixed dimension functions.
Objective
Function
FAFPACSABCBFPCFS
f 11 x 7.89 × 10−87.89 × 10−89.17 × 10−88.00 × 10−96.79 × 10−8NA
f 12 x 6.79 × 10−86.79 × 10−80.06796.79 × 10−86.79 × 10−8NA
f 13 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 14 x 6.79 × 10−86.79 × 10−87.89 × 10−86.79 × 10−86.79 × 10−8NA
f 15 x 0.18956.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 16 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 17 x 6.79 × 10−86.79 × 10−80.00121.60 × 10−46.79 × 10−8NA
f 18 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
f 19 x 6.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−86.79 × 10−8NA
Table 19. Parameters for algorithms.
Table 19. Parameters for algorithms.
AlgorithmParametersValue
CFSPopulation size50 for XOR and Balloon; 20 for the rest
Probability switch0.8
Discovery rate of alien egg (pa)0.25
Maximum number of iterations250
Table 20. Classification datasets.
Table 20. Classification datasets.
Classification
Datasets
Attributes
Count
Training
Samples Count
Test Samples CountNumber of Classes
3-bit XOR388 as training samples2
Balloon41616 as training samples2
Iris4150150 as training samples3
Breast Cancer95991002
Heart22801872
Table 21. MLP structure for each dataset.
Table 21. MLP structure for each dataset.
Classification DatasetsAttributes CountMLP Structure
3-bit XOR33−7−1
Balloon44−9−1
Iris44−9−3
Breast Cancer99−19−1
Heart2222−45−1
Table 22. Comparison results of CFS-MLP for XOR dataset.
Table 22. Comparison results of CFS-MLP for XOR dataset.
AlgorithmAverageStandard Deviation
CFS−MLP9.687 × 10−122.520 × 10−11
GWO−MLP9.410 × 10−32.950 × 10−1
PSO−MLP8.405 × 10−23.594 × 10−2
GA−MLP1.810 × 10−44.130 × 10−4
ACO−MLP1.803 × 10−12.526 × 10−2
ES−MLP1.187 × 10−11.157 × 10−2
PBIL−MLP3.022 × 10−23.966 × 10−2
WOA−MLP8.420 × 10−25.140 × 10−2
MFO−MLP5.298 × 10−61.038 × 10−5
Bold values in the table correspond to the best algorithmic values.
Table 23. Comparison results of CFS-MLP for the Balloon dataset.
Table 23. Comparison results of CFS-MLP for the Balloon dataset.
AlgorithmAverageStandard Deviation
CFS−MLP1.19 × 10−411.90 × 10−41
GWO−MLP9.38 × 10−152.81 × 10−14
PSO−MLP0.0005850.000749
GA−MLP5.08 × 10−241.06E−23
ACO−MLP0.0048540.00776
ES−MLP0.0190550.17026
PBIL−MLP2.49 × 10−55.27 × 10−5
WOA−MLP4.88 × 10−61.41 × 10−5
MFO−MLP1.85 × 10−156.18 × 10−15
Bold values in the table correspond to the best algorithmic values.
Table 24. Comparison results of CFS-MLP for the Iris dataset.
Table 24. Comparison results of CFS-MLP for the Iris dataset.
AlgorithmAverageStandard Deviation
CFS−MLP0.066735.31 × 10−4
GWO−MLP0.02290.0032
PSO−MLP0.228680.057235
GA−MLP0.0899120.123638
ACO−MLP0.4059790.053775
ES−MLP0.314340.052142
PBIL−MLP0.1160670.036355
WOA−MLP0.7341340.051808
MFO−MLP0.6679570.003467
Bold values in the table correspond to the best algorithmic values.
Table 25. Comparison results of CFS-MLP for the Breast Cancer dataset.
Table 25. Comparison results of CFS-MLP for the Breast Cancer dataset.
AlgorithmAverage Standard Deviation
CFS−MLP0.00182.83 × 10−4
GWO−MLP0.00127.44 × 10−5
PSO−MLP0.0348810.002472
GA−MLP0.0030260.0015
ACO−MLP0.013510.002137
ES−MLP0.040320.00247
PBIL−MLP0.0320090.003065
WOA−MLP0.0062430.003128
MFO−MLP0.0040380.003041
Bold values in the table correspond to the best algorithmic values.
Table 26. Comparison results of CFS-MLP for the Heart dataset.
Table 26. Comparison results of CFS-MLP for the Heart dataset.
AlgorithmAverageStandard Deviation
CFS−MLP0.06860.0067
GWO−MLP0.12260.0077
PSO−MLP0.1885680.008939
GA−MLP0.0930470.02246
ACO−MLP0.228430.004979
ES−MLP0.1924730.015174
PBIL−MLP0.1540960.018204
WOA−MLP0.1796640.052152
MFO−MLP0.083210.02062
Bold values in the table correspond to the best algorithmic values.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Salgotra, R.; Mittal, N.; Mittal, V. A New Parallel Cuckoo Flower Search Algorithm for Training Multi-Layer Perceptron. Mathematics 2023, 11, 3080. https://doi.org/10.3390/math11143080

AMA Style

Salgotra R, Mittal N, Mittal V. A New Parallel Cuckoo Flower Search Algorithm for Training Multi-Layer Perceptron. Mathematics. 2023; 11(14):3080. https://doi.org/10.3390/math11143080

Chicago/Turabian Style

Salgotra, Rohit, Nitin Mittal, and Vikas Mittal. 2023. "A New Parallel Cuckoo Flower Search Algorithm for Training Multi-Layer Perceptron" Mathematics 11, no. 14: 3080. https://doi.org/10.3390/math11143080

APA Style

Salgotra, R., Mittal, N., & Mittal, V. (2023). A New Parallel Cuckoo Flower Search Algorithm for Training Multi-Layer Perceptron. Mathematics, 11(14), 3080. https://doi.org/10.3390/math11143080

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop