1. Introduction
The primary concern of optimisation is finding either the minima or maxima of the objective function, subject to some given constraints. Optimisation problems naturally occur in machine learning, artificial intelligence, computer science, and operations research. Optimisation has been used to improve processes in all human endeavours. A wide variety of techniques for optimisation exist. These techniques include linear programming, quadratic programming, convex optimization, interior-point method, trust-region method, conjugate-gradient methods, evolutionary algorithms, heuristics, and metaheuristics [
1]. The era of artificial intelligence ushered in techniques for optimisation that are capable of finding near-optimal solutions to challenging and complex real-world optimisation problems. Then came the nature-inspired and bio-inspired metaheuristic optimization era, with huge successes recorded and increasing popularity over the past four decades.
Many attributed the popularity of nature-inspired and bio-inspired metaheuristics optimization algorithms to their ability to find near-optimal solutions [
2]. This success can be attributed to how these nature-inspired and bio-inspired metaheuristics optimizers mimic natural phenomena [
3]. These natural phenomena have inspired the development of almost all of the metaheuristic algorithms. Evolutionary techniques are gaining popularity in the same vein, too, with many novel techniques developed regularly. The performance of evolutionary techniques matches the nature-inspired or bio-inspired algorithms [
4].
The successes of these metaheuristic optimizers on real-world problems come with tremendous challenges. These challenges arise from the fact that real-world optimisation problems are complex and have multiple nonlinear constraints. The ability of optimisers to navigate these challenges and achieve optimality depends heavily on how the initial population is distributed, especially for gradient-based optimizers [
5]. Though metaheuristic optimizers are gradient-free, they must also be initialised; thus, they are greatly influenced by the nature of the initial population, especially the large-scale multimodal problems. Population-based metaheuristic algorithms show varying abilities to reach a global optimum when the initialisation scheme is varied [
6].
Interestingly, in the last decade, there has been an exponential growth in the number of proposed nature-inspired optimisation algorithms. Furthermore, there has been a corresponding claim of novelty and solid capability of the algorithms serving as powerful optimisation tools. Unfortunately, most algorithms do not seem to draw inspiration from nature or incorporate any successful methodology that mimics natural phenomena or systems [
7]. From the concept of the theorem of No Free Lunch, many real-world optimisation problems still require new approaches or methods to be solved perfectly or optimally. The theorem proved that any method could solve a problem efficiently, but no single method can effectively solve all problems. Studies have shown the popularity and successes of proposing algorithms that mimic the behaviours of animals to solve optimisation problems with reasonable accuracy.
The commonly used distribution for initialisation by most metaheuristic algorithms is the random number generator, which generates sequences (used as position vectors) that follow the uniform probability distribution. However, these sequences do not have low discrepancies, or are not equidistributed in a given search area, and do not efficiently cover the search space [
8]. On the other hand, quasirandom numbers can be generated with low discrepancy; these have been proven to have optimal discrepancy because they tend to cover the search space better and are helpful in optimisation [
9]. Low discrepancy sequences, like Van der Corput, Sobol, Faure, and Halton, are potent computational method tools and have been used to improve the performance of optimisation algorithms. Many other approaches exist in the literature, and these are presented in this paper.
A situation where the set of initial solutions lies near the position of the true optimality by chance, can increase the probability of finding the true optimality and significantly reduce the search efforts. In optimisation problems, the location of the true optimality is unknown a priori, and initialisation is a stochastic process. Additionally, the population size is equally important and when considering problems with high dimensions, small population size may lie sparsely in unpromising regions, and can return suboptimal solutions with bias. In addition, the different distributions used as position vectors for the initial population may have a different sampling emphasis and hence different degrees of diversity.
To demonstrate the importance of initialisation, consider the Bukin N. 6 function shown in
Figure 1. We assumed a search space of
. The advanced arithmetic optimization algorithm (nAOA) [
10] was initialised with beta distribution, and the distribution of the population after the first iteration is shown in
Figure 2. The blue dots represent the current location of the population, the red asterisk (*) represents the current best solution, and the red star (★) denotes the global optimal solution of the Bukin function. The nAOA converged towards the optimal solution after a few iterations, as shown in
Figure 3. Similarly, the nAOA was initialised with the random number, and the distribution of the population after the first iteration is shown in
Figure 4. The distribution of the population of nAOA quickly falls into a local optimum after a few iterations, as shown in
Figure 5.
Although initialisation plays a significant role in the performance of most metaheuristic optimizers, few studies or surveys have been conducted on the subject area. A search using the keywords survey OR review, initialisation (initialization), and metaheuristics, yielded no comprehensive review or survey articles in the literature. However, in discussing PSO variants, ref. [
11] provide a paragraph on attempts to improve PSO performance using different initialisation schemes. The authors discuss how low discrepancy sequences and variants of opposition-based learning enhance the initial swarm population. Another attempt using GA was presented by [
12], where the effect of three initialisation functions, namely, nearest neighbour (NN), insertion (In), and Solomon’s heuristic, were studied. Li, Liu, and Yang [
13] evaluated the effect of 22 different probability distribution initialisation methods on the convergence and accuracy of five optimisation algorithms. In this regard, we formulate the research question given below to accomplish our work:
What literature modified the initialisation control parameters comprising size and diversity of population and the maximum number of iterations to improve the algorithms’ performance?
The following questions are formulated to answer the main research question:
What research exists that used distributions other than the random number for initialisation of the population to improve the performance of metaheuristic algorithms?
What study exists that fine-tuned the population size and the number of iterations of different algorithms?
What are the major initialisation distributions used by the population-based algorithm?
What problems were solved by the modified algorithms?
What are other challenges yet to be explored by researchers in the research area?
To the best of our knowledge, no survey or review article focuses on general efforts to improve the performances of different metaheuristic optimizers using different initialisation schemes in the literature, which motivates the current research contribution. Therefore, this study presents a comprehensive survey of different initialisation methods employed by metaheuristic algorithm designers and optimisation enthusiasts to improve the performance of the different metaheuristic optimizers available in the literature. The study covers articles published between 2000–2021, and the specific contributions of this paper are summarised as follows:
We present a comprehensive review of the different distributions used to improve the diversity of the initial population of population-based metaheuristic algorithms.
We categorise the schemes into random numbers, quasirandom sequences, chaos theory, probability distributions, hybrids of other heuristic or metaheuristic algorithms, Lévy, and others.
We also discuss the different levels of success of these schemes and identify their limitations.
An in-depth highlight of the glossary of efforts to improve the performance of metaheuristic algorithms using several initialisation schemes is presented. Metaheuristic research enthusiasts can easily reference this glossary.
Finally, we provide the research gaps, useful insights, and future directions.
The rest of the paper is organised as follows. In
Section 2, we provide the methodology used for collecting papers. The major initialisation methods used to improve the performance of the algorithms are presented in
Section 3. In
Section 4, we discuss the various application areas of the present study. Results and discussion of findings from our experiment are presented in
Section 5. Finally,
Section 6 presents the concluding remarks.
3. Major Initialisation Methods
This section discusses the updated efforts on improving the initial condition of the population of metaheuristic algorithms. This provided an answer to our research question, what research examples exist that used distributions other than the random number for initialisation of the population, to improve the performance of metaheuristic algorithms? The different initialisation schemes identified in the literature were summarised or categorised into pseudo-random number or Monte Carlo methods, quasirandom methods, probability distributions, hybrid, chaos theory, Lévy, and ad hoc knowledge of the domain, and others. The categorisation was performed to aid our discussion of the schemes that we identified.
3.1. Pseudo-Random Number or Monte Carlo Methods
By default, the random number generation or Monte Carlo method is the most used initialisation scheme for most metaheuristic algorithms. It uses the uniform probability distribution to generate uniform pseudo-random number sequences that are used as location vectors for the population. Many population-based metaheuristic algorithms use this scheme, and interested readers can refer to the respective optimisers for details. The role of the random number generation, as an essential part of the initialisation process, has been greatly emphasised [
16,
17]. Despite its popularity, the random number sequence suffers because its discrepancy is not low and does not efficiently cover the search space [
8]. The discrepancy of the random number greatly influences how genuinely random the resulting randomly generated solutions are within the solution search spaces [
18]. Research works, such as those by [
19,
20], have shown that the random number does not result in an optimal discrepancy that will aid the convergence of the algorithms.
Figure 8 shows how the random numbers tend to form clusters after several iterations, instead of filling up the search space. This is a significant disadvantage of using random number generators to initialise the population of the metaheuristic algorithms.
We did not include a table for this category because most existing metaheuristic algorithms belong here. A table for this would be huge, and there is no area of application that this scheme has not been applied to.
3.2. Quasirandom Methods
Quasirandom number generators are known to generate sequences that are proven to have low discrepancy [
9]. Low discrepancy sequences, like Van der Corput, Sobol, Faure, and Halton, are potent computational method tools, which have been used to improve the performance of optimisation algorithms. Quasirandom numbers are effective initialisation mechanisms for metaheuristic algorithms to uniformly cover the search space in order to obtain the optimal solution. The particle swarm population in the work of [
21] was initialised using the randomized low discrepancy sequences of Halton, Sobol, and Faure. The three modified PSO were applied to the benchmark test functions, and results were then compared with the global best PSO. This showed that PSO was significantly improved with Sobol, while the results showed a varying improvement with Faure and Halton. Similarly, the Van der Corput and Sobol sequences were used to initialise the PSO and were then applied to solve the benchmark functions [
8]. The results obtained were promising when compared to the original PSO.
The krill population in the KH algorithm was initialised using the Faure, Sobol, and Var der Corptut sequences [
22]. The benchmark test functions were used to test the efficacy of the modified KH. Our findings revealed significant improvements in the performance of the KH algorithm when initialised using Faure, Sobol, and Var der Corptut low-discrepancy sequences, which was also the case with the guaranteed convergence particle swarm optimization (GCPSO) algorithm, using the Niching methods to initialise the swarm population [
23]; the Niching methods are based on the Faure low-discrepancy sequence, and the benchmark test functions were used to evaluate the performance of GCPSO, with promising results.
The initialisation schemes that were implemented using low-discrepancy sequences are known to perform poorly, as the problem dimension or graph size scales up.
Figure 9 shows how the Halton sequence spreads and fills the search space at the 1000th iteration, improving the convergence of algorithms. We have noted authors [
24] who use the Halton sequence to initialise the search agents of the Wingsuit Flying Search (WFS) algorithm.
Table 2 summarises the glossary of efforts that used low discrepancy sequences (quasirandom numbers) to initialise the population of some metaheuristic optimizers. Interested readers can refer to the references for more details about the efforts. In all the papers reviewed in this section, the authors claimed that fine-tuning the initialisation control parameters (population size and diversity and maximum number of iterations or function evaluations) improved the performance of the algorithm.
3.3. Probability Distributions
The probability distribution describes the possible values and likelihood that a random number is effective within a defined interval. Different probability distributions and their rigorous statistical properties can be used to initialise the population of metaheuristic algorithms. Li, Liu, and Yang [
13] used variants of Beta distribution, uniform distribution, normal distribution, logarithmic normal distribution, exponential distribution, Rayleigh distribution, Weibull distribution, and Latin hypercube sampling [
31] to form 22 different initialisation schemes in order to evaluate PSO, CS, DE, ABC, and GA. The variants of the probability distributions are as follows:
The Beta distribution is a continuous probability distribution over the interval (0,1). It can be written as . Varying the values of resulted in a variant of the Beta distribution, generating sequences with different behaviours in the search space. Three variants of the Beta distribution were used,
A uniform distribution is defined over the interval [a, b], and it is usually written as . One variant of the normal distribution was used.
The Gaussian Normal distribution is usually written as . In addition, varying the values of resulted in three (3) variants of the normal distribution, which generates sequences with different behaviours in the search space.
The logarithmic normal distribution is often written as . Four (4) variants of the logarithmic normal distribution were created by varying the values of .
An exponential distribution is asymmetric with a long tail and can be written as . Varying , resulted in three variants of the distribution which were used to initialise the population of the five algorithms.
The Rayleigh distribution can be written as . Three (3) variants of the distribution were created by varying the value of .
This distribution can be considered as a generalisation of a few other distributions. It can be written as . For example, corresponds to an exponential distribution, while leads to the Rayleigh distribution. In the same vein, three variants of the distribution were created.
The convergence and accuracy of five metaheuristic optimizers were evaluated on the benchmark test functions and the CEC2020 test functions. These optimisers are then initialised using 22 different initialization schemes [
13]. The findings of those authors showed that PSO and CS are more sensitive to the initialisation scheme used, whereas DE was less susceptible to the initialisation scheme used. In addition, PSO relies on a greater population size, whereas CS requires a lesser population size. DE does well with an increased number of iterations. The Beta, Rayleigh, and exponential distributions are great performers as the results showed that they greatly influence the convergence of the optimisers used.
Georgioudakis, Lagaros, and Papadrakakis [
31] incorporated Latin hypercube sampling (LHS) to initialise four (4) optimisers; namely, the evolution strategies (ES), covariance matrix adaptation (CMA), elitist covariance matrix adaptation (ECMA) and differential evolution (DE). They use these optimisers to investigate the relation between the geometry of the structural components, and their service life. They aimed to improve the service life of structural components under fatigue. Their choice of LHS instead of the random Monte Carlo simulation optimised the number of samples needed to calculate the problem regarding the formulation of the statistical quantities.
The stochastic fractal search (SFS) technique was used in the work of [
32] to improve the performance of the multi-layer perceptron neural network. It was used to obtain the optimal set of weights and threshold parameters. The hybrid approach was tested on EEE 14- and 118-bus systems, and the results were compared with other non-optimized MLP (optimized MLP based on genetic algorithm (MLP-GA) and Particle Swarm Optimization (MLP-PSO)). The precision was up by 20–50%, and the computational time was down by 30–50%. However, SFS tends to ignore local search; the correct balance between the global and local search is desired. Similarly, the levy-flight was replaced by stochastic random sampling of simpler fat-tailed distributions enhanced with scaled-chaotic sequences to boost cuckoo search (CS) performance in solving the complex wellbore trajectories problem [
33].
Probability distributions generally suffer from issues such as equiprobable disjunct intervals and errors in correlations between variables. We summarise efforts in this category in
Table 3.
3.4. Hybrid with Other Metaheuristic Algorithms
Most researchers used another metaheuristic algorithm to find an optimal solution for the initial position of the population in this approach. Metaheuristic algorithms with a high convergence rate in a specific problem domain are often used to find an initial solution. These solutions are then fed into the other metaheuristic algorithms as the initial conditions. A hybridization of ABC and TS was proposed in the work of [
39], where the bee population was initialised using the randomized breadth-first search. The performance of their hybrid was better than the algorithms they compared it with; however, it suffers from the time complexity problem of BFS. The authors [
40] initialised the monarch butterfly algorithm by equally partitioning the search space and in the F and T random distribution to mutate the divided population. The results showed significant improvements. The Krill in the work of [
41] were initialised using the pairwise linear optimisation, which uses fuzzy rules to create clusters that are used as the initial point for the KH. However, the results showed that this improvement would only suit systems based on fuzzy approximators. The CRO was improved using the VNS algorithm with a new processor selection model for the initialisation. The results are promising; however, parameter sensitivity still needs to be resolved [
42].
The cuckoo population was initialised using quasi-opposition-based learning (QOBL) [
43]. Reaching the optimal search is enhanced by considering a guess and its quasi-opposite guess. The initialisation schemes of BA are improved using a quasirandom sequence with low discrepancy called Torus [
25]. Their results were good; however, the results were not evaluated for higher-dimensional problems. Four (4) different dispatching rules (DR)-based initialisation strategies were used by [
44], with varying advantages and disadvantages. The best result was obtained when all of the strategies were used together, which means that the diversity of the population contributed less to the algorithm’s overall performance. In [
45], a scheme inspired by SAM was developed, and it is a simplified heuristic model that begins the swarm search with an initial set of high-quality solutions.
ABC was used to find the optimal cluster centre of the FCM [
46]. An improved ABC was also proposed to solve the vehicle routing problem (VRP) [
47]. Among other improvements, the bees were initialised using push forward insertion. An improved DE, named the enhanced differential evolution algorithm (EDE), used the opposition-based learning for the initialisation, along with other improvements, in order to enhance the performance of DE [
48]. The optimised stream clustering algorithm (OpStream) used an optimal solution of a metaheuristic algorithm to initialise the first set of the cluster [
49]. The optimal solution of the optimal shortening of covering arrays (OSCAR) problem was used as the initialisation function of a metaheuristic algorithm [
50].
Mandal, Chatterjee, and Maitra [
51] used the PSO to solve the problem that hampered the Chan and Vese algorithm for image segmentation problems, which is low-performance if the contours are not well initialised; the contours are initialised simultaneously with the population. Their hybrid solution made contour initialisation irrelevant to the performance of the algorithm. Another effort was presented by [
52], where a scheme to initialise the fuzzy c-means (FCM) clustering algorithm using the PSO was proposed. Finding the optimal cluster centres was set as the objective function of the PSO.
A memetic algorithm that uses the greedy randomized adaptive search procedure (GRASP) metaheuristic and path relinking to initialise and mutate the population was proposed [
53]. However, the scalability of the MA was untested. The authors [
54] proposed an initialisation scheme that used both the Metropolis-Hastings (MH) and function domain contraction technique (FDCT). MH is helpful when generating the direct sequence of a PD that is difficult. However, MH is best for high multidimensional complex optimisations, as these are problem-dependent. In such a situation, the FDCT is then employed. The FDCT is a sequential three-step solution starting with a random solution generator; and if this is not feasible, then the GBEST PSO generator is applied. If the previous two fail, then the search space reduction technique (SSRT) is applied. These steps ensure that the initialised population leads to a better solution.
Competitive swarm optimizer (CSO) is a variant of PSO used by [
55] to improve the extreme learning machines (ELM) network by depending on the individual competition of the particles, which optimise its weights and structure. Although the results show great promise, it took more training time to generate effective models. Sawant, Prabukumar, and Samiappan [
56] evaluated an approach to initialise the cuckoo nest based on the correlation between the spectral band of the nest that was proposed. The goal is to ensure convergence by making sure the location of the nest does not repeat. The k-means clustering algorithm is used to select specific clusters on the band based on their correlation coefficient. Another approach is presented to resolve the lack of diversity of PSO and its sensitivity to initialisation, which quickly leads to premature convergence. The crown jewel defence (CJD) is used to escape being stocked in the local optima by relocating and reinitialising the global and local best position. However, the performance of this improvement is not tested in higher dimensions [
57].
The DE and local search were combined to improve or enhance the chances of an optimal solution to the hybrid flow-shop scheduling problem [
58]. The brainstorm optimisation (BSO) was improved in the work of [
59] by implementing a scheme that allows for a reinitialisation scheme to be triggered, based on the current population. In the work of [
60], those authors used FA to detect the maxima and number of image clusters through a histogram-based segmentation; the maxima are then used to initialise the parameter estimates of the Gaussian mixture model (GMM). In the work [
61], the authors proposed a scheme that enhances the initial conditions of an algorithm by considering these initial conditions to be a sub-optimisation problem where the initial conditions are the parameters to be optimised by the MLA. Their obtained results showed improvements compared to the other algorithms used. The FA was also used in the work of [
62] as an optimiser to obtain the initial location of the translation parameters for WNNs. This led to a reduction in the number of hidden nodes of WNN and significantly increased the simultaneous generalisation capability of WNNs.
However, time and computational complexity may be a problem for this approach. In addition, a lack of a proven way to hybridise these algorithms greatly depends on the experience of the researcher. A summary of research efforts in this category is given in
Table 4.
3.5. Chaos Theory
Chaos theory describes the unpredictability of systems, and over the years, many advances have been made in this area. Chaotic sequences follow these properties: sensitive to initial conditions, ergodicity, and randomicity. This type of sequence has the advantages of introducing chaos or unpredictability into the optimisation, increasing the range of chaotic motion, and using these chaotic induced variables to search the space effectively [
69].
Using the logistic chaotic function, ref. [
70] proposed novel improvements on the CS, and one of these improvements is the use of the logistic chaotic function to initialise the population. While their results are promising, they suffer from high computational complexity. The same scheme was used in the work of [
71] to improve BA, where the bat population was initialised using chaotic sequences, instead of the random number generator. In addition, the bacterial population of BFO was initialised using chaotic sequences that were generated using logistic mapping [
72]. Similarly, the butterflies in the work of [
73] were initialised using the homogenous chaotic sequence which were adapted to the ultraviolet changes. Among other improvements proposed in the work of [
74], the chaotic initialisation strategy was used to initialise the whales in the multi-strategy ensemble whale optimization algorithm (MSWOA).
The chaos theory was used to initialise the moth-flame optimization (MFO) [
75], firefly algorithm (FA) [
76], artificial bee colony (ABC) [
77], biogeography based optimization (BBO) [
78]), krill herd (KH) [
79], water cycle algorithm (WCA) [
80], and grey wolf optimizer (GWO) [
81]. In all, the authors claimed superiority of their results over other algorithms; however, high computational complexity remains an issue for this category, and we provide a summary of efforts in
Table 5.
3.6. Ad Hoc Knowledge of the Domain
In the ad hoc knowledge of the domain approach, the authors used background knowledge of the domain to design the initialisation scheme of an algorithm. The nature of the problem is what influences the diversity and spread of the initial population. The scheme proposed in the work of [
86] used this scheme to generate initial solutions, serving as the initial point for the metaheuristic method. Their results were better and, in some cases, competitive; however, we believe that this method is excessively problem-dependent as such a generalisation is impossible. In the same vein, ref. [
87] proposed the initialisation of the bats method, based on ad hoc knowledge of the PV domain. Precisely, they used the peaks with similar duty ratios that occur at the power versus duty ratio of the boost converter curve. Yao et al. [
88] used the objective function to minimise the wear and tear of the actuators when initialising the population.
The clans in EHO [
89] were initialised by considering the acoustic decay model that is used to obtain the distance between the sensor and the noise source. Depending on the noise level, the intersection of the source coordinates will be at the radii, which is less likely to be single. The clans are initialised, while being based at the centre of the intersection. The technique suffers from being problem-dependent and requires much adaptation before being used in other domains. Finally, a scheme to help PSO avoid reinitialisation to capture the global peaks, when PSO changes its position and value in the P-V curve, was developed by [
90]. Particles are sent to areas of anticipated peaks; once located, particles are sent there to cater for them.
Table 6 gives a summary of this approach.
3.7. Lévy Flights
A two-way approach to improving the initialisation scheme for the bees algorithm was also proposed [
93]. The patch environment and levy motion imitate the natural food environment and the foraging motion of the bees, respectively. Although the patch concept is used in the original Bees algorithm for the neighbourhood search, its use for initialisation and the levy motion greatly improved its performance. In addition, the performance of the GWO algorithm [
94] was enhanced using the Lévy flight (LF) and greedy selection. An improved modified GWO algorithm is proposed to solve global or real-world optimisation problems. In order to boost the efficacy of GWO, strategies are integrated with the modified hunting phases. However, no test was carried out on a specific optimisation domain; hence, no comparison was made. A glossary of efforts on the use of this approach is given in
Table 7, and authors claimed superiority of their results over other algorithms.
3.8. Others
Other approaches to improve the diversity, spread, and optimality of the initial population of metaheuristic algorithms exist in the literature. This category includes approaches that used mathematical and statistical functions to aid the initial population in an exhaustive search.
A nonlinear simplex method was used to initialise the swarms [
102]. Their results showed that the particles gravitated better towards the excellent quality solutions. An approach where a particle is placed in the centre, and the rest of the particles are spread around it in the search space was considered by [
103]. Their result is promising; however, it is not entirely without bias. The use of complex-valued encoding for metaheuristic optimization research is gaining attention from researchers. A comprehensive and extensive overview of this approach is presented in [
104].
The complex-valued encoding metaheuristic algorithms have been applied significantly in function optimization, engineering optimization design, and combinatorial optimization. The regular metaheuristic algorithms are based on continuous or discrete encoding. The advantage of the complex-valued encoding metaheuristic algorithm is that it expands the search region and efficiently avoids falling into the local minimum. Finally, eight metaheuristic algorithms were enhanced using the complex-valued encoding, and they were tested using 29 benchmark test functions and five engineering optimisation design problems. The superiority of complex-valued encoding was proved by analysing and comparing the results with statistical significance, and the complex-valued encoding metaheuristic algorithm returned the best performances. We present a summary of what authors have done in this category in
Table 8.
6. Conclusions
So many works exist in the literature that clearly outline the nature of the role of the initial population in the overall performance of metaheuristic algorithms. However, despite the role that initialisation plays and the efforts put forward by researchers in this research area, to our knowledge, no comprehensive survey of articles on the subject area exists. Therefore, the present study presents a comprehensive survey of different approaches to improving performances of metaheuristic optimizers, using their initialisation scheme. We also show the publication trends for research in this area, and the number of citations. Finally, we provided a glossary of efforts that have been made to improve the performance of metaheuristic algorithms using their initialisation scheme. We also include the areas of application of these improvements for easy reference by metaheuristic research enthusiasts.
The number of articles published to date in the repositories that were discussed earlier showed that the area which focuses on the initialisation of the population of metaheuristic algorithms is relatively uncharted. Many of these metaheuristic algorithms have been proposed; however, less effort has been made regarding their initialisation scheme. Most researchers opt for the commonly used random number generator whose disadvantages have been significantly studied. The ease of implementation of the random number generator may have contributed to its use by researchers. On the one hand, the hybridisation of metaheuristic algorithms has yielded great results in the literature. Authors have had a great degree of success in using different initialisation schemes for the algorithms. We see a promising avenue whereby researchers can explore these high-performing initialisation schemes to assess their efficacy. The size of the population and the iteration number can be varied along with these schemes. This can help in increasing the performance of the algorithms.
Our experiments demonstrate that for the classical functions under consideration, BA is sensitive to the initialisation schemes, whereas GWO and BOA are not. The sensitivity of the algorithms is also problem-dependent, meaning that some functions were insensitive to the initialisation scheme. The population size and number of iterations play a role in the performance of the algorithms. We discovered that BA performed better with larger population sizes. GWO and BOA performed better when the number of iterations was greater. This conclusion is heavily dependent on the dimension problem; however, we believe that good population diversity and number of iterations will most likely lead to optimal solutions.
We also identified the need for an initialisation method for these algorithms that are best suited to the specific problem domain with statistical backing to yield an optimal solution for that set of problems. Unfortunately, most papers on meta-heuristics usually perform very little statistical validation, and if they do it is only on a single problem that the researchers describe. Benchmarking meta-heuristics with systematic and sound statistical techniques is usually lacking from many published works in the literature. In addition, a tuning/adaptive scheme could be developed, and this scheme should be capable of choosing an initialisation method from a suite of initialisation schemes that will lead to better solutions, depending on the nature of the problem encountered. This approach will also lead to the diversity of the population.