1. Introduction
Heuristic optimization algorithms have been studied and discovered over the past three decades and have been fully proven to solve a variety of complex, nonlinear optimization problems. These methods are both user-friendly and do not necessitate mathematical analysis of the optimization problem. Compared with the traditional methods, they have the advantages of flexibility, no gradient mechanism and avoiding being trapped in the local optimum [
1]. These features drew a large number of researchers to participate in the design. Most heuristic algorithms are inspired by the author’s observation of animal and plant phenomena in nature. To solve optimization problems, algorithms are used to simulate growth and evolution. In real scenarios, heuristics are also widely used, such as path planning [
2], wireless sensor network localization problem [
3], wireless sensor network routing problem [
4], airport gate assignment [
5] and cloud computing workflow scheduling [
6].
Heuristic algorithms can be divided into four categories [
7]. The first category is algorithms based on swarm intelligence techniques. Much of the inspiration for such algorithms comes from observations of social animals. In a population, each individual exhibits a certain degree of independence while still interacting with the entire group. The main representative algorithms are particle swarm optimization (PSO) [
8], the phasmatodea population evolution algorithm (PPE) [
9], the Gannet optimization algorithm (GOA) [
10], the grey wolf optimizer (GWO) [
11], cat swarm optimization (CSO) [
12], etc. The second category is algorithms based on evolutionary techniques. Such algorithms are inspired by developments in biology. The initial random population is gradually iterated to achieve the final optimization purpose through crossover, mutation, selection and other operations. The main representative algorithms are the genetic algorithm (GA) [
13], the differential evolution algorithm (DE) [
14], the quantum evolutionary algorithm (QEA) [
15], etc. The third category is the algorithm based on physical phenomena. This kind of algorithm simulates the law of some natural phenomena. The main representative algorithms are the Archimedes optimization algorithm (AOA) [
16], the simulated annealing algorithm (SAA) [
17], the sine cosine algorithm (SCA) [
18], etc. The fourth category of algorithms is those based on human-related technology. As an independent intelligent and rational individual, each person has unique physical and psychological behavior. The main representative algorithms are the Teaching–Learning-Based Optimization Algorithm (TLBO) [
19], the Gaining–Sharing Knowledge-based algorithm (GSK) [
7], etc.
The original GSK simulates the behavior of acquiring and sharing knowledge throughout a person’s life, culminating in the maturation of the individual [
20]. The author divides human life into two distinct phases: the junior phase, which corresponds to childhood, and the senior phase, which corresponds to adulthood. The strategies for knowledge acquisition and sharing are different in these two phases. At the beginning of the algorithm, individuals tend to use a relatively naive method to acquire and share knowledge. However, not all disciplines (on all dimensions of the solution) use this naive method. In some disciplines, individuals will also use relatively advanced methods for knowledge acquisition and sharing. With the growth of individuals, the algorithm enters the middle stage and the learning of knowledge is more inclined to use the advanced method, while a few disciplines still use the naive method. Individuals go through two stages, alternating between naive or advanced strategies to update their knowledge in each discipline. Individuals eventually reach maturity, which is when they find their optimal position. Ali Wagdy Mohamed demonstrated its powerful optimization capabilities on the CEC2017 test suite when he presented the GSK algorithm. Although the GSK algorithm demonstrates excellent convergence in solving the optimization problem, there is room for improvement in avoiding locally optimal solutions and convergence speed. To further improve the performance of the GSK algorithm, we propose several approaches to be incorporated into the GSK algorithm, which are described next in turn. Experiments have been conducted to demonstrate that these approaches are effective in improving the performance of the GSK algorithm.
Parallel processing is concerned with producing the same results using multiple processors with the goal of reducing the running time [
21]. Because physical parallel processing cannot be used in the optimization algorithm, we adopt an alternative approach. The main idea of the parallel mechanism is to divide the initial population into several different groups. Each group performs iterative updates independently and communicates regularly between groups. The parallel mechanism has been applied widely, including to the parallel particle swarm algorithm (Chu S C 2005) [
21] and parallel genetic algorithms [
22]. In addition, parallel strategy is also used in multi-objective optimization algorithms. Cai D proposed an evolutionary algorithm based on uniform and contraction for many-objective optimization [
23], which uses a parallel mechanism to enhance the local search ability.
The communication strategies between groups can be varied for optimizing different algorithms. This paper presents a communication strategy using the Taguchi method. The main idea is to use a pre-designed orthogonal table for crossover experiments. Compared with the traditional experimental method, it can achieve almost the same effect while obviously reducing the number of experiments. The Taguchi method has the advantages of reducing the number of experiments, reducing the cost of experiments and improving the efficiency of experiments [
24]. It has been successfully applied to improve the genetic algorithm (Jinn-Tsong Tsai 2004) [
25], the Archimedes optimization algorithm (Shi-Jie Jiang 2022) [
26], the cat swarm optimization algorithm (Tsai P W 2012) [
24], etc. In addition, opposition-based learning (OBL) was also incorporated into GSK. The concept of OBL was proposed by Tizhoosh (2005) [
27]. After that, some classical optimization algorithms started to introduce this idea. It has been successfully applied to improve grey wolf optimization (Souvik Dhargupta 2020) [
28] (Dhargupta S 2020) [
29], the differential evolution algorithm (Rahnamayan S 2008) [
30] (Wu Deng2021) [
31], particle swarm optimization (Wang H 2011) [
32], the grasshopper optimization algorithm (Ahmed A. Ewees 2018) [
33], etc.
In order to reach a convincing conclusion, the performance of any optimization or evolutionary algorithm can only be judged via extensive benchmark function tests [
34]. Some diverse and difficult test problems are required for this purpose and the CEC2017 test suite [
35] is a widely accepted test problem. In order to apply the optimization algorithm to the complex real-world optimization problem, it is necessary that the optimization algorithm can effectively solve the single objective optimization problem. CEC2017 test suite contains 30 single-objective real-parameter numerical optimization questions. Compared with CEC2013 and CEC2014, in CEC2017, several test problems with new characteristics are proposed, such as new basic problems, composing test problems by extracting features dimension-wise from several problems, graded level of linkages, rotated trap problems and so on. In this paper, the CEC2017 test suite is utilized to evaluate the proposed algorithm (POGSK), the original algorithm and several classical algorithms.
The IoV enables vehicles on the road to exchange information with roadside units (RSUs) [
36]. Therefore, users can expect quick, comprehensive and convenient services, such as road condition information, traffic jam section information, traffic condition information, city entertainment information, etc. However, traditional resource-allocation strategies may not be able to provide satisfactory Quality of Service (QoS) due to several factors. These include resource constraints, network transmission delays and the deployment of RSUs. Scheduling problems can be solved using various methods, which can be roughly grouped into three categories: exact, approximate and heuristic [
37]. The exact method is to calculate all the solutions of the whole search space to find the optimal solution, which is obviously only suitable for small-scale problems. The approximation method uses certain mathematical rules to find the optimal solution, which requires different analyses for different problems. However, in most cases, the mathematical analysis of the problem is difficult. Therefore, the heuristic approach is a decent option. The main objective of scheduling algorithms is to find the best resources in the cloud for the applications (tasks) of the end user. This improves the QoS parameters and resource utilization [
38]. In order to solve this optimization problem, this paper proposes using the heuristic algorithm POGSK to complete resource scheduling.
The main contributions of this paper are as follows:
1. An improved Gaining–Sharing Knowledge-based algorithm (POGSK) is proposed, which uses parallel strategy and OBL strategy. The use of parallel strategy increases the diversity of the population so that the algorithm can effectively avoid local optimal solutions. The OBL strategy can correct the convergence direction and improve the convergence accuracy.
2. A new inter-group communication strategy is designed. Specifically, the Taguchi communication strategy and the population-merger communication strategy were used. This enables efficient exchange of information between subpopulations and avoids the weakening of algorithm performance caused by the reduction of the number of individuals in subpopulations.
3. POGSK is used in the resource-scheduling problems of the IoV to improve QoS, which can reflect the performance of the algorithm in real scenarios. Simulation results show that POGSK is more competitive than other algorithms.
2. Related Works
2.1. GSK Algorithm
GSK is a human-based heuristic algorithm that gradually updates knowledge (corresponding to the solution of the algorithm) by simulating the process of knowledge sharing and acquisition in human life. The algorithm mainly consists of two phases: the junior phase and the senior phase, for knowledge gaining and sharing in phases have different processes [
7].
When individuals are young, they prefer to interact with individuals who are similar to themselves. Despite their immature ability to distinguish right from wrong, they are willing to communicate with unfamiliar people, which represents curiosity in the junior phase. People who are similar to themselves correspond to their relatives, friends and other small surrounding groups. People who are unfamiliar correspond to the stranger. The above scheme is the junior gaining–sharing scheme. In the junior phase, more dimensions are updated using this scheme than using the other (senior gaining–sharing) scheme.
After updating and iteration, individuals reach middle age gradually. As the capacity to distinguish between right and wrong gradually increases, individuals are more willing to divide the crowd into three different populations: the advantaged, the disadvantaged and the general population. Individuals improve themselves by interacting with these three groups. The above scheme is the senior gaining–sharing scheme. In the senior phase, more dimensions are updated using the senior scheme than using the junior scheme. In the following, we describe in detail the dimensions in which the two schemes are utilized and their respective processes.
Let
, i = 1, 2, 3, …, N; N is population size,
represents the individual members of the population.
= (
,
,
, …,
), D is the size of the problem and
represents the value of an individual in this dimension. Before the renewal of each generation, we need to determine which dimensions use the senior scheme and which dimensions use the junior scheme for each individual. Based on the concept of constant human growth, these dimensions are determined to use the following nonlinear increasing or decreasing empirical equation [
7].
where K is knowledge rate of a real number, K > 0, G is generation number and GEN is the maximum number of generations.
The steps for the junior scheme are as follows:
1. The fitness of all individuals is calculated and the sequence that follows is generated by ranking the individual from high to low fitness: (, , , …, , ). Each individual chooses three individuals to communicate with according to step 2.
2. For each individual
which is not the best and the worst, select
and
. For the best individual
, select
and
. For the worst individual
, select
,
. In addition, an individual
is selected in a random way. These three individuals are sources of information. The pseudo-code for the above junior scheme is presented in Algorithm 1:
Algorithm 1: Junior scheme pseudo-code |
|
Here, represents the knowledge factor ( > 0).
The steps for the senior scheme are as follows:
1. The fitness of all people is calculated and the sequence that follows is generated by ranking the individual from high to low fitness: (, , , …, , ).
2. The population is divided into three parts: the first 100p% with better fitness, the last 100p% with poor fitness and the middle NP − (2 * 100p%), where p is the proportion of a population division. For example, if p = 0.1, NP = 100, the best group is the top 10 peoples with better fitness, the worst group is the bottom 10 peoples with poor fitness and the middle group is the middle 80 peoples. From these three groups,
,
and
are selected as sources of information. The pseudo-code for the above senior scheme is presented in Algorithm 2:
Algorithm 2: senior scheme pseudo-code |
|
There are several important parameters in GSK, respectively knowledge rate K that controls the proportion of junior and senior schemes in the individual renewal scheme, knowledge factor
that controls the total amount of knowledge currently learned by the individual from others, knowledge ratio
that controls the ratio between the current and acquired experience. The pseudo-code is presented in Algorithm 3:
Algorithm 3: GSK pseudo-code |
|
2.2. Taguchi Method and Parallel Mechanism
Every engineer wants to design a satisfactory product with minimum cost and minimum time. However, many factors can impact the quality of the product and it will take a long time to test one by one. The Taguchi method was developed by Dr. Genichi Taguchi in Japan after World War II [
25]. When this method was used in practical production, it greatly promoted Japan’s economic recovery. The orthogonal matrix experiment is one of the important tools of the Taguchi method. Suppose a product has K influencing factors, each of which has Q levels. If the influence of each factor level on product quality is tested one by one,
experiments are required [
24]. This will consume a lot of time and production costs are difficult to control. The orthogonal matrix experiment uses the pre-designed orthogonal matrix to conduct a small number of experiments on each factor, which can achieve almost the same effect while greatly reducing the number of experiments. Assume that there are seven factors affecting the product, each of which has two levels, then we can use the
(
) shown in Equation (
3). Each row in the table represents an experiment, where the values represent the current level of factor adoption. It can be observed that in each column, the number of occurrences is the same for both levels, which guarantees the fairness of the experiment.
This paper mainly uses two levels orthogonal matrix. For the 10-dimensional experiment uses (). For the 30-dimensional experiment uses ().
The parallel mechanism, also known as multi-population strategy, is a popular algorithm-optimization method for increasing population diversity. The main idea is to divide the initial population into several subpopulations. The subpopulations were searched independently after initialization and communicated with other subpopulations according to certain conditions. For parallel strategies, the inter-group communication strategy is critical. Because the number of individuals in the subpopulation is reduced, the algorithm easily falls into the local optimum via independent search. Excellent intergroup communication strategies can make subpopulations gain the ability to escape the local optimum and effective communication strategies enable populations to exchange a small amount of information while improving their search ability. Chai proposed the tribal annexation communication strategy and herd mentality communication strategy to improve the searching ability of the whale optimization algorithm [
3]. Tsai used the Taguchi communication strategy to enhance the search capability of the cat swarm optimization algorithm [
24]. The above scheme demonstrates the diversity of communication strategies for different algorithms. In this paper, a suitable communication strategy is proposed according to the characteristics of the GSK algorithm. Specifically, the Taguchi communication strategy and the population-merger communication strategy were used.
2.3. Opposition-Based Learning
OBL was proposed by Tizhoosh (2005) [
27], which is fundamentally based on estimates and counter estimates. OBL can modify the convergence direction of the algorithm and improve the search accuracy of the algorithm. In order to get close to the optimal position quickly, we generally want the population to be near it. However, the initial populations are generated randomly and it may be far from the optimal position or even the exact opposite. As a result the algorithm converges so slowly that it does not converge to near the optimal solution under the specified conditions. The main idea of OBL is to find the opposite position of the random initial population, evaluate it and select the better position to replace the initial population. Furthermore, during the population updating process, the position of the current individual and its opposite are evaluated and the better individuals are left.
Definition 1 (Opposite Number [
27])
. Let x be a real number defined in the interval [a,b], then the Opposite Number of x is defined by the following equation: Definition 2 (Opposite Point [
27])
. Let P (, , …, ) be a point in an n-dimensional coordinate system with (, …, ) being real numbers, where each of is defined in the interval [, ]. Then, the Opposite Point (, , …, ) is defined by the following equation: In the actual search process, the search space usually changes dynamically, the Opposite Point
= ((
,
, …,
)) of
is defined by the following equation;
belongs to the population P = (
,
, …,
).
2.4. Resource-Scheduling Problem of the IoV
With intelligent transportation and smart city development, the IoV has received more and more attention. Based on the information interaction between the on-board unit and the roadside unit, a successful architecture of the IoV is formed. In this process, due to roadside unit deployment, own resource limitation and network delay, it may not guarantee satisfactory QoS for users. In this case, the most likely bottleneck is the proper scheduling of resources [
39]. Therefore, we need a scheduling algorithm to distribute the user workload into the RSUs. The algorithm must be based on resource capacity and solve the problem of over- and underutilization [
6]. The algorithm should take into account the available resources and work to improve QoS. Based on practical considerations, the performance of the algorithm can be evaluated by resource utilization, load balancing, maximum completion time, execution cost, power consumption, reliability and other indicators.
To improve the computing capabilities of mobile devices, edge computing transfers computation-intensive applications from resource-constrained smart mobile devices to nearby edge servers with computational capabilities [
40]. Bin Cao proposed a space–air–ground-integrated network (SAGIN)-IoV edge-cloud architecture based on software-defined networking (SDN) and network function virtualization (NFV) [
36] that takes into consideration that the actual user needs to establish an optimization model. Yao proposed a big data-based heterogeneous Internet of Vehicles engineering cloud system resource allocation optimization algorithm [
39]. Filip proposed a new model for scheduling microservices over heterogeneous cloud-edge environments [
41]. This model aims to improve the resource utilization of edge computing equipment and reduce cost. Farid M proposed a new multi-objective scheduling algorithm with fuzzy resource utilization (FR-MOS) for scheduling scientific workflow based on the PSO method [
6]. This algorithm’s primary objective is to minimize cost and makespan in consideration of reliability constraints, where the constraint coefficient is determined by cloud resource utilization.
Existing resource-scheduling algorithms generally account for a variety of usage scenarios. In our proposed case study, we recognize that it is practical to employ these existing methods. In this article, service delay, resource utilization, load balancing and security are considered simultaneously and an optimization model is constructed.
3. Proposed Algorithm and Its Application
3.1. Parallel Communication Strategy
Choosing a suitable communication strategy is critical in parallel strategy since it facilitates information exchange between two groups, thereby enhancing their search capabilities. In this paper, two primary communication strategies are used. The first primary communication strategy is controlled by the communication control factor R. If the random number generated in the communication is greater than R, all groups will be matched in pairs and the following Taguchi communication will be performed:
1. Select the optimal solution of the two groups and select the appropriate orthogonal table based on the number of levels and factors. For example, the experiment is two-level if it contains two candidates and is seven-factor if each candidate in the experiment contains seven influencing factors. Conduct the experiment according to the orthogonal table and calculate the fitness value for each new individual.
2. In each dimension, calculate the fitness sum of the two levels separately. For each dimension, select the level with better fitness as a candidate. Combine the candidates to produce an optimal individual.
When the random number is less than R, the optimal solutions within all groups are compared with the global optimal solution using the following steps:
1. If it is worse than the global optimal solution, the intra-group optimal solution is replaced by the global optimal solution.
2. If it is better than or equal to the global optimal solution, a random mutation operation is performed.
Every individual has the potential to excel in some dimensions. The Taguchi method can efficiently excavate these excellent dimensions and then combine them together. The communication process described above will be visualized through an example. The fitness function is assumed to be:
Suppose the search goal is to find the smallest fitness value. The two candidate individuals are shown in
Table 1. The Taguchi orthogonal experiment is a two-level seven-factor experiment that employs the
(
) orthogonal table shown in Equation (
3).
Table 2 depicts the specific operation process. The cumulative fitness value of the two candidate solutions in the table is calculated based on whether the solution is selected in the orthogonal table. In the first dimension of
Table 2, the candidate solution
was used in experiments 5 to 8, the cumulative fitness value for the first dimension of
is the sum of the fitness values from 5 to 8 experiments.
In addition, this paper employs the population-merger communication strategy as the second primary communication strategy. In a swarm intelligence algorithm with parallel strategies, multiple subpopulations search independently and communicate with each other at intervals. The parallel strategy increases the diversity of the algorithms, but reduces the number of individuals in each subpopulation and some algorithms require more individual data for search. This conflict weakens the performance of the algorithm. After testing, the search performance of the original GSK algorithm was significantly reduced when the algorithm was divided into several subpopulations. In order to solve the above problem, the population-merger communication strategy was adopted. Specifically, each subpopulation searched independently in the early stage of the algorithm. Once a specific condition is met, the two adjacent subpopulations combine into one population. The newly formed population incorporates all the information from both subpopulations. Finally, before the end of the algorithm, all the individuals are merged into one population, which contains all the information of the original subpopulation. In the first stage of this paper, the initial GSK population was divided into four groups. In the second stage, the four groups were merged into two groups. In the third stage, the two groups are merged into one group. The condition of merger refers to the number of fitness functions.
3.2. Incorporate OBL into GSK
There are two primary steps involved in adding OBL to GSK. Using OBL, optimize the initial population as the first step. In the second step, the opposite population is generated to correct the convergence direction.
For the original algorithm (GSK), the initial population is randomly generated within a defined range. The initial individuals thus generated may be too far away from the global optimal position. The utilization of the OBL strategy enables the generation of a population closer to the optimal position, thereby facilitating more effective algorithm optimization. The steps in detail are as follows:
1. Initialize the population X = {
,
, …,
} randomly according to the defined range, n denotes the number of individuals in the population. Generate the opposing population
= {
,
, …,
} with the following formula:
where
represents the upper bound of the current dimension and
represents the lower bound.
2. Select individuals with excellent fitness from {, } and combine them into NP, taking NP as the initial population.
In the process of population updating, by using similar methods to generate the opposite population and for evaluation, the current population can be guaranteed to be closer to the global optimal position. The probability of generating an opposing population can be controlled by adjusting the jump rate r. The steps in detail are as follows:
1. After each update of the population, a random number is generated to compare with the jump rate r. If the random number is less than r, the opposite population of the current population will be generated, the following formula shows the process:
where
represents the maximum value of the j dimension in the current population and
represents the minimum value.
2. Current and opposing populations are combined, and the fitness is tested separately. Then, the fittest individuals are selected from {, }.
In this paper, several different approaches are considered for integration with GSK and
Figure 1 illustrates the process. In POGSK, the initial population is divided into four subpopulations after OBL optimization and the four subpopulations independently conduct GSK and communicate with each generation according to the Taguchi method. After updating the current population, the OBL operation is performed. The pseudocode for POGSK is shown in Algorithm 4. Moreover, in order to demonstrate the POGSK process more visually,
Figure 2 shows its main processes.
Algorithm 4: POGSK pseudo-code |
|
3.3. Apply the POGSK to Solve the Resource-Scheduling Problem in IoV
In order to reasonably allocate the resources of RSUs in IoV and enhance the QoS, this paper proposes the following mathematical model. Assume that there are multiple vehicles on the road and a total of n tasks are submitted simultaneously and each task contains four attributes: (1) the size of the task; (2) deadlines for tasks; (3) type and quantity of resources required; (4) the transfer time of the submitted work. The n tasks are represented as follows [
36]:
Suppose that there are m processing nodes in the current scenario. Each processing unit contains two attributes: (1) type and quantity of resources owned by the processing unit and (2) the processing capacity of the processing unit. The m processing nodes are represented as follows [
36]:
Service delay
In order to provide users with faster services, service latency should be as short as possible. A processing node can handle multiple tasks simultaneously, with different processing capabilities for each node. Then, the processing time of a task on the processing node is [
36]
where
represents the size of the task
and
represents the processing capability of the processing node
. Then, the time required for a processing node to complete all the tasks assigned to it is
where C(i,j) is a binary value and indicates whether task
is assigned to node
, denoted as
The sum of the processing delays of all nodes is
Resource utilization
According to research, the energy consumption of the server in the idle state can account for more than 60% of the full load operation [
42], which leads to a large amount of energy wasted on the idle server. So we want the roadside unit to be as resource-efficient as possible. The service request in the IoV requires the support of four kinds of computer resources, namely CPU, memory, disk and bandwidth. We need to pay attention to all four sources. Then, the total resource utilization is
where CR(i,k) represents the number of
required by
, PR(j,k) represents the number of
owned by
and
is a binary value indicating whether
is turned on or not (k = 1,2,3,4 represents four resources).
Load balancing
The service request in the IoV has different demands on different resources; it is easy to cause the load of different types of resources to be unbalanced. A valuable load-balancing technique in cloud computing can enhance the accuracy and efficiency of cloud computing performance [
43]. So we want the processing unit to be as load-balanced as possible. Then, the utilization of
in
is
where C(i,j) is calculated using Equation (
16). The mean of resource utilization of
is
The variance of resource utilization of
is
The average resource utilization variance for all processing units is
where
is calculated by Equation (
19) and z represents the number of processing units opened.
Security
In the IoV, tasks must be completed on schedule to ensure safety, as service requests are made at high speeds. In real scenarios, the network latency and security of the task is important [
6,
44]. Task deadlines will be sent with task submissions and we want as many tasks as possible to be completed on time. Then, the actual time required to complete the task is
where
is calculated with Equation (
14) and TL(i,j) represents the transmission buffer time from
to
. Then, whether the task is completed on time is expressed as a binary value:
where
indicates the deadtime of
, which is uploaded when the task is submitted. Then, we express the degree of security as the successful execution rate of the task.
Considering the above four objectives, we propose the following fitness function:
For processing unit
, the number of various resources required by all the tasks running on it is not permitted to exceed the number of resources owned by the unit. The workflow is shown in
Figure 3. The constraint conditions are: