2.1. Principle of SVM
In the linearly divisible case, SVM is proposed from the optimal classification surface [
23], assuming that the sample in the training set
, Where
is the indicator of input,
is the indicator of output. The purpose of classification is to find a hyperplane that can entirely separate the two classes of samples for the two-class classification problem. The hyperplane is obtained by the nonlinear mapping:
. It is vital not only to correctly separate the samples but also to increase the classification interval. Solving the optimal hyperplane classification is translated to solving the following problem of optimization:
where the parameter
is the hyperplane’s weight vector. The parameter
is the bias. The parameter
is the penalty factor, which is one of the important factors affecting the classification of SVM performance. The parameter
is the variable of relaxation. The Lagrangian function is introduced and the original problem of optimization is made into being the concept of pairs using the following Equation (4):
The Lagrange multiplier is
and the kernel function is
. The kernels of the functions commonly used in SVM are linear kernel function, polynomial kernel function, RBF kernel function, sigmoid activation functions, etc. We use the universal RBF kernel function, which has superiority as shown in the literature [
15]. The expression of the function is:
Here
is a kernel factor that controls the Gaussian kernel’s range of action and is another parameter that affects the performance of the SVM classification. To obtain the decision function, the radial basis kernel function is used as:
2.2. Algorithm and Theory of IPSO
The PSO algorithm is similar to the genetic algorithm. It starts from the random solution and looks for the optimal solution through iteration. The quality of the solution through fitness is evaluated. The implementation of this algorithm is simpler and looks for the global optimal solution by following the current optimal value. This paper proposes an improved particle swarm optimization algorithm (IPSO) to optimize the super parameters of SVM. The algorithm adjusts the update mode of particles to simplify the particle swarm optimization algorithm. It has the advantage of accelerating the convergence speed in the later stage of particle swarm evolution and avoiding falling into local optimization to achieve good results.
IPSO is used to optimize the hyperparameters of the SVM. Based on the particle swarm optimization (PSO) algorithm shown in Equations (7) and (8), a new dynamic inertia weight and an optimized particle velocity and position update strategy are introduced to prevent the algorithm from dropping into the local optimum and boost the generalization efficiency of the SVM model.
Here, the parameter and , is the dimensionality of the solution vector space, where is the number of particles in the population, the parameters and are two positive constants, the parameter and are two independent random numbers between [0, 1], the parameter is the coefficient of momentum term, the parameter denotes the optimal path experienced by the actual particle, the parameter denotes the position of the population’s ideal particle.
Following a boost in two aspects of the above general particle swarm algorithm [
16], the IPSO algorithm is constructed.
1. An IPSO algorithm considers the effect of other population particles on the optimum search of the particles in the iteration. Each particle’s velocity is optimized according to the following three factors: the historical optimal value of the particles , optimal values of the particle within the neighborhood of the particles, and the global optimal value of the population .
The distance between each particle and other particles is determined in the iteration. The distance between the current particle
and any particles
is specified as the parameter
and the maximum distance is the parameter
. The ratio is calculated as
. According to the number of iterations, the threshold
varies and its description is
where
determines the number of iterations. The maximum number of iterations is defined as the parameter
. When the inequality
and
is satisfied, the particle
is found to be in the vicinity of
particle. The introduction of the quality learning factor
and the random number
, modifies the particle velocity according to the following equation.
If inequality or is satisfied, the speed of the particles is updated according to (7).
2. The standard PSO algorithm uses the parameter
to decrease the phase length, which is determined by seeking linearly and gradually to converge the iterations to the extreme value point [
7]. The drawback of this method is that the arithmetic pair is likely to collapse into the local optimum. To address these drawbacks, the parameter
decrease as an S-shaped function and changes dynamically. The parameter
is set to be a large value at the beginning of the optimization process to facilitate the global search and becomes smaller at the end of the search process to facilitate the local convergence. The representation of the weights in the IPSO algorithm is as follows:
The procedure of the IPSO algorithm is shown in
Figure 2.
Step 1: Set the important IPSO parameters such as learning factor, the maximum number of iterations, population size, etc.
Step 2: Initialize the individual pole position of the particle , the corresponding pole value , the position of the global pole , and the corresponding global pole value .
Step 3: Measure all values for particle fitness.
Step 4: the parameters , , and are compared.
Step 5: Update the particles’ locations and keep them within their limits.
If , then ;
If , then .
Otherwise, does not change.
Where the variables and are the maximum position and minimum position.
Step 6: Terminate the iteration if the number of iterations or the cutoff accuracy is satisfied; otherwise, return to Step 2.