*3.1. Basic Idea*

When solving multimodal optimisation problems, the main aim is to find multiple optimal solutions (global optimum and local optima) associated with a single cost function. Theoretically, multiswarm PSO is suitable for optimisation in multimodal problems with multiple local optima, because it can achieve a good balance between exploration and exploitation behaviours. However, the performance of multiswarm PSO algorithms is dependent on the starting points selected in the search process. When solving multimodal problems, new starting points can be randomly selected or derived from known solutions. In general, starting points are randomly selected. However, many search spaces are globally convex; thus, the quality of the local optima increases as the distance from the global optimum decreases. In the global convex search space of the LSO algorithm, if the starting point is selected near the area of the optimum, the global optimum solution can be obtained using the gradient descent algorithm. To explore possible new solutions, the proposed LS-PSO algorithm uses two behaviours of biological locust swarms, namely, solitary operation and social operation.

Solitary operation. Similar to the behaviour of biological locust swarms, when neighbouring particles are far away from each other within the swarm, the AF ensures group cohesion. Conversely, when neighbouring particles are too close, they are expelled by the SRF into new subswarms, which prevents premature convergence. The SRF can accelerate the particles to separate in different directions at a fixed time interval (Figure 3).

**Figure 3.** Prevent the LS-PSO algorithm from converging prematurely using the SRF.

To improve the searching ability of the LSO algorithm, a scout particle is introduced in a swarm to sugges<sup>t</sup> a search direction. In particular, the scout particle recommends the starting point of the re-searching process to find the best path to food sources at the ending period of the search. Thus, the re-searching process prevents most particle swarms from converging prematurely to a local suboptimal solution when the fitness function value has stabilised.

Social operation. To prevent most particle swarms from converging prematurely to local optima, the search space must be expanded using neighbouring particles to form a new particle swarm according to the social behaviour of locust swarms. In practice, the initial values of the particles of the new subswarms are set as close as possible to the global best solution, which decreases the search time for the regional optimal solution. Thus, the intelligent selection of a starting point effectively reduces the computational time, but maintains the accuracy of the recursive re-searching process. Therefore, this research focused on finding the best starting points for the scattering operation.

When all the particles converge quickly to a single attack path, the particle subswarms are forced to make dynamic changes in the neighbourhood structure, as illustrated in Figure 4. Thus, adjacent subswarms are reorganised by grouping partial particles into new subswarms for expanding the search space of each subswarm. When each subregion is reorganised and generated at each iteration, some of the particles of the subswarms are periodically randomly recombined, and the new subswarms search the adjacent regions again. *R* denotes the reorganisation period. In the aforementioned method, each subswarm can fully exchange information with the other subswarms. Compared with the traditional (static) neighbourhood structure, the new neighbourhood structure has greater freedom, which increases the diversity of the particle swarm searching.

**Figure 4.** Regroup strategy for multiswarm optimisation in the LS-PSO.

To prevent premature convergence to local optima, two modified approaches are proposed with updated rules for multi-objective searching: (1) multiswarm optimisation and (2) intelligent starting point selection. In multiswarm optimisation, which is inspired by the DMS-PSO method [11–13], the locust swarm periodically regroups the particles of the subswarms after they have converged into new subswarms. The new swarms are produced using particles from previous swarms using the regroup strategy (Figure 4).

In intelligent starting point selection, the starting point is selected near the best area in the global convex search space by using a nonrandom adaptive subswarm scattering strategy. The LSO algorithm attempts to jump using a fixed time interval and direction according to the suggested scout particle scattering at the starting point of the local optimum.

#### *3.2. Tracing the Sources of DDoS Attacks by Using the LS-PSO Algorithm*

To prevent particle swarms from converging quickly on a single path, the proposed LS-PSO algorithm divides them into several subswarms. Furthermore, to solve the multimodal search optimisation problem, the rules of each subswarm must be updated in the proposed LS-PSO algorithm.

Assuming that the particle swarm represents a group of packets in the attack path, each packet header record includes the source IP address, the address of the next route, and the destination address. Moreover, we consider that the highest fitness value would be obtained for the most recent experience in which particles travel on the best path. Multiple

possible attack paths exist between the nodes (*ni*, ..., *nk*, ..., *nj*). The fitness value of each path is calculated to check whether a particle has travelled on a low-cost path. Usually, the path search algorithm is used to improve the efficiency of a travel routing system by considering the selection of low-cost network routes, that is, where the distance between two nodes (i.e., *ni* and *nj*) is shorter, the hop count (*dij*) is smaller, and the path between the two nodes (*ni*, *nj*) has a high quality of service (QoS). Therefore, the path search algorithm usually selects the path with the lowest routing cost (i.e., the shortest travel distance and highest QoS to reduce the routing time). In general, the route cost of path *Ci* from node *x*i to the victim is inversely proportional to the distance travelled and directly proportional to the QoS (i.e., a high QoS corresponds to a low transmission delay and low traffic congestion); thus, *Lp* = *f QoS*, 1 *dij* . Theoretically, the minimum cost function (*Lp*) must be determined to solve the multipeak optimisation problem; however, this function is subject to routing cost constraints. The minimum cost function (*Lp*) is a positive number ( ∑*n j*=1 *Ci*.*<sup>p</sup>nij* > 0), which is expressed as follows: *n*

$$Fitness = Lp = \sum\_{j=1}^{n} \mathbb{C}\_{i}.p\_{ij\prime}^{n} \tag{1}$$

$$\text{Min } L\_{\mathbf{P}\_{\prime}} \text{ \textquotedblleft} i, j \right]$$

$$\text{Subject to } \sum\_{j=1}^{n} \mathbb{C}\_{i}. p\_{ij}^{n} > 0,\tag{2}$$

where *Fitness* represents the fitness value of a path. Adaptability is considered to evaluate the suitability of each path, and *pnij* indicates whether a path exists from node *i* to node *j* for particle *n*. An *pnij* value of 1 indicates that a path exists from node *i* to node *j* for particle *x*, and an *pnij*value of 0 indicates that the aforementioned path does not exist.

Route searching approach: In the proposed LS-PSO algorithm, a two-stage route searching approach based on the cluster first, route second (CFRS) strategy is used for path searching in the entire solution space. Our solution technique involves creating subswarms of particles that contain certain information regarding the destination. Inspired by the CFRS strategy used in capacity-constrained vehicle routing problems, this study divided the attack source into multiple network areas according to the IP domain associated with the timing data from DNS logs to determine the minimum cost path to the destination on the basis of a weighted graph theory.

The CFRS performs a single swarming of the vertex set and then determines a route with the minimum cost for each swarm. It also regularly expands possible paths from the destination node by examining the possible paths of the starting node until the end condition is satisfied for reconstructing the overall attack paths. In addition, the CFRS assigns several subswarms of particles in sequence to each local area. It uses heuristic algorithms to acquire the global optimum. The advantage of using FBCFRS is that by clustering the routing traffic, the attack sources can be found within multiple local areas in advance. Moreover, the redundancy of the attack path reconstruction can be reduced.

Exploration and exploitation processes: The exploration and exploitation processes follow different strategies. When solving multimodal optimisation problems, exploration involves following a new route, whereas exploitation involves following an existing route. In the exploitation process, the focus is on determining the local optimal solution by using the local and global updates of the position and velocity vectors. Therefore, the fitness value of each path in each subswarm must be updated to evaluate whether the particles travel on attack paths towards the attack sources. In the exploration stage, the global optimum is found using the regrouping method.

On the basis of previous studies on the use of the LSO algorithm in IPTBK analysis, the operation and verification process of the proposed LS-PSO algorithm is divided into three subphases, as depicted in Figure 5.

**Figure 5.** Flowchart of the LSO algorithm for the analysis of the network DDoS source tracing.

(1) Data preprocessing phase: The tcpdump tool is used to filter and collect the network routing packets required for path exploration, and to mark these packets for subsequent analysis and reorganisation. Then, Unicast Reverse Path Forwarding is used to check each router that passes through it. The source IP of the packet header is used to determine the path of the transmission connection.

(2) Route reconstruction phase: In multimodal optimisation methods, the exploration and exploitation processes are generally performed in different stages [13]. In the solitary operation stage, each subswarm is used to explore possible solutions. In this stage, the proposed LS-PSO algorithm focuses on determining the local optimal solutions in the solution space and prevents particle swarms from rapidly converging on a single path. In the social operation stage, the global optimal position is determined through a regrouping strategy.

2.1 Solitary operation: To increase the search efficiency of the particle swarms, the proposed LS-PSO algorithm divides them into several subswarms when solving multimodal optimisation problems, where the local update rules of each subswarm must be determined. In the LS-PSO algorithm, attack paths are explored and reconstructed on the basis of the route packets collected from the victim to calculate the fitness of each path.

The first particle swarm generates *R* = 5000 particles, and each subswarm has 20 particles (*S* = 20). The initial speed set for the LS-PSO algorithm is the same as that set for the LSO algorithm [14].

$$v\_o = c\_1 \cdot \left(\frac{Range}{2}\right) \cdot (c\_2 \cdot rand() - 1) \tag{3}$$

where *c*1 and *c*2 represent acceleration constants (*<sup>c</sup>*1 = 0.5 and *c*2 = 2), *Range* represents a unit value of particle position updating between the particle position and the centre of the subswarm, and *rand()* represents a random number in the range (0, 1). Suitable acceleration constants can control the particle speed. Route construction is performed using a velocity state updating rule for conducting position updates over 500 iterations (*n* = 500). The particle position for each iteration is updated using Equations (4) and (5).

$$\mathbf{x}\_i^k(t) = \mathbf{x}\_i^k(t-1) + G \cdot \mathbf{v}\_i^k \; (t-1) \tag{4}$$

To examine whether any particle exists in a particle swarm, the LSO algorithm explores the best position of the particle swarm (*Pbest*) by using the gravity vector *G* (*G* = (0.95, 0.05)) [14] in Equation (4). Thus, the gravity force attracts all single particles to search the solution space. In the LSO algorithm, a fixed speed ratio of 0.95 is used to update the distance without considering the effects of the network capability (i.e., the node distance (*dij*) and QoS). Consequently, determining the best route between two edge nodes is difficult, and most particles travel on the frequently travelled paths. Therefore, the current study considered two important factors, namely *dij* and the QoS (Equation (5)).

$$\mathbf{x}\_i^k(t) = \mathbf{x}\_i^k(t-1) + \Delta \mathbf{r}\_{ij}^k(t-1) \tag{5}$$

$$
\Delta \tau\_{ij}^k (t - 1) = \begin{cases}
\frac{v\_i^k(t) \cdot \text{QoS}}{d\_{ij}^k} & \text{for the optimal path of subswaram } k \\
0 & \text{otherwise}
\end{cases}
$$

In Equation (5), Δ*kij*(*t* − 1) represents the movement of Δ*t*, which is inversely proportional to the path distance *dkij* between the two end nodes. The parameter *dkij* represents the number of hops on the *i*th attack path in the *k*th subswarm.

For each iteration *i*, the new position of each particle is updated using Equation (6).

$$w\_i^k(t) = w\_i.v\_i^k(t-1) + (1 - w\_i) \cdot \left(p\_{best} - x\_i^k(t-1)\right) \tag{6}$$

A high *wi* value enables the particles to cross the destination easily; however, a small *wi* value leads to slow convergence. The LS-PSO algorithm uses the gradient descent algorithm to search for the optimal acceleration factor.

To determine the acceleration factor *wi* (Equation (6)), this study used a greedy local search technique associated with the quasi-Newtonian gradient descent method (BFGS) to identify possible local optima with an intelligent reconnaissance strategy (Equation (7)). Theoretically, the BFGS algorithm can efficiently search for the optimal particle positions when the particle is alone (*Pbest*) and in a subswarm (*Pgbest*) in a convex space. Moreover, it can efficiently improve the solution quality of each particle. To determine *Pbest* and *Pgbest* for a subswarm, the BFGS algorithm can be used for dynamically adjusting the particle acceleration (weight: *wi*) to avoid overfitting by minimising the routing cost *Ci* (Equation (7)).

$$w\_i(t+1) = w\_i(t+1) - \eta \frac{\partial \mathcal{C}\_i}{\partial w\_i} \tag{7}$$

where *η* is the learning factor.

The recursive process with the aforementioned updating rule generates *Pbest* and *Pgbest* values for estimating the fitness value for each particle. The fitness value of each particle is calculated to examine whether the particle selects the best route. When a particle moves to a new position, the fitness value is calculated for this position. If the fitness value for the new position is higher than that for the previous best position (i.e., *Pbest*), the value of *Pbest* must be replaced by the fitness value for the new position, updated according to the particle's optimal experience. Similarly, *Pgbest* must be replaced by *Pbest* if the fitness value of the new position is higher than *Pgbest*.

2.2 Social operation: To prevent the majority of subswarms from converging quickly to local optima, the LS-PSO algorithm uses the regrouping strategy to enable particles to escape from the original subswarms because of the mutual RF between particles. A fraction (e.g., 30%) of the particle subswarm is randomly selected to form a new subswarm. In the new subswarms, the starting points of the jumping particles are maintained around the best position *Pgbest* so as to improve the search results in the social operation process. The particle position is updated as follows:

$$\mathbf{x}\_{i}^{k}(t) = \mathbf{x}\_{i}^{k}(t-1) + \Delta \mathbf{x}\_{i}^{k}(t-1)\Delta \mathbf{x}\_{i}^{k}(t-1) = \pm \text{Range} \ast \left(1 + |\text{rand}(\cdot) \ast \text{spacing}|\right) \tag{8}$$

where *Range* represents a unit value of particle position updating between the particle position and the centre of the subswarm. The random jump distance is set using the term |*rand* () ∗ *spacing*|, for example, set spacing = 0.3 for small variation. The initial velocity is set using Equation (9) to accelerate the particles away from the previous local optima.

$$\boldsymbol{\upsilon}\_{0}^{k}(t) = \boldsymbol{\upsilon}\_{o} + \boldsymbol{\upsilon}\_{i}^{k} \left(t - 1\right) \cdot \left(\mathbf{x}\_{i}^{k}(t-1) - P\_{\text{best}}\right) \tag{9}$$

where *vk* 0 is shown in Equation (3).

The original subswarm and new subswarm then restart the search process and continue searching until the cost function error is less than the pre-set value or the maximum number of iterations is reached.

(3) Model validation phase: After updating the velocities and positions of the particle swarms, the proposed LS-PSO algorithm must determine the best path for successfully tracing the sources of DDoS attacks. The model accuracy is evaluated using the coverage percentage (%), which is the ratio of the average number of packets on an attack path to the total number of routing packets. The coverage percentage is expressed as follows:

Coverage percentage (%) = Average number of packets on an attack path/Total number of routing packets, (10)

> where the average number of packets on an attack path is computed as the total number of packets on the route divided by the routing distance (in terms of the hop count). If the converged solution is not the true attack node, then the average number of packets on the route is reset to 0 and the search for the true route is resumed. The complete process is summarised as follows (Algorithm 2).

**Algorithm 2** Pseudocode of the LS-PSO Algorithm

