1. Introduction
With the increasing number of Internet users and the constantly updating and evolving forms of Internet, the proportion of real-time multimedia transmission application scenarios has increased significantly, leading to higher requirements for information transmission. Under the current application requirements, IP multicast technology has developed rapidly. As a one-to-many communication mode, IP multicast technology can effectively save network bandwidth and reduce the network load. It is suitable for applications that are centralized in time and distributed in space, such as video conferencing, streaming media, and so on. However, due to the charging mechanisms and technical limitations of Internet service providers (ISPs), the popularity of IP multicasting [
1,
2] on the Internet is restricted. In contrast, the application layer multicast (ALM) [
3] migrates multicast data transmission from the IP layer to the application layer; data are replicated and forwarded through end-hosts. Furthermore, such approaches have the advantages of being easy to deploy and economical, as communication between the underlying layers of ALM sessions is still based on the very widespread unicast technology.
The key in application layer multicast communication is the construction of an ALM routing tree, which is mainly used to determine the tree structure in which data are delivered from the sender to all the receivers in the group. ALM routing trees are composed of user nodes, which may exit or fail. This uncontrollability can lead to instability in the ALM routing tree, thus affecting the ability of users to receive multicast data [
4]. Many researchers have attempted to reduce the instability caused by user nodes’ behavior by optimizing the topology of ALM routing trees [
5]. End-hosts with high stability are more easily used as core nodes to transmit the data based on the behavior and attributes of the user nodes. To optimize the ALM routing tree topology, Cao et al. have established an instantaneous stability model for the application layer multicast [
6] and successfully addressed the bounded-delay and high-stability model challenges [
7]. In application layer multicast optimization, the delay is also an important optimization objective. Huo et al. proposed an algorithm based on the stability probability and contribution link of nodes (CL-S) [
8]. This approach incorporates considerations for node out-degree and edge delay. Mercan et al. proposed the virtual direction multicast (VDM) [
9] and noted that, as long as the virtual distance is based on the delay and the stability, the VDM can construct a stable ALM routing tree with a low transmission delay. Li et al. have noted that in the coverage network, apart from the link delay, the replication delay of user nodes in processing messages should also be considered [
10]. Liao et al. have proposed an ALM model based on the node potential (NP) and a topological index (TI), which is suitable for applications in large-scale, real-time multimedia environments [
11]. Li et al. have proposed a class of algorithms that create a greedy multicast tree based on the ratio of fan-out to delay (RFD) and the probability of terminal stability to obtain a high performance in multicast sessions [
12]. This problem belongs to the class of combinatorial optimization problems, which is characterized by a high degree of complexity and computational difficulty. However, intelligent algorithms have some significant advantages in this regard. Some scholars have utilized neural networks to solve similar problems [
13,
14]. Some scholars have used evolutionary algorithms to solve it. For example, Pan et al. have designed a genetic algorithm to minimize the end-to-end delay under the out-degree constraint [
15]. In addition to the delay, Ma et al. have considered the average path stretch and used the artificial fish swarm algorithm to solve the problem [
16]. Based on previous research, Liu et al. have further considered the instability index of an ALM routing tree and designed an encoding-free non-dominated sorting genetic algorithm to simultaneously optimize the total delay and instability of the ALM routing tree [
5].
The above algorithms mainly optimize the delay and stability of ALM routing trees; however, several problems remain to be solved. The existing research has been optimized under a single conversation scenario. However, multiple multicast sessions existing simultaneously is fairly common. At present, studies on the simultaneous optimization of multiple co-existing ALM routing trees are rare. One feasible method for achieving this is to use a single ALM routing tree construction method multiple times; that is, the algorithms are used sequentially to construct each ALM routing tree. It is worth noting that, to improve the stability of data transmission, when constructing the ALM routing tree, the user nodes with a higher stability are preferentially selected as the core nodes for data forwarding. However, if these user nodes appear in multiple co-existing ALM routing trees at the same time, these user nodes’ out-degree (the number of times end-hosts copy and forward the data) significantly increases. Due to the limitations in the ability of end-hosts to copy and forward data, when the out-degree of user nodes is too large, node congestion will occur. This is especially relevant for forwarding nodes that are close to the source and may experience massive stress issues [
17], further affecting the stability of the ALM routing tree. Therefore, when multiple ALM routing trees are optimized at the same time, the out-degree of the user nodes in each ALM routing tree needs to be reasonably distributed to ensure that the total out-degree of each end-host does not exceed their capability.
This study aims to obtain multiple co-existing ALM routing trees based on multiple co-existing multicast sessions while striking a balance between minimizing the total delay and instability of these ALM routing trees. We introduce the node out-degree as a constraint to prevent the instability of multicast sessions caused by node congestion. First, a low delay and low instability model of multiple co-existing ALM routing trees is established. To achieve the optimization goal, a one-off solution method is proposed in this study. In this method, the encoding of the DAFSA represents the selection scheme of Steiner node sets for multiple multicast sessions, and then multiple ALM routing trees are obtained from the complete graph corresponding to the multiple Steiner node sets through the use of the spanning tree algorithm. The fitness function in the DAFSA is used to evaluate the generated ALM routing tree, which is iterated continuously to find the optimal ALM routing tree. Node congestion analysis is performed on the designed algorithm to verify the effectiveness of the algorithm in dealing with the node out-degree constraints, and the performance of the algorithm is verified through detailed simulation experiments. Due to the large difference in the importance of the two objective functions—namely, the delay and the instability—a weight selection method is used to assist in decision making.
The rest of this paper is organized as follows. In
Section 2, the constructed application layer multicast stability model is introduced. In
Section 3, the idea to solve the model of the problem is introduced, which is divided into two parts: selecting the Steiner point sets and improving the spanning tree algorithm. In
Section 4, the design of the DAFSA and the improvement of Prim’s spanning tree algorithm are described in detail. In
Section 5, exhaustive simulation experiments are shown, and the obtained results are analyzed. In
Section 6, the experimental results and the design approach of this paper are discussed. In
Section 7, a summary is given.
2. Optimization Model for Multiple Co-Existing ALM Routing Trees
The application layer network can be expressed as , consisting of a vertex set V and an edge set E. represents a user node and represents the communication channel between two user nodes. For a communication channel e, the transmission delay is denoted as , and the delay caused by message processing in the user node is denoted as . The user node v has a probability of leaving from graph G. For a user node v, the out-degree is denoted as (which cannot exceed ), and the number of its descendants is denoted as . In this paper, we mainly optimize the delay and instability of ALM routing trees. The routing tree for a single multicast session, including one source and multiple destinations, can be denoted as . The optimization model for multiple co-existing ALM routing trees needs to be based on K groups as the source and M destinations, generating K ALM routing trees, which are denoted as . The out-degree of user node in ALM routing tree is denoted as .
2.1. Delay
Delay refers to the time required for data to travel from a source node to a destination node. In an application layer multicast session, the intermediate nodes that forward data are the end-hosts. The equipment of the end-hosts has a limited forwarding capability, so the processing delay cannot be ignored. Therefore, the delay in this paper includes two parts: the transmission delay and the processing delay in end-hosts. The delay of the ALM routing tree
is denoted as
, and the total delay is calculated as shown in Equation (
1).
2.2. Instability
Instability mainly focuses on the exit and failure of user nodes. Node exiting means that a user node voluntarily leaves the application layer multicast session, while user node failure means that a user node leaves the application layer multicast session without notifying any other user nodes. In the ALM routing tree, the exit and failure behaviors of non-leaf nodes cause their descendant nodes to lose connectivity with the root node of the multicast tree.
2.2.1. Reducing the Impact of User Nodes’ Exiting Behavior
User nodes exiting is a spontaneous behavior. As the distribution of the online times for the end-hosts in multicast sessions shows a heavy-tailed phenomenon [
7,
18], this study pays more attention to the probability of user nodes exiting and uses the average number of descendant user nodes affected by the exit of the user nodes to measure the instability of ALM routing trees. The instability of ALM routing tree
is denoted as
, and the total instability is calculated as shown in Equation (
2).
2.2.2. Reducing the Risk of User Nodes’ Failure
User node failure is a passive behavior, which usually occurs as user nodes lose the ability to forward data due to experiencing a heavy load. Therefore, in this study, the out-degree of a node is limited to reduce the load on the end-host. Equation (
3) is the constraint.
In this study, the delay and instability are considered as the optimization objectives. However, these two objective functions may be in conflict. To find an appropriate trade-off in the multi-objective problem, weights for the objective functions are introduced to convert the multi-objective problem into a single-objective problem. Equation (
4) is the specific formula.
3. One-Off Optimization
The problem of ALM routing tree construction is essentially the Steiner tree problem in graph theory [
19,
20]. This problem requires finding the optimal tree that contains specified terminal nodes. However, solving this problem is very complicated: it has been proven to be NP-complete [
21], which means that there is no effective algorithm for solving it in polynomial time, and the solution space can be searched only with methods of exponential or even factorial complexity.
In the construction of multiple co-existing ALM routing trees, multiple co-existing application layer multicast sessions correspond to multiple Steiner trees. This further escalates the difficulty of solving the problem, as different multicast sessions may share nodes, and the out-degree of a node needs to be guaranteed not to exceed the performance limit of the node.
Although the co-existing Steiner tree optimization problem is difficult to solve, the spanning tree problem is relatively simple, which involves finding a single tree that contains all the vertices. This has been studied in depth and includes the minimum spanning tree problem [
22,
23], the degree-constrained minimum spanning tree problem [
24], the multi-objective spanning tree problem [
25], and so on.
In addition, it is very difficult to rationally allocate the out-degree of nodes between multiple co-existing ALM routing trees, which often results in an inability to obtain a feasible solution. However, the good adaptability and global search ability of the DAFSA enable it to perform well when dealing with problems involving complex constraints [
26]. At present, the processing methods for infeasible solutions include the use of penalty functions, repair methods, and so on.
In this study, the problem is decomposed into the following two parts.
3.1. Evolution: Using the DAFSA, Based on the Actual Source Nodes and the Destination Nodes, an Appropriate Set of Steiner Nodes Is Selected through a Population Iteration
The key to solving the considered problem is selecting the other user nodes that are not the source and the destinations (Steiner nodes) instead of user nodes. These nodes serve as the core nodes that connect the destination nodes. The positions and numbers of these nodes usually vary, according to the nature of the problem and the optimization goal. A trade-off needs to be struck between low node instability and a low delay between the source and the destinations while also considering the out-degree constraints of the user nodes to rationally distribute the Steiner nodes in each tree. These nodes, the source nodes, and the destination nodes are combined into a complete subgraph.
The discrete artificial fish swarm algorithm is a swarm intelligence algorithm. The basic idea of this algorithm is to simulate the behavior of individual fish in a fish swarm, such that the whole swarm can cooperatively find an optimal solution in the solution space. Each artificial fish represents a candidate solution in the solution space, and they exchange information and adjust their positions to find an optimal solution. Owing to a number of salient properties, which include flexibility, a fast convergence, and insensitivity to the initial parameter settings, the AFSA family has emerged as an effective swarm intelligence (SI) methodology that has been widely applied to solving real-world optimization problems [
27]. One of its main advantages is the ability to perform a global search in the search space and avoid becoming trapped in local optimal solutions.
The algorithm contains a series of behavior rules, such as foraging, following, randomly moving, and so on. These rules simulate the behavior of individual artificial fish when searching for food and avoiding danger:
(1) Randomly moving behavior: The individual randomly moves in various directions within its step limit.
(2) Foraging behavior: The individual randomly explores a new position within its visual limit. If the new position has a better fitness, it moves toward this position within its step limit; otherwise, if a position with a better fitness cannot be found within a limited number of try_number times, it will move randomly.
(3) Following behavior: The individual perceives the optimal individual within its visual limit and moves toward that individual if the surrounding area is not crowded; otherwise, the individual performs foraging.
In this study, the artificial fish school behavior strategy designed by Ma et al. [
16] was used. First, whether the artificial fish (AF) are crowded or not is determined. If not, the fish perform the following behavior and the algorithm ends. Otherwise, the individual enters into foraging behavior.
3.2. Evaluation: Based on the Spanning Tree Algorithm, the Complete Subgraph Is Converted into an ALM Routing Tree, and the Fitness Value Is Calculated
For this part, an ALM routing tree must be constructed based on the obtained complete subgraph; that is, all of the terminal nodes are connected using Steiner nodes, ensuring that the objective function is optimized. This problem is similar to the minimum spanning tree problem.
Prim’s algorithm [
22] has the advantages of simplicity and efficiency in processing the minimum spanning tree problem, the basic idea of which is to start from an initial node and gradually select the shortest edge connected to the current spanning tree until all the nodes are covered. According to the objective function defined above, this study improves Prim’s algorithm to heuristically construct an ALM routing tree with a low delay and better stability.
4. One-Off Optimization Method for Multiple Co-Existing Application Layer Multicast Trees
In this study, the DAFSA is used as the core method for the optimization of multiple co-existing ALM routing trees. First, based on the input multicast session, multiple sets of suitable Steiner node sets are selected to form a complete subgraph, as shown in
Figure 1. Then, multiple subgraphs are converted into ALM routing trees using the improved spanning tree algorithm. Subsequently, evaluation and updating of the bulletin board (used to store the set of optimal routing trees) was performed. The optimal ALM routing trees were ultimately obtained through continuous iteration. It is worth noting that the improved spanning tree algorithm is a deterministic algorithm, and the selected Steiner node set directly affects the fitness function used to evaluate the ALM routing tree.
4.1. Application of DAFSA in Multiple Co-Existing ALM Routing Trees
4.1.1. Encoding
The genotypes of the artificial fish are represented using matrix coding, where each row represents a Steiner node selection scheme for a multicast session, and this set of nodes forms a complete subgraph
. Equation (
5) represents the code for artificial fish
X (AF-
X).
where each row has
elements and each element can only be 0 or 1. If the complete subgraph
contains vertex
i, then
; otherwise,
. All the elements in any
corresponding to the source and destinations should always be 1, as all potential complete subgraphs must contain the source and destinations.
4.1.2. Fitness Function
The fitness function is used to evaluate the quality of the artificial fish. To address the artificial fish that do not satisfy the constraints, a penalty value is introduced into the fitness function. The artificial fish that do not meet the constraints are eliminated in the iterative process when the fitness function takes a large value. This strategy helps to emphasize the importance of satisfying the constraint conditions and guides the algorithm to find suitable solutions in the search space. The formula for the
Fitness is as follows:
where
p in Equation (
6) is the penalty factor and
in Equation (
7) represents the number of out-degree of node
that exceeds the degree constraint.
4.1.3. Behavior of Artificial Fish
The artificial fishes cooperatively search the solution space through the execution of behaviors. Specifically, optimal behavior is realized through a change in spatial position. As the solution space is discrete, the Hamming distance [
28] is used to measure the distance between two artificial fishes. In this study, the behaviors used in the DAFSA were designed as follows:
- (1)
Randomly moving behavior
The encoding method used in this study is binary encoding. To implement this behavior, we only need to randomly flip the elements that do not correspond to the source and destinations used in the encoding matrix of AF-X, in the manner of .
- (2)
Foraging behavior
Suppose the current position of an AF is X. Then, the AF randomly moves to a new position . If the foraging behavior is successful (i.e., ), then the AF will randomly select different elements between X and in X to cover the corresponding elements in ; otherwise, the AF will perform random movement.
- (3)
Following behavior
Following (or tail-chasing) is a behavior that imitates other AFs, especially those that perform well. Suppose that, within the visual range of AF-X, there are n AFs and is the solution with the optimal fitness. Assume that the Hamming distance between X and is equal to , which means there are elements in the encoding matrix of X that differ from the corresponding ones in . The fitness function is satisfied if and only if and , in which case the following behavior will be executed. The specific way in which this is executed is to randomly select elements from the above elements in to cover the corresponding elements in the AF, such that the distance between the two AFs will decrease and the similarity will increase.
4.2. Improved Spanning Tree Algorithm
During the decoding of an individual artificial fish, a tree that connects all the nodes needs to be obtained based on a complete graph. To make the constructed tree more stable with less delay under the condition that the out-degree constraint of the user node is satisfied, this study improves Prim’s algorithm by comprehensively considering the delay and the instability, instead of using the edge weights, to weigh the order of joining in the minimum spanning tree. We used the contributions of the delay and the instability (
DIC), calculated as follows:
In Equation (
8),
represents the corresponding depth when node
joins the tree,
represents the corresponding edge delay after node
is added to the tree,
represents the replication delay of node
, and
represents the probability of node
leaving a multicast session.
The node depth refers to the number of nodes that pass from the source node to a given node. The greater the depth of a node, the more unstable the data transmission path is, as the departure of any of the node’s ancestor nodes will cause it to receive no data. Therefore, to increase the stability of the entire tree, the depth of each node should be kept as small as possible.
When the delays of the end-hosts are the same, the preference is to choose the end-hosts with a low leaving rate, as the nodes that are preferentially added to the tree are more likely to serve as transit nodes for data forwarding. In this way, the overall stability of the multicast tree can be increased. Similarly, when nodes have the same probability of leaving, the node with the shortest delay is selected first, which can reduce the overall delay. Smaller DIC nodes should be at the upper level of the multicast tree, in order to take full advantage of their low delay and low instability, thus improving the two target values of the ALM routing tree.
By borrowing ideas from Prim’s algorithm, a preliminary ALM routing tree can be obtained that connects all the nodes in the complete graph. However, in the process of generating the tree, the phenomenon of node redundancy may occur due to improper selection of the Steiner node set; that is, non-destination nodes may appear at leaf nodes and are only involved in receiving data, not in forwarding it. The data transmission corresponding to this part has no practical significance and will only increase the delay and instability. These redundant branches need to be pruned, in order to ensure that the leaf nodes only contain the destination nodes of the session.
The improved spanning tree algorithm based on Prim’s algorithm is constructed in Algorithm 1.
Algorithm 1: DIC-based tree generation algorithm |
|
4.3. Algorithm Process
- (1)
The application layer network is input, and the relevant sources and destinations in K co-existing multicast sessions are specified;
- (2)
The algorithm-related parameters, such as the , , , , , and p are set;
- (3)
Individual artificial fish execute the behavior strategy and obtain multiple Steiner node sets;
- (4)
The improved spanning tree algorithm is used to obtain the co-existing ALM routing trees corresponding to the multiple Steiner node sets obtained for the AF;
- (5)
The fitness of the AF individuals are evaluated by calculating the delay and instability of multiple co-existing ALM routing trees. The current best AF individual is compared with those recorded on the bulletin board, and if its fitness is better, the bulletin is updated;
- (6)
It is determined whether the algorithm termination condition has been met. If not, steps (3)–(6) are repeated; otherwise, the ALM routing tree corresponding to the multicast sessions is output.
5. Simulation Experiment Analysis
The DAFSA approach designed in this paper was written and tested in C++. The simulations were run on a computer (AMD Ryzen 7 5700U) with an 1.80 GHz Radeon GPU, 16.00 GB of RAM, and the Windows 7 (x64) operating system. The parameter settings were as follows:
,
,
,
,
,
,
,
,
, and
. These parameters are chosen experimentally. The detailed discussion on parameter settings will be given in
Section 5.4 and
Section 5.5.
Figure 2 shows the IP network diagram. The circles in the diagram represent the user nodes, and the squares represent the router nodes. Each user node has two transmission parameters: the node replication delay and the departure probability. The weights between nodes represent the data transfer delays. Although the application layer multicast approach uses user nodes to transmit data, the underlying layer was still propagated through a routing node unicast approach. The edge delay between each pair of user nodes was obtained using the Dijkstra shortest path algorithm.
The session results for the optimization of four co-existing multicast sessions, each with one source node and eight destination nodes, are shown in
Table 1, and the ALM routing trees obtained using the proposed algorithm are shown in
Figure 3. For each ALM multicast tree corresponding to a multicast session, the out-degrees of all the nodes in
Figure 3 satisfied the constraint. The out-degrees of nodes 8, 30, 38, and 24 were all 5, as the instability probabilities of nodes 8, 30, and 38 were very low (i.e., two orders of magnitude lower than those of the other nodes). Therefore, when constructing the ALM routing tree, these three nodes were preferentially selected as the transfer nodes for data transmission. The out-degree of node 24 was also 5, as the out-degrees of nodes 8, 30, and 38 were allocated and because the data could only be forwarded through other nodes. However, the other nodes had a high probability of instability and, thus, were not suitable as transfer nodes. Therefore, the root node was directly used to transmit data to reduce the depth of the entire tree, thereby reducing the instability of the ALM routing tree.
Table 2 lists the delay and instability of the ALM routing tree for the four multicast sessions. As analyzed above, ALM trees a, b, and c used nodes 8, 30, and 38 as the transit nodes, respectively, which effectively reduced the instability. However, to satisfy the node out-degree constraint, the algorithm eventually selected some transit nodes (i.e., non-source and non-destination Steiner nodes), resulting in an increase in the link delay. In contrast, although ALM tree d (corresponding to multicast session 4) achieved a lower delay, it paid a higher price with its instability, which further illustrates that the algorithm made a certain trade-off between delay and stability.
In fact, the routing tree obtained with the algorithm was based on the application layer, and the actual data forwarding process used by the routing nodes to forward the data was in the form of an IP unicast. Taking ALM tree b from session 2 as an example, the actual data transmission process is shown in
Figure 4. The transmission path between each pair of nodes was the transmission path with the lowest delay.
5.1. Comparison between One-Off Optimization and Sequential Optimization
5.1.1. Comparison of Sequential Optimization That Does Not Consider the Out-Degree Constraint
In sequential optimization without considering the out-degree constraint, only one multicast session is optimized at a time, and the out-degree constraint on nodes is not considered. In one-off optimization, multiple multicast sessions are considered simultaneously to yield all multicast session transmission schemes. Different numbers of multicast sessions and destination nodes in each multicast session were set, and the node congestion under the two approaches described above was analyzed.
Figure 5 shows the fitting curves under sequential optimization and one-off optimization. The black dots indicate the out-degree violations under the two algorithms. The fitting surface shows that the node out-degree violation under sequential optimization increased exponentially with the number of multicast sessions and destination nodes, while one-off optimization presented no node constraint violations.
From the above analysis, as nodes 8, 30, and 38 were suitable transit nodes for forwarding data, their out-degree easily exceeded the constraint. For further analysis, we designed each session to contain five destination nodes and tested the out-degree of these three nodes under different numbers of multicast sessions.
Figure 6 shows that, in sequential optimization, when the number of multicast sessions was greater than four, the out-degree of node 30 exceeded the constraint, and when the number of multicast sessions was greater than six, the out-degree of node 38 exceeded the constraint; meanwhile, for node 8, the out-degree was basically maintained at 3 and was within the constraint. With an increase in the number of sessions, the out-degree of nodes 30 and 38 increased significantly. In addition, we found that the sum of the out-degree violation levels for nodes 30 and 38 and of all the nodes was equal, indicating that, under the considered experimental conditions, these two nodes caused the ALM routing tree to fail to satisfy the constraints.
In contrast, in one-off optimization, when the number of sessions reached four, the out-degrees of nodes 8, 30, and 38 were all 5, equal to the critical constraint value. However, as the number of multicast sessions increased, the out-degree of these three nodes did not exceed the constraint. This indicates that the one-off optimization method can make full use of the out-degree of core nodes and obtain an optimal solution under the constraint conditions.
5.1.2. Comparison with Sequential Optimization While Considering the Out-Degree Constraint
The above experiments provided in-depth information on the impact of not introducing constraint processing technology in sequential optimization. Notably, sequential optimization can also consider the out-degree of a node as a constraint condition. We adopted the node out-degree reservation strategy; that is, each time the optimization of an ALM routing tree is completed, the out-degree of the corresponding node is purposefully reduced. In the next optimization of the ALM routing tree, we can choose only those nodes that still have a valid out-degree. However, this strategy may trap the entire ALM routing tree in a local optimal solution.
This occurs because, during the construction of the ALM routing tree, better nodes are initially selected. As the out-degree of such core nodes is exhausted, the subsequent ALM routing tree can use only other nodes with a greater delay and a greater instability, resulting in a sharp increase in the instability and delay of the whole tree.
Table 3 shows the optimization results obtained for four multicast sessions. The number of destination nodes for each session was five, and the out-degree of each node was two. Although the one-off optimization method was not as good as the sequential optimization method in the construction of the first ALM routing tree, the results of the one-off optimization method showed a lower delay and instability when constructing the third and fourth ALM routing trees. When considering multiple co-existing ALM routing trees, the overall delay and stability were significantly better than those of the trees constructed using the sequential optimization method.
As can be seen from
Figure 7, in the first ALM routing tree, nodes 8, 30, and 38 were used, which decreased the delay and instability. However, in the third and fourth ALM routing trees, as the out-degrees of the selected core nodes 8, 30, and 38 had been used up, the other nodes were selected only to transmit data, resulting in significant increases in delay and instability in the two routing trees.
In contrast,
Figure 8 shows the results of one-off optimization. Through the rational distribution of nodes—for example, by using the out-degree of nodes 14 and 38 in ALM routing trees 3 and 4—the delay and instability of ALM routing trees 3 and 4 were reduced. Although this optimized allocation slightly increased the delay and instability of the first routing tree, it reduced the delay and instability of the multiple ALM routing trees as a whole. This result further clarifies the limitations of independently optimizing the ALM routing tree for each session. In contrast, the one-off optimization method used in this paper can more effectively optimize the overall performance.
5.2. Validation of the Penalty Function Mechanism
For the optimization of multiple co-existing ALM multicast routing trees, violating the out-degree constraint of nodes may cause failure of data transmission. Therefore, determining how to guide individual artificial fish to search in the feasible solution domain is highly important. In this study, a penalty mechanism was introduced to eliminate solutions that do not satisfy the constraints. In
Figure 9, we compare the effect of the algorithm with and without the use of the penalty mechanism regarding the out-degree violation of nodes.
As the scale of the multicast sessions increased, among the results obtained with the algorithm without a penalty mechanism, a greater node out-degree violation indicated that a very large number of destination nodes needed to copy and forward the data in large quantities. When the performance limit of a node is exceeded, the end-host will be down, which will cause the session to fail. On the other hand, when the penalty mechanism was used, the results obtained with the algorithm did not include nodes exceeding the degree constraint. This demonstrates that the penalty mechanism can effectively solve the out-degree constraint problem. In particular, when the scale of multicast sessions increases, the algorithm with the penalty mechanism performed better in terms of reducing the out-degree violations of nodes.
5.3. Algorithm Convergence Analysis
The execution of various behaviors enables the artificial fish swarm to perform more flexible and diverse searches in the solution space. However, under some circumstances—especially when the problem is complex and the solution space is large—these behavior modes may cause the algorithm to converge slowly, and the algorithm may be prone to becoming trapped in local optimal solutions. To verify the convergence and accuracy of the algorithm for the optimization problem in this paper, we conducted an analysis of scenarios using networks containing 25, 50, and 75 randomly distributed user nodes. In these networks, four multicast sessions were input, where each multicast session contained one source node and eight destination nodes.
The randomness of the artificial fish swarm algorithm may make the algorithm unstable. To reduce the impact of randomness on the algorithm results, multicast optimization was performed for each network 50 times, and a box plot was generated to show the locations of the distribution centers of these results and the distribution range. As shown in
Figure 10, for the networks with 25 and 50 user nodes, the box plot appears as a straight line; as such, the maximum and minimum values are the same, and there are no outlier values. When the network size increased to 75, the convergence stability of the algorithm decreased slightly, with a maximum value of 0.74014 and a minimum value of 0.73346, comprising a difference of only 0.9%. This indicates that the algorithm had relatively good stability under different network sizes.
To further verify the convergence ability of the algorithm, the fitness values of the 50 results were summed and averaged, and the obtained iteration diagram is shown in
Figure 11a. With network sizes of 25, 50, and 75 user nodes, the fitness value decreased rapidly at the beginning of the iteration, as the algorithm eliminated infeasible solutions. When the number of iterations reached approximately 10, the increase in the fitness value slowed down, as the solutions listed in the bulletin board already satisfied the constraints and its fitness function was low. Subsequently, as shown in
Figure 11a–c, the algorithm approached the optimal solution as it iterated and converged at 81, 98, and 191 iterations, respectively.
5.4. Parameter Sensitivity Analysis
Swarm intelligence algorithms usually exhibit good adaptability. However, setting reasonable parameters is still a key task when using optimization algorithms. The appropriate selection of parameters can significantly improve the performance of the algorithm. The main parameters of the artificial fish swarm algorithm include the population size
, the field of view
, the step size
, the number of attempts
, and the degree of congestion
.
Figure 12 shows the results of the algorithm from 20 to 200 iterations under different parameter settings.
Regarding the effect of the population size on the algorithm, as shown in
Figure 12a, when the population size increased, the number of iterations needed for the algorithm to converge decreased. However, in each iteration, the number of AFs participating in the optimization search increased. Therefore, this parameter had no significant impact on the overall convergence time. As can be seen from
Figure 12b–e, setting different values for the other parameters affected only the iterative process of the algorithm and had a relatively insignificant impact on the final convergence result, which indicates that the algorithm is insensitive to parameter changes and has good robustness.
5.5. Selection of Weights
In this study, the optimization of the ALM routing tree involves two objectives, namely, the delay and the instability, with corresponding weights and , respectively. The selection of these weights directly affects the performance of the algorithm and search results. In the experiments, the magnitude of the observed delay was much greater than that of the instability. This may have caused the delay to be too significant in the overall optimization process, leading to the contribution of instability being ignored. By adjusting the weights, the influence of the different objectives during the optimization process can be controlled.
The weights
and
can be determined in a number of ways. For example, the subjective judgment method [
29], statistical method [
30], and sensitivity analysis [
31] can be used. However, neither of the first two methods is applicable; the subjective judgment method requires an expert’s deep understanding of the problem and an accurate estimation of the contribution of each objective. Statistical methods require a large amount of supporting data; however, the resulting data of this problem are related to the number of source nodes, the destination nodes, the number of multicast sessions, and the network distribution and size, making this method costly. In contrast, sensitivity analysis, which directly assesses the impact of input parameters on the model output, is a simple and intuitive approach that requires less data and is easy to understand and implement.
Therefore, we used sensitivity analysis, and different weight combinations were used to cover the possible weight value ranges. The influences of these weights on the final optimization result were investigated, as shown in
Figure 13. In general, there was an increase in the weight ratio (
) as the instability of the ALM routing tree gradually increased, while the total delay continuously decreased. This is due to the increase in the value of the weight
; that is, the contribution of the delay increased.
In the process of gradually increasing the weight ratio, several inflection points appeared, indicated by the red points (a, b, c, and d) in the figure. These points are the turning points where the rate of decrease in the delay became slower, the rate of increase in the instability became greater, or both.
Table 4 lists the results under the weight ratios corresponding to these points. By analyzing these points, we could obtain a locally optimal weight ratio; that is, a significant reduction in the instability or delay can be obtained without a significant increase in the delay or instability, respectively.
For example, consider the process from point a to point b: the delay was sharply reduced, while the instability increased slightly. Therefore, choosing point a will be less unstable than choosing any point between a and b, and the delay will not increase much as the delay does not change sharply. Meanwhile, the delay is lower when point b is selected, and the increase in instability is not significant.
When only the instability was optimized, the instability value reached 0.567. Meanwhile, when only the total delay was optimized, the total delay reached 2788 ms. These results provide a reference for weight selection under different optimization objectives, such that the algorithm can be flexibly adapted to the specific needs of a given application. Decision makers can consider the importance of each objective to the overall goal and determine the optimal combination of weights by considering the practicality, expertise, and relevant interests.
5.6. Analysis of Solution to the Routing Tree Problem for a Single Multicast Session
Although this study is optimizing the multiple co-existence application layer multicast routing tree structure problem, the method is equivalent to optimizing a single multicast session when we set the input to only one multicast session. To evaluate the performance of the proposed algorithm regarding the optimization of a single multicast session, a multicast session in which the number of source nodes was 8 and the numbers of destination nodes were 2, 3, 14, 26, 24, 37, 22, 35, 28, 29, and 31 was set up. The algorithm in this paper was compared with three single multicast session multi-objective optimization algorithms, namely, Cao’s algorithm [
7], the CL-S [
8], and the VDM [
9]; all three algorithms are for single multicast sessions. Because of the differences between their optimization models and the one we formulated, we set the end-to-end latency constraint and the degree constraint in Cao’s algorithm to 300 and 5 and made the CL-S take the form of transmission delay in our formulated optimization model and constructed the virtual distance for the VDM based on our objective. Such a modification ensures comparability but will not alter the performance.
Table 5 shows that the proposed algorithm was superior to the three algorithms used for comparison, in terms of its total delay and instability.
6. Discussion
We investigate a key limitation of existing application layer multicast (ALM) routing optimization algorithms, namely, that these algorithms mainly focus on the optimization of individual multicast routing trees, whereas sequential one-by-one optimization is usually required when dealing with multiple co-existing multicast sessions. However, as the experimental results show, this sequential optimization approach can very easily lead to an excessive out-degree of user nodes suitable for forwarding data, triggering node congestion. And this phenomenon will be more serious with increases in the multicast session size, which will lead to a failure of data transmission in the session. Moreover, it is also difficult to make reasonable use of the node out-degree if we want to take into account the node out-degree constraints in the sequential optimization. The node out-degree reservation strategy mentioned in the previous section is a simple method of out-degree allocation, but it only ensures that all nodes can satisfy the constraints, which can easily lead to falling into a local optimum. Specifically, the routing tree optimized first performs well, while the performance of the routing tree optimized later gets worse. For multiple co-existing multicast sessions, such an allocation appears to be extremely unfair, and the performance of all multicast sessions cannot be optimized.
The discrete artificial fish swarming algorithm we designed takes multiple co-existing multicast sessions as a whole and achieves the optimization of the objective function values of multiple co-existing application layer multicast routing trees by continuously evolving artificial fish with higher fitness functions. Due to the introduction of a penalty function mechanism, this approach helps the algorithm filter the solutions that do not satisfy the node out-degree constraints and avoids session instability caused by node congestion. Experimental results show that the algorithm achieves satisfactory results, with trade-offs in node allocation across multicast sessions and a reduced overall delay and reduced instability. In addition, we note that our proposed algorithm is also effective in optimizing individual multicast sessions. Steiner nodes, selected by the DAFSA, have been proven to be very suitable as intermediate nodes for forwarding data, thus guaranteeing the performance of individual multicast sessions.
However, the present algorithm also has limitations. First, in setting the weights, when the network structure changes significantly, such as when the number of multiple co-existing application layer multicast sessions increases or the nodes of each session become complex, we are unable to find the optimal weighting parameter through multiple experiments because of the huge cost involved. This is mainly due to the fact that single-objective weighting methods are very sensitive to the choice of weights and are usually difficult to adapt to new contexts or changes in objectives. To overcome these problems, we propose to consider using a multi-objective decision-making approach [
32] to optimize the relationship between multiple objectives more comprehensively. Further, this study deals with node out-degree constraints using a penalty function, but the effect of the penalty function is highly dependent on the chosen penalty parameter, which adds to the complexity of the problem [
33]. In solving these problems, an improved spanning tree algorithm ensures that the nodes in the current multicast session do not exceed the constraints. But how to consider multiple co-existing complete graphs and generate application layer multicast routing trees that satisfy the constraints to avoid the use of the penalty function remains a problem that requires in-depth research.
Moreover, the application layer multicast routing tree construction problem is usually dynamic in nature. That is, the network topology, multicast session members, etc., may change over time. The swarm intelligence algorithm has difficulty in dealing with dynamically changing problems at the time of application, and as the problems become more complex, its search space becomes larger, which can easily lead to a decreased search efficiency. In contrast, trained neural networks have the ability to generalize to unseen situations and can adapt to the complexity of the problem by learning patterns and features of the data without the need for explicit rules [
34]. Therefore, we will include neural network methods in our future research.