1. Introduction
In recent years, the Internet of Things (IoT) has expanded significantly, leading to a large amount of data being generated by IoT devices. These data are sent over various networks to cloud-based servers and other data consumers. To cope with this large amount of data, decentralized fog-based architectures can be used. This allows ensuring low latency and effective resource usage since the IoT data can be processed close to the data sources. Fog networks consist of heterogeneous fog nodes and edge devices with different resource constraints, such as battery level, security level, central processing unit (CPU) use, and memory use. Moreover, fog nodes as well as user end-devices can be mobile (e.g., vehicles, smartwatches, smartphones, and mobile sensors), which means that the fog network architecture is not static, but dynamic, with a self-organizing ad hoc structure. This leads to the two major problems related to service provisioning in fog networks: optimal service (application) placement in the fog node and optimal data routing between the user end-device and the fog node that provides those services.
Each time the user end-device asks the nearest fog node to provide some services for it, the fog-based system should decide on the best fog nodes for service placement, considering various constraints when searching for the optimal placement, including battery level, CPU use, memory use, and security level. Recently, a lot of research effort has gone into developing optimal application and service placement algorithms and architectures for fog networks, which are reviewed in [
1,
2,
3,
4]. However, usually, optimal placement is solved as a separate problem and the cases when services are placed in fog nodes not adjacent to the user end-device are ignored. In such cases, fog systems need to route the IoT data from the user end-device to the fog nodes where required services are placed. These data are routed along a path starting at the user end-device (data source) passing through several fog and edge nodes until the data reach the final node or nodes, which process the data. Fog systems consist of heterogeneous nodes with different computational capacities and constraints, and the routing algorithm should consider these constraints while calculating the best path to route the data. For example, the path should exclude nodes with insufficient computational or energy resources or nodes that have low security levels if the user application requires a high level of data protection. Moreover, the algorithm for determining the data path should be simple and lightweight enough to be able to run on constrained devices, such as fog nodes, and provide a good enough path for data transfer. In such a scenario, it is not important to locate the optimal path as long as the path found using a lightweight algorithm addresses all the requirements and ensures good overall results. In such applications, heuristic optimization methods are usually used.
Therefore, data routing in heterogeneous fog networks becomes a multi-objective optimization task, which considers various computational, resource, and security constraints of the fog and edge nodes, and identifies the best path in terms of optimizing latency, energy consumption, bandwidth, etc. There are usually several alternative routing paths, which should be evaluated in real time to achieve the lowest possible response time and latency. Since fog network nodes (especially edge nodes) usually have limited resources, they rapidly reach their capacity, leading to longer and more complex paths from the user end-device to the remote fog node(s) suitable for service(s) placement. Moreover, the mobility of user end-devices and even fog nodes adds another level of complexity, requiring the algorithm to perform constant service replacement and, thus, constantly recalculate the best routing path.
Over the years, various researchers have solved the optimal routing or path-finding problem. Most of these works in the field of wireless sensor networks (WSNs), fog, and IoT systems focus on optimal routing protocols and data forwarding techniques, which in most cases have a single objective, e.g., optimize energy usage [
5,
6,
7]. As concluded in [
5], one of the drawbacks of existing routing techniques is fixed static routing and the reliability of decision-making nodes. The survey dedicated solely to nature-inspired algorithms for WSNs [
8] concluded that these algorithms are well-suited for solving multi-objective real-world optimization problems, while traditional algorithms fail to provide satisfactory results because the problem is complex. Nature-inspired algorithms, including particle swarm optimization (PSO), are used for energy-efficient clustering and routing, optimal coverage, data aggregation, and sensor localization, as classified in [
8]. The systematic study of topology control methods and routing techniques in wireless sensor networks [
7] reviewed recent articles, including topology-aware PSO and ant colony optimization (ACO)-based routing techniques in static and mobile wireless sensor networks, and stated that these techniques do not incorporate delay-sensitive routing or timely data delivery. According to [
7], the latest routing algorithms based on PSO and ACO techniques lag where the topology and routing requirement are delay-sensitive and concerned with data delivery ratio, throughput, and quality of service (QoS). Although several real-time routing techniques attempt to satisfy minimum latency and maximum throughput, the study [
7] concluded that the research in this field is in an early stage and limited to some well-known protocols. These findings are further supported by swarm-intelligence-based optimization techniques for the WSN survey [
9], which concluded that there are still some open research challenges, including weighing the energy consumption, QoS, security, and reliability of the network. In the survey in [
9] also, the authors found that most of the previous work optimized the performance of WSNs only from a single perspective (a single objective).
In [
10], the authors reviewed the multi-objective optimization techniques and challenges in WSNs, stating that in WSNs, routing is an essential factor that should be performed optimally, though there are various routing challenges, including scalability, energy consumption, connectivity, deployment, security, and coverage. According to [
10], multi-objective routing algorithms should consider coverage, throughput, end-to-end delay, capacity, collision, etc. Existing routing algorithms usually avoid security, though it is an essential factor in WSNs [
10]. In [
11], the authors reviewed path optimization techniques and noted that most path optimization techniques use performance measures, including packet delivery rate, network lifetime, energy consumption, delay, and distance, but do not consider the analysis of messages and the time complexity of their techniques. Therefore, real-world implementations of the proposed techniques pose a significant challenge as well [
11]. Finally, the survey on path planning for mobile sink in IoT-enabled WSNs [
12] paid attention to the fact that most research assumes an obstacle-free network environment, while in a real environment, obstacles are usually present.
To counter the aforementioned problems, we propose a multi-objective PSO-based path optimization method that can identify the best routing path from the user end-node to the fog node(s) where services for the user device are placed. Our method not only considers multiple constraints of the fog nodes along the path (CPU, memory, battery, security, etc.), but also supports the mobility of the user and fog nodes, as well as failures of the nodes, constantly reconfiguring the fog network and path. Instead of routing the IoT data dynamically, until it reaches the suitable fog node for further processing (which does not guarantee the optimal path), our method calculates the optimal path each time a change occurs in the fog network, such as user end-node movement, computational resource changes in the fog nodes, and fog node failures. This algorithm selects the best of all possible paths and forwards the data directly to the processing fog node(s). As a result, latency and response time are low and bandwidth and energy usage are minimized, achieved by combining the distributed orchestrator model, proposed by us in [
13], and the PSO-based optimization method, which locates the best path from the user end-node to the fog node(s), where the distributed orchestrator places the services for that user device. The distributed orchestrator constantly synchronizes the computational resources and constraints of each fog node, which allows the PSO-based algorithm to straightaway identify the optimal (suboptimal) path, instead of using various dynamic or evolutionary approaches, which achieve good results only after some time. Therefore, we propose a method that is able to simultaneously cope with two major problems in fog networks: optimal service (application) placement and optimal routing.
Heuristic nature-inspired algorithms, such as PSO [
14], genetic algorithm (GA) [
15], and ACO [
16], or even the cuckoo search algorithm (CS) [
17], help identify the optimal path in similar applications. The objective functions used in these methods vary from maximization of the packet delivery rate to data transmission latency, overall power consumption, delay time, and minimization of bandwidth consumption. Most methods tend to identify an optimal path for data flow based on one, most important, parameter of the IoT system or by combining several characteristics, such as latency, bandwidth, or energy, into one composite criterion using a simple linear combining objective function. However, using a composite of many criteria is not always the ideal solution to this issue because it is challenging to correctly determine the weights of the individual criterion. The authors of [
18] recommend the use of simulation and trial and error to adjust the weights of the criteria for constructing the linear combined function. One of the possible solutions to this problem is to use multi-objective optimization to identify all nondominated data paths, that is, the Pareto Frontier of the problem solutions [
19]. The final challenge in such cases is to choose the best solution from all nondominated candidates. Several different approaches may be used to compare the alternatives. If the application area is fixed, well defined, and extensively investigated in advance, some form of aggregation function of several competing factors can be constructed using mathematically calculated decision matrices [
20]. As the final step in the optimization process for selecting the best path from all alternatives, we propose using the analytic hierarchy process (AHP) [
21]. AHP uses only simple pairwise comparisons of all alternatives using all objective functions and can be easily adapted for use by machine-based decision making [
22,
23,
24]. The values of all criteria are normalized, allowing us to use heterogeneous measurement scales for different objective functions. The importance of the criteria used to construct the decision matrix is also evaluated using the same methodology, allowing us to skip the most controversial step, that is, manual weight assignment to different criteria. The decision matrix is prepared in advance by the experts in the application area and used during the execution of the algorithm. Being deterministic and easy to implement, the AHP fits the constrained nature of the fog nodes well.
Our main contribution to the field of data path optimization in fog architectures is a novel two-stage optimal-path-finding algorithm based on the multi-objective particle swarm optimization and the analytical hierarchy process. During the first stage of the proposed method, a Pareto set of nondominated alternative paths is found. Then, AHP is used to choose the best path according to the provided application-specific judgment matrix.
The article is organized as follows:
Section 2 presents the conceptual model of the fog system and a formal definition of the best data routing path finding problem.
Section 3 presents the proposed two-stage multi-objective optimization method to determine the best data routing path.
Section 4 covers the experimental evaluation and discusses the results obtained.
Section 5 concludes the article.
3. Two-Stage Multi-Objective Optimization Method for Finding the Best Data Path
In real applications, the objective functions
,
contradict each other. For example, the highest security increases CPU and RAM usage. One of the obvious approaches used in many solutions is to combine all objective functions into one composite criterion using simple linear equations. In this case, it is difficult to choose the “proper” coefficients, especially when the number of criteria increases. We propose the use of the two-stage optimization process presented in
Figure 4.
In step 1, the multi-objective particle swarm optimization (MOPSO) method was used to determine a Pareto set of nondominated solutions to the problem. In step 2, the analytical hierarchy process (AHP) [
21,
34] was used to choose the best solution from the Pareto set. The AHP uses the application-specific judgement matrix that represents the importance of objective functions in the specific application area. These matrices may be constructed beforehand by experts in the field using a simple pairwise comparison of criteria.
3.1. Multi-Objective Particle Swarm Optimization for Finding a Pareto Set of Alternative Paths
The PSO is inspired by the behavior of flocking birds. Individuals in the swarm are called particles and have assigned velocities. The particles fly through the search space according to personal experience and are also attracted by the best individual of the swarm. The MOPSO method proposed by Coello et al. in [
29] was used to find the optimal path. This modification of continuous-space PSO tries to find a Pareto optimal (also called a Pareto Frontier) set of solutions. The Pareto set includes all nondominated solutions, meaning that each solution in this set is better than all other solutions according to at least one optimization criterion.
Figure 5 presents a generalized flowchart of the multi-objective particle swarm optimization process used to find the Pareto set of paths.
In step 4, was an inertia weigh parameter of the PSO algorithm. Initially, its value was 0.4. The coefficients and are random numbers in the range of ; is the velocity of the -th particle.
In step 5, the new position of the particle was calculated. If the particle was outside the definition range (i.e., one of the elements of the particle had a negative value), it was given an opposite direction of the speed () and the position of the particle was set to the edge of the range of its definition (i.e., the search space).
In steps 6 and 7, the function generated a uniformly distributed random number from the interval .
For particle encoding, we used the indirect (sometimes called priority-based) encoding approach. Each particle
,
,
,
,
, represents one possible path in the graph from the first node to the last (destination) one. The elements of the particle are the probabilities of the corresponding nodes used during the construction of the path from the particle. When the new particle was generated, the elements of the particle vector were populated with random real numbers from the interval
. Algorithm 1 describes the construction of the path corresponding to the particle:
Algorithm 1: Path construction algorithm |
Input parameters: graph defined using edge matrix , particle .
Include the first node in path : , . with all available nodes for the path construction: . Repeat until -th node is included in path or more than steps are evaluated:
in path ; ;
was not included in path , mark particle as invalid. Result: path corresponding to particle (or invalid particle). |
For example, consider the graphs presented in
Figure 2 and
Figure 3. The construction of the path for a random particle
begins with the assignment of a source node to the path
. As it is already included in the path, node
is marked as unavailable (red color) for further evaluation. Then, the probabilities of all possible edges starting at the source node (according to the edge matrix
) are compared using element values of the particle
. Node
is appended to the path
because it has the highest probability (0.94 vs. 0.3) among all possible edges. Node
is marked as unavailable for further path construction. Then, all possible edges starting from
are evaluated in the same manner:
. Node
is added to the path
because it has the highest probability (0.8). Finally,
is added to the path as it has the highest probability compared to all other nodes reachable from node
:
. The final path that corresponds to the given particle is
.
3.2. AHP for Optimal Path Selection
AHP was used to choose the optimal path from the Pareto set.
Figure 6 presents a generalized flowchart of AHP:
In step 1, a three-level AHP framework was constructed (
Figure 7). The main objective of the process, that is, determining the best path from the source node to the destination node, comprised the first level. All objectives of the PSO optimization phase were formalized as criteria of AHP and became the second level. The weights of the criteria were calculated on the basis of a pairwise comparison usually conducted manually by experts in the application field. The final result of this step was the so-called judgment matrix (that is, matrix
in Step 2), provided to the algorithm beforehand. All alternative paths from the Pareto set formed the third level of the AHP framework. In step 3, the weight coefficient matrices
,
for each path from the Pareto set were formed by calculating their elements using a special comparison function
. A comparison function uses the corresponding objective functions
, calculates two values
and
, compares them, and transforms the result to the value from the interval
required by AHP. These comparison functions depend heavily on the nature of the criteria and are defined specifically and differently for each criterion.
Then, AHP was started (step 4 in the flowchart) and one best path was selected as the final result (step 5 in the flowchart).
3.3. Objective Functions and Constraints
Different devices of IoT nodes have different performances, network bandwidths, security characteristics, etc. Therefore, the objective functions , and constraints and should be defined according to the situation in the real infrastructure. In the experiments presented in this paper, we used the following objective functions for the evaluation:
The total bandwidth used by the data traveling through the path was calculated as the total weight of the graph edges, i.e., , where is the data path under evaluation and is the matrix of bandwidth usage. If the matrix carries latency values, a similar equation is also applicable for a network-induced latency evaluation: .
Some objective functions could not be expressed by the total weight of the edges because their value depended on the nodes included in the path. For example, CPU and RAM use should be calculated using the expression , where the weight vector , represents CPU use in MIPS by the data transfer through the corresponding nodes. Similarly, , where is the RAM-usage vector of the corresponding nodes expressed in MB.
The security objective function
used in this paper was calculated using yet another expression. The security of the entire data transferred along the path
, that is,
, was defined by the lowest security of all nodes included in the path. We assigned security levels (expressed in security bits, according to the NIST publication [
35]) to nodes on the basis of their ability to support the corresponding security protocols. In this case,
, where
is the vector of the security values of the corresponding nodes. Expression 512-x was used because the PSO algorithm tried to minimize the objective function. Thus, better security should correspond to smaller values of the objective function.
Other application-specific objective functions may also be used, such as power requirements and energy consumption. The concrete definition may also vary according to the system characteristics important in a selected scenario. The proposed optimization method was not limited to any specific amount or nature of the objective functions, as long as they satisfied these two simple requirements:
The result of the objective function is a positive real number.
Better values of the criteria are expressed by smaller numbers (i.e., the PSO method searches for a minimum of the function).
The constraints were also specific application-dependent functions. For example, total memory consumption or CPU use could not exceed the physical capabilities of the corresponding node. If the application area required a specific level of security, then it should be expressed as a constraint, for example, , where is calculated in exactly the same manner as described above. During the PSO phase of optimization, particles that violate the constraints were assigned large fines and naturally eliminated from the optimization process.
4. Results and Discussion
In this section, we summarize the implementation results of the proposed method. The main objective was to evaluate the characteristics of the algorithm under different situations and to test the feasibility of using it in real-life scenarios.
The method proposed for determining the best path was implemented using MATLAB. As input, the implementation used graph data with several weight matrices and vectors used to calculate the values of multiple objective functions. All the concrete numbers used here were only for illustration purposes and did not have any specific meaning. To better understand the context, we called the first objective function the bandwidth evaluation function , the second objective function the latency function , and the third objective function the security function . All these objective functions were calculated as described in the previous section. The implemented version of the algorithm performed a multi-objective particle swarm optimization, found a Pareto optimal set of paths, automatically formed required comparison matrices used in AHP, and chose the best path using a provided judgment matrix.
To illustrate the proposed optimization method, we considered the example graph presented in
Figure 2 (Graph A). Assume that the weights of the edges marked in blue represent the bandwidth requirements. In
Figure 8, the same graph is supplemented with latency requirements marked in green near the edges and the security evaluation of the infrastructure elements, which are marked by different colors of the corresponding graph nodes.
Suppose our objective was to determine the best route from nodes
to
that ensured minimal total bandwidth usage and minimal total latency and also guaranteed maximal security. In this case, the three-dimensional objective function was
. The PSO stage of the proposed method produced a Pareto set of nondominated solutions, as presented in
Table 2.
If we used the one-dimensional PSO method to find the best paths using all three objective functions separately, then the results would be as follows: if only bandwidth use was considered (in this case, the minimal bandwidth use would be 56); if only latency was optimized (in this case, the best result should be 132); and with 256 bits of total security if only security was optimized. As one can see, all optimal values of one-dimensional optimization cases are present in the Pareto set, complemented by some additional paths, which also may be chosen during the AHP step. The presence of the best values of one-dimensional optimization cases in the Pareto set indicates that the multi-objective optimization method works correctly and finds all the most important alternatives. During this step, a swarm of 20 particles was used and the number of iterations was 50.
The judgment matrix used during the AHP stage of optimization is:
This matrix means that minimal bandwidth consumption is more important than overall latency (2 vs. 1), but the security of the data path is much more important than both bandwidth and latency (7 and 3 vs. 1 accordingly). The results of the AHP evaluation of alternatives are summarized in
Table 3.
The best path is , which also means that the best collection of values of the objective functions is .
In the second scenario, we used the graph that was evaluated by other authors [
15,
19,
30]. We assumed that the standard edge weights used in one-dimensional optimization scenarios were bandwidth use.
In
Figure 9, the optimal path, considering only one objective function, found by the algorithm proposed by the authors of [
15] is shown by the bold lines. The total weight of this path is 142. Moreover, other algorithms have found only suboptimal paths: Munetomo’s [
32] algorithm found a path with a total weight of 187 and Inagaki’s [
36] algorithm found one with a weight of 234.
To use multi-objective optimization, we added a second set of weights (i.e., latency) to the edges and defined the security levels of the nodes.
Figure 10 presents the corresponding weight matrix. For the AHP stage, we used the following judgment matrix:
Figure 11 presents a graph of the Pareto front while optimizing using only bandwidth and latency objective functions.
The complete results are summarized in
Table 4, with the AHP evaluation scores added as the fifth column. We used a swarm of 40 particles and 50 iterations for the PSO part of the optimization.
The best alternative was . One can easily view all the optimal and suboptimal paths (considering only bandwidth objective function) discussed above among the members of the Pareto set (the optimal path while using one-dimensional optimization according to latency was 7010).
To show the influence of the judgment matrix on the final result, we used all three objective functions and two different judgment matrices. Matrix
prioritizes security:
The second judgment matrix, that is,
, prioritizes bandwidth over all other objectives:
The description of graph B was complemented by the node security vector
. During the PSO stage of optimization, we used a particle swarm of 40 particles and 50 iterations.
Figure 12 presents the Pareto set of solutions.
If the judgment matrix is used for the AHP step, then the best path is , with the following results of the objective functions: , , and . However, if the judgment matrix is used, then the best path is , with the corresponding objective functions having the following scores: (142, 10, 580, 56).
To test the proposed algorithm with graphs of different sizes, we generated some random graphs with random weight values assigned to the edges (representing bandwidth and latency) and nodes. Graphs with different numbers of nodes (from 20 to 45) are presented in
Figure 13. For example,
Figure 13a presents Graph20, with 20 nodes, and
Figure 13b presents Graph25, with 25 nodes.
Table 5 summarizes the results when the proposed method is applied to all these graphs. The judgment matrix used during the evaluation was
from Equation (3).
The experimental evaluation shows that the proposed method effectively finds the Pareto front in cases with graphs containing up to 45 nodes. If the graph size increases, then the PSO stage of the algorithm is not as effective because, in some cases, the method behaves in an unstable manner, i.e., in some cases, it does not include optimal paths in the Pareto set.
5. Conclusions
In this paper, we proposed a novel approach for finding the optimal data path in a heterogeneous IoT infrastructure. The proposed two-stage method used multi-objective particle swarm optimization to find a Pareto optimal set of alternative data paths, and then an analytical hierarchy process was applied to select the best alternative. The alternatives were evaluated using judgment matrices created once experts evaluated the optimization criteria used during the process. This approach had a double-fold effect: (1) it allowed us to compare different criteria, which is always challenging because the criteria may differ in that they may be qualitative, quantitative, use different units of measurement, etc.; and (2) in different application areas, the objective functions may differ in terms of importance. In such an instance, a different judgment matrix, prepared beforehand by experts in the corresponding application area, is sufficient to modify the method to be used in different scenarios. Moreover, the proposed method not only provided the whole set of alternative solutions, but also evaluated each of them. It allowed us to choose the second- or third-best alternative if the first was not suitable for some reason.
The proposed method worked with a wide range of objective functions, which can be easily expanded. In the examples presented in this article, we used two methods to evaluate objective functions. One can easily combine both approaches or even define more complex or even dynamic objective functions. The proposed approach was transparent as to the nature of the objective function, as long as two simple requirements were met: the result of the objective function was a positive, real number and better values of the criteria were expressed by smaller numbers (i.e., the PSO method searches for a minimum of the function).
The main advantages of the proposed method were its simplicity and the fact that it can be adapted to limited available resources, because both algorithms used during the two stages were well-suited to the constrained nature of fog devices. If the calculation characteristics of the fog node are limited, then the PSO algorithm can be used with fewer particles and/or iterations. Even in such cases, some suboptimal solutions would be found and provide “good enough” results. In addition, the second stage of the proposed method (AHP) was a simple deterministic method, which always chose the best alternative from the given set.
If the complexity of the graph representing the IoT infrastructure did not exceed 40 nodes and 120 edges, then the proposed algorithm produced a Pareto set of alternatives that included all alternatives with all optimal paths while considering each objective separately. If the complexity of the graph increased, the effectiveness of the PSO part of the algorithm was not sufficient. This limitation was not critical, considering the nature of the application of the proposed method, i.e., the IoT infrastructure. The graphs generated from real IoT devices will not exceed a few dozens of nodes and edges.
It was difficult to compare the proposed method with other similar optimization methods because few of them produced the full Pareto set. Usually, some kinds of combining functions are used during the search for an optimal solution. We tried to assess the correctness of the final Pareto set by applying the single-objective path optimization methods. The experimental results show that all the best paths found using all objective functions individually are also present in the set of Pareto Frontier. This shows that the proposed method successfully finds alternatives that are known to be nondominated beforehand.
Several interesting aspects of the proposed method could be explored in the future. It would be interesting to use it in a real IoT infrastructure and evaluate the number of resources saved or the level to which the QoS is improved. Furthermore, the construction of objective functions could be investigated and adapted to real measurements of real hardware.
We believe that the results of this work will be useful in future research in the area of IoT fog computing, data path optimization, and service orchestration, and will allow us to develop more efficient IoT systems.