Next Article in Journal
Deep and Wide Transfer Learning with Kernel Matching for Pooling Data from Electroencephalography and Psychological Questionnaires
Previous Article in Journal
Pragmatic Micrometre to Millimetre Calibration Using Multiple Methods for Low-Coherence Interferometer in Embedded Metrology Applications
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

An Optimized Nature-Inspired Metaheuristic Algorithm for Application Mapping in 2D-NoC

1
Computer Engineering Department, University of Engineering and Technology, Taxila 47050, Pakistan
2
Department of Information and Communication Engineering, Yeungnam University, Gyeongsan 38541, Korea
3
Department of Electronics and Information Engineering, Korea University, Sejong 30019, Korea
*
Authors to whom correspondence should be addressed.
Sensors 2021, 21(15), 5102; https://doi.org/10.3390/s21155102
Submission received: 14 July 2021 / Revised: 26 July 2021 / Accepted: 26 July 2021 / Published: 28 July 2021
(This article belongs to the Section Communications)

Abstract

:
Mapping application task graphs on intellectual property (IP) cores into network-on-chip (NoC) is a non-deterministic polynomial-time hard problem. The evolution of network performance mainly depends on an effective and efficient mapping technique and the optimization of performance and cost metrics. These metrics mainly include power, reliability, area, thermal distribution and delay. A state-of-the-art mapping technique for NoC is introduced with the name of sailfish optimization algorithm (SFOA). The proposed algorithm minimizes the power dissipation of NoC via an empirical base applying a shared k-nearest neighbor clustering approach, and it gives quicker mapping over six considered standard benchmarks. The experimental results indicate that the proposed techniques outperform other existing nature-inspired metaheuristic approaches, especially in large application task graphs.

1. Introduction

The overall performance and scalability of the system-on-chip (SoC) are degraded because of the increasing number of intellectual property (IP) cores embedding on the SoC. For the improvement of overall performance and flexibility of the SoC, new promising solutions have been proposed, and they are called network-on-chip (NoC) [1]. NoC is an on-chip, packet-based communication switching network which is created for interaction between IP cores of the SoC designs [2]. Routers (switch fabric) are linked in some standard topology for communications among IP cores. A router is available for every IP core in an NoC. The router is a basic building block of the NoC architecture; a fault-resilient router architecture is necessary for reliable on-chip communication. The authors of [3,4,5,6] did some architectural modifications in the existing NoC routers designs to propose a reliable on-chip network communication infrastructure. A message passing technique is used for the exchange of data between IP cores. As per the multi-core system principle, the contribution of NoC in power consumption of the total system is around 40%, and this has a vital role in network performance [1,7]. The power, latency and area of NoC-based systems are conspicuously impacted by the selection of an on-chip interconnection architecture [7]. Depending on the interconnection networks, numerous standard topologies are established for the NoC. The most renowned topology out of all prevailing conventional topologies of the NoC architecture is a mesh topology [8].
In the mesh topology, there are short paths for communication between IP cores and high bisection width. The interconnected structure is regular and fixed, and the links are of equal size. Considering this context, various techniques for applications mapping have been proposed using search-based and exact optimization methods. Additionally, proper modeling via an analytical approach has been investigated to reduce the area, latency and power in NoCs.
Because computation time to solve the mapping problem increases with the size of the application to be mapped, it is known that an application mapping is a non-deterministic polynomial-time (NP)-hard problem. To obtain the optimal solution over NoC performance metrics, search-based optimization techniques have been considered. Therefore, the solution of NP-hard problems is significantly dependent on the choice of the best heuristic or metaheuristic technique.
In practical systems, resources are limited so that an efficient utilization of given resources is a critical issue. Optimization techniques can be employed in a wide range of areas, including engineering, finance, resource planning and Internet routing. Using a mathematical model of the social and political progression, metaheuristic algorithms provide an effective algorithm to solve the given optimization problems. These algorithms can obtain a universal solution by facilitating interaction between high level approaches and local improvement methods.
Furthermore, a metaheuristic algorithm can be efficient if it offers a realistic equilibrium between experimentation and exploitation on a provided optimization problem, which is critical. Intensification (i.e., exploitation) is associated with local search, while diversification (i.e., exploration) is associated with global search. Diversification tends to find out diverse solutions globally (i.e., global search). On the other hand, intensification focuses on searching local regions with the knowledge of the current best solution from this region (i.e., local search). There is no initial solution required for global search, while local search starts from an initial candidate solution. The mobility of candidate solutions should be randomized as far as possible during the exploration phase. On the other hand, the exploitation process entails thorough investigation of the promising area(s). The most dominant difference between current metaheuristic algorithms, in general, is how they balance the discovery and exploitation phases. Depending upon the context mentioned above, sailfish optimization (SFO) is considered in this study.
SFO provides a suitable equilibrium between intensification (exploitation) and diversification (exploration) to avoid early convergence. To examine the performance metrics of NoC, the novel metaheuristic optimization algorithm used in this paper, that is, SFO, is described in [9]. The SFO algorithm is modeled after a sailfish group targeting a school of sardine prey in a series of attacks. To begin, SFO uses two assortments of prey and predator species to replicate the technique of group hunting. Second, the presented algorithm breaks down the mutual security of grouping prey by alternating attacks. Third, prey mobility can be changed across the search region, allowing the hunter to capture the right prey and improve its fitness. The effectiveness of the SFO algorithm is verified by examining the optimal mapping for eight NoC benchmarks for the two-dimensional (2D) mesh topology.
The remainder of the paper is structured as follows. The related work is given in Section 2. The inspiration for the sailfish optimization algorithm is described in Section 3. The mapping using SFO, models used for the analysis of metrics and the proposed algorithm are described in Section 4, Section 5 and Section 6, respectively. The experimental setup along with considered benchmarks and results are summarized and analyzed in Section 7. Section 8 ends with some conclusive remarks.

2. Related Work

In [10], Araki and Yoshihiro presented a multi-path reliable distance-vector routing strategy by utilizing multiple paths for the extension of reliable distance-vector routing (RDV) for the improvement of communication performance, decreased delivery delay, higher load-balancing and more substantial network capacity. In comparison to RDV, fault tolerance is also greater against the topology modifications. In [4], Rashid et al. proposed a reliable on-chip network communication architecture by making some architectural improvements in the existing NoC routers’ designs. In [11], a router’s controllers design based on finite-state machine (FSM) is presented for the minimization of error propagation, aiming at low utilization of logical resources.
In [12], Wu and Cai presented a Fibonacci tree optimization strategy (FTOS) for the scheduling query of wireless sensor networks. The proposed algorithm provided less energy consumption and optimization of detection efficiency. In [13], Rhee et al. presented an artificial neural network (ANN) model combined with the genetic algorithm (GA) for the cost-effective operation of a silo. The combined technique gave the optimized results with the improvement in the accuracy of internal level prediction of the silo, and an efficient number of sensors and their positions of installation are determined. In [14], the authors presented a comprehensive overview of the algorithms of machine learning for embedded systems and mobile computing space. In [15], the authors presented a heuristic technique based on the moth-flame optimization (MFO) algorithm for resolving the weak exploration problem of the k-means data clustering algorithm.
The problem of application mapping has stimulated the research community because of the expeditious growth in NoC. Tosun et al. proposed integer linear programming (ILP) as an exact mapping method for the mesh-based two-dimensional NoC with an energy minimization principle in [8]. In [16], Hu and Marculescu presented a branch and bound (BB) mapping solution for the topological allocation of IP cores on an NoC platform for the minimization of the total consumption of energy with the limitation of bandwidth of the link. In [17], Lei et al. presented a two-step genetic algorithm (GA) based on delay for the communication of NoC. The prime function for the scheduling and mapping of IPs was the minimization of overall execution time. Murali and Micheli proposed a heuristic approach based on a mapping algorithm for cores mapping on 2D mesh topology with the restraint of bandwidth reservation in [18]. In [19], Lu et al. presented a clustering algorithm based on simulated annealing for reducing the simulation time of an annealing process of a large system. The process of clustering compromised the optimum results but accelerated the computation time. In [20], Radu and Vintan proposed an optimized simulated annealing (OSA) algorithm for 2D mesh mapping by optimizing the parameters of the annealing process for producing the optimum outcomes with less time than the conventional simulated annealing schemes. Ascia et al. [21] presented a multi-objective GA for mapping of IP cores in a 2D mesh topology for optimizing the power consumption and network performance. In [22], Jena and Sharma presented a heuristics search based multi-objective GA for the mapping of IP cores on a 2D mesh topology for the optimization of link bandwidth, the performance of the network and power dissipation. Sepulvada et al. also presented a multi-objective adaptive immune algorithm (MAIA) for the problem of application mapping of NoC architecture [23]. In [24], Harmanani and Farah proposed an algorithm for assigning tasks to the nodes of a 2D mesh network based on simulated annealing. Hu et al. proposed a task mapping technique for the NoC architecture with a constraint of bandwidth [25]. This technique was energy aware and expedited the run-time of the process of task mapping, but it shows trade-off in the network performance results.
Ye et al. derived the power models for connectivity wires, switch and inbuilt buffer in [26]. In [27], the authors provided a well-accepted mathematical term for 2D NoC interconnect energy models. Kahng et al. [28] and Ost et al. [29] created a practical power model for 2D NoC as a follow-up to the one in [27]. In [28], the power model takes into account architecture-level power as well as region modeling and router capacity for the router. The power modeling in [28] was validated and checked by Ost et al. [29]. The authors of [30] calculated the efficiency of mesh-dependent 2D and 3D NoCs based on the comprehension of energy depletion between the cores and the routing area. The thesis by Sahu and Chattopadhyay [31] takes advantage of a comprehensive review of framework mapping techniques for NoC and examines various mapping methods proposed during the last period. As per Sahu and Chattopadhyay [31], a heuristic-based mapping strategy provided a better end result in terms of network output metrics optimization.
In [32], a simulated annealing (SA) algorithm is implemented as a metaheuristic approach to create an efficient mapping with IP connectivity specifications as a restriction for 2D NoC. The authors of [33] implemented mapping by scheduling with an ant colony optimization (ACO) approach for 2D NoC. In [34], a particle swarm optimization (PSO) is used as a mapping technique on both 2D and 3D NoCs, with the connectivity metric as the objective function. To tackle the problem addressed in [32], a power-aware mapping technique for 2D NoC utilizing SA with the taboo quest (SAT) was proposed by Alagarsamy and Gopalakrishnan [35]. In [36], a mapping technique for a 2D NoC is presented. The foremost objective is to build a chain of linked cores that can be used to construct a new mapping system. In comparison to similar ones, the authors of [36] attempted to use less bandwidth. In [37], Tosun presented a heuristic approach for a mesh 2D NoC in which a priority list based on overall and average communication bandwidth was established.
In [38], a reliability-aware technique is presented. The featured graph is divided into two sub-graphs, which are used to reduce transmission flow. As a result, transmission flow between the two sub-graphs is reduced, while traffic within every graph increases. Niknam and Amiri presented a novel hybrid PSO-based approach to address the clustering issue in [39]. For better performance, ACO and k-means techniques were used. The presented approach was tested and validated on various publicly available datasets, and the preliminary observations are optimistic. The suggested hybrid approach was shown to coincide with an optimal solution in the majority of instances. Junior et al. [40] also presented an ACO-based approach for finding and maximizing directions in a mesh-based NoC. Routed optimization was achieved by reducing the total delay in packet transmission between activities. The visionary conclusions showed the efficiency of the ACO-based technique. In addition, Xie et al. proposed an online mapping protocol to refine task mapping methodology for minimizing connection power consumption [41]. First, the run-time interconnection point of applications was investigated. Secondly, this method measured the mapping assignment and used real-time web mapping.

3. Sailfish Optimizer

In this section, the key inspiration for the SFO algorithm (SFOA) is discussed. The suggested algorithm and mathematical models are then thoroughly explained.

Inspiration

Shadravan et al. [9] recently introduced a new metaheuristic technique called SFO, which incorporates the action of both a predatory group of sailfish and a prey group of sardines. The sailfish is known as a social predator since it attacks and catches its prey in groups. Predators use various killing techniques in cooperative hunting. The class of sailfish, for example, is distinguished by the alternation of attack techniques. It entails that each member of the group attacks the school of prey (sardine) alone at a given time, injuring or hunting some of them while the other group members conserve their strength. Whenever a sailfish attacks a school of prey, it will update its location concerning them. Furthermore, the sailfish will update their location to occupy vacant space around the prey school and imitate circling the prey. When a member of the sardine group (prey) is wounded, the sardine group changes direction to avoid the sailfish’s subsequent attacks. The general procedure of the sailfish optimizer algorithm is defined in the subsections that follow.
Group hunting is an intriguing illustration of collective activity in communities of invertebrates, fishes, birds and mammals. Compared to hunting alone, predators do not require a lot of power to kill their prey while hunting in groups.
Predators in the most basic type of group hunting aim to finish off the prey by step-by-step planning of the attack, whereas predators under the more sophisticated class of group hunting practice specialized positions to mob and capture the prey [42]. The alternation of attacks is one of the most complicated group hunting techniques. This tactic allows the hunter to save strength when other predators are injuring the prey. Sailfish hunting in groups that alternate attacks on the schooling sardines is an illustration of this kind of method [43,44].
The most expeditious fish in the ocean, sailfish can attain speeds up to 62 miles per hour. They hunt in clusters, herding schools of smaller fish, such as sardines, near the surface. Sailfish find the sardines’ mobility and speed during the assault very difficult. The sailfish either slashes multiple sardines with its rostrum or taps a single sardine, causing it to become unstable. Sardines cannot float quickly enough to dodge the tip of the sailfish’s rostrum and are incapable of responding to this community hunting because the sailfish has one of the fastest accelerations ever observed in a floating creature. According to sardine experimental action, wounded sardines would be isolated from the prey shoal and unable to travel with the shoal, resulting in their capture by the sailfish [42].
The majority of sailfish attacks do not result in sardine deaths, and only a small percentage of sardines are directly caught. However, as sailfish attacks become more common, an increasing number of sardines are injured. Animals who hunt in groups, such as wolves, are more likely to engage in this form of hunting. On the other hand, these sailfish parties split up and regroup with new affiliates daily. During an assault, a sailfish preserves its big back flipper and sacral flippers upright to maintain its body strength. Often, right before an attack, they transform their body color from the usually bluish-silver parallel edges deepening to nearly black. The purpose for the color change is unclear, but it appears to be a form of communication between sailfish [42]. Sailfish use shifts in their body to signal which should move first, allowing them to avoid being injured by a companion. The attack-alternation technique of sailfish party hunting is the key inspiration for the SFO algorithm. The natural actions of sailfish and sardines are mathematically represented in the following subsection, and an optimization approach based on this mathematical model is developed.

4. Mapping Using SFOA

4.1. Problem Formulation

An application is characterized by a directed graph of the network in NoC, which is later scheduled by the scheduler using another directed core graph of the network on the existing IP-cores. The directed core graph is transmuted and depicted via an effective mapping method on the NoC topological architecture using an architecture graph.
Definition 1. 
Directed Task Graph (DTG): The task graph of the network is a directed acyclic graph DTG(P, E), where every node of the graph symbolizes a task of the computational process of the application. In addition, the directed edges or links represent the communication or data volume among the tasks communicating.
D T G ( P , E )
where P and E are the sets of nodes, which correspond to the processes or tasks, and links or edges, respectively, and p i P , e i , j E for i , j = 1 , 2 , 3 , .
Definition 2. 
Directed Core Graph (DCG): The core graph of the NoC architecture is a directed graph DCG(C, D), where every node of the graph symbolizes the IP cores in the topology. The directed edges represents the direct communication among the nodes (i.e., IP cores, di and dj).
D C G ( C , D )
where C is the set of IP cores or processing elements and D denotes the set of links or edges with communication directions in the architecture graph. Elements in C and D are defined as c i C and d i , j D for i , j = 1 , 2 , 3 , .

4.2. SFOA for NoC Mapping

The initial sailfish and sardine populations are generated using the initial mapping and weight of the task graph given at time t = 0 . Considering the settings of parameters of the proposed algorithm, the fitness value, which is the communication cost (CC) of the best sailfish (i.e., mapping solution), is computed. (For CC, refer to Equation (8) which is defined in Section 5). Later, the positions of sailfish and sardine are updated in in consideration of attack power (AP). (For position updates of sailfish and sardine, refer to Equations (21) and (27), respectively. For AP, refer to Equation (24) in Section 5). After updating the positions, the optimized result of mapping (sailfish) can be obtained.

4.3. Parameters Setting for SFOA

The proposed algorithm requires the setting of a few basic parameters to verify the efficiency of group hunting. In the proposed algorithm, the fitness function under consideration is the cost for communication, which is denoted by CC. The population size is 300, the number of iteration is equal to 150 and pp is the rate between the sailfish and sardine (where pp is defined as the fraction of the sardine population which forms the initial sailfish population), which is set to 0.1; these values are set for the application mapping on 2D NoC. These values are set based on the number of iterations run and optimization acquired for deducing an optimal solution. They also differ as per the properties of the application considered for mapping.
For the analysis of the performance parameters of an NoC such as energy, power and communication cost computation along with latency and average throughput, two models are used in this work. These two models are named the Bit Energy model and CMOS cell library model, and their mathematical expressions are explained in detail and in the next section.

5. Models Used for Analysis of Metrics

For analyzing the performance metrics of an NoC, two models are considered in the presented work [25,43]. An effective trade-off between the faster mapping over 2D mesh and performance metrics of NoC is presented by SFOA in this study.

5.1. Bit Energy Model

For the estimation of consumption of power of the router in the network, an energy model [25] is considered as follows:
E B = E S B + E L B ,
where E B is the energy used up for transferring 1 bit of data from the source node to the destination node, which comprises the energy of the switch ( E S B ) and energy of the link ( E L B ) of the NoC network. The average network energy consumption E B ( p i , p j ) for transferring 1 bit of data from a source node p i to the destination node p j is calculated by the following equation:
E B ( p i , p j ) = H c o u n t × E S B + ( H c o u n t 1 ) × E L B ,
where H c o u n t is the Manhattan distance between the source node ( a i , a j ) and the destination node ( b i , b j ) , which is obtained by
H c o u n t = a i b i + a j b j .
Therefore, the total energy consumption of the network (ET) is calculated by using the average network energy and the link bandwidth, B W ( p i , p j ) , between nodes p i and p j .
E T = i , j ( E B ( p i , p j ) × B W ( p i , p j ) )
Substituting Equation (4) into Equation (6), E T can be rewritten by
E T = i , j ( H c o u n t × E S B + ( H c o u n t 1 ) × E L B ) × B W p i , p j .
Moreover, the cost of communication is defined by
C C = i , j H c o u n t × B W p i , p j .
Different mapping results generate different energy and cost values. The prime concern is to obtain a mapping function that provides minimal cost for the whole network. The communication cost of the applications of NoC is considered the performance measure for distinct applications in this research work.

5.2. CMOS Cell Library Model

The proposed SFO algorithm utilizes the standard CMOS cell library model [43] for the calculation of network power, latency, energy consumption of packets and throughput of an NoC system. For the computation of average latency of the network via this model, the following equation is used:
L a t a v g = 1 N i = 1 N 1 N i k = 1 N i L a t ( i , j ) ,
where N is the total number of processor or cores in the network, N i is the total numbers of received packets by the core i and L a t ( i , j ) is the latency of packet j at destination node i.
The average throughput of the network, T P a v g , is evaluated as follows:
T P a v g = 1 N ( T S T W ) i = 1 N N i ,
where T W is the warm-up time of the simulation and T S is the simulation time.
The network average power, P N a v g , is computed by
P N a v g = 1 N i = 1 N k = 1 N i α ( i , k ) P N ( a c t , k ) + 1 α ( i , k ) P N ( i n a c t , k )
where α ( i , k ) is the active probability of component k in router i after T W . Moreover, P N ( a c t , k ) and P N ( i n a c t , k ) are the post-layout active and inactive power of the component k.
Finally, the network average energy consumption by every packet is given by
E P a v g = T S T W N N p a c k i = 1 N k = 1 N i α ( i , k ) P N ( a c t , k ) + 1 α ( i , k ) P N ( i n a c t , k )
where N is the total number of cores available in the network. N p a c k = i = 1 N N i is the total number of packets injected in the network. For a certain number of experiments, N remains the same, and N p a c k can be changed by increasing or decreasing the packet injection rate.

6. The Proposed Algorithm: SFOA

The proposed SFOA takes the inputs, directed task graph, D C G , and directed network graph, D N G , and effectively performs the mapping of the task onto the cores of the 2D NoC topological architecture.

6.1. Empirical Base for Initial Mapping

To create the empirical base for the initial mapping, the following five steps of the self-adaptive chicken swarm optimization (SCSO) algorithm [44] are considered. Furthermore, Figure 1 shows the flowchart for initial mapping procedure.
  • Step 1: From DCG, randomly select the IP-Core
    R a n d c i , for c i C
  • Step 2: Use the DC matrix to find the presence of direct connection of the selected core with each core.
    D C = 1 ; if ( c i , c j ) = d i j D 0 ; o t h e r w i s e
  • Step 3: Calculate the average CC ( A i ) and weight ( W i ) for each core ( c i ) as follows:
W i = d i j D w i j
A i = d i j D w i j N c i ,
where w i j is the weight between cores c i and c j and N ( c i ) is the open neighborhood of c i .
For the identification of neighbors, use the following equation:
N c i = c i C c i C c i , c j = d i j D c i , c j = d i j D
  • Step 4: For the identification of hop counts among the source node c i and sink node c j , use the following matrix:
H = H i j
where H i j means that ( i , j ) element of matrix H is given by H i j . Matrix H indicates the minimum probable links for communication between the source and sink nodes. Considering d ( c i , c j ) is the shortest path between the cores c i and c j , N ( c i , c j ) is the number of hops in the shortest path.
H i j = min N c i , c j
  • Step 5: Using the shared K-nearest neighbor clustering approach, form a diverse cluster. If c i and c j have each other in their closest K-nearest neighbors list, then an edge exists between them. The strength of this edge is evaluated using:
s t r c i , c j = K + 1 o × K + 1 p
where K is the size of the neighbor’s list, o is the position of shared near-neighbor in c i list and p is the position of shared near-neighbor in c j list. Hence, c i o , i.e., the shared near-neighbor in c i list, is equal to c j p , that is the shared near-neighbor in c j list.
After Step 5, an empirical base is created with clustered DCG. Figure 2, Figure 3 and Figure 4 show the standard NoC video object plane decoder (VOPD) benchmark, clustering of VOPD task graph and its initial mapping on a 4 × 4 mesh, respectively.

6.2. Video Object Plane Decoder

Video object plane decoder (VOPD) is an application comprising several sub-tasks: run-length decoder, downsampler, quantizer, etc. These sub-tasks require communication among themselves at the rates specified in MBs on the edges between them. Figure 5 represents the architectural diagram of the VOPD, while Figure 2 illustrates the graphical representation of the VOPD tasks. VOPD consists of 16 sub-tasks having 21 edges labeled with distinct communication bandwidth.
For the initial phase of mapping, a random procedure is adopted as a mapping strategy. The outcome of this initial mapping is considered the input for the proposed SFOA to minimize the consumption of power and communication cost of 2D NoC. Figure 6 represents the flowchart for SFOA.

6.3. SFOA Algorithm

6.3.1. Initialization

The first step of SFOA comprises initialization of the sailfish and sardine populations. The population generation/initialization is random. Variable position vectors represent that the sailfish can search in multiple dimensions. In this algorithm, the candidate solution considered is sailfish and the positions of sailfish in the search space are the variables of the problem. Firstly, the sailfish and sardine populations are randomly initialized as X S F i i t r and X S j i t r , which are the position of sailfish and sardine populations where the subscripts i and j are the indices of sailfish and sardine from the initialized population and the superscript i t r denotes the index of iteration.

6.3.2. Aristocracy

A sailfish hunts the sardine while exploring the search region and updating its location/position to find a better solution. While updating the location of the sailfish, which is the search agent in this algorithm, better solutions may be lost. There is the possibility that the updated positions can be worse than the previous positions, thus elitism/aristocracy is applied.
Aristocracy involves finding the best search agent via the best sailfish fitness value and, for the sardines, the best fitness value of injured sardine and replicating the unchanged best solutions to the next generation. The best position of the search agent (sailfish) is kept in every iteration and measured as an Elite. The best or the fittest sailfish acquired until now is the Elite sailfish. It would be the one affecting the maneuverability and speeding up of sardines during the attacking. The location of any injured sardine is also saved in every iteration, which the sailfish will consider for group hunting as the best target selected.
Secondly, the fitness of each sailfish and sardine in the population is calculated using the fitness function (i.e., CC in the proposed algorithm). Based on this, Elite (i.e., the best sailfish) and injured sardine are acquired. The best sailfish is the one having the smallest fitness function value at iteration i t r .
X S F b e s t i t r = { X S F i t r | sailfish with the smallest fitness value }
Similarly, the injured sardine is the one which has been attacked and injured by the sailfish and having the smallest value of CC.
X S i n j i t r = { X S i t r | sardine injured by the best sailfish }

6.3.3. Attack-Alternation Technique

Sailfish promote the success rate of hunting their prey with the help of attacking in coordination technique. Sailfish chase their prey and herd them, change their own position conferring to the position of the other hunting sailfish, without even directly communicating with each other. Through this attack-alternation technique, sailfish injure more sardines during the first phase of hunting, which leads to a higher rate of success in capturing the prey at advanced phases of group hunting.
Afterward, the termination condition is checked. If the condition is not satisfied, the position of sailfish is updated with the following equation:
X S F n e w i t r = X S F b e s t i t r δ i t r × φ × X S F b e s t i t r + X S i n j i t r 2 X S F o l d i t r .
The symbols in the above update equation are defined as follows: X S F n e w i t r is the updated position of sailfish, X S F b e s t i t r is the position of best sailfish, δ i t r is the coefficient at iteration i t r , φ is a random number between 0 and 1, X S i n j i t r is the position of injured sardine and X S F o l d i t r is the current position of sailfish.
δ i t r = 2 × φ × P .
where P denotes the prey density.
The prey density represents the quantity of prey at each iteration. It is an important factor when updating the position of sailfish because the number of prey (i.e., sardines) will decline in group hunting as follows:
P = 1 n S F n S F + n S
where n S F and n S denote the numbers of sailfish and sardine, respectively, in each iteration.
After using Equation (21) for updating the position of sailfish, the attacked power of sailfish, A P , at iteration i t r is calculated with
A P = C × 1 2 × i t r × ϵ ,
where C and e p s i l o n are the coefficients for linearly decreasing AP.

6.3.4. Hunting Prey

The observation of a complete massacre of sardine is very sporadic at the beginning of the group hunting. In more than 90% of the cases, the scales of sardines would be removed after the sailfish strikes their bodies. At the start of the hunting phase, the energy level of sailfish for hunting and catching its prey is higher, and the sardines are also not really drained and injured. This is the reason that sardines have excessive escape speed and high maneuverability. Sailfish’s attacking power would decline steadily over the time of hunting.
The position of every sardine in the population is also updated based on the current position of sailfish and AP at every iteration. The following formula is used for updating the position of sardine:
X S n e w i t r = φ 1 × X S F b e s t i t r X S o l d i t r + A P ,
where X S n e w i t r and X S o l d i t r are the updated and previous positions of sardine and φ 1 is a random number between 0 and 1.
Considering the value of A P , if the attack power of sailfish is less than 0.5, only S number of sardines positions will be updated. Otherwise, all the sardines’ positions will be updated. Here, S is determined by
S = n S × A P .
Next, the fitness value (i.e., CC) of all the sardines and sailfish is recalculated as per their updated positions and population is sorted.

6.3.5. Catching Prey

Alongside the reducing attacking power of sailfish, the energy levels of sardines would be decremented because of the recurrent powerful attacking of sailfish. The attacks also affect the maneuverability as it reduces the prey’s ability to detect the directional information regarding the position of sailfish. This will result in pulling away the sardines from the school after being slashed by the sailfish’s rostrum, and they would be quickly captured then.
In the last phase of hunting, the pulled away sardines are quickly captured by the sailfish. In this algorithm, it is considered that, if any sardine becomes fitter than the sailfish, it is removed from its population. The sailfish will update to the position of the corresponding sardine as follows:
X S F i t r = X S i t r , i f C C S i t r < C C S F i t r ,
where C C S i t r and C C S F i t r denote the fitness values (i.e., CC values) of sardines and sailfish at iteration i t r .
Thereafter, the position of best sailfish and injured sardine is also updated at every iteration.

6.3.6. Deducing Optimal Sailfish

The injured sardine that pulled away from the school would quickly be captured. In SFOA, it is considered that, when a sardine becomes weak, its respective sailfish catches its prey. The hunted sardine’s position replaces the sailfish’s position, elevating the probabilities of new prey’s hunting. After satisfying the termination condition, the best sailfish is acquired along with its fitness value, that is CC.

7. Results and Discussion

This section presents the results of the performance analysis of SFOA for 2D NoC for six standard NoC benchmarks, as shown in Table 1. Network size is standard 4 × 4 for all considered benchmarks. For a fair comparison with previous state-of-the-art architectures, the network size is the same. VOPD application consists of 16 sub-tasks. These sub-tasks can be mapped on a 4 × 4 mesh network. However, in the case of MPEG4, MWD, MP3encMP3dec, 263encMP3dec and 263decMP3dec 4, 4, 3, 4 and 1 routers are idle, respectively.

7.1. Experimental Setup

To evaluate the performance of the proposed SFOA, different standard NoC benchmarks were considered and various experiments were conducted. The proposed algorithm was verified for 2D NoC with other nature-inspired algorithms such as ACO, PSO, GA, SA and CSO. The code for the proposed SFOA algorithm was written in Python and implemented on NoC Tweak Simulator [43]. All experiments were run on a PC Intel(R) Core (TM) i7-16GB RAM, 2.30 GHz processor. Table 2 depicts the details of the NoC Tweak platform for simulation.

7.2. Average Power Dissipation Analysis

To evaluate the efficiency of the proposed algorithm, power minimization analysis was also performed. It shows that SFOA outperformed other existing mapping techniques and the average percentage of improvement on power minimization with other nature-inspired algorithms.
Table 3 shows the results for total power consumption in watts (W) of 2D 4 × 4 mesh for six standard NoC benchmarks. From the results in Table 3, it is evident the average improvement of power minimization of our proposed algorithm SFOA is 3.63%, 23.7%, 18.70%, 22.14%, 27.25%, 18.66%, 12.08% and 4.73% over ILP, ACO, PSO, SAT, SA, GA, BA and CSO, respectively.

7.3. Communication Cost and Computation Time Analysis

The execution analysis of the proposed SFOA compared to other present nature-inspired mapping algorithms is presented in this section. Table 4 depicts the evaluation of average communication cost ( H c o u n t × B W ) from Equation (8) for VOPD [34] and MPEG4 [30] standard NoC benchmarks for two-dimensional NoC.
As ILP [8] is regarded as one of the most competent algorithms in the exact mapping method for communication cost estimation, our proposed SFOA esd explicitly compared with ILP as well, along with other algorithms. SFOA provides the same results for communication cost, as shown by the results in Table 4. The values of a few parameters are missing in Table 4, Table 5 and Table 6 for some benchmarks as they were not provided by the authors in the base papers of ACO [33] and SA [32].
The percentage deviation from the exact mapping method based on ILP over heuristic-based mapping techniques for 2D NoC is shown in Table 5. However, the proposed SFOA gives the best results compared with other nature-inspired algorithms, as specified by the results in Table 6. In comparison with other existing mapping techniques, the proposed SFOA takes 69% less computation time. Table 6 represents the estimations for computation time in seconds and communication cost in MB/s of two-dimensional 4 × 4 mesh for six standard NoC benchmarks.

7.4. Average Network Latency Analysis

For the analysis of the performance of the proposed SFOA, the impact of average network latency was also scrutinized with different types of traffic patterns on mesh topological architecture. The considered distinct types of traffic patterns are a uniform random traffic pattern and tornado traffic pattern. These traffic patterns are a method for defining the communication between the IP-cores of the NoC.
In the case of uniform random traffic patterns, it distributes the traffic uniformly, balances the load and each source is equally likely to communicate with each destination. In the case of tornado traffic patterns, it is devised as a combatant for torus topologies.
The performance analysis of the considered 4 × 4 mesh-based NoC architecture was done using the XY-routing algorithm via the NoCTweak simulator [43]. The average network latency of the proposed algorithm, i.e., SFOA, was evaluated for the above-considered two types of traffic patterns compared with other existing nature-inspired heuristics algorithms.
Figure 7 depicts the graphical results of the average network latency in contrast to different rates of injection load under uniform random traffic patterns. It is evident from this graph that SFOA outperformed PSO, GA, BA and CSO by 11.23%, 16.40%, 8.65% and 4.42%, respectively, for uniform random traffic pattern. Furthermore, Figure 8 illustrates the graphical results of the average network latency compared to different injection load rates under tornado traffic patterns. It can be seen from this graph that SFOA outperformed PSO, GA, BA and CSO by 24.06%, 25.45%, 13.89% and 5.82%, respectively, for tornado traffic patterns.
SFOA gives the best latency in comparison with other existing nature-inspired algorithms considered such as PSO, GA, BA and CSO using minimum hops count mapping technique.
The mapping results of the proposed SFOA clearly indicate that it is more efficient than other existing nature-inspired algorithms. The results in figures and tables show the improvement in performance analysis parameters. It indicates the reduction in average network power consumption, computation time, communication cost and average network latency.

8. Conclusions

This paper presents a state-of-art nature-inspired metaheuristic algorithm, i.e., SFOA, which mainly comprises two advantages. The first advantage is high-speed convergence by strengthening the searching process used for the best sailfish group. The second advantage is robust optimization by strengthening the search space intended for the diversity of the sardine population. SFOA is used for the optimized mapping of the application task graph on a two-dimensional NoC with mesh topology. The efficiency of the proposed approach was assessed based on the results of the performance analysis parameters for six standard NoC benchmarks. The evaluation of the proposed SFOA proficiency was done via multiple experiments on alternative heuristic algorithms such as ACO, PSO, SA, GA, BA and CSO. The results shown in the previous section indicate that the average improvement of power minimization of the proposed algorithm SFOA is 3.63%, 23.7%, 18.70%, 22.14%, 27.25%, 18.66%, 12.08% and 4.73% over ILP, ACO, PSO, SAT, SA, GA, BA and CSO, respectively. In contrast to other existing mapping techniques, the proposed SFOA takes 69% less computation time. It is evident from the average network latency graphs that SFOA outperformed PSO, GA, BA and CSO for two distinct standards of traffic patterns for NoC by 11.23%, 16.40%, 8.65% and 4.42% for uniform random traffic patterns and 24.06%, 25.45%, 13.89% and 5.82% for tornado traffic patterns, respectively. The experiments results reveal that SFOA outperformed other nature-inspired algorithms to minimize power consumption, computation time, communication cost and latency. Moreover, this work can be continued in various ways, e.g., some hybrid algorithms can be introduced to reduce computation time further. This algorithm can also be implemented on 2D and 3D NoC architectures with different topologies.

Author Contributions

Conceptualization, S.S. and N.K.B.; Formal analysis, S.S., N.K.B., F.H. and W.A.; Investigation, S.S., N.K.B., F.H., W.A., Y.B.Z. and H.Y.; Methodology, S.S., N.K.B., F.H., W.A., Y.B.Z. and H.Y.; Resources, Y.B.Z. and H.Y.; Software, S.S. and N.K.B.; Validation, N.K.B., Y.B.Z. and H.Y.; Visualization, S.S., N.K.B., F.H. and W.A.; Writing—original draft, S.S. and N.K.B.; Writing—review and editing, Y.B.Z. and H.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (2019R1A2C1083988), in part by Basic Science Research Program through the NRF funded by the Ministry of Education (2021R1I1A3041887) and in part by MSIT, Korea, under the ITRC (Information Technology Research Center) support program (IITP-2021-2016-0-00313) supervised by the IITP (Institute for Information & communications Technology Promotion).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Bjerregaard, T.; Mahadevan, S. A survey of research and practices of network-on-chip. ACM Comput. Surv. (CSUR) 2006, 38, 1–es. [Google Scholar] [CrossRef]
  2. Tsai, W.C.; Lan, Y.C.; Hu, Y.H.; Chen, S.J. Networks on chips: Structure and design methodologies. J. Electr. Comput. Eng. 2012, 2012, 4. [Google Scholar] [CrossRef] [Green Version]
  3. Baloch, N.K.; Baig, M.I.; Daneshtalab, M. Defender: A low overhead and efficient fault-tolerant mechanism for reliable on-chip router. IEEE Access 2019, 7, 142843–142854. [Google Scholar] [CrossRef]
  4. Rashid, M.; Baloch, N.K.; Shafique, M.A.; Hussain, F.; Saleem, S.; Zikria, Y.B.; Yu, H. Fault-Tolerant Network-On-Chip Router Architecture Design for Heterogeneous Computing Systems in the Context of Internet of Things. Sensors 2020, 20, 5355. [Google Scholar] [CrossRef] [PubMed]
  5. Shafique, M.A.; Baloch, N.K.; Baig, M.I.; Hussain, F.; Zikria, Y.B.; Kim, S.W. NoCGuard: A Reliable Network-on-Chip Router Architecture. Electronics 2020, 9, 342. [Google Scholar] [CrossRef] [Green Version]
  6. Ibrahim, M.; Baloch, N.K.; Anjum, S.; Zikria, Y.B.; Kim, S.W. An energy efficient and low overhead fault mitigation technique for internet of thing edge devices reliable on-chip communication. Softw. Pract. Exp. 2020. [Google Scholar] [CrossRef]
  7. Bononi, L.; Concer, N. Simulation and analysis of network on chip architectures: Ring, spidergon and 2D mesh. In Proceedings of the Design Automation & Test in Europe Conference, Munich, Germany, 6–10 March 2006; Volume 2, p. 6. [Google Scholar]
  8. Tosun, S.; Ozturk, O.; Ozen, M. An ILP formulation for application mapping onto network-on-chips. In Proceedings of the 2009 International Conference on Application of Information and Communication Technologies, Baku, Azerbaijan, 14–16 October 2009; pp. 1–5. [Google Scholar]
  9. Shadravan, S.; Naji, H.; Bardsiri, V.K. The Sailfish Optimizer: A novel nature-inspired metaheuristic algorithm for solving constrained engineering optimization problems. Eng. Appl. Artif. Intell. 2019, 80, 20–34. [Google Scholar] [CrossRef]
  10. Araki, D.; Yoshihiro, T. A Distance-Vector-Based Multi-Path Routing Scheme for Static-Node-Assisted Vehicular Networks. Sensors 2019, 19, 2688. [Google Scholar] [CrossRef] [Green Version]
  11. Melo, D.R.; Zeferino, C.A.; Dilillo, L.; Bezerra, E.A. Maximizing the inner resilience of a network-on-chip through router controllers design. Sensors 2019, 19, 5416. [Google Scholar] [CrossRef]
  12. Wu, L.; Cai, H. Energy-Efficient Adaptive Sensing Scheduling in Wireless Sensor Networks using Fibonacci Tree Optimization Algorithm. Sensors 2021, 21, 5002. [Google Scholar] [CrossRef]
  13. Rhee, J.H.; Kim, S.I.; Lee, K.M.; Kim, M.K.; Lim, Y.M. Optimization of Position and Number of Hotspot Detectors Using Artificial Neural Network and Genetic Algorithm to Estimate Material Levels Inside a Silo. Sensors 2021, 21, 4427. [Google Scholar] [CrossRef] [PubMed]
  14. Ajani, T.S.; Imoize, A.L.; Atayero, A.A. An Overview of Machine Learning within Embedded and Mobile Devices–Optimizations and Applications. Sensors 2021, 21, 4412. [Google Scholar] [CrossRef] [PubMed]
  15. Singh, T.; Saxena, N.; Khurana, M.; Singh, D.; Abdalla, M.; Alshazly, H. Data Clustering Using Moth-Flame Optimization Algorithm. Sensors 2021, 21, 4086. [Google Scholar] [CrossRef]
  16. Hu, J.; Marculescu, R. Energy-and performance-aware mapping for regular NoC architectures. IEEE Trans.-Comput.-Aided Des. Integr. Circuits Syst. 2005, 24, 551–562. [Google Scholar]
  17. Lei, T.; Kumar, S. A two-step genetic algorithm for mapping task graphs to a network on chip architecture. In Proceedings of the Euromicro Symposium on Digital System Design, Belek-Antalya, Turkey, 1–6 September 2003; pp. 180–187. [Google Scholar]
  18. Murali, S.; De Micheli, G. Bandwidth-constrained mapping of cores onto NoC architectures. In Proceedings of the Proceedings Design, Automation and Test in Europe Conference and Exhibition, Paris, France, 16–20 February 2004; Volume 2, pp. 896–901. [Google Scholar]
  19. Lu, Z.; Xia, L.; Jantsch, A. Cluster-based simulated annealing for mapping cores onto 2D mesh networks on chip. In Proceedings of the 2008 11th IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systems, Bratislava, Slovakia, 16–18 April 2008; pp. 1–6. [Google Scholar]
  20. Radu, C.; Vinţan, L. Domain-knowledge optimized simulated annealing for Network-on-Chip application mapping. In Advances in Intelligent Control Systems and Computer Science; Springer: Berlin/Heidelberg, Germany, 2013; pp. 473–487. [Google Scholar]
  21. Ascia, G.; Catania, V.; Palesi, M. Multi-objective mapping for mesh-based NoC architectures. In Proceedings of the 2nd IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis, Stockholm, Sweden, 8–10 September 2004; pp. 182–187. [Google Scholar]
  22. Jena, R.; Sharma, G. Application mapping of mesh based-NoC using multi-objective genetic algorithm. Int. J. Comput. Appl. 2008, 30, 17–22. [Google Scholar]
  23. Sepúlveda, J.; Strum, M.; Chau, W.J.; Gogniat, G. A multi-objective approach for multi-application NoC mapping. In Proceedings of the 2011 IEEE Second Latin American Symposium on Circuits and Systems (LASCAS), Bogota, Colombia, 23–25 February 2011; pp. 1–4. [Google Scholar]
  24. Harmanani, H.M.; Farah, R. A method for efficient mapping and reliable routing for NoC architectures with minimum bandwidth and area. In Proceedings of the 2008 Joint 6th International IEEE Northeast Workshop on Circuits and Systems and TAISA Conference, Montreal, QC, Canada, 22–25 June 2008; pp. 29–32. [Google Scholar]
  25. Hu, J.; Marculescu, R. Energy-aware mapping for tile-based NoC architectures under performance constraints. In Proceedings of the 2003 Asia and South Pacific Design Automation Conference, Kitakyushu, Japan, 24 January 2003; pp. 233–239. [Google Scholar]
  26. Ye, T.T.; Micheli, G.D.; Benini, L. Analysis of power consumption on switch fabrics in network routers. In Proceedings of the 39th Annual Design Automation Conference, New Orleans, LA, USA, 10–14 June 2002; pp. 524–529. [Google Scholar]
  27. Bhat, S. Energy Models for Network-on-Chip Components; Master of Science, Department of Mathematics and Computer Science, Technische Universiteit Eindhoven: Eindhoven, The Netherlands, 2005. [Google Scholar]
  28. Kahng, A.B.; Lin, B.; Samadi, K. Improved on-chip router analytical power and area modeling. In Proceedings of the 2010 15th Asia and South Pacific Design Automation Conference (ASP-DAC), Taipei, Taiwan, 18–21 January 2010; pp. 241–246. [Google Scholar]
  29. Ost, L.; Guindani, G.; Moraes, F.; Indrusiak, L.; Määttä, S. Exploring NoC-based MPSoC design space with power estimation models. IEEE Des. Test Comput. 2010, 28, 16–29. [Google Scholar] [CrossRef]
  30. Feero, B.S.; Pande, P.P. Networks-on-chip in a three-dimensional environment: A performance evaluation. IEEE Trans. Comput. 2008, 58, 32–45. [Google Scholar] [CrossRef]
  31. Sahu, P.K.; Chattopadhyay, S. A survey on application mapping strategies for network-on-chip design. J. Syst. Archit. 2013, 59, 60–76. [Google Scholar] [CrossRef]
  32. Marcon, C.; Borin, A.; Susin, A.; Carro, L.; Wagner, F. Time and energy efficient mapping of embedded applications onto NoCs. In Proceedings of the 2005 Asia and South Pacific Design Automation Conference, Shanghai, China, 21–21 January 2005; pp. 33–38. [Google Scholar]
  33. Ferrandi, F.; Lanzi, P.L.; Pilato, C.; Sciuto, D.; Tumeo, A. Ant colony heuristic for mapping and scheduling tasks and communications on heterogeneous embedded systems. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2010, 29, 911–924. [Google Scholar] [CrossRef] [Green Version]
  34. Sahu, P.K.; Shah, T.; Manna, K.; Chattopadhyay, S. Application mapping onto mesh-based network-on-chip using discrete particle swarm optimization. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 2013, 22, 300–312. [Google Scholar] [CrossRef]
  35. Alagarsamy, A.; Gopalakrishnan, L. SAT: A new application mapping method for power optimization in 2D—NoC. In Proceedings of the 2016 20th International Symposium on VLSI Design and Test (VDAT), Guwahati, India, 24–27 May 2016; pp. 1–6. [Google Scholar]
  36. Tavanpour, M.; Khademzadeh, A.; Janidarmian, M. Chain-Mapping for mesh based Network-on-Chip architecture. IEICE Electron. Express 2009, 6, 1535–1541. [Google Scholar] [CrossRef] [Green Version]
  37. Tosun, S. New heuristic algorithms for energy aware application mapping and routing on mesh-based NoCs. J. Syst. Archit. 2011, 57, 69–78. [Google Scholar] [CrossRef]
  38. Patooghy, A.; Tabkhi, H.; Miremadi, S.G. RMAP: A reliability-aware application mapping for network-on-chips. In Proceedings of the 2010 third International conference on dependability, Venice, Italy, 18–25 July 2010; pp. 112–117. [Google Scholar]
  39. Niknam, T.; Amiri, B. An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Appl. Soft Comput. 2010, 10, 183–197. [Google Scholar] [CrossRef]
  40. Junior, L.S.; Nedjah, N.; de Macedo Mourelle, L. Routing for applications in NoC using ACO-based algorithms. Appl. Soft Comput. 2013, 13, 2224–2231. [Google Scholar] [CrossRef]
  41. Xie, B.; Chen, T.; Hu, W.; Tang, X.; Wang, D. An energy-aware online task mapping algorithm in NoC-based system. J. Supercomput. 2013, 64, 1021–1037. [Google Scholar] [CrossRef]
  42. Herbert-Read, J.E.; Romanczuk, P.; Krause, S.; Strömbom, D.; Couillaud, P.; Domenici, P.; Kurvers, R.H.; Marras, S.; Steffensen, J.F.; Wilson, A.D.; et al. Proto-cooperation: Group hunting sailfish improve hunting success by alternating attacks on grouping prey. Proc. R. Soc. Biol. Sci. 2016, 283, 20161671. [Google Scholar] [CrossRef] [PubMed]
  43. Tran, A.T.; Baas, B. NoCTweak: A Highly Parameterizable Simulator for Early Exploration of Performance and Energy of Networks on-Chip; Tech. Rep. ECE-VCL-2012-2; VLSI Computation Lab, ECE Department, University of California: Davis, CA, USA, 2012. [Google Scholar]
  44. Alagarsamy, A.; Gopalakrishnan, L.; Mahilmaran, S.; Ko, S.B. A self-adaptive mapping approach for network on chip with low power consumption. IEEE Access 2019, 7, 84066–84081. [Google Scholar] [CrossRef]
  45. Li, J.; Song, G.; Ma, Y.; Wang, C.; Zhu, B.; Chai, Y.; Rong, J. Bat algorithm based low power mapping methods for 3D network-on-chips. In Proceedings of the National Conference of Theoretical Computer Science, Nanning, China, 13–15 November 2020; pp. 277–295. [Google Scholar]

Short Biography of Authors

Saleha Sikandar completed her BS in Electronic Engineering from International Islamic University Islamabad (IIUI), Pakistan in 2017. She is currently doing her MSc in Computer Engineering from University of Engineering & Technology (UET) Taxila, Pakistan. She has vast experience in research and development in various embedded systems companies. She worked on the development of many embedded and chip design systems. Her research interests include embedded systems design, Network on Chip (NoC), and reconfigurable systems designs. She is currently working on the low-cost application mapping on NoC.
 
Naveed Khan Baloch received his BSc degree in Computer Engineering from the University of Engineering and Technology, Taxila, Pakistan in 2007. He has worked in multinational companies as an embedded system designer from 2007 to 2010. He joined the university as a lecturer after completing his MS degree from UET Taxila. He recently completed his Ph.D. in Computer Engineering from the same university. He has published many research papers in his field and has experience in embedded system design, fault tolerant systems, reconfigurable computing, and he is currently working on self-healing digital systems. Nowadays, he is working as an Assistant Professor in Computer Engineering Department UET Taxila. During his tenure in academia, he did many collaborations with industry and foreign universities in the field of on-chip networks, embedded vision, and reconfigurable computing.
 
Fawad Hussain received his B.Sc. in Computer Engineering, M.Sc. in Electrical Engineering, and Ph.D. in Computer Engineering at the University of Engineering and Technology (UET), Taxila, Pakistan in 2005, 2009, and 2015, respectively. He is currently working as an Assistant Professor in the Computer Engineering Department, the University of Engineering and Technology (UET), Taxila, Pakistan. His research interest includes speech and audio processing, computer vision, human activity recognition, and emotion recognition. He is currently leading research for his MS and Ph.D. students in the mentioned areas of interest.
 
Waqar Amin received his B.Sc. degree in computer engineering from the University of Engineering and Technology at Taxila (UET Taxila), Pakistan, in 2007, and MS degree also from the UET Taxila in 2014. Currently, he is pursuing his Ph.D. degree. He has vast experience in research and development in various embedded systems companies. He worked on the development of many GSM and 3G systems. His research interests include fault-tolerant systems, Network on Chip (NoC), self-healing, and reconfigurable systems designs. Currently, he is working on the low-cost application mapping on NoC.
 
Yousaf Bin Zikria is currently working as an Assistant Professor in the Department of Information and Communication Engineering, College of Engineering, Yeungnam University, Gyeongsan-Si, South Korea. He received a Ph.D. degree from the Department of Information and Communication Engineering, Yeungnam University, Korea, in 2016. He has more than ten years of experience in research, academia, and industry in the field of Information and Communication Engineering and Computer Science. He authored more than 90 scientific peer-reviewed journals, conferences, patents, and book chapters. GoogleScholar: https://scholar.google.com/citations?user=K90qMyMAAAAJhl=en (accessed on 20 July 2021), Website: https://sites.google.com/view/ybzikria (accessed on 20 July 2021), Researchgate: https://www.researchgate.net/profile/YousafZikria (accessed on 20 July 2021).
 
Heejung Yu (Senior Member, IEEE) received a B.S. degree in Radio Science and Engineering from Korea University, Seoul, South Korea, in 1999 and M.S. and Ph.D. degrees in Electrical Engineering from the Korea Advanced Institute of Science and Technology, Daejeon, South Korea, in 2001 and 2011, respectively. From 2001 to 2012, he was with the Electronics and Telecommunications Research Institute, Daejeon, South Korea, and, from 2012 to 2019, he was with the Yeungman University, Gyeongsan, South Korea. Currently, he is an associate professor with the Department of Electronics and Information Engineering, Korea University, Sejong, South Korea. His areas of interest include statistical signal processing and communication theory.
Figure 1. Flowchart for initial mapping.
Figure 1. Flowchart for initial mapping.
Sensors 21 05102 g001
Figure 2. Standard NoC VOPD benchmark.
Figure 2. Standard NoC VOPD benchmark.
Sensors 21 05102 g002
Figure 3. Clustering of VOPD task graph.
Figure 3. Clustering of VOPD task graph.
Sensors 21 05102 g003
Figure 4. Initial mapping of VOPD task graph on 4 × 4 mesh.
Figure 4. Initial mapping of VOPD task graph on 4 × 4 mesh.
Sensors 21 05102 g004
Figure 5. Architectural representation of VOPD.
Figure 5. Architectural representation of VOPD.
Sensors 21 05102 g005
Figure 6. Flowchart for proposed SFOA.
Figure 6. Flowchart for proposed SFOA.
Sensors 21 05102 g006
Figure 7. Average network latency for uniform random traffic patterns.
Figure 7. Average network latency for uniform random traffic patterns.
Sensors 21 05102 g007
Figure 8. Average network latency for Tornado traffic patterns.
Figure 8. Average network latency for Tornado traffic patterns.
Sensors 21 05102 g008
Table 1. Standard NoC benchmarks details with mesh sizes.
Table 1. Standard NoC benchmarks details with mesh sizes.
BenchmarkNodesEdges2D Mesh Size
VOPD [35]1621 4 × 4
MPEG4 [31]1226 4 × 4
MWD [35]1213 4 × 4
MP3encMP3dec [31]1314 4 × 4
263encMP3dec [31]1212 4 × 4
263decMP3dec [31]1415 4 × 4
Table 2. Simulation environment description.
Table 2. Simulation environment description.
Network Type2D Mesh
Type of PlatformEMBEDDED
Embedded applicationsVOPD, MPEG4, MWD, MP3encMP3dec, 263encMP3dec, 263decMP3dec
Mapping algorithmSFOA, CSO, ACO, PSO, SA
Type of RouterWORMHOLE-PIPELINE
Routing algorithmXY DIMENSION-ORDERED
Arbitration PolicyVIRTUAL CHANNEL ARBITRATION
Packet delivery typeWITHOUT ACK
Packet distributionEXPONENTIAL
Sending ACK policySEND ACK OPTIMALLY
Packet length (fixed)10 (flits)
Injection rate (flit)0.1 (flits/cycle/node)
Output channel selectionXY-ORDERED
Buffer size8 (flits)
Inter-route link length10,000 (µm)
Pipeline type8
Pipeline stages4
Input clock frequency1000 (MHz)
Operating clock frequency1000 (MHz)
Warm-up time20,000 cycles
Table 3. Total Power (W) for 2D NoC 4 × 4 Mesh for standard NoC benchmarks.
Table 3. Total Power (W) for 2D NoC 4 × 4 Mesh for standard NoC benchmarks.
Mapping AlgorithmVOPDMPEG4MWDMP3encMP3dec263encMP3dec263decMP3dec
ILP1.5281.1371.0121.2281.2861.211
ACO1.9201.4231.2181.4981.5991.738
PSO1.8411.3571.1121.5071.4451.561
SAT1.8561.3701.2361.5241.5631.624
SA1.9711.4781.2561.5901.6971.877
GA1.8431.3561.1091.5071.4451.561
BA1.6341.2471.1101.4861.3231.313
CSO1.5181.2191.0231.2281.2861.198
Proposed Algorithm1.3111.2011.0151.2281.2861.148
Table 4. Estimation of communication cost for 2D NoC.
Table 4. Estimation of communication cost for 2D NoC.
Mapping AlgorithmCommunication Cost (Hops × Bandwidth) in MB/s
VOPDMPEG4
ILP [8]41193567
ACO [33]-3633
PSO [34]41193567
SA [32]42313567
GA [21]42183772
BA [45]41193567
CSO [44]41193567
Proposed Algorithm41193567
Table 5. Percentage deviation over ILP based mapping techniques.
Table 5. Percentage deviation over ILP based mapping techniques.
Mapping AlgorithmPercentage of Communication Cost Deviation
VOPDMPEG4
ACO-1.9
PSO0.00.0
SA2.7-
GA2.45.7
BA0.00.0
CSO0.00.0
Proposed Algorithm0.00.0
Table 6. Communication cost and computation time of 2D NoC 4 × 4 mesh for standard NoC benchmarks.
Table 6. Communication cost and computation time of 2D NoC 4 × 4 mesh for standard NoC benchmarks.
Mapping AlgorithmVOPDMPEG4MWD
Communication Cost (Hops × Bandwidth) in MB/sComputation Time in SecondsCommunication Cost (Hops × Bandwidth) in MB/sComputation Time in SecondsCommunication Cost (Hops × Bandwidth) in MB/sComputation Time in Seconds
ILP41194679.341356722.3401120210.021
ACO--363318.652--
PSO41193.78535673.46511203.432
SA42313878.5273567-1451197.541
GA42183.92537723.23413213.420
BA41192.23135672.92511222.894
CSO41192.23135672.01011221.996
Proposed Algorithm41191.9835671.9611201.886
Mapping AlgorithmMP3encMP3dec263encMP3dec263decMP3dec
Communication Cost (hops × bandwidth) in MB/sComputation Time in SecondsCommunication Cost (hops × bandwidth) in MB/sComputation Time in SecondsCommunication Cost (hops × bandwidth) in MB/sComputation Time in Seconds
ILP17.0211435.012230.407193.03519.8234897.210
ACO17.2311196.856----
PSO17.0213.194230.4073.18519.8233.188
GA17.1333.194230.6983.18519.9113.174
BA17.8342.653231.4502.34519.9362.350
CSO17.0211.785230.4071.52719.8231.511
Proposed Algorithm17.0211.585230.4071.22719.8231.011
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Sikandar, S.; Baloch, N.K.; Hussain, F.; Amin, W.; Zikria, Y.B.; Yu, H. An Optimized Nature-Inspired Metaheuristic Algorithm for Application Mapping in 2D-NoC. Sensors 2021, 21, 5102. https://doi.org/10.3390/s21155102

AMA Style

Sikandar S, Baloch NK, Hussain F, Amin W, Zikria YB, Yu H. An Optimized Nature-Inspired Metaheuristic Algorithm for Application Mapping in 2D-NoC. Sensors. 2021; 21(15):5102. https://doi.org/10.3390/s21155102

Chicago/Turabian Style

Sikandar, Saleha, Naveed Khan Baloch, Fawad Hussain, Waqar Amin, Yousaf Bin Zikria, and Heejung Yu. 2021. "An Optimized Nature-Inspired Metaheuristic Algorithm for Application Mapping in 2D-NoC" Sensors 21, no. 15: 5102. https://doi.org/10.3390/s21155102

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop