A Path Load-Aware Based Caching Strategy for Information-Centric Networking

Chao, Yichao; Ni, Hong; Han, Rui

doi:10.3390/electronics11193088

Open AccessArticle

A Path Load-Aware Based Caching Strategy for Information-Centric Networking

by

Yichao Chao

^1,2,

Hong Ni

^1,2 and

Rui Han

^1,2,*

¹

National Network New Media Engineering Research Center, Institute of Acoustics, Chinese Academy of Sciences, No. 21, North Fourth Ring Road, Haidian District, Beijing 100190, China

²

School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, No. 19(A), Yuquan Road, Shijingshan District, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(19), 3088; https://doi.org/10.3390/electronics11193088

Submission received: 2 September 2022 / Revised: 21 September 2022 / Accepted: 23 September 2022 / Published: 27 September 2022

Download

Browse Figures

Versions Notes

Abstract

:

Ubiquitous in-network caching plays an important role in improving the efficiency of content access and distribution in Information-Centric Networks (ICN). Content placement strategies, which determine the location distribution of content replicas in the network, have a decisive impact on the performance of the cache system. Existing strategies primarily focus on pushing popular content to the network edge, aiming to improve the overall cache hit ratio while neglecting to effectively balance the traffic load between network links; this leads to insufficient utilization of network bandwidth resources and further excessive content delivery time and user QoE degradation. In this paper, a Path Load-Aware Based Caching strategy (PLABC) is proposed, in which content-related information and dynamic network-related information are comprehensively considered to make cache decisions. Specifically, the utility of caching the content at each on-path node is calculated according to the bandwidth consumption savings and the load level of the transmission path, and the node with the greatest utility value is selected as the caching node. Extensive simulations are conducted to compare the performance of PLABC with other state-of-the-art schemes by quantitative analysis. Simulation results validate the PLABC strategy’s effectiveness, especially in balancing link load and reducing content delivery time.

Keywords:

information-centric networking; content placement; QoE; load balancing; content delivery time; cache utility

1. Introduction

In recent years, with the thriving of content-based services and the explosive growth of network traffic, the usage model of the Internet has gradually evolved from traditional end-to-end communication to large-scale content access and distribution. At the same time, the widespread commercial deployment of fifth-generation (5G) mobile networks also put forward higher requirements on the key performance metrics of the network, such as bandwidth, latency, throughput, etc. [1,2]; these facts would bring tremendous challenges to the current TCP/IP-based network architecture which was originally designed for communication between pairs of hosts. Under these circumstances, ICN was proposed and regarded as a promising solution for the problems faced by TCP/IP, such as scalability and mobility, due to its novel features including identity/location separation, ubiquitous in-network caching, and native multicast capabilities [3,4].

As one of the most important features of ICN, the core idea of in-network caching is to place content replicas at in-network nodes that are closer to end-users. Thus, that when the subsequent requests hit the cache on the transmission path, the content requests can be served directly by the cache nodes, instead of obtaining content from the remote server every time. This way, the content retrieval delay, as well as the redundant traffic in the network can potentially get reduced, thus improving network performance and content distribution efficiency.

Although caching technology has been extensively studied in traditional cache systems (e.g., Web cache [5,6,7,8,9], Content Distribution Networks (CDNs) [10,11,12], P2P [13,14]) in the past few decades, many new characteristics of ICN caching make it fundamentally different from previous studies, including its ubiquity in the network, its coupling with network layer functions, and its transparency to upper layer applications, etc. [15,16]; these new features make the design of the ICN cache system more complex but also empower the ICN nodes with natural content-aware and network-state-aware capabilities, so that it can use multidimensional information to design a more efficient cache management mechanism.

Caching strategies that consider using such multidimensional information, namely network-related information and content-related information, have been proposed in the literature in the context of ICN. For example, several studies comprehensively exploit the node topology information and content popularity information during the cache decision process, so as to improve the cache hit ratio and reduce overall network traffic [15,17,18]. By placing popular content on nodes close to users, these strategies claim to reduce service distance and delay. However, considering that the bandwidth capacity of the network link is finite, a shorter service distance does not always mean a shorter service completion time, as the transmission time of the served content is closely related to the congestion state of the transmission path.

Given that the content transmission time is one of the key indicators of improving user QoE, researchers have proposed to use information indicating network congestion, such as RTT, queuing delay and link bandwidth [19,20,21,22], to manage the cached contents. However, most of these approaches work in the form of cache replacement strategies. Each node makes content insertion/replacement decisions independently, which inevitably faces the problems of cache redundancy and low cache hit ratio, and further leads to a heavier load of the network links.

Focusing on the problems above, this paper proposes a Path Load-Aware Based Caching strategy (PLABC). In PLABC, the load level of the transmission path and the bandwidth consumption saved by content caching are jointly considered in the cache decision process. This strategy aims to balance the traffic load between network links by carefully selecting the cache location of contents, so as to increase the throughput of end-to-end transmissions and reduce the content transmission time. At the same time, the number of cache insertion/replacement operations is limited by considering the cache replacement cost, thereby reducing cache redundancy and improving the overall cache hit ratio. The main contributions of this paper are as follows:

We formulate the content placement problem as the average content transmission time minimization problem, and we derive the design principles of the caching strategy based on the analysis of the objective function.
We propose a caching strategy based on path load awareness, in which we comprehensively consider the content-related information (e.g., content size and popularity) and the network-related information (e.g., transmission path distance and load) to make cache decisions.
We conduct extensive simulations to study the impact of various experimental parameters on the performance of PLABC and demonstrate that it achieves the highest cache hit ratio and minimum average content transmission time compared with state-of-the-art strategies.

The rest of the paper is organized as follows. First, the discussion of related work is presented in Section 2. Then the name resolution-based ICN architecture and the formulated optimization problem are introduced in Section 3. Section 4 describes the algorithm and implementation of the PLABC strategy. Performance evaluation is then presented in Section 5. Finally, Section 6 provides a conclusion of the paper as well as an outlook for future work.

2. Related Work

Researchers have proposed many ICN architectures, which all adopt the communication paradigm of accessing the contents and services by name rather than the original location, though they have many differences in terms of implementations. According to the routing scheme, these architectures can be divided into two categories [4,23]. One is the name-based routing approach, which is represented by CCN/NDN [24,25]. In this approach, a request is routed based on the name of the content. The other one is the name resolution approach, in which the content name is resolved into a single or a set of network addresses (e.g., IP), and then the request is routed to one of these network addresses. Representative architectures of this category include DONA [26], PURSUIT [27], MobilityFirst [28], NetInf [29] and SEANet [30].

Among the many research topics in ICN, the design of in-network caching has received extensive attention from researchers due to its fundamental differences compared to traditional cache systems. The caching strategy determines which content is placed on which nodes. Thus, it plays an important role in improving the performance of the cache system and the content distribution efficiency of the entire network. A variety of caching strategies have been proposed in the context of a Web cache, CDNs as well as ICN; these strategies are based on different criteria when making cache decisions, so they can achieve different optimization goals. Specifically, in the area of ICN, existing caching strategies proposed in the literature can be roughly divided into three categories according to the criteria used in their cache decision process: content-based caching, network-based caching, and hybrid caching.

In content-based caching, cache decisions are made based only on content-related information (such as content size, popularity, freshness, etc.), which are usually maintained and updated independently by each cache node. OCPCP [31] calculates the importance (i.e., popularity) of each content based on its request records, and each node only caches the top

n

content in importance, where

n

is the cache capacity of the node. MPC [32] tags content as popular once the number of requests for it reaches a predefined popularity threshold. If the node holds this popular content, then it will suggest to all its neighbor nodes to cache this content. Similarly, FGPC [33] also only caches content whose popularity exceeds the predefined threshold. The difference between FGPC and MPC is that when the remaining cache space is enough, FGPC will cache every newly arriving content while MPC will not. In addition, the authors also proposed a variant of FGPC, called D-FGPC, to dynamically adjust the popularity threshold. In WAVE [34], each content is divided into small-sized chunks, and the upstream node on the transmission path recommends the number of chunks to be cached at its downstream node. The recommended value is exponentially increased as the request number increases to reflect the content’s popularity. TLRU [35] is an extension of simple LRU, in which each node calculates a timestamp for each arriving content, which is determined by factors such as content request frequency and content size. The arriving content is cached only if its corresponding timestamp is greater than its average request time. PB-NCC [36] is a popularity-based caching strategy with number-of-copies control, in which the on-path node that carries the content of minimum popularity is chosen as the caching node.

On the other hand, network-based caching exploits network-related information to make cache decisions. Such information may be static network topology information, such as hop count and graph-related centrality metrics. Moreover, this information may be dynamic network state information, such as the queue length and bandwidth occupation ratio of the forwarding port. In LCD (Leave Copy Down) and MCD (Move Copy Down) [9], each content is only cached at the node one hop downstream of the cache hit node. In addition, in MCD, the requested content needs to be evicted from the cache where the hit occurred. Similarly, in CLS [37], each content copy is pulled down one level towards the user by request and pushed up one level towards the server by the cache eviction. Additionally, CLS creates a caching trail along the download path to assist chunk search, which is similar to Breadcrumbs [38]. On the contrary, literature [39] suggests caching the content at the edge node closest to the user. CL4M [40] further studies the topology information of nodes and proposes to cache the content at the downstream node with the largest betweenness centrality. In ProbCache [41], each router probabilistically caches the content, where the cache probability is calculated based on the caching capability of the downstream path and the router’s distance from the user.

Hybrid caching strategies refer to those that jointly consider content-related information and network-related information when making cache decisions. For example, MAGIC [18] defines the cache placement gain of each content as the bandwidth consumption that can be saved if the current node caches the content, and the saved bandwidth consumption is calculated by jointly considering the content popularity and hop reduction. On the content download path, the node with the largest difference between the cache placement gain and the cache replacement penalty will be selected as the caching node. Ref [15] proposed an age-based cache replacement policy, where the lifetime of each content replica is determined by its age, and the age value depends on content location and popularity. CRCache [42] exploits the correlation of content popularity and router importance to select several routers along the content delivery path to cache the corresponding content. Several studies consider the congestion state of the content delivery path, aiming to reduce the content transmission time [19,20,21,22,43,44]. For example, CAC [19] calculates the caching utility of each content according to its popularity and the congestion degree of its transmission path, so that popular content is preferentially cached at the downstream nodes of the congested link. In CPC [20], existing congestion feedback signals are used to guide cache decisions at each content router. In DCP [43], the popular content is distributed to the network regions with less traffic load and more accessible routers to balance the congested routers’ traffic load.

In general, content-based or network-based caching alone is not enough to improve the performance of the cache system. Content-based caching faces the problem of redundant caching of popular content, while network-based caching has a low cache hit ratio under arbitrary topology. Most importantly, they all lack consideration for limited link capacity and network congestion, so content delivery time cannot be guaranteed. Existing hybrid caching strategies that consider network congestion perform poorly in terms of the cache hit ratio, which makes the overall load of the network links relatively high. As a result, they provide limited performance improvement in terms of network service capacity. To this end, this paper proposes a path load-aware based caching strategy. In the cache decision process, the load level of the transmission path and the bandwidth consumption saved by content caching are comprehensively considered, which can effectively reduce the content transmission time and improve the cache hit ratio.

3. System Model

This section first introduces the name resolution-based ICN architecture [45], followed by a description of the process of content retrieval under this architecture. Then the process of formulating the content placement problem as the average content transmission time minimization problem is presented. Finally, based on the analysis of the objective function, the design principles of the caching strategy are derived.

3.1. Name Resolution Based ICN Architecture

Authors have proposed an ICN architecture that adopts the name resolution-based routing approach in previous publications [45]. As shown in Figure 1, there is a Name Resolution System (NRS) which is responsible for maintaining the mapping between the content name and the network address (NA) of the node where the content is located. The ICN content routers (i.e.,

R_{1}, R_{2}, R_{3}

) implement packet forwarding and in-network caching functions, and those located at the network edge are also called edge routers (i.e.,

R_{1}, R_{3}

). Edge routers are connected to the local area network where the end users are located. The details of content retrieval and caching in the network are as follows.

When the origin server generates a new content

m

, it first registers the binding between its NA and the content’s name with NRS. After that, when user

A

wants to obtain content

m

, it first resolves the NA of the node where the content is located according to the content’s name through NRS. Then, the content request is issued and routed to this node. When the origin server receives this request, it returns the corresponding content data to user

A

.

In the process of content transmission, the intermediate nodes need to decide whether to cache the content, so a caching strategy needs to be specified for these nodes. For example, under the Leave Copy Everywhere (LCE) strategy [16], every node stores a copy of every content that has passed by it. When the caching operation is done, the routers also need to register the binding between their NA and the content’s name with NRS.

User

B

retrieves the content following the same process. First, it resolves through NRS the NAs of all the nodes holding content

m

. Then it selects one node among them according to a specific replica selection strategy. Finally, it sends the request to the selected node. For example, under the Nearest Replica Routing (NRR) [46], router

R_{2}

will be selected as the serving node.

Note that the content mentioned above can exist in the form of a data chunk, which is used in some ICN architectures; these chunks are generated by dividing the original application-layer content object according to a certain size, which can be several MBs, such as 2MB, 5MB, 10MB, etc. Then a globally unique name is assigned to each chunk, and chunks are used as the basic unit of in-network caching. In the rest of this paper, the terms content and chunk will be used without distinction, both of which refer to the named data chunk.

3.2. Problem Formulation

We consider a network of arbitrary topology, represented by

G = (V, E)

.

V

represents the set of nodes in the network, including the ICN content routers and origin servers.

E

denotes the links directly connecting them. In addition, we assume that each end user is connected to an edge router, and the request incoming rate at each edge router represents the aggregated request rate of all the users connected to it. We denote by

V_{e}

the set of edge routers, and

V_{e} \subset V

.

Each cache node

v \in V

has a cache capacity which is denoted as

c_{v}

. Generally, when the cache system reaches a steady state, the space of each node is usually full. Thus, every time when new content is inserted into the cache node, some old contents in the cache node need to be evicted to free up the required storage space.

Each link

l_{i, j} \in E

has a bandwidth capacity of

B W_{i, j}

bits/sec, where

i

and

j

represent the two neighbor nodes connected by a link

l_{i, j}

, and

i, j \in V

. We denote by

φ (l_{i, j})

the bandwidth occupation ratio of the link. The residual bandwidth of link

l_{i, j}

is denoted by

R e s t B w_{i, j}

, and it can be calculated as:

R e s t B w_{i, j} = B W_{i, j} \times [1 - φ (l_{i, j})]

(1)

We use

M

to denote the set of content items available at the network. For each content

m \in M

, we use

s^{m}

to denote its size (in bits). Each content has an origin server node, which is responsible for its permanent storage. Moreover, the access frequencies of different content items are assumed to follow the Zipf distribution [47]. That is to say, the

k

-th most popular content has a request probability proportional to

1 / k^{α}

for some

α > 0

. Early research found that the relative popularity of web requests follows a Zipf-like distribution with α varying from 0.64 to 0.83 [5]. Statistics gathered in [48] for a video on demand (VoD) service suggest a Zipf distribution popularity with

0.65 \leq α \leq 1

. Since the in-network caching of ICN needs to handle the mixed traffic of different applications rather than a single one, this paper investigates the impact of a wider range of α values (0.4–1.2) on the cache performance in the simulation section.

As for the requests, we assume that the total exogenous request arrival rate of the network is a constant value over a period of time, which is denoted as

R

. We use

r_{u}^{m}

to denote the aggregate incoming request rate (in requests per second) for content

m

at any edge node

u \in V_{e}

. Furthermore, we assume that the arrival process of requests at each edge node follows a Poisson distribution.

As shown in Equation (2), we formulate the optimization problem where the objective is to minimize the overall average content transmission time

D

for all users in the network and over all requested contents. Table 1 lists all the notations mentioned in this section.

m i n D = \frac{1}{R} \sum_{u \in V_{e}} \sum_{m \in M} \sum_{v \in V} r_{u}^{m} \times h_{v}^{m} \times a_{u, v}^{m} \times t_{v, u}^{m}

(2)

h_{v}^{m}

is the cache indication variable with a value of 0/1 to indicate whether content

m

exists in node

v

’s local cache.

a_{u, v}^{m}

is the variable that indicates whether node

v

is selected as the serving node for the request; its value is also 0 or 1, depending on the replica selection strategy adopted. We assume that for any request, only one replica is selected as the serving node.

Content transmission time is often directly related to the available bandwidth of the bottleneck link of its delivery path [49]. Referring to the modeling method of literature [50], the transmission time of content

m

can be approximated as:

t_{v, u}^{m} = \frac{s^{m}}{R e s t B w_{v, u}}

(3)

where

s^{m}

is the size of content

m

,

R e s t B w_{v, u}

is the available bandwidth of the transmission path

p_{v, u}

. Assuming that the bottleneck link of path

p_{v, u}

is denoted as

b t l_{v, u}

, then the value of

R e s t B w_{v, u}

can be approximated as the residual bandwidth of link

b t l_{v, u}

, which is defined as:

R e s t B w_{v, u} = B W_{b t l_{v, u}} \times [1 - φ (b t l_{v, u})]

(4)

where

B W_{b t l_{v, u}}

is the bandwidth capacity of link

b t l_{v, u}

, and

φ (b t l_{v, u})

is the bandwidth occupation ratio.

Based on the above analysis, the objective function can be converted into:

m i n D = \frac{1}{R} \sum_{u \in V_{e}} \sum_{m \in M} \sum_{v \in V} r_{u}^{m} \times h_{v}^{m} \times a_{u, v}^{m} \times \frac{s^{m}}{B W_{b t l_{v, u}} \times [1 - φ (b t l_{v, u})]}

(5)

subject to:

\sum_{m \in M} s^{m} h_{v}^{m} \leq c_{v}, \forall v \in V

(6)

0 < φ (l_{i, j}) < 1, \forall l_{i, j} \in E

(7)

\sum_{v \in V} h_{v}^{m} \geq 1, \forall m \in M

(8)

\sum_{v \in V} a_{u, v}^{m} = 1, \forall u \in V_{e}, \forall m \in M

(9)

The decision variable in this optimization problem can be represented by a binary matrix

H = (h_{v}^{m} \in \{0, 1\}, \forall v \in V, \forall m \in M

), which is of size

|V| \times |M|

. This matrix specifies the contents of each cache in the network. Note that the goal of this paper is not to calculate the optimal

H

in an offline manner, which would introduce unacceptable computation and communication overhead. Instead, we dynamically adjust the value of some elements of

H

(i.e.,

h_{v}^{m}

) by caching the content at the nodes along the path during the content transmission process, thus continuously reducing the value of the objective function.

3.3. Design Principles of the Caching Strategy

The problem of minimizing average content transmission time is essentially a static content placement problem whose NP-completeness has been proven in [10], so the problem cannot be solved in polynomial time, especially for the resource-constrained ICN router. Therefore we consider a heuristic way to solve this problem. Based on the analysis of the objective function and our experience, the following design principles for the caching strategy can be derived.

(1) Content with higher popularity should be cached priorly. Higher popularity means a higher request rate for these contents, i.e., a higher number of requests. All else being equal, caching these contents can bring more reduction in average content transmission time.

Proof.

In the topology shown in Figure 2, the request rate for content

m_{1}

and

m_{2}

at user

u

are denoted as

r_{u}^{m_{1}}

and

r_{u}^{m_{2}}

respectively. We assume that

r_{u}^{m_{1}} > r_{u}^{m_{2}}

. In addition, we assume that the two contents are of the same size, and the residual bandwidth of their corresponding transmission paths is also the same. That is:

s^{m_{1}} = s^{m_{2}}

(10)

R e s t B w_{v_{1}, u} = R e s t B w_{v_{2}, u}

(11)

The transmission time saved by caching content

m_{1}

at node

i

is:

t_{1} = r_{u}^{m_{1}} \times s^{m_{1}} \times (\frac{1}{R e s t B w_{v_{1}, u}} - \frac{1}{R e s t B w_{i, u}})

(12)

The transmission time saved by caching content

m_{2}

at node

i

is:

t_{2} = r_{u}^{m_{2}} \times s^{m_{2}} \times (\frac{1}{R e s t B w_{v_{2}, u}} - \frac{1}{R e s t B w_{i, u}})

(13)

According to Equations (10) and (11), we can easily get

t_{1} > t_{2}

, that is, caching more popular contents in nodes can bring more transmission time savings. □

(2) Content with lower residual bandwidth on the transmission path should also be cached priorly. On the one hand, caching the content on nodes with high-bandwidth transmission paths can reduce the delivery time for subsequent requests for this content. On the other hand, it can effectively reduce the traffic on upstream overloaded links, achieve load balancing between network links, and improve the overall throughput of the network.

Proof.

Using the topology shown in Figure 2 as an example, we assume that

R e s t B w_{v_{1}, u} > R e s t B w_{v_{2}, u}

, i.e., the transmission path of content

m_{1}

has higher residual bandwidth. In addition, we assume that the request rate and the data size of the two contents are the same, namely

r_{u}^{m_{1}} = r_{u}^{m_{2}}

and

s^{m_{1}} = s^{m_{2}}

. Same as Equations (12) and (13), the transmission time saved by caching content

m_{1}

and

m_{2}

at node

i

can be calculated separately, which is not repeated here. Moreover, we can easily derive that

t_{1} < t_{2}

, that is, caching content that has lower residual bandwidth on its transmission path can reduce transmission time. □

(3) Content with longer transmission distance should also be cached priorly because it occupies more network bandwidth. We use the number of hops traversed during content transmission to represent the transmission distance, and denote it as

d_{v, u}

. The total traffic load incurred by any content request can be defined by Equation (14), which is the product of request rate, content size, and transmission distance; it can also be seen as the total amount of network bandwidth occupied by the transfer of the requested content. Other things being equal, the shorter the transmission distance, the less bandwidth it consumes, which means that the available bandwidth for other content flows sharing the same link is increased, thereby reducing the average content transmission time.

L o a d I n c u r_{u}^{m} = r_{u}^{m} \times s^{m} \times d_{v, u}

(14)

Proof.

Consider the network shown in Figure 3, assuming that content

m_{1}

and

m_{2}

have the same size and request rate, and the same available bandwidth of their transmission paths. According to the conclusions of (1) and (2) above, these two contents should have the same cache priority at node

i

, since the transmission time saving of caching either of them is the same. However, considering that content

m_{2}

has a longer transmission distance, its total bandwidth consumption is higher. In addition to reducing the transmission time of its own subsequent requests, caching

m_{2}

at node

i

also increases the available bandwidth for other content flows (

m_{3}

in the figure) that share the same upstream link (

l_{v_{2}, j}

drawn with red line in the figure). As a result, the transmission time of these contents can also get reduced. To sum up, caching

m_{2}

can achieve more reduction in the overall content transmission time. □

Based on the above analysis, we conclude that all the above parameters should be considered comprehensively in the cache decision process, including content popularity, the residual bandwidth and distance of the transmission path, etc. In this way, the traffic load incurred by the requests for these content replicas can be effectively balanced between different network links, so as to increase their available transmission bandwidth and reduce the average content transmission time.

4. Path Load-Aware Based Caching (PLABC) Strategy

In this section, we propose a path load-aware based caching strategy, which is an on-path caching scheme [51]. We aim to minimize the average content transmission time of all requested contents, and the idea is to balance the traffic load between network links and minimize the bandwidth consumption of content transmission by carefully choosing the cache location of the content replicas, thereby increasing the available bandwidth of subsequent requests and reducing the content transmission time.

To achieve this, a utility is defined for each content whose value depends on the difference between the benefit of cache placement and the replacement cost of the corresponding node, and the intermediate node with the highest utility value will be selected as the caching node. In the calculation of these values, the content-related information (e.g., content size and popularity) and the network-related information (e.g., transmission path distance and load) are comprehensively considered.

For the following of this section, we first introduce the key components of our caching strategy, including popularity estimation and upstream path load estimation. Secondly, we introduce the calculation methods of cache placement benefit, cache replacement cost, and cache utility in turn. Then, a max-utility-based caching node selection method is proposed. Finally, we introduce the implementation of the caching strategy under the network architecture proposed in Section 3, as well as an overhead analysis of our scheme.

4.1. Popularity Estimation

For the in-network nodes, it’s a difficult task to accurately estimate the popularity of all the contents. The reasons include two aspects, the amount of items in the content space being too massive, and the content popularity’s dynamically changing over time. Therefore, having every node maintain the popularity information of all the contents will lead to an unacceptable computation and memory overhead. Due to the resource constraints of in-network nodes and the performance requirements of high-speed packet forwarding, the calculation of content popularity should be both lightweight and fast. In this article, we suggest that each node maintain request information only for content that exists in its local cache. Therefore, for the retrieval process of a given content, the popularity information can only be obtained from the serving node (i.e., the origin server or upstream cache hit nodes).

For any content

m

in node

v

’s local cache, we define its popularity

p_{v}^{m}

as the ratio of the number of requests for content

m

to the number of all requests received by node

v

. Considering the time-varying nature of content popularity, we count the number of requests for different content in a periodic manner, where the value of the statistical period (denoted as

T_{p}

) can be 1 min, 1 h or even 1 week. Then

p_{v}^{m}

can be defined as:

p_{v}^{m} = \frac{N_{v}^{m}}{N_{v}}, \forall m \in M_{v}, \forall v \in V

(15)

M_{v}

represents the set of contents cached at node

v

.

N_{v}^{m}

and

N_{v}

represent the number of requests for content

m

and for all locally cached contents received by node

v

in time period

T_{p}

, respectively. Algorithm 1 shows the detailed process of estimating the popularity of content

m

at node

v

.

Algorithm 1: Popularity Estimation

Input: Content Name:

m

Output: Popularity of Content

m

at Node

v

: p_{v}^{m}

Parameters: Statistical Period:

T_{p}

1:: initialization: $N_{v}^{m} \leftarrow 0$ , $N_{v} \leftarrow 0$ , $t \leftarrow 0$ , $p_{v}^{m} \leftarrow 0$
2:: while 1 do
3:: Cache node $v$ receives a request for content $m$ ;
4:: if content $m$ is in node $v$ ’s local cache then
5:: $N_{v}^{m} \leftarrow N_{v}^{m} + 1$ ;
6:: $N_{v} \leftarrow N_{v} + 1$ ;
7:: else
8:: Forward the request packet to next hop;
9:: end if
10:: if $t \geq T_{p}$ then
11:: $p_{v}^{m} \leftarrow N_{v}^{m} / N_{v}$ ;
12:: $N_{v}^{m} \leftarrow 0$ , $N_{v} \leftarrow 0$ , $t \leftarrow 0$ ;
13:: end if
14:: end while

Furthermore, assuming that the same content has similar popularity at different nodes, then for any intermediate node

i

on the path

p_{v, u}

, there is

p_{i}^{m} \approx p_{v}^{m}

, where

v

and

u

represent the serving node and the consumer node, respectively, and

p_{v, u}

represents the transmission path between them, as shown in Figure 4. Further, the request rate of content

m

at node

i

can be estimated as:

λ_{i}^{m} = λ_{i} \times p_{i}^{m} \approx λ_{i} \times p_{v}^{m}, \forall m \in M_{v}, \forall i \in p_{v, u}, \forall v \in V

(16)

where

λ_{i}

is the total request rate of all the contents cached in node

i,

which can be calculated as Equation (17).

λ_{i} = \frac{N_{i}}{T_{p}}, \forall i \in p_{v, u}

(17)

In Equation (17),

T_{p}

represents the specified statistical period as described above, and

N_{i}

represents the total number of requests for all locally cached contents received by node

i

within

T_{p}

.

4.2. Upstream Path Load Estimation

The upstream path (i.e., the path segment from the serving node to current node, as shown in Figure 4), consists of a set of links with different traffic loads. To this end, a variable called upstream path load is defined to quantify the overall load level of the upstream path; its value is jointly determined by the average and maximum link load of all the links included in the upstream path. For any intermediate node

i

on path

p_{v, u}

, its upstream path is denoted as

p_{v, i}

, and its load can be defined as:

φ (p_{v, i}) = α \times φ_{v, i}^{a v g} + β \times φ_{v, i}^{b t l}, \forall i \in p_{v, u}

(18)

where

α + β = 1, 0 \leq α, β \leq 1

(19)

φ_{v, i}^{a v g} = \sum_{l_{i} \in p_{v, i}} φ (l_{i}) / N_{v, i}

(20)

φ_{v, i}^{b t l} = \max φ (l_{i}), \forall l_{i} \in p_{v, i}

(21)

Specifically,

φ_{v, i}^{a v g}

represents the average load of all the links of the upstream path

p_{v, i}

,

φ_{v, i}^{b t l}

represents the load of the bottleneck link among them (i.e., the link with the maximum load),

α

and

β

represent their weights, respectively.

φ (l_{i})

represents the link load of link

l_{i}

, and

N_{v, i}

is the number of links contained in path

p_{v, i}

. Among them,

φ (l_{i})

can be measured by each router periodically calculating the bandwidth occupation ratio of each of its output ports, which is already implemented in many SDN-based switches. Algorithm 2 shows the detailed process of upstream path load estimation.

Algorithm 2: Upstream Path Load Estimation

Input: Upstream Path:

p_{v, i}

Output: Upstream Path Load:

φ (p_{v, i})

Parameters:

α

,

β

1:: Initialization: $T o t a l L o a d \leftarrow 0$ , $A v g L o a d \leftarrow 0$ , $M a x L o a d \leftarrow 0$ , $L i n k N u m \leftarrow 0$
2:: for each link $l_{i}$ on the upstream path $p_{v, i}$ do
3:: Get $φ (l_{i})$ from the local port statistics of the router;
4:: if $φ (l_{i}) > M a x L o a d$ then
5:: $M a x L o a d \leftarrow φ (l_{i})$ ;
6:: $T o t a l L o a d \leftarrow T o t a l L o a d + φ (l_{i})$ ;
7:: $L i n k N u m \leftarrow L i n k N u m + 1$ ;
8:: end for
9:: $A v g L o a d \leftarrow T o t a l L o a d / L i n k N u m$ ;
10:: $φ (p_{v, i}) \leftarrow α \times A v g L o a d + β \times M a x L o a d$ ;

4.3. Cache Benefit Calculation

One of the most important benefits of in-network caching is to reduce the traffic load on upstream links by serving requests at intermediate nodes, i.e., the so-called “storage for bandwidth”. From a load balancing perspective, we want to reduce traffic over overloaded links while increasing traffic over underloaded links. The way we achieve this is to set a higher cache benefit value for those contents transmitted through the overloaded links, so that they have a higher probability of being cached at downstream nodes, thereby reducing the number of subsequent requests for these contents to be forwarded through the upstream overloaded links. Specifically, the cache benefit value is jointly determined by parameters such as content popularity, transmission distance, and transmission path load, so as to ensure that it can achieve a high cache hit ratio while balancing the load between network links.

The cache benefit of placing a new content

m

at node

i

is defined as Equation (22); its value is jointly determined by two parameters, the reduction of network load (denoted as

L o a d R e d u c t i o n_{i}^{m}

) and the transmission cost of upstream path (denoted as

T r a n s C o s t_{v, i}

).

C a c h e B e n e f i t_{i}^{m} = L o a d R e d u c t i o n_{i}^{m} \times T r a n s C o s t_{v, i}, \forall i \in p_{v, u}

(22)

L o a d R e d u c t i o n_{i}^{m}

represents that if content

m

is cached at node

i

, the amount of traffic that could be potentially reduced, as defined in Equation (23). In this equation,

λ_{i}^{m}

is the request rate of content

m

on node

i,

which can be estimated by its popularity, as shown in Equation (16).

s^{m}

is the size of content

m

(in bits).

d_{v, i}

is the distance (hop count) between the serving node

v

and current node

i

, which is also equal to the number of links included in the upstream path

p_{v, i}

.

L o a d R e d u c t i o n_{i}^{m} = λ_{i}^{m} \times s^{m} \times d_{v, i}, \forall i \in p_{v, u}

(23)

T r a n s C o s t_{v, i}

represents the transmission cost per unit of traffic on the path

p_{v, i}

, its value depends on the upstream path load

φ (p_{v, i})

. Our idea is that for a given path, the heavier its load, the higher its transmission cost should be. Therefore the transmission cost of path

p_{v, i}

can be defined by Equation (24).

T r a n s C o s t_{v, i} = \frac{1}{{[1 - φ (p_{v, i})]}^{γ}}

(24)

Typically, we can set

γ = 1

, so that when

φ (p_{v, i}) < 50 %

,

T r a n s C o s t_{v, i} = 1

, that means no load-balancing is considered for low-load links in this process. While

φ (p_{v, i}) \geq 50 %

, for example, when

φ (p_{v, i}) = 90 %

,

T r a n s C o s t_{v, i} = 10 .

That means the cost of using this path for content transfer is 10 times higher than the cost of using an idle path.

4.4. Cache Replacement Cost Calculation

Due to the huge amount of content items and the limit of cache capacity, when the cache system reaches a stable state, the cache space is usually full. Thus, the insertion of a new content into the cache is always accompanied by the eviction of some cached contents, so as to make room for the new coming one. This process is the so-called “cache replacement”. Contrary to the benefits of inserting new contents into the cache, evicting contents from the cache will lead to the increase in the overall network load (i.e., bandwidth consumption) and transmission costs.

4.4.1. Eviction Cost of the Content

We first define an eviction cost for each content in cache node

i

, which represents the potential increase in the overall transmission cost if this content is evicted from the cache. At the time when content

m

is first inserted into node

i

’s local cache, its eviction cost and its cache benefit at this node are of equal value, that is, both are equal to the product of the load reduction and the upstream path transmission cost. Thus, similar to Equation (22), we obtain:

e c_{i}^{m} = L o a d R e d u c t i o n_{i}^{m} \times T r a n s C o s t_{v, i}, \forall i \in p_{v, u}

(25)

It should be noted that since the upstream link load and transmission cost may change dynamically over time, the eviction cost of each cached content should also change dynamically with it. However, such an update mechanism may introduce high computational overhead to nodes. In addition, the network traffic usually presents obvious time-dependent characteristics, such as significant increases in traffic load on certain links during certain periods of the day [52]. However, the time scale over which the link load changes significantly is often much larger than the residence time of the content in the LRU-based cache. Therefore, in this paper, the eviction cost of each content is assumed to remain unchanged after it is inserted into the cache.

4.4.2. Replacement Cost of the Node

Based on the above analysis, we now define the cache replacement cost of node

i

, which is denoted as

R e p l a c e m e n t C o s t_{i}

; it depends on which content is currently in the cache being evicted. If the rest cache space of node

i

is enough to accommodate the new coming content, then no content will be evicted; thus,

R e p l a c e m e n t C o s t_{i}

is zero. Otherwise, some contents need to be removed from the cache to free up the required storage space. Algorithm 3 describes the calculation of node replacement cost in detail. In this case, the replacement cost of node

i

is equal to the sum of the eviction cost of all the evicted contents. Therefore, we obtain:

R e p l a c e m e n t C o s t_{i} = \{\begin{matrix} \sum_{m' \in M_{i}^{e v i c t}} e c_{i}^{m'}, i f c s_{i}^{r e s t} < s^{m} \\ 0, i f c s_{i}^{r e s t} \geq s^{m} \end{matrix}

(26)

where

c s_{i}^{r e s t}

represents the available cache space of node

i

, and

s^{m}

is the size of the newly arriving content

m

.

M_{i}^{e v i c t}

represents the set of contents that need to be evicted from node

i

’s local cache, and it needs to satisfy the following constraint:

\sum_{m' \in M_{i}^{e v i c t}} s^{m'} \geq s^{m} - c s_{i}^{r e s t}

(27)

Algorithm 3: Replacement Cost Calculation

Input: Size of Content

m

: s^{m}

; Rest Cache Space:

c s_{i}^{r e s t}

Output:

R e p l a c e m e n t C o s t_{i}

Parameters: Cache Replacement Policy: LRU

1:: Initialization: $R e p l a c e m e n t C o s t_{i} \leftarrow 0$ ;
2:: if $c s_{i}^{r e s t} < s^{m}$ then
3:: $A v a i l a b l e C a c h e S p a c e \leftarrow c s_{i}^{r e s t}$ ;
4:: while $A v a i l a b l e C a c h e S p a c e < s^{m}$ do
5:: Get a to-be-evicted content $\{m'\}$ from local cache according to LRU;
6:: $A v a i l a b l e C a c h e S p a c e \leftarrow A v a i l a b l e C a c h e S p a c e + s^{m'}$ ;
7:: $R e p l a c e m e n t C o s t_{i} \leftarrow R e p l a c e m e n t C o s t_{i} + e c_{i}^{m'}$ ;
8:: end while
9:: else
10:: $R e p l a c e m e n t C o s t_{i} = 0$
11:: endif

4.5. Max-Utility-Based Caching Node Selection

The utility of caching content

m

at node

i

is defined as the difference between the corresponding cache benefit and replacement cost, as shown in Equation (28).

C a c h e U t i l i t y_{i}^{m} = C a c h e B e n e f i t_{i}^{m} - R e p l a c e m e n t C o s t_{i}, \forall i \in p_{v, u}

(28)

Obviously, caching operations should only be performed when the cache utility is greater than zero, because it means that replacing the old content in the cache with the new content

m

can bring higher benefits.

However, for nodes on the transmission path

p_{v, u}

, there may be more than one node with a cache utility greater than zero. In order to avoid cache redundancy and to utilize storage resources more efficiently, only one node on the transmission path is selected to cache the content. For minimizing the overall transmission cost, the node with the maximum cache utility is selected as the caching node. Therefore, we obtain:

C a c h i n g N o d e = a r g m a x C a c h e U t i l i t y_{i}^{m}, \forall i \in p_{v, u}

(29)

Algorithm 4 describes the process of caching node selection in detail.

Algorithm 4: Max-Utility-based Caching Node Selection

Input: Transmission Path:

p_{v, u}

Output:

C a c h e N o d e

1:: Initialization: $M a x U t i l i t y \leftarrow 0$ , $M a x N o d e \leftarrow 0$ , $C a c h e N o d e \leftarrow 0$ ;
2:: for each node $i$ on the path $p_{v, u}$ do
3:: Calculate $C a c h e B e n e f i t_{i}^{m}$ ;
4:: Calculate $R e p l a c e m e n t C o s t_{i}$ ;
5:: $C a c h e U t i l i t y_{i}^{m} \leftarrow C a c h e B e n e f i t_{i}^{m} - R e p l a c e m e n t C o s t_{i}$ ;
6:: if $C a c h e U t i l i t y_{i}^{m} > M a x U t i l i t y$ then
7:: $M a x U t i l i t y \leftarrow C a c h e U t i l i t y_{i}^{m}$ ;
8:: $M a x N o d e \leftarrow i$ ;
9:: end if
10:: end for
11:: if $M a x U t i l i t y > 0$ then
12:: $C a c h e N o d e \leftarrow M a x N o d e$ ;
13:: else
14:: Do not cache content $m$ on path $p_{v, u}$ ;
15:: end if

4.6. Implementation

As mentioned in Section 3.1, the size of the content (i.e., the named data chunk) is assumed to be several MBs. Due to the limitations of the underlying network infrastructure, the data for each content needs to be split into a series of packets for transmission. The receiver-driven transmission control protocol is widely used in ICN, in which the receiver controls the sending rate of the request message or the amount of data specified in the request message, according to the congestion state of the transmission path. In our implementation, the cache utility value of each node is calculated during the forwarding process of the first data packet of each content, and record the node corresponding to the maximum cache utility. The detailed process of cache decision for a given content

m

is as follows.

(1): Forwarding process of the first request packet

As shown in Figure 5, we add two cache auxiliary fields to the header of the request packet, which are used to record the maximum cache utility value and the ID of its corresponding node, respectively. We denote by

r e q_{1}^{m}

the first request packet of content

m

. For

r e q_{1}^{m}

, both fields are set to null since the cache decision has not yet been completed.

(2): Forwarding process of the first data packet

As shown in Figure 6, when the serving node receives

r e q_{1}^{m}

, it needs to return the requested data to the receiver, and we denote it as

d a t a_{1}^{m}

. In addition to the max utility and node ID fields mentioned above, we also need to add locally maintained content size and popularity information, as well as the number of hops and path load fields updated during transmission, to the header of the data packet. The information indicated by these fields will be used by each on-path node to calculate the cache utility of the content.

The max utility field in the data packet header is initialized to zero. During the hop-by-hop forwarding of

d a t a_{1}^{m}

, each node calculates the utility value of caching the content locally. If the utility value is greater than zero, this node temporarily decided to cache

d a t a_{1}^{m}

and allocate memory space for assembling subsequent packets into a complete chunk. In addition, if the utility value of the current node is greater than the

M a x U t i l i t y

in the packet header, update the value of the corresponding fields. For example, since

C a c h e U t i l i t y_{3} > M a x U t i l i t y

and

C a c h e U t i l i t y_{3} > 0

, we cache

d a t a_{1}^{m}

in

R_{3}

and update

M a x U t i l i t y = C a c h e U t i l i t y_{3}

,

N o d e I D = R_{3}

. Therefore, when

d a t a_{1}^{m}

is forwarded to the receiver, the cache auxiliary fields at this time record the maximum cache utility value of all the nodes along the transmission path and the ID of its corresponding node. That is,

M a x U t i l i t y = 3

,

N o d e I D = R_{2}

. By now, the cache decision process is complete and we choose the node with the largest utility value (i.e.,

R_{2}

) as the caching node.

(3): Forwarding process of subsequent request packets

For subsequent requests, the cache fields of the packer header will be directly set to the maximum utility value and its corresponding node ID, i.e.,

M a x U t i l i t y = 3

,

N o d e I D = R_{2}

. During the hop-by-hop forwarding of

r e q_{2}^{m}

, each node checks whether its own ID is consistent with the node ID in the request header. If not, it needs to delete the data temporarily cached locally in step (2), and terminate the data assembling process of the corresponding content. For example, router

R_{3}

deletes the cached

d a t a_{1}^{m}

of content

m

, quits assembling chunk

m

and releases the allocated memory. This process is shown in Figure 7.

(4): Forwarding process of subsequent data packets

For subsequent data packets, set the cache fields in the packet header to be the same as the request packet. The intermediate nodes directly match the node ID in the packet header to quickly determine whether the packet needs to be cached locally. When all the data packets of the content have been forwarded and assembled, the cache node can register this content with NRS and provide data services. This process is shown in Figure 8.

4.7. Overhead Analysis

4.7.1. Storage Overhead

As shown in Figure 9, in our design, each node needs to maintain necessary attribute information for locally cached contents, including content size, request frequency, eviction cost and data address. Each information record is created along with the cache of its corresponding content and deleted as the content is evicted. Although the overhead of maintaining this information increases linearly with the number of cached contents, we point out that the storage space occupied by recording this information is completely affordable compared to the node’s cache capacity.

4.7.2. Communication Overhead

The communication overhead introduced by our scheme mainly comes from the cache auxiliary fields added in the packet header, including seven fields in the data packet of the first response, and two fields in other packets, which are only a few bits of overhead. Furthermore, our algorithm does not introduce any overhead caused by out-of-band signaling interactions. Compared with the bandwidth savings brought by our algorithm (which will be introduced in Section 5), the communication cost incurred by these additional fields is negligible and worthwhile.

4.7.3. Computation Overhead

In our design, the main computational complexity is caused by each node calculating the cache utility of each passing content. However, the parameters used in this calculation are directly obtained by parsing the data packet header, which only involves simple multiplication of several parameters, and does not involve any complex operations, such as exponential operation, query, sort, etc. In addition, our scheme adopts the LRU replacement policy with

O (1)

complexity, which also ensures that our scheme has low computational overhead. More importantly, our cache decision is based on the content level, that is, the above calculation will only be performed during the transmission of the first data packet of each content, while only simple field matching operations will be performed on other data packets.

5. Performance Evaluation

In this section we present the results of our simulations. We first introduce the experimental setup. Then we investigate the performance of our proposed caching strategy and compare it against several representative related works under various experiment parameters. Finally, we present the analysis and discussion of the simulation results.

5.1. Experimental Setup

We evaluate the performance of our proposed strategy with trace-driven simulations, using the Icarus simulator [53]. Icarus is a Python-based discrete-event simulator, which is used to implement and evaluate different ICN caching and replacement schemes. We first implement the name resolution-based request routing mechanism, as well as our proposed caching strategy PLABC in Icarus. In addition, we implement the representative approaches mentioned in Section 2, including LCE, LCD [9], ProbCache [41], CAC [19] and MAGIC [18]. We compare the caching performance of PLABC with these strategies under different experiment parameters. Table 2 lists the main parameters involved in the simulation.

For the network configuration, we use three real-world network topologies of Abovenet (182 nodes), Tiscali (240 nodes) and Telstra (318 nodes) from the Rocketfuel dataset [54]. We assume that the NRS is deployed in a centralized manner, and each request is forwarded following the shortest path routing towards its nearest replica node. For simplicity, we assume that all the network links have equal bandwidth capacity, which is 1Gbps. All the in-network nodes are equipped with a cache of the same size. Further, we assume that all the nodes adopt LRU policy to manage the content replicas in the local cache, unless a replica management method is specified by the caching strategy, such as CAC and MAGIC.

The total number of the requested contents in the network is set to

3 \times 10^{5}

. The popularity of different content items is assumed to follow a Zipf distribution [47], and the Zipf exponent (denoted as

α

), which represents the content popularity skewness, is varied from 0.4 to 1.2. Moreover, we assume that all the content items have the same size, which is 10 MB.

A total of

6 \times 10^{5}

requests are generated in each experiment scenario. The first

3 \times 10^{5}

requests are used to allow caches to converge, while the remaining requests are used to record the evaluation metrics in the experiments. We assume that the arrival process of the requests on the edge nodes follows a Poisson distribution, and the total number of requests generated per unit time is varied from 100 to 500.

5.2. Simulation Results

We set different experiment scenarios by adjusting the values of parameters including network topology, Zipf exponent

α

, cache size and request rate. For each experiment scenario, we use four metrics to quantitatively analyze the performance of different strategies, including cache hit ratio, average link load, load balancing degree and average content transmission time. Each experiment is performed 10 times and the average value of the results is recorded.

For the convenience of explanation, we present the experiment results according to different evaluation metrics in the following subsections.

5.2.1. Cache Hit Ratio

The cache hit ratio is defined as the ratio of requests served by the cache nodes to the total number of requests from all users. The remaining requests will be forwarded to the origin server, hence this metric can also reflect the effect of in-network caching on server load reduction.

We first investigate the effect of different values of

α

on the cache hit ratio of different caching strategies. The relative cache size and the request rate are set to 0.2 and 500 respectively. Then we choose the value range of α to be 0.4~1.2, which can fully simulate the popularity distribution of different types of application traffic. A larger value of α means that more requests will be concentrated on a small number of popular contents, that is, the proportion of repeated requests will be higher, so the cache hit ratio will also be higher. Figure 10 confirms this conclusion, in all three simulated topologies, the cache hit ratio of all strategies increases with the increase in the α value.

As shown in Figure 10, both PLABC and MAGIC achieve considerably higher cache hit ratio than other strategies and that is for all topologies used. This is expected since they both consider the cost of cache replacement in the cache decision process, thus avoiding popular content being evicted from the cache. ProbCache performs worse than PLABC since the cache decision is made without consideration of content popularity. The rest strategies, i.e., LCD, LCE and CAC, all perform poorly in the cache hit ratio. Similar to ProbCache, neither LCD nor LCE explicitly records the popularity of the content, so neither can accurately identify and cache really popular content. As for CAC, its basic principle is to cache content with more congested transmission path, rather than more popular content, which is doomed to its low hit ratio.

Figure 11 shows the cache hit ratio of different caching strategies when the cache capacity varies. The Zipf parameter

α

and the total request rate are fixed at 0.8 and 500, respectively. The obtained results demonstrate that all simulated strategies can achieve high cache hit ratios when the cache capacity is high. Obviously, this increase in the cache hit ratio is not linear to the increase in the cache size, i.e., the increase rate of the cache hit ratio gradually decreases. In addition, the difference in cache hit ratio of different strategies is basically consistent with the above analysis. Under different cache sizes, both PLABC and MAGIC achieve the highest cache hit ratio, while LCE, LCD and CAC the lowest. Note that when the cache size is small, PLABC performs slightly better than MAGIC. The main reason is that MAGIC adopts a replacement policy based on placement gain ranking, which makes it difficult to evict outdated popular content from the cache, resulting in cache redundancy and inefficient use of cache resources. In contrast, we adopt a simple LRU-based cache management policy, so that the content with decreasing popularity can be deleted in time to free up the cache space.

Figure 12 presents the cache hit ratio results under different request rates, while the Zipf parameter α and the relative cache size are set to 0.8 and 0.2, respectively. Generally speaking, the hit ratio of a good caching strategy should not be affected by the link load, that is, it should remain stable under different request rates. Figure 12 shows that all simulated strategies can guarantee a certain cache hit ratio at any requested rate. As mentioned above, the hit ratio of PLABC is always slightly higher than that of MAGIC due to the low cache size set in this experiment; it is worth noting that under topology Tiscali, the difference between MAGIC and PLABC is smaller; this is mainly because these topologies are different in the proportion of cache nodes and the number of links, which brings a small impact on the performance of different strategies, but this does not affect our conclusions.

5.2.2. Average Link Load

The average link load refers to the average value of the traffic forwarded by each link per unit time within the simulation time; it is an effective metric to show the traffic mitigation of in-network caching schemes as it can reflect the overall load level of the network links. As mentioned in Section 3, the link load incurred by the transfer of each content depends on the transmission distance and the content size. Since the size of all the contents is set to be equal throughout our simulation, the average link load will depend primarily on the average content transmission distance, which is determined by the cache hit ratio and hit location.

Figure 13 shows that with the increase of α, the average link load of all strategies decreases, which is mainly due to the increase in the cache hit ratio. More requests are served at the cache nodes, thereby reducing the average transmission distance of the requested contents; it can be seen from this figure that the average link load for different strategies is very close, but PLABC still performs slightly better than other strategies. Note that for CAC strategy, the average link load decreases significantly slower than other strategies. Although the cache hit ratio of CAC is similar to that of LCE and LCD in this experiment, the average transmission distance of CAC is longer because it preferentially caches the content with more congested transmission paths at the downstream nodes rather than the more popular content.

Figure 14 shows that with the increase of cache capacity, the average link load of all strategies also gradually decreases, for the cache hit ratio is increasing. PLABC provides approximately 6.1%, 5.1% and 6.2% improvement in average link load over MAGIC in the three topologies, respectively. This means that the average cache hit distance of PLABC is closer. This is mainly because we use a simple LRU queue to manage cached contents, thus avoiding outdated popular content from occupying the cache space all the time. An interesting phenomenon is that although the cache hit ratio of LCD is very low, it achieves the second-best average link load.

A higher request rate means that more traffic needs to be transmitted over the network. Accordingly, the link load is heavier, and the link is more likely to be congested. Therefore, in Figure 15, the average link load of all strategies increases linearly with the increase of the request rate.

5.2.3. Load Balancing Degree

The load balancing degree is defined as the standard deviation of the link load. We use this metric to evaluate whether the caching strategy can achieve load balancing between network links. According to the analysis in Section 3, we know that load balancing can increase the average throughput of content download flows, so this metric can also reflect the ability of the caching strategy to reduce the content transmission time.

Figure 16 demonstrates the load balancing degree results of different strategies when we vary the Zipf exponent α. Obviously PLABC outperforms the other strategies in all three topologies. With the increase of α, the difference in the load balancing degree between different strategies gradually decreases. However, for CAC, since its overall link load is higher (see Figure 13), the variance of its link load values is also greater.

We further study the impact of varying cache capacity on the load balancing performance of different strategies in Figure 17. We can observe that PLABC achieves optimal load balancing results in all topologies. In contrast, most other strategies lack consideration of dynamic link load, so their performance is unstable in different topologies.

Figure 18 shows the impact of varying the total request rate on load balancing for different strategies. We observe that PLABC outperforms other caching strategies in all three topologies. The load balancing degree of all strategies increases with the increase of the request rate, but the increase rate of PLABC is significantly slower, which indicates its good adaptability to network load fluctuations.

5.2.4. Average Content Transmission Time

This metric refers to the average transmission time of all the requested contents; it is mainly determined by the available bandwidth of the content download flow, so it is not only related to the cache hit ratio, but also to the load balancing ability of the caching strategy. We use this metric to reflect the effect of different caching strategies on improving user QoE.

Figure 19 shows the results of the average content transmission time as a function of alpha, which shows a similar trend to the load balancing degree (see Figure 16). PLABC achieves optimal average content transmission time among all compared strategies. The primary reason behind this superior performance is the good load balancing capability of PLABC, which increases the transmission bandwidth of the content flows and reduces the transmission time. In addition, we notice that as α increases, the transmission time of all strategies gradually converges to a similar value. This is mainly because the cache hit ratio is very high at this time, and most requests can be satisfied at nodes close to the network edge, so most contents can be downloaded at a rate close to the access bandwidth.

Figure 20 compares the average content transmission time of different strategies as the cache capacity varies. In the three topologies, PLABC achieves a 14.3%, 17.9%, and 16.1% reduction in average transmission time, respectively, compared with MAGIC. Although MAGIC has a high hit ratio, it cannot guarantee that requests are forwarded along the low-load path due to the lack of consideration of link load, so it performs poorly in terms of transmission time. Compared to CAC, which is specially designed to optimize transmission time, PLABC still achieves 7.9%, 14.5%, and 11.4% performance improvement, respectively. More importantly, the performance of PLABC in different topologies is stable, which is of great significance for those application scenarios where the topology changes dynamically.

Figure 21 further investigates the effect of request rate on the average transmission time. As mentioned earlier, a higher request rate means more contents to be delivered, which reduces the available bandwidth for each content download flow, and therefore increases the delivery time for all strategies. However, the average transmission time of PLABC increases significantly more slowly, which shows the good scalability of our approach. Therefore, under the same QoS requirement in terms of content transmission time, our strategy enables the cache network to have higher service capacity, that is, a larger demand can be handled by the network.

5.3. Discussion

According to the above analysis, PLABC achieves the best performance in all scenarios. The awareness of the transmission path load in the cache decision process enables the content transmitted through the high-load path to be cached preferentially, thus enabling PLABC to achieve effective load balancing and the lowest average content transmission time. In addition, frequent cache replacement is limited based on the replacement cost to avoid popular content being evicted from the cache. The cache node selection based on the max-utility also avoids cache redundancy to a large extent; these mechanisms ensure that PLABC can still maintain a high cache hit ratio while achieving the best content transmission time performance. In contrast, MAGIC does not consider the impact of link load on cache performance, so it achieves a high hit ratio while losing content transmission time performance. Similarly, due to the lack of restrictions on cache redundancy, CAC improves content transfer time performance at the expense of cache hit ratio. Other strategies perform poorly in all scenarios, mainly because they lack the necessary cache decision information, such as content popularity, transmission distance, etc.

6. Conclusions

In this paper, we proposed for ICN an on-path caching strategy PLABC based on path load awareness, which aims to achieve load balancing between network links, meanwhile maintaining a high cache hit ratio. We jointly considered the content-related information (e.g., content size and popularity) and the network-related information (e.g., transmission path distance and load) in the cache decision process. Those contents with heavily loaded transmission path, as well as those contribute more to saving bandwidth consumption are cached preferentially. Through extensive simulation, we demonstrated that PLABC significantly outperforms state-of-the-art schemes and enhances the service capability of the network with higher throughput and shorter content delivery time. Our current work assumed that all requests are forwarded using the nearest-replica routing strategy by default. Future work includes investigating the interaction mechanism and joint design method of request routing and caching strategy to further exploit their potential in congestion avoidance and load balancing.

Author Contributions

Conceptualization, Y.C., H.N. and R.H.; methodology, Y.C. and R.H.; software, Y.C.; writing—original draft preparation, Y.C.; writing—review and editing, Y.C., H.N. and R.H.; supervision, H.N.; project administration, R.H.; funding acquisition, H.N. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Strategic Leadership Project of the Chinese Academy of Sciences: SEANET Technology Standardization Research System Development (Project No. XDC02070100).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Acknowledgments

We would like to express our gratitude to Bo Li and Yuanhang Li for their meaningful support for this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ravindran, R.; Chakraborti, A.; Amin, S.O.; Azgin, A.; Wang, G. 5G-ICN: Delivering ICN Services over 5G Using Network Slicing. IEEE Commun. Mag. 2017, 55, 101–107. [Google Scholar] [CrossRef]
Serhane, O.; Yahyaoui, K.; Nour, B.; Moungla, H. A Survey of ICN Content Naming and In-Network Caching in 5G and beyond Networks. IEEE Internet Things J. 2021, 8, 4081–4104. [Google Scholar] [CrossRef]
Ahlgren, B.; Dannewitz, C.; Imbrenda, C.; Kutscher, D.; Ohlman, B. A Survey of Information-Centric Networking. IEEE Commun. Mag. 2012, 50, 26–36. [Google Scholar] [CrossRef]
Xylomenos, G.; Ververidis, C.N.; Siris, V.A.; Fotiou, N.; Tsilopoulos, C.; Vasilakos, X.; Katsaros, K.V.; Polyzos, G.C. A Survey of Information-Centric Networking Research. IEEE Commun. Surv. Tutor. 2014, 16, 1024–1049. [Google Scholar] [CrossRef]
Breslau, L.; Cao, P.; Fan, L.; Phillips, G.; Shenker, S. Web Caching and Zipf-like Distributions: Evidence and Implications. In Proceedings of the IEEE INFOCOM ’99, Conference on Computer Communications, Proceedings, Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies, The Future Is Now (Cat. No.99CH36320), New York, NY, USA, 21–25 March 1999; Volume 1, pp. 126–134. [Google Scholar]
Li, B.; Golin, M.J.; Italiano, G.F.; Deng, X.; Sohraby, K. On the Optimal Placement of Web Proxies in the Internet. In Proceedings of the IEEE INFOCOM ’99, Conference on Computer Communications, Proceedings, Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies, The Future Is Now (Cat. No.99CH36320), New York, NY, USA, 21–25 March 1999; Volume 3, pp. 1282–1290. [Google Scholar]
Che, H.; Wang, Z.; Tung, Y. Analysis and Design of Hierarchical Web Caching Systems. In Proceedings of the IEEE INFOCOM 2001 Conference on Computer Communications, Twentieth Annual Joint Conference of the IEEE Computer and Communications Society (Cat. No.01CH37213), Anchorage, AK, USA, 22–26 April 2001; Volume 3, pp. 1416–1424. [Google Scholar]
Qiu, L.; Padmanabhan, V.N.; Voelker, G.M. On the Placement of Web Server Replicas. Proc. IEEE Infocom 2001, 3, 1587–1596. [Google Scholar] [CrossRef]
Laoutaris, N.; Syntila, S.; Stavrakakis, I. Meta Algorithms for Hierarchical Web Caches. In Proceedings of the IEEE International Conference on Performance, Computing, and Communications, 2004, Phoenix, AZ, USA, 15–17 April 2004; pp. 445–452. [Google Scholar]
Kangasharju, J.; Roberts, J.; Ross, K.W. Object Replication Strategies in Content Distribution Networks. Comput. Commun. 2002, 25, 376–383. [Google Scholar] [CrossRef]
Wauters, T.; Coppens, J.; Dhoedt, B.; Demeester, P. Load Balancing through Efficient Distributed Content Placement. In Proceedings of the Next Generation Internet Networks, 2015, Rome, Italy, 18–20 April 2005; IEEE: Piscataway, NJ, USA, 2005; pp. 99–105. [Google Scholar]
Amble, M.M.; Parag, P.; Shakkottai, S.; Ying, L. Content-Aware Caching and Traffic Management in Content Distribution Networks. In Proceedings of the 2011 Proceedings IEEE INFOCOM, Shanghai, China, 10–15 April 2011; pp. 2858–2866. [Google Scholar]
Rao, A.; Lakshminarayanan, K.; Surana, S.; Karp, R.; Stoica, I. Load Balancing in Structured P2P Systems. In Proceedings of the Peer-to-Peer Systems II, Berkeley, CA, USA, 21–22 February 2003; Kaashoek, M.F., Stoica, I., Eds.; Springer: Berlin/Heidelberg, Germany, 2003; pp. 68–79. [Google Scholar]
Wierzbicki, A.; Leibowitz, N.; Ripeanu, M.; Wozniak, R. Cache Replacement Policies Revisited: The Case of P2P Traffic. In Proceedings of the IEEE International Symposium on Cluster Computing and the Grid, 2004, CCGrid 2004, Chicago, IL, USA, 19–22 April 2004; pp. 182–189. [Google Scholar]
Ming, Z.; Xu, M.; Wang, D. Age-Based Cooperative Caching in Information-Centric Networking. In Proceedings of the 2014 23rd International Conference on Computer Communication and Networks (ICCCN), Shanghai, China, 4–7 August 2014; pp. 1–8. [Google Scholar]
Zhang, G.; Li, Y.; Lin, T. Caching in Information Centric Networking: A Survey. Comput. Netw. 2013, 57, 3128–3141. [Google Scholar] [CrossRef]
Din, I.U.; Hassan, S.; Khan, M.K.; Guizani, M.; Ghazali, O.; Habbal, A. Caching in Information-Centric Networking: Strategies, Challenges, and Future Research Directions. IEEE Commun. Surv. Tutor. 2018, 20, 1443–1474. [Google Scholar] [CrossRef]
Ren, J.; Qi, W.; Westphal, C.; Wang, J.; Lu, K.; Liu, S.; Wang, S. MAGIC: A Distributed MAx-Gain In-Network Caching Strategy in Information-Centric Networks. In Proceedings of the 2014 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Toronto, ON, Canada, 27 April–2 May 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 470–475. [Google Scholar]
Badov, M.; Seetharam, A.; Kurose, J.; Firoiu, V.; Nanda, S. Congestion-Aware Caching and Search in Information-Centric Networks. In Proceedings of the 1st International Conference on Information-Centric Networking—INC ’14, Paris, France, 24–26 September 2014; ACM Press: New York, NY, USA, 2014; pp. 37–46. [Google Scholar]
Nguyen, D.; Sugiyama, K.; Tagami, A. Congestion Price for Cache Management in Information-Centric Networking. In Proceedings of the 2015 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Hong Kong, China, 26 April–1 May 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 287–292. [Google Scholar]
Carofiglio, G.; Mekinda, L.; Muscariello, L. LAC: Introducing Latency-Aware Caching in Information-Centric Networks. In Proceedings of the 2015 IEEE 40th Conference on Local Computer Networks (LCN), Clearwater Beach, FL, USA, 26–29 October 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 422–425. [Google Scholar]
Yokota, K.; Sugiyama, K.; Kurihara, J.; Tagami, A. RTT-Based Caching Policies to Improve User-Centric Performance in CCN. In Proceedings of the 2016 IEEE 30th International Conference on Advanced Information Networking and Applications (AINA), Crans-Montana, Switzerland, 23–25 March 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 124–131. [Google Scholar]
Liao, Y.; Sheng, Y.; Wang, J. A Survey on the Name Resolution Technologies in Information Centric Networking. J. Netw. New Media 2020, 9, 1–9. [Google Scholar]
Jacobson, V.; Smetters, D.K.; Thornton, J.D.; Plass, M.F.; Briggs, N.H.; Braynard, R.L. Networking Named Content. In Proceedings of the 5th International Conference on Emerging Networking Experiments and Technologies, Rome, Italy, 1–4 December 2009; Association for Computing Machinery: New York, NY, USA, 2009; pp. 1–12. [Google Scholar]
Zhang, L.; Afanasyev, A.; Burke, J.; Jacobson, V.; Claffy, K.C.; Crowley, P.; Papadopoulos, C.; Wang, L.; Zhang, B. Named Data Networking. SIGCOMM Comput. Commun. Rev. 2014, 44, 66–73. [Google Scholar] [CrossRef]
Koponen, T.; Chawla, M.; Chun, B.-G.; Ermolinskiy, A.; Kim, K.H.; Shenker, S.; Stoica, I. A Data-Oriented (and beyond) Network Architecture. In Proceedings of the 2007 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, Kyoto, Japan, 27–31 August 2007; Association for Computing Machinery: New York, NY, USA, 2007; pp. 181–192. [Google Scholar]
Fotiou, N.; Nikander, P.; Trossen, D.; Polyzos, G.C. Developing Information Networking Further: From PSIRP to PURSUIT. In Proceedings of the Broadband Communications, Networks, and Systems, Faro, Portugal, 19–20 September 2018; Tomkos, I., Bouras, C.J., Ellinas, G., Demestichas, P., Sinha, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 1–13. [Google Scholar]
Raychaudhuri, D.; Nagaraja, K.; Venkataramani, A. MobilityFirst: A Robust and Trustworthy Mobility-Centric Architecture for the Future Internet. SIGMOBILE Mob. Comput. Commun. Rev. 2012, 16, 2–13. [Google Scholar] [CrossRef]
Dannewitz, C.; Kutscher, D.; Ohlman, B.; Farrell, S.; Ahlgren, B.; Karl, H. Network of Information (NetInf)—An Information-Centric Networking Architecture. Comput. Commun. 2013, 36, 721–735. [Google Scholar] [CrossRef]
Wang, J.; Cheng, G.; You, J.; Sun, P. SEANet: Architecture and Technologies of an On-site, Elastic, Autonomous Network. J. Netw. New Media 2020, 9, 1–8. [Google Scholar]
Zhang, G.; Tang, B.; Wang, X.; Wu, Y. An Optimal Cache Placement Strategy Based on Content Popularity in Content Centric Network. J. Inf. Comput. Sci. 2014, 11, 2759. [Google Scholar] [CrossRef]
Bernardini, C.; Silverston, T.; Festor, O. MPC: Popularity-Based Caching Strategy for Content Centric Networks. In Proceedings of the 2013 IEEE International Conference on Communications (ICC), Budapest, Hungary, 9–13 June 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 3619–3623. [Google Scholar]
Ong, M.D.; Chen, M.; Taleb, T.; Wang, X.; Leung, V.C.M. FGPC: Fine-Grained Popularity-Based Caching Design for Content Centric Networking. In Proceedings of the 17th ACM International Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems—MSWiM ’14, Montreal, QC, Canada, 21–26 September 2014; ACM Press: New York, NY, USA, 2014; pp. 295–302. [Google Scholar]
Cho, K.; Lee, M.; Park, K.; Kwon, T.T.; Choi, Y.; Pack, S. WAVE: Popularity-Based and Collaborative in-Network Caching for Content-Oriented Networks. In Proceedings of the 2012 Proceedings IEEE INFOCOM Workshops, Orlando, FL, USA, 25–30 March 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 316–321. [Google Scholar]
Bilal, M.; Kang, S.-G. Time Aware Least Recent Used (TLRU) Cache Management Policy in ICN. In Proceedings of the 16th International Conference on Advanced Communication Technology, Bikaner, India, 12–13 August 2014; pp. 528–532. [Google Scholar]
Li, Y.; Wang, J.; Han, R. PB-NCC: A Popularity-Based Caching Strategy with Number-of-Copies Control in Information-Centric Networks. Appl. Sci. 2022, 12, 653. [Google Scholar] [CrossRef]
Li, Y.; Lin, T.; Tang, H.; Sun, P. A Chunk Caching Location and Searching Scheme in Content Centric Networking. In Proceedings of the 2012 IEEE International Conference on Communications (ICC), Ottawa, ON, Canada, 10–15 June 2012; pp. 2655–2659. [Google Scholar]
Rosensweig, E.J.; Kurose, J. Breadcrumbs: Efficient, Best-Effort Content Location in Cache Networks. In Proceedings of the IEEE INFOCOM 2009—The 28th Conference on Computer Communications, Rio De Janeiro, Brazil, 19–25 April 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 2631–2635. [Google Scholar]
Sourlas, V.; Paschos, G.S.; Flegkas, P.; Tassiulas, L. Caching in Content-Based Publish/Subscribe Systems. In Proceedings of the GLOBECOM 2009—2009 IEEE Global Telecommunications Conference, Honolulu, HI, USA, 30 November–4 December 2009; pp. 1–6. [Google Scholar]
Chai, W.K.; He, D.; Psaras, I.; Pavlou, G. Cache “Less for More” in Information-Centric Networks. In Proceedings of the NETWORKING 2012, Prague, Czech Republic, 21–25 May 2012; Bestak, R., Kencl, L., Li, L.E., Widmer, J., Yin, H., Eds.; Springer: Berlin/Heidelberg, Germany, 2012; pp. 27–40. [Google Scholar]
Psaras, I.; Chai, W.K.; Pavlou, G. Probabilistic In-Network Caching for Information-Centric Networks. In Proceedings of the Second Edition of the ICN Workshop on Information-Centric Networking—ICN ’12, Helsinki, Finland, 17 August 2012; ACM Press: New York, NY, USA, 2012; p. 55. [Google Scholar]
Wang, W.; Sun, Y.; Guo, Y.; Kaafar, D.; Jin, J.; Li, J.; Li, Z. CRCache: Exploiting the Correlation between Content Popularity and Network Topology Information for ICN Caching. In Proceedings of the 2014 IEEE International Conference on Communications (ICC), Sydney, NSW, Australia, 10–14 June 2014; IEEE: Piscataway, NJ, USA, 2014; pp. 3191–3196. [Google Scholar]
Nikmard, B.; Movahhedinia, N.; Khayyambashi, M.R. Congestion Avoidance by Dynamically Cache Placement Method in Named Data Networking. J. Supercomput. 2022, 78, 5779–5805. [Google Scholar] [CrossRef]
Dutta, N.; Patel, S.K.; Faragallah, O.S.; Baz, M.; Rashed, A.N.Z. Caching Scheme for Information-Centric Networks with Balanced Content Distribution. Int. J. Commun. Syst. 2022, 35, e5104. [Google Scholar] [CrossRef]
Zeng, L.; Ni, H.; Han, R. An Incrementally Deployable IP-Compatible-Information-Centric Networking Hierarchical Cache System. Appl. Sci. 2020, 10, 6228. [Google Scholar] [CrossRef]
Zhang, F.; Zhang, Y.; Raychaudhuri, D. Edge Caching and Nearest Replica Routing in Information-Centric Networking. In Proceedings of the 2016 IEEE 37th Sarnoff Symposium, Newark, NJ, USA, 19–21 September 2016; pp. 181–186. [Google Scholar]
Ioannou, A.; Weber, S. A Survey of Caching Policies and Forwarding Mechanisms in Information-Centric Networking. IEEE Commun. Surv. Tutor. 2016, 18, 2847–2886. [Google Scholar] [CrossRef]
Yu, H.; Zheng, D.; Zhao, B.Y.; Zheng, W. Understanding User Behavior in Large-Scale Video-on-Demand Systems. In Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006, Leuven, Belgium, 18-21 April 2006; Association for Computing Machinery: New York, NY, USA, 2006; pp. 333–344. [Google Scholar]
Carter, R.L.; Crovella, M.E. Measuring Bottleneck Link Speed in Packet-Switched Networks. Perform. Eval. 1996, 27–28, 297–318. [Google Scholar] [CrossRef]
Zeng, X.; Wang, D.; Han, S.; Yao, W.; Wang, Z.; Chen, R. An Effective Load Balance Using Link Bandwidth for SDN-Based Data Centers. In Proceedings of the Artificial Intelligence and Security, Dublin, Ireland, 19–23 July 2021; Sun, X., Pan, Z., Bertino, E., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 256–265. [Google Scholar]
Zhang, M.; Luo, H.; Zhang, H. A Survey of Caching Mechanisms in Information-Centric Networking. IEEE Commun. Surv. Tutor. 2015, 17, 1473–1499. [Google Scholar] [CrossRef]
Liu, Z.; Wang, K.; Li, W.; Xiao, Q.; Shi, D.; He, G. Measurement and Modeling Study of IPTV CDN Network. In Proceedings of the 2009 IEEE International Conference on Network Infrastructure and Digital Content, Beijing, China, 6–8 November 2009; pp. 302–306. [Google Scholar]
Saino, L.; Psaras, I.; Pavlou, G. Icarus: A Caching Simulator for Information Centric Networking (ICN). Available online: https://doi.org/10.4108/icst.simutools.2014.254630 (accessed on 15 August 2022).
Spring, N.; Mahajan, R.; Wetherall, D. Measuring ISP Topologies with Rocketfuel. SIGCOMM Comput. Commun. Rev. 2002, 32, 133–145. [Google Scholar] [CrossRef]

Figure 1. A generic caching scenario in the name resolution-based ICN network.

Figure 2. Example of caching content with different popularity (residual bandwidth).

Figure 3. Example of caching content with different transmission distances.

Figure 4. The illustration of upstream path.

Figure 5. Forwarding process of

r e q_{1}^{m}

.

Figure 5. Forwarding process of

r e q_{1}^{m}

.

Figure 6. Forwarding process of

d a t a_{1}^{m}

.

Figure 6. Forwarding process of

d a t a_{1}^{m}

.

Figure 7. Forwarding process of subsequent request packets.

Figure 8. Forwarding process of subsequent data packets.

Figure 9. Example of content information table maintained by each node.

Figure 10. Cache hit ratio results of different caching strategies when Zipf exponent alpha varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 11. Cache hit ratio results of different caching strategies when the relative cache size varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 12. Cache hit ratio results of different caching strategies when the total request rate varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 13. Average link load results of different caching strategies when Zipf exponent alpha varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 14. Average link load results of different caching strategies when the relative cache size varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 15. Average link load results of different caching strategies when the total request rate varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 16. Load balancing degree results of different caching strategies when Zipf exponent alpha varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 17. Load balancing degree results from different caching strategies when the relative cache size varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 18. Load balancing degree results from different caching strategies when the total request rate varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 19. Average content transmission time results of different caching strategies when Zipf exponent alpha varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 20. Average content transmission time results from different caching strategies when the relative cache size varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Figure 21. Average content transmission time results of different caching strategies when the total request rate varies under topology: (a) Abovenet; (b) Tiscali; (c) Telstra.

Table 1. Summary of the notations.

Name	Comment
$V$	The set of in-network nodes (routers, servers, etc.)
$V_{e}$	The set of edge nodes, which is a subset of $V$
$E$	The set of links between nodes
$c_{v}$	The cache capacity of cache node $v$ (in bits)
$l_{i, j}$	The directed link connecting node i and j
$B W_{i, j}$	The bandwidth capacity of the link $l_{i, j}$
$φ (l_{i, j})$	The bandwidth occupation ratio of link $l_{i, j}$
$M$	The set of contents requested in the network
$s^{m}$	The size of content $m$ (in bits)
$R$	The total request rate at all edge nodes
$r_{u}^{m}$	The request rate for content $m$ at edge node $u$
$h_{v}^{m}$	The 0–1 variable indicating whether or not content $m$ is in node $v ’$ s local cache
$a_{u, v}^{m}$	The 0–1 variable indicating whether to select node $v$ as the service node for edge node $u ’$ s request for $m$
$t_{v, u}^{m}$	The time required to transfer content m from node v to node u

Table 2. Experiment parameters.

Category	Parameter	Value
Network	Topology	Abovenet/Tiscali/Telstra
	Bandwidth of Network Links	1 Gbps
	Cache Placement	Uniform
	Content Replacement Policy	LRU/others
Content	Content Size	10 MB
	Popularity Model	$Zipf (α = 0.4 ~ 1.2$ )
	Number of Contents	$3 \times 10^{5}$
Request	Number of Requests	$6 \times 10^{5}$
	Request Distribution	Poisson distribution
	Total Request Rate	$100 ~ 500$ req/s

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Chao, Y.; Ni, H.; Han, R. A Path Load-Aware Based Caching Strategy for Information-Centric Networking. Electronics 2022, 11, 3088. https://doi.org/10.3390/electronics11193088

AMA Style

Chao Y, Ni H, Han R. A Path Load-Aware Based Caching Strategy for Information-Centric Networking. Electronics. 2022; 11(19):3088. https://doi.org/10.3390/electronics11193088

Chicago/Turabian Style

Chao, Yichao, Hong Ni, and Rui Han. 2022. "A Path Load-Aware Based Caching Strategy for Information-Centric Networking" Electronics 11, no. 19: 3088. https://doi.org/10.3390/electronics11193088

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Path Load-Aware Based Caching Strategy for Information-Centric Networking

Abstract

1. Introduction

2. Related Work

3. System Model

3.1. Name Resolution Based ICN Architecture

3.2. Problem Formulation

3.3. Design Principles of the Caching Strategy

4. Path Load-Aware Based Caching (PLABC) Strategy

4.1. Popularity Estimation

4.2. Upstream Path Load Estimation

4.3. Cache Benefit Calculation

4.4. Cache Replacement Cost Calculation

4.4.1. Eviction Cost of the Content

4.4.2. Replacement Cost of the Node

4.5. Max-Utility-Based Caching Node Selection

4.6. Implementation

4.7. Overhead Analysis

4.7.1. Storage Overhead

4.7.2. Communication Overhead

4.7.3. Computation Overhead

5. Performance Evaluation

5.1. Experimental Setup

5.2. Simulation Results

5.2.1. Cache Hit Ratio

5.2.2. Average Link Load

5.2.3. Load Balancing Degree

5.2.4. Average Content Transmission Time

5.3. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI