Next Article in Journal
Trust, but Verify: Informed Consent, AI Technologies, and Public Health Emergencies
Previous Article in Journal
A Digital Currency Architecture for Privacy and Owner-Custodianship
 
 
Article
Peer-Review Record

A Hierarchical Cache Size Allocation Scheme Based on Content Dissemination in Information-Centric Networks

Future Internet 2021, 13(5), 131; https://doi.org/10.3390/fi13050131
by Hongyu Liu 1,2 and Rui Han 1,2,*
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Future Internet 2021, 13(5), 131; https://doi.org/10.3390/fi13050131
Submission received: 8 April 2021 / Revised: 7 May 2021 / Accepted: 13 May 2021 / Published: 15 May 2021
(This article belongs to the Section Network Virtualization and Edge/Fog Computing)

Round 1

Reviewer 1 Report

The submission on “A Hierarchical Cache Size Allocation Scheme Based on Content Dissemination in Information-Centric Networks” (ICN) evaluates the performance of distributed caching schemes on ICN nodes with regard to the hit ratio and the effect on delay and content server load. A weighting scheme for the efficiency of the nodes is proposed based on their importance depending on the distance to content servers and the users and the request rates of the user population that can be served by a cache on the node. The cache size is then allocated according to the node weight within a limited total cache size budget.

On the whole, the submission shows that the weighting scheme, which is based on simple summation over relevance indicators per node, can improve alternative weighting measures based on node degree and betweenness. The results are plausible due to more information being used related to caching efficiency than for the alternatives. As a strength of the submission, basic effects with main impact on the results are demonstrated in hit ratio figures in simple and clear presentations.

The presentation is comprehensible, but can be improved in many details. Technological aspects and the related literature are mainly covered.

However, the reviewer would like to address gaps in the usage of Zipf distributed requests and LRU caching strategy, which should be addressed more carefully, as well as a longer list of remarks on text passages and other details that have potential for improvement.

 

Remarks on Zipf distributed requests:

The performance evaluation of the submission is based on simulation using Zipf distributed requests with parameter Alpha, as explained on page 12.

It remains unclear, whether the requests are independent, but since no traces are used and no correlation among the requests are described, the reader has to assume independent requests.

Therefore, please explicitly state, if requests are independent.

The parameter Alpha of the Zipf distribution is assumed in the range 0.8 – 1.1. However, no hint or reference is given why. Well known measurement studies on Zipf distributed web requests indicate that Alpha is in the range 0.6 - 0.9, as confirmed e.g. by [1]

[1] L. Breslau et al., Web caching and Zipf-like distributions: Evidence and implications, Proc. IEEE Infocom (1999) 126-134

Thus, the range of Alpha in the submission seems to be shifted by about 0.2 beyond what is realisitc, with the consequence that cache hit ratios are massively overestimated. The results of Fig. 9 of the submission show that increasing Alpha by 0.2 has an increasing effect on the hit ratio of about 30%. Therefore, the authors should give reasons why their range for Alpha is realistic or adapt the range.

 

Remarks on LRU caching strategy:

The submission applies the LRU caching strategy, as many ICN cache papers do. However, LRU is shown to be inappropriate for web caching in many studies compared to other strategies. This especially holds for independent Zipf distributed requests, where LFU and other methods with awareness of request frequency, object size or combined scores are more efficient, as shown e.g. by [2] – [4]

[2] N. Megiddo and S. Modha, Outperforming LRU with an adaptive replacement cache algorithm, IEEE Computer (Apr. 2004) 4-11

[3] G. Hasslinger et al., Performance evaluation for new web caching strategies combining LRU with score-based selection, Computer Networks 125 (2017) 172-186

[4] H. ElAarag, Web proxy cache replacement strategies: Simulation, implementation and performance evaluation, Springer Publ. (2013) 1-103

Therefore, the authors should state reasons, why their approach for ICN caching has to be restricted to LRU strategy or include more efficient methods.

 

Remarks on text passages that should be checked for improvement or correction:

On page 1:

“The mismatch between the location of the content and the content to retrieve” The syntax and meaning is not clear. Maybe it should read:

“The mismatch between location based routing and distributed retrieval schemes”

“to smoothly transition“ -> “to smoothly transfer”

 

On page 3:

“Therefore, the file in ICN is divided into smaller independently

identifiable data blocks (chunks), and chunks are the basic caching units [9,18,20]. This makes many analysis reference models of traditional cache invalid for ICN.“

The last statement suggests that the usage of data chunks is a new feature enabled by ICN? On the contrary, data chunks are well established in web caching, CDNs and P2P systems since around 2005.

 

Please unify citation style, e.g. examples on page 4:

“. Xu [25] used” “. Chui [26] defined” “. Wang [27] analyzed” <- versus -> “, [24] considered” “. [28] was” “. [30] [31] established”

 

“The other is to formulate an optimization problem and solve it. This kind of researches designed optimization problems, . . .” -> “On the other hand, researches designed optimization problems, . . .”

The English style is sufficient to make the submission readable, but there are many unusual text passages and syntax errors that could be improved by an experienced English speaker.

 

On page 5:

“Some studies found the key factors that“

„some studies used a t-SNE algorithm to reduce“

Which ones? Please cite such a study. What does “t-SNE” mean?

 

On page 6:

„the node forwards the request to the next node“

But: what is the next node in a network ?

 

Figure 3: Cache units are surrounded by a dotted line, but for SNa node, the dotted line includes 3 units in both cases. It would be better to include only 2 cache units in the dotted line of SNa in a) and leave the third unit outside to clearly show that it is outside of the cache space.

 

“Traditional caching networks usually cache the same type of content or application.“ This may hold for caching 20 years ago, but not in nowadays CDNs and clouds.

 

Math notations in Table 1 and in the formulas on the following pages:

Instead of “CD_wi; DP_wi; NO_wi; N_Gn” it would be better to use the format:

“CDW_i; DPW_i; NW_i; NG_n” Then the abbreviation e.g. “Node Weight” is clearly distinguished from the numerical index “i”; however, currently it seems as if “_wi” refers to 2 differed indices “w” and “i”.

 

“N_Num” is not clearly defined and should be added to Table 1.

In formula (8) on page 12, the sums should have lower bounds “a=1” and “b=1”.

In table 2: “Replament policy” -> “Replacement policy”

Author Response

Thanks for your comments. Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

This paper proposes a hierarchical cache size allocation scheme which considers the node weights in terms of content domains, data path and node location. The idea is interesting. However, more details on the design and evaluation should be provided to verify the effectiveness of the proposed algorithm. More detailed comments are given below:

1) The paper proposes to calculate the weights based on content domain, data path and node location. A set of parameters are proposed while calculate these weights (equation 1-5). It is not clear why these parameters are chosen, why these parameters are the most important factors that affect the caching performance. Why are equation 1 and 3 simplified to equation 2 and 4? Why are the parameters in equation 2 and 4 are more important than other parameters, and how important are they compared with other parameters?

2) How will the propose algorithm be deployed. Will it be deployed in a central server or in a distributed way? How are the parameters obtained to calculate the weights? It seems that these parameters need to be measured based on run-time status. How will the parameter changes affect on the cache size allocation? Is it practical to adjust cache size after the caches have been allocated and deployed in the routers?

3) The evaluation does not shown how the variations of the parameters in equation 1-5 can affect the caching performance, which is the most influential parameter? How do the combination of different combination of these parameters affect the caching performance? Consequently, it is difficult to judge if the propose algorithm work well or not.

4) The presentation need to be improved and a careful proof-reading is required. Some examples are given below.

a) What are the meanings for the node colours in Figure 4?

b) Line 237: The description to Figure 3 is not clear, e.g. in “The request for content C3 was forwarded to SNa”, who sends the content out?

Author Response

Thanks for your comments. Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

The modified paper has addressed some of the reviewer’s previous comments. However, some important concepts require further clarification:

1) The meaning of size sensitivity is not clear. It is convincing that the size sensitivity and content popularity are ignored during to the difficulties to obtain the values (equation 1 & 2). How will this simplification affect on the algorithm performance? How is equation 3 simplified to equation 4?

2) How will the propose algorithm be deployed. Will it be deployed in a central server or in a distributed way? How are the parameters obtained to calculate the weights? It seems that these parameters need to be measured based on run-time status. How will the parameter changes affect on the cache size allocation? Is it practical to adjust cache size after the caches have been allocated and deployed in the routers?

3) The evaluation does not shown how the variations of the parameters in equation 1-5 can affect the caching performance, which is the most influential parameter? How does the combination of different combination of these parameters affect the caching performance? Consequently, it is difficult to judge if the propose algorithm work well or not.

4) What are the meanings for the node colours in Figure 4?

5) This paper proposed a heterogeneous allocation scheme. Why does the evaluation not show the comparison with other heterogeneous allocation schemes?

6) What does LRU and LFU stand for?

Author Response

Thanks for you comments. Please see the attachment.

Author Response File: Author Response.pdf

Round 3

Reviewer 2 Report

All the comments have been addressed. 

Back to TopTop