Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

A Prediction-Based Model for Consistent Adaptive Routing in Back-Bone Networks at Extreme Situations

Electronics 2020, 9(12), 2146; https://doi.org/10.3390/electronics9122146

by Qianru Zhou¹

and Dimitrios Pezaros^2,*

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3: Anonymous

Electronics 2020, 9(12), 2146; https://doi.org/10.3390/electronics9122146

Submission received: 12 November 2020 / Revised: 6 December 2020 / Accepted: 8 December 2020 / Published: 15 December 2020

(This article belongs to the Special Issue Future Networks: New Advances and Challenges)

Round 1

Reviewer 1 Report

The authors analyze the problem of path (in)stability under extreme congestion, and propose a solution to switch to backup paths when congestion triggers. Contrary to one of the claims, the issue of route flapping (with or without congestion) has been extensively studied in the literature in the past, and a number of proposed solutions exists. These are not discussed or referred in the paper. Additionally, the paper has a number of issues, in my opinion, that should be addressed:

1) The notation is sometimes inconsistent, e.g. in equations (3), (4) and (5) there is some confusion regarding t and Δt in the differential congestion indicators. More importantly, the DPCI (1) is presented as a Markov process, but it is actually not Markovian unless further assumptions are made on L(t). Fortunately, the Markovian property is not exploited at all in the paper.

2) The cumulative DPCI presented in (6)-(9) is basically a moving average, which is not novel. It is rather a well known mechanism for smoothing a volatile signal. The key observation is, as the authors recognize, that these quantities depend crucial of a manual parameter, θ, acting as a hysteresis factor, and its effect is not discussed at all in the paper.

3) From an implementation point of view, how is latency measured between two endpoints in a path? I guess it's simply the RTT in the transport protocol (e.g., TCP) but this is not done so in the experiments, where ping is employed. Are ping probes necessary in parallel to measure latency?

4) Given that, increasingly, delay-based congestion control algorithms are being used in the Internet, and these suffer from less congestion episodes than loss-based congestion control protocols, to what extent is the proposed method appropriate in modern networks?

5) The experiments presented use the Abilene topology as an example, and quite old traffic traces. While the topology might be acceptable (even if the current topology of Internet is more 'flat' the 20 years ago), the traffic traces are outdated and not representative of typical applications today.

6) In Figs. 5-8, it is said that 'different parameters θ' are compared. Which ones? No mention or discussion accompanies in the main text.

Overall, the support and evidence that the proposed metric for switching paths is robust needs improvements: the formulation needs to be refined, the experiments need more detail, the traffic models seem to be unresponsive to congestion, and the main concepts that support the proposal are not particularly novel (a moving average of delay measurements). All of these requires further clarification.

Author Response

The authors would like to thank the reviewer for his/her helpful and insightful comments. We have carefully improved the quality of the manuscipt by taking all the comments into account. The modification of the manuscript and the responses to the reviews point-to-point are described as follows in red.

The existing work on reduce route flapping are mainly route dampening and route aggregation. These methodologies try to reduce route flapping passively, in other words, they didn't change the flapping route itself, but execute some additional calculation and operation on these flapping routes. Besides, to our knowledge, the existing works did not report experiment results on the same use case as ours, nor did they consider various congestion situations when traffic load is 5, 7.5, and 20 times of bandwidth. By taking history latency of the path and competing path into consideration, the proposed method achieve more stability while achieving acceptable channel situation improvement (get a stable route last for more than 3000 seconds at longest, more than 1000 seconds on average). In the revised paper, more decriptions are added to the paper on Page 2, the second paragraph of Section 2, marked in red.

The notations t is used when describing continuous-time process, while notation Δt for discrete-time process, we have added more declaration in Page 4, Line 105, 110, and 114.

In the revised manuscript, we have removed the description about Markovian property.

Indeed the proposed model has a "moving" property, however, it is not moving average, as defined in Section 3. The stability of parameter θ has been addressed in detail in Experiment results presentaton on Section 4. In the revised paper, more details of the effect and importance of parameter θ have been added on Page 3, second paragraph in Section 3.1.1.

We use RTT measured by ping as the latency, as presented in Section 4, from page 6 to 7. In the revised manuscript, more detail was added on Page 7, Line 207-215, and marked in red.

In our experiment, we choose delay measurement as the metric, for according to Hayes et.al [1], delay provide more timely feedback than packet loss information on the state of network. From our experiment data, packet-loss rate is in direct proportion to latency when channel is extremely congested. We think it is an interesting direction for future work.

[1] D. A. Hayes and D. Ros, "Delay-based Congestion Control for Low Latency," 2013.

The focus of this paper is on a backbone network topology rather than trying to capture the Internet-wide, cross-domain structure. Indeed the traffic dataset is dated in 2003, which is a little outdated, but it only serves as a baseline, and we has tried out different congestion situations when traffic load is 5, 7.5, and 20 times of channel bandwidth, and different experiment results suggest the same evidence of the effect of proposed PSI.

6) In Figs. 5-8, it is said that 'different parameters θ' are compared. Which ones? No mention or discussion accompanies in the main text.

In the revised paper, the captions of Fig. 5-8 are revised.

Reviewer 2 Report

The authors have proposed a solution based on prediction-based model for consistent adaptive routing in back-bone networks that is capable of measuring the consistency of path latency difference. By learning the history latency of all optional paths, PSI is able to predict the onset of an obvious and steady channel deterioration, and make the decision to switch path. The topic is really interesting specially for dense edge networks with cloud integration. The proposed solution is supported with the results. However, I would like to ask authors to add the following information.

The authors have mentioned the use of Docker platform, however very less information is provided about the use, hosting and containerization. This information should be added in detail.
Since the author's main focus is latency. What will be the impact on latency if cloudlets (Docker) are hosted near to the edge nodes as compared to a central cloud (Docker)? Resources, optimization, density of the network nodes should be discussed.
The authors should compare their results with already available techniques, lets say in Fig. 4, with AI based predictive routing. That will give the readers an idea about the complexity of the algorithms vs. performance.

Author Response

1. The authors have mentioned the use of Docker platform, however very less information is provided about the use, hosting and containerization. This information should be added in detail.

In the revised manuscript, more details of the Docker platform is provided on Page 6-7, Line 187-191. 2. Since the author's main focus is latency. What will be the impact on latency if cloudlets (Docker) are hosted near to the edge nodes as compared to a central cloud (Docker)? Resources, optimization, density of the network nodes should be discussed.
An edge-based cloudlet approach would definitely help reducing latency of particular traffic flows if they exploited locality characteristics but the backbone network could/would still experience high congestion levels that would impact all traffic aggregates over its connecting POPs. In this study, we have focused on improving the latency of such aggregates. 3. The authors should compare their results with already available techniques, lets say in Fig. 4, with AI based predictive routing. That will give the readers an idea about the complexity of the algorithms vs. performance. In the revised paper, the comparization between AI-based approaches with our proposed one on Page 3, second paragraph, marked in red.

Reviewer 3 Report

The authors of this paper study the problem of selecting paths that stay consistently optimal for a long term in extremely congested situations. To solve that issue, a model that can measure the consistency of path latency difference is proposed and evaluated.

The work is quite interesting and deserves publication. However, some minor comments should be taken into consideration:

1) First of all, it is necessary to revise the manuscript since there a lot of grammatical/syntax errors that should be taken into account. An indicative (not full) list is presented below

Page 2, line 33, revise the phrase: “Although numbers work has been…”

Page 93, the acronym RTT is not defined (although it is a quite well known acronym).

Page 3, line 104, revise the phrase: “…from the definition Eq. 1…”. Also, for a reader who is not familiar with the subject, it would be nice to explain why it is obvious that ω has the Markov property in (1).

Page 3, line 113, revise the phrase: “…we are able to describe the consistency of the relative path congestion situation use the…”

Page 4, line 117, revise the phrase: “…can be get from…”

Page 4, line 146, revise the phrase: “…indicators may becomes…and oscillations happens”

Page 5, line 147, revise the phrase: “They are depend on…”

Page 5, line 149, revise the phrase: “In our evaluation experiment, where the traffic data used…”

Page 5, line 163, revise the phrase: “…are 1, it happens when both…”

In line 165, the number of the Table is missing.

Line 172, revise the phrase: “…with topology the…”

In line 179, the acronym PSI should be used (and not Path Swap Indicator). The same applies to other lines as well.

Lines 180-181, revise the phrase: “…what is the proportions of path i’s congestion situation is really worse…”

Line 184, revise the phrase: “…before the indicator change…”

Line 206, revise the phrase: “A tool to active measure…”

Line 210: the acronym OSPF should be used.

Line 213, revise the phrase: “…peak traffic severely exceed…”

Line 220, revise the phrase: “…and in each networks…”

Line 340, revise the phrase: “…we proposed an Markov…”

2) Some future directions should be included in the conclusion. In that case is it possible to discuss how difficult it is to incorporate the determination of call blocking probabilities in your method?

Author Response

The work is quite interesting and deserves publication. However, some minor comments should be taken into consideration:

1) First of all, it is necessary to revise the manuscript since there a lot of grammatical/syntax errors that should be taken into account. An indicative (not full) list is presented below

Page 2, line 33, revise the phrase: “Although numbers work has been…”