1. Introduction
Modern networks face an increasingly complex and dynamic environment, which makes it increasingly difficult to accurately measure user demands and infer reliable predictions [
1]. This is a problem for network operators, as routing is usually optimized permanently with respect to the expected traffic matrix [
2,
3,
4,
5,
6]. Without a good estimate, networks must be able to dynamically react to the momentary changes in user demands and network state. This involves the careful readjustment of the link costs in shortest path routing, setting up new label switched paths in MPLS, the arrangement of traffic splitting ratios in multipath routing, etc., in concert with the momentary traffic matrix experienced at ingress routers. It is widely held that an adaptive routing algorithm should be adopted for this purpose [
7,
8,
9,
10,
11,
12,
13,
14].
Adaptive routing algorithms (e.g., the Internet Gateway Routing Protocol (IGRP), Border Gateway Protocol (BGP)) are methods that change their routing decisions to reflect the topology and adapt to the traffic changes [
15]. Routing algebra is an algebraic framework that provides a mathematical template for the specification, design, and verification of routing policies [
16,
17]. It can be shown, for example, that the IGRP can be made to converge to shortest and widest paths, but that the composite metric of protocol does not lead to optimal paths [
16].
There is another way. A carefully optimized static routing can also ensure low congestion, no matter what demands the users impose on the network [
18,
19,
20,
21,
22,
23,
24,
25,
26]. The idea is to design a single static routing, the so-called
oblivious routing, that works reasonably for
all traffic matrices. The routing performance is measured by the
competitive ratio (or oblivious performance ratio), defined as maximal factor between the congestion produced by oblivious routing and some hypothetical ideal routing scheme, taken over all traffic matrices.
Curiously, in undirected networks the competitive ratio is only a logarithmic function of the number of nodes [
22], and even though in directed graphs it can grow as
[
23], extensive simulations on real and synthetic topologies revealed that its value usually remains under 2 [
24].
Recently, oblivious routing was combined with other approaches; semi-oblivious and hybrid traffic engineering solutions were developed and investigated [
27,
28,
29,
30]. Thus, a deep understanding of the oblivious routing and the analysis of its performance with the competitive ratio might be an important step towards better traffic engineering solutions.
Under the hood, the competitive ratio is a worst case metric, conveying the impression that, if its value is low, then we can expect the worst case congestion to manifest over only a small fraction of the traffic matrices and most “normal” traffic matrices will be routed with no congestion at all. In this paper, we ask to what extent this intuitive worst-case performance characterization remains valid in a broader statistical sense.
We introduce two new routing performance metrics. The probability of congestion (PoC) describes the probability that, over a given demand, distribution congestion shows up somewhere in the network. Easily, if the PoC approaches 1, the network spends most of the time in a congested state. The second metric, the expected value of congestion (EVoC), quantifies the anticipated extent of congestion. We argue that the PoC and the EVoC are revealing statistical performance measures for general static routing algorithms as well as for oblivious routing itself.
Not only are they useful as absolute performance metrics but also an appealing relative interpretation also exists. In this setting, traffic matrices are picked uniformly from the set of traffic matrices routable without congestion in the network, and we characterize the competitiveness of oblivious routing by comparing the PoC and the EVoC to those attainable by an optimal routing algorithm. As a main contribution of the paper, we show that the performance penalty of oblivious routing can become arbitrarily large in this sense. In particular, we show that the PoC for oblivious routing grows as in certain directed graphs on n nodes and in some undirected graphs it approaches 1 with , , and this can occur even when the competitive ratio is only (in particular, 2 in the first case and in the second).
In the second part of the paper, numerical evaluation of real life topologies are shown suggesting that the PoC and the EVoC indeed exhibit the worst-case behavior that we identified analytically.
The rest of the paper is organized as follows. Related literature is surveyed in
Section 2. Then, in
Section 3, we recall the geometric framework from [
13], and formal definitions for the PoC and the EVoC as well as methods to evaluate them are provided in
Section 4. In
Section 5, we turn to view the PoC and the EVoC as competitive measures. Numerical studies are given in
Section 6, and finally
Section 7 concludes the paper.
2. Related Work
Oblivious routing is an attractive paradigm for networks, where centralized control or frequent reconfigurations are infeasible or unacceptable. It is the answer for the question, how to route the traffic, without a priori knowledge of the traffic demand.
One of the first results on oblivious routing dates back to the 1980s, when Valiant and Brebner studied oblivious routing on hypercubes [
18]. Their randomized scheme has the competitive ratio
, where
n is the number of nodes in the network. Despite the first early results, the breakthrough came only in 2002, when Räcke showed that, for generic undirected graphs, the maximum congestion is within a
factor of the lowest possible congestion [
19]. This result was non-constructive, and the running time of the algorithm’s precomputation phase was exponential.
In the next years, the competitive ratio and the complexity of the algorithms was subsequently improved by Harrelson, Hildrum and Rao [
20], and later by Räcke [
22] from
to
and to
, respectively (see
Table 1). It was also shown that Räcke’s
bound is asymptotically tight as there are networks on which no oblivious routing algorithm exists with sub-logarithmic competitive ratio [
31,
32]. However, for directed graphs, no logarithmic congestion guarantee exists [
23].
Meanwhile, algorithms were proposed to solve the oblivious routing problem in polynomial time, most notably the LP formulation given by Applegate et al. [
24]. Their numerical evaluations also suggest that the competitive ratio in most real-world network topologies is 2 or less: in contrast the
overprovisioning suggested by Räcke’s worst-case result, it is sufficient to overprovision the capacity of each edge of the network by a factor not more than 2.
Oblivious routing algorithms require routing tables polynomial in the network size. Räcke and Schmid designed an oblivious routing schemes that only requires small routing tables (polylogarithmic tables in the network size) and, at the same time, still guarantees a close-to-optimal load [
33].
However, oblivious routing is a well-studied problem for traffic engineering on the Internet [
34,
35,
36,
37], due to its ability to avoid continuous route reconfiguration, it has found its place in the wireless world, as well [
38,
39].
Rétvári et al. were the first ones to systematically study the geometric properties of various routings [
13,
40,
41,
42]. They showed that, under reasonable regularity conditions, the set of demand combinations a capacitated graph admits (the so-called throughput polytope), and the set of demands a routing can place in the network without congestion (the feasible region), are compact, down-monotone,
K-dimensional polyhedra, where
K is the number of source–destination pairs.
They used this geometric view to study fair allocation of network resources in a routing-independent manner [
40] and to develop new, hybrid distributed-centralized semi-oblivious routing algorithms [
13]. Nevertheless, no attempts have been made to convert this geometric framework into a comprehensive performance characterization for oblivious routing this far.
3. Model and Notations
Let the network topology be given by a directed or undirected graph
, let
and
. For simplicity, we shall refer to both the
arcs in directed graphs and the
edges in undirected graphs as
links, using the notation
. Let the vector of (finite) link capacities be
(see
Table 2 for a summary on notations). Users are represented by the set of unique source–destination pairs
. Let
denote the momentary demand of the
kth user presented at the source node
. The vector
is called a traffic matrix.
A convenient way to describe a routing algorithm is through its associated
routing function . In general, a routing function tells how to map a traffic matrix to the links in the network:
. Here,
u is a column
-vector formed by separate column
m-vectors
for each
, where
describes the amount of flow from
routed to the link
. The routing function
is often decomposed into separate routing functions
for each
:
, and for each link
:
. For the rest of this paper, we only consider
distributed, static routing functions
, where
only depends on
and
is a static
matrix for each
k. The congestion produced by a routing function
, when subjected to some traffic matrix
, is measured by the
maximum link utilization :
In the geometric model of [
13], a traffic matrix
corresponds to a point in the
K-dimensional Euclidean space
, a routing
u is embedded in the
-dimensional Euclidean space
, and the routing function is a mapping from the former space to the latter space. The
throughput or
demand polytope (Throughout this paper, we will use the term throughput polytope to emphasize that we mean the polytope of all demands routable somehow through the network without any link over-utilization.)
T is defined as the set of traffic matrices
for which there is a routing
u that accommodates
in the network with no link over-utilization:
. Some
is called
admissible if
. The throughput polytope has the following geometric properties [
40]:
T is the intersection of finitely many half-spaces, and thus it is indeed a polytope [
43];
as such, T is convex: for all ;
under mild regularity conditions, T is full-dimensional;
T is down-monotone: for each , it holds that .
The second geometric ingredient is the feasible region. The
feasible region for some
describes the set of traffic matrices that can be accommodated in the network by
without causing link over-subscription:
. For a static routing function
, the feasible region
is a down-monotone a polytope [
13]. We call
an
optimal routing function if its feasible region coincides with the set of all admissible traffic matrices:
. An optimal routing function is guaranteed to exist [
13,
14], but it might need to be
adaptive (i.e.,
) and/or
centralized (
).
In contrast to optimal routing,
oblivious routing is simple, static, and distributed. Within the context of this paper, we only treat the so-called
congestion minimizing version of oblivious routing algorithms, defined as the static routing function
that minimizes the maximum congestion compared to an optimal routing algorithm
over all traffic matrices [
23,
24]:
This definition implicitly fixes the so-called competitive ratio as the way to measure the performance and the competitiveness of a routing function . In the next section, we argue that this performance metric leaves some important performance aspects in obscurity.
4. New Performance Metrics for Static Routing
The competitive ratio is a worst-case metric. Underlying this metric is the intuition that a low competitive ratio guarantees good routing, because we can expect the worst-case congestion to appear for only a few pathologic traffic matrices. To determine whether this intuition is plausible, we need to be able to answer questions like “What is the chance that we experience congestion somewhere in the network?”, or “How grave a congestion can we expect?”. To do this, we need to develop new, fundamentally statistical performance measures. Consider the below definitions.
Definition 1. Given a static routing function , the probability of congestion
for is the probability that at least one network link becomes overloaded by : According to this definition, the PoC takes its values from the interval . The lower the value of the PoC, the smaller the chance that at a random time instance that we find the network in a congested state.
Definition 2. Given a static routing function , the expected value of congestion
is the mean value of the maximum link utilization produced by : The EVoC quantifies the extent of congestion one can expect at a random time instance. As such, if then the network is in a safe operational domain, and forecasts severe congestion.
4.1. Traffic Model
In order to evaluate the PoC and the EVoC, we need to be able to characterize the probability distribution by which a traffic matrix shows up in the network. To reflect that, the oblivious routing algorithm does not assume any a priori information on input traffic, and our traffic model assumes that traffic matrices arrive with equal probability from some pre-defined input set.
Assumption 1. The probability that a traffic matrix θ is realized by the users is given by , where Θ is some pre-defined set of traffic matrices, and Vol denotes the standard Lebesgue-measure (i.e., the K-dimensional volume).
Here, might be confined by the topology of the network (e.g., the capacity of the links at the border), it might be shaped by some admission control mechanism at the network ingress, or it might be any bounded subset of . For the rest of this paper, we assume that is compact and polyhedral. The very definition of the performance measures, and the essential methodology we introduce later on to calculate them, remain valid even if Assumption 1 fails, but the development becomes somewhat more complex.
4.2. Evaluating the PoC and the EVoC
Next, we discuss how to evaluate the above formulas numerically in the case when
S is an arbitrary static routing function. First, we write (
2) as a conditional probability:
. Observing that
if
and 1 otherwise, for the probability of congestion we find:
Obtaining a closed form for the expected value of congestion needs a little more work. Writing (
3) as a conditional expected value, we find:
To evaluate (
5), we need to evaluate the function
over all
. This amounts to solving the multi-parametric program
as the function of the parameter
. For a static routing function, this multi-parametric program is linear, therefore the value function
is piecewise affine over
:
where
I is a finite index set,
is a polyhedral complex on
, and
are affine functions over
[
44]. Triangulate each subregion
and let the resultant simplices be
. Denote the vertices of the simplex
by
. Then, using the Lasserre–Avrachenkov theorem [
45,
46]:
4.3. Approximating the EVoC
Unfortunately, evaluating the form (
6) for larger networks is computationally involving. Therefore, below, we prove a simple formula that gives a useful lower bound.
Lemma 1. For the expected value of congestion of a static routing function over the traffic matrix set Θ: Proof. Instead of integrating over
, we integrate over another polytope
that is a scalar multiple of
:
. We choose
so that
. Now, since
is structurally similar to
, the integration of
over
is much simpler. First, write
in the following form:
As
, we have
. Therefore,
. Thus, we obtain
Hence, a lower estimate on
is simply
To evaluate the integral, we partition the space into layers
. The measure of a layer
is
. Now, we can evaluate the estimate:
The last step is because, by definition, . □
5. On the Competitiveness of Oblivious Routing
Thus far, we have defined the PoC and the EVoC in terms of an arbitrary (but bounded) input demand set . This provides a convenient absolute performance characterization where we can reason about the performance of some static routing quantitatively by evaluating the PoC and the EVoC for a particular network and a particular . A perhaps even more descriptive qualitative characterization emerges if we think of the PoC and the EVoC as performance metrics relative to optimal routing.
For this, we only need to observe that, under the choice , the probability of congestion for an optimal routing function is (as and so ), and, for the expected value of congestion, we find . Therefore, under the assumption , the PoC essentially tells the probability that oblivious routing overloads some link, provided that optimal routing could route input traffic matrices without congestion. This gives a new competitive measure for comparing oblivious routing to optimal routing. A similar relative interpretation can be given for the EVoC as well.
In the rest of this paper, we use this relative interpretation to study the competitiveness of oblivious routing. Correspondingly, we shall assume that
, and we shall use the short-hand notations
,
,
, and
. Evaluating (
4) under this assumption yields
. Hence,
in the competitive interpretation describes the quantity of demands routable by oblivious routing compared to the quantity of demands routable at all. Second, by the substitution
in (
7) gives:
Here,
is trivial from the definition. Note that
is well-defined, as
and so
.
, which means that, under our traffic model, in very large networks, we can expect at least one link to operate at maximum capacity when using oblivious routing. Observe also that, appealingly, (
8) relates each of the three fundamental performance metrics focal in this paper: the competitive ratio
, the probability of congestion
, and the expected value of congestion
.
5.1. Statistical Competitiveness in Directed Graphs
Ideally, we would like the PoC to be upper-bounded by some constant, say,
in all graphs. That would mean that there is more than a 50% chance that a traffic matrix routable by an optimal algorithm is also routable by the oblivious algorithm. If this is too illusory a desire, it would be sufficient if
approached 1 slowly. Unfortunately, it turns out that there is no general upper bound on
apart from the trivial
and
can grow as
when
is
in certain directed graphs [
42].
Consider the directed graph of
nodes in
Figure 1 for any
, let all link capacities be 1, and let the source–destination pairs be
. The first source–destination pair has only a single path; thus, all its traffic is sent through this path. The rest of the users
have two paths.
Due to the symmetry of the network, the oblivious routing function is determined by a single variable , the fraction of traffic sent at to the path passing through node . The largest link loads are produced by the traffic matrices and . For the first one, the maximum load occurs on link . The latter one causes maximum load on link . The load is minimal and equal to , when , resulting in .
The probability of congestion can be approximated by approximating the volume of
T and
R.
T is down-monotone, thus, the
K-hypercube of size 1 is inside
T. There are another
half-hypercubes in
T as well (see
Figure 1). The volume of this
polytope is
. On the other hand,
R is enclosed by a hyper-rectangle
, whose lower left corner is the origin and whose upper right corner is the point
. The volume of
is
.
Thus,
with
from below and with the substitution
.
5.2. Statistical Competitiveness in Undirected Graphs
Oblivious routing is particularly competitive in undirected graphs in the worst-case sense, thanks to the firm upper bound on . In statistical sense, competitiveness seems less appealing. Below, we show that there exist undirected graphs on nodes in which grows as where is constant, even though is still .
Theorem 1. For any integer and even, there is an undirected graph on nodes with and .
Proof. Consider the undirected graph in
Figure 2 for any
and even, let all link capacities be 1, and let the source–destination pairs be
. First, we construct the oblivious routing and calculate
as a function of
K, and then we obtain an approximation on
.
1. Competitive ratio: Along the same lines as above, for the splitting ratios (see
Figure 2) we find
and
, and so
. Observe that
.
2. The probability of congestion: Using the above expressions for
and
, the feasible region
R is given by the inequality representation
, and the throughput polytope
T can be written as:
Here, again, instead of calculating the volumes directly we take lower and upper approximations. Define as . Clearly, is obtained from R by excluding a valid inequality, and thus . We derive the volume of by enumerating its extreme points and then obtaining a triangulation.
The extreme points of can be written as , where for all and is the -th permutation of the vector containing exactly ones. We observe that the set together with the origin gives precisely the extreme points of a unit hypercube of dimension . Any -dimensional hypercube can be split into simplices, and, as is structurally equivalent to such a -dimensional hypercube, this gives a valid triangulation for as well.
The volume of a simplex in our triangulation is given by , where is the volume of the ith simplex. The volume of is obtained by summing the volumes of these simplices: . Here, the last equation comes from the fact that is exactly the volume of the unit hypercube.
The volume of
T can be derived similarly. Let
be defined as the intersection of the directions
with the boundary of
T and define
as the convex hull of these points plus the origin. Easily, the convex hull defines a polytope that is inscribed in
T, so
. The points
themselves can be calculated by observing that a ray of direction
hits constraints (
9) as well as (10), and the smaller of these two determines
:
.
again has the same structure as a unit hypercube, and thus, along the same reasoning as before, we find
.
Consider the ratio and examine the fractions more closely.
If , then and ; thus, .
If , then .
If , then and .
Note that the last point of is outside of T; however, this does not cause problems as this is the only such point. Hence, , which completes the proof. □
6. Numerical Results
We conducted several rounds of numerical evaluations on various network topologies obtained from real sources to augment the mathematical analysis with practical insights. The goal was to see whether oblivious routing produced the same poor competitiveness in real networks that we identified in the course of the mathematical analysis. In line with this goal, we treated the PoC and the EVoC as competitive measures, and thus we fixed .
We ran the first round of evaluations on the ISP data maps from the Rocketfuel dataset [
47,
48]. We used the same method as in [
24] to obtain approximate POP-level topologies: we collapsed the topologies so that nodes correspond to cities, we eliminated leaf-nodes, and we set link capacities inversely proportional to the link weights. In this paper, only the results for the AS1239, AS3257, AS6461, AS3967, and AS1755 topologies are shown (for more details see
Table 3).
Another round of evaluations was conducted on the NSFNET Phase II topology [
49]. Note that these topologies are inherently directed and symmetric. To facilitate experimenting with undirected networks as well, we obtained undirected versions by substituting each oppositely directed arc with an undirected edge.
From these topologies, we generated a series of increasingly more complex configurations by adding gradually more source–destination pairs. Recall that the number of source–destination pairs K determines the dimensions of the underlying geometric space and thus it has a profound impact on both the PoC and the EVoC. In our experiments, K was increased from 2 to 30. For each K, fifteen independent samples were generated picking the source and the destination uniformly at random according to bimodal distribution. In every instance, two maximally node-disjoint paths were provisioned per user. We experimented with more paths per user, but the results remained basically the same (see later).
For the PoC, we need to obtain the throughput polytope (which is a difficult task in itself [
40]), the feasible region, and the volumes thereof. For
, we computed the volumes exactly using Vinci [
50], and, for the rest of the data points, due to the high complexity of exact volume computation, we used a volume approximation algorithm based on Bollobás’ algorithm [
51] that was crafted in C++. For the EVoC, we also need to solve a multi-parametric program as mentioned above, for which we used the Multi-Parametric Toolbox [
52] for Matlab. Finally, the competitive ratio was always computed accurately in Matlab.
Figure 3 depicts the exact and approximate results for the PoC, the competitive ratio, and EVoC in four directed and two undirected networks. Our most important observation is that real networks, directed or undirected, exhibited the same worst case competitiveness in terms of the PoC and the EVoC we found earlier analytically. The PoC grows beyond 80% for as few as about 10 users, and as the number of users entered the range of the number of nodes (about 20–30 in our cases), it surpassed 90% and approaches 100% with very high confidence.
The EVoC grew beyond 1, indicating symptoms of grave congestion. Recall that this is despite of the fact that an optimal routing algorithm would not produce any congestion at all in these scenarios. This is despite that the competitive ratio remains relatively low (less than –). An interesting further observation is that, in undirected networks, the variance of results is much higher than for directed ones; however, even for undirected graphs, the confidence of our approximations is very good in the range where competitiveness really matters (i.e., for higher Ks).
Finally, we also found that path diversity does not influence the results substantially.
Figure 4 depicts the PoC and the competitive ratio as we increased the number of maximally node disjoint paths per user from 2 to 6 in the directed AS1239 and AS6461 topologies. It appears that competitiveness is mostly determined by the number of source–destination pairs, and path diversity has very little impact.
7. Conclusions
Oblivious routing is a possible alternative for adaptive routing strategies when engineering network traffic as it avoids continuous reconfiguration of routing policies. Furthermore, oblivious routing is a promising candidate for minimum-congestion routing in large networks. This is due the fact that it is fundamentally distributed and comes equipped with a hard logarithmic congestion bound. This bound, however, is intrinsically of worst-case nature. In this paper, we took the first steps toward a statistical performance characterization. Our contributions in this regard are three-fold.
First, we defined new performance metrics, the probability of congestion (PoC) and the expected value of congestion (EVoC), valid for any static routing algorithm. Second, we showed how to evaluate the PoC and the EVoC quantitatively, under the assumption that traffic matrices are uniformly distributed in a demand set . We believe that this assumption is reasonable, and especially so for the case of oblivious routing, yet we note that the development goes similarly even if the assumption does not hold, but volume computation will generally need to be substituted with integration over convex bodies, a problem of the same computational complexity. Third, we have restated the PoC and the EVoC as qualitative statistical competitive measures, and we found that competitiveness of oblivious routing vanishes in terms of these measures.
Our findings, in some part, could be attributed to the strange behavior of volumes in large dimensions that is often referred to as the curse of dimensionality. In extremely large dimensions grows very fast with for any polytope P, and this causes to almost surely approach 1 as . Our analytic results, however, indicate that occurs even when the dimension K is still moderate to be relevant to practice. Moreover, it appears that competitiveness is mostly determined by the number of source–destination pairs, and path diversity has very little impact.
It is important to emphasize that these results do not qualify oblivious routing
per se. For once, oblivious routing is optimized for the competitive ratio and not for the PoC or the EVoC, and it is perfectly possible (however improbable) that an algorithm tuned for the PoC would yield better competitiveness. Second, our results are only for the case when no information on anticipated demands is available. Indeed, earlier work [
6] showed that, with a suitably precise guess on input traffic, oblivious routing can become highly efficient. However, compared to an optimal adaptive algorithm [
13,
14] (which never causes congestion when subjected to a routable demand), oblivious routing (which almost surely does) is not competitive at all. These findings perhaps shed some new light on the age-old static vs. adaptive routing debate as well.
Finally, it is an open question as to whether it is possible to close the performance gap—measured in terms of the competitive ratio and PoC—between an optimal adaptive routing algorithm and the oblivious routing, by introducing some adaptiveness to the oblivious routing in the form of arbitrary finite routing regions.