Figure 4 is explained as follows. In
Section 3.1, given
,
, and
as the sets of RF chains, beams, and devices, we employ integer programming to implement a greedy approach for approximate throughput maximization where the throughput (
) and SNR experienced by a link to the device
ı using beam
b and chain
r are modeled. Using the optimal throughput,
from
Section 3.1, we capture the dynamics of (ergodic) inter-packet delay and propagation delay in
Section 3.2 as observed from the transport layer for balancing delays across all links. In
Section 3.2, given a total of
W packets to be transmitted across the associated links, our goal is to determine a vector
so as to minimize the total transmission time
. However, we observe that finding optimal
for MPTCP scheduling is a highly challenging and non-trivial task. Therefore, in
Section 4, we model multipath scheduling as a sequential decision problem Multi-Armed Bandits (MAB), with the multipath scheduler and links acting as the actor and arms of the bandits. While the MAB approach resolves the optimization problem centrally, its dependence on local learning often leads to undesirable suboptimal behaviors, whereas the possible decentralized approach requires data sharing from neighbors. In
Section 4.1, we enhance the MAB with FL and propose to provide congestion information to new joining MPTCP users during the handshake process. Further enhancements appear to be plausible with opportunistic scheduling; therefore, we provide some new insights and mathematical interpretations of adopting flexible packet-cloning (redundancy for forward erasure correction) for low-delay MPTCP scheduling, which requires further investigations and is left for future work.
3.1. Modeling Beamforming and Selection Mechanism
For tracking beamforming and selection mechanism dynamics at the AP, we develop a high-level pragmatic model to track the allocation of the RF chains and beams. We formulate an integer programming problem and develop a greedy beam-selection approach. One can view this as a cross-layer approach, where the beamforming and selection process assigns RF chains and beams to appropriate end devices in order to maximize the total network throughput during each (congestion) window scheduling round at the TRA layer.
Consider that
,
, and
represent the sets of RF chains, beams, and devices, respectively. With normalized transmission power, the throughput, say
perceived by a link to the device
ı using beam
b and chain
r, can be estimated as
where
is the effective received power of the
device using the
beam and the
RF chain;
represents crosstalk on the
beam caused by the simultaneous transmission over the
beam using the
chain; and
is the noise (say Gaussian).
is the indicator variable (binary) to track whether the
beam and the
chain are assigned to
link or not. Using this formulation, we implement Algorithm 1, a greedy approach for the joint allocation of RF chains and beams to devices. For a finite number of devices, beams, and RF chains, we can devise a joint
beamforming and device selection mechanism as a nonlinear throughput maximization problem
Algorithm 1 Beamforming and RF chain allocation |
procedureSelect RF Chain, , and while do procedure Update , Increment and ; end procedure end while while do end while end procedure procedure () while do end while Return end procedure
|
Remark 1. The computational complexity of the proposed beamforming and device selection mechanism is very high and may not be feasible in real-time as it requires an exhaustive search of O() operations. Therefore, along the lines of [20,21], we implement a greedy algorithm for the joint allocation of RF chains and beams to devices. Observe in Algorithm 1 that we attempt to allocate the RF chain with the largest throughput gain (where possible) in each window scheduling round. In particular, the computation complexity of Algorithm 1 is O(). 3.2. Understanding Delay Balancing
In light of the above discussion, we now formulate an optimization problem and explain the challenges in solving it. Our aim is to design a low-delay MPTCP scheduler that is aware of (i) congestion windows across all links and exploit (ii) the impact of MAC/PHY fluctuations on the windows transmission time. Let be the number of available links and let and denote the (ergodic) inter-packet delay and propagation delay, respectively, experienced by a packet m transmitted over link ı, where after each scheduling round can be estimated from Algorithm 1 as .
Given a total of
W packets to be transmitted across the associated links, let
denote the number of packets (congestion window) sent over link
ı, with
, and
. The total transmission time of
W packets (
W is the aggregate congestion window) is then given by
Observe in (
5) that the challenge in designing an MPTCP scheduling algorithm for bulk file transfer is, in fact, that the per-packet delays
are uncertain and variable, and so is the total delay
. One could attempt to compute the distribution of
for all links, but this begins to be computationally expensive (and incurs long delay) with increasing
W (so it does not work effectively for real-time packet transmissions). We also lack details of the distribution of per-packet delays on 802.11
ay links at the TRA layer.
Next, we observe the following approximate approach. For
, in our framework, the
are assumed as independent, but not identically distributed. In fact, it is precisely the feature that we want to exploit as the transmission times along different links stem from different distributions. Then, by using (
5), we have
Unlike other models, we relax the independence assumption between packets on a link (there will typically be dependence), instead assuming independence between packet delays on distinct links. To approximate
in (
7), we select the variability parameters
,
so that
where
and
N is a standard normal random variable. By considering the delays,
, as Normal random variables, and we can compute the variability parameter,
, as
where the second term in is a sum over
such that
m and
is a pair selected without repetitions out of
, where
.
As a result, we solve (
7), using
as the standard normal cumulative distribution function,
The mean and the variability parameter on link ı can be approximated by computing the average, variance, and covariance of all previous 200 packets (a reasonable upper bound on the current size of web objects) by observing the time between arrivals of acknowledgments from the multipath scheduling source.
Our objective in the proposed scheduling is to find appropriate
w that approximately solves the following optimization
where
represents short run averages of
. The idea is to stochastically schedule
packets in to:
windows. When we relax the optimization (
11) so that
, then we observe that a fractional number of packets can be sent on each link; for sufficiently large
W, the solution to the optimization in (
11) ensures (approximately) equal delay on all links and so is of Wardrop form. As will be discussed soon in
Section 5, we, however, are restricted to transmitting an integer number of packets on each link with minimal inequalities in delay due to quantization. We can observe that the solution
of (
11)
intuitively satisfies
given that all links are used to transmit packets, i.e., {all elements of
. Note that the inter-packet delay is never zero in any link; therefore, mean link delay
and variability parameter
both are not zero. Then, observe that
remains satisfied, when we reduce
by a small positive constant,
, and increase
by
. This is always possible since
and {
and/or
}
. Furthermore, we find
by using (
10) to (
12), which is our proxy for the
.
3.3. How the MPTCP Scheduler May Solve (11)
Consider that we solve (
11) by using feedback-based optimization, i.e., first solving (
12) (convex optimization) and then searching over the integer vectors obtained by taking the ceiling or flooring of each
to find the integer vector
w minimizing
. This search is over
combinations and is fast for a realistic number of links. For example, when
, one can use the binary search of computational cost
. With new insights from (
12), we include a
target delay factor,
, in our multipath scheduler, representing the overall delay limits,
. Furthermore, we set
as a global parameter in our design and is equal for all subflows of a connection over all of the links, thus solving (
12). More specifically, the anticipated MPTCP scheduling algorithm should solve (
12) (feedback-based autonomous optimization) jointly by (i) improving the congestion avoidance (by approximately solving (
11) for actuation) and (ii) learning the required parameters (locally in the system). We demonstrate examples of such simultaneous
explore-exploit approaches using bandits and federated bandits in the next section.