Internet of Drones: Improving Multipath TCP over WiFi with Federated Multi-Armed Bandits for Limitless Connectivity

Pokhrel, Shiva Raj; Mandjes, Michel

doi:10.3390/drones7010030

Open AccessFeature PaperArticle

Internet of Drones: Improving Multipath TCP over WiFi with Federated Multi-Armed Bandits for Limitless Connectivity

by

Shiva Raj Pokhrel

^1,*

and

Michel Mandjes

²

¹

School of Information Technology, Deakin University, Geelong, VIC 3220, Australia

²

Faculty of Science, Mathematics and Informatics Korteweg-de Vries Institute, University of Amsterdam, 1090 GE Amsterdam, The Netherlands

^*

Author to whom correspondence should be addressed.

Drones 2023, 7(1), 30; https://doi.org/10.3390/drones7010030

Submission received: 24 November 2022 / Revised: 17 December 2022 / Accepted: 20 December 2022 / Published: 31 December 2022

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

We consider multipath TCP (MPTCP) flows over the data networking dynamics of IEEE 802.11ay for drone surveillance of areas using high-definition video streaming. Mobility-induced handoffs are critical in IEEE 802.11ay (because of the smaller coverage of mmWaves), which adversely affects the performance of such data streaming flows. As a result of the enhanced 802.11ay network events and features (triggered by beamforming, channel bonding, MIMO, mobility-induced handoffs, channel sharing, retransmissions, etc.), the time taken for packets to travel end-to-end in 802.11ay are inherently time-varying. Several fundamental assumptions inherent in stochastic TCP models, including Poisson arrivals of packets, Gaussian process, and parameter certainty, are challenged by the improved data traffic dynamics over IEEE 802.11ay networks. The MPTCP model’s state estimation differs largely from the actual network values. We develop a new data-driven stochastic framework to address current deficiencies of MPTCP models and design a foundational architecture for intelligent multipath scheduling (at the transport layer) considering lower layer (hybrid) beamforming. At the heart of our cross-layer architecture is an intelligent learning agent for actuating and interfacing, which learns from experience optimal packet cloning, scheduling, aggregation, and beamforming using successful features of multi-armed bandits and federated learning. We demonstrate that the proposed framework can estimate and optimize jointly (explore–exploit) and is more practicable for designing the next generation of low-delay and robust MPTCP models.

Keywords:

internet of drones (IoD); multi-armed bandits (MAB); multipath TCP (MPTCP); scheduling; beamforming; cross-layer approach; federated learning; packet replication; opportunistic scheduling; IEEE 802.11ay

1. Introduction

The IEEE task group ‘ay’ defined technical advances for harnessing MIMO (Multiple Input Multiple Output), channel bonding/aggregation, and beamforming techniques to deliver very high speed (100 GBit/s) and extremely low latency over WiFi [1]. The emerging concept of ‘limitless connectivity’ mandates the availability of reliable WiFi connections that can handle extreme demands for both speed and coverage. This calls for innovation in WiFi architecture and new methods of data networking. With the ensuing presence of bandwidth-hungry mobile applications, extremely high throughput demands, and WiFi 6 (and 6E) saturation, the WiFi alliance is investigating the IEEE 802.11be standard [2], known as WiFi 7 (a major move towards limitless connectivity). Final approval of WiFi 7 is expected in 2024, as it is now in the drafting phases, although products will likely be out sooner once the draft protocols are approved.

Of particular relevance to this work is the niche 60 GHz WiFi, 802.11ay, which has specified new advances to physical (PHY) and medium access control (MAC) [1]. Built on IEEE 802.11ad, 802.11ay brings numerous technical advances, including MIMO, channel bonding, aggregation, and beamforming to deliver 100 GBit/s of output and extremely low latency [3,4]. IEEE 802.11ay provides such a wide arsenal of technological advances. These enhanced performances open the door for emerging applications, such as distributed mmWave networks, virtual and augmented reality, real-time control/actuation, and 8K video streaming. The coverage area of IEEE 802.11ay is relatively small due to the adoption of a new millimeter wave (mmWave). However, IEEE 802.11ay has an inherent capability to meet real-time constraints even in a highly congested coverage area by adopting mmWave and MIMO-based beamforming.

The new design of IEEE 802.11ay focuses mainly on the PHY and MAC layer solutions. In particular, the three main design goals specified by the IEEE Task Group are to

Provide a minimum of 20 Gbit/s throughputs;
Be backward compatible with IEEE 802.11ad; and
Broaden the collection of conceivable use cases and situations by offering innovative PHY and MAC layer solutions.

The integration of sophisticated technologies at the two layers, prevalent in modern WiFi networks, enables the majority of the future application and service requirements. Channel bonding, MIMO, aggregation, multi-user transmission, fast beamforming training, etc. are examples of enhancements for higher impact. However, because of the smaller mmWave coverage area and limited capabilities of TCP congestion control models, the utilization of such advanced features is often limited to stationary networks with fixed devices. In mobile networks, handoffs are pretty common due to small coverage, which adversely impacts the TCP model’s performance. Multipath TCP (MPTCP) [5] has been standardized by IETF and used by Apple; however, there is not much support as required from the transport (TRA) layer. The resulting delay for (re)connection due to loss recovery and handover management can be substantial, to the point where it essentially undermines the multipath performance gain from the aforementioned PHY/MAC layer advances.

In particular, the stochastic design of MPTCP requires a major rethink to realize the anticipated changes. There are several fundamental assumptions that are challenged, including Poisson arrivals of packets, Gaussian process, and parameter certainty inherent in the stochastic TCP modeling at the TRA layer, which requires a drastically new approach to harness the IEEE 802.11ay PHY/MAC layer advances.

1.1. Motivating Example

Our motivating example and research hypothesis in this paper can be outlined as follows. A motivating scenario of a convoy of drones hovering over the Darling Harbour of Sydney (Australia) is shown in Figure 1. A navy ship is visiting a civilian port at the Darling Harbour. A festival (e.g., New Year’s Eve) is taking place, and the ship has docked alongside the wharf. Improvised hazards to the ship might be experienced when the ship approaches the Harbour or offshore. The area is highly congested, consisting of many people, vehicles, media, and buildings. The Harbour is temporarily secured, so access to the ship is highly restricted. The safety of the ship and its crew, together with the public attending the event and residing in the neighborhood, is highly important.

Various possible dangers in and around the wharf and across the neighborhood must be monitored continuously, detected precisely, and resolved in real time. The responses should be seamless, without any adverse impact, and not be detrimental to the public and properties. To this end, we advocate a solution considering a convoy of drones for surveillance (high-definition video) over the Harbour area using an 802.11ay antenna array system as discussed in Figure 1. We use the blueprint of https://www.turbosquid.com/3d-models/3d-sydney-darling-harbour-convention-model-1304512, (accessed on 20 November 2022).

To elaborate, in such a mobile environment of the Internet of Drones, it is highly challenging and non-trivial to meet real-time constraints and mobility control requirements over 802.11ay networks. More than just an increase in speed, in congested areas such as stadiums and festivals, IEEE 802.11ay has been designed with MIMO beamforming and antenna arrays to facilitate multi-connectivity and improved performance. However, mobility-induced handoffs are critical in 802.11ay because of the smaller coverage of mmWaves, which adversely affects the performance of the MPTCP flows when used for surveillance of the harbor areas with HD video streaming.

As mentioned, the mobility of nodes often causes frequent handoffs, timeouts, and Doppler effects due to the small coverage area. The handover process of mmWave WiFi in multiple access points (APs) environment is quite complex and involves several network signaling, which is time-consuming. Such a handover procedure increases delay, making it impracticable for future real-time communication and control applications. Therefore, our primary focus in this work is to alleviate the need for handoffs by ablating the proven features of multi-connectivity from the TRA layer (e.g., MPTCP [5]) over 802.11ay networks for real-time video streaming and control (e.g., actuation, automation, orchestration of a convoy of drones for surveillance).

With such a multi-connectivity driven by an efficient cross-layer architecture forming a tightly coupled TRA-MAC-PHY layers mechanisms over multiple paths, IEEE802.11ay Task Group’s visions, to efficiently harness advances in PHY/MAC Layer technology, including MIMO, beamforming, and channel bonding, to enhance performance and reliability by multiple orders of magnitude, could become a reality at the application level.

Complementary to PHY/MAC-Layer advances, TRA layer multipath TCP solutions (for addressing handoffs and harnessing bandwidths) apply load-balanced control over several links to provide end-to-end multipath network flow and ease congestion; see [6] and the references therein, e.g., MPTCP [5,7] and emerging multipath versions of QUIC (MPQUIC) [8]. The scheduling mechanism in MPTCP and not-so-futuristic MPQUIC plays a critical role in efficiently allocating packets over the surrounding multiple links simultaneously.

1.2. Challenges and Contributions

In this subsection, we will first discuss the technical challenges of existing MPTCP over IEEE 802.11ay networks. Thereafter, we discuss the impending challenges of existing stochastic and queuing theories used for designing MPTCP protocols. Stochastic theories and fluid modeling have been celebrated in designing and evaluating new TCP and MPTCP algorithms. In general, we follow a waterfall approach for designing these algorithms [7]. To this end, we first develop approximate models using fluid and statistical estimation, evaluate their performance with queuing theory and/or probabilistic reasoning, and then employ optimization for maximizing network utility (commonly referred to as Network Utility Maximization in TCP literature) by adjusting the algorithm parameters.

1.2.1. Multipath TCP over IEEE 802.11ay

Discovering a prevailing, efficient, and high-performing multipath TCP scheduling mechanism over 802.11ay mmWave and MIMO beamforming links is undoubtedly a non-trivial task. Substantial research works are being conducted on multipath scheduling to overcome the impeding limitations [9]. To the best of our knowledge, this paper is a first step towards developing an intelligent cross-Layer framework for the analysis and collaborative design of efficient multipath scheduling with optimal bonding and beamforming over IEEE 802.11ay networks.

Multipath data transport protocols have lately been promoted for network convergence. The potentiality of MPTCP is advocated by 3GPP [10], where the proposed ATSSS architecture envisions 5GC-based support for transport layer multipath connectivity across 3GPP and non-3GPP networks. As a result, MPTCP schedulers that operate efficaciously in convergent network situations are in very high demand [10].

Future 6G (sixth generation) and WiFi networks will present new issues for MPTCP scheduling because their links (e.g., mmWave, MIMO, and beamforming advances) impose substantially more transient variation and dynamicity than that of the existing lower frequency networks. Furthermore, future use cases (recall Figure 1) necessitate significant gains in targeted performance requirements and QoE. As a result, multipath scheduler investigation can help illuminate existing and potential unanticipated performance restraints of the emerging applications, ultimately aiding in the invention of a more efficient TRA-Layer approach for 802.11ay.

To this end, we identify the following key bottleneck challenges, being gaps in our knowledge:

∘: Uncertain mmWave links and delays. Future applications such as trajectory control and haptic communications require extremely low delay control information and high-speed video delivery guarantees. However, as a result of queueing, beamforming, channel bonding, MIMO, mobility-induced handoffs, other connections sharing the link, retransmissions, etc., the time taken to travel a packet from end-to-end is inherently stochastic and time-varying. Therefore, packets can easily arrive at the devices reordered, in which case they need to be buffered until they can be delivered in the correct order to the TRA-Layer [11], creating head-of-line blocking (HoL). The resulting buffering delay can be substantial, to the point where it essentially undermines the speed/latency gain from the use of several 802.11ay links.
∘: Unknown delays in learning PHY/MAC dynamics from the TRA layer. When a connection is first established, the end hosts are unaware of the link’s properties (e.g., queuing/retransmissions) and PHY/MAC techniques (e.g., MIMO/bonding, backoffs, etc.). Typically, information is limited to obtained during the first connection handshake and perhaps stored past historical data. As the connection proceeds, feedback is received from packet transmissions, but this feedback is delayed (round trip time (RTT) can correspond to the transmission of hundreds of packets over high-speed 802.11ay). Small packets (with short control information from the server) thus have limited information to link characteristics, and bulk connections (8K video streaming uplink to the server) need to learn the link characteristics on the fly (e.g., simultaneous tracking, estimation, and optimization) as packets are being transmitted.
∘: Low delay forward error correction (FEC). The most efficient way to use coded packets or packet replication (e.g., cloning) over modern WiFi is currently unclear, and the design of low-delay cloning/coding for multipath links remains poorly understood in the literature.

1.2.2. Modeling Challenges

There is a significant area for improvement in this stochastic TCP design process as we need to estimate and optimize simultaneously (instead of fully decoupled estimation followed by optimization). In addition, we make several fundamental assumptions (including Poisson arrivals of packets, Gaussanity of underlying processes, etc.) and take system parameters for granted without accounting for their uncertainties.

With advances in AI and machine learning, the emergence of high-speed wireless networks, the availability of big data, and improved computing capabilities, we can develop interfaces for TCP sources (Delay Adapted Linked Increases Algorithm (DALIA) [11]) to interact with the system and create a new paradigm of MPTCP design using exploration and exploitation. We need to advance our understanding of TCP modeling. Are existing assumptions (Poisson arrivals, Gaussianity [12], etc.) realistic? For example, arrivals in the highly fluctuating IEEE 802.11ay networks are often more variable (overdispersed [13]) than that captured by the Poisson process due to some external events.Perhaps, today’s Internet traffic over IEEE 802.11ay cannot be captured by a pure Gaussian process [12].) Therefore, we need to rethink the underlying staffing rules [14]. Indeed, there are various features that could not be captured and explained by existing stochastic TCP models [15,16] for data packet flows in modern wireless networks.

Another impeding challenge concerns the estimation of the input of queues from observations, so as to detect the changes in arrivals (i.e., changepoint detection [17]). In addition, one would like to have techniques to take into account the uncertainty of the parameters [14]. Data-driven techniques sometimes reveal unforeseen scenarios and phenomena, and it is essential to build robust TCP models with improved queuing-theoretic results that can drive the congestion dynamics in a realistic and desirable fashion. Besides all of these observations, our main focus in this work is on techniques like (i) bandits [18], which can help to explore and exploit in the queuing context and (ii) federated learning [19], which provides a collaborative network interface to give congestion information to the TCP models.

1.2.3. Contributions

We develop a novel cross-layer architecture to address current deficiencies in the mobile IEEE 802.11ay framework that enables simultaneous modeling and optimization of multipath schedulers and beamforming mechanisms. We handle the aforementioned challenges in developing a new architecture and experience-driven protocol with the following three key solutions:

Intelligent multipath scheduler. We investigate developing learning-based (intelligent) multipath scheduling algorithms that distribute information and coded packets to links with the least amount of delay at the receiver.
Collaborative beamforming and scheduling design. Considering the challenges in designing intelligent FEC-aware scheduling and low latency error-correcting codes at the PHY/MAC layer, the influence of the data size on the ideal multipath scheduler, and the necessity for collaborative bonding, beamforming, and scheduling, we employ multi-armed bandit with federated learning (FL) over the links, to explore and exploit optimal characteristics for the proposed integrated cross-layer design.
Multipath network support from the TRA layer. In addition to the above innovations, we develop a coupled low-delay multipath scheduling protocol (LD-MPSP) for 802.11ay networks. We design tuneable packet cloning to adapt to the size of the flows and resolve the impeding challenges over erroneous 802.11ay links.

1.3. Literature

Of particular relevance to this research work is the large-scale MIMO designs presented in [20,21]. HEKATON [20] presented a compressive sensing-based estimation scheme for a phased-array antenna using the sounding reference signal of LTE. BUSH [21] developed a blind estimation mechanism that works with a single RF chain without any reference signal. However, they [20,21] do not consider the new advances in 802.11ay networks.

Looking ahead, several more 6G goals and the progression of the data speeds of wireless networks generally target Tbps from 2030 [22]. The primary goal of multi-connectivity with IEEE 802.11ay networks lies in enhancing the performance and quality of experience (QoE) at the end device while promoting load balancing and bonding across several links. Within the envisaged environment, this objective of sustainable convergence will undoubtedly lead to many concerns for the ongoing wireless research as to whether and how this convergence and the target data rate are feasible. Since different experts from several fields also pose an identical problem, we look from a cross-layer perspective to the upcoming IEEE 802.11ay obstacles and opportunities before us [4].

Perhaps two best-known examples of a TRA-Layer multipath transport that guarantees reliable in-order delivery are MPTCP LIA [23] and DALIA [11] (as reported in [24]). Multipath schedulers have been developed based on a round-robin approach, a lowest RTT first strategy, retransmission and penalization, and delay awareness (using an indirect measurement of HoL blocking). More recently, schedulers have been proposed based on bufferbloat mitigation. Earliest Delivery Path First (EDPF) has served as the baseline scheduler for much work. While it is optimal for loss-free links with a constant known delay, EDPF cannot take into account the time-varying nature of link delays (due to bonding, beamforming, MIMO, etc.) and also packet loss. Mitigating the HoL blocking in multipath connections remains a challenging and largely unsolved research problem [9], especially for short control information transfers and/or mmWave links with asymmetric and fluctuating characteristics [6,9].

Fortunately, federated learning (FL) proposed by Google [25] (and enhanced by Deakin [19]) can potentially perform the required distributed learning, where each device uploads its local learning model at fixed slots to the server for (ensemble) averaging of the learning and develops a global model, which in turn is downloaded by the device for their local learning updates [25,26]. However, to fully utilize the potential of the federated learning for intelligent scheduling over IEEE 802.11ay, its slot duration needs to be highly adaptive, and the multi-armed bandit approach [27,28] can efficiently moderate such a distributed dynamism [29].

1.4. Organization

This remaining paper is structured as follows: In Section 2, we explain the underlying background and network scenario of 802.11ay in detail, discuss observations and explain the problems of MPTCP packet scheduling from the transport layer. Our modeling (including its beamforming and selection mechanism, delay balancing across multiple links, and their interactions and complexity of the optimization) is described in Section 3. In Section 4, we employ bandits to solve the delay balanced multipath scheduling and improve it with federated learning and packet cloning mechanism. With relevant insights developed from analytic modeling, learning and exploitation, we develop a novel low delay multipath scheduling protocol in Section 5 and explain its properties. Results from our analytical model are validated by using NS-3 simulation in Section 6 and compare the proposed framework with celebrated MPTCP protocols. Section 7 concludes the paper.

2. Network Scenarios and Problems

In this section, we first explain the underlying background of IEEE 802.11ay in detail, then discuss throughput observations of various cases of TCP and MPTCP with four bonded channels and explain the problems of MPTCP packet scheduling from the transport layer and the TCP design issues. In particular, we explain several fundamental assumptions inherent in stochastic TCP models that are challenged by the improved data traffic dynamics of IEEE 802.11ay networks.

2.1. Background of 802.11ay

IEEE 802.11ay supports EDMG (Enhanced Directional Multi-Gigabit) training fields. As shown in Figure 2, the EDMG frame consists of DMG (Directional Multi-Gigabit) preambles and header fields of DMG for compatibility with 802.11ad. It is separated into two halves: (i) Non-EDMG section (identifiable by DMG devices and consists of an L-STF (Legacy-Short Training Field), L-CEF (Legacy-Channel Estimation Field), and legacy header fields); (ii) EDMG portion, comprising of the fields recognized by EDMG Stations (STAs), such as their STF and CEF fields, as well as the new headers A and B. IEEE 802.11ay supports three types of frames: Control (DMG beacons, Beamforming Training, etc.), Single Carrier (SC), and Orthogonal Frequency Division Multiplexing (OFDM). For data networking, we can EDMG SC (1-21 modulation and codings, maximum of 8085 Mbit/s) or EDMG OFDM (20 modulations and coding, maximum of 8316 Mbit/s).

A high-level view of the considered multipath data networking scenario in the Internet of Drones using IEEE 802.11ay is shown in Figure 3. Channel configuration of IEEE 802.11ay allows 8 (2.16 GHz) channels in operations. It facilitates the bonding of a continuous group of channels to enhance the transmission rate significantly. Bonding a maximum of four channels results width of 8.64 GHz, and the standard entails using two bonded channels.

For the beam refinement protocol (BRP) introduced in the 802.11ad and its switching time limitations (36 ns), IEEE 802.11ay redesigned the training field to adapt to the device heterogeneity and the flexibility of switching times. The training field consists of a configurable number of units, and each unit includes several subfields, including six Golay sequences. IEEE 802.11ay adds a variable Golay sequence flexibly set by the user and, in the case of channel bonding, is also dependent on the number of continuous channels. They have strong correlation characteristics, making them ideal for estimation and beamforming. To assist the estimation and beamforming for MIMO communication, for each space-time stream, IEEE 802.11ay enables a uniquely identified orthogonal set of these sequences.

IEEE 802.11ay has two types of MIMO: (i) SU-MIMO is a single-user MIMO that enables two devices to broadcast and receive various spatial streaming (up to eight); (ii) MU-MIMO is a multi-user MIMO for downlinks which enables an AP to send distinct spatial streaming to several devices (up to 8) at the same time. In particular, 802.11ay supports MIMO for multifold improvement in throughput. It allows simultaneous transmission and receipt of up to eight spatial streams at the same frequency and time. In addition, for the MIMO transmission, the standard requires the capability of analog RF precoding.

Phased antenna arrays can synthesize a narrow beam pattern in this mode to establish a spatial channel for each stream; however, depending on the quality of the phase shifters and the arrangement, it is often not plausible to generate pencil beams with low inter-stream interference. This explains the need for the collaborative design of beamforming and channel allocation at lower layers while performing performance-oriented MPTCP congestion control and scheduling from the TRA layer. As illustrated in Figure 3, it is worth noting that our main focus in this paper is to develop a complete understanding of the dynamics of MPTCP over IEEE 802.11ay networks to identify and ameliorate the MPTCP design issues, which is still lacking, and this is a highly challenging task.

2.2. Observations

First, we perform simulations for both EDMG with different channel widths to observe the throughputs perceived by the fixed and mobile devices over IEEE 802.11ay using TCP and MPTCP protocol (see details of the network protocol and parameters later in Table 2). Our test scenario comprises two IEEE 802.11ay devices connected by a one-meter Line-Of-Sight (LOS) AP. First, we set the two devices to utilize the best beam pattern possible, resulting in a high SNR value and eliminating packet loss. The results of maximum throughput are shown in the first row of Table 1. Thereafter, we repeat the same experiment under the same setup except that the devices are moving across multiple APs at the speed of 10 m/s (see the second row of Table 1). The throughput deteriorates substantially due to mobility (complicated and time-consuming handover process). Furthermore, we repeat the same experiment under the same setup except that the devices are using MPTCP protocol and are associated with two adjacent APs while moving across multiple APs at the speed of 10 m/s (see the third row of Table 1). We can clearly see that MPTCP outperforms TCP when devices are moving from one AP to another. However, the maximum throughput with MPTCP (two paths) when devices are stationary is much lower than expected. We find that this is due to a lack of synchronization and interactions between multipath scheduling at the transport layer with that of the underlying advances in MAC/PHY of 802.11ay.

2.3. Packet Scheduling over 802.11ay from TRA Layer

MPTCP at the TRA layer has the potential to harness bandwidths and handle handoffs seamlessly; however, the MPTCP Scheduling design is a delicate balancing act. Multipath scheduler design over IEEE 802.11ay is essentially a decision problem, namely deciding which packets to transmit down which link at which time for given beamforming and device allocation. However, deriving an optimal ultra-low delay scheduler is intractable due to the large state space (the state is combinatorial in the number of packets in flight, which is often several hundred) combined with uncertain knowledge of the link delays and channel losses. To this end, we have the following observations on the capabilities of an existing MPTCP modeling approach, protocol designs and their inherent problems.

Analytical modeling of multipath data transport (see [7] and references therein) focuses on using fluid models to enhance linked scheduler designs. It is generally known that a fluid-based model averages out short-term stochastic fluctuations in parameters of interest, such as windows. Naturally, it leads to coupled partial differential equations as the system dynamics description [7]. Such a pragmatic approach dramatically simplifies the difficulty in modeling; for example, the number of bytes transferred over a period of length

T

seconds in the core assumption of a fluid flow model, with

w (t)

the congestion window and

R T T (t)

the RTT at time t,

\begin{matrix} \int_{t = 0}^{T} w (t) / R T T (t) d t . \end{matrix}

(1)

where the period is the time taken for

w (t)

to start its increments from

w_{max} / 2

packets to

w_{max}

;

w_{max}

denotes the maximum window size in packets. Within each round, the window growth is roughly linear. It will always reach

w_{max}

(according to this assumption), which is difficult to realize over 802.11ay links. Therefore, we need to relax this approximation while designing a new multipath scheduler, and, eventually, harness PHY/MAC/TRA layers advance over ergodic characteristics of multiple 802.11ay links intelligently.

On the other hand, the renowned utility maximization idea for the dynamical system of single-path TCP users associates a utility function

U_{s} (w_{s})

with each device s and understand the source and the network by executing the distributed algorithm so as to maximize the aggregate utilities perceived by the devices. The fundamental notion of this idea is to join the equilibria of the dynamical systems with the solution of an optimization problem. Therefore, the nonlinear dynamical system can be observed as one that is solving the utility maximization problem in a distributed manner. For example, (

w_{s}^{★}, p_{l}^{★}

) is an equilibrium of the dynamical system of single-path TCP users iff

w_{s}^{★}

is the optimal congestion window for

\begin{matrix} Maximize \sum_{s \in S} U_{s} (w_{s}) s . t . λ_{l} \leq θ_{l} \end{matrix}

where

p_{l}^{★}

is the Lagrange multiplier or price (path loss) associated with the link at the optimal congestion window

w_{s}^{★}

. For the case of MPTCP,

λ_{l} \leq θ_{l}

means that the aggregate traffic (

λ_{l}

) over each link is less than the capacity (

θ_{l}

) of the link. In fact, almost all of the currently deployed utility functions of single-path TCP algorithms presented in the literature are concave, reflecting a single stable equilibrium [7]. However, for the case of the multipath algorithm, the utility function does not exist, and the same interpretation of utility maximization seems infeasible (Section III-B [7]).

We observe that the traditional fluid-flow approach neither works well nor adapts intelligently over the highly asymmetric, dynamic, and complicated mix of PHY/MAC/TRA layers’ advances over 802.11ay networks. Due to its lack of careful consideration of runtime states and suboptimal solutions, it typically performs undesirably over IEEE 802.11ay networks. We introduce a new multi-armed bandit-based scheduling with a packet cloning mechanism and an (upper bound) delay factor that is pledging for dynamic scheduling and congestion avoidance. The proposed approach is to support flexible runtime control and its capability to respond intelligently concerning highly dynamic and time-variant state spaces. To improve further, we develop an idea of a federated learning-based bandit for accounting for the MAC/PHY beamforming/selection mechanism across all available links intelligently, ameliorating the mobility effects. This approach helps to alleviate the adverse impacts due to congestion control approaches at the TRA layer (prevalent with the pioneering MPTCP designs: BALIA [7], OLIA [30] and LIA [23]).

A recurrent theme in BALIA [7], OLIA [30] and LIA [23] is presuming loss probabilities and RTTs as given variables (parameter uncertainty) and then use conventional TCP models to determine the (number of packets to be transmitted) congestion windows for each path. However, with the WiFi MAC/PHY advances and impairments’ dynamics over 802.11ay networks, the loss probabilities and RTTs are not exogenous quantities. In reality, beamforming and selection mechanisms shall be viewed as functions of the congestion windows for all links because of the coupled interactions in our design. Moreover, as a function of the loss probability and RTTs, the average congestion window cannot be estimated without including uncertainties in the equations of the conventional TCP models.

3. Analytic Modeling

We take a modular approach to creating the analytical model, segregating the study of various elements of system behavior into separate modules. Figure 4 depicts the components of our overall analytic model and their interdependencies with the proposed algorithms. In order to create the approximate mathematical interpretation for each module, it is assumed that the output values of all other modules are available.

Figure 4 is explained as follows. In Section 3.1, given

R

,

B

, and

I

as the sets of RF chains, beams, and devices, we employ integer programming to implement a greedy approach for approximate throughput maximization where the throughput (

θ_{r, b}^{ı}

) and SNR experienced by a link to the device ı using beam b and chain r are modeled. Using the optimal throughput,

θ_{r, b}^{ı}

from Section 3.1, we capture the dynamics of (ergodic) inter-packet delay and propagation delay in Section 3.2 as observed from the transport layer for balancing delays across all links. In Section 3.2, given a total of W packets to be transmitted across the associated links, our goal is to determine a vector

w = [w_{1}, w_{2}, \dots, w_{n}]

so as to minimize the total transmission time

T (w)

. However, we observe that finding optimal

w^{★}

for MPTCP scheduling is a highly challenging and non-trivial task. Therefore, in Section 4, we model multipath scheduling as a sequential decision problem Multi-Armed Bandits (MAB), with the multipath scheduler and links acting as the actor and arms of the bandits. While the MAB approach resolves the optimization problem centrally, its dependence on local learning often leads to undesirable suboptimal behaviors, whereas the possible decentralized approach requires data sharing from neighbors. In Section 4.1, we enhance the MAB with FL and propose to provide congestion information to new joining MPTCP users during the handshake process. Further enhancements appear to be plausible with opportunistic scheduling; therefore, we provide some new insights and mathematical interpretations of adopting flexible packet-cloning (redundancy for forward erasure correction) for low-delay MPTCP scheduling, which requires further investigations and is left for future work.

3.1. Modeling Beamforming and Selection Mechanism

For tracking beamforming and selection mechanism dynamics at the AP, we develop a high-level pragmatic model to track the allocation of the RF chains and beams. We formulate an integer programming problem and develop a greedy beam-selection approach. One can view this as a cross-layer approach, where the beamforming and selection process assigns RF chains and beams to appropriate end devices in order to maximize the total network throughput during each (congestion) window scheduling round at the TRA layer.

Consider that

R

,

B

, and

I

represent the sets of RF chains, beams, and devices, respectively. With normalized transmission power, the throughput, say

θ_{r, b}^{ı}

perceived by a link to the device ı using beam b and chain r, can be estimated as

\begin{matrix} θ_{r, b}^{ı} & = & log (1 + S N R_{r, b}^{ı}); \end{matrix}

(2)

\begin{matrix} S N R_{r, b}^{ı} & = & P_{r, b}^{ı} / (\sum_{i \in B} \sum_{j \in R} \sum_{k \in I ∖ {ı}} I_{i, j, k} C_{i, j}^{b} + ς^{2}) \end{matrix}

(3)

where

P_{r, b}^{ı}

is the effective received power of the

ı^{t h}

device using the

b^{t h}

beam and the

r^{t h}

RF chain;

C_{i, j}^{b}

represents crosstalk on the

b^{t h}

beam caused by the simultaneous transmission over the

ı^{t h}

beam using the

j^{t h}

chain; and

ς^{2}

is the noise (say Gaussian).

I_{i, j, k} \in {0, 1}

is the indicator variable (binary) to track whether the

i^{t h}

beam and the

j^{t h}

chain are assigned to

k^{t h}

link or not. Using this formulation, we implement Algorithm 1, a greedy approach for the joint allocation of RF chains and beams to devices. For a finite number of devices, beams, and RF chains, we can devise a joint beamforming and device selection mechanism as a nonlinear throughput maximization problem

\begin{matrix} max_{I} \sum_{b \in B} \sum_{r \in R} \sum_{ı \in I} I_{b, r, ı} θ_{r, b}^{ı} s . t . \sum_{b \in B} \sum_{ı \in I} I_{b, r, ı} \leq 1, \forall r \in R . \end{matrix}

(4)

Algorithm 1 Beamforming and RF chain allocation

procedureSelect RF Chain $I \leftarrow 0$ , $\tilde{R} \leftarrow ϕ$ , $\tilde{B} \leftarrow ϕ$ and $\tilde{I} \leftarrow ϕ$
while $(A l l o c a t e (\tilde{B} \cup {b}, \tilde{R} \cup {r}, \tilde{I} \cup {ı}) - A l l o c a t e (\tilde{B}, \tilde{R}, \tilde{I})) \leq 0$ do
procedure Update
$\tilde{b}, \tilde{r}$ , $\tilde{ı} \leftarrow {max}_{b \in B, r \in R ∖ \tilde{R}, ı \in I} (A l l o c a t e (\tilde{B} \cup {b}, \tilde{R} \cup {r}, \tilde{I} \cup {ı}) - A l l o c a t e (\tilde{B}, \tilde{R}, \tilde{I}))$
Increment $\tilde{b}, \tilde{r}$ and $\tilde{ı}$ ;
end procedure
end while
while $ı = 1, ı < | B |, ı + +$ do
$_{\tilde{b}, \tilde{r}, \tilde{ı}} \leftarrow 1$
end while
end procedure
procedure ( $A l l o c a t e (B, R, I)$ )
while $ı = 1, ı < | R |, ı + +$ do
$I_{\tilde{b}, \tilde{r}, \tilde{ı}} \leftarrow 1$
end while
Return $\sum_{b \in B} \sum_{r \in R} \sum_{ı \in I} I_{b, r, ı} θ_{r, b}^{ı}$
end procedure

Remark 1.

The computational complexity of the proposed beamforming and device selection mechanism is very high and may not be feasible in real-time as it requires an exhaustive search of O(

{| I |}^{| R | | B |}

) operations. Therefore, along the lines of [20,21], we implement a greedy algorithm for the joint allocation of RF chains and beams to devices. Observe in Algorithm 1 that we attempt to allocate the RF chain with the largest throughput gain (where possible) in each window scheduling round. In particular, the computation complexity of Algorithm 1 is O(

{| I | | R |}^{3} | B |

).

3.2. Understanding Delay Balancing

In light of the above discussion, we now formulate an optimization problem and explain the challenges in solving it. Our aim is to design a low-delay MPTCP scheduler that is aware of (i) congestion windows across all links and exploit (ii) the impact of MAC/PHY fluctuations on the windows transmission time. Let

ı = 1, 2, \dots, n

be the number of available links and let

P_{m}^{ı}

and

D^{ı}

denote the (ergodic) inter-packet delay and propagation delay, respectively, experienced by a packet m transmitted over link ı, where

P_{m}^{ı}

after each scheduling round can be estimated from Algorithm 1 as

\frac{file size}{θ_{r, b}^{ı}}

.

Given a total of W packets to be transmitted across the associated links, let

w_{ı}

denote the number of packets (congestion window) sent over link ı, with

\sum_{ı} w_{ı} = W

, and

w = [w_{1}, w_{2}, \dots, w_{n}]

. The total transmission time of W packets (W is the aggregate congestion window) is then given by

\begin{matrix} T (w) = max_{ı \in {1, 2, \dots, n}} (D_{ı} + \sum_{m = 1}^{w_{ı}} P_{m}^{ı}) . \end{matrix}

(5)

Observe in (5) that the challenge in designing an MPTCP scheduling algorithm for bulk file transfer is, in fact, that the per-packet delays

P_{m}^{ı}

are uncertain and variable, and so is the total delay

P^{ı} (w_{ı}) = \sum_{m = 1}^{w_{ı}} P_{m}^{ı}

. One could attempt to compute the distribution of

T (w)

for all links, but this begins to be computationally expensive (and incurs long delay) with increasing W (so it does not work effectively for real-time packet transmissions). We also lack details of the distribution of per-packet delays on 802.11ay links at the TRA layer.

Next, we observe the following approximate approach. For

ı \in {1, 2, \dots, n}

, in our framework, the

D_{ı} + \sum_{m} P_{m}^{ı}

are assumed as independent, but not identically distributed. In fact, it is precisely the feature that we want to exploit as the transmission times along different links stem from different distributions. Then, by using (5), we have

\begin{matrix} \Pr (max_{ı} (D_{ı} + P^{ı} (w_{ı})) > τ) & = 1 - \Pr (max_{ı} (D_{ı} + P^{ı} (w_{ı})) < τ) \end{matrix}

(6)

\begin{array}{l} = 1 - \prod_{ı} \Pr (P^{ı} (w_{ı}) < (τ - D_{ı})) . \end{array}

(7)

Unlike other models, we relax the independence assumption between packets on a link (there will typically be dependence), instead assuming independence between packet delays on distinct links. To approximate

P^{ı} (w_{ı})

in (7), we select the variability parameters

u_{ı}

,

u_{ı} \geq 0

so that

\begin{matrix} P^{ı} (w_{ı}) & \approx & γ_{ı} w_{ı} + u_{ı} \sqrt{w_{ı}} N, \end{matrix}

(8)

where

γ_{ı} = E [P_{m}^{ı}] \geq 0

and N is a standard normal random variable. By considering the delays,

P_{m}^{ı}

, as Normal random variables, and we can compute the variability parameter,

u_{ı}

, as

\begin{matrix} u_{ı} = \sum_{m = 1}^{w_{ı}} V a r [P_{m}^{ı}] + 2 \sum_{m < \hat{m}} c o v [P_{m}^{ı}, P_{\hat{m}}^{ı}] \end{matrix}

(9)

where the second term in is a sum over

c o v [P_{m}^{ı}, P_{\hat{m}}^{ı}]

such that m and

\hat{m}

is a pair selected without repetitions out of

1, 2, 3, \dots w_{ı}

, where

m < \hat{m}

.

As a result, we solve (7), using

Φ

as the standard normal cumulative distribution function,

\begin{matrix} \Pr (max_{ı} (D_{ı} + P^{ı} (w_{ı})) > τ) & \approx 1 - \prod_{ı} Φ (\frac{τ - D_{ı} - γ_{ı} w_{ı}}{u_{ı} \sqrt{w_{ı}}}) \end{matrix}

(10)

The mean

γ_{ı}

and the variability parameter

μ_{ı}

on link ı can be approximated by computing the average, variance, and covariance of all previous 200 packets (a reasonable upper bound on the current size of web objects) by observing the time between arrivals of acknowledgments from the multipath scheduling source.

Our objective in the proposed scheduling is to find appropriate w that approximately solves the following optimization

\begin{matrix} min_{w \in N^{n}} \bar{T} (w) s . t . \sum_{ı} w_{ı} = W, \end{matrix}

(11)

where

\bar{T} (w)

represents short run averages of

T (w)

. The idea is to stochastically schedule

W = \sum w_{ı}

packets in to:

w_{1}, w_{2}, \dots, w_{n}

windows. When we relax the optimization (11) so that

w \in R_{+}^{n}

, then we observe that a fractional number of packets can be sent on each link; for sufficiently large W, the solution to the optimization in (11) ensures (approximately) equal delay on all links and so is of Wardrop form. As will be discussed soon in Section 5, we, however, are restricted to transmitting an integer number of packets on each link with minimal inequalities in delay due to quantization. We can observe that the solution

w^{★}

of (11)

{min}_{w \in R_{+}^{n}} \bar{T} (w) s . t . \sum_{ı} w_{ı} = W

intuitively satisfies

\begin{matrix} | {\bar{P}}^{ı} (w_{ı}) + D_{ı} | - | ({\bar{P}}^{j} (w_{j}) + D_{j}) | \approx 0, for all ı, j \in {1, 2, \dots, n} \end{matrix}

(12)

given that all links are used to transmit packets, i.e., {all elements of

w^{★}} \geq 1

. Note that the inter-packet delay is never zero in any link; therefore, mean link delay

γ_{ı}

and variability parameter

u_{ı}

both are not zero. Then, observe that

\sum_{ı} w_{ı} = W

remains satisfied, when we reduce

w_{j}

by a small positive constant,

η > 0

, and increase

w_{ı}

by

η

. This is always possible since

w_{ı} \geq 1

and {

γ_{ı}

and/or

u_{ı}

}

\geq 0

. Furthermore, we find

τ

by using (10) to (12), which is our proxy for the

{max}_{ı} D_{ı} + P_{ı} (w_{ı})

.

3.3. How the MPTCP Scheduler May Solve (11)

Consider that we solve (11) by using feedback-based optimization, i.e., first solving (12) (convex optimization) and then searching over the integer vectors obtained by taking the ceiling or flooring of each

w_{ı}

to find the integer vector w minimizing

T (w)

. This search is over

2^{n}

combinations and is fast for a realistic number of links. For example, when

n = 2

, one can use the binary search of computational cost

O (log n)

. With new insights from (12), we include a target delay factor,

τ

, in our multipath scheduler, representing the overall delay limits,

(P^{ı} (w_{ı}) + D_{ı})

. Furthermore, we set

τ

as a global parameter in our design and is equal for all subflows of a connection over all of the links, thus solving (12). More specifically, the anticipated MPTCP scheduling algorithm should solve (12) (feedback-based autonomous optimization) jointly by (i) improving the congestion avoidance (by approximately solving (11) for actuation) and (ii) learning the required parameters (locally in the system). We demonstrate examples of such simultaneous explore-exploit approaches using bandits and federated bandits in the next section.

4. Multi-Armed Bandit (MAB) for Optimal MPTCP Scheduling

In this section, we first provide a short background of MAB and then employ MAB to model and solve the scheduling problem (11). Next, we improve MAB’s efficiency with federated learning and opportunistic scheduling.

MAB employs a sequential decision approach in which an actor must decide which arms of the bandits to draw in each epoch [27]. Arm draws are MAB’s actions, and after every execution of the action, a reward (payoff) is to be observed. The goal of the MAB approach is to facilitate the scheduling at the TRA-Layer framework design for progressive decisions to maximize the overall reward earned through a series of actions during transmissions of packets. Due to the limited knowledge, the actor often needs to balance the exploitation of past acts that worked well and investigate/explore future acts that may provide more significant rewards [28].

The scheduling problem (11) can be modeled as a MAB with the multipath scheduler and links acting as the actor and arms, respectively. The selected subset of arms

S

is referred to as a super arm, and the appropriate normalized reward is

R (w)

=

- \frac{1}{\bar{T} {(w)}_{w \in S}} + 1 \in [0, 1)

. As a result, the goal of (8) may be viewed as determining a temporal sequence of actions based on a specific policy that maximizes the cumulative reward, i.e.,

\begin{matrix} max_{w \in S} R (w) s . t . \sum_{ı} w_{ı} = W . \end{matrix}

(13)

To this end, we quantify the action selection policies for the scheduler in terms of regret. In particular, the regret is estimated as the difference between the predicted reward of optimum actions (equilibrium windows,

w^{*}

) and the reward achieved by the proposed strategy (given windows, w). It is worth noting that minimizing regret is the same as maximizing the overall gain. Let

R_{ı} (t)

denote the normalized reward, then its expected reward of link ı (arm ı),

{\bar{R}}_{ı}

, can be estimated as

\begin{matrix} {\bar{R}}_{ı} = E [1 - \frac{(D_{ı} (t) + \sum_{m = 1}^{w_{ı} (t)} P_{m}^{ı} (t))}{T (w (t))}], \end{matrix}

(14)

and thus the desired optimal super arm will be

\begin{matrix} S^{★} =_{S \subset I, | S | = n} {min_{ı \in S} {\bar{R}}_{ı}} . \end{matrix}

(15)

Given policy

π

, the cumulative regret up to time

T

can then be computed as

\begin{matrix} Regret (π) = T {\bar{R}}_{ı} (S^{★}) - E [\sum_{t = 1}^{T} R (S (t))], \end{matrix}

(16)

thus the multipath scheduling problem can be reformulated as

\begin{matrix} min_{S (t), t \geq 1} Regret (π (S)) . \end{matrix}

(17)

With relevant insights from [27,28,31], using (17), we have developed and implemented multipath scheduling based on upper confidence bound policy, referred to as MS-UCB, as shown in Algorithm 2.

Algorithm 2 MS-UCB multipath scheduling algorithm

1:: procedure Initialize
2:: while $t = 1, 2 \dots$ do
3:: Random Selection: Superarm $S (t)$ and $| S (t) | = n$
4:: Compute $u (t)$ and $v (t)$
5:: end while
6:: end procedure
7:: procedure Update
8:: while $({(T - t)}^{2} > t o l e r a n c e, ϵ)$ do
9:: Renew $u (t)$ and $v (t)$ using (14) and (15)
10:: Compute $S (t + 1)$ using (16)
11:: Increment $t \leftarrow t + 1$ ;
12:: end while
13:: end procedure

At a higher level, the feedback observation module for each arm/path is at the core of this algorithm (instead of every super arm or action as a whole). The same arm might be seen with respect to multiple actions, and in order to make better recommendations in the future, we gather and use the experience of the rewards of one arm from several actions.

In Algorithm 2, we define two vectors

u (t)

and

v (t)

representing the number of times that arms/links is scheduled until time t and the mean observed reward values of arms/links up to the current round t, respectively. The vectors

u (t)

and

v (t)

, initialized as

u (t) = v (t) = 0

, evolve over time t (is an ergodic process). The evolution of

u_{ı} (t)

and

v_{ı} (t)

for an arm/path ı is as follows:

\begin{matrix} u_{ı} (t + 1) & = & \{\begin{matrix} u_{ı} (t) + 1 & if ı \in S (t) \\ u_{ı} (t) & otherwise \end{matrix} \end{matrix}

(18)

and

\begin{matrix} v_{ı} (t + 1) & = & \{\begin{matrix} \frac{R_{ı} (t) + u_{ı} (t) v_{ı} (t)}{1 + u_{ı} (t)} & if ı \in S (t) \\ v_{ı} (t) & otherwise . \end{matrix} \end{matrix}

(19)

As shown in Algorithm 2 (step 10), the multipath scheduler enters the while loop to choose super arm

S (t + 1)

, after every transmission round by

\begin{matrix} S (t + 1) =_{S \subset I, | S | = n} \sum_{ı \in S} (\underset{exploration}{\underset{︸}{\sqrt{(n + 1) \frac{log (t + 1)}{u_{ı} (t)}}}} + \underset{exploitation}{\underset{︸}{v_{ı} (t + 1)}}) \end{matrix}

(20)

Remark 2

(Upper Bound on Regret). We believe that the average regret attained by the proposed multipath MS-UCB scheduling algorithm (Algorithm 1) is upper bounded by a factor, the derivation of which requires further investigations. With relevant insights and modeling ideas from this work, the stability and convergence conditions of the network dynamics with MS-UCB will be studied in the future.

4.1. With Federated Learning

Looking ahead, we consider a decentralized MAB problem to resolve multiple subflows scheduling from x Internet users using the link/arm ı who can communicate their data, in particular their local updates (

u (t), v (t)

), exclusively with their neighboring multipath TCP users using the same link. Without FL, each user has a series of choices about picking the n arms/links but only has access to local feedback/evaluations of the actual reward/regret for all actions made. In addition, they are often biased due to the skewness of the underlying distributions, as reported in [29]. Such dependence only on local learning often leads to suboptimal behaviors because of unforeseen network environments and dynamics, whereas converging on a decentralized approach requires data sharing from neighbors. Motivated by the potential of FL and with relevant insights from [29], we propose to provide congestion information to new joining MPTCP users during the handshake process. In particular, we seek a solution that will allow multipath schedulers to share local experiences (

u^{x} (t), v^{x} (t)

) in the network, which can be shared as a private copy of the learned model with the neighbors.

Different from Algorithm 2 (step 10), the multipath scheduler enters the main while loop to choose super arm

S (t + 1)

, after every transmission round by

\begin{matrix} S (t + 1) & = & _{S \subset I, | S | = n} \sum_{ı \in S} (\underset{exploration}{\underset{︸}{\sqrt{(n + 1) \frac{log (t + 1)}{{\bar{u}}_{ı} (t)}}}} + \underset{exploitation}{\underset{︸}{{\bar{v}}_{ı} (t + 1)}}) . \end{matrix}

where we use the renewal reward theorem to compute the exploration and exploitation based on the data sets from all multipath TCP users sharing the arm/path ı. Considering

P (x)

as the stationary distribution of the underlying Markovian Process,

{(X_{t})}

for tracking the evolution of multiple multipath users sharing link ı, with a slight abuse of notation, (we take

P (x) = 1 / x

for simplification in our numerical analysis):

\begin{matrix} {\bar{u}}_{ı} (t) & = & lim_{x \to \infty} \frac{U_{x, i} (t)}{T_{x + 1}} \approx \frac{\sum_{x} P (x) u_{ı}^{x} (t)}{\sum_{x} P (x) x}; \end{matrix}

(21)

\begin{matrix} {\bar{v}}_{ı (t + 1)} & = & lim_{x \to \infty} \frac{V_{x, i} (t)}{T_{x + 1}} \approx \frac{\sum_{x} P (x) v_{ı^{x} (t)}}{\sum_{x} P (x) x} . \end{matrix}

(22)

Remark 3

(Opportunistic Scheduling). The stochastic delay-aware scheduling proposed above allocates packet transmissions for an application amongst the active links so as to minimize the average delay. Moreover, it ensures with a high probability a multipath connection (with W packets) that guarantees low delay and high speed. However, we may sometimes be fortunate when the delays experienced on one or more links happen to be lower than usual. Observe that a different realization of random delay might also lead to scheduling on links yielding lower delays. Of course, we will discover the delay realization only after transmitting the packets; however, by transmitting additional redundant packets (packet cloning) over and above the minimum number W, we can still opportunistically reduce the delivery delay by exploiting link delay realizations when they occur. Observe that we are trading off the link capacity (as we send more packets) for the likelihood of lower delay.

Next, we explain our idea of cloning to jointly (i) consider erratic PHY/MAC errors into stochastic models, and (ii) perform opportunistic scheduling.

4.2. With Packet Cloning

We develop packet cloning inside our LD-MPSP algorithm to resolve the challenges over 802.11ay networks. If packets are cloned before transmission over multiple links, then the performance of short flows will no longer be limited by the worst link. In fact, it will be guaranteed (opportunistically) to be at least equal to that due to the best link. Packet cloning is more implementable over wireless links than sending packets only on the best link. In practice, the best link is never known a priori (and is expensive to determine). Moreover, replication-based MPTCP will maintain the worse link(s), which will increase the robustness against possible failure of the better link(s).

By performing such data replication inside LD-MPSP, we provide an opportunistic and elegant solution for short data flows. This is easy to understand by considering the extreme case of full-cloning in which all the data packets are copied over all the available links. With full-cloning, one can solve the packet reordering problem, and it is extremely simple to implement. The simplicity of cloning will be highly effective for improving the performance of short flows that often finish before any sophisticated control algorithm could possibly take effect. However, cloning of packets will increase the data traffic on the Internet and can adversely affect the competing TCP and MPTCP flows which are not using redundancy. Prolonged TCP flows may learn link characteristics on the fly as packets are transferred, but smaller TCP flows have minimal information and guidance on the link characteristics.

With partial-cloning, one will be able to improve the performance of longer flows that are amenable to control by providing just the right amount of redundancy over multiple heterogeneous links (in a dynamic manner according) to the changing conditions of the wireless links. Therefore, we exploit loss-based adaptive cloning in our LD-MPSP, in which w MTCP packets are cloned randomly over all links with probability:

\begin{matrix} c_{p} = min {k, p_{e} (t) / (1 - p_{e} (t))}, 0.5 \leq k \leq 1 \end{matrix}

(23)

such that the total number of MPTCP packets (including the cloned packets) is given by

\begin{matrix} w^{'} = min {(1 + k) w, ⌈ w /(1 - p_{e} (t))⌉} \end{matrix}

(24)

where

p_{e} (t) : = {max}_{ı} {p_{e}^{ı} (t)}

,

p_{e}^{ı} (t)

is the packet loss probability due to wireless channel errors along link ı.

5. Multi-Connectivity over IEEE 802.11ay Links

Based on our requirement analysis in Section 3.2 and MAB solution in Section 4, we design a Low Delay Multipath Scheduling Protocol (LD-MPSP) in this section. LD-MPSP algorithm design is driven by an intelligent controller, which depends on the difference between the target delay

τ

and the estimated one-way delay,

q (t)

. As noted earlier, the values of

τ

and

q (t)

for a link are obtained from the federated MAB modules. The goal is to search for an appropriate integer, either

⌈ w_{ı} ⌉

or

⌊ w_{ı} ⌋

and maintain a small non-zero target queuing delay parameter,

τ

, equal for all links (

τ

can be evaluated at the multipath scheduling source by using the distributed learning agent discussed in Section 3.2). Each multipath scheduling from a source has a set of links, and each link ı maintains a congestion window

w_{ı}

and measures its delay,

q_{ı}

(the overall delay perceived by

w_{ı}

packets). Motivated by the findings in [32], observe in Algorithm 3 that we proposed congestion window adaptation as follows (Cf. MPTCP LIA, OLIA, BALIA versions in [7]):

∘: For each Ack on link ı, $w_{ı} \leftarrow w_{ı} + α / w_{ı}$ , and
∘: For each packet loss on link ı, $w_{ı} \leftarrow w_{ı} / 2$

where

α = (τ - q_{ı}) / τ

.

Algorithm 3 LD-MPSP Algorithm

1:: procedure Initialize
2:: Target Delay ( $τ$ ), One-way Delay (q), Window (w)
3:: end procedure
4:: procedure congestion avoidance
5:: while (window Adapt) do
6:: if (ACK on link ı) then update $q_{ı}$
7:: $w_{ı} \leftarrow w_{ı} + (τ - q_{ı}) / τ w_{ı}$
8:: else $w_{ı} \leftarrow w_{ı} / 2$
9:: end if
10:: end while
11:: end procedure
12:: procedure Quantization
13:: $w$ ← integer vector ( $w$ ) s.t. $min T$
14:: Evaluate $τ$
15:: end procedure

Observe from the window adaptation process in Algorithm 3 that, once the target delay is attained, LD-MPSP persists in the state unless packet loss or other traffic perturbs the

q (t)

. Furthermore, steps (10, 11) of Algorithm 3 mandate that LD-MPSP absorbs the uncertain delay variation by updating

τ

at the multipath scheduling sources dynamically (thus solving the above-derived equations approximately). Our LD-MPSP algorithm shares some design aspects with other lower-than-best-effort data transfer service algorithms, such as competitive and considerate congestion avoidance (4CP) and low extra delay background transfer (LEDBAT).

LD-MPSP necessitates the inclusion of a timestamp field from the source in each packet, which the receiver uses to determine the one-way delay from the multipath scheduling source and transmits this calculated value back to the source in the Ack packet. This feature makes it befitted for practice in time-varying 802.11ay networks since it can continuously examine its one-way latency and backoff its window if it surpasses a (customizable) outset.

Figure 5 shows our early findings. The time-varying evolution of 10 TCP Reno vs. 10 multipath (LIA and LD-MPSP) devices over 802.11ay throughputs obtained from 0 s to 200 s under erroneous channel (40% lossy channel) with two different experiments. In Experiment 1, we observe that LIA cannot compete well with TCP (see

{tcp}_{lia}

) and, in Experiment 2, we can see that our algorithm (LD-MPSP) performs better than LIA when competing with TCP (see

{tcp}_{our}

) flows over 802.11ay networks.

5.1. Fluid Approximation

A simple fluid-flow representation of the congestion window evolution of a subflow ı,

w_{ı} (t)

, in continuous time is given by:

\begin{matrix} \frac{d w_{ı} (t)}{d t} & = & \frac{((τ - q_{ı} (t)) / τ)}{R T T_{ı} (t)} - \frac{2 w_{ı}^{2} (t) p_{ı} (t)}{3 R T T_{ı} (t)}, \end{matrix}

(25)

where

p_{ı} (t)

is the loss probability of the flow ı at time t. Observe in (25) that the LD-MPSP window oscillates between

w_{min}^{ı} = 2 w_{ı} (t) / 3

and

w_{max}^{ı} = 4 w_{ı} (t) / 3

with an average of

w_{ı} (t)

. Similarly, the simplest network link algorithm that guarantees that these equilibrium losses are indeed Lagrange multipliers (as shown in [7]), which are based on applying the gradient projection algorithm to the dual problem. Therefore, the fluid-flow representation for loss dynamics in the 802.11 ap links can be captured as

\begin{matrix} \frac{d p_{ı (t)}}{d t} = \frac{1}{m_{ı}} {(λ_{ı (t)} - 1_{q_{ı (t)}} (θ_{ı (t)}))}_{p_{ı}}^{+}, \end{matrix}

(26)

where

λ_{ı} (t) = w_{ı} / R T T_{ı} (t)

and

1_{q_{ı} (t)} (θ_{ı} (t))

denotes that the packet service rate of the network link ı at time t is

θ_{ı}

only when the link queue,

q_{ı} (t)

, is saturated.

5.2. Equilibrium Conditions

A point (

w^{★}, p^{★}

) is said to be an equilibrium of (25) and (26) if it satisfies, for all links,

\begin{matrix} {(\frac{3 ((τ - q_{ı}) / τ)}{2 w_{ı}^{2}} - p_{ı})}_{w_{ı}}^{+} = 0, \frac{1}{m_{ı}} {(λ_{ı} - θ_{ı})}_{p_{ı}}^{+} = 0 o r \end{matrix}

\begin{matrix} w_{ı} \geq 0, \frac{3 ((τ - q_{ı}) / τ)}{2 w_{ı}^{2}} \leq p_{ı}, and \frac{3 ((τ - q_{ı}) / τ)}{2 w_{ı}^{2}} = p_{ı}, w_{ı} > 0 \end{matrix}

equivalently,

\begin{matrix} p_{ı} \geq 0, λ_{ı} \leq θ_{ı}, and λ_{ı} = θ_{ı} i f p_{ı} > 0 . \end{matrix}

(27)

Any finite equilibrium (

w^{★}, p^{★}

) must have

p_{ı} > 0

for all links ı. We always focus on finite equilibria and interpreting (27) in terms of

w_{max}^{ı}

provides us with an important insight about the bandwidth delay product (

C_{ı}

) of the network circuit link (

w^{ı_{max}} \approx C_{ı}

),

\begin{matrix} p_{ı} = \frac{8 ((τ - q_{ı}) / τ)}{3 {(w_{\max}^{ı})}^{2}} \approx \frac{8 ((τ - q_{ı}) / τ)}{3 {C_{ı}}^{2}} \end{matrix}

(28)

5.3. Utility Maximization

For a case of the MPTCP algorithm, with equilibrium (

w_{s}^{★}, p_{l}^{★}

), the same interpretation of network utility maximization discussed earlier may not hold [7] (Section 3-B). Nevertheless, the solution of the suitably constructed utility maximization problem, i.e.,

\begin{matrix} Maximize \sum_{s \in S} U_{s} (w_{s}) s . t . λ_{l} \leq θ_{l}, \forall l \end{matrix}

(29)

helps us understand the dynamics of the system (for instance, in constructing a Lyapunov function for convergence and stability analysis). The case of utility maximization for MPTCP design is much more delicate as compared to single-path TCP: whether an underlying utility function exists or does not depend upon the design choice of the multipath congestion avoidance mechanism.

5.4. Computational Complexity of LDMPSP

In this part of our investigation, we will determine in what order bottlenecks occur along different paths. Consider

G (S) = max_{ı \in S} \{\frac{\frac{\sqrt{w_{ı}}}{R T T_{ı}}}{\sum_{ı} \frac{w_{ı}}{R T T_{ı}}}\}

(30)

and the fractional ratios of the numerator terms in (30) across different paths, in ascending order, are

\begin{matrix} \{\frac{\sqrt{w_{1}}}{R T T_{1}}\} \leq \{\frac{\sqrt{w_{2}}}{R T T_{2}}\} \leq \{\frac{\sqrt{w_{3}}}{R T T_{3}}\} \leq \dots \{\frac{\sqrt{w_{k}}}{R T T_{k}}\} . \end{matrix}

(31)

Note that, with the order in (31), the finite equilibrium of the window increase of LD-MPSP can be estimated as [5]

min_{S \subset R : ı \in S} \frac{\frac{w_{max (S)}}{R T T_{max (S)}^{2}}}{{(\sum_{s \in S} \frac{w_{s}}{R T T_{s}})}^{2}} = min_{ı \leq j \leq k} \frac{\frac{w_{u}}{R T T_{u}^{2}}}{{(\sum_{t \leq j} \frac{w_{t}}{R T T_{t}})}^{2}}

(32)

Observe that we can calculate (32) using a linear search.

6. Results and Discussion

In this section, we provide the details of the simulation setup, performance evaluation, and discuss the implications and results. We evaluate the assumptions made while designing the three algorithms.

6.1. Simulation Setup

We utilize the open-source simulation platform with great realism for testing IEEE 802.11ad/ay standards by extending the multipath congestion control and scheduling design at the TRA layer. We have utilized and built upon the NS-3 modules developed in [11,33], and our simulation setup consists of four main software tools as follows:

In NS-3, we have created an MPTCP module [11] that adheres to RFC-6824 and closely resembles the MPTCP Linux kernel architecture. See [11], where we can obtain a quick overview of our module. Several path management approaches (e.g., FullMesh and NdiffPorts), MPTCP congestion control and scheduling protocols (e.g., LIA, OLIA, BALIA, DALIA) are supported by the module.
IEEE 802.11ad/ay NS-3 implementation is another main foundation of our performance evaluation setup [33]. We utilize the IEEE 802.11ay MAC layer design with a very accurately abstracted PHY layer. In addition, we implement a realistic mmWave channel model based on channel profiles obtained by the ray-tracing software at 60 GHz. Our NS-3 MPTCP module is coupled tightly with the 802.11ad/ay implementation.
We adopt a MATLAB-based code/beam generator for creating codebook instances that describe the beam patterns of an IEEE 802.11ad/ay enabled device’s phased antenna array(s) (https://github.com/wigig-tools/codebook-generator, accessed on 22 November 2022);
We implement a MATLAB-based mmWave channel modeling as a ray-tracing software for channel realization building upon quasi-deterministic channel realization software publicly available in GihHub (https://github.com/wigig-tools/qd-realization, accessed on 22 November 2022);
The training and ablation analysis in this paper follows the lines of that of [34,35,36,37], and therefore we include the essential steps only.

In our network setup, ten devices are moving in an elliptical path and are associated with multiple APs as shown in Figure 6, with the following settings (unless stated otherwise): (i) between the server and devices, we used three MPTCP connections (each with three subflows selected from seven links EDMG-SC 802.11ay networks with spatial streaming up to eight); (ii) a setup of five APs where devices are always moving in an elliptical orbit at a speed of 15 m per sec. (iii) Various size files from an HTTP server are downloaded, which is creating the emulated packet traffic, ranging in size from 1 MB to 20 MB. Other parameters and protocols used in the simulation are summarized in Table 2.

6.2. Evaluating Multipath Scheduling with Beamforming and Selection

We examine MS-UCB with the beamforming and selection over 802.11ay by comparing the overhead (in milliseconds) with the existing 802.11ad. In our setup, we bypass the physical layer interface and simulate the overhead of the proposed cross-layer framework in the MAC layer with multipath scheduling at the transport layer. This approach filters the physical layer impacts completely and facilitates reasonable comparison for tractability. To reduce beamforming training costs, we deploy an experimental MAC layer implementation [33] that allows us to examine PHY layer characteristics without introducing the entire complication. The whole Beacon Interval (BI) is reserved for data transmission in this MAC implementation. With the setup, we employ up to seven EDMG-SC links and eight RF chains, with multiple RF chains and multiple links serving one (selected) device simultaneously. Observe in Figure 7 that, with the increasing number of devices, there is no noticeable increase in the overhead (in milliseconds) due to MS-UCB. Unlike the 802.11ad (dashed red curve in Figure 7), the proposed framework first selects the devices (pull-based approach) and then considers only the selected devices in the contention of the 802.11ay system.

More importantly, the overhead with the proposed MS-UCB multipath scheduling (solid blue curve in Figure 7) decreases sharply with increases in the number of devices. For example, when there is just a small number of devices (<40), the overhead is relatively low in 802.11ad, but the multipath (MS-UCB) overhead is pretty high. The overheads of MPTCP LIA over 802.11ad rise gradually as the number of devices increases (dashed red curve), and in practice, given a reasonable IoT scenario, a large number of devices exist. With the proposed cross-layer approach, the overhead depends only on the number of selected (and served) devices (independent of the total number of devices). In addition, the overhead due to the proposed MS-UCB multipath scheduling for a reasonable number of devices (>60) is relatively small. It is worth noting that the observed enhancements are mostly the result of MAB coupled with multipath scheduling and beam selection for hybrid beamforming.

6.3. Observing Application Performance from the TRA Layer

A file transfer application is emulated using the proposed multipath and TCP New Reno (for single path 802.11ad, HEKATON, BUSH). For data transfer, we set the packet size to 1500 bytes. Each user’s SNR is set as physical layer traces, which are inputs into the NS-3 simulation to evaluate application performance at the TRA layer. The comparison of throughput (in Mbps) evolution over time is shown in Figure 8. We adopt Direct Code Execution, a framework for ns-3 that allows existing implementations of userspace and kernel space network protocols or programs to be handled flexibly without requiring source code modifications. As a result, the setup is almost similar to a real-world scenario. The data transmission takes around 100 seconds, as we randomly change the RTT of the paths from 10 to 40 milliseconds over the duration. Despite all of the considered overheads, such as headers, timeouts, slow start and congestion avoidance, we observe in Figure 8 that multipath outperforms BUSH, HEKATON and 802.11ad MU-MIMO.

6.4. Comparing Energy Efficiency

We also assess the proposed multipath system’s energy efficiency (bits-per-Joule). For tractability, we utilize two RF chains and two devices to provide a fair comparison. To standardize the energy efficiency figures, we use 802.11ad MU-MIMO as a baseline. Figure 9 depicts the mean normalized energy efficiency. When compared to 802.11ad, HEKATON, Multipath and BUSH achieve superior energy efficiency. This is because of their low power consumption, and all of them employed phased array to enhance energy efficiency. We observe in Figure 9 that Multipath outperforms BUSH, and BUSH outperforms HEKATON in terms of efficiency. This is because an optimum technique should execute multiple RF chains simultaneously while servicing one device. Multipath always employs the best strategy (two or more RFs where feasible); BUSH utilizes two RFs in some low SNR circumstances; however, HEKATON does not use this strategy.

6.5. Evaluating Aggregate Throughput

We begin by investigating the case with multiple links and several RF chains, seven links EDMG-SC 802.11ay networks with spatial streaming (up to eight). In contrast to HEKATON, a flexible path, beam, and device selection algorithm is being developed in the proposed cross-layer multipath MS-UCB scheme. It indicates that the proposed multipath MS-UCB system, unlike HEKATON (one-to-one), is a one-to-many mapping solution. In our system, several paths and beams can even serve the same device if the throughput is higher than serving multiple devices. In our experiments, we find that the proposed cross-layer multipath system is better than HEKATON and BUSH. In extremely low SNR instances, BUSH can use two RFs and rule out a device for the best overall throughput of another (one-to-two mapping) in each transmission, but HEKATON always serves several users in all cases.

As shown in Figure 10, we investigate the scalability with an increasing number of devices. We increase the total number of devices from 5 to 20 and measure the overall network throughput. With increasing device counts, Multipath always outperforms BUSH, and BUSH consistently beats HEKATON and vanilla 802.11ac.

6.6. Comparing the Proposed Framework with Celebrated Multipath TCP

Our LD-MPSP has been tested by comparing it with the leading MPTCP algorithms: DALIA [11] and LIA [23]. We set up a testbed with seven links EDMG-SC 802.11ay networks and eight spatial streaming. We use 1500 bytes packets and set 20 milliseconds target delay. Our fluid modeling and simulation settings are flexible to handle various data speeds and network parameters.

In Figure 5 above, we compare the evolution of the WiFi throughputs between regular TCP and MPTCP flows as observed in ns-3 simulations. We found that, due to the inbuilt packet cloning mechanism, the results of the throughput with LD-MPSP are more stable and highly improved as compared to that with DALIA (highly fluctuating) even with a high wireless loss.

Next, to investigate the robustness of our algorithm to delay variations, we have performed extensive numerical experiments. In Figure 11, we did this by changing the number of flows over links in time, i.e., LD-MPSP flows are competing with TCP flows in the 802.11ay links. By changing the number of flows, we guarantee that there will be fluctuations in the shared buffer occupancy at the AP, thus creating delay variations. LD-MPSP outperforms MPTCP DALIA in terms of ameliorating the adverse impact of fluctuating channel and delay dynamics. In addition, we observed that the LD-MPSP algorithm is robust and remains stable against such delay changes. It is also interesting to observe that, with the injected delay variations, the throughputs perceived by all existing MPTCP flows oscillate sometimes before they converge (which is prevalent in a single-path TCP as well).

Another performance assessment of our LD-MPSP is shown in Figure 12 with the following settings extended from Figure 6: (i) between the server and two devices, we used five MPTCP connections (each with five subflows selected from seven links EDMG-SC 802.11ay networks with spatial streaming up to eight); (ii) A setup of seven APs where devices are always moving at a speed of 25 m per sec from one AP towards another AP and so on; (ii) Files from an HTTP server created the emulated packet traffic, ranging in size from 2 MB to 10 MB. Observe in Figure 12 that LD-MPSP outperforms the state-of-the-art MPTCP protocols in terms of ‘network utility maximization (NUM)’ considerably. More importantly, for the motivating example discussed in Table 1 earlier, the throughputs perceived by the two devices with LD-MPSP are 56.20 Gbps (EDMG SC) and 61.01 Gbps (EDMG OFDM); see Table 3.

7. Conclusions

In this paper, we have developed a complete understanding of the dynamics and problems of MPTCP over IEEE 802.11ay networks. We have demonstrated several ways to capture the impact due to the mobility of devices on MPTCP and exploit it while scheduling packet transmissions amongst multiple 802.11ay links from the transport layer. We have found that the requirement is usually to transmit group of packets with low latency. Thus, a more significant factor to be considered from the user’s perspective is the aggregate data transfer completion time rather than the per-packet delay. We have developed a delay analysis model based on stochastic multipath scheduling of packets and hybrid beamforming of 802.11ay links, which enables us to design a novel MPTCP that considers packets inside congestion windows and their relationship to hybrid beamforming and device selection. More importantly, we have shed light on a data-driven TRA-Layer design platform. For example, we demonstrate how a multi-armed bandit can achieve delay balancing (with and without federated learning) and how to schedule replicated packets for short and longer connections opportunistically. A detailed investigation of the impact of low delay with forwarding error correction, 802.11 support for MPTCP sources, and extending the design over WiFi 7 are topics of our future study.

Author Contributions

Conceptualization, S.R.P. and M.M.; Methodology, S.R.P. and M.M.; Software, S.R.P.; Validation, S.R.P.; Formal analysis, S.R.P. and M.M.; Investigation, S.R.P. and M.M.; Resources, S.R.P. and M.M.; Data curation, S.R.P. and M.M.; Writing—original draft, S.R.P.; Writing—review & editing, M.M.; Visualization, S.R.P.; Supervision, M.M.; Funding acquisition, S.R.P. and M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was jointly funded by the Faculty of Science, Engineering and Built Environment (FSEBE/Deakin) and the school of IT (SIT/Deakin) Deakin University internal research grants and the Comcast/USA innovation fund 2022.

Conflicts of Interest

The authors declare no conflict of interest.

References

IEEE Draft Standard for Information Technology-Telecommunications and Information Exchange Between Systems—Local and Metropolitan Area Networks-Specific Requirements Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications-Amendment 2: Enhanced Throughput for Operation in License-Exempt Bands Above 45 GHz; IEEE: Piscataway, NJ, USA, 2020.
Huang, K.; Huang, L.; Quan, Y.; Du, H.; Luo, C.; Lu, L.; Hou, R. Mutli-Link Channel Access Schemes for IEEE 802.11be Extremely High Throughput. IEEE Commun. Stand. Mag. 2022, 6, 46–51. [Google Scholar] [CrossRef]
Sahoo, A.; Gao, W.; Ropitault, T.; Golmie, N. Admission Control and Scheduling of Isochronous Traffic With Guard Time in IEEE 802.11ad MAC. IEEE Trans. Mob. Comput. 2022, 1–10. [Google Scholar] [CrossRef]
Ghasempour, Y.; da Silva, C.R.C.M.; Cordeiro, C.; Knightly, E.W. IEEE 802.11ay: Next-Generation 60 GHz Communication for 100 Gb/s WiFi. IEEE Commun. Mag. 2017, 55, 186–192. [Google Scholar] [CrossRef]
Wischik, D.; Raiciu, C.; Greenhalgh, A.; Handley, M. Design, Implementation and Evaluation of Congestion Control for Multipath TCP. In Proceedings of the USENIX NSDI Conference, Boston, MA, USA, 30 March–1 April 2011; Volume 11, p. 8. [Google Scholar]
Wu, H.; Caso, G.; Ferlin, S.; Alay, O.; Brunstrom, A. Multipath Scheduling for 5G Networks: Evaluation and Outlook. IEEE Commun. Mag. 2021, 59, 44–50. [Google Scholar] [CrossRef]
Peng, Q.; Walid, A.; Hwang, J.; Low, S.H. Multipath TCP: Analysis, Design, and Implementation. IEEE/ACM Trans. Netw. 2016, 24, 596–609. [Google Scholar] [CrossRef] [Green Version]
Michel, F.; Cohen, A.; Malak, D.; Coninck, Q.D.; Médard, M.; Bonaventure, O. FlEC: Enhancing QUIC With Application-Tailored Reliability Mechanisms. IEEE/ACM Trans. Netw. 2022, 1–14. [Google Scholar] [CrossRef]
Wei, W.; Xue, K.; Han, J.; Wei, D.S.; Hong, P. Shared bottleneck-based congestion control and packet scheduling for multipath TCP. IEEE/ACM Trans. Netw. 2020, 28, 653–666. [Google Scholar] [CrossRef]
3GPP. 3GPP TS 23.501, V16.6.0 (ETSI); System Architecture for the 5G System; 3GPP: Paris, France, 2020. [Google Scholar]
Pokhrel, S.R.; Mandjes, M. Improving Multipath TCP Performance over WiFi and Cellular Networks: An Analytical Approach. IEEE Trans. Mob. Comput. 2019, 18, 2562–2576. [Google Scholar] [CrossRef]
Van De Meent, R.; Mandjes, M.; Pras, A. Gaussian traffic everywhere? In Proceedings of the 2006 IEEE International Conference on Communications, Istanbul, Turkey, 11–15 June 2006; Volume 2, pp. 573–578. [Google Scholar]
Hohn, N.; Veitch, D.; Abry, P. Cluster processes: A natural language for network traffic. IEEE Trans. Signal Process. 2003, 51, 2229–2244. [Google Scholar] [CrossRef] [Green Version]
Heemskerk, M.; Mandjes, M.; Mathijsen, B. Staffing for many-server systems facing non-standard arrival processes. Eur. J. Oper. Res. 2022, 296, 900–913. [Google Scholar] [CrossRef]
Mandjes, M.; Van De Meent, R. Resource dimensioning through buffer sampling. IEEE/ACM Trans. Netw. 2009, 17, 1631–1644. [Google Scholar] [CrossRef] [Green Version]
Asanjarani, A.; Nazarathy, Y.; Taylor, P. A survey of parameter and state estimation in queues. Queueing Syst. 2021, 97, 39–80. [Google Scholar] [CrossRef]
Mandjes, M.; Ravner, L. Hypothesis testing for a Lévy-driven storage system by Poisson sampling. Stoch. Process. Their Appl. 2021, 133, 41–73. [Google Scholar] [CrossRef]
Krishnasamy, S.; Sen, R.; Johari, R.; Shakkottai, S. Learning unknown service rates in queues: A multiarmed bandit approach. Oper. Res. 2021, 69, 315–330. [Google Scholar] [CrossRef]
Pokhrel, S.R.; Choi, J. Low-Delay Scheduling for Internet of Vehicles: Load-Balanced Multipath Communication With FEC. IEEE Trans. Commun. 2019, 67, 8489–8501. [Google Scholar] [CrossRef]
Xie, X.; Chai, E.; Zhang, X.; Sundaresan, K.; Khojastepour, A.; Rangarajan, S. Hekaton: Efficient and practical large-scale MIMO. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking, Paris, France, 7–11 September 2015; pp. 304–316. [Google Scholar]
Chen, Z.; Zhang, X.; Wang, S.; Xu, Y.; Xiong, J.; Wang, X. Enabling Practical Large-Scale MIMO in WLANs With Hybrid Beamforming. IEEE/ACM Trans. Netw. 2021, 29, 1605–1619. [Google Scholar] [CrossRef]
Amakawa, S.; Aslam, Z.; Buckwater, J.; Caputo, S.; Chaoub, A.; Chen, Y.; Corre, Y.; Fujishima, M.; Ganghua, Y.; Gao, S.; et al. White Paper on RF Enabling 6G—Opportunities and Challenges from Technology to Spectrum; 6G Flagship: Oulu, Finland, 2021. [Google Scholar]
Raiciu, C.; Handley, M.; Wischik, D. Coupled Congestion Control for Multipath Transport Protocols; IETF RFC 6356; Internet Engineering Task Force: Fremont, CA, USA, 2011. [Google Scholar]
Lee, C.; Jung, J.; Chung, J.M. DEFT: Multipath TCP for High Speed Low Latency Communications in 5G Networks. IEEE Trans. Mob. Comput. 2020, 20, 3311–3323. [Google Scholar] [CrossRef]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A.y. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the 20 th International Conference on Artificial Intelligence and Statistics, PMLR, Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
Bellavista, P.; Giannelli, C.; Mamei, M.; Mendula, M.; Picone, M. Application-driven Network-aware Digital Twin Management in Industrial Edge Environments. IEEE Trans. Ind. Inform. 2021, 17, 7791–7801. [Google Scholar] [CrossRef]
Gai, Y.; Krishnamachari, B.; Jain, R. Combinatorial network optimization with unknown variables: Multi-armed bandits with linear rewards and individual observations. IEEE/ACM Trans. Netw. 2012, 20, 1466–1478. [Google Scholar] [CrossRef]
Xia, W.; Quek, T.Q.; Guo, K.; Wen, W.; Yang, H.H.; Zhu, H. Multi-Armed Bandit-Based Client Scheduling for Federated Learning. IEEE Trans. Wirel. Commun. 2020, 19, 7108–7123. [Google Scholar] [CrossRef]
Zhu, Z.; Zhu, J.; Liu, J.; Liu, Y. Federated Bandit: A Gossiping Approach. Proc. ACM Meas. Anal. Comput. Syst. 2021, 5, 3–4. [Google Scholar] [CrossRef]
Khalili, R.; Gast, N.; Popovic, M.; Le Boudec, J.Y. MPTCP is Not Pareto-optimal: Performance Issues and a Possible Solution. IEEE/ACM Trans. Netw. 2013, 21, 1651–1665. [Google Scholar] [CrossRef] [Green Version]
Auer, P.; Cesa-Bianchi, N.; Fischer, P. Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 2002, 47, 235–256. [Google Scholar] [CrossRef]
Pokhrel, S.R.; Williamson, C. A Rent-Seeking Framework for Multipath TCP. ACM SIGMETRICS Perform. Eval. Rev. 2021, 48, 63–70. [Google Scholar] [CrossRef]
Assasa, H.; Grosheva, N.; Ropitault, T.; Blandino, S.; Golmie, N.; Widmer, J. Implementation and evaluation of a WLAN IEEE 802.11 ay model in network simulator ns-3. In Proceedings of the WNS3 2021: 2021 Workshop on ns-3, Virtual, 23–24 June 2021; pp. 9–16. [Google Scholar]
Pokhrel, S.R.; Walid, A. Learning to Harness Bandwidth with Multipath Congestion Control and Scheduling. IEEE Trans. Mob. Comput. 2021, 1. [Google Scholar] [CrossRef]
Pokhrel, S.R.; Choi, J. Federated Learning With Blockchain for Autonomous Vehicles: Analysis and Design Challenges. IEEE Trans. Commun. 2020, 68, 4734–4746. [Google Scholar] [CrossRef]
Pokhrel, S.R.; Choi, J. Improving TCP Performance Over WiFi for Internet of Vehicles: A Federated Learning Approach. IEEE Trans. Veh. Technol. 2020, 69, 6798–6802. [Google Scholar] [CrossRef]
Pokhrel, S.R.; Choi, J.; Walid, A. Fair and Efficient Distributed Edge Learning With Hybrid Multipath TCP. IEEE/ACM Trans. Netw. 2022, 1–13. [Google Scholar] [CrossRef]

Figure 1. Motivating Example: Consider a fast-moving convoy of drones flying over the Darling Harbour of Sydney, Australia. The convoy (head to tail) has direct access to several Internet connection sites via mmWave WiFi 802.11ay antenna array system. Control information and actions from the server are disseminated within the convoy instantaneously via uplinks. Downlinks are mostly used for HD video streaming for monitoring anomalies around the Harbour area.

Figure 2. Enhanced directional multi-gigabit (EDMG) fields of 802.11ay.

Figure 3. A high-level view of the considered MPTCP data networking scenario in the Internet of Drones using multiple APs of IEEE 802.11ay. Our main focus in this paper is to develop a complete understanding of such dynamics of MPTCP over 802.11ay WiFi to identify and ameliorate the MPTCP design issues at the transport layer.

Figure 4. An abstract view of the analytic modeling approach demonstrating interactions between different cross-layer components of the design towards novel insights in developing the three main algorithms in this paper.

Figure 5. Temporal evolution of 5 TCP Reno vs. 5 MPTCP (LIA and LD-MPSP) users’ 802.11ay throughputs obtained from 0 s to 200 s over seven links EDMG-SC ay networks with spatial streaming (up to eight) with two different experiments. Experiment 1: observe that LIA cannot compete well with TCP (see

{tcp}_{Lia}

), Experiment 2: our algorithm (LD-MPSP) performs better than LIA when competing with TCP (see

{tcp}_{Ld - mpsp}

) flows (simulation is summarized in Table 2).

Figure 5. Temporal evolution of 5 TCP Reno vs. 5 MPTCP (LIA and LD-MPSP) users’ 802.11ay throughputs obtained from 0 s to 200 s over seven links EDMG-SC ay networks with spatial streaming (up to eight) with two different experiments. Experiment 1: observe that LIA cannot compete well with TCP (see

{tcp}_{Lia}

), Experiment 2: our algorithm (LD-MPSP) performs better than LIA when competing with TCP (see

{tcp}_{Ld - mpsp}

) flows (simulation is summarized in Table 2).

Figure 6. Simulation network topology: Multiple devices are moving in an elliptical orbit and are associated with several 802.11ay APs simultaneously using multipath TCP protocol at the TRA layer.

Figure 7. Quantifying and comparing the overhead in milliseconds of MS-UCB over seven links EDMG-SC 802.11ay networks with spatial streaming (up to eight) with MPTCP LIA over 802.11ad.

Figure 8. Comparison of the temporal evolution of TRA layer throughputs in Mbps while transmitting bulk files. The proposed cross−layer MS−UCB’s performance is compared with Multipath TCP−based BUSH and HEKATON over seven links EDMG−SC 802.11ay networks with spatial streaming (up to eight). MPTCP LIA’s throughput over 802.11ad is a baseline.

Figure 9. Comparing the energy efficiency of the proposed multipath MS-UCB with MPTCP-based BUSH and HEKATON over seven links EDMG-SC 802.11ay networks with spatial streaming (up to eight). MPTCP LIA efficiency over 802.11ad is taken as a baseline.

Figure 10. Performance comparison of the proposed multipath MS-UCB with MPTCP-based BUSH and HEKATON over seven links EDMG-SC 802.11ay networks with spatial streaming (up to eight). MPTCP LIA performance over 802.11ad is taken as a baseline.

Figure 11. Throughputs in Gbps over time of LD-MPSP and DALIA users in the 802.11ay links as observed at the devices using five links EDMG-SC 802.11ay networks with spatial streaming (up to eight).

Figure 12. Impacts of varying file size downloads (from 2 MB to 8 MB) on the perceived network utilities when devices are using seven links EDMG-SC 802.11ay networks with spatial streaming (up to eight) (several runs with LIA, DALIA and LD-MPSP using the extended setup of Figure 6).

Table 1. Throughputs of various cases with four bonded channels.

MCS	EDMG SC	EDMG OFDM
Maximum TCP Throughput	28.96 Gbps	30.95 Gbps
Mobility TCP Throughput	9.6 Gbps	9.25 Gbps
Mobility MPTCP Throughput	16.36 Gbps	16.15 Gbps
Maximum MPTCP Throughput	41.32 Gbps	42.13 Gbps

Table 2. Network and Protocol Parameters.

Parameters	Value	Parameters	Value
TRA Protocol	TCP/MPTCP	TCP Header	$20 bytes$
Aggregation	MSDU/MPDU	A-MSDU bytes	7935
IP Header	$20 bytes$	A-MPDU bytes	$4, 194, 303$
Payload size	$1460 bytes$	Propagation delay	$6 ms$
Block Ack Size	$1024 frames$	No. of Transmits	27 sectors
Sector Azimuth	$- 80^{\circ}$ : $20^{\circ}$ : $80^{\circ}$	Congestion window	$1 (initial)$
Sector Elevation	$- 45^{\circ}$ :R0 $^{\circ}$ : $45^{\circ}$	Transmit Power	12 dBm
MPTCP	OLIA/BALIA	file size	small/bulk

Table 3. Throughputs of various cases with four bonded channels (Cf., Table 1).

MCS	EDMG SC	EDMG OFDM
Maximum TCP Throughput	28.96 Gbps	30.95 Gbps
Mobility TCP Throughput	9.6 Gbps	9.25 Gbps
Mobility MPTCP Throughput	16.36 Gbps	16.15 Gbps
Maximum MPTCP Throughput	41.32 Gbps	42.13 Gbps
Mobility LD-MPSP Throughput	37.16 Gbps	36.95 Gbps
Maximum LD-MPSP Throughput	56.20 Gbps	61.01 Gbps

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Pokhrel, S.R.; Mandjes, M. Internet of Drones: Improving Multipath TCP over WiFi with Federated Multi-Armed Bandits for Limitless Connectivity. Drones 2023, 7, 30. https://doi.org/10.3390/drones7010030

AMA Style

Pokhrel SR, Mandjes M. Internet of Drones: Improving Multipath TCP over WiFi with Federated Multi-Armed Bandits for Limitless Connectivity. Drones. 2023; 7(1):30. https://doi.org/10.3390/drones7010030

Chicago/Turabian Style

Pokhrel, Shiva Raj, and Michel Mandjes. 2023. "Internet of Drones: Improving Multipath TCP over WiFi with Federated Multi-Armed Bandits for Limitless Connectivity" Drones 7, no. 1: 30. https://doi.org/10.3390/drones7010030

Article Menu

Internet of Drones: Improving Multipath TCP over WiFi with Federated Multi-Armed Bandits for Limitless Connectivity

Abstract

1. Introduction

1.1. Motivating Example

1.2. Challenges and Contributions

1.2.1. Multipath TCP over IEEE 802.11ay

1.2.2. Modeling Challenges

1.2.3. Contributions

1.3. Literature

1.4. Organization

2. Network Scenarios and Problems

2.1. Background of 802.11ay

2.2. Observations

2.3. Packet Scheduling over 802.11ay from TRA Layer

3. Analytic Modeling

3.1. Modeling Beamforming and Selection Mechanism

3.2. Understanding Delay Balancing

3.3. How the MPTCP Scheduler May Solve (11)

4. Multi-Armed Bandit (MAB) for Optimal MPTCP Scheduling

4.1. With Federated Learning

4.2. With Packet Cloning

5. Multi-Connectivity over IEEE 802.11ay Links

5.1. Fluid Approximation

5.2. Equilibrium Conditions

5.3. Utility Maximization

5.4. Computational Complexity of LDMPSP

6. Results and Discussion

6.1. Simulation Setup

6.2. Evaluating Multipath Scheduling with Beamforming and Selection

6.3. Observing Application Performance from the TRA Layer

6.4. Comparing Energy Efficiency

6.5. Evaluating Aggregate Throughput

6.6. Comparing the Proposed Framework with Celebrated Multipath TCP

7. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI