1. Introduction
Millimeter Wave (mmWave) frequencies constitute a major component of SA 5G networks for high data rates support in enhanced mobile broadband (eMBB). One key advantage here is the contiguous available spectrum at these bands. However, the aggregated path losses impose the use of beamforming techniques to achieve higher link gains. This results in directional transmission and reception modes, which yields prolonged access times. Now the International Mobile Telecommunications (IMT) framework specifies 10 millisecond (ms) latency levels for eMBB in 5G systems [
1]. Hence, a major challenge here is to provide fast access schemes that feature ultralow times, along with reduced power and energy consumption levels. Additionally, these access schemes need to consider channel fluctuations and variations in link status as a function of blockage, as well as mobility effects.
Currently, conventional schemes dictate that the MS and BS perform spatial search over all directions, in order to determine the best beamforming and combining vectors with the highest received signal level. For example, work in [
2] proposes a hierarchical codebook for iterative search that uses wide beams in the initial search stages, then refinement is conducted in subsequent stages using narrow beams. However, this technique can suffer from reduced directivity, outages and sensitivity to blockage due to the low gains achieved in the initial codebook stage. Moreover, work in [
3,
4] uses metaheuristics in efforts to accelerate the access times and reduce energy consumption, e.g., generalized pattern search and Hooke Jeeves methods. The work in [
5] exploits the sidelobe information to retrieve the direction of the main lobe. However, this scheme is limited to line-of-sight (LoS) and single-ray channel. Moreover, work in [
6] exploits grating lobes for simultaneous transmission to increase directivity. However, this scheme features a complex beamforming structure with a large number of antennas and high-power requirements.
Furthermore, the geolocation-assisted access scheme in [
7] utilizes global positioning system (GPS) at the MS to determine the BS location. However, this context-based scheme is limited to outdoor settings with permanent GPS connectivity, and it requires the BS to conduct exhaustive beam search to allocate the MS. The work in [
8] proposes a single RF chain architecture for multi-user that uses downlink (DL)–uplink (UL) and DL–DL beam-training techniques. A subset beam group is trained here in a single time slot. However, the use of a single RF chain at the BS limits the number of connected MS and scalability. Finally, the work in [
9] proposes a subarray-cooperation multiresolution codebook design. It features a beam alignment scheme that adaptively selects initial layers based on various simultaneous signal-to-noise (SNR) levels. Hence, it quickly aligns the desired beam pairs under single dominant path channels by using hybrid beamforming. Overall, the aforementioned schemes still yield great computational complexity, prolonged access times and high power and energy requirements.
Various studies have used deep learning to solve the problem of beam management for mmWave communication systems. First, the authors of [
10] use deep neural network (DNN) to predict the beam direction at the BS, while implementing an omnidirectional antenna at the MS. The work only studies the accuracy of the DNN algorithm using 24 beams at the BS and aims to outperform the exhaustive beam mechanism. Additionally, the work in [
10] considers omnidirectional MS and directional BS. However, the omni-directional mode at the MS presents various challenges in terms of signal quality and throughput (requires further investigation). By contrast, the proposed work in this paper uses 64 beams and considers system performance including access times, power and energy consumption. Furthermore, results are compared to the fastest beam access schemes reported in the literature. In addition, this paper proposes beamforming models at the MS and BS, which then uses the deep learning network to study comprehensive performance metrics.
Furthermore, deep-learning-based beam selection is proposed in [
11] to reduce the time overhead by exploiting sub-6 GHz channel information. A DNN algorithm is used to estimate the power delay profile (PDP) of a sub-6 GHz channel, then acting an input of the DNN. Overall, this work relies on the support of sub-6 GHz connections, thus limiting the ability of mmWave networks to operate separately. This becomes efficient for a 5G new radio network operating at FR2. Moreover, the work assumes that the sub-6 GHz link is already established, which makes the access time incomplete, i.e., it is required here to study the time complexity for beam association once a MS joins a network until the start of the data-plane. There is also a lack of comprehensive beamforming designs at the MS and BS, where it is limited to a conventional discrete Fourier transform (DFT)-based codebook. Similarly, the work in [
12] also relies on sub-6 GHz channel vectors for initial beam access and blockage in mmWave systems. As opposed to the methodology of [
11] which extracts spatial channel characteristics at the sub-6 GHz band and then uses them to reduce the mmWave beam training overhead, here mapping functions are predicted directly from the sub-6 GHz channel. Specifically, the model leverages transfer learning to reduce the learning time overhead. However, the estimation of the mapping functions is often complicated and requires large neural network to achieve accuracy. In addition, the work again relies on sub-6 GHz bands to realize the beam access at mmWave. Namely, dual-band (microwave and mmWave transceivers) systems are needed at the BS and MS. Here the power consumption analysis and access time need to be further investigated.
For mmWave vehicular communication, the authors od [
13] propose a beam alignment procedure based on fingerprinting approach, over which a set of beam pairs constitute the fingerprint of a given location. Deep learning is deployed at the BS to adapt and update these fingerprints. Moreover, a plurality mechanism is proposed for the beams that meet the received signal strength, i.e., to achieve multiplexing and diversity gains. The outcomes aim to improve the fidelity as compared to exhaustive beam search fingerprinting without deep learning.
Another use for DNN is for beam management and interference coordination in indoor dense mmWave networks for IEEE 802.11ay networks in [
14] that optimizes the beam directions, beamwidths and transmit power. The goal is to reduce the computational complexity and time, while obtaining comparable sum-rate to conventional methods. However, it uses a beamforming training mechanism to establish the directional links between mobile access point (MAB) and stationary access point (SAP), which is used subsequently to generate training data for DNN to mitigate interference. Therefore, the deep learning network is not used here for initial access. Moreover, the implementation is limited to indoor wireless local area networks (WLAN) and not applied for outdoor settings at larger separation distances.
The authors of [
15] describe a specific dataset for beam selection techniques on vehicle-to-infrastructure using millimeter waves. A methodology for channel data generation in mmWave multiple-input multiple-output (MIMO) scenarios is presented that aims to simplify creating data in mobility scenarios by invoking a traffic simulator and a ray-tracing simulator. However, the context here is different and unrelated to initial access between MS and BS cells. Namely, the propagation channel dataset developed here by raytracing is specific for vehicle-to-infrastructure mmWave networks. Overall, the work in [
15] focuses on modeling mobility only and lacks the analysis of the downlink performance. By contrast, the proposed work in this paper focuses on standalone mmWave networks, considering link performance with beamforming architectures at the BS and MS.
Furthermore, deep learning is also used for beam training in [
16] for a mmWave massive MIMO system. The nonlinear properties of channel power leakage are used in the estimation process, where DNN to predict the best beam combination that yields the strongest channel path based on the probability vector in efforts to improve the successful and achievable rates at lower overhead. However, this work lacks latency and power models, as well as comprehensive beamforming modeling at the BS/MS.
Moreover, the work in [
17] presents a beam alignment technique with partial beams using neural networks for multi-user mmWave massive MIMO system in efforts to improve the spectral efficiency at reduced training overhead, as compared to hierarchical search and compressed sensing methods. Offline training is conducted on the channel model, after which online prediction is achieved for beam distribution vector using partial beams. Here the obtained dominant indices from the beam distribution vector are used to align the beams for the multi-user. The work in [
18] combines machine learning and situational awareness to learn the power and optimal beam index, during which the angles of arrival (AoAs) are first estimated based on the location and then this information is used as input to the neural network for beam selection. However, the requirement for user location information (prior knowledge) here for training weakens the proposed algorithm and adds to the system complexity.
A joint beamforming approach between distributed BSs is developed in [
19] that deploys machine learning to simultaneously serve a mobile MS. The latter transmits a single uplink training sequence to the participating BSs using omni or quasi-omni beam patterns to develop a pattern for the location signatures. The signatures are then deployed at the deep learning stage to estimate the beamforming vectors at these BSs, thus reducing the training overhead. The limitation of this work is the use of wide beams (omni or quasi-omni), which makes it inefficient in blockage scenarios during the user mobility. These wide beams also yield low channel gains and throughput levels. Out-of-band information is likewise used in [
20] for deep learning beam prediction scheme to minimize the training complexity. Namely, a dual-band (sub-6 GHz and mmWave links) approach is implemented, where the optimal beam in mmWave band is estimated from sub-6 GHz channel state information (CSI). The work focusses on testing the network accuracy without investigating the beamforming and channel models. One limitation here is the assumption of similar spatial features between the channels of the two bands, which is an inaccurate assumption.
Overall, existing deep-learning-based beam access schemes (summarized in
Table 1) still lack key operating assumptions that are in conflict with the objectives of the FR2 NR of 5G systems. First and foremost, some models assume omni-directional mode at the MS, where the beam discovery is limited to the BS. Others are dependent on the multiple BSs, the MS location and sub-6 GHz bands, and thus fail to operate mmWave as a standalone network. Other limitations include indoor implementation and marginal enhancement to existing conventional methods (e.g., beam sweeping and exhaustive searches). These models overall lack time delay and power consumption models in the control plane and hence work is needed to investigate the delay in standalone beamforming-based mmWave networks.
In light of the above, this paper proposes a first use of a deep learning network model for initial beam access in mmWave communications, with the goal to develop one of the fastest beam access schemes. The model operates in learning and training modes, which aims to predict the best beam index over subsequent time steps, where these indices are affiliated with specific beamforming and combining vectors.
The key practical application for the proposed work is enhancing mmWave networks as part of the 5G FR2 New Radio. Current 5G implementation relies on conventional sub-6 GHz and leverages mmWave bands as a supplementary component, e.g., dual bands and carrier aggregation. However, this is projected only in the first phase of 5G, where the mmWave bands are expected to work independently starting in 2022–2024, contingent upon the development of mature technologies and optimization to support the targeted throughout and latencies. Hence, the mmWave bands are projected to provide standalone service without dependency on microwave (legacy) bands. Thus, the beamforming capability enhances the channel quality and throughput for the user. Furthermore, the deep learning algorithm reduces the access times and control-plane latencies, which meets the ultra-low delays, as defined by the 3GPP targeted at 1 ms. This improves the quality of service (QoS) and enables the implementation of mmWave standalone networks.
Moreover, the technique can be adopted in wireless local area networks (WLAN) as part of the IEEE 802.11ay standard and mmWave links for vehicle-to-everything (V2X) after adding the mobility component. Moreover, the deep learning algorithm will enable the use of highly directional beams without the reliance on wide-beam codebooks; this in turn eliminates the vulnerability of bean-blockage due to low directivity. Namely, the MS will be able to use narrow beams when transiting from sleep (off) mode to idle or active mode in the control plane, thus helping to reduce the beam search time. As a result, high data rates can be supported here, i.e., leveraging the high channel capacities and aggregated antenna gains.
This paper is organized as follows.
Section 2 presents the beamforming, signals and channel models. Then the beam access scheme is proposed in
Section 3. Performance evaluation is presented in
Section 4, along with conclusions in
Section 5.
3. Beam Prediction Access Scheme
The key processing elements of the proposed BRNN-LSTM deep learning model for beam prediction are now presented. First, unidirectional RNN updates hidden layers based upon information received from the input layer as well as the activation state. However, a limitation of unidirectional RNN is learning from past only. Hence, a bidirectional approach is adopted in this work to improve RNN. In this merge, one direction learns the past state, whereas the other learns the future state, then the two outputs are combined for an enhanced estimate. Therefore, the bidirectional feature enables LSTM to train each input sequence in disjoint forwards and backward states that are subsequently connected to the same output layer. This strengthens LSTM to retrieve additional beam index contextual information as compared to the conventional LSTM method. The process at the backward and forward states are similar at each of the bidirectional units.
Another limitation for RNN networks during the training process is the gradient vanishing problem for long data sequences. Hence LSTM networks are adopted to solve this problem by introducing memory blocks (units) that are comprised of self-connected memory cells and multiplicative gate, thus enabling the learning of long term dependencies. Therefore, the work here combines the saliences of past/future (backward and forward) state information at the BRNN with the powerful memory blocks for extended training periods and more information in LSTM, to improve the quality and accuracy of the beam prediction problem. Furthermore, four BRNN-LSTM layers are stacked (chained) to achieve higher precision. The proposed BRNN-LSTM method has three phases, i.e., input, hidden layers and output, where each hidden layer is represented by a bidirectional LSTM cell. Along these lines, the prediction scheme combines BRNN and LSTM to achieve a suitable solution for time-series prediction of variable sequences lengths, i.e., duplicating the training on the input sequences (information from dataset) by leveraging forward and backward states. The architecture for the proposed scheme is presented next.
3.1. Network Architecture
The network architecture for the proposed scheme is presented in
Figure 2. It is composed of input sequences, four BRNN-LSTM layers, where each layer is composed of 50 cells (neurons).
Each LSTM cell in the BRNN-LSTM model is composed of an input
, input modulation
, forget
and output
gates that determine information entering the cell state, see
Figure 3.
The output of the last BRNN-LSTM layer is fed as the input of the dense layer (as per
Figure 2), which is composed of linear activation function. The output of the dense layer presents the output of the proposed scheme, which is the beam index prediction at time step
t + 1. The process over which this prediction process is achieved is now presented.
3.2. Operating Modes
The network operates in two modes, i.e., the learning (Mode I) and training (Mode II).
Learning Mode (Mode I): The network here operates in normal mode, where beam scanning is performed at the MS and BS using conventional schemes, e.g., codebook-based iterative search. Namely, once a MS transits from sleep to active mode and joins the mmWave network, then search is conducted over all beamforming and combining vectors to determine the best beam index and its affiliated direction at time step
t, i.e., yielding the highest signal level. In notations,
Thereafter, the BS and MS will feed the best beam index at every time index (step) for use in the training mode. After the model is trained well, the MS and BS leverage it to predict the next best beam used, as presented next in Mode II.
Training Mode (Mode II): Given the sequences of beam indices with the highest selection over time step t, retrieved from the dataset, the MS and BS now predict the next most likely beam to be used at time step t + 1. Namely, the prediction scheme leverages parametric information of previous time steps (periods) and then labels the next to predict the beam index that returns the highest signal. The BRNN-LSTM scheme here recursively processes beam sequences at every time step of the input. It then maintains a hidden state which is a function of the previous state and the current input.
Problem Formulation: Let be the prediction status at time step t. Hence, the beam access problem is defined as a prediction of the best beam direction at time step t + 1, given the status at time step t. Thus, the goal is to maximize the probability of successful beam prediction at the BS using the proposed BRNN-LSTM deep learning model.
3.3. BRNN-LSTM Deep Learning Model
The beam prediction algorithm relies on the BRNN-LSTM network in two stages. First, the outer processing stage that is between the layers, and the input state inside the LSTM cell. For the outer stage between the layers, first the input data is processed by both the bidirectional forward and backward layers to obtain the hidden states. These data reflect contextual information about the beam indices and affiliated power levels. Following this step, the hidden states are fused to obtain the output layer. Here the bidirectional property enables LSTM to retrieve additional beam index contextual information as compared to conventional LSTM, i.e., obtaining the current and future time steps by the backward and forward states. The process at the backward and forward states are similar at each of the bidirectional LSTM units, where each LSTM cell is composed of a state consisting of input , input modulation , forget and output gates that determine information entering the cell state. Consider the details inside the LSTM cell.
For the inner stage inside the LSTM unit, first, the cell state
at time
t which specifies information carried to the next sequence is modified by
in the sigmoid layer placed underneath it, which is in turn adjusted by
that delivers the new candidate cell state. The forget gate
receives hidden-state vector
(output vector of the LSTM unit) at time step
t − 1, and input vector at time step
t,
, as its inputs. This gate then produces an output number between 0 and 1 for each number in the previous cell state at time step
t − 1,
. Namely, the output of the
instructs the cell state on which information to forget or discard by multiplying 0 by a position in the matrix. Meanwhile, if the output of
is 1, then the information is kept in the cell state, where a
sigmoid function,
is applied to the weighted input and previous hidden state. Equations (7)–(11) represent the
,
,
,
, and
formulations at time step
t, respectively, expressed as [
22]
The parameters
,
,
and
are the weight matrices and
,
are the bias vectors for the
,
,
, and
, respectively, i.e., learnt during the training mode. Finally, the hidden-state layer output
(working memory) is modeled as
. Note that the key parameters here for the model are the logistic
sigmoid and the hyperbolic
tanh nonlinear activation function for each gate to predict probability of the output. Foremost,
is a
sigmoid function with range ∈ [0,1] that can only add memory. Note that the
sigmoid function is unable to forget/clear memory, since the cell state equation is a summation between the previous cell state. Therefore,
is activated with a
tanh activation function with a [−1, 1] range that allows the cell state to forget memory. Overall, the training settings for the model include four layers and one dense layer, as presented in
Figure 2.
The dropout layer is used to control the weight of the hidden layers, where the drop-out regularization rate in each layer used as a regularizer is set at 0.2. The model is trained with 350 epoches over a period of two weeks. A data structure is created with 60 time steps, each of 10 min, and a single output is created, since LSTM cells store long-term memory state. Hence, in each training stage, there are 60 previous training set elements for each taken sample. Consequently, in the testing stage, the first 60 samples are needed for an accurate estimate of the subsequent best beam index. Overall, the training objective is to compute weight matrices, and bias vectors that minimize the loss function for all training time steps, as shown next. See
Table 2 for the parametric settings chosen for the layers in the BRNN-LSTM model.
Dataset: The dataset used in this paper is part of the
BigData Challenge in [
23] recorded over the period of two weeks. Namely, this dataset revealed MSs traffic volumes and used beam indices in sectorized geographical grids.