Next Article in Journal
Probability-Based Diagnostic Imaging of Fatigue Damage in Carbon Fiber Composites Using Sparse Representation of Lamb Waves
Next Article in Special Issue
Sum Rate Maximization for Intelligent Reflecting Surface-Assisted UAV-Enabled NOMA Network
Previous Article in Journal
A Survey on Zero-Knowledge Authentication for Internet of Things
Previous Article in Special Issue
UTD-PO Solutions for the Analysis of Multiple Diffraction by Trees and Buildings When Assuming Spherical-Wave Incidence
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Budgeted Thompson Sampling for IRS Enabled WiGig Relaying

1
Computational Learning Theory Team, RIKEN-Advanced Intelligence Project (AIP), Fukuoka 819-0395, Japan
2
Engineering Department, NRC, Egyptian Atomic Energy Authority, Cairo 13759, Egypt
3
Faculty of Arts and Science, Kyushu University, Fukuoka 819-0395, Japan
4
Department of Informatics, Kyushu University, Fukuoka 819-0395, Japan
5
Department of Electrical Engineering, College of Engineering in Wadi Addawasir, Prince Sattam Bin Abdulaziz University, Wadi Addawasir 11991, Saudi Arabia
6
Department of Electrical Engineering, Aswan University, Aswan 81542, Egypt
*
Author to whom correspondence should be addressed.
Electronics 2023, 12(5), 1146; https://doi.org/10.3390/electronics12051146
Submission received: 15 January 2023 / Revised: 17 February 2023 / Accepted: 22 February 2023 / Published: 27 February 2023

Abstract

:
Intelligent reconfigurable surface (IRS) is a competitive relaying technology to widen the WiGig coverage range, as it offers an effective means of addressing blocking issues. However, selecting the optimal IRS relay for maximum attainable data rate is a time-consuming process, as it requires WiGig beamforming training (BT) to tune the phase shifts (PSs) for WiGig base station (WGBS) and IRS relays. This paper proposes a self-learning-based budgeted Thomson sampling approach for IRS relay probing (BTS-IRS) to address this challenge. The BT time cost of probing the IRS relay is incorporated into the main BTS formula, where both payoff and cost posterior distributions are sampled separately, their ratio is estimated, and the arm/IRS relay with the highest ratio is decided. This enables the IRS relay to be chosen with the lowest BT time cost. Numerical results demonstrate the improved performance of the BTS-IRS relaying technique regarding BT time consumption/cost, spectral efficiency, and attainable data rate when compared to other benchmarks.

1. Introduction

The high-frequency range of 30–300 GHz of WiGig communications is being touted as an essential element of the upcoming beyond fifth-generation (B5G) and six-generation (6G) wireless networks as it provides an extensive spectrum for transmission [1]. Nevertheless, its high operating frequency nature has an effect on severe channel characteristics with sensitive blocking, as individual shadowing and raindrops can certainly affect WiGig communication linkage [1]. Furthermore, this limits the coverage of WiGig owing to its short transmission scale [2]. Thus, diverse beamforming training (BT) approaches were commonly utilized in conjunction with WiGig transmissions to expand its communication range [2]. Nevertheless, this BT methodology yields a considerable time overhead, which hinders the attainable data rate of the WiGig linkage.
Currently, intelligent reconfigurable surface (IRS) is recommended as a critical enabling technology for future B5G/6G wireless networks [3]. The IRS panels comprise passive antenna elements that are manipulated to facilitate the orientation of incoming electromagnetic waves (EMs) towards a designated receiver by calibrating their phase shifts (PSs) [4]. Hence, it significantly boosts and extends the wireless transmission/reception range [3,4].
In contrast to traditional relaying methods, such as amplify and forward (AF) or decode and forward (DF), IRS is particularly well-suited for installation at low-cost, given the absence of the costly radio frequency (RF) chains required by traditional relaying [5]. Therefore, the literature studied and recommended IRS applications in various wireless scenarios [3,4].
Additionally, there is a mutually beneficial relationship in IRS-enabled WiGig communications. The IRS can act as a relay for the WiGig signal, extending its range and circumventing obstacles. On the other hand, WiGig base stations (WGBS) can use IRS to direct beams to specific WiGig receivers or destinations. This concept can also be applied to terahertz (THz) communications, which have fragile channels and high levels of path blockage in forthcoming 6G networks [6]. Furthermore, IRS-enabled WiGig relaying effectively diminishes the increased complexity of the WiGig traditional relaying by getting rid of the enormous RF chains. Hence, some related work studied IRS-enabled WiGig communications, particularly in elevated blockage scenarios [7,8,9]. The literature of [10,11,12] has extensively explored various challenges such as the estimation of the cascaded IRS-enabled WiGig channel and optimizing antenna phase shifts of both the WGBS and IRS relay [13,14] (Herein, the IRs panels act as relays and we are trying to find the best nearby panel. A similar term is used in [13,14]). The use of IRS relay probing can provide a range of real-world applications, including extending the coverage range of wireless communication networks in rural and remote areas, enabling emergency response and disaster relief, improving wireless capacity, helping to develop and deploy B5G/6G systems, and providing a reliable and consistent connection for the Internet of Things (IoT).
Our paper is motivated by the investigation of WiGig-aided IRS relaying. In recent years, WiGig and IRSs have been widely heralded as groundbreaking technologies that could enhance the performance of wireless communication systems, particularly in terms of increased spectral efficiency and reduced energy consumption. Nonetheless, the performance of IRSs is heavily contingent upon the selection of precoding techniques and the training time required to determine the optimal precoder weights. This is due to the ability of IRS to manipulate the signal via tuning the reflectivity coefficients of its elements to be adaptable against channel conditions. Therefore, our objective is to identify the optimal surrounding IRS panel that can be utilized to relay information from WGBS to the mobile user when LoS path is blocked. However, this research focuses on finding the supreme IRS relay amongst the scattered relays (IRS boards), where this targeted relay/board maximizes the attainable throughput at the WiGig client. This is anticipated as a complex practical issue in spreading IRS-empowered WiGig relaying. The hardship of this issue comes from the significant BT computational overhead needed to simultaneously tune the phase shifts of the WGBS and IRS relay, as it is computationally intensive (First, the phase shifts from WGBS to the probed IRS panel have to be tuned. Then, another tuning process at IRS is in order to direct the signal to the client). Hence, brute forcing all of the equipped IRS boards employing relay searching techniques to locate the supreme one looks like an inoperable solution due to the severe time consumption. Hence, in this paper, the IRS relay probing issue is handled and formulated as a budgeted multi-armed bandit (MAB) methodology [15].
The key findings and achievements outlined in this paper are listed as follows:
  • The WiGig IRS relay probing problem is mathematically formulated as an optimization problem in order to choose the supreme IRS relay that maximizes the attainable data rates of WiGig communication linkages. Still, the selection process is constrained by the time consumption of the BT procedure of the chosen IRS panel;
  • WiGig-IRS relay probing issue is reformulated as online budgeted MAB, where budgeted Thompson sampling (BTS-IRS) with guaranteed time-cost performance is envisioned to handle it. Herein, the WGBS, IRS boards/relays, throughput, and BT are the bandit player, arms, reward, and cost, respectively;
  • The proposed BTS-IRS algorithm demonstrates superior performance compared to other benchmark schemes through numerous numerical simulations in various scenarios as the BT processing time is amended to the classical TS formula as a budget/cost;
  • Numerous numerical evaluations were carried out to illustrate the superiority of the envisioned BTS scheme compared to other benchmarks such as the brute force solution, the classical TS algorithm [16,17], and random selection of IRS relays. Moreover, BTS-IRS is compared to previous cost-effective UCB-based MAB schemes. The BTS-IRS scheme exhibited an average spectral efficiency that was superior to other benchmark schemes, with comparable BT time computation.
The rest of this paper is organized as follows. Section 2 details existing research related to the topic. Section 3 presents the WiGig-IRS system model, which includes the IRS-assisted WiGig channel model and the optimization problem for IRS relay probing. Section 4 presents the proposed BTS-IRS scheme. Numerical simulations are discussed in Section 5. Finally, Section 6 offers the conclusions and findings.

2. Related Works

Due to the advanced technology of EM, IRS-aided telecommunication is envisioned to burn line-of-sight difficulties of future wireless communications [3,4]. IRS panels can assist satellite communications [18,19,20], IoT networks [21], UAV communications [7], etc. Besides its energy saving, it overcomes the half-wavelength issue of antenna elements. Moreover, IRSs reuse EM waves resulting in satisfactory power expenditure. Furthermore, its innovative radio propagation capabilities enable ubiquitous secured wireless transmissions [22]. Two main energetic applications of IRS in wireless systems are IRS modulators and IRS relays. The authors of [23] investigated IRS-enhanced modulations. This can be achieved via adjusting the IRS reflection coefficients to efficiently forward the transmitted signal from the antenna feeder via modulation. In IRS relay implementations, the transmitted power from the WGBS to the IRS Panel is redirected to the client via continuous readjustment of the PSs [24]. Few related studies investigated the impact of IRS on WiGig wireless networks. In [25], coauthors of this work proposed double-stage MAB methodologies for WiGig-IRS wireless systems to select the best route that attains the maximum spectral efficiency (SE) via the IRS panel. Still, no relay probing consideration was performed in this work. Moreover, they also leveraged MAB schemes in [7] to determine the best route via IRS in envisioned UAV-mounted IRS system that enhances the WiGig communications in hotspot zones. Furthermore, the authors of [8] derived a tractable model for coverage of IRS-aided WiGig networks based on stochastic geometry. A federated learning (FL) aided model of WiGig-enabled IRS systems was envisioned in [9]. The authors of [10] explored an effective two-stage channel estimation formulation for hybrid IRS-assisted WiGig MIMO systems, while [11] derived it via atomic norm minimization. In [12], a proposal was advanced to optimize the average spectral efficiency in an IRS-aided WiGig MIMO architecture, which incorporated the use of hybrid precoders for the WGBS and passive precoders for the IRS, taking advantage of time-varying channel state information (CSI). Furthermore, the work of [24] studied a multi-user hybrid precoding (HP) scheme for IRS-enhanced WiGig systems. The work of [26] envisioned a deep learning (DL) scheme that attains the optimal communication rate for IRS systems. In [27], the authors derived a machine learning (ML) enabled solution for efficient beam management in IRS-aided WiGig systems. Hybrid IRS relay structures were established in [28] with better performance than traditional IRS models. Precise SINR formulations for IRS-enabled WiGig AF relay were formulated in [29]. They attained the optimal PSs with maximal end-to-end SINR via allocating the optimal power of AF relay. The work of [30] investigated the utilization of sequential fractional programming (SFP) for multi-objective optimization of passive beamforming at WiGig-assisted IRS, user association, and power allocation. Furthermore, different IRS-aided WiGig BT structures were explored in [31] to determine the optimal transmission coefficients. IRS panels were utilized to assist cell-free MIMO communications in [32,33]. The research in [34] examined the issue of selecting the IRS in vehicular systems, where the vehicle chooses the IRS panel that results in the highest received signal. In spite of their work, [32,33,34] did not deal with WiGig IRS relaying in addition to neglecting the study of BT time minimization problem. In [35], coauthors of this work contributed to the IRS millimeter wave relaying using cost-effective MABs using modified UCB schemes, i.e., MAB-CE1 and MAB-CE2. Since TS is better than UCB, this paper leverages BTS for WiGig-IRS relay probing with guaranteed performance.

3. System Model and Problem Statement

The system model depicted in Figure 1 involves utilizing a group of IRS relays, numbered from 1 to R, within the coverage area of a WiGig system that is enabled by IRS. The IRS relays are strategically placed to boost the WiGig linkage. The figure demonstrates the total blockage of the direct link, i.e., line of sight (LoS) between the WGBS and the mobile user (MU)/client. Therefore, the WGBS should choose one of the surrounding IRS panels to relay its data to the ME via the NLoS path that avoids the blocker. The WGBS is comprised of M antenna elements which are arranged linearly in a uniform fashion. In addition, each IRS panel has an array of N antenna elements with a uniform planar layout. The IRS controller tunes the phase shifts of the IRS antenna arrays based on the phase shift vector chosen via WGBS.
To manage information between the WGBS and IRS relay, an assigned communication channel between them is operated. Following IRS-assisted WiGig channel models provided in [25,36], the delivered signal at the MU can be mathematically described as:
y r = h r U H Θ r i H B r f j x + n ,
where 1 r R , 1 i Ω r and 1 j F .
In addition, y r and x are the received and sent symbols tracked via IRS relay r, correspondingly. In addition, the transmission power equals P = E x x H , where ˙ H is the Hermitian transpose, and n CN 0 , σ 0 2 identifies noise. f j C M × 1 is the analog precoder vector (Further investigations of digital or hybrid precoder types will be considered as future work) sized by M × 1 of WGBS, while Θ r i C N r × N r is the N r × N r diagonal matrix whose main diagonal is the IRS board antenna elements’PS vector r. Analog precoders have reduced complexity compared to digital or hybrid precoders. It can be implemented with low-complexity circuits, which makes them suitable for low-cost and low-power IRS systems. In addition, its ability to implement continuous phase shifts, which can improve the performance of the IRS system. In contrast, digital precoders are limited to discrete phase shifts, which can result in reduced performance. i and j define the indexes of the utilized Θ r and f , where Ω r and F identify the sets of accessible PSs, vectors of IRS relay r, and WGBS, accordingly. H B r C N r × M is the N r × M channel matrix between WGBS and IRS relay r. h r U C N r × 1 is the N r × 1 channel vector between the IRS relay r and MU. H B r and h r U can be expressed as follows [25,36]:
H B r = N r M K B r k = 1 K B r η k a r ( π k ( A o A ) , O k ( A o A ) ) g B ( π k ( A o D ) ) H
h r U = N r K r U k = 1 K r U ν l a r θ k A o D , ϕ k A o D
where K B r is the linkage paths between WGBS and IRS relay r, and K r U is the paths between IRS relay r and MU. η k and ν k are the coefficients of large-scale fading [36] that are distributed as complex Gaussian i.e., CN 0 , 10 P L k 10 , where P L k d B = P L d 0 10 α log 10 d ρ l . P L d 0 is the path loss at a reference distance d 0 = 1 m [36], α identifies path loss exponent, d is the distance, and ρ k CN 0 , σ ρ l 2 is the shadowing term of path l in K B r and K r U . a r θ k A o A , ϕ k A o A , and g B π k A o D are the response vectors of the l-th path array at the IRS relay r and BS, where the angle of arrival (AoA) is π k A o A o k A o A and π k A o D is the angle of departure (AoD) [25,36]. a r θ k A o D , ϕ k A o D is the response vector of the k t h path at the IRS panel r, where θ k A o D is the AOD azimuth angle and ϕ k A o D is the AOD elevation angle. Mainly, a r θ , ϕ is mathematically stated as follows [25]:
a r θ , ϕ = 1 N r 1 , , e j 2 π λ s y sin θ + z cos ϕ , T ,
where s, λ refer to antenna spacing and carrier wavelength, accordingly, and 0 y , z N r 1 . Similarly, g B π k A o D is given as [25]:
g B θ = 1 M 1 , , e j 2 π λ s n sin θ , T ,
where 0 n M 1 .
The goal of the IRS-assisted WiGig relay selection problem is to select the optimal IRS relay, r * , to maximize the attainable throughput, Ψ , measured in bits per second. This can be mathematically represented as
r * = max r Ψ r = max r B W ψ r T D T D + r = 1 M Q T B T r
ψ r = log 2 1 + P h r U H Φ r i H B r f j h r U H Φ r i H B r f j H σ 0 2
s.t
i Ω r ,
M Q R ,
j F .
where B W , T D are the utilized bandwidth and data transmission time, respectively. r = 1 M Q T B T r is the overall beamforming time needed to probe M Q R IRS relays prior to deciding the supreme one with the highest data rate. T B T r is the beamform training time of relay r, mathematically expressed as:
T B T r = Ω r F Δ B T ,
where Δ B T is the BT time of single pair of Φ r i and f j .
Brute-force searching all of the accessible IRS relays, i.e., M Q = R , then deciding the supreme one, i.e., r * , with maximum ψ , causes great degradation of the attainable data rate Ψ due to the huge overall BT time cost with all spread R IRS relays. The supreme data rate performance issues from only probing one relay with the largest spectral efficiency. Hence, in this paper, a self-learning-assisted solution via formulating the issue under investigation as a BMAB bounded by the chosen IRS relay’s BT cost will be envisioned, and every time slot t, a single IRS relay is chosen.

4. Envisioned BTS-IRS Algorithm

Herein, we will briefly discuss the vision behind MABs, including BMABs, and then re-formulate (6) as a BMAB game. Then, the BTS-IRS scheme with efficient time cost will be highlighted to deal with the discussed optimization problem.

4.1. MAB Concept

MAB is a well-known issue in reinforcement learning in which a decision-maker must choose which arm of a k-armed bandit to pull at each round in order to maximize their cumulative reward over time. The decision-maker does not know the true expected reward for each arm and must learn this through trial and error. MAB is a promising online/self-learning methodology with a broad range of feasible applications [15]. In the typical setting of MAB, a player intends to maximize his longstanding gain via attempting k arms k { 1 , 2 , , K } of the slot machine. Hence, the player subsequently draws one arm per time t, i.e., k t , and collects its attainable payoff, R k t . Initially, the player explores all of the arms once at least to notice their payoffs. Hence, the player decides his next chosen arm upon those noticed payoffs. Typically, the player handles the exploitation-exploration compromise, i.e., exploiting the largest reward arm discovered till now or exploring new ones [15,17,37,38]. BMABs are similar to MABs, except that playing any arm is associated with paying a cost to reveal the reward. The way of handling the cost mathematically reveals the BMAB type. As some types embed the cost term in both exploration–exploitation terms of the BMAB algorithms, and others only subtract the cost term in the MAB exploration term. Hence, in BMABs, the player attempts to maximize his payoff and minimize his paid cost. Thus, BTS is envisioned to handle IRS relay probing scenario [35,39]. As will be shown later, the BTS solution is a sub-optimal one, where it intends to minimize its regret defined as:
R ( T H r ) = E t = 1 T H r μ R ( i , t ) * μ R ( i , t ) ,
At time horizon T H r , the optimal arm’s mean rewards are denoted by μ R ( i , t ) * , and the mean reward from the selected arm by the BTS scheme is denoted by μ R ( i , t ) . The expectation operator E is derived from the random selection of R ( i , t ) from the BTS scheme. BTS endeavors to mitigate instant regret over t, yet the cumulative regret is still increased over T H r . This leads to suboptimal performance of the BTS algorithm over t, while it is close to optimal over the extended training period of T H r .
The IRS-assisted WiGig relaying described in (6) can be re-formulated as BMAB by taking into account the computational time cost of the chosen relay. The player in this scenario is the WGBS, whose objective is to obtain the maximal long-term reward, i.e., the attainable ψ , within a certain time horizon T H r , via choosing among the accessible IRS relays (arms) R with simultaneously reducing the computational beamforming time cost of the chosen IRS relay at each t. Hence, (6) is re-expressed as a sequential time optimization formula, expressed as:
max L ( 1 ) , , L ( T H r ) B W T H r t r L r ( t ) ψ r ( t ) T d T d + T B T r ( t )
ψ r ( t ) = log 2 1 + P h r ( t ) U H Φ r ( t ) i H B r ( t ) f j h r ( t ) U H Φ r i ( t ) H B r ( t ) f j H σ 0 2
S.t
T H r ( 0 , Z + ) ,
i Ω r ,
j F ,
T B T r ( t ) max r ( T B T r ) ,
r L r ( t ) = 1 , 1 r R .
where Z + is the set of positive integers. L r ( t ) is a linkage indicator, which indicates whether IRS relay r is chosen to build a WiGig relay connection at time t, with a value of 1 indicating a selection and 0 indicating otherwise. The restriction (9f) states that the resulting block transmission time cost from choosing IRS relay r is limited by maximum T B T among all of the available IRS panels R. The restriction (9g) assures that a single IRS relay is pulled at time t. Equation (9) involves sequentially selecting the IRS relay over the bandit time T H r , where the chosen one must have the highest spectral efficiency while also satisfying the block transmission time cost bound. The transformation of (6a) into (9a) helps to decrease the beamform training time cost for the problem of selecting IRS relays by only choosing one IRS board at a time, which simulates the optimal process leading to observable increment in the achievable data rate.

4.2. Envisioned BTS Algorithm

BTS is a variant of the classic TS algorithm that is used for solving BMAB problem types [38]. It is a mathematically rigorous method for balancing the exploration and exploitation of different arms (IRS relays in our case) in order to maximize the expected reward (spectral efficiency) over a given budget (BT time). It can be mathematically modeled as a Bayesian optimization problem with the following components: Prior distributions: Each arm is associated with a prior distribution over its reward distribution. These distributions are updated as more observations are made. Posterior distributions: The posterior distribution over the reward of each arm is obtained by combining the prior with the observed rewards. Sampling: A sample is drawn from the posterior distribution of each arm, and the arm with the highest sample is chosen for the next round. Budget constraint: A budget constraint is imposed, which limits the total number of times that an arm can be pulled over a fixed period. BTS is mathematically sound and can be proven to have theoretical guarantees for performance. Hence, in our case i.e., IRS-relay probing, the prior distribution is the Gaussian distribution impacted by the additive white Gaussian noise, and the budget is the maximum allowable BT time. BTS-IRS optimal strategy is by the continuous selection of the optimal arm, which has the maximum ratio of mean payoff to mean cost. Hence, BTS selects that optimal arm as much as possible during the time horizon.
Algorithm 1 details the envisioned BTS-IRS. Its inputs are T B T r , Ω r r R , and F . The output is the chosen IRS relay at trial t, r * ( t ) . As an initialization step, x r ( t ) = 0 , which is the number of draws of IRS panel r at time t. In addition, ψ ¯ r ( 0 ) = 0 and T ¯ B T r ( 0 ) = 0 , where the former is the spectral efficiency/reward of IRS panel r and the latter is the time cost of IRS panel r. The algorithm starts for the whole time horizon to draw the supreme IRS selected using the BTS policy. This is carried out by sampling spectral efficiencies θ i r ( t ) and BT time cost θ i c ( t ) from a normal distribution, and then drawing the IRS relay with the largest spectral efficiency to BT cost ratio i.e., r * ( t ) = arg max i R θ i r ( t ) θ c r ( t ) . Afterwards, we receive the spectral efficiency ψ r * ( t ) and BT cost T B T r * ( i ) from this specific IRS relay and update the distribution of both spectral efficiency and BT cost as follows:
ψ ¯ r * ( t ) = 1 x r * ( t ) i = 1 t L r * ( i ) ψ r * ( i )
T ¯ B T r * ( t ) = 1 x r * ( t ) i = 1 t L r * ( i ) T B T r * ( i )
Algorithm 1: Proposed BTS-IRS Algorithm
Electronics 12 01146 i001

5. Numerical Analysis

Herein, numerical simulations are implemented to validate the efficiency of the envisioned BTS-IRS scheme by comparing it with benchmark techniques, including brute force search, random selection, and naïve TS algorithms. As prescribed in Figure 1, our experimental setup considers a certain coverage region, where the WGBS is located on one edge of the region and the UE is positioned on the other end. The direct link between both is blocked. A random number of IRS boards are spread inside the communication area. The target is to find out the best probed IRS panel/relay. The entire set of IRS relays is systematically evaluated through a brute-force search to identify the maximal rewarded IRS relay, i.e, spectral efficiency. The random IRS relay approach selects a relay randomly. The classical TS method does not include any cost function and only considers the TS values. The relay with the highest TS is then chosen to establish the WiGig relay link, which only improves the distribution of spectral efficiencies. The classical TS method is used as a benchmark for the proposed BTS-IRS, as it is the original variation of BTS. Additionally, the performance of the brute-force search scheme is also included as it is a conventional method that leads to optimal data rate, yet with paying the highest BT time cost. The comparison also includes the results of the brute force search and random selection methods, which are important to include as the brute force search is a traditional heuristic approach and the random selection method is a simpler algorithm with lower computational requirements. Furthermore, previous work of [35] is included in the comparison to compare MAB-CE1, which is a budgeted UCB algorithm [40] that reflects the cost in both exploitation exploration terms, and MAB-CE2, which is derived from an explore then commit algorithm [39]. MAB-CE2 first explores the probed IRS relays and then decides the lowest cost (BT time) IRS relay from a set of highly rewarded IRS relays [35].
The setup scenario considers IRS relays to be randomly distributed across a simulated area of 2.5 Km 2 , unless otherwise specified. The WGBS and UE are located at opposite extremities of the simulated region. For consistency, the radiation patterns of WGBS and IRS panels must be equal in size to their respective antenna sizes, i.e., | F | = M and | Ω r | = N r . The number of beams issued from WGBS is 16. The IRS panels have changed UPA sizes and beam count consequently, where M q uniformly lies between 60 and 600 antenna elements. Table 1 states other important simulation parameters.

5.1. Performance versus Different Numbers of IRS Relays

Figure 2 plots the mean spectral efficiency of the compared schemes versus different numbers of IRS relays at 30 dBm transmitter power. As the IRS relays increase, the spectral efficiencies of all IRS relay probing schemes also increase due to the increased probability of obtaining a good IRS relay. The brute-force scheme has the optimum performance due to its strategy of finding out the supreme relay after searching all scattered relays. Moreover, the random approach delivers the worst performance as it stochastically decides IRS relay without caring about spectral efficiency values. Classical TS shows better performance than BTS-IRS, MAB-CE1, and MAB-CE2 due to its continuous selection of the highest rewarded IRS relay. BTS-IRS delivers better performance than other cost-effective MAB schemes due to its efficient decision policy. At eight IRS relays, around 95%, 77%, 75%, 61%, and 40% of the brute-force performance is attained using classical TS, BTS-IRS, MAB-CE1, MAB-CE2, and random IRS relay selection schemes.

5.2. Performance versus Different Values of Transmitter Power

Figure 3 previews the performance comparison of the IRS relay probing schemes in terms of mean spectral efficiency versus the transmitter power using 20 IRS panels/relays. For all schemes, as the transmitter power increases, the spectral efficiencies also increase consequently. The brute-force scheme has the optimum performance due to its policy of probing all accessible IRS relays. The random scheme has the worst performance because of the stochastic IRS relay probing policy. The classical TS-IRS scheme achieves better spectral performance than the proposed BTS-IRS because of its policy of choosing the highest spectral efficiency IRS relay without considering any cost. Our proposed BTS-IRS scheme has better performance than previously cost-effective, i.e., MAB-CE1 and MAB-CE2 [35], solutions due to its ease of probing policy upon the posterior distributions of the best IRS relay. At the transmitter power of 10 dBm, around 93%, 75%, 40%, 35%, and 10.46% of the brute force performance are achieved using classical TS, BTS-IRS, MAB-CE1, MAB-CE2, and random IRS relay probing, accordingly.
Figure 4 presents the mean throughput performance of all compared probing methods versus the transmitter power using 20 IRS relay. The brute-force scheme has the worst performance due to the effect of large BT time accumulated from trying all possible relays. Our envisioned BTS-IRS has comparable throughput performance compared to other schemes due to its efficient handling of BT. When the transmitter power equals 30 dBm, around 16, 15, 12, 9, and 5 times of the brute-force value are achieved utilizing BTS-IRS, MAB-CE2, TS, MAB-CE1, and random IRS relay probing, accordingly.
Figure 5 illustrates the mean BT time cost in 10 5 s of the IRS relay probing schemes under comparison versus the transmitter power at a fixed number of IRS relays which is 20. It has been observed that the BT cost remains constant when the transmission power is varied. The brute-force search delivers the highest BT performance, which is 0.92 msec and is the highest performance followed by TS-IRS due to its careless strategy of BT cost, as it cares only about picking up the relay with the highest spectral efficiency. The random IRS probing delivers a constant BT cost, i.e., 46 s performance, with lower performance than classical TS because of its randomized IRS relay probing strategy. The proposed BTS-IRs deliver comparable BT time performance. Note that MAB-CE2 has the lowest BT consumption due to its selection strategy of choosing the minimum cost arm from a set of highly rewarded arms after sufficient exploration. This ensures the effectiveness of solving the problem using the powered MAB methodologies that take into its account the BT cost.

5.3. Complexity Analysis

The IRS-enabled WiGig relay probing problem has two major sources of complexity: the complexity issued from the Beamforming Training (BT) process and the utilized algorithm computational complexity.
The main complexity source is the BT one, which consumes a significant amount of time (46 μ s) to test one TX/RX beam pair. The brute-force search algorithm has the largest BT complexity since it exhaustively evaluates all accessible IRS relays prior to deciding the supreme one O R Ω r F . On the other hand, algorithms like the classical TS, and the random selection have much lower BT complexity due to the one BT process required per time slot, i.e., O Ω r F . Our envisioned BTS-IRS is O l n Ω r F [41].
The algorithmic computational complexity, i.e., the execution speed of the algorithm, is minimal when compared to the complexity of the BT process. BTS-IRs and TS computational complexities are similar with O R + 1 . This is due to sampling the 1D Gaussian Random Variable and updating its associated parameters [41].

6. Conclusions

This paper handled the problem of IRS-enabled WiGig relay probing by formulating it using a budgeted MAB setup. The main target is to find out the supreme IRS relay with the highest data rate and lowest BT time cost. Hence, we proposed a BTS-IRS algorithm, a variant of theoretically guaranteed TS, where the optimal IRS relay is selected via the maximal ratio of spectral efficiency and BT time cost. BTS-IRS delivered higher performance than the classical TS, brute force, and random IRS probing schemes in terms of spectral efficiency mean throughput. It also delivers comparable BT time consumption performance compared to MAB-CE1 and MAB-CE2. This paper paves the way for using advanced MAB schemes to handle various problems in IRS-enabled wireless communication topics effectively.

Author Contributions

Conceptualization, S.H., K.H. and E.M.M.; methodology, S.H.; software, S.H. and E.M.M.; validation, K.H., E.T. and E.M.M.; formal analysis, S.H. and E.M.M.; investigation, K.H. and E.T.; resources, E.M.M.; data curation, E.T.; writing—original draft preparation, S.H.; writing—review and editing, S.H., K.H. and E.T.; visualization, E.M.M.; supervision, K.H. and E.T.; project administration, K.H.; funding acquisition, S.H., K.H. and E.M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by JSPS KAKENHI Grant Nos. JP21K14162, JP22H03649, JP19H04067, and JP20H05967 Japan. It is also cofunded via Prince Sattam bin Abdulaziz University Project No funding. (PSAU/2023/R/1444) KSA.

Data Availability Statement

Not applicable.

Acknowledgments

This work was supported by JSPS KAKENHI Grant Nos. JP21K14162, JP22H03649, JP19H04067, and JP20H05967 Japan. It is also supported via funding from Prince Sattam bin Abdulaziz University Project No. (PSAU/2023/R/1444) KSA.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Uwaechia, A.N.; Mahyuddin, N.M. A Comprehensive Survey on Millimeter Wave Communications for Fifth-Generation Wireless Networks: Feasibility and Challenges. IEEE Access 2020, 8, 62367–62414. [Google Scholar] [CrossRef]
  2. Hong, W.; Jiang, Z.H.; Yu, C.; Hou, D.; Wang, H.; Guo, C.; Hu, Y.; Kuai, L.; Yu, Y.; Jiang, Z.; et al. The Role of Millimeter-Wave Technologies in 5G/6G Wireless Communications. IEEE J. Microw. 2021, 1, 101–122. [Google Scholar] [CrossRef]
  3. Liu, Y.; Liu, X.; Mu, X.; Hou, T.; Xu, J.; Di Renzo, M.; Al-Dhahir, N. Reconfigurable Intelligent Surfaces: Principles and Opportunities. IEEE Commun. Surv. Tutor. 2021, 23, 1546–1577. [Google Scholar] [CrossRef]
  4. Sejan, M.A.S.; Rahman, M.H.; Shin, B.-S.; Oh, J.-H.; You, Y.-H.; Song, H.-K. Machine Learning for Intelligent-Reflecting-Surface-Based Wireless Communication towards 6G: A Review. Sensors 2022, 22, 5405. [Google Scholar] [CrossRef] [PubMed]
  5. Björnson, E.; Özdogan, Ö.; Larsson, E.G. Intelligent Reflecting Surface Versus Decode-and-Forward: How Large Surfaces are Needed to Beat Relaying? IEEE Wirel. Commun. Lett. 2020, 9, 244–248. [Google Scholar] [CrossRef] [Green Version]
  6. Almohamad, A.; Hasna, M.; Zorba, N.; Khattab, T. Performance of THz Communications Over Cascaded RISs: A Practical Solution to the Over-Determined Formulation. IEEE Commun. Lett. 2022, 26, 291–295. [Google Scholar] [CrossRef]
  7. Hashima, S.; Hatano, K.; Mohamed, E.M. Advanced MAB Schemes for WiGig-Aided Aerial Mounted RIS Wireless Networks. In Proceedings of the IEEE Consumer Communications & Networking Conference, Lasvegas, NV, USA, 8–11 January 2023. [Google Scholar]
  8. Chen, Y.; Wang, Y.; Jiao, L. Robust Transmission for Reconfigurable Intelligent Surface Aided Millimeter Wave Vehicular Communications With Statistical CSI. IEEE Trans. Wirel. Commun. 2022, 21, 928–944. [Google Scholar] [CrossRef]
  9. Luo, R.; Ni, W.; Tian, H.; Cheng, J. Federated Deep Reinforcement Learning for RIS-Assisted Indoor Multi-Robot Communication Systems. IEEE Trans. Veh. Technol. 2022, 71, 12321–12326. [Google Scholar] [CrossRef]
  10. Schroeder, R.; He, J.; Brante, G.; Juntti, M. Two-Stage Channel Estimation for Hybrid RIS Assisted MIMO Systems. IEEE Trans. Commun. 2022, 70, 4793–4806. [Google Scholar] [CrossRef]
  11. He, J.; Wymeersch, H.; Juntti, M. Channel Estimation for RIS-Aided mmWave MIMO Systems via Atomic Norm Minimization. IEEE Trans. Wirel. Commun. 2021, 20, 5786–5797. [Google Scholar] [CrossRef]
  12. Yang, F.; Wang, J.-B.; Zhang, H.; Lin, M.; Cheng, J. Multi-IRS-Assisted mmWave MIMO Communication Using Twin-Timescale Channel State Information. IEEE Trans. Commun. 2022, 70, 6370–6384. [Google Scholar] [CrossRef]
  13. Nguyen, T.V.; Truong, T.P.; Nguyen, T.M.T.; Noh, W.; Cho, S. Achievable Rate Analysis of Two-Hop Interference Channel with Coordinated IRS Relay. IEEE Trans. Wirel. Commun. 2022, 21, 7055–7071. [Google Scholar] [CrossRef]
  14. Yuan, Y.; Wu, D.; Huang, Y.; Chih-Lin, I. Reconfigurable Intelligent Surface Relay: Lessons of the Past and Strategies for Its Success. IEEE Commun. Mag. 2022, 60, 117–123. [Google Scholar] [CrossRef]
  15. Xia, Y.; Qin, T.; Weidong, M.; Nenghai, Y.; Liu, T. Budgeted Multi-Armed Bandits with Multiple Plays. In Proceedings of the 25th International Joint Conference on Artificial Intelligence IJCAI, New York, NY, USA, 9–15 July 2016. [Google Scholar]
  16. Agrawal, S.; Goyal, N. Analysis of Thompson Sampling for the Multi-armed Bandit Problem. In Proceedings of the 25th Annual Conference on Learning Theory, Edinburgh, UK, 25–27 June 2012. [Google Scholar]
  17. Audibert, J.; Munos, R.; Szepesvari, C. Exploration-exploitation tradeoff using variance estimates in multi-armed bandits. Theor. Comput. Sci. 2009, 410, 1876–1902. [Google Scholar] [CrossRef]
  18. Lin, Z.; Niu, H.; An, K.; Wang, Y.; Zheng, G.; Chatzinotas, S.; Hu, Y. Refracting RIS-Aided Hybrid Satellite-Terrestrial Relay Networks: Joint Beamforming Design and Optimization. IEEE Trans. Aerosp. Electron. Syst. 2022, 58, 3717–3724. [Google Scholar] [CrossRef]
  19. Lin, Z.; An, K.; Niu, H.; Hu, Y.; Chatzinotas, S.; Zheng, G.; Wang, J. SLNR-based Secure Energy Efficient Beamforming in Multibeam Satellite Systems. IEEE Trans. Aerosp. Electron. Syst. 2022. early access. [Google Scholar] [CrossRef]
  20. Lin, Z.; Lin, M.; Wang, J.-B.; de Cola, T.; Wang, J. Joint Beamforming and Power Allocation for Satellite-Terrestrial Integrated Networks with Non-Orthogonal Multiple Access. IEEE J. Sel. Top. Signal Process. 2019, 13, 657–670. [Google Scholar] [CrossRef] [Green Version]
  21. Niu, H.; Lin, Z.; Chu, Z.; Zhu, Z.; Xiao, P.; Nguyen, H.X.; Lee, I.; Al-Dhahir, N. Joint Beamforming Design for Secure RIS-Assisted IoT Networks. IEEE Internet Things J. 2023, 10, 1628–1641. [Google Scholar] [CrossRef]
  22. Pei, X.; Yin, H.; Tan, L.; Cao, L.; Li, Z.; Wang, K.; Zhang, K.; Bjornson, E. RIS-Aided Wireless Communications: Prototyping, Adaptive Beamforming, and Indoor/Outdoor Field Trials. IEEE Trans. Commun. 2021, 69, 8627–8640. [Google Scholar] [CrossRef]
  23. Tang, W.; Chen, M.Z.; Dai, J.Y.; Zeng, Y.; Zhao, X.; Jin, S.; Cheng, Q.; Cui, T.J. Wireless Communications with Programmable Metasurface: New Paradigms, Opportunities, and Challenges on Transceiver Design. IEEE Wirel. Commun. 2020, 27, 180–187. [Google Scholar] [CrossRef] [Green Version]
  24. Cheng, Q.; Li, L.; Zhao, M.-M.; Zhao, M.-J. Cooperative Localization for Reconfigurable Intelligent Surface-Aided mmWave Systems. In Proceedings of the 2022 IEEE Wireless Communications and Networking Conference (WCNC), Austin, TX, USA, 10–13 April 2022; pp. 1051–1056. [Google Scholar] [CrossRef]
  25. Mohamed, E.M.; Hashima, S.; Anjum, N.; Hatano, K.; Shafai, W.E.; Elhalawany, B.M. Reconfigurable intelligent surface-aided millimeter wave communications utilizing two-phase minimax optimal stochastic strategy bandit. IET Commun. 2022, 16, 2200–2207. [Google Scholar] [CrossRef]
  26. Taha, A.; Alrabeiah, M.; Alkhateeb, A. Enabling Large Intelligent Surfaces with Compressive Sensing and Deep Learning. IEEE Access 2021, 9, 44304–44321. [Google Scholar] [CrossRef]
  27. Jia, C.; Gao, H.; Chen, N.; He, Y. Machine learning empowered beam management for intelligent reflecting surface assisted MmWave networks. China Commun. 2020, 17, 100–114. [Google Scholar] [CrossRef]
  28. Abdullah, Z.; Chen, G.; Lambotharan, S.; Chambers, J.A. A Hybrid Relay and Intelligent Reflecting Surface Network and Its Ergodic Performance Analysis. IEEE Wirel. Commun. Lett. 2020, 9, 1653–1657. [Google Scholar] [CrossRef]
  29. Torkzaban, N.; Khojastepour, M.A.A. Shaping mmWave Wireless Channel via Multi-Beam Design using Reconfigurable Intelligent Surfaces. In Proceedings of the 2021 IEEE Globecom Workshops (GC Wkshps), Madrid, Spain, 7–11 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
  30. Zhao, D.; Lu, H.; Wang, Y.; Sun, H. Joint Passive Beamforming and User Association Optimization for IRS-assisted mmWave Systems. In Proceedings of the GLOBECOM 2020—2020 IEEE Global Communications Conference, Taipei, Taiwan, 7–11 December 2020; pp. 1–6. [Google Scholar] [CrossRef]
  31. Wang, P.; Fang, J.; Zhang, W.; Li, H. Fast Beam Training and Alignment for IRS-Assisted Millimeter Wave/Terahertz Systems. IEEE Trans. Wirel. Commun. 2022, 21, 2710–2724. [Google Scholar] [CrossRef]
  32. Zhang, Y.; Di, B.; Zhang, H.; Lin, J.; Xu, C.; Zhang, D.; Li, Y.; Song, L. Beyond Cell-Free MIMO: Energy Efficient Reconfigurable Intelligent Surface Aided Cell-Free MIMO Communications. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 412–426. [Google Scholar] [CrossRef]
  33. Hao, W.; Li, J.; Sun, G.; Zeng, M.; Dobre, O.A. Securing Reconfigurable Intelligent Surface-Aided Cell-Free Networks. IEEE Trans. Inf. Forensics Secur. 2022, 17, 3720–3733. [Google Scholar] [CrossRef]
  34. Mensi, N.; Rawat, D.B. On the Performance of Partial RIS Selection vs. Partial Relay Selection for Vehicular Communications. IEEE Trans. Veh. Technol. 2022, 71, 9475–9489. [Google Scholar] [CrossRef]
  35. Mohamed, E.M.; Hashima, S.; Hatano, K.; Fouda, M.M. Cost-Effective MAB Approaches for Reconfigurable Intelligent Surface Aided Millimeter Wave Relaying. IEEE Access 2022, 10, 81642–81653. [Google Scholar] [CrossRef]
  36. Guo, X.; Chen, Y.; Wang, Y. Learning-Based Robust and Secure Transmission for Reconfigurable Intelligent Surface Aided Millimeter Wave UAV Communications. IEEE Wirel. Commun. Lett. 2021, 10, 1795–1799. [Google Scholar] [CrossRef]
  37. Francisco-Valencia, I.; Marcial-Romero, J.R.; Valdovinos, R.M. A comparison between UCB and UCB-Tuned as selection policies in GGP. J. Intell. Fuzzy Syst. 2019, 36, 5073–5079. [Google Scholar] [CrossRef]
  38. Agrawal, S.; Goyal, N. Near-Optimal Regret Bounds for Thompson Sampling. J. ACM 2017, 64, 30. [Google Scholar] [CrossRef]
  39. Sinha, D.; Sankararama, K.A.; Kazerouni, A.; Avadhanula, V. Multi-armed Bandits with Cost Subsidy. In Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, San Diego, CA, USA, 13–15 April 2021. [Google Scholar]
  40. Ding, W.; Qin, T.; Zhang, X.; Liu, T. Multi-Armed Bandit with Budget Constraint and Variable Costs. In Proceedings of the AAAI Conference on Artificial Intelligence, Bellevue, WA, USA, 14–18 July 2013. [Google Scholar] [CrossRef]
  41. Xia, Y.; Li, H.; Qin, T.; Yu, N.; Liu, T. Thompson Sampling for Budgeted Multi-Armed Bandits. In Proceedings of the International Joint Conference on Artificial Intelligence, Buenos Aires, Argentina, 25–31 July 2015. [Google Scholar]
Figure 1. IRS-enabled WiGig Relaying.
Figure 1. IRS-enabled WiGig Relaying.
Electronics 12 01146 g001
Figure 2. Mean spectral efficiency versus IRS relays [35].
Figure 2. Mean spectral efficiency versus IRS relays [35].
Electronics 12 01146 g002
Figure 3. Mean spectral efficiency versus transmitter power [35].
Figure 3. Mean spectral efficiency versus transmitter power [35].
Electronics 12 01146 g003
Figure 4. Mean throughput versus transmitter power [35].
Figure 4. Mean throughput versus transmitter power [35].
Electronics 12 01146 g004
Figure 5. Mean BT time cost versus transmitter power [35].
Figure 5. Mean BT time cost versus transmitter power [35].
Electronics 12 01146 g005
Table 1. Simulation parameters.
Table 1. Simulation parameters.
ParameterValue
P t 30 dBm
B W 2.16 GHz
σ 0 2 −114 dBm
K B r 5
K r u 5
M16
| F | 16
T H r 1000
T D 20 ms
τ B T 46 μ s
P L ( d 0 ) 61 dB
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hashima, S.; Hatano, K.; Takimoto, E.; Mohamed, E.M. Budgeted Thompson Sampling for IRS Enabled WiGig Relaying. Electronics 2023, 12, 1146. https://doi.org/10.3390/electronics12051146

AMA Style

Hashima S, Hatano K, Takimoto E, Mohamed EM. Budgeted Thompson Sampling for IRS Enabled WiGig Relaying. Electronics. 2023; 12(5):1146. https://doi.org/10.3390/electronics12051146

Chicago/Turabian Style

Hashima, Sherief, Kohei Hatano, Eiji Takimoto, and Ehab Mahmoud Mohamed. 2023. "Budgeted Thompson Sampling for IRS Enabled WiGig Relaying" Electronics 12, no. 5: 1146. https://doi.org/10.3390/electronics12051146

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop