Machine Learning Strategies for Reconfigurable Intelligent Surface-Assisted Communication Systems—A Review

Ibarra-Hernández, Roilhi F.; Castillo-Soria, Francisco R.; Gutiérrez, Carlos A.; García-Barrientos, Abel; Vásquez-Toledo, Luis Alberto; Del-Puerto-Flores, J. Alberto

doi:10.3390/fi16050173

Open AccessReview

Machine Learning Strategies for Reconfigurable Intelligent Surface-Assisted Communication Systems—A Review

¹

Faculty of Science, Autonomous University of San Luis Potosí, Av. Chapultepec 1570, Privadas del Pedregal, San Luis Potosí 78295, Mexico

²

Electrical Engineering Department, Metropolitan Autonomous University Iztapalapa, Av. San Rafael Atlixco 186, Leyes de Reforma 1ra Secc, Iztapalapa, Mexico City 09340, Mexico

³

Facultad de Ingeniería, Universidad Panamericana, Álvaro del Portillo 49, Zapopan 45010, Mexico

^*

Authors to whom correspondence should be addressed.

Future Internet 2024, 16(5), 173; https://doi.org/10.3390/fi16050173

Submission received: 22 April 2024 / Revised: 11 May 2024 / Accepted: 14 May 2024 / Published: 17 May 2024

(This article belongs to the Special Issue 6G Wireless Communication Systems: Applications, Opportunities and Challenges, Volume III)

Download

Browse Figures

Versions Notes

Abstract

:

Machine learning (ML) algorithms have been widely used to improve the performance of telecommunications systems, including reconfigurable intelligent surface (RIS)-assisted wireless communication systems. The RIS can be considered a key part of the backbone of sixth-generation (6G) communication mainly due to its electromagnetic properties for controlling the propagation of the signals in the wireless channel. The ML-optimized (RIS)-assisted wireless communication systems can be an effective alternative to mitigate the degradation suffered by the signal in the wireless channel, providing significant advantages in the system’s performance. However, the variety of approaches, system configurations, and channel conditions make it difficult to determine the best technique or group of techniques for effectively implementing an optimal solution. This paper presents a comprehensive review of the reported frameworks in the literature that apply ML and RISs to improve the overall performance of the wireless communication system. This paper compares the ML strategies that can be used to address the RIS-assisted system design. The systems are classified according to the ML method, the databases used, the implementation complexity, and the reported performance gains. Finally, we shed light on the challenges and opportunities in designing and implementing future RIS-assisted wireless communication systems based on ML strategies.

Keywords:

reconfigurable intelligent surface (RIS); machine learning; deep learning; wireless communication systems

Graphical Abstract

1. Introduction

Efficient design strategies for wireless communication systems are essential for communication networks’ fifth and sixth generations. Service requirements for 5G/6G network, such as high data traffic, throughput, latency, availability, energy efficiency, and cost-efficiency, make an efficient implementation of applications such as vehicular communication, Industry 4.0, and the Internet of things (IoT) possible [1,2]. In contrast to 4G networks, 5G applications offer native support to applications-oriented communications between machines [3]. On the other hand, recently, research has been focused on 6G wireless communication systems, adding higher reliability and ultra-high data rates. The design combines simultaneous communication and sensing to detect objects’ presence and identify their shape, location, and speed [4,5,6]. Moreover, 6G networks will be designed to be self-protected, self-planned, and will self-manage their resources despite energy constraints [7]. To achieve these goals, new transmission technologies have been proposed; among others, reconfigurable intelligent surfaces (RISs) are considered one of the key technologies in 6G communication systems. RIS-assisted wireless systems can tune the wireless propagation environment, making a smart radio context [8]. A RIS is also known as an intelligent reflecting surface (IRS) or a large intelligent surface (LIS). This technology is a two-dimensional surface comprising controllable elements of low-cost passive components. The most important property of an RIS is its capability to modify dynamically the wireless communication environment [9]. Due to the aforementioned properties, RISs are raising interest for a variety of applications such as power transfer, physical layer security [10,11], unnamed aerial vehicles communication [12,13], IoT and sensor networks [14], non-orthogonal multiple access (NOMA) [15,16], spatial modulation (SM) [16,17], among others. In 6G communications, the RIS has become a leading technology for mmWave communications [18,19]. On the other hand, machine learning (ML) is a technology derived from artificial intelligence (AI) that focuses on extracting knowledge from data (learning) to obtain patterns that design the rules or methods to make predictions, which allow interactive and smart decisions to estimate parameters efficiently. ML has been successfully applied to computer vision, signal, and language processing tasks. Due to its capabilities, ML has been considered a powerful technique to assess wireless network performance. This can be done by learning in almost real time the relevant features of a physical communication environment [20]. The intersection of ML and wireless communication systems has become a prolific field of research and an alternative for the design of future-generation networks [21].

Next-generation networks will have higher levels of complexity due to their higher data rates, self-configuration, and self-management capabilities. In this scenario, ML-based wireless communications can be designed without relying on complex mathematical models [22]. Recently, the research on ML methods applied to wireless systems has gained notable attention. The effectiveness of learning algorithms enhances the system performance, reliability, and adaptability of networks, where predictions bring better decisions in real-time systems [23]. Most research, surveys, and tutorials focus on RIS fundamentals and operating principles in the state of the art. In contrast, other surveys enlist ML applications for RIS-aided networks [24,25,26,27,28,29,30]. In [24], the authors focused their review on DL-based schemes to enhance RIS communications. They characterize a taxonomy of ML RIS systems to summarize the algorithms applied, such as supervised learning (SL), unsupervised learning (UL), federated learning (FL), and reinforcement learning (RL). The paper states the ML model architecture, the major contributions, and remarks for each revised research framework. The authors in [25,26] survey current ML techniques used for RIS-aided communications, quoting those ML techniques applied for enhancement. Their research framework compares the ML applications on RISs by the type of algorithm and model architecture used, the type of system model according to the type of inputs and outputs provided, single- and multiple-user scenarios, and the goal of each paper examined. The study in [27] presents a signal processing-based review of research frameworks that merged AI and RIS. The authors divide the study of AI-RIS into different scenarios, such as environmental sensing, channel acquisition, beamforming design, and resource scheduling. The literature review comprises the type of AI algorithm employed and the application scenario. The paper by Zhou et al. [28] focuses on the optimization approaches for RIS-aided wireless networks. Among the described techniques, the research enlists several ML algorithms for RIS wireless communications, introducing the novelty concepts of graph, transfer learning, and hierarchical learning. This research summarizes the application of ML on RIS communications, dividing it into different solutions such as the maximization of sum rate/capacity, energy efficiency, phase-shift resolutions, scenarios, channel settings, and the channel state information (CSI) dissemination method. Some constraints about the signal-to-interference noise ratio (SINR), the RIS phase, and computational resources are also stated. The authors also provide each revised framework’s drawbacks, difficulties, and application scenarios [31].

In contrast to these previous works, this paper is focused on novel road maps for implementing AI and ML applications in RIS-assisted wireless communication systems and the potential of RIS to enhance the capacity and quality of communication links, showing the convergence of ML and RIS to unleashes the potential of both technologies towards an optimal design of future wireless communication systems. The main contributions of this paper are summarized as follows:

We provide an updated list of the research frameworks that apply ML methods, DL architectures, and the underlying learning schemes for RIS-assisted communications.
We present a list of data sources for training the ML algorithms that enhance the performance of the RIS-assisted communication system. We highlight research works that provide the source code for testing and replicability of the revised approaches.
We present an overview of future challenges and opportunities in ML applications for RIS-assisted wireless communication systems, and we provide new directions on ML for designing RIS-assisted communication systems for engineers and researchers.

The rest of the paper is organized as follows. Section 2 states the basis of the RIS technology. Section 3 describes the ML frameworks existing in the literature. Section 4 describes the reviewed papers related to the applications of ML in RIS-assisted wireless communication systems. Based on the reviewed papers, challenges and future trends are discussed in Section 5. Finally, the conclusions are shown in Section 6.

2. Foundations on RISs

This section introduces the fundamental concepts of RISs, such as their hardware architecture and operating principles. It also provides a basic description of an RIS-assisted communication system, including the end-to-end configuration and the channel model.

2.1. RIS Hardware Architecture

A RIS is a planar array with electromagnetic properties designed to control the signal propagation. A RIS is also considered a metasurface because it is composed of a metamaterial, which is digitally programmable [32]. Figure 1 shows the basic hardware architecture of an RIS; on the left, we can see the metasurface, which is a planar array of units called reflective elements or meta-atoms [33]. The RIS elements (i.e., geometrical shape, size, dimension, orientation, and arrangement) modify the electromagnetic (EM) wave response. Changing the meta-atom reflection coefficient allows dynamic wireless channels in real time, making this technology suitable for wireless communication applications.

Figure 1 shows the structure of a reflective element. Each element contains an embedded PIN diode, which switches between the two states (on and off). The PIN diode controls the biasing DC voltage of each element. The equivalent circuit with two states has a phase-shift difference of

π

radians between them [32]. Therefore, it is possible to obtain different discrete phase shifts by setting different voltages through the RIS elements. Furthermore, the amplitude of each reflected signal can be controlled by setting different values of resistors in each element [34].

Figure 1 also shows the three layers structure of the RIS hardware. The outer layer comprises an array of metallic patches (reflective elements) printed on a dielectric substrate that directly interacts with an incident signal. The next layer, behind the metallic patches, consists of a copper plate. It is used to avoid a leakage in the energy of the signal. Finally, the inner layer is a control circuit board whose main task is to adjust the amplitude or phase shift of the incident signal in each element. Typical electronic devices that can be RIS controllers are field-programmable gate arrays (FPGAs) or microcontrollers.

2.2. RIS-Assisted Wireless Communication Systems

In a realistic scenario of wireless communications, the electromagnetic (EM) wave suffers from many effects. As the waves travel through the radio channel, they interact with different objects in the propagation environment, resulting in multipath propagation and power attenuation. In this scenario, the RIS can control both transmitter and receiver channel links [9,35]. RIS surfaces enhance communication between the BS and the destination by reflecting waves, where the surface elements work together, redirecting the signals in a specific direction [36]. Hence, the received signal strength can be enhanced at the final user device. As a result, the probabilistic wireless channel turns into a deterministic model controlled by software [37]. Nonetheless, RIS devices are effective solutions providing alternative paths between the transmitter and receivers when direct communications are impossible [38].

Figure 2 shows the RIS-assisted communications system model. The BS is equipped with

N_{t}

transmit antennas (Tx), while each K destination user device has

N_{r}

antennas. The system uses N RISs with

N_{s}

reflective elements. The direct link between the BS and the kth user is a matrix defined as

H_{k}^{D P} \in C^{N_{r} \times N_{t}}, k = {1, 2, \dots K}

. The link from the nth RIS to the kth user is a matrix defined as

H_{n, k} \in C^{N_{r} \times N_{s}}

, and the link from the BS to the nth RIS is defined as

H_{n}^{S R} \in C^{N_{s} \times N_{t}}, n = {1, 2, \dots N}

. In this example, each one of the links defined in the scheme is assumed to be quasi-static with a Rayleigh fading distribution, whose elements are complex Gaussian random variables with zero mean and unit standard deviation

CN (0, 1)

. However, more realistic channels can be considered for better system performance prediction.

2.3. RIS-Assisted Communication System Model

Following the scheme in Figure 2, the transmitter at the BS has m inputs, the encoded bits

a_{k} = {d_{n}}_{n = 0}^{m}

, where

d_{n} \in {0, 1}

. The outputs are the signals

x_{k}

, which come from modulated bits. The system can use M-ary quadrature amplitude modulation (QAM) or any other modulation scheme. The signal received by the kth user via the nth RIS and the direct path is defined as [39]

\begin{matrix} y_{k} = \sqrt{γ_{k}} H_{1, k} Θ_{1} G_{1} H_{1}^{S R} x_{k} + \sqrt{γ_{k}} H_{2, k} Θ_{2} H_{2}^{S R} G_{2} x_{k} + \dots \\ + \sqrt{γ_{k}} H_{l, k} Θ_{l} H_{l}^{S R} G_{l} x_{k} + \dots + \sqrt{γ_{k}} H_{L, k} Θ_{L} G_{L} x_{k} + \sqrt{γ_{k}} H_{k}^{D P} x + n_{k}, \end{matrix}

(1)

where

Θ_{l}

is the lth matrix of phase shifts, defined as

Θ_{l} = diag (e^{θ_{l, 1}}, e^{θ_{l, 2}}, \dots, e^{θ_{l, N_{s}}})

. In the given expression,

e^{θ_{l, i}}

denotes the phase shift of the ith reflecting element of the lth RIS, and

n_{k} \in C^{N_{r} \times 1}

stands for the noise whose samples are assumed to be independent and identically distributed (i.i.d.) with

CN (0, σ^{2})

. Moreover, the term

\sqrt{γ_{k}}

corresponds to the destination’s signal-to-noise ratio (SNR). The expression (1) can also be written as

y_{k} = \sqrt{γ_{k}} (\sum_{i = 1}^{N} H_{l, k} Θ_{l} G_{l} + H_{k}^{D P}) x_{k} + n_{k} .

(2)

Let us consider Equation (1); the received signal

y_{k}

is directly affected by the matrix channels

H_{l, k}

. These links are difficult to estimate if a passive RIS is considered without signal processing facilities [40]. Two common reference models for the channel are the Rayleigh [41] and the Rician fading channels. However, one of the principal open problems for RIS-assisted communications’ enhancement is channel estimation. In [42], the authors propose an ON/OFF protocol to estimate the cascaded channel

H_{l, k}

dividing the process into stages, where each stage only estimates one column vector associated with one RIS element. More specifically, the cascaded channel is represented by

N_{s}

columns as

H_{l, k} = [h_{l, k}^{(1)}, h_{l, k}^{(2)} \dots h_{l, k}^{(N_{s})}],

(3)

where

h_{l, k}^{(i)}

represents a column vector associated with the individual channel corresponding to the ith RIS element. At the ith stage, only the ith RIS element is turned on, while the remaining elements are turned off. The procedure of channel estimation is defined as follows: the direct channel

H_{k}^{D P}

is estimated by adopting a conventional wireless transmission strategy among users, such as sending pilot signals. Then, the RIS-element’s associated channels

h_{l, k}^{(i)}

are estimated independently using various methods. However, the channel estimation accuracy can be degraded because only one RIS element reflects the pilot signal to the BS. Another estimation strategy considers the power in all RIS elements but still divides the channel matrix

H_{l, k}

into columns to improve the estimation accuracy [43].

The matrices

G_{l}

and

Θ_{l}

in Equation (1) can be adjusted to control the impinging signal amplitude and phase through the RIS elements, respectively. One of the most important RIS communications’ advantages is called beamforming. Beamforming enhances RIS communications by allowing the control of the signal reflection towards a desired direction to improve the reception quality. The main approaches for RIS beamforming are:

Active beamforming: requires additional active components or circuits to manipulate the amplitude and phase of the RIS-impinging signals. Although the signal adjustments are implemented in real time, the power consumption and system complexity increase.
Passive beamforming: The system relies on exploiting the properties of RISs by software rather than adding additional electronic circuits. These systems are less complex than active beamforming and consume less power.
Hybrid beamforming: Several approaches combine the active and passive designs [38,44].

In practice, the RIS is usually designed to have many reflecting elements. In this case, it will be more efficient to implement discrete amplitude/phase-shift levels, which require only a small number of control bits for each element. As expected, the cost of phase-shift control is higher than amplitude control since the optimal phase-shift solutions should align all the signals reflected by the RIS despite their strength with the signal transmitted from the BS [38]. The alignment produces a coherent combined signal that maximizes the received signal power at the receiver.

3. Basics of ML Algorithms

This section explains the basics of ML, research areas, and the algorithms applied to RIS-assisted wireless communications systems.

ML is a research area that comprises many algorithms that can extract rules or knowledge from data patterns. These kinds of methods are known as learning algorithms. Figure 3 presents two diagrams to compare traditional programming and ML paradigms. For basic programming, we have a set of variables and input data that are modified according to a set of rules or instructions (each one is defined by a line of code in any programming language) to have a desired output. On the other hand, for the ML paradigm, we set the input data and the desired output (in the case of supervised learning) to obtain a set of rules or instructions to predict future outputs for novel inputs. Although ML algorithms have been developed since the 1970s, they are part of a new study era due to the computational resources currently available. Nowadays, it is possible to train learning algorithms from different devices and use graphical processing units (GPUs) to accelerate the training process. ML solves automatic improvement over time, inference, and decision-making problems [45].

3.1. Types of ML Algorithms

Learning algorithms are split in multiple ways. The most common is according to the desired output. When the output is a discrete value or a categorical label, we have a classification problem. On the other hand, it is a regression problem when continuous values are considered. Considering the output, supervised learning is used when we have the output values or labels as input data. In contrast, when the data are unlabeled, it corresponds to unsupervised learning. Figure 4 illustrates an example of the aforementioned divisions. At the top, we can notice that classification is the process of creating a border between the discrete outputs labeled green and red dots. In contrast, the regression process computes a model that predicts continuous outputs, such as a linear function. In supervised learning, we are aware of the labels of the outputs (red and green), while unsupervised learning performs connections or groups on unlabeled data (same color of dots).

A second classification of ML methods is according to the type of algorithm involved in the learning task. For instance, Figure 5 depicts a Venn diagram containing the different ML emerging areas arising from the learning method employed. This division is a group of subsets, where artificial intelligence (AI) comes in the outer level since it comprises data science and the different areas involved in conducting automated and human-free reasoning tasks to make decisions. ML is a subset of AI because it comprises computational methods and procedures to extract knowledge from data. The third level, called deep learning (DL), emerges from neural networks, which are learning algorithms that extract patterns based on the biological model of the neuron units from the brain. The next label is called reinforcement learning (RL). Rather than taking a dataset as input, it is based on the design of an agent. The agent explores an environment and learns how to obtain rewards from it by trial and error. The next subsections briefly describe each of these areas, introducing the concept of federated learning, derived from having independent ML models working on each user device.

3.2. Deep Learning (DL) Algorithms

As mentioned before, DL algorithms are based on neural networks (NNs), which are computational models based on neurons, the basic units of the human brain. A neuron structure is depicted in Figure 6; we define numerical inputs that are multiplied by real coefficients called weights. Then, the weighted inputs include a term called bias. Finally, there is a stage called the activation function, which discriminates or separates the weighted sum of inputs plus the bias to obtain an output. Among the most commonly used activation functions, we find the rectifier linear unit (ReLU), the sigmoid function, and the hyperbolic tangent function.

An NN is a group of layers, each comprising a stack of neurons. Commonly, the simplest case is to have three components: an input layer, an output layer, and one or more hidden layers. The term deep means that these hidden layers provide some depth to the network in order to learn. The NN term refers to one-layer hidden schemes. Likewise, if more than one hidden layer is used, the structure is known as a deep neural network (DNN). Figure 7 depicts the aforementioned DNN architecture for the most basic structure. However, more sophisticated NNs, including convolutional neural networks (CNN) and recurrent neural networks, can be found in the literature. The learning process of a DNN consists of the appropriate adjustment of the weights to find the minimum of a loss function. This function compares the real output and the one obtained from the NN. In the literature and applications, the most commonly used optimizers are the stochastic gradient descent (SGD) [46] and Adam optimizer [47].

3.3. Reinforcement Learning Algorithms

For ML and DL algorithms, the training process consists of feeding inputs with labeled or unlabeled data from measurements or observations previously gathered and stored in a dataset. Then, the ML algorithm learns from patterns found in these data. Reinforcement learning (RL) algorithms do not use the same paradigm since they are focused on training an agent, which has to be capable of accumulating rewards r through actions a when observing a desired environment. Each action, reward, and information about the environment is stored in a state s. Figure 8 describes this process, which consists of a loop whose outputs are data sequences of the state, an action, the reward, and the next state values.

During training, the agent performs different actions through the environment, considering it a Markov decision process (MDP); each action is penalized or rewarded [48]. The RL agent learns to make smart decisions given a particular state. The function that evaluates the state and decides the best output action is the policy

π (s)

. An RL agent aims to collect the maximum sum of rewards

R (τ)

, which are commonly weighted by a discount factor as follows:

R (τ) = \sum_{k = 0}^{\infty} γ^{k} r_{t + k + 1},

(4)

where

τ

represents a sequence of states and actions over time t,

r_{t}

is a reward collected at that time, and

γ

is the discount factor. Even though the objective of RL is to find the optimal policy

π^{*}

, which is the one that leads to the best cumulative reward, there are two main types of RL:

Policy-based methods: To learn which action to take given a state, the policy is directly trained.
Value-based methods: A value function is trained to learn the most valuable state, and this value is used to take the action that leads to it. This value function is denoted as $Q^{*}$ (state value) or $V^{*}$ (action value).

A common approach in RL is Q-learning, an off-policy and value-based method that uses a Q-table to update the action–value function at each step rather than updating it at the end of each episode. Each row of the Q-table consists of a state–action pair value. The “Q” means the “Quality” of the value of that action at that state. During training, the agent explores the environment to update the Q-table and determine a better approximation of the optimal policy. Figure 9 shows Q-learning and Deep Q-learning approaches (DQL) where the Q table in DQL updates the output values with a DL architecture. In RL, the Bellman equation is a foundation to address the relationship between the value of a state or state–action pair and the expected return or cumulative reward. It is an equation essential to understanding and solving reinforcement learning problems. Bellman’s equation is expressed as

V (s) = max_{a} [R (s, a) + γ V (s_{t + 1})],

(5)

where

R (s, a)

refers to the expected cumulative reward given the current state and action, and

V (s_{t} + 1)

to the associated value of the next state. Given the Bellman equation, it is possible to obtain the optimal policy function to obtain the best action to take given a state.

π^{*} (s) = arg max_{a} Q^{*} (s, a) .

(6)

Among Q-learning value-based RL algorithms, some commonly used approaches in RIS-aided communication systems are the deep-deterministic policy gradient (DDPG) [49], the twin-delayed deep-deterministic policy gradients (TD3) [50], and the soft actor–critic methods (SAC) [51].

In DDPG, the agent learns deterministic policies from action spaces with high dimensions and continuous outputs. It uses two NNs: the actor network and the critic network. Both networks compute the current state’s action predictions and generate an error signal called temporal difference (TD) at each time step. The actor network takes as input the current state and outputs a real value that represents an action chosen from a continuous action space. The function of the critic network is to estimate the Q-value of the current state of the action given by the actor. The training consists of updating the weights of the actor-network values and calculating the gradient of the policy value function.

The TD3 method minimizes the effects of overestimated values and function approximation errors in deep Q-learning applications. It follows the DDPG approach but uses two critic networks. This variant of double Q-learning aims to limit the possible overestimation. Nonetheless, the SAC algorithm is considered as off-policy, where the actor’s target is to maximize the entropy while maximizing the reward. SAC uses typically four neural networks: the actor network to select the actions from the environment whose inputs are the current state and the output is a probability distribution over the possible actions, two critic networks to estimate the state–action pairs as inputs and then estimate the Q-value of the expected return and performing stabilization or mitigating overestimation in training, and finally, the value network to estimate how good it is to be in a particular state to improve stabilization. DRL communications and networking approaches cover network control, adaptive rate, proactive caching, data offloading, network security and negative preservation, traffic routing, resource scheduling, and data collection, among others [52].

3.4. Federated Learning (FL)

The common way ML is applied to RIS applications is to train a learning algorithm using the signals that arrive at the RIS as input data, generating a model to enhance transmission performance. Figure 10 depicts a traditional learning scheme for the improvement of an RIS-assisted wireless communication link. Here, we have a centralized model directly computed at the BS, which is trained with data collected from different users within the network. The model requires raw data transmission to a central server, where training is performed. One thing to point out is that privacy and security concerns are not considered due to the sensitive nature of the wireless link. Moreover, the latency issues and scalability limitations can affect the centralized model. Figure 10 shows that the input settings are the goal to achieve, the parameters for the ML algorithm, and the optimization choices. To evaluate the model, an adequate performance metric has to be chosen according to the task selected.

In contrast to the traditional ML scheme, FL is a decentralized ML paradigm which suggests the collaboration of mobile devices to train a global model without sharing its private data [53]. Each user device computes updates on its local model. The RIS surface distributes the trained parameters from the user equipment (UE) to the BS using the parameters learned from the users; then, the BS trains a global model using these parameters. The function of the aggregation unit is to update and manage the global model associated with the BS to improve it. Systematic improvement is made by selecting local models with the best conditions, such as proper channel scenarios and relevant training updates. This process is repeated until the global model achieves the desired performance or the system reaches a defined deadline [54]. Figure 11 illustrates the FL-based application of an RIS communication system.

4. ML Applications for RIS-Assisted Communications

This section presents the research frameworks for RIS-assisted wireless communication systems based on ML. Our comparative analysis comprises the type of ML algorithm applied, the network architecture, the type of database used (if applicable), and whether or not the source code is available. Figure 12 shows the principal issues related to the application of ML to RIS-assisted wireless systems. Most reported frameworks are based on channel estimation and beamforming for optimal phase-shift selection. Signal detection approaches can be based on sensing the spectrum, classifying the modulation, or demodulating the transmission signal. Finally, we review the FL methods (Figure 11), where the RIS is used to distribute the parameters coming from the UE.

In this section, we also describe the source databases available to the research community for wireless applications, focusing on RIS, MIMO, and mmWave systems. Finally, we shed light on the main frameworks based on RIS-assisted wireless communication systems as a research problem.

4.1. Resources for Generating Databases to Train ML on Wireless Communications

ML algorithms aim to find useful patterns from data to make automatic decisions. Most of the research papers in wireless communication systems feed the ML algorithm employed by performing Monte Carlo realizations of the wireless channel, following the Rayleigh, Rician, and Saleh–Valenzuela models. Since channel realizations are complex-valued and neural networks work only on the real domain, DL-based approaches process the channel values by processing their real and imaginary parts separately or taking their magnitude. Table 1 lists the main datasets for testing ML algorithms applied for the improvement of RIS-aided communications. In [55], a non-publicly available resource that generates data from MIMO mmWave systems is presented. The system includes a ray-tracing simulator and a vehicle traffic simulator in a 5G environment with mobility. DeepMIMO [56] is one of the most popular choices among researchers, since it is an open-source project to generate wireless communication datasets for training and testing ML algorithms. It is based on the ray-tracing simulator Remcom Wireless Insite [57]. Tewes et al. [58] gathered physical channel measurements of various geometric antenna arrangements and RIS surfaces to generate a dataset. The data are publicly available on the IEEEDataPort platform [59] and on a GitHub repository [60]. Among the different settings to consider is the RIS-to-antenna distances, the reflection angles, and the positions of the RIS placed in a rotating way. Another publicly available dataset of measurements is presented in [61], where OFDM transceivers are included. The details about the design of the RIS are included.

Other publicly available datasets exist for ML applications in MIMO wireless systems beyond RISs. The applications comprise automated modulation recognition (AMR) to detect the modulation scheme without needing any prior information using DL methods [62,63,64,65]. For mmWave MIMO applications, the datasets available are used for beam selection as the desired output [66,67,68].

4.2. Estimation of CSI

The process of CSI estimation for RIS-assisted communication systems using ML takes pilot signals from the receiver or channel realizations as input data. Then, an ML algorithm is trained offline or online for learning to estimate the output CSI channel matrices. Most of the reported research frameworks are DL-based, since in this case, matrices give channel representations of MU systems. CSI estimation is one of the most challenging tasks for RIS-aided communications since the channel estimation and the transmission failure are associated [69]. Hence, the phase-shift selection at the RIS can be improved when performing an adequate channel estimation [70].

Figure 13 shows the channel estimation process aided by ML. We notice that a loss function aids the DL model in computing the distance between the actual and estimated output. The reported frameworks for CSI adapt the loss function according to the mean square error (MSE), having some variations such as the minimum mean square error (MMSE) and the least mean square error (LMSE), among others. Another important parameter to consider is the optimizer, which adjusts the parameters and builds the model and the activation functions used by the neurons. Table 2 enlists DL-based solutions for CSI estimation on RIS-aided wireless systems. Here, we summarize the contributions of each framework and the corresponding remarks.

One of the pioneering works introducing a DL framework for CSI in RISs was proposed by Elibir et al. [71]. The DL model used the pilot signals that arrived at the UE device as input. The training data were synthetically generated when RIS elements were turned ON/OFF by making input–output pairs for several channel realizations. Input data came from direct and cascaded channels, and the DNN output was vectorized in channel matrices. The proposed approach did not need to be re-trained when the user location changed up to 4 degrees. For the DL architecture, there were two CNNs composed of nine layers, using an SGD optimizer, dropout, and a mini-batch of 128 samples. Taha et al. [72] propose a system divided into two phases. Firstly, in the learning phase, the RIS employs an exhaustive search to collect the dataset for the DL model. The optimal beamforming vector with the highest achievable rate reflects the transmitted data. Hence, the model learns how to map the input channel vector to an output vector. Finally, in the prediction stage, the DL model predicts the beamforming vector from the estimated sampled channel. The DL architecture consists of an adapted NN whose number of layers varies, and the neurons are activated by a ReLU function. Also, the model is trained using the DeepMIMO dataset. In [73], the authors developed a DL-based detector called DeepRIS for wireless receivers. The DeepRIS framework estimated the channel and phase angles from the received signal. The model was trained offline with synthetic channel realizations, phase variances, and random sequences of bits. At the output, the model estimated the transmitted symbol and the CSI. This system model consisted of an ANN with a variable number of fully connected layers using a

t a n h

activation function to keep negative weights and the Adam optimizer. A deep-denoising neural network (DDNN) for CSI estimation on mmWave RIS systems was proposed in [74]. Only a few elements were activated using partial channels in the training stage. The complete channel matrix could be reconstructed from the limited measurements by using compressive sensing (CS) with orthogonal matching pursuit (OMP) due to the sparsity of cascaded channels [80]. The complete OMP-DL solution obtained multi-carrier pilots and then built a redundant dictionary to improve system accuracy. The model architecture consisted of a CNN with 15 convolutional layers, with 64 filters each of size

3 \times 3 \times 64

, using a ReLU activation function and Adam as the optimizer. A CNN can be used to reduce the complexity of the RIS in CSI estimations. The authors in [75] developed a DL framework based on CNN architectures for this task. The first was the single-enhanced deep super-resolution neural network (ESDR), which obtained an accurate CSI for a single-ray scenario. The second architecture, called multi-scale deep super-resolution neural network (MSDR), was proposed to estimate CSI on multi-scale sparse low-resolution channel scenarios. Both proposed architectures used ReLU as the activation function. The output was an estimated channel matrix. The framework proposed in [76] is also a CNN-based architecture called fast and flexible denoising network (FFDNet). The dataset used as input comes from synthetic channel realizations that are separated into real and imaginary components. FFDNet uses noise variance information, 2D convolutional layers, and ReLU activations with the Adam optimization. The distributed ML (DML) framework was introduced in [77] to build a downlink CSI estimation. The CNN-based scheme improved the estimation accuracy by extracting different channel features from different scenarios. This model worked when the user moved from one channel scenario to another cell, and the system built a global model shared by the BS with all users that could be jointly trained based on local training sets available to all users. The model was generated using the outdoor tracing scenario from the DeepMIMO dataset. For this architecture, the model used 32 filters of 3x3 kernel, batch normalization, and ReLU activation in combination with max-pooling and the Adam optimization. He et al. [78] consider that the rank-deficiency property of the cascaded channel matrix can be used to develop a deep unfolding method architecture to reduce the training overhead and estimate CSI. The design is an ANN-based architecture, which consists of linear layers whose inputs are synthetic channel realizations. The model uses the Adam optimization and ReLU activation functions for all layers except the last one.

The combination of precoding for multiuser (MU) downlink and CSI was addressed in [79], where the authors proposed an RIS-reflective network architecture of linear layers with a sigmoid activation and Adam optimization. The proposed model could be implemented when the LOS between the BS and user devices was blocked. In that framework, the CSI was acquired at the BS using the DL network with low pilot inputs and feedback signaling overhead.

In Table 3, we list the analyzed frameworks to compare them by defining the type of ML algorithm used, the network architecture, and the DL parameters reported according to Figure 13. We also list the type of data generated, following the approaches described in Section 4.1 and finally, the last column of Table 3 defines whether the authors share the source code with the community to reproduce the results described by their framework. Most of the codes are provided by the collection at the GitHub repository uploaded by Ken Wang [81], where we can also see the programming language utilized.

4.3. Beamforming Applications

ML-based beamforming for RIS-assisted wireless systems applications considers pilot signals or channel realizations as input data for training, similarly to the CSI approaches listed in Section 4.2. However, in most of the reported state-of-the-art research frameworks, the optimization problem relies on maximizing the sum rate using a DL scheme. Figure 14 shows the process of beamforming solutions based on DL for RIS-assisted systems, where the benchmark is similar to the one depicted in Figure 13, except for the learning stage. The sum-rate maximization is added to the optimizing process in the former case. The outputs consist of beamforming and phase-shift matrices at the RIS [82].

Several solutions have been initiated using DRL schemes for beamforming rather than DL, avoiding the training of datasets. Here, channel matrices can be considered an RL environment to build the state space; the actions are the optimized phase shift and the beamforming matrices at each channel realization, while the sum rate is used as an immediate reward. Figure 15 shows a general approach for applying the DRL loop to RIS-aided wireless communications. The action

A^{(t)}

consists of a space given by the phase shifts and beamforming matrices at a given time t. The state

S^{(t)}

consists of a space given by all channel matrices defined in Equation (1) and the last action performed by the agent, whose objective is to maximize the accumulated sum of rewards. For the task of beamforming, a common choice for the rewards

R^{(t)}

is the sum-rate metric

ρ_{k}

[82,83]. A model is built at each time interval by updating the actions using a DRL algorithm such as DDPG, TD3, or SAC (since the output is continuous). As soon as the sum-rate rewards are maximized, the closed loop depicted in Figure 15 updates and optimizes the beamforming and phase-shift matrices.

In Table 4, we show the revised applications of ML in RIS-aided communication systems. The contributions and remarks for each approach are listed. In the first revised approach, Taha et al. [84] propose a novel RIS architecture where all elements are passive except for a few active elements. The RIS learns how to optimally interact with the incident signal, given only the channels in the active elements. The authors use the DeepMIMO dataset to generate channels based on the outdoor ray-tracing scenario. The DL architecture is based on a multilayer perception (MLP) with a variable number of fully connected layers, ReLU activation, and MSE loss function. In [85], the authors developed a DL approach for phase configuration on an RIS-MIMO system. The architecture consisted of two networks. The first DNN was fed with the pilot signals and compared with a least-squares estimator for CSI. In contrast, the second network received shorter pilot sequences as input and predicted the optimum phases and beamforming vectors online. The architecture of the first DNN consisted of three fully connected hidden layers with a standard scalar preprocessing of inputs, Adam optimizer, and ReLU activation function. The second DNN consisted of four fully connected layers with the Adam optimizer and a ReLU activation function. Also, the pilot signals were generated synthetically from channel realizations.

Unsupervised learning solutions have also been proposed for beamforming. The research in [86] performs an offline training mechanism to make real-time predictions of RIS phase shifts online, maintaining a desired rate of performance. The proposed architecture consists of an ANN with five fully connected layers with a variable number of neurons, a ReLU activation, batch normalization, and the Adam optimizer. The DL algorithm’s input data were synthetically generated and previously normalized. An ML-based framework capable of directly optimizing both the BS’s beamformers and the RIS’s reflective coefficients was proposed in [87]. This work addressed the problem of optimally tuning the RIS elements for capacity enhancement in a multiuser cellular network.

In [88], the signal processing functions at the RIS were optimized by learning algorithms, where the modulation and beamforming were performed in parallel. The authors developed a DL architecture that consisted of two DNNs: one for the BS and another for the user device. The BS DNN mapped the transmitted bit stream to the transmitted signal modulation and beamforming, while the user DNN mapped the received signal to the estimated soft bit stream. The outputs from the second DNN were combined and demodulated at the UE device. The goal was to achieve optimal BER performance, where the network was trained once. The designed end-to-end learning scheme could simultaneously optimize the BS precoding, the RIS, and the UE device. The architecture consisted of one input layer, one hidden layer, and one output layer for the BS network. The layers were fully connected with a ReLU activation. The user DNN had the same architecture as the first layer but used the Xavier initialization and Adam optimization. The framework suggested a type of transfer-learning algorithm. In [89], a DL approach is proposed to maximize the weighted sum-rate (WSR) [8,95,96] for beamforming. The system is called RISnet, a dedicated architecture where all antennas share the same parameters. It applies an alternating optimization procedure for the phase shift and precoding matrices. The structure of the NN consists of a variable number of layers according to the channel inputs, followed by ReLU activations and the Adam optimizer. The output is the estimated phase shift, where the NN takes a separate channel feature for each user.

When training is performed online, input data are not required for DRL-based beamforming schemes. In [83], the DDPG algorithm was used for the joint design of transmit beamforming and phase-shift matrices at the RIS. The sum rate was used to train the DRL agent as the immediate reward. The method required an actor and a critic network, whose architecture comprised one input layer, one hidden layer, and one output layer with a

t a n h

activation function at the output. The number of neurons corresponded to the number of users and antennas at the BS and RIS elements. The scheme performed a whitening process to avoid correlations between entries. Saglam et al. [90] addressed the beamforming optimization research problem under imperfect CSI and hardware impairments. The algorithm employed a closed form to model the phase-dependent amplitude entries developed in [97]. Using the SAC DRL method, the agent learned the optimal beamforming and phase-shift matrices (actions) that maximized the sum rate (reward). The SAC framework consisted of two Q-networks and one stochastic policy network, with the Adam optimization, a

t a n h

activation for the output layer, and a ReLU activation for the hidden layers.

A model-free design to configure reflections on RIS-aided communications was proposed in [91]. This DRL approach did not require the CSI of the sub-channels and operated under the TDD-MIMO scheme. The DRL-based method used a double DQN (DDQN) to perform coarse phase control in real time, enhancing the phase-shift optimization. The Q-learning-based architecture consisted of four layers with 128 units, each using the Adam optimizer and MSE loss function. DRL approaches have a 3D trajectory and phase-shift-enhanced design for RIS-assisted UAV systems. The authors in [92] perform this optimization using DDPG and DDQN algorithms. The framework explores the mobility of the UAV in a 3D space and reaches the optimal phase shift of the RIS with high accuracy. The architecture consists of two-layer networks with 30 units, applying a ReLU activation and the Adam optimization. The state space consists of channel realizations and the locations of UAVs. At the same time, the reward proposed is a ratio of the total quantity of data, state, location of UAVs, flight time, and scheduling.

Another framework that applies DRL for joint optimization in a UAV environment is proposed in [93]. In that paper, the authors design a scheme for energy harvesting for resource allocation and convex optimization of phase shifts in a UAV-RIS system. The proposed method is based on an SD3 scheme that uses a pair of critic networks. The reward is evaluated in terms of the harvested energy. The state space consists of the equivalent channels from the BS to the UAV-RIS, the distance between the UAV elements and meta-surface, the meta-surface location, and each UAV antenna’s position. The action space consists of the phase shift at the RIS, the transmit power at each UAV, and the scheduling. The physical layer security in RIS-assisted mmWave UAV communications under multiple eavesdroppers and imperfect CSI was investigated in [94]. This framework aimed to maximize UAVs’ worst-case secrecy energy efficiency (SEE) via a joint optimization of the flight trajectory, UAV active beamforming, and RIS passive beamforming. The TD3-based proposed scheme had two agents composed of three networks each. All networks were based on MLP architectures. The state space consisted of the local information at the UAVs, where the action space captured the flying direction and completed the UAV trajectory using SEE as a reward. As in Section 4.2, we compared the ML algorithms and parameters for beamforming applications in Table 5. We list the source database used for each approach as the source code availability.

4.4. Federated Learning Applications

As depicted in Figure 11, FL is an approach where each UE device has its own model, whose parameters are sent from the RIS to the BS. A global ML model updates all parameters using a global aggregation method. In [98], the authors propose an AirCom-based FL model [99], which jointly optimizes the device selection, the aggregation beamformer at the BS, and the phase shifts at the RIS to maximize the number of devices that participate in the model aggregation under MSE requirements. The proposal consists of a two-step framework. In the first step, sparsity is considered for device selection, and then the maximum feasible device is found by solving a group of MSE problems of minimization. Then, an alternating optimization framework is used to design the aggregation of beamformers efficiently. Authors in [100] propose an offline enhancement of the RIS FL scheme with AirComp. The FL procedure is decomposed utilizing the look-ahead information and a Lyapunov framework. A BCD algorithm is used for phase-shift tuning and decoupled at the transceiver [100]. Then, a low-complexity algorithm based on an element-wise successive refinement is applied to make the practical implementation of an RIS with discrete shift constraint possible. The research in [101] studies the computation and communication of the resource allocation of an FL-RIS-assisted communication system to minimize the training latency. The authors propose a block coordinate descend technique (BCD), an alternative optimization to block the variables. Then, user selection and transmit power allocation are optimized via the maximization-minimization (MM) algorithm. Finally, semidefinite relaxation (SDR) and Gaussian minimization are used to obtain the RIS phase shift [101].

Resource allocation has also been studied for FL Internet-of-things (IoT) applications [102]. In this framework, IoT devices collaborate by using an FL scheme to train an ML model, enabling the privacy of each device. However, the energy constraint can limit the interaction with the central server and IoT devices. To address this issue, the authors in [102] propose an iterative resource allocation algorithm to reduce the total energy consumption by jointly performing local training and model uploading. FL applications have also been used on RIS-assisted UAV communications. The framework reported in [103] proposes an FL network via over-the-air communications to achieve high-quality and ubiquitous network coverage under privacy and low-latency requirements [103]. The goal is to minimize the worst-case MSE by jointly optimizing the RIS phase shifts and the noise factor for noise suppression using the power transmitted by the UAV and the trajectory [103].

4.5. ML Applications for Signal Decoding

The optimal decoding of the signal at the UE device improves the transmission error and enhances the overall efficiency of an RIS-assisted system. In [104], the authors proposed demodulating an OFDM signal for an RIS MIMO system based on a CNN. The transmitted data at the BS were used to test the model, which was trained to generate synthetic channel realizations. The framework considered the Saleh–Vanenzuela model to generate the Rayleigh fading channels. For the architecture, the framework considered a CNN with two hidden layers, the Adam optimization, and a ReLU activation function, using 32 filters of size 3 × 3. The output layer used a softmax activation function to determine the outputs. The BER and symbol error rate (SER) performance were evaluated [104].

In [105], the authors developed a signal detection scheme for RIS-assisted communication systems using a hybrid CNN-GRU architecture [105]. The combined network was trained offline by simulating the OFDM data over the channel. The BER and SER metrics were evaluated for that model. For the CNN, the architecture consisted of 64 filters of size 3 × 3 using a pooling layer. The GRU network consisted of four layers, where the hidden layers used a

T a n h

activation while a sigmoid activation was applied at the output layer. The output layer showed the probability of belonging to one of the classes [105]. The architecture applied a cross-entropy loss and the Adam optimization.

4.6. ML-Based Applications for RIS-Assisted Communications Modeling

The design of tools for simulating RIS-assisted communication systems provides a road map for researchers to understand how RIS surfaces interact with a network. For instance, open-source channel modeling solutions have been provided for indoor/outdoor scenarios, including the physical aspects of wireless propagation in the presence of an RIS [106].

The authors in [107] use ML to configure RIS-assisted communication environments, modeling wireless as a custom interpretable backpropagation NN. The NN learns the propagation parameters and configures them to facilitate users’ communication in the vicinity. Each RIS element is configured as a neuron or node whose weight corresponds to the power distribution. The proposed NN minimizes the number of tiles required to achieve efficient communication.

5. Future Trends, Challenges, and Opportunities

RISs have been considered as a key enabling technology for 5G and 6G systems to aid wireless communications. A higher data rate, low latency, spectral efficiency, and support from a massive number of users are some of the benefits RISs provide. According to the listed ML applications in this survey, the future directions that authors pointed out are the following:

The extension or scalability of systems from MISO to MIMO;
The use of multiple RISs;
The use of larger datasets;
Improvements in the ML-trained model to avoid overfitting;
Hyperparameter tuning for DL and DRL approaches.

Additionally, since the challenges for 6G consist of having self-planned and self-organized networks, recent studies have applied ML algorithms to some challenges of RIS-assisted communication. Despite the significant advances in ML approaches for RIS communications, it is crucial to understand that this research area is relatively new, and there are still some open research problems and issues to address.

5.1. Database Shortage

Most of the frameworks reported are supervised approaches, where using labeled data to train a model is a crucial step. ML is a tool for large, complex data processing, where huge numbers of inputs are required [108,109]. The model performance depends directly on data. Although there exist benchmarks that use channel realizations to generate an input to build DL-based solutions for RIS-assisted wireless communications [56], only a few researchers have considered gathering physical measurements to train ML algorithms [58]. Ensuring data integrity, security and scalability are the main concerns when creating a database for the implementation of learning algorithms. Moreover, for applications on wireless communications, labeling or indexing schemes are crucial for the efficient optimization of systems since most of the data are complex and continuous. It is also important to motivate researchers and engineers to collaborate in the creation, use, and implementation of ML-based frameworks on existing databases to facilitate the knowledge sharing and implementation processes.

5.2. Source Code Sharing

The availability of a dataset for ML training is important for testing different ML algorithms and constructing models for RIS-assisted communications systems. Sharing source code with the community motivates researchers to strengthen the design of ML models. Researchers are also motivated to reproduce the experiments and results using the available source code. Having reproducible results is important for scientific research since it ensures the findings can be validated and verified.

Source code sharing encourages the collaboration between researchers to enhance ML models and the performance of the systems. Collaboration allows the scientific community to contribute improvements, suggest modifications, and lead to more robust and reliable solutions [110]. It brings different perspectives and expertise to the benchmarks. The educational value of sharing the source code resides in the insight gains and best-practice benchmarks for researchers, students, and practitioners, who can go deep into implementation details and novel techniques about ML methods in this area. The accelerated progress is ensured by having open-source initiatives to accelerate the innovation of techniques to enhance RIS-assisted wireless systems [111]. It leads to a faster development of new algorithms, models, and methodologies.

5.3. Model Deployment and Updating

Wireless communications systems operate under naturally dynamic conditions. However, most reported ML approaches perform offline training. Given the quick variation in channel conditions, monitoring and updating ML models is crucial to ensure the model’s performance under these dynamics. Hyperparameter tuning is another important issue for the effectiveness of an ML model [112]. ML benchmarks can be improved by properly adjusting parameters, such as the number of layers, the number of neurons, the choice of the optimizer, and the loss function, among others.

5.4. Exploring Different Learning Approaches

Most reported approaches for enhancing RIS-assisted wireless networks are based on DL. Several frameworks have begun to incorporate RL and FL as well. However, there are ML algorithms that have not yet been widely explored.

Transfer learning (TL) exploits a trained model’s learning ability and applies this knowledge to a different but related task. We stated before the importance of sharing databases and source code, but sharing the trained ML model and making transfer learning is another area of opportunity that can offer the following benefits:

Carbon footprint and environmental sustainability: The computational cost of the training stage involves energy consumption. However, transfer learning allows researchers to enhance trained ML models to reduce training time and resource consumption.
Ensure data efficiency: If the model has been trained on large datasets, it can be fine-tuned on smaller datasets. Here, the challenge is to acquire extensive labeled data.
Generalization of models: Transfer learning allows the development of models trained from different sources and tasks to be generalized. Generalized models could perform well on different communications scenarios and new paradigms such as NOMA [113] or cell-free schemes [114].

Distributed machine learning (DML) is a decentralized ML approach which has been applied to RIS-assisted networks. Unlike FL, each device training is initialized independently since no additional devices are participating in the network. On the other hand, in DML, it is possible to exchange parameters and gradients among nodes without coordination or synchronization techniques [115]. Adding graph neural networks (GNN) to DML optimization can reveal the relations among the wireless network elements. However, DML sophisticated techniques have not yet been studied [28,116].

Extreme machine learning (EML) is a new paradigm in neural networks [117]. EML model training is faster, and the number of hyperparameters is reduced compared to DL, making tuning more straightforward. Applications of EML can benefit RIS-assisted IoT schemes since they require less memory than conventional DL, making EML suitable for those IoT devices with hardware constraints.

Transformer architectures: With the rise of large language models and natural language processing leveraged by applications such as ChatGPT, transformer architectures have recently become a subject of interest. Sixth-generation intelligent communications based on transformers have been proposed as a promising approach for future networks [118,119,120].

TinyML: Most current works rarely consider complex computation, massive storage, and consumed energy issues when training and deploying an ML framework to enhance RIS-aided wireless systems. Here, tinyML presents hardware-efficient solutions for beamforming, channel estimation, and federated learning problems [121,122].

Swarm intelligence methods: The enhancement of communication systems has been addressed by swarm methods for beamforming, resource allocation, and other related issues [123]. This area has been considered a subset of AI to obtain high-quality solutions for optimizing wireless systems. The combination of DL and swarm intelligence is a potential solution for the optimization of next-generation networks.

6. Conclusions

In this work, applications of ML on RIS-assisted communications systems were reviewed. The work focused on database generation for training, the type of ML algorithm applied, and the problems exposed in each research framework listed. We shed light on the areas of opportunity, future applications, and ML schemes to be developed for RIS-assisted networks. Since this is an interesting area of intense research, it is important to analyze and explore the capabilities of ML algorithms to optimize and reduce computational complexity in the design of wireless communication systems. We hope this paper is useful as a guideline for researchers, students, and engineers to develop novel and efficient schemes for the next generations of wireless communication systems.

Author Contributions

Conceptualization, R.F.I.-H., F.R.C.-S. and C.A.G.; methodology, R.F.I.-H. and F.R.C.-S.; software, R.F.I.-H.; validation, F.R.C.-S., C.A.G., A.G.-B., L.A.V.-T. and J.A.D.-P.-F.; Writing—original draft, R.F.I.-H. and F.R.C.-S.; writing—review and editing, F.R.C.-S. and C.A.G.; visualization, A.G.-B., L.A.V.-T. and J.A.D.-P.-F. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Universidad Panamericana and National Council of Humanities, Science, and Technology in Mexico (CONAHCYT).

Data Availability Statement

Data sharing is not applicable to this article as no new data were created or analyzed in this study.

Acknowledgments

Francisco R. Castillo Soria’s contribution was carried out during his research stay at the Universidad Autónoma Metropolitana (UAM) Iztapalapa, Mexico City.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

RIS	Reconfigurable intelligent surface
5G	Fifth-generation communications
6G	Sixth-generation communications
THz	Terahertz
IoT	Internet of things
MIMO	Multiple input, multiple output
CSI	Channel state information
ML	Machine learning
AI	Artificial intelligence
DL	Deep learning
SL	Supervised learning
UL	Unsupervised learning
RL	Reinforcement learning
FL	Federated learning
DML	Distributed machine learning
DRL	Deep reinforcement learning
DQL	Deep Q-learning
DQN	Deep Q-network
NN	Neural network
ANN	Artificial neural network
DNN	Deep neural network
CNN	Convolutional neural network
GNN	Graph neural network
ReLU	Rectifier linear unit
tanh	Hyperbolic tangent
SGD	Stochastic gradient descent
MSE	Mean squared error
MMSE	Minimum mean squared error
LMSE	Least mean square error
Adam	Adaptive moment estimation
MDP	Markov decision process
DDPG	Deep deterministic policy gradient
SAC	Soft actor–critic
TD	Temporal difference
TD3	Twin-delayed deep-deterministic policy gradient
EML	Extreme machine learning
EM	Electromagnetic
PIN	Positive–intrinsic–negative
DC	Direct current
mmWave	Millimetric wave
FPGA	Field-programmable gate array
UE	User equipment
BS	Base station
AP	Access point
Tx	Transmit antenna
Rx	Receiver antenna
QAM	Quadrature amplitude modulation
QSM	Quadrature spatial modulation
OFDM	Orthogonal frequency division multiplexing
GPU	Graphical processing unit
CPU	Central processing unit
MU	Multiuser
LOS	Line of sight
UAV	Unnamed aerial vehicle
TDD	Test-driven development
BCD	Block coordinate descent
MM	Maximization minimization
SDR	Semidefinite relaxation
AirCom	Aeronautical communications

References

Rost, P.; Mannweiler, C.; Michalopoulos, D.S.; Sartori, C.; Sciancalepore, V.; Sastry, N.; Holland, O.; Tayade, S.; Han, B.; Bega, D.; et al. Network slicing to enable scalability and flexibility in 5G mobile networks. IEEE Commun. Mag. 2017, 55, 72–79. [Google Scholar] [CrossRef]
Andrews, J.G.; Buzzi, S.; Choi, W.; Hanly, S.V.; Lozano, A.; Soong, A.C.; Zhang, J.C. What will 5G be? IEEE J. Sel. Areas Commun. 2014, 32, 1065–1082. [Google Scholar] [CrossRef]
Gutierrez, C.A.; Caicedo, O.; Campos-Delgado, D.U. 5G and beyond: Past, present and future of the mobile communications. IEEE Lat. Am. Trans. 2021, 19, 1702–1736. [Google Scholar] [CrossRef]
Wild, T.; Braun, V.; Viswanathan, H. Joint design of communication and sensing for beyond 5G and 6G systems. IEEE Access 2021, 9, 30845–30857. [Google Scholar] [CrossRef]
Tan, D.K.P.; He, J.; Li, Y.; Bayesteh, A.; Chen, Y.; Zhu, P.; Tong, W. Integrated sensing and communication in 6G: Motivations, use cases, requirements, challenges and future directions. In Proceedings of the 2021 1st IEEE International Online Symposium on Joint Communications & Sensing (JC&S), Dresden, Germany, 23–24 February 2021; pp. 1–6. [Google Scholar]
Subrt, L.; Pechac, P. Intelligent walls as autonomous parts of smart indoor environments. IET Commun. 2012, 6, 1004–1010. [Google Scholar] [CrossRef]
Di Renzo, M.; Ntontin, K.; Song, J.; Danufane, F.H.; Qian, X.; Lazarakis, F.; De Rosny, J.; Phan-Huy, D.T.; Simeone, O.; Zhang, R.; et al. Reconfigurable intelligent surfaces vs. relaying: Differences, similarities, and performance comparison. IEEE Open J. Commun. Soc. 2020, 1, 798–807. [Google Scholar] [CrossRef]
Guo, H.; Liang, Y.C.; Chen, J.; Larsson, E.G. Weighted sum-rate maximization for intelligent reflecting surface enhanced wireless networks. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Waikoloa, HI, USA, 9–13 December 2019. [Google Scholar]
Liu, Y.; Liu, X.; Mu, X.; Hou, T.; Xu, J.; Di Renzo, M.; Al-Dhahir, N. Reconfigurable intelligent surfaces: Principles and opportunities. IEEE Commun. Surv. Tutor. 2021, 23, 1546–1577. [Google Scholar] [CrossRef]
Khalid, W.; Yu, H.; Do, D.T.; Kaleem, Z.; Noh, S. RIS-aided physical layer security with full-duplex jamming in underlay D2D networks. IEEE Access 2021, 9, 99667–99679. [Google Scholar] [CrossRef]
Tang, Z.; Hou, T.; Liu, Y.; Zhang, J.; Zhong, C. A novel design of RIS for enhancing the physical layer security for RIS-aided NOMA networks. IEEE Wirel. Commun. Lett. 2021, 10, 2398–2401. [Google Scholar] [CrossRef]
Yang, L.; Meng, F.; Zhang, J.; Hasna, M.O.; Di Renzo, M. On the performance of RIS-assisted dual-hop UAV communication systems. IEEE Trans. Veh. Technol. 2020, 69, 10385–10390. [Google Scholar] [CrossRef]
Rahmatov, N.; Baek, H. RIS-carried UAV communication: Current research, challenges, and future trends. ICT Express 2023, 9, 961–973. [Google Scholar] [CrossRef]
Niu, H.; Lin, Z.; Chu, Z.; Zhu, Z.; Xiao, P.; Nguyen, H.X.; Lee, I.; Al-Dhahir, N. Joint beamforming design for secure RIS-assisted IoT networks. IEEE Internet Things J. 2022, 10, 1628–1641. [Google Scholar] [CrossRef]
Kumaravelu, V.B.; Imoize, A.L.; Soria, F.R.C.; Velmurugan, P.G.S.; Thiruvengadam, S.J.; Do, D.T.; Murugadass, A. RIS-Assisted Fixed NOMA: Outage Probability Analysis and Transmit Power Optimization. Future Internet 2023, 15, 249. [Google Scholar] [CrossRef]
Castillo-Soria, F.; Gutierrez, C.; Kumaravelu, V.; Garcıa-Barrientos, A. RIS-Assisted Non-orthogonal Multiple Access System Based on SSK. Wireless Pers Commun 2024, 134, 2391–2412. [Google Scholar] [CrossRef]
Castillo-Soria, F.R.; Macias-Velasquez, S.; Kumaravelu, V.B.; Ramos, V.; Azurdia-Meza, C.A. Multiple Parallel RIS-Assisted MU-MIMO-DQSM System; Blind and Intelligent Approaches. Available online: http://www.cic-chinacommunications.cn/EN/10.23919/JCC.ja.2023-0695 (accessed on 13 May 2024).
Di Renzo, M.; Zappone, A.; Debbah, M.; Alouini, M.S.; Yuen, C.; De Rosny, J.; Tretyakov, S. Smart radio environments empowered by reconfigurable intelligent surfaces: How it works, state of research, and the road ahead. IEEE J. Sel. Areas Commun. 2020, 38, 2450–2525. [Google Scholar] [CrossRef]
Tang, W.; Chen, M.Z.; Chen, X.; Dai, J.Y.; Han, Y.; Di Renzo, M.; Zeng, Y.; Jin, S.; Cheng, Q.; Cui, T.J. Wireless communications with reconfigurable intelligent surface: Path loss modeling and experimental measurement. IEEE Trans. Wirel. Commun. 2020, 20, 421–439. [Google Scholar] [CrossRef]
Wang, J.; Jiang, C.; Zhang, H.; Ren, Y.; Chen, K.C.; Hanzo, L. Thirty years of machine learning: The road to Pareto-optimal wireless networks. IEEE Commun. Surv. Tutor. 2020, 22, 1472–1514. [Google Scholar] [CrossRef]
Hellström, H.; da Silva, J.M.B., Jr.; Amiri, M.M.; Chen, M.; Fodor, V.; Poor, H.V.; Fischione, C. Wireless for machine learning: A survey. Found. Trends® Signal Process. 2022, 15, 290–399. [Google Scholar] [CrossRef]
Zappone, A.; Di Renzo, M.; Debbah, M. Wireless networks design in the era of deep learning: Model-based, AI-based, or both? IEEE Trans. Commun. 2019, 67, 7331–7376. [Google Scholar] [CrossRef]
Wang, C.X.; Di Renzo, M.; Stanczak, S.; Wang, S.; Larsson, E.G. Artificial intelligence enabled wireless networking for 5G and beyond: Recent advances and future challenges. IEEE Wirel. Commun. 2020, 27, 16–23. [Google Scholar] [CrossRef]
Sejan, M.A.S.; Rahman, M.H.; Shin, B.S.; Oh, J.H.; You, Y.H.; Song, H.K. Machine Learning for Intelligent-Reflecting-Surface-Based Wireless Communication towards 6G: A Review. Sensors 2022, 22, 5405. [Google Scholar] [CrossRef] [PubMed]
Faisal, K.; Choi, W. Machine learning approaches for reconfigurable intelligent surfaces: A survey. IEEE Access 2022, 10, 27343–27367. [Google Scholar] [CrossRef]
Faisal, K.; Choi, W. A study on machine learning-based approaches for reconfigurable intelligent surface. In Proceedings of the 2021 International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Republic of Korea, 20–22 October 2021; pp. 227–232. [Google Scholar]
Zhang, S.; Li, M.; Jian, M.; Zhao, Y.; Gao, F. AIRIS: Artificial intelligence enhanced signal processing in reconfigurable intelligent surface communications. China Commun. 2021, 18, 158–171. [Google Scholar] [CrossRef]
Zhou, H.; Erol-Kantarci, M.; Liu, Y.; Poor, H.V. A Survey on Model-based, Heuristic, and Machine Learning Optimization Approaches in RIS-aided Wireless Networks. arXiv 2023, arXiv:2303.14320. [Google Scholar] [CrossRef]
Wong, Y.H.; Chiong, C.W. Transceiver Design for Secure Wireless Communication Networks with IRS using Deep Learning: A Survey. In Proceedings of the 2023 International Conference on Digital Applications, Transformation & Economy (ICDATE), Miri, Malaysia, 14–16 July 2023; pp. 245–249. [Google Scholar]
Elbir, A.M.; Mishra, K.V. A survey of deep learning architectures for intelligent reflecting surfaces. arXiv 2020, arXiv:2009.02540. [Google Scholar]
ElMossallamy, M.A.; Zhang, H.; Song, L.; Seddik, K.G.; Han, Z.; Li, G.Y. Reconfigurable intelligent surfaces for wireless communications: Principles, challenges, and opportunities. IEEE Trans. Cogn. Commun. Netw. 2020, 6, 990–1002. [Google Scholar] [CrossRef]
Cui, T.J.; Qi, M.Q.; Wan, X.; Zhao, J.; Cheng, Q. Coding metamaterials, digital metamaterials and programmable metamaterials. Light Sci. Appl. 2014, 3, e218. [Google Scholar] [CrossRef]
Wu, Q.; Zhang, R. Towards smart and reconfigurable environment: Intelligent reflecting surface aided wireless network. IEEE Commun. Mag. 2019, 58, 106–112. [Google Scholar] [CrossRef]
Yang, H.; Chen, X.; Yang, F.; Xu, S.; Cao, X.; Li, M.; Gao, J. Design of resistor-loaded reflectarray elements for both amplitude and phase control. IEEE Antennas Wirel. Propag. Lett. 2016, 16, 1159–1162. [Google Scholar] [CrossRef]
Björnson, E.; Özdogan, Ö.; Larsson, E.G. Reconfigurable intelligent surfaces: Three myths and two critical questions. IEEE Commun. Mag. 2020, 58, 90–96. [Google Scholar] [CrossRef]
Özdogan, Ö.; Björnson, E.; Larsson, E.G. Intelligent reflecting surfaces: Physics, propagation, and pathloss modeling. IEEE Wirel. Commun. Lett. 2019, 9, 581–585. [Google Scholar] [CrossRef]
Basar, E.; Di Renzo, M.; De Rosny, J.; Debbah, M.; Alouini, M.S.; Zhang, R. Wireless communications through reconfigurable intelligent surfaces. IEEE Access 2019, 7, 116753–116773. [Google Scholar] [CrossRef]
Wu, Q.; Zhang, S.; Zheng, B.; You, C.; Zhang, R. Intelligent reflecting surface-aided wireless communications: A tutorial. IEEE Trans. Commun. 2021, 69, 3313–3351. [Google Scholar] [CrossRef]
Castillo-Soria, F.R.; Del Puerto-Flores, J.A.; Azurdia-Meza, C.A.; Babu Kumaravelu, V.; Simón, J.; Gutierrez, C.A. Precoding for RIS-Assisted Multi-User MIMO-DQSM Transmission Systems. Future Internet 2023, 15, 299. [Google Scholar] [CrossRef]
Wei, X.; Shen, D.; Dai, L. Channel estimation for RIS assisted wireless communications—Part I: Fundamentals, solutions, and future opportunities. IEEE Commun. Lett. 2021, 25, 1398–1402. [Google Scholar] [CrossRef]
Björnson, E.; Sanguinetti, L. Rayleigh fading modeling and channel hardening for reconfigurable intelligent surfaces. IEEE Wirel. Commun. Lett. 2020, 10, 830–834. [Google Scholar] [CrossRef]
Mishra, D.; Johansson, H. Channel estimation and low-complexity beamforming design for passive intelligent surface assisted MISO wireless energy transfer. In Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 4659–4663. [Google Scholar]
Alwazani, H.; Kammoun, A.; Chaaban, A.; Debbah, M.; Alouini, M.S. Intelligent reflecting surface-assisted multi-user MISO communication: Channel estimation and beamforming design. IEEE Open J. Commun. Soc. 2020, 1, 661–680. [Google Scholar]
Lyu, J.; Zhang, R. Hybrid active/passive wireless network aided by intelligent reflecting surface: System modeling and performance analysis. IEEE Trans. Wirel. Commun. 2021, 20, 7196–7212. [Google Scholar] [CrossRef]
Jordan, M.I.; Mitchell, T.M. Machine learning: Trends, perspectives, and prospects. Science 2015, 349, 255–260. [Google Scholar] [CrossRef]
Ketkar, N.; Ketkar, N. Stochastic gradient descent. In Deep Learning with Python: A Hands-on Introduction; Springer: Berlin/Heidelberg, Germany, 2017; pp. 113–132. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Mnih, V.; Kavukcuoglu, K.; Silver, D.; Rusu, A.A.; Veness, J.; Bellemare, M.G.; Graves, A.; Riedmiller, M.; Fidjeland, A.K.; Ostrovski, G.; et al. Human-level control through deep reinforcement learning. Nature 2015, 518, 529–533. [Google Scholar] [CrossRef] [PubMed]
Lillicrap, T.P.; Hunt, J.J.; Pritzel, A.; Heess, N.; Erez, T.; Tassa, Y.; Silver, D.; Wierstra, D. Continuous control with deep reinforcement learning. arXiv 2015, arXiv:1509.02971. [Google Scholar]
Fujimoto, S.; Hoof, H.; Meger, D. Addressing function approximation error in actor-critic methods. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 1587–1596. [Google Scholar]
Haarnoja, T.; Zhou, A.; Abbeel, P.; Levine, S. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of the International Conference on Machine Learning, Stockholm, Sweden, 10–15 July 2018; pp. 1861–1870. [Google Scholar]
Luong, N.C.; Hoang, D.T.; Gong, S.; Niyato, D.; Wang, P.; Liang, Y.C.; Kim, D.I. Applications of deep reinforcement learning in communications and networking: A survey. IEEE Commun. Surv. Tutor. 2019, 21, 3133–3174. [Google Scholar] [CrossRef]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Aguera y Arcas, B. Communication-efficient learning of deep networks from decentralized data. In Proceedings of the Artificial Intelligence and Statistics, Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
Li, L.; Fan, Y.; Tse, M.; Lin, K.Y. A review of applications in federated learning. Comput. Ind. Eng. 2020, 149, 106854. [Google Scholar] [CrossRef]
Klautau, A.; Batista, P.; González-Prelcic, N.; Wang, Y.; Heath, R.W. 5G MIMO data for machine learning: Application to beam-selection using deep learning. In Proceedings of the 2018 Information Theory and Applications Workshop (ITA), San Diego, CA, USA, 11–16 February 2018; pp. 1–9. [Google Scholar]
Alkhateeb, A. DeepMIMO: A generic deep learning dataset for millimeter wave and massive MIMO applications. arXiv 2019, arXiv:1902.06435. [Google Scholar]
Remcom Wireless Insite. Available online: https://www.remcom.com/wireless-insite-em-propagation-software (accessed on 30 October 2023).
Tewes, S.; Heinrichs, M.; Weinberger, K.; Kronberger, R.; Sezgin, A. A comprehensive dataset of RIS-based channel measurements in the 5GHz band. In Proceedings of the 2023 IEEE 97th Vehicular Technology Conference (VTC2023-Spring), Florence, Italy, 20–23 June 2023; pp. 1–5. [Google Scholar]
IEEEDataPort. Available online: https://ieee-dataport.org/ (accessed on 13 November 2023).
GitHub. Available online: https://github.com (accessed on 20 October 2023).
Rossanese, M.; Mursia, P.; Garcia-Saavedra, A.; Sciancalepore, V.; Asadi, A.; Costa-Perez, X. Open Experimental Measurements of Sub-6GHz Reconfigurable Intelligent Surfaces. IEEE Internet Comput. 2024, 28, 19–28. [Google Scholar] [CrossRef]
Zhang, F.; Luo, C.; Xu, J.; Luo, Y.; Zheng, F.C. Deep learning based automatic modulation recognition: Models, datasets, and challenges. Digit. Signal Process. 2022, 129, 103650. [Google Scholar] [CrossRef]
O’Shea, T.J.; Corgan, J.; Clancy, T.C. Convolutional radio modulation recognition networks. In Proceedings of the Engineering Applications of Neural Networks: 17th International Conference, EANN 2016, Aberdeen, UK, 2–5 September 2016; pp. 213–226. [Google Scholar]
O’shea, T.J.; West, N. Radio machine learning dataset generation with gnu radio. In Proceedings of the GNU Radio Conference, Boulder, CO, USA, 12–16 September 2016; Volume 1. [Google Scholar]
O’Shea, T.J.; Roy, T.; Clancy, T.C. Over-the-air deep learning based radio signal classification. IEEE J. Sel. Top. Signal Process. 2018, 12, 168–179. [Google Scholar] [CrossRef]
Gu, J.; Salehi, B.; Roy, D.; Chowdhury, K.R. Multimodality in mmWave MIMO beam selection using deep learning: Datasets and challenges. IEEE Commun. Mag. 2022, 60, 36–41. [Google Scholar] [CrossRef]
Salehi, B.; Belgiovine, M.; Sanchez, S.G.; Dy, J.; Ioannidis, S.; Chowdhury, K. Machine learning on camera images for fast mmwave beamforming. In Proceedings of the 2020 IEEE 17th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), Delhi, India, 10–13 December 2020; pp. 338–346. [Google Scholar]
Salehi, B.; Gu, J.; Roy, D.; Chowdhury, K. Flash: Federated learning for automated selection of high-band mmwave sectors. In Proceedings of the IEEE INFOCOM 2022-IEEE Conference on Computer Communications, London, UK, 2–5 May 2022; pp. 1719–1728. [Google Scholar]
Sun, R.; Wang, W.; Chen, L.; Wei, G.; Zhang, W. Diagnosis of intelligent reflecting surface in millimeter-wave communication systems. IEEE Trans. Wirel. Commun. 2021, 21, 3921–3934. [Google Scholar] [CrossRef]
Demir, Ö.T.; Björnson, E. Is channel estimation necessary to select phase-shifts for RIS-assisted massive MIMO? IEEE Trans. Wirel. Commun. 2022, 21, 9537–9552. [Google Scholar] [CrossRef]
Elbir, A.M.; Papazafeiropoulos, A.; Kourtessis, P.; Chatzinotas, S. Deep channel learning for large intelligent surfaces aided mm-wave massive MIMO systems. IEEE Wirel. Commun. Lett. 2020, 9, 1447–1451. [Google Scholar] [CrossRef]
Taha, A.; Alrabeiah, M.; Alkhateeb, A. Enabling large intelligent surfaces with compressive sensing and deep learning. IEEE Access 2021, 9, 44304–44321. [Google Scholar] [CrossRef]
Khan, S.; Khan, K.S.; Haider, N.; Shin, S.Y. Deep-learning-aided detection for reconfigurable intelligent surfaces. arXiv 2019, arXiv:1910.09136. [Google Scholar]
Liu, S.; Gao, Z.; Zhang, J.; Di Renzo, M.; Alouini, M.S. Deep denoising neural network assisted compressive channel estimation for mmWave intelligent reflecting surfaces. IEEE Trans. Veh. Technol. 2020, 69, 9223–9228. [Google Scholar] [CrossRef]
Jin, Y.; Zhang, J.; Zhang, X.; Xiao, H.; Ai, B.; Ng, D.W.K. Channel estimation for semi-passive reconfigurable intelligent surfaces with enhanced deep residual networks. IEEE Trans. Veh. Technol. 2021, 70, 11083–11088. [Google Scholar] [CrossRef]
Kundu, N.K.; McKay, M.R. Channel estimation for reconfigurable intelligent surface aided MISO communications: From LMMSE to deep learning solutions. IEEE Open J. Commun. Soc. 2021, 2, 471–487. [Google Scholar] [CrossRef]
Dai, L.; Wei, X. Distributed machine learning based downlink channel estimation for RIS assisted wireless communications. IEEE Trans. Commun. 2022, 70, 4900–4909. [Google Scholar] [CrossRef]
He, J.; Wymeersch, H.; Di Renzo, M.; Juntti, M. Learning to estimate RIS-aided mmWave channels. IEEE Wirel. Commun. Lett. 2022, 11, 841–845. [Google Scholar] [CrossRef]
Wu, M.; Gao, Z.; Huang, Y.; Xiao, Z.; Ng, D.W.K.; Zhang, Z. Deep learning-based rate-splitting multiple access for reconfigurable intelligent surface-aided tera-hertz massive MIMO. IEEE J. Sel. Areas Commun. 2023, 41, 1431–1451. [Google Scholar] [CrossRef]
Chen, J.; Liang, Y.C.; Cheng, H.V.; Yu, W. Channel estimation for reconfigurable intelligent surface aided multi-user mmWave MIMO systems. IEEE Trans. Wirel. Commun. 2023, 22, 6853–6869. [Google Scholar] [CrossRef]
Wang, K. RIS-Codes-Collection: A Complete Collection Contains the Codes for RIS (IRS) Papers. 2022. Available online: https://github.com/ken0225/RIS-Codes-Collection#ris-codes-collection-a-complete-collection-contains-the-codes-for-risirs-papers (accessed on 13 May 2024).
Di, B.; Zhang, H.; Song, L.; Li, Y.; Han, Z.; Poor, H.V. Hybrid beamforming for reconfigurable intelligent surface based multi-user communications: Achievable rates with limited discrete phase shifts. IEEE J. Sel. Areas Commun. 2020, 38, 1809–1822. [Google Scholar] [CrossRef]
Huang, C.; Mo, R.; Yuen, C. Reconfigurable intelligent surface assisted multiuser MISO systems exploiting deep reinforcement learning. IEEE J. Sel. Areas Commun. 2020, 38, 1839–1850. [Google Scholar] [CrossRef]
Taha, A.; Alrabeiah, M.; Alkhateeb, A. Deep learning for large intelligent surfaces in millimeter wave and massive MIMO systems. In Proceedings of the 2019 IEEE Global Communications Conference (GLOBECOM), Waikoloa, HI, USA, 9–13 December 2019. [Google Scholar]
Özdoğan, Ö.; Björnson, E. Deep learning-based phase reconfiguration for intelligent reflecting surfaces. In Proceedings of the 2020 54th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, CA, USA, 1–4 November 2020; pp. 707–711. [Google Scholar]
Gao, J.; Zhong, C.; Chen, X.; Lin, H.; Zhang, Z. Unsupervised learning for passive beamforming. IEEE Commun. Lett. 2020, 24, 1052–1056. [Google Scholar] [CrossRef]
Jiang, T.; Cheng, H.V.; Yu, W. Learning to reflect and to beamform for intelligent reflecting surface with implicit channel estimation. IEEE J. Sel. Areas Commun. 2021, 39, 1931–1945. [Google Scholar] [CrossRef]
Jiang, H.; Dai, L.; Hao, M.; MacKenzie, R. End-to-end learning for ris-aided communication systems. IEEE Trans. Veh. Technol. 2022, 71, 6778–6783. [Google Scholar] [CrossRef]
Peng, B.; Siegismund-Poschmann, F.; Jorswieck, E.A. RISnet: A Dedicated Scalable Neural Network Architecture for Optimization of Reconfigurable Intelligent Surfaces. In Proceedings of the WSA & SCC 2023; 26th International ITG Workshop on Smart Antennas and 13th Conference on Systems, Communications, and Coding, VDE, Braunschweig, Germany, 27 February 2023; pp. 1–6. [Google Scholar]
Saglam, B.; Gurgunoglu, D.; Kozat, S.S. Deep Reinforcement Learning Based Joint Downlink Beamforming and RIS Configuration in RIS-aided MU-MISO Systems Under Hardware Impairments and Imperfect CSI. arXiv 2022, arXiv:2211.09702. [Google Scholar]
Wang, W.; Zhang, W. Intelligent reflecting surface configurations for smart radio using deep reinforcement learning. IEEE J. Sel. Areas Commun. 2022, 40, 2335–2346. [Google Scholar] [CrossRef]
Mei, H.; Yang, K.; Liu, Q.; Wang, K. 3D-trajectory and phase-shift design for RIS-assisted UAV systems using deep reinforcement learning. IEEE Trans. Veh. Technol. 2022, 71, 3020–3029. [Google Scholar] [CrossRef]
Peng, H.; Wang, L.C. Energy harvesting reconfigurable intelligent surface for UAV based on robust deep reinforcement learning. IEEE Trans. Wirel. Commun. 2023, 22, 6826–6838. [Google Scholar] [CrossRef]
Tham, M.L.; Wong, Y.J.; Iqbal, A.; Ramli, N.B.; Zhu, Y.; Dagiuklas, T. Deep Reinforcement Learning for Secrecy Energy-Efficient UAV Communication with Reconfigurable Intelligent Surface. In Proceedings of the 2023 IEEE Wireless Communications and Networking Conference (WCNC), Glasgow, UK, 26–29 March 2023; pp. 1–6. [Google Scholar]
Guo, H.; Liang, Y.C.; Chen, J.; Larsson, E.G. Weighted sum-rate maximization for reconfigurable intelligent surface aided wireless networks. IEEE Trans. Wirel. Commun. 2020, 19, 3064–3076. [Google Scholar] [CrossRef]
Cao, Y.; Lv, T.; Ni, W. Intelligent reflecting surface aided multi-user mmWave communications for coverage enhancement. In Proceedings of the 2020 IEEE 31st Annual International Symposium on Personal, Indoor and Mobile Radio Communications, London, UK, 31 August–3 September 2020; pp. 1–6. [Google Scholar]
Abeywickrama, S.; Zhang, R.; Wu, Q.; Yuen, C. Intelligent reflecting surface: Practical phase shift model and beamforming optimization. IEEE Trans. Commun. 2020, 68, 5849–5863. [Google Scholar] [CrossRef]
Wang, Z.; Qiu, J.; Zhou, Y.; Shi, Y.; Fu, L.; Chen, W.; Letaief, K.B. Federated learning via intelligent reflecting surface. IEEE Trans. Wirel. Commun. 2021, 21, 808–822. [Google Scholar] [CrossRef]
Wang, Z.; Zhao, Y.; Zhou, Y.; Shi, Y.; Jiang, C.; Letaief, K.B. Over-the-air computation: Foundations, technologies, and applications. arXiv 2022, arXiv:2210.10524. [Google Scholar]
Zhao, Y.; Wu, Q.; Chen, W.; Wu, C.; Poor, H.V. Performance-oriented design for intelligent reflecting surface assisted federated learning. IEEE Trans. Commun. 2023, 71, 5228–5243. [Google Scholar] [CrossRef]
Zhao, L.; Xu, H.; Wang, J.; Chen, Y.; Chen, X.; Wang, Z. Computation–communication resource allocation for federated learning system with intelligent reflecting surfaces. Arab. J. Sci. Eng. 2022, 47, 10203–10209. [Google Scholar] [CrossRef]
Zhang, T.; Mao, S. Energy-efficient federated learning with intelligent reflecting surface. IEEE Trans. Green Commun. Netw. 2021, 6, 845–858. [Google Scholar] [CrossRef]
Liu, H.; Yuan, X.; Zhang, Y.J.A. Reconfigurable intelligent surface enabled federated learning: A unified communication-learning design approach. IEEE Trans. Wirel. Commun. 2021, 20, 7595–7609. [Google Scholar] [CrossRef]
Sejan, M.A.S.; Rahman, M.H.; Song, H.K. Demod-CNN: A Robust Deep Learning Approach for Intelligent Reflecting Surface-Assisted Multiuser MIMO Communication. Sensors 2022, 22, 5971. [Google Scholar] [CrossRef]
Rahman, M.H.; Sejan, M.A.S.; Aziz, M.A.; Kim, D.S.; You, Y.H.; Song, H.K. Deep Convolutional and Recurrent Neural-Network-Based Optimal Decoding for RIS-Assisted MIMO Communication. Mathematics 2023, 11, 3397. [Google Scholar] [CrossRef]
Basar, E.; Yildirim, I. SimRIS channel simulator for reconfigurable intelligent surface-empowered communication systems. In Proceedings of the 2020 IEEE Latin-American Conference on Communications (LATINCOM), Santo Domingo, Dominican Republic, 18–20 November 2020; pp. 1–6. [Google Scholar]
Liaskos, C.; Tsioliaridou, A.; Nie, S.; Pitsillides, A.; Ioannidis, S.; Akyildiz, I. An interpretable neural network for configuring programmable wireless environments. In Proceedings of the 2019 IEEE 20th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Cannes, France, 2–5 July 2019; pp. 1–5. [Google Scholar]
Géron, A. Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2022. [Google Scholar]
Chollet, F. Deep Learning with Python; Simon and Schuster: New York, NY, USA, 2021. [Google Scholar]
Walters, W.P. Code sharing in the open science era. J. Chem. Inf. Model. 2020, 60, 4417–4420. [Google Scholar] [CrossRef] [PubMed]
Lerner, J.; Tirole, J. The economics of technology sharing: Open source and beyond. J. Econ. Perspect. 2005, 19, 99–120. [Google Scholar] [CrossRef]
Bardenet, R.; Brendel, M.; Kégl, B.; Sebag, M. Collaborative hyperparameter tuning. In Proceedings of the International Conference on Machine Learning, Atlanta, GA, USA, 16–21 June 2013; pp. 199–207. [Google Scholar]
Pham, Q.V.; Huynh-The, T.; Alazab, M.; Zhao, J.; Hwang, W.J. Sum-rate maximization for UAV-assisted visible light communications using NOMA: Swarm intelligence meets machine learning. IEEE Internet Things J. 2020, 7, 10375–10387. [Google Scholar] [CrossRef]
Biswas, S.; Vijayakumar, P. AP selection in cell-free massive MIMO system using machine learning algorithm. In Proceedings of the 2021 Sixth International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET), Chennai, India, 25–27 March 2021; pp. 158–161. [Google Scholar]
Asad, M.; Moustafa, A.; Ito, T. Federated learning versus classical machine learning: A convergence comparison. arXiv 2021, arXiv:2107.10976. [Google Scholar]
Chen, C.; Xu, S.; Zhang, J.; Zhang, J. A Distributed Machine Learning-Based Approach for IRS-Enhanced Cell-Free MIMO Networks. arXiv 2023, arXiv:2301.08077. [Google Scholar] [CrossRef]
Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: A new learning scheme of feedforward neural networks. In Proceedings of the 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No. 04CH37541), Budapest, Hungary, 25–29 July 2004; Volume 2, pp. 985–990. [Google Scholar]
Wang, Y.; Gao, Z.; Zheng, D.; Chen, S.; Gündüz, D.; Poor, H.V. Transformer-empowered 6G intelligent networks: From massive MIMO processing to semantic communication. IEEE Wirel. Commun. 2022, 30, 127–135. [Google Scholar] [CrossRef]
Han, X.; Zhiqin, W.; Dexin, L.; Wenqiang, T.; Xiaofeng, L.; Wendong, L.; Shi, J.; Jia, S.; Zhi, Z.; Ning, Y. AI enlightens wireless communication: A transformer backbone for CSI feedback. China Commun. 2024, 1–14. [Google Scholar] [CrossRef]
Zhang, J.; Li, J.; Shi, L.; Wang, Z.; Jin, S.; Chen, W.; Poor, H.V. Decision Transformer for Wireless Communications: A New Paradigm of Resource Management. arXiv 2024, arXiv:2404.05199. [Google Scholar]
Liu, H.; Wei, Z.; Zhang, H.; Li, B.; Zhao, C. Tiny machine learning (tiny-ml) for efficient channel estimation and signal detection. IEEE Trans. Veh. Technol. 2022, 71, 6795–6800. [Google Scholar] [CrossRef]
Kopparapu, K.; Lin, E.; Breslin, J.G.; Sudharsan, B. Tinyfedtl: Federated transfer learning on ubiquitous tiny iot devices. In Proceedings of the 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops), Pisa, Italy, 21–25 March 2022; pp. 79–81. [Google Scholar]
Pham, Q.V.; Nguyen, D.C.; Mirjalili, S.; Hoang, D.T.; Nguyen, D.N.; Pathirana, P.N.; Hwang, W.J. Swarm intelligence for next-generation networks: Recent advances and applications. J. Netw. Comput. Appl. 2021, 191, 103141. [Google Scholar] [CrossRef]

Figure 1. RIS hardware architecture, detailing reflecting elements and equivalent circuits.

Figure 2. RIS-aided wireless communication system scheme.

Figure 3. Paradigms of classical programming and machine learning.

Figure 4. Classification of ML methods according to the output.

Figure 5. Diagram representation of the different ML areas.

Figure 6. Representation of a neuron unit.

Figure 7. Basic architecture of a neural network.

Figure 8. Basic scheme of the RL process.

Figure 9. Q-learning and deep Q-learning approaches.

Figure 10. A traditional ML scheme for RIS-assisted communication systems.

Figure 11. Federated learning scheme application on an RIS-aided communication system.

Figure 12. Applications of ML for enhancing RIS-aided wireless systems.

Figure 13. Diagram of ML-based CSI estimation for RIS communication systems.

Figure 14. Scheme for DL-based beamforming for RIS-assisted communications systems.

Figure 15. General approach of DRL for beamforming on RIS-aided communication systems.

Table 1. Datasets for training ML algorithms applied to RIS-assisted communication systems’ enhancement.

Reference	Type of Datasource	Description	Publicly Available?
[55]	Synthetic	Vehicle traffic and ray-tracing simulator to generate 5G-based datasets	No
[56]	Synthetic	Dataset generator tailored to ML algorithms based on the Remcom ray-tracing application	Yes
[58]	Physical	Dataset of channel measurements for different geometric antenna arrangements and RISs	Yes
[61]	Physical	Dataset of measurements on RIS in the 6 GHz band and OFDM transceivers	Yes

Table 2. ML applications for CSI estimation on RIS-aided communications.

Reference	Contributions	Remarks
[71]	DL twin CNN-based architecture for the estimation of CSI	The model does not need to be re-trained when the user location is changed up to 4 degrees
[72]	Combined DL and compressive sensing to estimate the CSI using only the active reflective elements of the RIS	The DL scheme does not need any knowledge of the RIS array geometry, but a large dataset is needed
[73]	DL approach for the estimation of CSI and symbol detection in RIS wireless systems	The proposed model estimates the CSI and phase angles from the received signals
[74]	DL-based estimation of channels from UE to RIS which works at different SNR levels and number of multipaths	The model can reach a high NMSE performance with a few elements activated during the training stage
[75]	Proposal of two DL architectures for CSI estimation exploiting the low-rank sparsity of channels	The proposal can increase its performance when increasing the density of sensing devices
[76]	Proposed the channel estimation as an image denoising problem using a CNN-based architecture	Numerical results show that the proposed model performance is close when the system has perfect channel knowledge
[77]	DML-based scheme for CSI estimation where the BS and the users perform training collaboratively	The proposal can achieve higher estimation accuracy when the pilot overhead is reduced to 1/8
[78]	Proposed a deep unfolding network for the estimation of the downlink channel of an RIS wireless system	The proposal outperforms the least-squares estimator and has lower complexity using a smaller training overhead
[79]	DL-based estimation of CSI for RIS-aided and massive-MIMO systems which extracts the correlation features of subcarriers	The proposal outperforms in terms of spectral efficiency with a lower signaling overhead

Table 3. ML architectures, methods, and data sources for the estimation of CSI in RIS-aided communication systems.

Reference	ML Algorithm	Architecture and ML Methods	Database Used	Available Source Code?
[71]	DL	2 CNNs of 9 layers, dropout, SGD optimizer, minibatch	Synthetic	Yes, MATLAB R2018b
[72]	DL	Adapted NN with variable number of layers, ReLU activation function	DeepMIMO [56]	Yes, MATLAB R2018b
[73]	DL	Adapted NN with a variable number of layers, tanh activation function, Adam optimizer	Synthetic	No
[74]	DL	CNN of 15 convolutional layers, 64 filters of size $3 \times 3 \times 64$ , ReLU activation function, Adam optimizer	Synthetic	Yes, MATLAB R2018b and Python
[75]	DL	CNN-based EDSR, MDSR architectures, ReLU activation	Synthetic	Yes, Python
[76]	DL	CNN-based FFDNet, filters of $3 \times 3$ and conv2D layers, ReLU activation, batch normalization	Synthetic	Yes, Python
[77]	DL	CNN-based network, 32 filters, ReLU activation	DeepMIMO [56]	Yes
[78]	DL	NN-based network, ReLU activation, NMSE loss function, ReLU activation	Synthetic	Yes, Python
[79]	DL	ANN with linear layers, sigmoid activation, Adam optimizer	Synthetic	Yes, Python

Table 4. Revised ML-based approaches for beamforming in RIS-assisted communication systems.

Reference	Contributions	Remarks
[84]	DL-based approach where the RIS learns the optimal interaction with the incident signal; only channels at the active elements are given	The proposal can achieve nearly optimal data rates without any knowledge of the RIS array geometry
[85]	DL-based approach for phase reconfiguration at the RIS that uses the local propagation environment	The model outperforms the classical least-squares estimator with low training overhead
[86]	Unsupervised DL-based approach for phase-shift prediction of RIS reflecting elements	The model is able to perform shift configuration in real time while mantaining a reasonable rate
[87]	Optimization of both beamformers at the BS and reflective coefficients at the RIS based on a DL scheme	With a few number of pilots the proposal can learn to maximize a rate or minimize an objective
[88]	DL-based solution for optimizing the active beamforming from the BS to users and passive beamforming for the RIS	The proposed solution can achieve better BER performance than a conventional RIS system
[89]	ML approach to maximize the weighted sum-rate designed according to the property of product in RIS-aided systems	The proposal outperforms the block coordinate descent algorithm (BCD) solution
[83]	Joint design solution with beamforming at the access point and phase vector at RIS elements based on the BCD algorithm	The proposal achieves significant performance gain compared to benchmarks that use 100 RIS elements
[90]	DRL approach for the joint design of transmit beamforming and phase shifts at the RIS in an MU-MISO environment	The proposal overcomes hardware impairments for RIS-aided wireless systems
[91]	DRL solution for the real-time control of the phase at the RIS, which is independent of CSI	The proposal outperforms the model-free RIS control without sub-channel CSI
[92]	DRL approach for the joint design of the phase shift at the RIS and the control of trajectories of UAVs	The proposal improves the energy-efficiency performance of an RIS-assisted UAV system
[93]	DRL-based solution for the energy harvesting and phase shift control of an RIS-assisted UAV system	The proposal outperforms in terms of trade-off efficiency and practicality
[94]	DRL approach for the joint optimization of a UAV trajectory and the active/passive beamforming of the RIS	The proposal achieves greater performance in terms of energy savings and sum-rate

Table 5. ML algorithms, parameters, databases, and source code availability for the revised beamforming applications on RIS-aided wireless systems.

Reference	ML Algorithm	Architecture and ML Methods	Database Used	Available Source Code?
[84]	DL	Perceptron with variable fully connected layers, RELU activation, MSE loss	Deep MIMO [56]	No
[85]	DL	2 NNs with fully connected layers, ReLU activation, Adam optimizer	Synthetic	No
[86]	DL	5 fully connected layers ANN with variable number of neurons, ReLU activation, Adam optimizer	Synthetic	No
[87]	DL	DNN to parametrize pilots GNN to capture interactions among users, Adam optimizer, ReLU activation function	Synthetic	Yes, MATLAB R2018b and Python
[88]	DL	2 DNN (BS and UE) fully connected networks, ReLU and sigmoid activation functions, cross-entropy loss function, Adam optimizer	Synthetic	Yes, MATLAB R2018b and Python
[89]	DL	NN-based architecture, ReLU activation, Adam optimizer	Synthetic	Yes, Python
[83]	RL	DDPG algorithm with both actor and critic networks, tanh activation function	n/a	Yes, Python (reproduction work)
[90]	RL	SAC algorithm with 3 MLPs, Adam and Xavier optimization, tanh activation function	n/a	Yes, Python
[91]	RL	Q-learning-based scheme with DQN agent with 4 layers, 128 units, MSE loss function, Adam optimizer	n/a	Yes, Python
[92]	RL	DQN and DDPG algorithms, 2-layer networks with 30 units, ReLU activation, Adam optimizer	n/a	Yes, Python
[93]	RL	SD3 algorithm for dual-domain energy harvesting and joint optimization of phase shifts and transmit power	n/a	Yes, Python
[94]	RL	2 agents based on TD3 algorithm with 3 networks each, MLP architecture	n/a	Yes, Python

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ibarra-Hernández, R.F.; Castillo-Soria, F.R.; Gutiérrez, C.A.; García-Barrientos, A.; Vásquez-Toledo, L.A.; Del-Puerto-Flores, J.A. Machine Learning Strategies for Reconfigurable Intelligent Surface-Assisted Communication Systems—A Review. Future Internet 2024, 16, 173. https://doi.org/10.3390/fi16050173

AMA Style

Ibarra-Hernández RF, Castillo-Soria FR, Gutiérrez CA, García-Barrientos A, Vásquez-Toledo LA, Del-Puerto-Flores JA. Machine Learning Strategies for Reconfigurable Intelligent Surface-Assisted Communication Systems—A Review. Future Internet. 2024; 16(5):173. https://doi.org/10.3390/fi16050173

Chicago/Turabian Style

Ibarra-Hernández, Roilhi F., Francisco R. Castillo-Soria, Carlos A. Gutiérrez, Abel García-Barrientos, Luis Alberto Vásquez-Toledo, and J. Alberto Del-Puerto-Flores. 2024. "Machine Learning Strategies for Reconfigurable Intelligent Surface-Assisted Communication Systems—A Review" Future Internet 16, no. 5: 173. https://doi.org/10.3390/fi16050173

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Machine Learning Strategies for Reconfigurable Intelligent Surface-Assisted Communication Systems—A Review

Abstract

1. Introduction

2. Foundations on RISs

2.1. RIS Hardware Architecture

2.2. RIS-Assisted Wireless Communication Systems

2.3. RIS-Assisted Communication System Model

3. Basics of ML Algorithms

3.1. Types of ML Algorithms

3.2. Deep Learning (DL) Algorithms

3.3. Reinforcement Learning Algorithms

3.4. Federated Learning (FL)

4. ML Applications for RIS-Assisted Communications

4.1. Resources for Generating Databases to Train ML on Wireless Communications

4.2. Estimation of CSI

4.3. Beamforming Applications

4.4. Federated Learning Applications

4.5. ML Applications for Signal Decoding

4.6. ML-Based Applications for RIS-Assisted Communications Modeling

5. Future Trends, Challenges, and Opportunities

5.1. Database Shortage

5.2. Source Code Sharing

5.3. Model Deployment and Updating

5.4. Exploring Different Learning Approaches

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI