Deep Learning-Empowered RF Sensing in Outdoor Environments: Recent Advances, Challenges, and Future Directions

Nguyen, Quang D. M.; Lukito, William D.; Liu, Xuemeng; Liu, Chang

doi:10.3390/electronics14010125

Open AccessReview

Deep Learning-Empowered RF Sensing in Outdoor Environments: Recent Advances, Challenges, and Future Directions

¹

School of Computing, Engineering and Mathematical Sciences, La Trobe University, Melbourne, VIC 3086, Australia

²

School of Electrical and Information Engineering, The University of Sydney, Sydney, NSW 2006, Australia

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(1), 125; https://doi.org/10.3390/electronics14010125

Submission received: 26 November 2024 / Revised: 27 December 2024 / Accepted: 29 December 2024 / Published: 31 December 2024

(This article belongs to the Special Issue Next-Generation Sensing and Communication Technologies)

Download

Browse Figures

Versions Notes

Abstract

Recently, with advancements in Deep Learning (DL) technology, Radio Frequency (RF) sensing has seen substantial improvements, particularly in outdoor applications. Motivated by these developments, this survey presents a comprehensive review of state-of-the-art RF sensing techniques in challenging outdoor scenarios with practical issues such as fading, interference, and environmental dynamics. We first investigate the characteristics of outdoor environments and explore potential wireless technologies. Then, we study the current trends in applying DL to RF-based systems and highlight its advantages in dealing with large-scale and dynamic outdoor environments. Furthermore, this paper provides a detailed comparison between discriminative and generative DL models in support of RF sensing, offering insights into both the theoretical underpinnings and practical applications of these technologies. Finally, we discuss the research challenges and present future directions of leveraging DL in outdoor RF sensing.

Keywords:

RF signals; sensing; radio frequency; outdoor environments; deep learning; wireless technologies

1. Introduction

Radio Frequency (RF) sensing is a technology that utilizes wireless signals to detect and monitor objects, activities, or environmental changes. By interpreting changes in RF signal properties, such as amplitude, phase, and frequency, it can extract meaningful information about the surrounding environment [1]. This approach encompasses various platforms, including LoRa (“Long Range”), Wi-Fi, radar, Software-Defined Radio (SDR), and Radio Frequency Identification (RFID), each enabling specific methods for sensing applications [2]. RF sensing is known for being non-intrusive, cost-effective, and energy-efficient, making it a practical solution for monitoring multiple subjects simultaneously in large areas [3].

In recent years, RF sensing has expanded to outdoor environments, where it supports applications in various sectors [4,5,6,7]. Moreover, RF sensing is crucial across various industries, such as telecommunications and structural health monitoring [3,8,9]. Within the telecommunications sector, RF sensing optimizes network efficiency and spectrum management by providing precise positioning and spatial analysis, which enhance spectrum usage and enable real-time adjustments to fluctuating RF environments [10]. In the context of structural health monitoring, RF sensing technologies, such as RFID sensors, support non-contact, real-time surveillance of infrastructure [11]. RF sensing enables human activity detection for applications such as crowd management, movement tracking, and public safety, helping authorities monitor pedestrian flows during large events [1]. These systems rely on RF signals to detect the presence, movement patterns, and locations of individuals, enhancing urban traffic management [12] and security. Similarly, vehicles and unmanned aerial vehicles (UAVs) could benefit from RF sensing through technologies like millimeter-wave (mmWave) radar, which is able to provide real-time localization, object detection, and tracking. Additionally, RF sensing aids environmental monitoring by providing insights into air quality, temperature, and pollution levels, contributing to sustainable urban development [13]. However, outdoor environments pose unique challenges for RF sensing, including multipath interference, signal attenuation, and environmental variability [14]. Obstacles like buildings, vehicles, and natural terrain disrupt RF signals, causing reflection, scattering, or attenuation. Additionally, harsh weather conditions, including rain and high humidity, can negatively impact signal quality [15]. Furthermore, overlapping signals from other wireless systems operating in the same environment introduce significant interference, complicating the reliability of RF sensing [16,17]. To overcome these challenges, various signal processing techniques or machine learning algorithms have been applied, enhancing RF sensing capabilities in complex outdoor environments [17,18].

Deep Learning (DL), a subcategory of Machine Learning (ML), has revolutionized the way data-driven solutions address these complex challenges. Over the past decade, advancements in DL have demonstrated exceptional performance across diverse domains [19,20,21,22,23], driven by innovations in model architectures, such as Multilayer Perceptrons (MLPs), Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), and generative models like Autoencoders (AEs), Generative Adversarial Networks (GANs), Diffusion Models (DMs) and Large Language Models (LLMs). While shallow ML models, such as Support Vector Machines, Decision Trees, and shallow Neural Networks, have been employed in RF sensing and classification tasks, they often rely on manually crafted features and may struggle in high-dimensional or unstructured data settings [24,25]. In contrast, DL models have shown remarkable adaptability and scalability, enabling them to process and analyze high-dimensional data with minimal manual intervention [26,27]. DL simplifies the process of traditional feature engineering by hierarchically representing raw data, resulting in more reliable solutions. It has emerged as a revolutionary approach in RF sensing, automating the extraction of complex features from raw signals and removing the reliance on manually crafted feature designs. Discriminative models like CNNs and transformers excel in tasks such as signal classification [28], anomaly detection [29], and behavioral prediction [30,31,32,33], while generative approaches, including GANs and DMs, enhance detection capabilities in noisy or low-data scenarios by modeling signal distributions. DL bridges the gap between traditional signal processing and intelligent automation, driving advancements in applications ranging from human activity recognition and health monitoring [1,6,34] to spectrum management and anomaly detection in communication systems [7,12,35,36,37]. These models adapt to dynamic environments, enabling continuous innovation and improving RF sensing performance in complex outdoor scenarios [1,4].

The main objective of this review is to systematically explore three key research questions that are crucial to advancing RF sensing in outdoor environments. Specifically, this work seeks to explore: (1) the recent advancements in outdoor RF sensing and their effectiveness in addressing unique challenges such as signal interference and environmental variability; (2) the wireless technologies commonly employed in outdoor applications and their suitability for various sensing requirements; (3) the role of DL in enhancing RF sensing performance, particularly regarding accuracy and adaptability in complex outdoor settings. By thoroughly examining these questions, this review offers a structured framework for understanding the current landscape and future directions of DL-enabled RF sensing in outdoor applications.

Although previous surveys have extensively addressed areas such as human sensing [1,6,34], autonomous vehicles and smart homes [4], material identification [38], or indoor localization [39,40,41], most of these works concentrate on controlled indoor environments, narrowly defined applications, or a single class of DL models, such as generative models [42]. In contrast, this study centers on the unique challenges and potential solutions for outdoor RF sensing, where dynamic, large-scale environments introduce distinct problems, such as fading, interference, and environmental variability. Rather than fixating on tightly constrained scenarios, we highlight the common obstacles that broadly characterize outdoor conditions. Building on this foundation, we assess how commonly employed wireless technologies adapt to these challenges, noting their relative strengths and limitations. Beyond technical considerations, we examine a broad range of DL models, showcasing how these approaches can enhance RF sensing performance and reliability in complex, real-world outdoor settings.

The novelty of this review lies in its emphasis on leveraging DL, including the latest trending models such as LLMs, to address these challenges in outdoor settings, offering new perspectives on integrating DL techniques to improve RF sensing performance in terms of accuracy, adaptability, and scalability. As far as we are aware, this is the first in-depth review to explore DL-empowered RF sensing in outdoor environments. The main contributions of this paper are summarized as follows:

Unlike existing surveys, this survey provides a comprehensive review of RF sensing in outdoor environments, identifying key challenges and examining wireless technologies best suited for these settings.
We provide a detailed examination of DL techniques, both generative and discriminative, alongside recent outdoor RF sensing studies. We highlight the benefits and limitations of each approach, compare these two modeling paradigms, and discuss the advantages and disadvantages of integrating them.
This survey paper explores the existing challenges of leveraging DL in outdoor RF sensing and presents insights and possible solutions for future tendencies.

The structure of this paper is shown in Figure 1. Following this introduction, Section 2 provides an overview of RF sensing in outdoor environments, with a focus on the unique challenges and the wireless technologies best suited to these settings. Building on this, Section 3 explores how DL models enhance RF sensing, categorizing prominent approaches and assessing their practical applications. Section 4 then addresses the core challenges in integrating DL with RF sensing, offering future research directions to address these issues. Finally, Section 5 encapsulates the key insights, underscoring potential avenues for advancing RF sensing in outdoor environments.

2. Overview of RF Sensing in Outdoor Environments

RF sensing uses wireless signals to detect and monitor objects, activities, or environmental changes by analyzing variations in properties such as amplitude, phase, and frequency. However, outdoor environments introduce unique challenges like fading and interference, which complicate control and significantly affect accuracy and reliability. These challenges include storm clouds obstructing line-of-sight (LoS) sensing signals, signal degradation due to multipath effects caused by reflections off objects, and interference from other wireless systems, as illustrated in Figure 2. Addressing these issues requires a thorough understanding of factors such as multipath effects, signal attenuation, environmental variability, and external interference. This section explores these challenges in detail and introduces wireless technologies best suited to ensure reliable performance in outdoor environments.

2.1. Challenges of RF Sensing in Outdoor Environments

RF sensing systems deployed in outdoor environments face numerous challenges that are not present or easily managed in indoor settings. In outdoor scenarios, environmental conditions are dynamic and difficult to control, leading to fluctuations in signal quality and reliability [43,44]. These challenges arise from several key factors, including signal attenuation, multipath propagation, non-LoS (NLoS) conditions, interference by other wireless systems, and environmental variability.

One of the foremost challenges in RF signal transmission is attenuation, where the signal strength diminishes as the distance between the transmitter and receiver increases. In free space, the path loss can be described by

P L_{F} (d) [dB] = 20 log (\frac{4 π d f}{c}),

(1)

where d is the distance between the transmitter and receiver, f is the frequency, and c is the speed of light. This equation quantifies the predictable loss in signal strength as a function of distance and frequency in an unobstructed environment.

To establish a connection between the free-space model and more complex real-world scenarios, the log-distance path loss model introduces a path loss exponent n, which accounts for various environmental factors that influence signal attenuation:

P L_{L D} (d) [dB] = P L_{F} (d_{0}) + 10 n log (\frac{d}{d_{0}}),

(2)

where

d_{0}

is a reference distance at which the path loss characteristics resemble free-space conditions as described in Equation (1). The path loss exponent n typically ranges from 2 to 6, depending on the propagation environment. Notably, when

n = 2

, Equation (2) simplifies to the free-space path loss model, highlighting how n modulates the free-space assumption to accommodate more complex environments.

As illustrated in Table 1, the path loss exponent n varies significantly across different environments, underscoring the inherent challenges in outdoor signal propagation. Indoor environments typically exhibit lower path loss exponents, ranging from 1.6 to 3 in LoS or obstructed settings within buildings or factories, except for obstructed in-building scenarios, where n reaches 4–6. In contrast, outdoor environments, such as urban or shadowed urban areas, show higher path loss exponents, ranging from 2 to 5, even under typical conditions. Although the obstructed in-building exponent surpasses outdoor values, this may result from experimental setups involving signal propagation through multiple floors, which may not represent realistic scenarios for modern sensing applications [45]. Moreover, outdoor environments introduce additional challenges, including typically longer transmission distances, which inherently increase path loss. These factors collectively indicate that outdoor sensing environments are generally more challenging than indoor ones.

The analysis of Table 1 highlights that outdoor environments present significant challenges for RF signal propagation. Urban areas, especially those with substantial obstructions like buildings and other infrastructure, exhibit higher path loss exponents, leading to more pronounced signal attenuation.

In these environments, attenuation is affected by both large-scale and small-scale fading. Large-scale fading describes the path loss caused by terrain and significant obstacles over extended distances. On the other hand, small-scale fading involves quick variations in signal strength due to multipath propagation, which occurs as signals reflect off nearby objects, like walls and vehicles [44]. These fading effects are essential to understanding RF performance across diverse environments.

A significant challenge in RF sensing is multipath propagation, which complicates reliable signal interpretation. In multipath environments, signals reflect off surfaces like buildings, vehicles, and the ground before reaching the receiver, creating NLoS conditions, as shown in Figure 2. As illustrated in Figure 3, each reflected signal can be modeled as a delayed, attenuated copy of the original, known as a delay tap. These overlapping signal paths interfere with each other at the receiver, causing distortions that obscure the true features of the original signal. This interference degrades the accuracy of the received signal, making it challenging to reliably interpret the original transmission or reconstruct the sensed environment [44]. Multipath propagation is particularly problematic in urban areas, where abundant reflective surfaces amplify this overlapping effect.

For addressing multipath propagation challenges, reconfigurable intelligent surfaces (RIS) can indeed be beneficial, as they offer a way to create a controlled virtual LoS [47]. RIS technology can effectively mitigate multipath fading by redirecting RF signals in real-time and enhancing signal strength at the receiver [48]. However, optimizing RIS configurations for high accuracy of sensing tasks, such as human posture recognition, remains challenging [49]. Additionally, challenges such as the area and bandwidth of influence, as highlighted by Alexandropoulos et al. [50], require careful planning for RIS deployment in smart wireless environments.

Interference from other wireless systems is another challenge that can degrade the performance of outdoor RF sensing. Additionally, the accumulation of communication technologies such as Wi-Fi networks, cellular towers, and IoT (Internet of Things) devices, creates overlapping signals in the same frequency bands, leading to noise which causes signal degradation [51]. This interference reduces the reliability of measurements and increases the likelihood of errors, particularly in crowded urban areas where spectrum congestion is common.

In addition to interference, environmental factors such as weather conditions and moving objects introduce further variability. Temperature, humidity, and precipitation affect signal propagation by altering the path and strength of the signals [15,52]. For example, rain or fog can absorb or scatter RF signals, causing attenuation and reducing the effective range of the system. Furthermore, outdoor environments are dynamic, with constantly changing elements, such as moving vehicles and pedestrians. These temporary obstructions can significantly affect signal paths in RF sensing applications. For example, vehicle vibrations and non-linear movements can degrade automotive radar sensor signals, impacting accuracy and detection probability [53].

2.2. Wireless Technologies for Outdoor Environments

To address the challenges mentioned above, the selection of appropriate wireless technologies is critical. Certain wireless technologies are less suitable for outdoor RF sensing applications due to limitations in range, susceptibility to interference, or insufficient robustness under varying conditions. Table 2 compares current wireless technologies, underlining the sensing range and power characteristics of each technology in experimental settings. For example, although Bluetooth is designed to operate effectively in noisy environments and can handle fading and interference, its relatively short range makes it less ideal for large-scale outdoor RF sensing [54]. ZigBee, while advantageous for low-power applications, faces challenges in outdoor settings due to interference, range limitations, and vulnerability to multipath effects and environmental fluctuations, limiting its effectiveness [55]. Similarly, Sigfox, though suited to low-power IoT use cases, is omitted here due to its limited data rate and daily message constraints [56], which are unsuitable for continuous outdoor sensing. In the following section, we discuss some of the best wireless technology candidates for RF sensing in outdoor environments.

Long Range (LoRa) [57] is a physical proprietary technique based on spread spectrum modulation techniques derived from chirp spread spectrum (CSS) technology, offering distinct advantages for RF-based sensing, particularly in large-scale applications. With an extended range often spanning several kilometers, LoRa effectively addresses the range limitations common in expansive outdoor environments. Its CSS modulation and various spreading factors provide resilience to intra-technology interference and some tolerance to multipath effects [65]. However, despite this robustness, LoRa remains vulnerable to inter-technology interference from other devices sharing unlicensed frequency bands [66]. LoRa’s minimal power consumption makes it especially well suited for applications that require continuous battery operation in sensing devices. Moreover, its affordability and ease of implementation increase accessibility for researchers and practitioners [67]. Nevertheless, the small bandwidth of LoRa limits its capacity to capture and transmit detailed information. As a result, LoRa is commonly used in scenarios where low data rates are sufficient, such as object localization [68] and environmental monitoring [69].

Millimeter-Wave (mmWave) technology is an advanced wireless communication technology operating in the frequency range of 30–300 GHz. The directional character of mmWave signals, meaning they travel in focused, narrow beams rather than spreading broadly, helps to reduce interference from other wireless systems, enhancing reliability in densely populated areas. Additionally, mmWave radars are effective in various weather conditions and NLoS scenarios, making them a viable alternative to sensor-based methods like cameras and LiDAR in complex environments [1]. Its high bandwidth enables the capturing of detailed data, significantly enhancing the resolution and accuracy of sensing applications compared to technologies like LoRa, Wi-Fi, and traditional RF systems. This makes mmWave ideal for high-precision tasks that require detailed environmental information [4]. Furthermore, In localization, Hao et al. [70] proposed a mmWave-based multipath-assisted localization model that leverages multipath effects to improve indoor localization accuracy while traditional models often filter out these effects to minimize errors. This approach utilizes the high spatial resolution of mmWave to achieve precise position estimates. Despite these advantages, mmWave technology faces limitations in outdoor environments, particularly reduced range due to signal attenuation at high frequencies. This relationship is explained by Equation (2), where path loss increases with rising frequency. Thus, while mmWave excels in short-range, high-resolution sensing, its use in large-scale outdoor settings requires careful planning. Common applications include high-resolution localization [71], human sensing [1], and automotive radar for object detection in autonomous vehicles [4].

Long-Term Evolution (LTE) is a widely used wireless communication technology with emerging applications in RF sensing, particularly for outdoor environments. Originally designed for communication, LTE operates over a broad frequency range (450 MHz–3.8 GHz) with extensive coverage, which makes it adaptable for sensing tasks. Recent studies have utilized LTE to detect environmental changes, such as traffic and human activity, by analyzing signal reflections and interference patterns [72,73,74]. This approach allows the reuse of existing cellular infrastructure, potentially reducing additional sensor costs. However, LTE faces challenges in sensing specific targets due to interference from non-target objects, such as trees, vehicles, and pedestrians, over long distances. These factors, along with multipath effects, can degrade signal quality [72]. Additionally, integrating sensor networks into LTE may lead to network overload in dense areas [75]. Despite these issues, LTE’s vast infrastructure and broad coverage make it a promising option for large-scale outdoor sensing, such as asset tracking or environmental sensing, where extended connectivity is essential.

Wireless Fidelity (Wi-Fi) operates in the 2.4 GHz, 5 GHz, and 6 GHz frequency bands, as defined by IEEE standards 802.11b/g/n (2.4 GHz), 802.11a/n/ac (5 GHz), and 802.11ax (6 GHz) [60]. Due to advantages such as affordability and widespread availability, Wi-Fi sensing has emerged as a versatile technology for applications like human activity recognition and movement tracking. Compared to radar, Wi-Fi-based sensing provides extensive coverage with fewer blind spots [6]. Additionally, Wi-Fi signals are able to penetrate walls and other obstacles, making them suitable for through-the-wall sensing tasks, such as human presence detection. This capability, along with compatibility with commodity devices like smartphones, enables activity sensing within rooms from outdoor locations without the need for specialized or invasive equipment [76]. However, Wi-Fi signals are highly susceptible to various factors, including environmental variability and multipath effects, particularly in multi-object scenarios [6]. Ensuring reliable Wi-Fi sensing across diverse real-world settings is challenging, as it involves complex optimization problems. Moreover, Wi-Fi was not originally intended for sensing; thus, using it for such applications can degrade communication performance due to network interference and limited resources [77].

Radio Frequency Identification (RFID) technology, encompassing both passive and active systems, is widely utilized in RF sensing applications for object tracking and environmental monitoring [78]. An RFID system comprises tags, readers, and antennas. Passive tags, which rely on energy from the RFID reader to operate, are advantageous for long-term monitoring applications, such as environmental sensing or asset tracking, due to their low power requirements [79]. However, passive tags are more susceptible to environmental factors like rain, humidity, and temperature variations, which can cause phase drift and signal attenuation, reducing accuracy in outdoor location tracking [80]. In contrast, active RFID tags have an internal power source, enabling them to broadcast signals continuously or at specified intervals. This results in a greater range and improved signal strength, making them especially suitable for outdoor applications with significant distances between the tag and reader. Due to their ability to emit signals and operate at higher frequencies than passive RFID, active RFID tags can also be effectively used in localization systems [81].

Ultra-Wideband (UWB) radar has been employed in a variety of military and civilian contexts for high-resolution sensing and imaging [82]. Recently, it has gained recognition as an effective solution for accurate localization, particularly in scenarios where global navigation satellite systems (GNSS) are unavailable, due to its exceptional precision in range estimation [83,84]. Compared to other common localization technologies like Bluetooth, Zigbee, and radio frequency identification (RFID), UWB offers improved ranging accuracy, effective multipath resolution, and strong resistance to interference, attributed to its wide bandwidth of 800 MHz [62]. This capability is particularly valuable in scenarios requiring high-precision localization [85]. Despite these advantages, UWB’s range is limited compared to other RF sensing technologies like LoRa or LTE, making it more suitable for localized sensing tasks rather than long-range applications.

Terahertz (THz) radiation occupies the electromagnetic spectrum between the microwave and far-infrared ranges, serving as a link between electronics and optics within the so-called “terahertz gap” [86]. THz wavelengths are generally defined as ranging from 1.0 to 0.1 mm, corresponding to frequencies of 300 GHz to 3 THz [87]. The short wavelengths of THz radiation enable high-resolution imaging, which is beneficial for detailed inspections and for identifying fine structures in materials. Additionally, THz photon energies are far lower than those of X-rays, making them non-ionizing and safe for human-involved applications, such as imaging and security screening [87]. However, THz radiation has a significant drawback: high signal loss. THz waves experience greater free-space path loss than lower frequency bands, as path loss increases with the square of the frequency, as illustrated in Equation (2). As a consequence, due to its high-resolution sensing capabilities on a small scale, THz radiation is applied in imaging [88], food quality inspection [89], and various pharmaceutical industry processes [90]. Finally, Naftaly et al. [91] reviewed both current and emerging industrial applications, highlighting the significant market potential of THz technology.

In summary, each wireless technology presents distinct characteristics, strengths, and limitations. Selecting an appropriate one based on the specific requirements of outdoor sensing tasks is essential for effectively handling diverse challenges—whether that involves achieving long-distance coverage or capturing detailed environmental information. The chosen technology should balance cost-effectiveness, durability, and performance to ensure reliable operation in complex outdoor conditions. However, selecting the right wireless technology is only part of the solution. Implementation strategies also play a critical role in enhancing overall system performance. Data-driven approaches, particularly those based on DL, have shown substantial promise in improving accuracy, robustness, and adaptability in dynamic scenarios. In the next section, we will explore how DL can bolster RF sensing performance. Moreover, to clarify the range of technologies incorporated in DL-empowered RF sensing research, Section 3.2 reviews recent work employing various wireless technologies, such as Wi-Fi, Bluetooth, and mmWave, summarized in Table 3.

3. The Role of Deep Learning in RF Sensing

Over the past decade, DL has made significant strides [19,20,21,22,23], influencing a wide range of fields, including RF sensing [6,17,18,34]. While wireless technologies provide the foundational infrastructure for outdoor RF sensing, the performance of traditional approaches remains limited in addressing the complexities of outdoor environments. DL addresses these limitations by automatically identifying patterns and extracting essential features, enabling improved accuracy and adaptability in outdoor scenarios. The following section explores the role of DL in RF sensing, focusing on common DL models and analyzing recent studies that leverage these techniques for outdoor applications.

3.1. Deep Learning Models in RF Sensing

Before discussing model architectures, it is essential to differentiate between two key approaches to learning patterns in data: discriminative and generative models [109]. Discriminative models aim to model the connections between input and output variables by directly modeling the conditional probability

P (Y | X)

, where X describes the input and Y the output. This method allows them to predict the outcomes corresponding to observed data, making them effective for classification tasks. In contrast, generative models seek to understand the data generation process by modeling the joint probability

P (X, Y)

. They model the underlying data distribution, allowing them to produce new samples that closely mirror the input data [109]. This distinction is crucial in RF sensing; while discriminative models are effective for various prediction tasks, such as detecting human presence or specific movements from RF signals, generative models excel in simulating or reconstructing RF environments. They allow for more accurate predictions, particularly in scenarios with limited data or unseen environments [42]. In the following sections, we will briefly explain the architecture and mechanisms of popular DL models used in RF Sensing, as illustrated by Figure 4 and Figure 5. Table 4 presents the type of approach, objectives, advantages, and disadvantages of each model.

Multilayer Perceptrons (MLPs) form the core of neural network architectures, comprising an input layer, multiple hidden layers, and an output layer, all interconnected as depicted in Figure 4. Each neuron in a layer connects to every neuron in the subsequent layer. To capture non-linear relationships, MLPs utilize activation functions such as ReLU or sigmoid, enabling the learning of complex patterns. Data flows through the network in a feed-forward fashion, starting from the input layer and passing through hidden layers to the output layer without any loops or recurrences. Training involves backpropagation, which updates weights and biases by minimizing the loss function based on the error between actual and predicted outputs. MLPs are well suited for structured data and simple tasks like classification. In RF sensing, they can utilize channel state information (CSI) for downstream tasks such as localization [110] and activity recognition [111].

Convolutional Neural Networks (CNNs) are specifically designed to analyze and extract spatial features from high-resolution 2-dimensional matrices. Building upon the MLP architecture [112], CNNs incorporate convolutional and pooling layers, which are particularly effective in reducing the output size of each layer, especially for inputs with high dimensions, as illustrated in Figure 4. Convolutional layers utilize filters (kernels) to scan the input image and identify patterns such as edges, textures, or shapes, generating feature maps that emphasize various characteristics of the input. Pooling layers follow, reducing the spatial dimensions of these feature maps while retaining critical features and lowering computational demands. A fully connected layer is then used to link every neuron from the previous layer to those in the next, producing the final classification output. This hierarchical feature extraction enables CNNs to excel in tasks like image classification, object detection, and segmentation by progressively capturing more abstract features at deeper layers [113]. Moreover, when RF data are structured appropriately, they can display spatial patterns similar to those in large-scale 2-dimensional matrices, making CNNs highly effective for RF representation learning [30,103].

Recurrent Neural Networks (RNNs) are designed to address sequence prediction tasks by incorporating recurrent layers, where the output of a neuron at one-time step is used as input for the same neuron at the next step. This structure creates a hidden state that acts as a memory, carrying information across the sequence. Despite their utility, RNNs face challenges such as vanishing and exploding gradients, which can lead to diminishing or excessively growing weights, complicating training, and reducing their overall effectiveness [114]. To overcome these issues, advanced variants like Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks were developed. These models introduce mechanisms to handle long-term dependencies in sequential data by replacing the traditional single non-linear activation function with multiple specialized functions and incorporating copying and concatenation mechanisms, enabling the retention of crucial information over extended sequences. Despite these advancements, RNNs still face challenges, such as sequential processing and limited memory capacity, which restrict parallel computation and complicate the modeling of very long-term dependencies, even with LSTM and GRU networks [115]. Nonetheless, RNNs remain valuable for applications involving RF time-series data, such as activity recognition [105,106].

Autoencoders (AEs) [116] are unsupervised learning neural networks designed to learn compact data representations through compression and reconstruction. Their architecture is composed of two main parts: an encoder and a decoder, as illustrated in Figure 5. The encoder compresses the input data into a lower-dimensional latent space, capturing its key features, while the decoder reconstructs the original data from this reduced representation. The network is trained by minimizing the reconstruction loss, which quantifies the difference between the input and its reconstruction, enabling AEs to effectively capture meaningful and concise representations of the data. This capability makes AEs well suited for tasks like dimensionality reduction and anomaly detection [117]. Variational AEs (VAEs) are a type of AE that encode inputs as probability distributions, typically Gaussian, defined by mean and variance parameters. This probabilistic framework allows VAEs to model data variability and generate new data points by sampling from these distributions, making them effective for tasks such as data generation, image synthesis, and anomaly detection [118]. More advanced variants, including convolutional and denoising AEs, refine the basic structure for specialized purposes, such as bridging gaps in radio maps [93] or filtering noise from RF signal data [119]. Additionally, sparse AEs, which introduce sparsity constraints on hidden units, are often used to learn efficient representations in scenarios with limited data or significant noise [120].

Generative Adversarial Networks (GANs) [121] are a category of generative DL models that utilize a generator-discriminator architecture to produce realistic data samples. GANs are composed of two neural networks: a generator, which creates synthetic data samples (e.g., RF signals, images) from random noise, and a discriminator, which attempts to distinguish between real and synthetic samples, as depicted in Figure 5. These networks are trained simultaneously in an adversarial setup, where the generator strives to generate increasingly convincing samples, and the discriminator works to accurately differentiate between real and generated data. This adversarial training continues until the generator becomes proficient enough that the discriminator can no longer reliably differentiate between the two. GANs are particularly valuable for tasks such as data generation, augmentation, and anomaly detection. However, training GANs can be challenging due to issues like mode collapse and unstable convergence [122]. Conditional GANs extend the standard GAN framework by incorporating additional conditional information, such as class labels or attributes, to guide the generation process [123]. By conditioning both the generator and discriminator on this auxiliary information, cGANs enable more controllable and targeted data generation, addressing some limitations of traditional GANs. In RF sensing, GANs find applications in areas like radio map construction [30,92].

Diffusion Models (DMs) [124] are a type of generative model that uses a two-phase process involving forward and reverse diffusion. In the forward phase, noise is gradually added to the data over multiple steps, transforming them into a noise distribution and obscuring the original structure. The reverse phase, handled by a neural network, learns to remove this noise step-by-step to reconstruct the original data. This training process enables the network to generate new samples from noise by reversing the diffusion. DMs are highly effective for creating high-quality images, often surpassing traditional methods like GANs in stability and output quality. Their probabilistic framework offers controlled data generation, making them appropriate for synthetic RF signal generation [31,94], or RF data augmentation [95].

Large Language Models (LLMs), such as GPT (Generative Pre-trained Transformer) [125] and LLaMA [126], are built upon a transformer architecture [127]. As displayed in Figure 5, the input text is tokenized and then encoded with positional information to capture word order before being passed through multiple blocks comprising self-attention and feed-forward neural network layers. This structure concludes with a softmax output layer for predicting the next sequence. The transformer architecture enables parallelized training, significantly accelerating the training process, particularly when handling vast datasets. Moreover, the self-attention mechanism determines the contextual relevance of each word in relation to others, allowing the model to capture long-range dependencies and context, a capability that addresses the limitations of RNNs in maintaining long-term memory. During training, LLMs are exposed to extensive text corpora and are typically optimized with self-supervised objectives, such as masked language modeling and next-word prediction. These approaches result in models proficient at diverse natural language processing tasks, such as text generation, translation, and summarization. Although limited research has explored the application of LLMs to RF sensing, their emerging capabilities, such as in-context learning, instruction following, and adaptability to varying conditions, suggest the potential for RF applications, especially in scenarios characterized by variability in devices and environments.

3.2. Deep Learning-Empowered Outdoor RF Sensing

Having explored a range of DL architectures commonly employed in RF sensing, including foundational models for feature extraction and advanced architectures for generating and interpreting complex data patterns, we now turn to recent studies that leverage these techniques specifically for outdoor RF sensing. These studies demonstrate how various DL approaches address outdoor challenges such as signal interference, multipath effects, and environmental variability, showcasing the practical applications and effectiveness of these models in real-world scenarios.

3.2.1. Generative Models

Applications of generative models in recent RF sensing research have focused on enhancing data reconstruction, augmentation, and synthesis to address challenges such as sparse datasets, missing signal information, and low signal-to-noise ratios (SNRs). Leveraging their powerful ability to generate synthetic RF data, these models can fill in missing points in radio maps and create diverse scenarios for model training. For example, techniques like GANs and DMs are used to reconstruct and predict RF signals by learning from visual or prior RF data. Conditional GANs [92] support accurate localization and traffic monitoring by correcting errors in situations where the geographical map data are flawed, while DMs, such as RF-Diffusion [31], achieve high accuracy in reconstructing RF signals across various domains. Additionally, models like RFGen [94] synthesize realistic RF data using simulation-based approaches, which significantly enhance applications like gesture recognition and posture estimation in previously unseen environments. By generating and expanding data, generative DL models improve the performance and generalization capabilities of RF sensing systems, enabling more accurate detection, classification, and localization tasks.

Moreover, generative models are also highly effective in multimodality tasks within RF sensing, where data from different sources, or modalities, are combined to improve learning and inference. In multimodality, models can use information from one type of data, such as RF signals, visual or audio, to enhance understanding of another type. This approach is particularly beneficial when one modality has sparse or incomplete data, as models can leverage information from another modality to fill in gaps and improve accuracy [128]. With recent advances in Multimodal Large Language Models (MLLMs), DL now enables the integration of multiple modalities, such as LiDAR, images, and RF signals, creating unified representations of surrounding environments and providing comprehensive datasets to train other models more effectively.

Radio map construction is a critical application in outdoor RF sensing, allowing for the creation of spatial maps that capture signal characteristics across different areas. This process is essential for tasks like localization and environmental monitoring, where accurately mapping RF signals enhances situational awareness. Generative models, particularly GANs, are well suited for this task due to their architecture, which can learn complex data distributions and recover missing signal information. For example, CoSense [30] utilizes conditional GANs to reconstruct incomplete mmWave signals by learning patterns from visual data, addressing the challenge of weak reflectivity in mmWave signals and filling in gaps within the radio map. CoSense combines networking and sensing tasks through time-sharing and residual networks to dynamically predict heatmaps, which improves traffic monitoring and pedestrian safety at corners, even in adverse weather conditions. These models were developed and assessed using a custom dataset gathered over a six-month period with a unique hardware configuration, including a mmWave cascade device and a ZED stereo camera. In terms of performance, CoSense acquires a median Intersection over Union (IoU) of 0.55 for pedestrians and 0.63 for vehicles. IoU is a metric used to evaluate the overlap between the predicted area (e.g., of an object) and the actual area, with values closer to 1 indicating higher accuracy. Therefore, these IoU scores demonstrate a reasonable level of accuracy for detecting and localizing pedestrians and vehicles while maintaining network throughput and reducing sensing overhead by 70%. However, the system’s reliance on a custom-built hardware setup for data collection limits its reproducibility and broader adoption. Additionally, the mean depth prediction error increases for vehicles and other large objects due to their complex structure and the variability of mmWave reflections.

In real-time RF sensing scenarios, transmitter information, such as the location and power of base stations or mobile devices, is often unavailable due to practical limitations. Zhang et al. [92] develop a cooperative radio map estimation (CRME) approach for 6G networks, which estimates spatial radio signal distribution; unlike traditional radio map estimation, this approach does not rely on transmitter information such as location and power. The authors implement a conditional GAN framework, called GAN-CRME, which leverages distributed received signal strength (RSS) samples from mobile users to infer the radio map. The GAN-CRME method achieves a root mean squared error (RMSE) of approximately 0.009 when using over 1000 RSS samples, outperforming the state-of-the-art RadioUNet model, which scores 0.021 in RMSE. Additionally, GAN-CRME offers low data acquisition costs and computational complexity, making it suitable for real-time applications in dynamic environments.

Teganya and Romero [93] developed a DL approach using AE for accurate radio map estimation by filling the missing points in the radio map, focusing on power spectral density (PSD) estimation in wireless communication networks. The authors proposed a deep completion AE architecture that leverages spatial and frequency information. They trained the model using both synthetic and real-world data, applying techniques such as denoising and transfer learning to enhance the network’s performance. The proposed method achieves a 30% reduction in root mean square error (RMSE) compared to existing techniques when using synthetic data, and up to 100% improvement in RMSE for real-world data in urban environments, demonstrating its superior performance in both scenarios.

The generative DL model also has the important ability to generate synthetic data, which can help address issues with sparse datasets when training models for downstream tasks. For example, RFGen [94] enhances the generalization capabilities of mmWave sensing systems by synthesizing RF data using cross-modal diffusion and ray tracing simulation with prompting technique. The authors integrate a physics-based ray tracing simulator with a DM to generate diverse 3D scenes and corresponding RF data. They utilize RF sensing prompts to create specific applications and environments, combining structured multipath noise and environmental context through a path-based intermediate representation to simulate realistic RF signals. RFGen shows a significant improvement in unseen scenarios, increasing gesture recognition accuracy to above 90% across various orientations and reducing average posture estimation errors by up to 90% compared to baseline models. Nonetheless, the synthetic data produced by RFGen might not completely represent the complexity of real-world settings, particularly those with dense clutter and complicated object interactions. Additionally, creating a dataset using RFGen still requires significant time and computational resources.

Among significant works using DMs, Chi et al. [31] introduced RF-Diffusion, a versatile generative DL model for RF signals based on DMs. This approach leverages diffusion processes across time and frequency domains to reconstruct RF signals accurately. The model is trained on diverse RF signal datasets, including Wi-Fi signals (802.11 transmissions) for gesture recognition, Frequency Modulated Continuous Wave (FMCW) radar signals for object detection, and simulated 5G channel estimation signals under varying noise and interference conditions. These datasets encompass both real-world and synthetic data, enabling the model to capture complex RF signal characteristics, such as amplitude, phase, and frequency variations. RF-Diffusion employs a complex-valued neural network to handle the inherent properties of RF signals and incorporates attention-based modules for effective feature extraction. It achieves an average Structural Similarity Index Measure (SSIM) of 0.81 for Wi-Fi signals and 0.75 for FMCW radar signals, surpassing baseline models by up to 71.3%.

Due to the limitations of traditional augmentation methods, such as rotation and flipping, and the modest improvements achieved by advanced techniques like GANs in automatic modulation classification (AMC), Xu et al. [95] introduce a Diffusion-based Radio Signal Augmentation (DiRSA) algorithm, enhancing dataset diversity and reducing overfitting for DL models. DiRSA reconstructs radio signals from noise using modulation category prompts to expand and diversify the training dataset, specifically targeting scenarios with limited data availability. DiRSA improves classification accuracy by up to 6% at SNR levels above 0 dB when applied with the LSTM model, compared to traditional methods like rotation and flipping.

Chang et al. [96] improved the accuracy of radio frequency fingerprint (RFF) recognition by developing a method that integrates prior information of wireless signals to combine the characteristics of transient and steady-state signals for enhanced classification, particularly under low SNR conditions. The proposed model segments incoming signals into transient and steady-state components and uses an autoencoder for denoising transient features. The model fuses these features to form a comprehensive RFF, leveraging high SNR prior information to enhance accuracy at 100% recognition accuracy on the LFM6 dataset and at 99.91% accuracy on the LFM15 dataset at 5 dB SNR. It also maintains 90.64% accuracy even under challenging conditions with

- 5

dB SNR, demonstrating its robustness and effectiveness in various environments. However, acquiring high-SNR original signals, which is required to guide the denoising process, can be expensive.

Another significant work, Babel [98], aims to enhance multimodal sensing through a technique known as expandable modality alignment. Babel incorporates a pre-trained modality tower (BERT, STGCN, ResNet3D, ViT, and Point Transformer) to encode multiple sensing modalities, such as inertial measurement unit (IMU) which captures motion data, as well as video, Wi-Fi signals, mmWave signals, and LiDAR data into a unified representation for downstream sensing tasks. The results demonstrate this framework improved human activity recognition accuracy by up to 22% compared to state-of-the-art methods, showing an average accuracy improvement of 12.02% across six modalities. However, despite the innovative approach, the work faces a key limitation with the datasets used, such as UTD-MHAD, MM-Fi, OPERANet, XRF55, and Kinetics-400. These datasets primarily provide paired samples for only two modalities at a time, which restricts the framework’s ability to comprehensively align all six modalities simultaneously. The authors of the paper suggest that future work should focus on expanding Babel’s capability to align additional sensing modalities involving inviting contributions from the broader community to further align additional modalities as Babel is designed with a scalable architecture.

AirECG [99], a contactless electrocardiogram (ECG) monitoring system using mmWave sensing, employs a custom cross-domain DM that translates mmWave signals into ECG data through multiple denoising iterations and calibration guidance. AirECG uses a CNN-based patchification approach to process the multichannel mmWave data. The CNN encodes the data into patches (tokens) that contain cardiac features. These tokens serve as the input for the diffusion process, allowing the system to integrate and enhance cardiac information from multiple reflection points on the chest. The core of the denoising process is a hierarchical Transformer model, which takes the CNN-encoded mmWave tokens and performs denoising, generating, and updating synthetic ECG data in multiple iterations. To further enhance the fidelity of the generated ECG data, a calibration guidance mechanism is integrated. This module uses historical ECG data (from reference devices) to guide the denoising steps. AirECG achieved a Pearson correlation coefficient (PCC) of 0.955 and 0.860 for normal heartbeat and abnormal beats, respectively, showing 15.0% to 21.1% improvements over existing methods. Despite its innovations, the system has limitations. Historical ECG data are required for calibration, which may limit its use in scenarios involving entirely new participants. Additionally, performance degrades in noisy or highly dynamic environments, where mmWave sensing is prone to interference. Addressing these challenges could enable AirECG to support broader applications, such as contactless sports tracking and other dynamic monitoring contexts.

Recently, LLMs have gained significant attention due to their exceptional comprehension and reasoning capabilities, driving progress across various fields [129]. However, the fundamental difference between textual data and RF signal data poses a significant challenge in applying LLMs to RF sensing tasks. A common solution is to leverage ML or DL models to convert sensing signal data into textual representations, which can then serve as input for LLMs to perform downstream tasks, such as reasoning, answering questions, or summarization. Ouyang and Srivastava [130] introduce LLMSense, a framework that utilizes LLMs for high-level reasoning on long-term spatiotemporal sensor data by adopting this approach. To address the challenge of long sensor traces, the authors propose two strategies: summarization before reasoning and selective inclusion of historical traces. LLMSense achieves approximately 80% accuracy on tasks like dementia diagnosis and occupancy tracking, even with datasets containing limited training samples. An additional advantage of LLMs is their ability to perform zero-shot learning in classification tasks. Ji et al. [131] demonstrate that, with effective prompting, LLMs can directly interpret raw IMU data based on their extensive knowledge base, showcasing their promising potential for analyzing raw sensor data in the physical world.

Another notable advancement in this domain is IoT-LLM [97], a framework designed to address IoT-related real-world problems through a structured process. The framework begins by simplifying raw IoT data and enriching it with meaningful context to make it interpretable for LLMs. It then retrieves domain knowledge and task-specific demonstrations to enhance the reasoning capabilities of LLMs. Finally, it configures prompts to activate the step-by-step reasoning capabilities of LLMs, enabling them to handle complex IoT tasks effectively. Unlike previous works that primarily use LLMs as user interfaces or coordinators, IoT-LLM directly integrates IoT sensor data into LLM reasoning workflows, bridging a critical gap in the field. The framework was evaluated on five IoT datasets, including IMU data, ECG data, Wi-Fi CSI data, and Received Signal Strength Indicator (RSSI) data, demonstrating its effectiveness in tasks such as human activity recognition and industrial anomaly detection. IoT-LLM improves LLM performance by approximately 65%, showcasing its potential in enhancing LLM-based IoT task reasoning. Despite these advancements, LLMs still face challenges with high-dimensional data, such as audio and 3D point clouds, due to their complexity and length. Furthermore, the study does not directly compare the classification performance of LLMs with conventional DL methods. However, LLMs exhibit a clear advantage in providing both accurate results and explainable reasoning processes, offering a level of comprehensibility that surpasses traditional approaches in many scenarios.

3.2.2. Discriminative Models

Discriminative DL models in RF sensing are typically applied for tasks like signal detection, classification, and localization [7,17,34,132]. These models, such as CNNs, LSTMs, and advanced hybrid networks, excel in extracting and classifying features from RF signals to identify patterns or specific entities (e.g., drones, human activities, and vehicles) and manage RF spectrum efficiently. For instance, models like WRIST [100] and DeepFeat [101] use spectro-temporal and LTE-specific features, respectively, to detect and localize RF emissions or devices, achieving high accuracy levels. These applications extend to human and drone recognition tasks, where CNNs, LSTMs, and attention-based models process RF data like spectrograms which present how the signal’s frequency content changes over time, and CSI for precise identification, even in dynamic or congested environments [30,103]. Discriminative models are also used in health monitoring systems like HealthDAR [104], where they identify vital signs and human activities through RF signals, offering non-intrusive monitoring solutions. Despite their effectiveness, these models often face challenges in maintaining performance under varying environmental conditions and depend on high-quality training data.

To improve real-time outdoor localization accuracy in dense urban environments where GNSS-based methods fail due to LoS issues, Yapar et al. introduced LocUNet [14], an end-to-end CNN for localization using path loss radio maps. The authors compare four different DL models: RadioUNet [133], fingerprint-based kNN [134], Adaptive KNN [135], and LocUNet, trained using path loss radio maps to improve accuracy despite signal variability and interference. For training purpose, the authors introduce two novel datasets: RadioLocSeer, which includes simulated RSS data, urban city maps, and Base Station (BS) locations; and RadioToASeer, which provides Time of Arrival (ToA) measurements across various propagation scenarios. LocUNet outperforms other models with an average localization error of 5 m, showing 11- and 14-m improvements over kNN-based methods [134,135]. However, the study primarily focuses on static urban scenarios without validation in dynamic urban settings. Dynamic elements, such as moving vehicles or pedestrians, are either simplified or omitted. Consequently, future research should involve validating and improving the model using real-world datasets to better capture the complexities of dynamic environments.

Nguyen et al. [100] present a wideband, real-time, spectro-temporal (WRIST) RF identification system aimed at addressing the critical challenge of spectrum scarcity and dynamic RF spectrum management. The authors employ a deep learning framework inspired by YOLO (You Only Look Once) [136], a well-known object detection model. By leveraging transfer learning, they adapt YOLO to develop an RF identification system, enabling accurate detection and classification of emissions. WRIST was trained and evaluated using two datasets: a synthetic dataset comprising spectrum snapshots, which represents the RF signal power across frequencies at a single point in time and were generated for five RF technologies (Wi-Fi, Bluetooth, ZigBee, Lightbridge, and XPD), and a real emissions dataset collected from the 2.4 GHz industrial, scientific, and medical (ISM) band in both controlled and real-world environments. WRIST supports the detection, classification, and precise localization of RF emissions in the 2.4 GHz ISM band, achieving high performance with a class detection accuracy of over 99% and precision and recall values reaching 94% in controlled environments. However, challenges persist in maintaining the system’s accuracy in highly congested, real-world scenarios. The authors suggest future work should focus on expanding the dataset and enhancing the system’s capabilities to improve robustness and adaptability under diverse and dense spectrum conditions.

DeepFeat [101] is a deep-learning-based framework optimized for large-scale outdoor localization in LTE networks. The approach integrates a feature selection module utilizing chi-square and correlation techniques, effectively reducing the computational load by 20.6% while enhancing accuracy through a refined selection of 12 LTE-specific features. By employing a deep feed-forward neural network, the system reaches a median localization accuracy of 13.179 m for a 6.27

{km}^{2}

area and 13.7 m for a

45 {km}^{2}

region. Despite these positive outcomes, maintaining consistent accuracy in varied environments remains challenging, with future efforts focused on improving model adaptability and performance in larger urban areas.

UAVs have become a popular research focus due to their extensive applications in areas like public safety, agriculture, and communications [137,138]. However, with this growth comes the need for advanced detection and classification techniques to ensure safe and secure UAV operation. Xue et al. [102] propose a DL approach for UAV identification using RF signals, specifically addressing challenges in scenarios involving nonstandard waveforms, such as unknown operating channels and environmental variations. To address these challenges, the system includes a method to correct frequency mismatches in signals, known as carrier frequency offset (CFO) compensation, using a technique called morphological filtering, which helps align signals accurately under different conditions. The authors test various ways of representing the radio signals, such as basic signal components, signal strength over time, and visual representations of the signal’s frequency and time information, which is similar to a heatmap. Their findings indicate that a real-valued CNN with spectrogram inputs achieves optimal performance, achieving 97% classification accuracy while ensuring efficient processing. Despite its success, the system requires further improvements to enhance resilience in dynamic wireless environments and maintain accuracy across real-world scenarios.

Another approach transforms RF signals into spectrograms, which are then processed using deep neural networks (DNNs) as proposed by Podder et al. [103]. The ResNet-50V2 model initially applied in noise-free indoor conditions achieved an 85.39% accuracy. However, in outdoor environments at 50 m and 100 m distances, the accuracy decreased to 68.90% and 56.88%, respectively. To address this, the authors developed a CNN model optimized for outdoor settings, which improved classification accuracy to 78.12%. They further enhanced their system using a binary classification task, achieving a 95.08% accuracy on a mixed dataset of UAV and non-UAV images. Despite these achievements, challenges remain, especially in maintaining high accuracy at longer distances and under varied noise conditions, where different levels of background noise distort signal quality by lowering the SNR.

An end-to-end DL model for detecting and identifying drones by using RF signals, addressing challenges posed by interference from other signals like Bluetooth and Wi-Fi operating in the same 2.4 GHz band, has been proposed by Alam et al. [5]. Their approach employs a multiscale feature extraction technique using CNNs to extract enriched features without manual intervention, reducing computational overhead. The model achieves 97.53% accuracy for overall detection, with precision, sensitivity, and F1-score values reaching over 98% across varying SNRs. However, challenges persist, particularly in maintaining high accuracy under low SNR conditions, where signal clarity is compromised by increased noise, and in complex, real-world environments, which introduce additional signal interference and environmental variability.

In other cases, human subjects are also utilized for various tasks, including activity recognition and presence detection. HealthDAR [104] is a contactless health monitoring system designed for vital sign monitoring, human activity recognition, and tracking. The system leverages a compact, low-energy radar integrated with a DL network to detect coughs and monitor vital signs like heart rate and breath rate. HealthDAR achieves a high precision in uncontrolled environments, showing a Pearson correlation coefficient of 0.99 for heart rate and 0.98 for breath rate when compared to ground truth data, indicating a strong similarity between HealthDAR’s estimations and the actual measured values. The mean error for heart rate is 0.3 beats per minute with a standard deviation of 0.04, while for breath rate, it consistently falls within the expected range for adult respiratory rates. Despite its promising results, the system faces challenges in accurately recognizing activities for new subjects and under diverse scenarios, indicating the need for further enhancements to improve robustness and generalization.

Wang et al. [105] propose mmParse, a novel system for human parsing using mmWave radar point clouds, designed to overcome challenges like sparsity and specular reflection inherent in such data. The system employs a multi-task learning framework, combining human parsing with auxiliary tasks like pose estimation to enhance structural feature extraction. In particular, they employ LSTM as part of the feature extraction process for pose estimation. Evaluations demonstrate that mmParse achieves around 92% overall accuracy and an 84% mean IoU across various environments. Nonetheless, challenges remain, especially regarding signal deflection, which can lead to missing body parts in the data. To overcome these challenges, future research could explore data augmentation techniques or additional sensor fusion approaches to compensate for incomplete data and validate the system’s performance in a wider range of real-world conditions.

To accelerate other research in the area of human sensing, Yang et al. [106] presented SenseFi, a comprehensive DL framework and benchmark designed for Wi-Fi-based human sensing. The system evaluates various DL models, such as CNNs, LSTMs, and transformers, using CSI data for tasks like human activity recognition, gesture recognition, and human identification. Extensive experiments demonstrate that shallow models, such as CNN-5, often outperform deeper architectures, like ResNet, in diverse Wi-Fi environments, achieving an accuracy of 98.11% on UT-HAR datasets. Despite the success, challenges include optimizing models for cross-domain adaptation and ensuring efficiency in real-time applications.

Human-vehicle recognition (HVR) has attracted significant interest due to its potential for enabling non-intrusive, efficient detection of traffic participants, which is essential for enhancing intelligent transportation systems. Song et al. [107] introduced wireless-based lightweight attention deep learning (Wi-LADL), a lightweight DL model for HVR that leverages attention mechanisms in wireless sensing to improve feature discrimination. Wi-LADL uses RSS data processed with convolutional block attention modules (CBAM) to capture detailed features, achieving an impressive 98.8% average accuracy at a 2.4 GHz frequency with an antenna height of 0.8 m. This model effectively distinguishes five categories: one-pedestrian, two-pedestrian, one-bicycle, two-bicycles, and one-car. Although Wi-LADL demonstrates high accuracy, challenges persist, particularly in adapting the model to varying antenna heights and frequencies.

Wang et al. [108] introduce UAV-CTNet, a hybrid DL network designed to improve UAV detection and identification in response to growing security concerns. UAVs often operate in the 2.4 GHz ISM band, where their signals are difficult to detect reliably. UAV-CTNet integrates CNN and Transformer architectures: the Transformer captures global features of the RF signals, while the CNN focuses on extracting local features from minimum variance distortionless response (MVDR) spectral vectors, an advanced spectral estimation technique that enhances frequency resolution. The model is trained on the CardRF Dataset, a publicly available experimental dataset with raw waveforms of UAV control signals, downlink signals, and interference signals, such as Wi-Fi and Bluetooth. This approach achieves over 90% detection accuracy under various SNR values, outperforming conventional methods (FBREWT + CNNNet) [139]. The proposed system uses MVDR spectral features, which provide higher frequency resolution compared to traditional Fourier spectra; however, calculating MVDR spectra involves solving optimization problems (e.g., matrix inversion), which can be computationally intensive for large datasets or real-time applications.

Nie et al. [67] employs four DL models: CNN-LSTM, Swin Transformer, ConvNext [140], and Vision TF [141], to enhance human activity recognition using LoRa wireless RF signals. The approach leverages signal transformations like Short-Time Fourier Transform (STFT), differential signal processing (DSP), and frequency-to-image conversion to extract discriminative features from LoRa signals. Each model is trained and evaluated across tasks i.e., activity classification, identity recognition, room identification, and presence detection. The results obtained show that ConvNext achieved the highest performance, with a 96.7% accuracy in activity classification and 97.9% in identity recognition. Vision TF excelled in presence detection with 98.5% accuracy. CNN-LSTM and Swin Transformer showed moderate performance, highlighting ConvNext’s superiority in spatial feature extraction and Vision TF’s effectiveness in global context understanding; however, the experiment was undertaken in controlled indoor environments.

3.2.3. Comparison and Integration of Discriminative and Generative Models

The preceding sections highlighted how generative models can reconstruct or synthesize RF data to handle challenges like missing information, low-SNR conditions, and complex multipath environments, while discriminative models excel at classifying signals or estimating target properties from enriched feature representations. Here, we consider how these two approaches compare and can be integrated to improve RF sensing performance.

Generative models have shown promise in addressing data scarcity and improving model robustness, for instance, by filling gaps in radio maps [30] or generating synthetic RF data to enhance gesture recognition [94]. They also enable multimodal fusion by incorporating signals from cameras, LiDAR, and other sensors [97,98]. Yet, these models can be resource-intensive, and their fidelity to real-world complexities remains a challenge [94,142].

Discriminative models, on the other hand, offer direct classification or localization solutions when sufficient high-quality data are available [100,101]. They often achieve fast inference and strong performance in controlled conditions, but they may struggle when the environment changes or when labeled training data are limited. Generative models can partially alleviate these issues by supplementing training sets with synthetic samples that improve domain adaptation and generalization [67,108].

Understanding the strengths and limitations of both generative and discriminative models is crucial for developing more effective and adaptable RF sensing solutions. In essence, generative models can bolster discriminative models in outdoor scenarios by supplying diverse training data that capture realistic environmental complexities. For instance, advanced diffusion models [31,94,95] can simulate various multipath conditions, moving obstacles, interference patterns, and weather-related changes. This enriched training set reduces the need for expensive, perfectly labeled real-world data and improves the domain adaptation capabilities of discriminative models, enabling them to generalize better to new locations, frequencies, or system configurations [99]. Figure 6 exemplifies how these concepts can be operationalized. In this conceptual pipeline, raw, low-quality RF signals are first augmented and preprocessed by a generative model, then passed to a discriminative model tasked with predicting the number of people. This combined approach can enhance performance, even under challenging real-world conditions.

Recently, LLMs have begun to blur the traditional distinctions between generative and discriminative approaches. Thanks to their training on vast and diverse corpora, LLMs can perform zero-shot learning [131], handle multimodal inputs, and address multiple tasks simultaneously [97]. However, these models typically require more computational resources than generative models like AEs or DMs, and they may struggle to scale effectively to the complex, high-dimensional conditions of outdoor RF sensing [130]. Although promising, their current applicability remains limited, necessitating further research into efficient adaptations for RF-based tasks.

Despite these benefits, integrating generative and discriminative models introduces new challenges. Synthetic data may not perfectly represent real-world distributions, reducing the effectiveness of domain adaptation. Incorporating generative models can also increase computational overhead, potentially limiting applicability in resource-constrained systems. Finally, ensuring proper calibration, quality control, and validation of synthetic data is essential for maintaining the trustworthiness and interpretability of these integrated systems. In the following section, we delve deeper into the challenges of deploying deep learning in RF sensing, particularly in complex outdoor environments.

4. Challenges and Future Directions

Despite the optimistic progress highlighted in the previous section, several challenges remain in applying DL to outdoor RF sensing. These challenges arise from the unique features of RF data and the diverse conditions of outdoor applications, ranging from data scarcity and processing demands to the complexities of integrating multimodal data. Handling these issues is essential for the continued advancement of RF sensing technology and its applicability across domains like traffic management, environmental monitoring, and urban infrastructure. In this section, we investigate key challenges in RF sensing, including data limitations, the gap between synthetic and real-world data, and the need for integrated sensing and communication systems, as well as emerging approaches, such as federated learning, that hold promise for overcoming these obstacles.

4.1. The Scarcity of Training Data

Acquiring labeled data for applications like human tracking, activity recognition, or vehicle monitoring poses substantial privacy and ethical concerns, restricting the availability of large-scale datasets for these tasks [98,143,144]. Even when sufficient training data are available, manual annotation of specific events is often required, which is labor-intensive, time-consuming, and costly [145]. This issue is particularly severe for RF data labeling, as unlike visual data that can be easily reviewed offline through recordings, RF data are not intuitively interpretable by humans without specialized tools [34]. One potential solution is to combine RF data with more easily labeled modalities, such as vision or audio, to build more robust multi-modal models. This approach allows DL models to leverage information from more interpretable data types, reducing dependence on large volumes of purely RF-labeled data. However, collecting synchronized multi-modal data in outdoor environments remains technically challenging due to the complexity of integrating and aligning data from different sensors, especially when deployed over large areas.

4.2. The Gap Between Synthetic and Real-World Data

A principal challenge in training with synthetic data is bridging the gap between controlled simulations and the variability of real-world scenarios [14,133]. While simulated environments allow precise control, real-world data introduce variations in material properties, signal conditions, and dynamic outdoor factors that are difficult to replicate accurately. This difference makes it challenging to verify that models trained on synthetic data function effectively in real-world scenarios. Elements such as changing weather, diverse terrains, and sensor noise add further complexity to generating synthetic data that capture the nuances of actual environments. Addressing this gap remains critical for deploying reliable models in practical settings. Current methods mitigate this issue by balancing synthetic and real-world data in model training. Additionally, cross-modality data generation offers promise; for example, Li et al. [146] proposed SBRF, a model to generate RF signals from video data by integrating ray tracing with electromagnetic computation. This physics-based approach overcomes some limitations of purely data-driven and model-driven techniques, which may lack precision or be costly and labor-intensive. However, challenges persist in adapting this method to complex, real-world conditions.

4.3. The Data Preprocessing Effort

Generative models are highly effective at processing raw signal data, particularly in outdoor environments where RF signals are often disrupted by ambient factors. However, generative AI models face challenges with latency and computational efficiency, making them less suitable for low-latency, resource-constrained scenarios, such as autonomous vehicles or real-time surveillance [143]. For example, Zeng et al. [147] introduced a radio anomaly detection framework using denoising diffusion probabilistic models that address issues like unstable training and poor performance with low SNR signals. Despite its strong performance, the framework demands significant computational resources to achieve high anomaly detection accuracy. To address this, applying compression techniques or model distillation methods, as suggested by Menghani [148], could reduce computational demands while maintaining the model’s desired performance, offering a practical solution for efficiency-sensitive applications. Another notable work by Liu et al. [149] introduced a predictive communication protocol utilizing a convolutional LSTM network based on historical channel data. This approach eliminates the need for explicit channel tracking, thereby significantly reducing signaling overhead.

4.4. Multimodal RF Sensing

Integrating multiple data modalities into a single RF sensing system can significantly reduce the cost and complexity of using separate systems for various urban applications while enhancing their connectivity. For instance, in traffic management, RF sensing can complement vision systems and sensor networks to enable vehicle detection, monitor pedestrian movements, and control smart traffic lights dynamically, leading to optimized traffic flow and improved safety [30]. In environmental monitoring, RF sensing combined with air quality sensors offers real-time tracking of pollution levels and environmental conditions [13]. However, achieving a functional multimodal system presents notable challenges, especially in integrating and synchronizing data from sensors with different sampling rates and formats, which requires sophisticated alignment methods. Furthermore, implementing real-time processing in large-scale urban settings necessitates robust data fusion algorithms and advanced infrastructure capable of managing high data volumes consistently and efficiently.

4.5. Integrated Sensing and Communication (ISAC)

RF sensing can be implemented within existing communication infrastructure, effectively utilizing resources while preserving communication capacity. This approach, known as Integrated Sensing and Communication (ISAC), has appeared as a compelling research area. ISAC refers to a design approach and set of enabling technologies that combine sensing and communication capabilities, aiming to optimize the utilization of wireless resources while providing mutual advantages [12]. This integration enables advancements in fields such as mobile crowd sensing, channel knowledge mapping, passive sensing networks, vehicular communications, satellite imaging, and broadcasting [12]. Additionally, ISAC leverages recent progress in machine learning and DL. Specifically, the functions of machine learning in ISAC systems are underlined by Demirhan and Alkhateeb [150], spanning joint sensing and communication (JSC), sensing-aided communication, and communication-aided sensing. These roles include optimizing waveform design, predicting beam patterns, enhancing system security, and improving network-level operations, all contributing to significant performance gains based on real-world data.

4.6. Federated Learning

With the rapid advancement of RF sensing systems, scaling these systems to cover larger areas and track more objects introduces a significant challenge: the need for scalable, privacy-preserving solutions. These solutions must efficiently process vast amounts of data generated by multiple entities while protecting sensitive information collected in public spaces. One promising approach is federated learning [151], a machine learning framework that distributes model training across multiple devices or decentralized data sources. In this setup, each device retains its local data and trains the model locally, sharing only model updates with a central server instead of transmitting raw data. Federated learning provides several benefits for RF sensing, including enhanced data privacy through local data retention, reduced network latency by avoiding the transmission of large datasets, and improved training efficiency. By leveraging the computational power and diverse datasets across a network of IoT devices, federated learning can also accelerate model convergence and enhance overall learning accuracy [152].

5. Conclusions

In this comprehensive survey, we have examined recent applications of DL architectures and techniques in outdoor RF sensing. Our analysis indicates that the rapid advancement of generative DL models has significantly enhanced RF sensing systems, particularly in outdoor environments where RF signal quality and quantity are often compromised due to environmental challenges. The findings highlight the pivotal role of wireless technologies in determining RF sensing performance, highlighting the need for continuous development in this area. Moreover, the diversity of settings and devices presents challenges in establishing a universal framework for RF sensing tasks, which typically require high-quality, large-scale datasets for effective training. Looking ahead, we anticipate that improvements in infrastructure, leading to increased computational power and the ability to generate synthetic data, will facilitate the integration of more advanced DL techniques, such as MLLMs, thereby significantly enhancing the performance of RF sensing systems.

Author Contributions

Conceptualization, C.L. and Q.D.M.N.; methodology, W.D.L. and C.L.; software, Q.D.M.N. and C.L.; validation, C.L., W.D.L. and X.L.; formal analysis, Q.D.M.N.; investigation, Q.D.M.N.; resources, C.L., W.D.L., Q.D.M.N. and X.L.; data curation: Q.D.M.N., C.L. and X.L.; writing—original draft preparation, Q.D.M.N.; writing—review and editing, C.L., W.D.L. and X.L.; visualization, Q.D.M.N.; supervision, C.L. and W.D.L.; project administration, C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, J.; Xi, R.; He, Y.; Sun, Y.; Guo, X.; Wang, W.; Na, X.; Liu, Y.; Shi, Z.; Gu, T. A Survey of mmWave-Based Human Sensing: Technology, Platforms and Applications. IEEE Commun. Surv. Tutor. 2023, 25, 2052–2087. [Google Scholar] [CrossRef]
Chen, Z.; Zheng, T.; Luo, J. Octopus: A Practical and Versatile Wideband MIMO Sensing Platform. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (MobiCom ’21), New Orleans, LA, USA, 31 January–4 February 2021; pp. 601–614. [Google Scholar] [CrossRef]
Lubna, L.; Hameed, H.; Ansari, S.; Zahid, A.; Sharif, A.; Abbas, H.T.; Alqahtani, F.; Mufti, N.; Ullah, S.; Imran, M.A.; et al. Radio frequency sensing and its innovative applications in diverse sectors: A comprehensive study. Front. Commun. Netw. 2022, 3, 1010228. [Google Scholar] [CrossRef]
Kong, H.; Huang, C.; Yu, J.; Shen, X. A Survey of mmWave Radar-Based Sensing in Autonomous Vehicles, Smart Homes and Industry. IEEE Commun. Surv. Tutor. 2024; early access. [Google Scholar] [CrossRef]
Alam, S.S.; Chakma, A.; Rahman, M.H.; Bin Mofidul, R.; Alam, M.M.; Utama, I.B.K.Y.; Jang, Y.M. RF-Enabled Deep-Learning-Assisted Drone Detection and Identification: An End-to-End Approach. Sensors 2023, 23, 4202. [Google Scholar] [CrossRef] [PubMed]
Ahmad, I.; Ullah, A.; Choi, W. WiFi-Based Human Sensing with Deep Learning: Recent Advances, Challenges, and Opportunities. IEEE Open J. Commun. Soc. 2024, 5, 3595–3623. [Google Scholar] [CrossRef]
Jagannath, A.; Jagannath, J.; Kumar, P.S.P.V. A comprehensive survey on radio frequency (RF) fingerprinting: Traditional approaches, deep learning, and open challenges. Comput. Netw. 2022, 219, 109455. [Google Scholar] [CrossRef]
Cui, L.; Zhang, Z.; Gao, N.; Meng, Z.; Li, Z. Radio frequency identification and sensing techniques and their applications—A review of the state-of-the-art. Sensors 2019, 19, 4012. [Google Scholar] [CrossRef]
Moloudian, G.; Hosseinifard, M.; Kumar, S.; Simorangkir, R.B.; Buckley, J.L.; Song, C.; Fantoni, G.; O’Flynn, B. RF energy harvesting techniques for battery-less wireless sensing, industry 4.0 and internet of things: A review. IEEE Sens. J. 2024, 24, 5732–5745. [Google Scholar] [CrossRef]
Khunteta, S.; Saikrishna, P.; Agrawal, A.; Kumar, A.; Chavva, A.K.R. RF-Sensing: A New Way to Observe Surroundings. IEEE Access 2022, 10, 129653–129665. [Google Scholar] [CrossRef]
Pathak, N.P. Non-invasive RF sensors: Design and analysis. In Proceedings of the 2017 2nd International Conference on Telecommunication and Networks (TEL-NET), Noida, India, 10–11 August 2017; p. 1. [Google Scholar] [CrossRef]
Liu, F.; Cui, Y.; Masouros, C.; Xu, J.; Han, T.X.; Eldar, Y.C.; Buzzi, S. Integrated Sensing and Communications: Toward Dual-Functional Wireless Networks for 6G and Beyond. IEEE J. Select. Areas Commun. 2022, 40, 1728–1767. [Google Scholar] [CrossRef]
Van Truong, T.; Nayyar, A.; Masud, M. A novel air quality monitoring and improvement system based on wireless sensor and actuator networks using LoRa communication. PeerJ Comput. Sci. 2021, 7, e711. [Google Scholar] [CrossRef] [PubMed]
Yapar, C.; Levie, R.; Kutyniok, G.; Caire, G. Real-Time Outdoor Localization Using Radio Maps: A Deep Learning Approach. IEEE Trans. Wirel. Commun. 2023, 22, 9703–9717. [Google Scholar] [CrossRef]
Yi Lim, N.C.; Yong, L.; Su, H.T.; Yu Hao Chai, A.; Vithanawasam, C.K.; Then, Y.L.; Siang Tay, F. Review of Temperature and Humidity Impacts on RF Signals. In Proceedings of the 13th International UNIMAS Engineering Conference (EnCon 2020), Kota Samarahan, Malaysia, 27–28 October 2020; pp. 1–8. [Google Scholar] [CrossRef]
Haenggi, M.; Ganti, R.K. Interference in large wireless networks. Found. Trends Netw. 2009, 3, 127–248. [Google Scholar] [CrossRef]
Zheng, T.; Chen, Z.; Ding, S.; Luo, J. Enhancing RF Sensing with Deep Learning: A Layered Approach. IEEE Commun. Mag. 2021, 59, 70–76. [Google Scholar] [CrossRef]
Wang, X.; Wang, X.; Mao, S. RF Sensing in the Internet of Things: A General Deep Learning Framework. IEEE Commun. Mag. 2018, 56, 62–67. [Google Scholar] [CrossRef]
Wang, X.; Zhao, Y.; Pourpanah, F. Recent advances in deep learning. Int. J. Mach. Learn. Cybern. 2020, 11, 747–750. [Google Scholar] [CrossRef]
Bayoudh, K.; Knani, R.; Hamdaoui, F.; Mtibaa, A. A survey on deep multimodal learning for computer vision: Advances, trends, applications, and datasets. Vis. Comput. 2022, 38, 2939–2970. [Google Scholar] [CrossRef] [PubMed]
Ni, J.; Young, T.; Pandelea, V.; Xue, F.; Cambria, E. Recent advances in deep learning based dialogue systems: A systematic survey. Artif. Intell. Rev. 2023, 56, 3055–3155. [Google Scholar] [CrossRef]
Iman, M.; Arabnia, H.R.; Rasheed, K. A review of deep transfer learning and recent advancements. Technologies 2023, 11, 40. [Google Scholar] [CrossRef]
Choudhary, K.; DeCost, B.; Chen, C.; Jain, A.; Tavazza, F.; Cohn, R.; Park, C.W.; Choudhary, A.; Agrawal, A.; Billinge, S.J.; et al. Recent advances and applications of deep learning methods in materials science. Npj Comput. Mater. 2022, 8, 59. [Google Scholar] [CrossRef]
Sigg, S.; Scholz, M.; Shi, S.; Ji, Y.; Beigl, M. RF-sensing of activities from non-cooperative subjects in device-free recognition systems using ambient and local signals. IEEE Trans. Mob. Comput. 2013, 13, 907–920. [Google Scholar] [CrossRef]
Wang, J.; Zhang, X.; Gao, Q.; Ma, X.; Feng, X.; Wang, H. Device-free simultaneous wireless localization and activity recognition with wavelet feature. IEEE Trans. Veh. Technol. 2016, 66, 1659–1669. [Google Scholar] [CrossRef]
LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
Elbir, A.M. DeepMUSIC: Multiple signal classification via deep learning. IEEE Sens. Lett. 2020, 4, 1–4. [Google Scholar] [CrossRef]
Conn, M.A.; Josyula, D. Radio Frequency Classification and Anomaly Detection Using Convolutional Neural Networks. In Proceedings of the 2019 IEEE Radar Conference (RadarConf), Boston, MA, USA, 22–26 April 2019; pp. 1–6. [Google Scholar] [CrossRef]
Regmi, H.; Sur, S. CoSense: Deep Learning Augmented Sensing for Coexistence with Networking in Millimeter-Wave Picocells. ACM Trans. Internet Things 2024, 5, 17:1–17:35. [Google Scholar] [CrossRef]
Chi, G.; Yang, Z.; Wu, C.; Xu, J.; Gao, Y.; Liu, Y.; Han, T.X. RF-Diffusion: Radio Signal Generation via Time-Frequency Diffusion. In Proceedings of the 30th Annual International Conference on Mobile Computing and Networking (MobiCom ’24), Washington, DC, USA, 18–22 November 2024; pp. 77–92. [Google Scholar] [CrossRef]
Liu, C.; Liu, X.; Li, S.; Yuan, W.; Ng, D.W.K. Deep CLSTM for Predictive Beamforming in Integrated Sensing and Communication-Enabled Vehicular Networks. J. Commun. Inf. Netw. 2022, 7, 269–277. [Google Scholar] [CrossRef]
Liu, C.; Liu, X.; Wei, Z.; Ng, D.W.K.; Schober, R. Scalable Predictive Beamforming for IRS-Assisted Multi-User Communications: A Deep Learning Approach. arXiv 2022, arXiv:2211.12644. [Google Scholar]
Nirmal, I.; Khamis, A.; Hassan, M.; Hu, W.; Zhu, X. Deep Learning for Radio-Based Human Sensing: Recent Advances and Future Directions. IEEE Commun. Surv. Tutor. 2021, 23, 995–1019. [Google Scholar] [CrossRef]
Burghal, D.; Ravi, A.T.; Rao, V.; Alghafis, A.A.; Molisch, A.F. A Comprehensive Survey of Machine Learning Based Localization with Wireless Signals. arXiv 2020, arXiv:2012.11171. [Google Scholar] [CrossRef]
Amjad, B.; Ahmed, Q.Z.; Lazaridis, P.I.; Hafeez, M.; Khan, F.A.; Zaharis, Z.D. Radio SLAM: A Review on Radio-Based Simultaneous Localization and Mapping. IEEE Access 2023, 11, 9260–9278. [Google Scholar] [CrossRef]
Soumya, A.; Krishna Mohan, C.; Cenkeramaddi, L.R. Recent Advances in mmWave-Radar-Based Sensing, Its Applications, and Machine Learning Techniques: A Review. Sensors 2023, 23, 8901. [Google Scholar] [CrossRef] [PubMed]
Chen, Y.; Xu, C.; Li, K.; Zhang, J.; Guo, X.; Jin, M.; Zheng, X.; He, Y. Wireless sensing for material identification: A survey. IEEE Commun. Surv. Tutor. 2024; Early Access. [Google Scholar] [CrossRef]
Kerdjidj, O.; Himeur, Y.; Sohail, S.S.; Amira, A.; Fadli, F.; Attala, S.; Mansoor, W.; Copiaco, A.; Gawanmeh, A.; Miniaoui, S.; et al. Uncovering the potential of indoor localization: Role of deep and transfer learning. IEEE Access 2024, 12, 73980–74010. [Google Scholar] [CrossRef]
Zafari, F.; Gkelias, A.; Leung, K.K. A Survey of Indoor Localization Systems and Technologies. IEEE Commun. Surv. Tutor. 2019, 21, 2568–2599. [Google Scholar] [CrossRef]
Yassin, A.; Nasser, Y.; Awad, M.; Al-Dubai, A.; Liu, R.; Yuen, C.; Raulefs, R.; Aboutanios, E. Recent Advances in Indoor Localization: A Survey on Theoretical Approaches and Applications. IEEE Commun. Surv. Tutor. 2017, 19, 1327–1346. [Google Scholar] [CrossRef]
Wang, L.; Zhang, C.; Zhao, Q.; Zou, H.; Lasaulce, S.; Valenzise, G.; He, Z.; Debbah, M. Generative AI for RF Sensing in IoT Systems. arXiv 2024, arXiv:2407.07506. [Google Scholar]
Dogan, D.; Dalveren, Y.; Kara, A. A Mini-Review on Radio Frequency Fingerprinting Localization in Outdoor Environments: Recent Advances and Challenges. In Proceedings of the 14th International Conference on Communications (COMM), Bucharest, Romania, 16–18 June 2022; pp. 1–5. [Google Scholar] [CrossRef]
Budalal, A.A.; Islam, M.R. Path loss models for outdoor environment—With a focus on rain attenuation impact on short-range millimeter-wave links. E-Prime-Adv. Electr. Eng. Electron. Energy 2023, 3, 100106. [Google Scholar] [CrossRef]
Rappaport, T.S.; Blankenship, K.; Xu, H. Propagation and Radio System Design Issues in Mobile Radio Systems for the Glomo Project; Virginia Polytechnic Institute and State University: Blacksburg, VA, USA, 1997. [Google Scholar]
Cho, Y.S.; Kim, J.; Yang, W.Y.; Kang, C.G. MIMO-OFDM Wireless Communications with MATLAB; John Wiley & Sons: Hoboken, NJ, USA, 2010. [Google Scholar]
Liu, Y.; Liu, X.; Mu, X.; Hou, T.; Xu, J.; Di Renzo, M.; Al-Dhahir, N. Reconfigurable Intelligent Surfaces: Principles and Opportunities. IEEE Commun. Surv. Tutor. 2021, 23, 1546–1577. [Google Scholar] [CrossRef]
Trichopoulos, G.C.; Theofanopoulos, P.; Kashyap, B.; Shekhawat, A.; Modi, A.; Osman, T.; Kumar, S.; Sengar, A.; Chang, A.; Alkhateeb, A. Design and Evaluation of Reconfigurable Intelligent Surfaces in Real-World Environment. IEEE Open J. Commun. Soc. 2022, 3, 462–474. [Google Scholar] [CrossRef]
Hu, J.; Zhang, H.; Di, B.; Li, L.; Bian, K.; Song, L.; Li, Y.; Han, Z.; Poor, H.V. Reconfigurable Intelligent Surface Based RF Sensing: Design, Optimization, and Implementation. IEEE J. Sel. Areas Commun. 2020, 38, 2700–2716. [Google Scholar] [CrossRef]
Alexandropoulos, G.C.; Crozzoli, M.; Phan-Huy, D.T.; Katsanos, K.D.; Wymeersch, H.; Popovski, P.; Ratajczak, P.; Bénédic, Y.; Hamon, M.H.; Gonzalez, S.H.; et al. Smart Wireless Environments Enabled by RISs: Deployment Scenarios and Two Key Challenges. arXiv 2022, arXiv:2203.13478. [Google Scholar] [CrossRef]
Huang, Y.; Chen, Z.; Wen, C.; Li, J.; Xia, X.G.; Hong, W. An Efficient Radio Frequency Interference Mitigation Algorithm in Real Synthetic Aperture Radar Data. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–12. [Google Scholar] [CrossRef]
Ameloot, T.; Van Torre, P.; Rogier, H. Variable Link Performance Due to Weather Effects in a Long-Range, Low-Power LoRa Sensor Network. Sensors 2021, 21, 3128. [Google Scholar] [CrossRef]
Hau, F.; Baumgärtner, F.; Vossiek, M. The Degradation of Automotive Radar Sensor Signals Caused by Vehicle Vibrations and Other Nonlinear Movements. Sensors 2020, 20, 6195. [Google Scholar] [CrossRef] [PubMed]
Iannizzotto, G.; Milici, M.; Nucita, A.; Lo Bello, L. A Perspective on Passive Human Sensing with Bluetooth. Sensors 2022, 22, 3523. [Google Scholar] [CrossRef]
Tabassum, M.; Zen, K. Performance Evaluation of ZigBee in Indoor and Outdoor Environment. In Proceedings of the 9th International Conference on IT in Asia (CITA), Kuching, Malaysia, 4–5 August 2015; pp. 1–7. [Google Scholar] [CrossRef]
Sigfox. Sigfox: A Global 0G Network for IoT. 2024. Available online: https://build.sigfox.com/sigfox (accessed on 21 December 2024).
LoRa, Platform for IoT | Semtech. Available online: https://www.semtech.com/lora (accessed on 21 December 2024).
Augustin, A.; Yi, J.; Clausen, T.; Townsley, W.M. A Study of LoRa: Long Range I& Low Power Networks for the Internet of Things. Sensors 2016, 16, 1466. [Google Scholar] [CrossRef] [PubMed]
Chu, N.H.; Nguyen, D.N.; Hoang, D.T.; Pham, Q.V.; Phan, K.T.; Hwang, W.J.; Dutkiewicz, E. AI-Enabled mm-Waveform Configuration for Autonomous Vehicles with Integrated Communication and Sensing. IEEE Internet Things J. 2023, 10, 16727–16743. [Google Scholar] [CrossRef]
Chen, C.; Chen, X.; Das, D.; Akhmetov, D.; Cordeiro, C. Overview and Performance Evaluation of Wi-Fi 7. IEEE Commun. Stand. Mag. 2022, 6, 12–18. [Google Scholar] [CrossRef]
Ahson, S.A.; Ilyas, M. RFID Handbook: Applications, Technology, Security, and Privacy; CRC Press: Boca, FL, USA, 2017; ISBN 978-1-4200-5499-6. [Google Scholar]
Barbieri, L.; Brambilla, M.; Trabattoni, A.; Mervic, S.; Nicoli, M. UWB Localization in a Smart Factory: Augmentation Methods and Experimental Assessment. IEEE Trans. Instrum. Meas. 2021, 70, 1–18. [Google Scholar] [CrossRef]
Hillger, P.; Grzyb, J.; Jain, R.; Pfeiffer, U.R. Terahertz Imaging and Sensing Applications with Silicon-Based Technologies. IEEE Trans. Terahertz Sci. Technol. 2019, 9, 1–19. [Google Scholar] [CrossRef]
Li, Y.; Chi, Z.; Liu, X.; Zhu, T. Passive-ZigBee: Enabling ZigBee Communication in IoT Networks with 1000X+ Less Power Consumption. In Proceedings of the 16th ACM Conference on Embedded Networked Sensor Systems (SenSys ’18), Shenzhen, China, 4–7 November 2018; pp. 159–171. [Google Scholar] [CrossRef]
Demeslay, C.; Rostaing, P.; Gautier, R. Theoretical Performance of LoRa System in Multipath and Interference Channels. IEEE Internet Things J. 2022, 9, 6830–6843. [Google Scholar] [CrossRef]
Haxhibeqiri, J.; Van den Abeele, F.; Moerman, I.; Hoebeke, J. LoRa Scalability: A Simulation Model Based on Interference Measurements. Sensors 2017, 17, 1193. [Google Scholar] [CrossRef]
Nie, M.; Zou, L.; Cui, H.; Zhou, X.; Wan, Y. Enhancing Human Activity Recognition with LoRa Wireless RF Signal Preprocessing and Deep Learning. Electronics 2024, 13, 264. [Google Scholar] [CrossRef]
Islam, K.Z.; Murray, D.; Diepeveen, D.; Jones, M.G.K.; Sohel, F. LoRa-based outdoor localization and tracking using unsupervised symbolization. Internet Things 2024, 25, 101016. [Google Scholar] [CrossRef]
Wu, D.; Liebeherr, J. A Low-Cost Low-Power LoRa Mesh Network for Large-Scale Environmental Sensing. IEEE Internet Things J. 2023, 10, 16700–16714. [Google Scholar] [CrossRef]
Hao, Z.; Yan, H.; Dang, X.; Ma, Z.; Jin, P.; Ke, W. Millimeter-Wave Radar Localization Using Indoor Multipath Effect. Sensors 2022, 22, 5671. [Google Scholar] [CrossRef]
Kwon, G.; Liu, Z.; Conti, A.; Park, H.; Win, M.Z. Integrated Localization and Communication for Efficient Millimeter Wave Networks. IEEE J. Sel. Areas Commun. 2023, 41, 3925–3941. [Google Scholar] [CrossRef]
Feng, Y.; Xie, Y.; Ganesan, D.; Xiong, J. LTE-based Pervasive Sensing Across Indoor and Outdoor. In Proceedings of the 19th ACM Conference on Embedded Networked Sensor Systems SenSys ’21, New York, NY, USA, 15–17 November 2021; pp. 138–151. [Google Scholar] [CrossRef]
Sardar, S.; Mishra, A.K.; Khan, M.Z.A. Vehicle detection and classification using LTE-CommSense. IET Radar Sonar Navig. 2019, 13, 850–857. [Google Scholar] [CrossRef]
Sonny, A.; Rai, P.K.; Kumar, A.; Khan, M.Z.A. Deep learning-based smart parking solution using channel state information in LTE-based cellular networks. In Proceedings of the International Conference on COMmunication Systems & NETworkS (COMSNETS), Bangalore, India, 8–10 January 2020; pp. 642–645. [Google Scholar] [CrossRef]
Jabbar, A.; Abdullah, F.Y. Long term evolution (LTE) scheduling algorithms in wireless sensor networks (WSN). Int. J. Comput. Appl. 2015, 121. [Google Scholar] [CrossRef][Green Version]
Gu, Y.; Chen, J.; He, K.; Wu, C.; Zhao, Z.; Du, R. WiFiLeaks: Exposing Stationary Human Presence through a Wall with Commodity Mobile Devices. IEEE Trans. Mob. Comput. 2024, 23, 6997–7011. [Google Scholar] [CrossRef]
Ma, Y.; Zhou, G.; Wang, S. WiFi Sensing with Channel State Information. ACM Comput. Surv. 2019, 52, 1–36. [Google Scholar] [CrossRef]
Landaluce, H.; Arjona, L.; Perallos, A.; Falcone, F.; Angulo, I.; Muralter, F. A Review of IoT Sensing Applications and Challenges Using RFID and Wireless Sensor Networks. Sensors 2020, 20, 2495. [Google Scholar] [CrossRef]
Shen, E.; Yang, W.; Wang, X.; Kang, B.; Mao, S. TagSense: Robust Wheat Moisture and Temperature Sensing Using RFID. IEEE J. Radio Freq. Identif. 2024, 8, 76–87. [Google Scholar] [CrossRef]
Le Breton, M.; Baillet, L.; Larose, E.; Rey, E.; Benech, P.; Jongmans, D.; Guyoton, F. Outdoor UHF RFID: Phase Stabilization for Real-world Applications. IEEE J. Radio Freq. Identif. 2017, 1, 279–290. [Google Scholar] [CrossRef]
Zhang, D.; Yang, L.T.; Chen, M.; Zhao, S.; Guo, M.; Zhang, Y. Real-time locating systems using active RFID for Internet of Things. IEEE Syst. J. 2014, 10, 1226–1235. [Google Scholar] [CrossRef]
Florentin, I. Discussion on UWB Technology and Its Applicability in Different Fields. J. Mil. Technol. 2020, 4. [Google Scholar] [CrossRef]
Queralta, J.P.; Martínez Almansa, C.; Schiano, F.; Floreano, D.; Westerlund, T. UWB-based System for UAV Localization in GNSS-Denied Environments: Characterization and Dataset. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 25–29 October 2020; pp. 4521–4528. [Google Scholar] [CrossRef]
Gabela, J.; Retscher, G.; Goel, S.; Perakis, H.; Masiero, A.; Toth, C.; Gikas, V.; Kealy, A.; Koppányi, Z.; Błaszczak-Bąk, W.; et al. Experimental Evaluation of a UWB-Based Cooperative Positioning System for Pedestrians in GNSS-Denied Environment. Sensors 2019, 19, 5274. [Google Scholar] [CrossRef]
Yang, J.; Dong, B.; Wang, J. VULoc: Accurate UWB Localization for Countless Targets without Synchronization. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2022, 6, 1–25. [Google Scholar] [CrossRef]
Siegel, P. Terahertz technology. IEEE Trans. Microw. Theory Tech. 2002, 50, 910–928. [Google Scholar] [CrossRef]
Bogue, R. Sensing with terahertz radiation: A review of recent progress. Sens. Rev. 2018, 38, 216–222. [Google Scholar] [CrossRef]
Jansen, C.; Wietzke, S.; Peters, O.; Scheller, M.; Vieweg, N.; Salhi, M.; Krumbholz, N.; Jördens, C.; Hochrein, T.; Koch, M. Terahertz imaging: Applications and perspectives. Appl. Opt. 2010, 49, E48–E57. [Google Scholar] [CrossRef]
Ren, A.; Zahid, A.; Fan, D.; Yang, X.; Imran, M.A.; Alomainy, A.; Abbasi, Q.H. State-of-the-Art in Terahertz Sensing for Food and Water Security—A Comprehensive Review. Trends Food Sci. Technol. 2019, 85, 241–251. [Google Scholar] [CrossRef]
Pawar, A.Y.; Sonawane, D.D.; Erande, K.B.; Derle, D.V. Terahertz technology and its applications. Drug Invent. Today 2013, 5, 157–163. [Google Scholar] [CrossRef]
Naftaly, M.; Vieweg, N.; Deninger, A. Industrial applications of terahertz sensing: State of play. Sensors 2019, 19, 4203. [Google Scholar] [CrossRef] [PubMed]
Zhang, Z.; Zhu, G.; Chen, J.; Cui, S. Fast and Accurate Cooperative Radio Map Estimation Enabled by GAN. arXiv 2024, arXiv:2402.02729. [Google Scholar] [CrossRef]
Teganya, Y.; Romero, D. Deep Completion Autoencoders for Radio Map Estimation. IEEE Trans. Wirel. Commun. 2022, 21, 1710–1724. [Google Scholar] [CrossRef]
Chen, X.; Zhang, X. RF Genesis: Zero-Shot Generalization of mmWave Sensing through Simulation-Based Data Synthesis and Generative Diffusion Models. In Proceedings of the 21st ACM Conference on Embedded Networked Sensor Systems (SenSys ’23), Istanbul, Turkey, 6–9 November 2024; pp. 28–42. [Google Scholar] [CrossRef]
Xu, Y.; Huang, L.; Zhang, L.; Qian, L.; Yang, X. Diffusion-Based Radio Signal Augmentation for Automatic Modulation Classification. Electronics 2024, 13, 2063. [Google Scholar] [CrossRef]
Chang, J.; Zhou, Z.; Mi, S.; Zhang, Y. Radio frequency fingerprint recognition method based on prior information. Comput. Electr. Eng. 2024, 120, 109684. [Google Scholar] [CrossRef]
An, T.; Zhou, Y.; Zou, H.; Yang, J. IoT-LLM: Enhancing Real-World IoT Task Reasoning with Large Language Models. arXiv 2024, arXiv:2410.02429. [Google Scholar]
Dai, S.; Jiang, S.; Yang, Y.; Cao, T.; Li, M.; Banerjee, S.; Qiu, L. Advancing Multi-Modal Sensing Through Expandable Modality Alignment. arXiv 2024, arXiv:2407.17777. [Google Scholar]
Zhao, L.; Lyu, R.; Lei, H.; Lin, Q.; Zhou, A.; Ma, H.; Wang, J.; Meng, X.; Shao, C.; Tang, Y.; et al. AirECG: Contactless Electrocardiogram for Cardiac Disease Monitoring via mmWave Sensing and Cross-domain Diffusion Model. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2024, 8, 144:1–144:27. [Google Scholar] [CrossRef]
Nguyen, H.N.; Vomvas, M.; Vo-Huu, T.D.; Noubir, G. WRIST: Wideband, Real-Time, Spectro-Temporal RF Identification System Using Deep Learning. IEEE Trans. Mob. Comput. 2024, 23, 1550–1567. [Google Scholar] [CrossRef]
Mohamed, A.; Tharwat, M.; Magdy, M.; Abubakr, T.; Nasr, O.; Youssef, M. DeepFeat: Robust Large-Scale Multi-Features Outdoor Localization in LTE Networks Using Deep Learning. IEEE Access 2022, 10, 3400–3414. [Google Scholar] [CrossRef]
Xue, C.; Li, T.; Li, Y.; Ruan, Y.; Zhang, R.; Dobre, O.A. Radio-Frequency Identification for Drones With Nonstandard Waveforms Using Deep Learning. IEEE Trans. Instrum. Meas. 2023, 72, 1–13. [Google Scholar] [CrossRef]
Podder, P.; Zawodniok, M.; Madria, S. Deep Learning for UAV Detection and Classification via Radio Frequency Signal Analysis. In Proceedings of the 25th IEEE International Conference on Mobile Data Management (MDM), Brussels, Belgium, 24–27 June 2024; pp. 165–174. [Google Scholar] [CrossRef]
Li, A.; Bodanese, E.; Poslad, S.; Chen, P.; Wang, J.; Fan, Y.; Hou, T. A Contactless Health Monitoring System for Vital Signs Monitoring, Human Activity Recognition, and Tracking. IEEE Internet Things J. 2024, 11, 29275–29286. [Google Scholar] [CrossRef]
Wang, S.; Cao, D.; Liu, R.; Jiang, W.; Yao, T.; Lu, C.X. Human Parsing with Joint Learning for Dynamic mmWave Radar Point Cloud. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2023, 7, 34:1–34:22. [Google Scholar] [CrossRef]
Yang, J.; Chen, X.; Zou, H.; Lu, C.X.; Wang, D.; Sun, S.; Xie, L. SenseFi: A library and benchmark on deep-learning-empowered WiFi human sensing. Patterns 2023, 4, 100703. [Google Scholar] [CrossRef]
Song, M.; Lou, L.; Chen, X.; Zhao, X.; Hong, Y.; Zhang, S.; He, W. Wi-LADL: A Wireless-Based Lightweight Attention Deep Learning Method for Human-Vehicle Recognition. IEEE Sens. J. 2023, 23, 2803–2814. [Google Scholar] [CrossRef]
Wang, Q.; Yang, P.; Yan, X.; Wu, H.C.; He, L. Radio Frequency-Based UAV Sensing Using Novel Hybrid Lightweight Learning Network. IEEE Sens. J. 2024, 24, 4841–4850. [Google Scholar] [CrossRef]
Tomczak, J.M. Why Deep Generative Modeling? In Deep Generative Modeling; Springer Nature: Cham, Switzerland, 2021; pp. 1–12. [Google Scholar] [CrossRef]
Zhou, R.; Tang, M.; Gong, Z.; Hao, M. FreeTrack: Device-free human tracking with deep neural networks and particle filtering. IEEE Syst. J. 2019, 14, 2990–3000. [Google Scholar] [CrossRef]
Wu, X.; Chu, Z.; Yang, P.; Xiang, C.; Zheng, X.; Huang, W. TW-See: Human activity recognition through the wall with commodity Wi-Fi devices. IEEE Trans. Veh. Technol. 2018, 68, 306–319. [Google Scholar] [CrossRef]
Zhao, Y.; Wang, G.; Tang, C.; Luo, C.; Zeng, W.; Zha, Z.J. A Battle of Network Structures: An Empirical Study of CNN, Transformer, and MLP. arXiv 2021, arXiv:2108.13002. [Google Scholar]
Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. J. Big Data 2021, 8, 1–74. [Google Scholar] [CrossRef]
Shewalkar, A.; Nyavanandi, D.; Ludwig, S.A. Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU. J. Artif. Intell. Soft Comput. Res. 2019, 9, 235–245. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Kingma, D.P.; Welling, M. Auto-Encoding Variational Bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
Bank, D.; Koenigstein, N.; Giryes, R. Autoencoders. In Machine Learning for Data Science Handbook: Data Mining and Knowledge Discovery Handbook; Springer: Berlin/Heidelberg, Germany, 2023; pp. 353–374. [Google Scholar]
Doersch, C. Tutorial on variational autoencoders. arXiv 2016, arXiv:1606.05908. [Google Scholar]
Almazrouei, E.; Gianini, G.; Almoosa, N.; Damiani, E. A Deep Learning Approach to Radio Signal Denoising. In Proceedings of the 2019 IEEE Wireless Communications and Networking Conference Workshops (WCNCW), Marrakech, Morocco, 1–8 April 2019; pp. 1–8. [Google Scholar] [CrossRef]
Migliori, B.; Zeller-Townson, R.; Grady, D.; Gebhardt, D. Biologically inspired radio signal feature extraction with sparse denoising autoencoders. arXiv 2016, arXiv:1605.05239. [Google Scholar]
Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
Creswell, A.; White, T.; Dumoulin, V.; Arulkumaran, K.; Sengupta, B.; Bharath, A.A. Generative adversarial networks: An overview. IEEE Signal Process. Mag. 2018, 35, 53–65. [Google Scholar] [CrossRef]
Mirza, M. Conditional generative adversarial nets. arXiv 2014, arXiv:1411.1784. [Google Scholar]
Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre-Training. OpenAI. 2018. Available online: https://api.semanticscholar.org/CorpusID:49313245 (accessed on 21 December 2024).
Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; et al. LLaMA: Open and Efficient Foundation Language Models. arXiv 2023, arXiv:2302.13971. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Advances in Neural Information Processing Systems 30 (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
Liu, G.; Van Huynh, N.; Du, H.; Hoang, D.T.; Niyato, D.; Zhu, K.; Kang, J.; Xiong, Z.; Jamalipour, A.; Kim, D.I. Generative AI for Unmanned Vehicle Swarms: Challenges, Applications and Opportunities. arXiv 2024, arXiv:2402.18062. [Google Scholar]
Zhou, H.; Hu, C.; Yuan, Y.; Cui, Y.; Jin, Y.; Chen, C.; Wu, H.; Yuan, D.; Jiang, L.; Wu, D.; et al. Large language model (LLM) for telecommunications: A comprehensive survey on principles, key techniques, and opportunities. arXiv 2024, arXiv:2405.10825. [Google Scholar] [CrossRef]
Ouyang, X.; Srivastava, M. LLMSense: Harnessing LLMs for High-level Reasoning Over Spatiotemporal Sensor Traces. arXiv 2024, arXiv:2403.19857. [Google Scholar]
Ji, S.; Zheng, X.; Wu, C. HARGPT: Are LLMs Zero-Shot Human Activity Recognizers? arXiv 2024, arXiv:2403.02727. [Google Scholar]
Liu, C.; Wei, Z.; Ng, D.W.K.; Yuan, J.; Liang, Y.C. Deep Transfer Learning for Signal Detection in Ambient Backscatter Communications. IEEE Trans. Wirel. Commun. 2021, 20, 1624–1638. [Google Scholar] [CrossRef]
Levie, R.; Yapar, C.; Kutyniok, G.; Caire, G. RadioUNet: Fast Radio Map Estimation with Convolutional Neural Networks. IEEE Trans. Wirel. Commun. 2021, 20, 4001–4015. [Google Scholar] [CrossRef]
Bahl, P.; Padmanabhan, V.N. RADAR: An In-Building RF-Based User Location and Tracking System. In Proceedings of the 19th Annual Joint Conference of the IEEE Computer and Communications Societies (INFOCOM 2000), Tel Aviv, Israel, 26–30 March 2000; pp. 775–784. [Google Scholar] [CrossRef]
Oh, J.; Kim, J. AdaptiveK-nearest neighbour algorithm for WiFi fingerprint positioning. ICT Express 2018, 4, 91–94. [Google Scholar] [CrossRef]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Yao, Y.; Lv, K.; Huang, S.; Li, X.; Xiang, W. UAV trajectory and energy efficiency optimization in RIS-assisted multi-user air-to-ground communications networks. Drones 2023, 7, 272. [Google Scholar] [CrossRef]
Lukito, W.D.; Xiang, W.; Lai, P.; Cheng, P.; Liu, C.; Yu, K.; Zhu, X. Integrated STAR-RIS and UAV for Satellite IoT Communications: An Energy-Efficient Approach. IEEE Internet of Things J. 2024; Early Access. [Google Scholar] [CrossRef]
Bremnes, K.; Moen, R.; Yeduri, S.R.; Yakkati, R.R.; Cenkeramaddi, L.R. Classification of UAVs utilizing fixed boundary empirical wavelet sub-bands of RF fingerprints and deep convolutional neural network. IEEE Sens. J. 2022, 22, 21248–21256. [Google Scholar] [CrossRef]
Liu, Z.; Mao, H.; Wu, C.Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 11976–11986. [Google Scholar] [CrossRef]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In Proceedings of the 9th International Conference on Learning Representations (ICLR), Virtual Conference, 3–7 May 2021. [Google Scholar]
Medaiyese, O.O.; Ezuma, M.; Lauf, A.P.; Adeniran, A.A. Hierarchical Learning Framework for UAV Detection and Identification. IEEE J. Radio Freq. Identif. 2022, 6, 176–188. [Google Scholar] [CrossRef]
Lee, W.; Park, J. LLM-Empowered Resource Allocation in Wireless Communications Systems. arXiv 2024, arXiv:2408.02944. [Google Scholar]
Khachatrian, H.; Mkrtchyan, R.; Raptis, T.P. Outdoor Environment Reconstruction with Deep Learning on Radio Propagation Paths. arXiv 2024, arXiv:2402.17336. [Google Scholar] [CrossRef]
Chen, L.; Zheng, L.; Xia, D.; Sun, D.; Liu, W. STL-Detector: Detecting City-Wide Ride-Sharing Cars via Self-Taught Learning. IEEE Internet Things J. 2022, 9, 2346–2360. [Google Scholar] [CrossRef]
Li, J.; Zhang, D.; Wu, Z.; Yu, C.; Li, Y.; Chen, Q.; Hu, Y.; Sun, Q.; Chen, Y. SBRF: A Fine-Grained Radar Signal Generator for Human Sensing. IEEE Trans. Mob. Comput. 2024; 1–17, Early Access. [Google Scholar] [CrossRef]
Zeng, J.; Liu, X.; Li, Z. Radio Anomaly Detection Based on Improved Denoising Diffusion Probabilistic Models. IEEE Commun. Lett. 2023, 27, 1979–1983. [Google Scholar] [CrossRef]
Menghani, G. Efficient deep learning: A survey on making deep learning models smaller, faster, and better. ACM Comput. Surv. 2023, 55, 1–37. [Google Scholar] [CrossRef]
Liu, C.; Yuan, W.; Li, S.; Liu, X.; Li, H.; Ng, D.W.K.; Li, Y. Learning-based Predictive Beamforming for Integrated Sensing and Communication in Vehicular Networks. IEEE J. Sel. Areas Commun. 2022, 40, 2317–2334. [Google Scholar] [CrossRef]
Demirhan, U.; Alkhateeb, A. Integrated Sensing and Communication for 6G: Ten Key Machine Learning Roles. IEEE Commun. Mag. 2023, 61, 113–119. [Google Scholar] [CrossRef]
McMahan, B.; Moore, E.; Ramage, D.; Hampson, S.; Aguera y Arcas, B. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, FL, USA, 20–22 April 2017; pp. 1273–1282. [Google Scholar]
Nguyen, D.C.; Ding, M.; Pathirana, P.N.; Seneviratne, A.; Li, J.; Vincent Poor, H. Federated Learning for Internet of Things: A Comprehensive Survey. IEEE Commun. Surv. Tutor. 2021, 23, 1622–1658. [Google Scholar] [CrossRef]

Figure 1. Organization of this survey.

Figure 2. An illustration of the challenges of Radio Frequency (RF) sensing in outdoor environments, including multipath interference, and environmental factors.

Figure 3. An illustration of multipath propagation.

Figure 4. Representative architectures of discriminative models commonly applied in RF sensing.

Figure 5. Representative architectures of generative models commonly applied in RF sensing.

Figure 6. A proposed pipeline for integrating generative and discriminative models in RF sensing.

Table 1. Path loss exponent n in various environments [46].

Environment	Path Loss Exponent (n)
Free space (Outdoor)	2
Urban area cellular radio (Outdoor)	2.7–3.5
Shadowed urban cellular radio (Outdoor)	3–5
In-building LoS (Indoor)	1.6–1.8
Obstructed in building (Indoor)	4–6
Obstructed in factories (Indoor)	2–3

Table 2. Comparison of wireless technologies for sensing applications.

Name	Sensing Range	Transmission Power	Operating Frequency	Outdoor Application
LoRa [57,58]	Up to 15 km	Up to 20 dBm	433 MHz, 868 MHz, 915 MHz	Environmental monitoring, outdoor localization
mmWave [4,59]	Up to 500 m	30–40 dBm	30–300 GHz	High-resolution outdoor surveillance, autonomous vehicle navigation
LTE	Up to 100 m	23–43 dBm	450 MHz–3.8 GHz	Environmental sensing, asset tracking
Wi-Fi [60]	Up to 100 m	Up to 30 dBm	2.4 GHz, 5 GHz, 6 GHz	Public space monitoring, human sensing
RFID [61]	Up to 10 cm	N/A	125–134 kHz (Low Frequency)	Asset tracking, wildlife monitoring
	Up to 1 m	N/A	13.56 MHz (High Frequency)
	Up to 10 m	N/A	860–960 MHz (Ultra-High Frequency)
UWB [62]	Up to 200 m	−41.3 dBm	3.1–10.6 GHz	High-precision localization
Terahertz [63]	Up to 10 m	N/A	0.3–3 THz	Non-invasive inspection, imaging
ZigBee [64]	Up to 100 m	Up to 20 dBm	2.4 GHz	Short-range sensing
Bluetooth [54]	Up to 100 m	0–20 dBm	2.4 GHz	Short-range sensing

Table 3. Summary of recent works leveraging Deep Learning models in RF sensing.

Study	Wireless Technology	Data Type	Algorithm
CoSense [30]	mmWave radar	Experimental	Conditional GAN (mmWave radio map reconstruction)
Zhang et al. [92]	Ray tracing (5.9 GHz)	Numerical	Conditional GAN (radio map estimation without transmitter info)
Teganya and Romero [93]	Path loss, shadowing, ray tracing	Mixed	Deep convolutional autoencoder (radio map completion)
RFGen [94]	mmWave radar (24 GHz, 77 GHz)	Numerical	Generative Diffusion Models (synthetic RF data generation)
Chi et al. [31]	Wi-Fi (5.825 GHz), mmWave	Mixed	Hierarchical Diffusion Transformer (RF signal reconstruction)
Xu et al. [95]	RF signals (RadioML2016.10a)	Numerical	Diffusion-based augmentation (DiRSA) (for AMC)
Chang et al. [96]	RF signals (low SNR)	Experimental	GRU + Transformer + MFCC (RFF classification under low SNR)
IoT-LLM [97]	IMU, ECG, Wi-Fi CSI, RSSI	Experimental	Structured LLM Reasoning Framework (IoT task reasoning)
Babel [98]	Wi-Fi, mmWave	Experimental	Multi-modal contrastive learning (multimodal alignment for HAR)
AirECG [99]	mmWave radar (60–77 GHz)	Experimental	Cross-domain Diffusion Model (mmWave-to-ECG translation)
Yapar et al. [14]	Custom RF (Ray tracing)	Numerical	LocUNet (UNet-based CNN) (localization from path loss maps)
WRIST [100]	2.4 GHz ISM (Wi-Fi, Bluetooth, ZigBee)	Mixed	YOLO-based DL model (RF emission detection and classification)
DeepFeat [101]	LTE (4G networks)	Experimental	Deep Feed-Forward Neural Network (large-scale outdoor localization)
Xue et al. [102]	5.725–5.850 GHz ISM	Experimental	CNN + LSTM (UAV classification under varying conditions)
Podder et al. [103]	2.4 GHz ISM (UAV protocols)	Experimental	CNN (ResNet-50V2) (UAV recognition from spectrograms)
Alam et al. [5]	2.4 GHz ISM (Wi-Fi, UAV controllers)	Experimental	Deep CNN with residual connections (UAV detection in congested bands)
HealthDAR [104]	mmWave FMCW Radar (77–79.585 GHz)	Experimental	ResNet-18 + SSPD (vital signs and activity monitoring)
Wang et al. [105]	mmWave FMCW Radar (60–64 GHz)	Experimental	mmParse (PointNet + NLN + attention) (human parsing)
SenseFi [106]	Wi-Fi (CSI data)	Experimental	MLP, CNN, LSTM, Transformer (benchmark for Wi-Fi-based human sensing)
Song et al. [107]	433 MHz, 915 MHz, 2.4 GHz	Experimental	Lightweight CNN + CBAM (human-vehicle recognition)
Wang et al. [108]	2.4 GHz ISM (UAV, Wi-Fi, Bluetooth)	Experimental	UAV-CTNet (CNN + Transformer) (UAV detection and identification)
Nie et al. [67]	LoRa signals	Experimental	CNN-LSTM, Swin Transformer, ConvNext, Vision TF (LoRa-based HAR)

Table 4. Comparison of Deep Learning model types in RF sensing.

Type of Approach	Model	Objectives	Advantages	Disadvantages
Discriminative	MLPs	Classification, regression	Simple architecture, easy to implement, efficient for small datasets	Limited capacity for spatial/temporal information, not scalable for complex tasks
	CNNs	Signal representation, feature extraction	Good at extracting spatial features	Limited for temporal information without additional structures
	RNNs	Sequential signal analysis, time-series prediction	Handles sequential and temporal dependencies well	Prone to vanishing/exploding gradient problems, less efficient for long sequences
Generative	AEs	Dimensionality reduction, anomaly detection	Good for feature extraction, data compression	Poor reconstruction with complex signals, requires tuning of latent space size
	GANs	RF signal generation, data augmentation, anomaly detection	Capable of generating realistic data	Difficult to train, sensitive to hyperparameters
	DMs	Signal denoising, enhancement, and generative modeling	High quality in denoising and generating diverse data, robust training	Computationally intensive, slow to generate outputs compared to GANs
	LLMs	Cross-modal RF sensing, sequence modeling	Excellent for capturing long-range dependencies, scalable, adaptable to different tasks (e.g., classification, localization)	Requires large datasets or well-pre-trained models, computationally expensive

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nguyen, Q.D.M.; Lukito, W.D.; Liu, X.; Liu, C. Deep Learning-Empowered RF Sensing in Outdoor Environments: Recent Advances, Challenges, and Future Directions. Electronics 2025, 14, 125. https://doi.org/10.3390/electronics14010125

AMA Style

Nguyen QDM, Lukito WD, Liu X, Liu C. Deep Learning-Empowered RF Sensing in Outdoor Environments: Recent Advances, Challenges, and Future Directions. Electronics. 2025; 14(1):125. https://doi.org/10.3390/electronics14010125

Chicago/Turabian Style

Nguyen, Quang D. M., William D. Lukito, Xuemeng Liu, and Chang Liu. 2025. "Deep Learning-Empowered RF Sensing in Outdoor Environments: Recent Advances, Challenges, and Future Directions" Electronics 14, no. 1: 125. https://doi.org/10.3390/electronics14010125

APA Style

Nguyen, Q. D. M., Lukito, W. D., Liu, X., & Liu, C. (2025). Deep Learning-Empowered RF Sensing in Outdoor Environments: Recent Advances, Challenges, and Future Directions. Electronics, 14(1), 125. https://doi.org/10.3390/electronics14010125

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Empowered RF Sensing in Outdoor Environments: Recent Advances, Challenges, and Future Directions

Abstract

1. Introduction

2. Overview of RF Sensing in Outdoor Environments

2.1. Challenges of RF Sensing in Outdoor Environments

2.2. Wireless Technologies for Outdoor Environments

3. The Role of Deep Learning in RF Sensing

3.1. Deep Learning Models in RF Sensing

3.2. Deep Learning-Empowered Outdoor RF Sensing

3.2.1. Generative Models

3.2.2. Discriminative Models

3.2.3. Comparison and Integration of Discriminative and Generative Models

4. Challenges and Future Directions

4.1. The Scarcity of Training Data

4.2. The Gap Between Synthetic and Real-World Data

4.3. The Data Preprocessing Effort

4.4. Multimodal RF Sensing

4.5. Integrated Sensing and Communication (ISAC)

4.6. Federated Learning

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI