1D-CLANet: A Novel Network for NLoS Classification in UWB Indoor Positioning System

Wang, Qiu; Chen, Mingsong; Liu, Jiajie; Lin, Yongcheng; Li, Kai; Yan, Xin; Zhang, Chizhou

doi:10.3390/app14177609

Open AccessArticle

1D-CLANet: A Novel Network for NLoS Classification in UWB Indoor Positioning System

by

Qiu Wang

^1,2

,

Mingsong Chen

^1,2,*

,

Jiajie Liu

^1,2,

Yongcheng Lin

^2,3

,

Kai Li

^2,3

,

Xin Yan

^1,2 and

Chizhou Zhang

^1,2

¹

Light Alloy Research Institute, Central South University, Changsha 410083, China

²

State Key Laboratory of Precision Manufacturing for Extreme Service Performance, Changsha 410083, China

³

School of Mechanical and Electrical Engineering, Central South University, Changsha 410083, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(17), 7609; https://doi.org/10.3390/app14177609

Submission received: 4 July 2024 / Revised: 1 August 2024 / Accepted: 19 August 2024 / Published: 28 August 2024

(This article belongs to the Section Electrical, Electronics and Communications Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Ultra-Wideband (UWB) technology is crucial for indoor localization systems due to its high accuracy and robustness in multipath environments. However, Non-Line-of-Sight (NLoS) conditions can cause UWB signal distortion, significantly reducing positioning accuracy. Thus, distinguishing between NLoS and LoS scenarios and mitigating positioning errors is crucial for enhancing UWB system performance. This research proposes a novel 1D-ConvLSTM-Attention network (1D-CLANet) for extracting UWB temporal channel impulse response (CIR) features and identifying NLoS scenarios. The model combines the convolutional neural network (CNN) and Long Short-Term memory (LSTM) architectures to extract temporal CIR features and introduces the Squeeze-and-Excitation (SE) attention mechanism to enhance critical features. Integrating SE attention with LSTM outputs boosts the model’s ability to differentiate between various NLoS categories. Experimental results show that the proposed 1D-CLANet with SE attention achieves superior performance in differentiating multiple NLoS scenarios with limited computational resources, attaining an accuracy of 95.58%. It outperforms other attention mechanisms and the version of 1D-CLANet without attention. Compared to advanced methods, the SE-enhanced 1D-CLANet significantly improves the ability to distinguish between LoS and similar NLoS scenarios, such as human obstructions, enhancing overall recognition accuracy in complex environments.

Keywords:

channel impulse response; non-line-of-sight classification; ultra-wideband; deep learning; indoor positioning system

1. Introduction

In recent years, the need for precise positioning has increased across various sectors. While satellite-based positioning systems are effective outdoors, their limited signal penetration makes them unsuitable for indoor navigation, prompting the development of indoor positioning systems [1,2]. Advances in sensing technologies have introduced methods like WIFI, RSSI, and Ultra-Wideband (UWB) positioning, which are widely utilized in the internet of things (IoT), smart buildings, healthcare, and industrial navigation [3]. UWB technology is particularly notable for its high temporal accuracy, low power consumption, and strong penetration capabilities, making it ideal for precise indoor applications [4].

However, UWB indoor positioning systems (IPS) face significant challenges due to their susceptibility to Non-Line-of-Sight (NLoS) conditions, which greatly reduce positioning accuracy [5]. As shown in Figure 1, under NLoS conditions, signals from the transmitter (TX) are prone to refraction, reflection, and diffraction. Only a small portion of the signal can penetrate obstacles and reach the receiver (RX) directly. The different arrival times and phases of these signal components can cause distortion or errors in the original signal, a phenomenon known as the multipath effect [6]. This issue is particularly severe in complex indoor environments with significant occlusion. Accurate distance estimation in UWB networks relies on measuring the time of flight (ToF) under optimal Line-of-Sight (LoS) conditions. Nevertheless, NLoS conditions, prevalent in indoor settings, distort signal delays, leading to positive biases in ToF estimates and subsequent distance measurements [7]. Most UWB positioning algorithms, such as trilateration [8] and the least squares method [9], assume ideal LoS conditions between anchors and tags, which can result in significant errors when NLoS and multipath components are present [10]. Therefore, identifying and mitigating the negative impact of NLoS is crucial.

Recent advancements in NLoS identification techniques for UWB positioning have greatly enhanced accuracy in indoor environments [11]. Key approaches include statistical hypothesis testing [12], machine learning (ML) [13], and deep learning (DL) methods [14]. Statistical hypothesis testing evaluates the likelihood of NLoS conditions based on signal characteristics [12], analyzing the statistical properties of the received signal metrics, such as the signal strength or time-of-flight, to detect deviations indicative of NLoS propagation. ML algorithms rely on extracting hand-crafted features from the channel impulse response (CIR) data and training models using labeled datasets to differentiate between LoS and NLoS scenarios [13], employing techniques like decision trees, support vector machines (SVMs), and ensemble methods to identify patterns in signal data characteristic of NLoS conditions. DL techniques, including recurrent neural networks (RNNs) and convolutional neural networks (CNNs), directly process raw UWB signal data [14], automatically learning hierarchical features that capture complex temporal and spatial dependencies. For instance, CNNs extract spatial features through convolutional layers, while RNNs, particularly Long Short-Term memory (LSTM) networks, capture sequential dependencies, making them well-suited for time-series data inherent in UWB signals [15]. These advancements are crucial for enhancing the accuracy and reliability of UWB ranging and positioning systems in dynamic surroundings with obstacles, effectively identifying and mitigating NLoS conditions to ensure robust performance [16].

Although binary NLoS recognition methods have achieved high accuracy, they struggle to differentiate subtle variations in complex indoor environments. To address this, we introduce an innovative NLoS recognition model based on a low-power UWB chip, named the 1D-ConvLSTM-Attention network (1D-CLANet). This model utilizes CIR data to accurately identify LoS and segmented NLoS conditions, requiring no prior knowledge and making it adaptable to various deployment scenarios.

The main contributions are as follows:

(1): A novel network that integrates CNN, LSTM, and SE attention mechanisms is proposed. This architecture efficiently combines the deep feature extraction capability of 1D-CNN, the time series processing strength of LSTM, and the SE attention mechanism’s focus on critical features, significantly enhancing NLoS classification accuracy for UWB indoor positioning.
(2): By leveraging 1D-CNN and other lightweight design considerations, the proposed method achieves a balance between high classification accuracy and low processing time. The integration of LSTM spatiotemporal features with the SE attention mechanism enables the model to effectively distinguish challenging features that other methods often confuse, such as LoS and human obstructions.
(3): We validated the efficacy of our proposed method on both publicly available datasets and multiclass datasets collected from experiments. The experimental outcomes indicate that our approach not only significantly improves NLoS binary classification performance but also shows excellent performance in multiclass recognition of common NLoS obstructions in indoor environments.

The structure of the remaining sections of this paper is as follows. Section 2 reviews the related work on UWB NLoS identification. Section 3 explores the feasibility of using DL for UWB NLoS identification. Section 4 describes the 1D-CLANet model for NLoS identification, which is based on CNN, LSTM, and Squeeze-and-Excitation (SE) attention mechanisms. Section 5 details the findings from our experiments. Finally, Section 6 summarizes the article and briefly addresses future research directions.

All symbols used in the paper are defined as shown in Table 1.

2. Related Work

Traditional methods for NLoS identification rely on signal propagation characteristics and prior information [17], including distance-based, prior map-based, and channel feature-based approaches.

Distance-based methods typically use statistical analysis of the estimated or measured distances to identify NLoS conditions, employing metrics including time of arrival (TOA), time difference in arrival (TDOA), angle of arrival (AOA), and received signal strength indicator (RSSI) [18,19], or their combinations [19]. Shi et al. [20] introduced a low-cost UWB distance compensation model that uses kurtosis to detect NLoS environments, addressing electromagnetic wave loss issues in real indoor settings. Liu et al. [21] developed a SVM-based method using ratio data from ranging statistics for NLoS identification. Despite their efficiency, these methods are prone to misjudgments due to distance information fluctuations, increasing NLoS measurement errors.

In static, well-mapped indoor environments, prior map-based methods are more suitable. These methods use the environmental layout and tag positions to identify NLoS conditions [22]. Wang et al. [23] created LoS/NLoS maps and proposed an NLoS correction localization method based on these maps, using a map-corrected extended Kalman filter (MKF) algorithm for optimizing trajectories in NLoS environments with few anchors. The downside is the extensive data collection required for prior maps and the inability to identify the dynamic obstacles causing NLoS occlusion.

Channel feature-based methods leverage advancements in signal systems to extract features from UWB data. Compared to distance-based methods, they better utilize signal propagation information and are often combined with ML techniques like SVM [24] and Gaussian distribution (GD) models [25]. These methods have shown success in the binary LoS/NLoS classification. For example, Henk et al. [26] used static parameters from waveforms and trained an SVM for prediction, while Che et al. [27] proposed a generalized Gaussian distribution (GGD) technique for NLoS recognition. Yang et al. [28] introduced a Two-Step NLoS detection method using decision trees and feedforward neural networks for fine-tuning. However, these methods often fail to generalize across various NLoS types and obtaining accurately labeled datasets is challenging, resulting in limited precision due to insufficient deep feature extraction by ML models.

In contrast, DL methods excel in feature extraction from NLoS scenarios. Si et al. [29] proposed a multilayer perceptron (MLP) model combining raw CIR features with handcrafted features for LoS/NLoS recognition. Jiang et al. [30] improved CNN-based NLoS recognition by researching CIR denoising methods, achieving a higher accuracy. Víctor et al. [31] used LSTM networks to identify and classify human obstructions under 6.5 GHz NLoS conditions. Jeong et al. [32] introduced a hybrid quantum CNN (HQCNN) for LoS/NLoS classification, showing excellent performance. Additionally, the attention mechanisms integrated into DL models have yielded promising results. Yang et al. [33] combined attention mechanisms with Bi-directional LSTM (BiLSTM) and CNN for high-accuracy NLoS classification, while Niu et al. [34] embedded an efficient channel attention (ECA) module into a residual network for greenhouse environments. Tian et al. [35] combined gated recurrent units (GRU) with attention mechanisms to capture dynamic UWB signal information, reducing NLoS classification errors.

All relevant methods have been summarized in Table 2. Despite the significant advancements in DL-based UWB NLoS research in recent years, most methods still focus on LoS/NLoS binary classification. There is limited exploration of NLoS multi-class classification, especially in complex indoor environments. Conducting research on NLoS multi-class identification can help systems better understand UWB signal propagation and obstacle types, thereby addressing specific ranging and positioning errors. While some advanced techniques have significantly improved model performance, incorporating attention mechanisms and complex model structures typically increase model size and computational requirements, leading to higher hardware costs. For UWB positioning tasks, it is crucial to design more lightweight and efficient models.

3. Theoretical Analysis

As shown in Figure 2, we take a 3-anchor positioning model as an example. Assume the coordinates of each anchor are known and denoted as

(x_{1}, y_{1})

,

(x_{2}, y_{2})

, …,

(x_{n}, y_{n})

where

n = 3

in Figure 2. The coordinates of the tag to be determined are denoted as (

x

,

y

). The distances from the tag to each anchor, measured by the UWB device using ToF, are

d_{1}

,

d_{2}

, …,

d_{n}

. This yields a system of nonlinear equations for the distances from the tag to each anchor:

\{\begin{matrix} d_{1} = \sqrt{{(x_{0} - x)}^{2} + {(y_{0} - y)}^{2}} \\ d_{2} = \sqrt{{(x_{1} - x)}^{2} + {(y_{1} - y)}^{2}} \\ \dots \\ d_{n} = \sqrt{{(x_{n} - x)}^{2} + {(y_{n} - y)}^{2}} \end{matrix}

(1)

Square both sides of each equation in Equation (1) and subtract the subsequent n equations from the first equation. Combine the terms containing

x

and

y

, and express the resulting equations in matrix form:

[\begin{matrix} 2 (x_{1} - x_{n}) 2 (y_{1} - y_{n}) \\ 2 (x_{2} - x_{n}) 2 (y_{2} - y_{n}) \\ \dots \\ 2 (x_{n - 1} - x_{n}) 2 (y_{n - 1} - y_{n}) \end{matrix}] [\begin{array}{l} x \\ y \end{array}] = [\begin{matrix} d_{n}^{2} - d_{1}^{2} + x_{1}^{2} - x_{n}^{2} + y_{1}^{2} - y_{n}^{2} \\ d_{n}^{2} - d_{2}^{2} + x_{2}^{2} - x_{n}^{2} + y_{2}^{2} - y_{n}^{2} \\ \dots \\ d_{n}^{2} - d_{n - 1}^{2} + x_{n - 1}^{2} - x_{n}^{2} + y_{n - 1}^{2} - y_{n}^{2} \end{matrix}]

(2)

Simplify the above equations to the form

A_{l s} X = B_{l s}

, where

A_{l s}

and

B_{l s}

are given by

A_{l s} = [\begin{matrix} 2 (x_{1} - x_{n}) 2 (y_{1} - y_{n}) \\ 2 (x_{2} - x_{n}) 2 (y_{2} - y_{n}) \\ \dots \\ 2 (x_{n - 1} - x_{n}) 2 (y_{n - 1} - y_{n}) \end{matrix}]

(3)

B_{l s} = [\begin{matrix} d_{n}^{2} - d_{1}^{2} + x_{1}^{2} - x_{n}^{2} + y_{1}^{2} - y_{n}^{2} \\ d_{n}^{2} - d_{2}^{2} + x_{2}^{2} - x_{n}^{2} + y_{2}^{2} - y_{n}^{2} \\ \dots \\ d_{n}^{2} - d_{n - 1}^{2} + x_{n - 1}^{2} - x_{n}^{2} + y_{n - 1}^{2} - y_{n}^{2} \end{matrix}]

(4)

After converting to a system of linear equations, we can solve the above equation using the least squares method.

X = {(A_{l s}^{T} A_{l s})}^{- 1} A_{l s}^{T} B_{l s}

(5)

where

X = {[x y]}^{T}

; this allows us to obtain the calculated position of the UWB tag within the multi-anchor positioning network.

There are many other methods for calculating positions in UWB positioning networks, such as gradient descent and robust regression. However, as discussed in the introduction, nearly all positioning methods are inevitably affected by UWB ranging errors, especially under NLoS conditions. As shown in Equation (6), NLoS can significantly increase the gross error

Δ d (t)

in the measured distance. As shown in Figure 2b, NLoS intuitively causes a positive error in the measurement

d_{2}

from AP2 to the tag, resulting in a positioning error that shifts the result from the blue point to the red point.

\{\begin{cases} d_{LoS} = d^{*} + η (t) \\ d_{NLoS} = d^{*} + Δ d (t) + η (t) \end{cases}

(6)

where

d_{LoS}

and

d_{NLoS}

are the UWB measurements under LoS and NLoS conditions, respectively.

d^{*}

is the true distance between tag and anchor,

η (t)

is the UWB measurement noise.

Therefore, mitigating UWB ranging errors under NLoS conditions or selecting the ranging results from anchors under LoS conditions for position calculations is crucial for reducing positioning errors in UWB IPS. To achieve these technical effects, identifying LoS/NLoS propagation conditions is an essential prerequisite for both error mitigation methods. Currently, the most common approach is to use UWB CIR features to identify the channel’s LoS/NLoS propagation conditions. The CIR describes the characteristics of a channel by detailing its response to a brief impulse signal, thereby outlining the communication channel’s properties. In a UWB localization system, the transmitted symbol

s (t)

from each anchor or tag can be represented as [36]

s (t) = \sqrt{A} \sum_{m = 0}^{M - 1} p (t - m T_{p})

(7)

where

A

denotes the amplitude of the transmitted pulse,

p (t)

represents the single Gaussian pulse waveform, and

M

pulses, each with a period of

T_{p}

, form specific frames. The impulse response of the transmission channel can be expressed as

h (t) = \sum_{c = 1}^{C} η_{c} δ (t - α_{c})

(8)

where

η_{c}

and

α_{c}

denote the fading coefficient and the time delay of the c-th path, respectively. Consequently, the received signal is the sum of transmitted signals, each experiencing attenuation and delay through multiple paths, represented as

r (t) = \sum_{c = 1}^{C} η_{c} s (t - α_{c}) + n (t)

(9)

where

n (t)

is the additive white Gaussian noise (AWGN) with variance

σ_{r}^{2}

.

Figure 3 illustrates the CIR curves under both LoS and NLoS conditions. The primary distinction lies in the first path (FPH); LoS curves feature a consistent FPH peak, while NLoS curves exhibit a clipped FPH without a peak. Additionally, the maximum value of LoS CIR curves surpasses that of NLoS curves. Multipath effects influence NLoS CIR curves, creating multiple local peaks, with more severe obstructions causing an increase in these peaks. These varying characteristics enable the differentiation between LoS and NLoS conditions using ML methods.

4. Proposed Method

This paper introduces a lightweight method for UWB NLoS multi-class identification based on 1D-CLANet. The 1D-CLANet architecture is designed to extract the spatiotemporal features of CIR signals for channel model classification. As illustrated in Figure 4, the proposed network is composed of two main components: a feature extraction module and a feature fusion and classification module. At the heart of the network is the feature extraction module, which integrates CNN and LSTM. This module includes a CNN-based section for spatial feature extraction and an LSTM-based section for temporal feature extraction. The SE attention mechanism module further adjusts the weights of each channel. The feature fusion module then combines the spatiotemporal features with the key features highlighted by the SE attention mechanism and feeds them into a classifier. The following sections provide a detailed overview of each module.

4.1. Convolutional Neural Network

Figure 5 illustrates the structure of our 1D-CNN module, where the input 1 × 512 sequence data are indexed from positions 504 to 1016 of the original 1 × 1016 CIR data using a fixed-size window. The core concept of 1D-CNN is the convolution step, which can be viewed as a correlation process. Our 1D-CNN module consists of three convolutional filters with batch normalization (BN), the kernel sizes for Conv1 to Conv3 are

16 \times 3

,

32 \times 3

, and

64 \times 3

, respectively. All Maxpooling layers have a kernel size of

1 \times 2

. The convolutional filters are defined as follows:

u_{C} = v_{C} * Z = \sum_{s = 1}^{C^{'}} v_{C}^{s} * z^{s}

(10)

where

Z = [z^{1}, z^{2}, \dots, z^{L}]

is the input CIR signal with length

L = 512

and

U = [u_{1}, u_{2}, \dots, u_{C}] \in ℝ^{H \times W \times C}

represents the output feature map after convolution operation. The feature map has spatial dimensions

H \times W

and

C

channels.

V = [v_{1}, v_{2}, \dots, v_{C}]

is the set of learned filter kernels, where each filter

v_{C}

is defined as

[v_{C}^{1}, v_{C}^{2}, \dots, v_{C}^{L}]

. The kernels are applied with a stride of 1 and padding set to 1.

Normalizing the input can significantly reduce the number of epochs required for training. It also has a regularization effect, which helps to reduce generalization error. The batch normalization layer is defined as follows:

{\hat{u}}_{c} = γ_{B N} (\frac{u_{c} - μ}{\sqrt{σ^{2} + ε}})

(11)

where

γ_{B N}

and

β

are learnable parameters,

μ

and

σ^{2}

are the mean and variance of the batch, and

ε

is a small constant added to the variance for numerical stability (commonly

ε = 10^{- 5}

).

After the batch normalization operation, the outputs

\hat{U} = [{\hat{u}}_{1}, {\hat{u}}_{2}, \dots, {\hat{u}}_{C}]

are passed through the ReLU activation function. Pooling layers are typically added after CNN modules to perform spatial downsampling of the feature maps. Here, the ReLU activation function and Maxpooling layer work together to introduce sparsity and enhance nonlinearity.

4.2. Long Short-Term Memory Cell

However, 1D-CNNs have some limitations, such as their inability to capture long-term dependencies due to fixed-size windows, a tendency to overfit due to numerous parameters, and insufficient recognition of sequential data relationships. To tackle these challenges, the LSTM network, introduced by Hochreiter and Schmidhuber [37], provides an effective solution. Illustrated in Figure 6, the LSTM network, a variant of RNN, adeptly handles time-series sequence issues by incorporating memory blocks. These blocks contain a forget gate, an input gate, a cell state update, and an output gate. In the 1D-CLANet model, the deep spatial features extracted by the 1D-CNN are flattened by Flatten1 and then fed into an LSTM with 512 hidden units to produce spatiotemporal features (sequences). This architecture allows the network to effectively manage temporal dependencies and complex sequences, thereby alleviating the computational constraints and error propagation issues found in traditional feedforward neural networks [31,38].

The calculation for the memory block depicted in Figure 6 is as follows:

f_{t} = σ (w_{f} \cdot [h_{t - 1}, x_{t}] + b_{f})

(12)

i_{t} = σ (w_{i} \cdot [h_{t - 1}, x_{t}] + b_{i})

(13)

{\hat{C}}_{t} = \tanh (w_{c} \cdot [h_{t - 1}, x_{t}] + b_{c})

(14)

{\overset{⌣}{C}}_{t} = f_{t} \cdot {\overset{⌣}{C}}_{t - 1} + i_{t} \cdot {\hat{C}}_{t}

(15)

O_{t} = σ (w_{o} \cdot [h_{t - 1}, x_{t}] + b_{o})

(16)

h_{t} = o_{f} \cdot \tanh (C_{t})

(17)

where

f_{t}

represents the forget gate‘s activation vector,

i_{t}

denotes the input/update gate’s activation vector,

{\hat{C}}_{t}

is cell input activation vector,

{\overset{⌣}{C}}_{t}

is current cell memory,

O_{t}

stands for the output gate’s activation vector, and

h_{t}

is current cell output. The bias vector and weight matrices for the input gate (

i

), output gate (

o

), forget gate (

f

), and memory cell (

c

) are denoted by

b

,

w

.

h_{t - 1}

is previous cell output,

{\overset{⌣}{C}}_{t - 1}

is previous cell memory,

σ

signifies the sigmoid function, and “

\cdot

” indicates the Hadamard product [38].

4.3. Squeeze-and-Excitation Block

In complex electromagnetic environments, the distinguishing features between UWB base stations and tags are weak. Traditional methods struggle with CIR data for LoS/NLoS classification due to limited feature extraction, inadequate long-term dependency capture, and high reliance on feature engineering. Research in computer vision shows that attention mechanisms, such as SE attention [39], self-attention [40], and coordinate attention (CA) [41], enhance deep neural networks’ feature learning. These mechanisms adaptively weight features based on their importance. In UWB NLoS recognition, sequential signal samples with data dimensions of 1 × 1016 are affected by hardware defects and temporal correlations, making feature importance assessment challenging. The SE attention mechanism selectively emphasizes crucial channel features while suppressing less useful ones, maintaining a low parameter count. Given the limited computing resources of IoT devices, the proposed lightweight model, 1D-CLANet, feeds the spatiotemporal CIR features extracted by the 1D-CNN and LSTM into the SE attention module (numChannels = 512, reductionRatio = 4) to enhance feature extraction and classification accuracy.

Figure 7 shows the details of the SE attention block. In the squeeze operation, global average pooling is applied to the channel dimensions of the feature map

H_{lstm} = [h_{1}, h_{2}, \dots, h_{t}, h_{T}]

from the LSTM. This pooling operation compresses

H_{lstm}

across its spatial dimensions, resulting in the statistic

z m \in ℝ^{C}

. The element

z m

of c-th channel can be computed as:

z m_{c} = F_{s q} (H_{lstm}) = \frac{1}{H \times W} \sum_{i = 1}^{H} \sum_{j = 1}^{W} h_{c} (i, j)

(18)

where

H

and

W

are the height and width of the feature map and

h_{c} (i, j)

is the feature value at spatial location

(i, j)

.

After the squeeze operation, the excitation step captures the channel dependencies of the aggregated information. We use two fully connected (FC) layers to learn the scale of the channel weights, and the scale vector

s m

is constrained to [0, 1] using the sigmoid function:

s m = F_{e x} (z m, W) = σ (W_{2} \cdot ReLU (W_{1} \cdot z m + b_{1}) + b_{2})

(19)

where

W_{1}

and

W_{2}

are weights of the FC layers,

b_{1}

and

b_{2}

are biases,

σ

is the sigmoid activation function, and is

ReLU (\cdot)

the ReLU function.

The final step is to scale the original feature map using an activation function, which involves a channel-wise multiplication between the scalar

s m_{c}

and

h_{c}

for each channel c, resulting in the final output

{\tilde{H}}_{SE} = [{\tilde{h}}_{1}, {\tilde{h}}_{2}, \dots, {\tilde{h}}_{t}, {\tilde{h}}_{T}]

of the SE attention block:

{\tilde{h}}_{c} = F_{s c a l e} (h_{c}, s m_{c}) = s m_{c} \cdot h_{c}

(20)

4.4. Feature Fusion and Classification Module

After enhancing the critical UWB features in the LSTM output sequence using the SE attention mechanism, we fuse the deep spatio-temporal features

H_{lstm}

extracted by the original 1D-CNN and LSTM with the key features

{\tilde{H}}_{SE}

from the SE attention block. This results in a fused feature vector

F_{concat} = [H_{lstm}, {\tilde{H}}_{SE}]

, which is then flattened using a flatten layer. Next, two FC layers are used, with a ReLU layer and a dropout(0.5) layer inserted between FC1(128) and FC2(5). This configuration allows the network to learn complex patterns while minimizing overfitting. The second FC layer is defined as the output layer and consists of five fully connected neurons. Finally, a softmax layer is connected as the classifier, The softmax layer is defined to output class probabilities.

Softmax (x_{i s}^{softmax}) = \frac{\exp (x_{i s}^{softmax})}{\sum_{j s = 1}^{5} \exp (x_{j s}^{softmax})} 1 \leq j s \leq 5

(21)

where

x_{i s}^{softmax}

and

x_{j s}^{softmax}

are the is-th and js-th input elements, respectively.

The network architecture and parameters of the 1D-CLANet are shown in Table 3. Moreover, this model employs focal loss as the multi-class loss function, which helps address class imbalance by focusing more on difficult-to-classify samples. Focal loss adds a modulating factor to the cross-entropy loss to give greater attention to these challenging samples.

L = - α {(1 - {\hat{y}}_{i y})}^{γ} \log ({\hat{y}}_{i y})

(22)

where

α

is a balancing factor,

γ

is a focusing parameter, and

{\hat{y}}_{i y}

is the predicted probability for class iy.

The hyperparameter settings used in the training process of the 1D-CLANet model are shown in Table 4. During training, the Adam algorithm is used to optimize the parameters of each layer of the 1D-CLANet network. The batch size is set to 32, the initial learning rate is set to 0.0001, and the total number of training epochs is 20. To prevent overfitting and improve generalization ability, we apply L2 regularization, learning rate decay, and dropout during the training process.

5. Experiments and Analysis

5.1. Experimental Setting Up and Dataset Description

We tested the binary classification performance of our proposed model for NLoS identification using the publicly available dataset from [42]. This dataset was collected from seven different locations: office 1, office 2, a small apartment, a small workshop, a kitchen with a living room, a bedroom, and a boiler room. Each location provided 3000 LoS samples and 3000 NLoS samples, resulting in a total of 42,000 samples. Due to the lack of detailed CIR multi-class datasets online, we validated the model’s performance using our own data. The experimental scenarios shown in Figure 8 include the office area and stair corridor of a university laboratory. Data collection was performed using the DecaWave DWM3000 chip, with modules fixed on tripods at a height of 1.6 m. The specific parameter configurations are shown in Table 5. The distance information between the base station and the tag was obtained using a Leica Disto X3 measurement instrument with an accuracy of ±1 mm, as shown in Figure 8b.

The collected dataset comprises CIR data from four common indoor occlusion obstacles: human body, glass, wooden door, and concrete wall, with 3000 samples per scenario. In total, we collected and screened 15,000 samples representing five categories of NLoS obstructions or LoS signals for the training and testing datasets. During the CIR data collection process for each batch, the position of the anchor “

Δ

” remained stationary, while the tag “

□

” was moved a fixed distance of 1 m each time to collect a set of data. Figure 8 provides detailed location information of the anchor and tag for each batch of CIR data collection, as well as initial distance measurements between the base station and the tag and equidistant distance information between each occlusion obstacle.

In the stairwell scenario shown in Figure 8a, data were collected for LoS I, wooden door I (0.036 m), human I (1.74 m, 64 kg), and concrete wall I for NLoS obstructions. The distance between human I and the anchor behind it was maintained at 0.1 m, and human I remained in the same position during each batch of data collection (measuring tag and anchor positions at the five locations shown in the figure). After each batch, human 1 moved to the next position (backward by 2 m). In the office corridor scenario shown in Figure 8b, data were collected for LoS II, wooden door II (0.06 m), human II (1.68 m, 53 kg), glass IV (0.02 m), and concrete wall II for NLoS obstructions. The distance between human II and the anchor behind it was maintained at 0.5 m. Human II moved 2m for each batch of data collection, following the same measurement method as described for Figure 8a.

Data processing and model training were performed on a system with a Ryzen 3600 processor (3.59 GHz), GTX 1660S graphics card, and 16GB RAM.

The following metrics are used to evaluate NLoS identification accuracy [17]:

Accuracy = \frac{TP + TN}{TP + TN + FP + FN}

(23)

Recall = \frac{1}{n} \sum_{i c = 1}^{n} \frac{{TP}_{i c}}{{TP}_{i c} + {FN}_{i c}}

(24)

Precision = \frac{1}{n} \sum_{i c = 1}^{n} \frac{{TP}_{i c}}{{TP}_{i c} + {FP}_{i c}}

(25)

F 1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}

(26)

where

i c = 1, 2, 3, 4, 5

is the five types of CIR signal propagation channel, true positive (TP) is the correctly classified positive samples, true negative (TN) is the correctly classified negative samples, false positive (FP) is the misclassified positive samples, and false negative (FN) is the misclassified negative samples.

5.2. Results and Analysis

This paper evaluates the proposed 1D-CLANet method from three perspectives. First, we assess its effectiveness in traditional binary classification of LoS/NLoS scenarios. Second, we evaluate its effectiveness in classifying LoS and multiple NLoS scenarios to further validate the method. Lastly, we conduct ablation experiments to test the impact of integrating different attention mechanisms on the model’s multi-class classification performance and training costs, confirming the effectiveness of incorporating the SE attention mechanism into our model.

(1) Performance Evaluation of Traditional Binary Classification: To assess the performance of the 1D-CLANet model in traditional binary classification, we compared it with four ML (including DL) methods: CNN-LSTM (CNSM), ResNet (without ECA) [34], CNN [30], and SVM [26], and a traditional method based purely on signal processing: kurtosis identification [20]. Apart from CNSM and ResNet, which use the network structures shown in Figure 9a,b (with FC2 in Figure 9a and FC in Figure 9b set to 1), the other comparison methods, CNN [30] and SVM [26], employ the original model structures and parameters from [30] and [26]. The kurtosis identification method [20] determines an appropriate threshold by calculating the distribution of kurtosis of the NLoS and LoS CIR data from the training dataset. When the kurtosis of the input CIR is greater than or equal to this threshold, it is classified as LoS; otherwise, it is classified as NLoS.

We adjusted the configuration of the training and test sets. The training data were selected from the first five scenarios of dataset [42]: office 1, office 2, a small apartment, a small workshop, and a kitchen with a living room. This amounted to a total of 30,000 sets of CIR data, with an equal split between LoS and NLoS samples. The performance evaluation was conducted using the remaining two scenarios of dataset: a bedroom and a boiler room, totaling 12,000 CIR datasets, also with an equal split between LoS and NLoS. This setup ensured a clear distinction between the training and testing data to validate the model’s generalization ability.

Except for SVM, which used manually extracted CIR features such as energy

ε_{r}

, maximum amplitude

r_{m a x}

, rise time

t_{r i s e}

, mean excess delay

τ_{m e d}

, Root-mean-square delay spread

τ_{r m s}

, kurtosis

k

, and estimated distance

d_{t}

as training inputs, all other comparison methods used normalized CIR signal data as the input, as shown in Table 6. The computation methods for the manually extracted CIR features can be found in [26]. To ensure a fair comparison, the hyperparameters and training configurations for ResNet and SVM were set according to [34] and [26], respectively. Since [30] did not provide training parameter configurations for CNN, the hyperparameters and training configurations from Table 4 and Section 4.4 were used for 1D-CLANet, CNSM, and CNN. The threshold value of 100.34 for kurtosis identification was determined using the method mentioned earlier.

Table 7 presents the performance comparison of various methods for LoS/NLoS binary classification. It lists four evaluation metrics used to assess each ML approach. As shown, even in NLoS binary classification across different environments (where the training and test sets are from different environments, and the test set includes two types of scenarios), the proposed method demonstrates higher generalization compared to other methods. Both 1D-CLANet and CNSM achieve over 91% on all four metrics, outperforming other traditional methods in recognizing LoS and NLoS scenarios.

Compared to the baseline method CNSM, 1D-CLANet improves precision, recall, accuracy, and F1-score by 5.83%, 5.88%, 5.86%, and 5.85%, respectively. This indicates that incorporating the SE attention mechanism can enhance the model’s binary classification accuracy. Among the five comparison methods, ResNet achieves the highest accuracy at 88.94%, followed by CNN at 82.97% and SVM at 79.66%. The kurtosis identification method, based purely on signal processing, performs the worst, with an accuracy of only 70.80%. This is because the selection of the CIR kurtosis threshold is limited by the environment. In a fixed environment where all obstacles have similar dielectric constants, a fixed-threshold kurtosis-based method can achieve relatively high accuracy. However, in this experiment, the training data came from five different environments, and the test data came from two other environments, leading to significant differences in dielectric constants between environments. This variation caused the kurtosis identification method to fail and resulted in low accuracy. Due to these shortcomings of pure signal processing methods, recent research tends to use ML or DL methods for UWB NLoS propagation identification to address the challenge of high generalization across different environments. For SVM, the lack of high-dimensional features from the CIR signal resulted in a poorer performance.

Furthermore, Figure 10 presents the receiver operating characteristic (ROC) curves to further evaluate the effectiveness of the 1D-CLANet model. It is evident that the area under the curve (AUC) for both CNSM (0.969) and 1D-CLANet (0.996) is higher than that of other common approaches. This indicates that 1D-CLANet outperforms conventional methods in LoS/NLoS binary classification. In summary, integrating the SE attention mechanism with the LSTM output enhances the accuracy of LoS/NLoS binary classification. However, the applicability of the 1D-CLANet method to more complex scenarios remains to be further explored.

(2) Performance Evaluation of NLoS Multi-Class Classification: To demonstrate the performance of our method in recognizing various NLoS scenarios, we compared it with LSTM, SVM [26], HQCNN [32], MLP [29], and the baseline method CNSM without any attention mechanisms. The LSTM model consists of 512 hidden units and two FC layers (LSTM-512-FC1-128-FC2-5, sequence). CNSM uses the network structure shown in Figure 9a. Unlike in binary classification, the output unit of FC2 in the multi-classification task is set to 5. The other comparison methods, MLP [29], HQCNN [32], and SVM [26], use the original model structures and parameters from [29], [32] and [26], respectively.

The training data for multi-class classification was collected from the stairway passage and office corridor scenarios shown in Figure 8a,b, as detailed in Section 5.1. We used a total of 7000 CIR data samples from the office corridor scene depicted in Figure 8b for training, covering five types of CIR data: human II (1.68 m, 53 kg), glass IV (0.02 m), wooden door II (0.06 m), concrete wall II, and LoS II, with each category comprising 20% of the samples.

For performance evaluation, we used a test set consisting of a total of 1857 CIR samples. This includes four categories of CIR data from the stairway passage scenario shown in Figure 8a: human I (1.74 m, 64 kg, 497 samples), wooden door I (0.036 m, 325 samples), concrete wall I (245 samples), and LoS I (364 samples). Additionally, we incorporated CIR data for glass IV (0.02 m, 426 samples) from Figure 8b. Compared to the training dataset, the glass IV CIR data in the test set has different anchor-to-tag distances to ensure a clear distinction between the training and testing data. This distinction allows for a fair evaluation of the model’s multi-class classification performance.

The MLP [29] method used both normalized CIR data and manually extracted features: first path strength difference (FPSD), first path distance difference (FPDD), number of pseudo peaks (NPP),

τ_{m e d}

,

τ_{r m s}

, and

k

. SVM [26] used manually extracted CIR features consistent with those in the binary classification experiment. The other comparison methods used normalized CIR signal data as input, as shown in Table 8. The calculation methods for the manually extracted CIR features used in SVM and MLP can be found in [26,29], respectively. To ensure a fair comparison, the hyperparameters and training configurations for MLP, HQCNN, and SVM were based on [26,29,32], respectively. The hyperparameters and training configurations for 1D-CLANet, CNSM, and LSTM were based on those in Table 4 and Section 4.4.

Figure 11 illustrates the precision, recall, F1, and accuracy of four common approaches, CNSM without any attention mechanisms and the 1D-CLANet method. The performance of common techniques declines while handling multi-class scene classification, particularly for SVM and LSTM-related methods, indicating that these approaches cannot capture enough features to differentiate between various scenes. Clearly, both CNSM and 1D-CLANet outperform traditional methods in multi-scene classification. Compared to CNSM, the 1D-CLANet shows improvements of 4.87% in precision, 6.32% in recall, 5.88% in accuracy, and 5.76% in F1. Unlike CNSM, the 1D-CLANet incorporates the SE attention mechanism, which aids in distinguishing scenes with similar CIR characteristics, thereby enhancing overall performance.

The confusion matrix provides detailed information on the classification performance of various methods. As shown in Figure 12a–d, Sce. 1 to Sce. 5 correspond to LoS, human, glass, wooden door, and wall obstruction scenarios, respectively. CNSM, MLP, and HQCNN achieved a certain degree of multi-scene recognition. As depicted in Figure 12d,f, MLP even outperformed 1D-CLANet in terms of classification accuracy for Sce. 4. However, misclassifications in scenarios such as Sce. 1, Sce. 2 and Sce. 4 led to overall poor performance for MLP, HQCNN, and SVM. While LSTM performs well in time series prediction, it struggles to extract sufficient features from CIR signals for LoS/NLoS classification with minimal time complexity, particularly in distinguishing Sce. 5, where it performs poorly compared to other models. The proposed 1D-CLANet, even without the attention mechanism (CNSM), achieved significantly better accuracy (95.58%, 89.82%) compared to MLP, HQCNN, SVM, and LSTM methods (85.68%, 80.24%, 76.95%, and 74.04%, respectively), as shown in Figure 12e,f. With the SE attention mechanism, 1D-CLANet improved the recall rates for Sce. 1 and Sce. 2 to 98.9% and 89.54%, respectively, which other methods struggled to achieve. However, the recognition accuracy for Sce. 5 decreased due to SE prioritizing other features. Despite this, the overall accuracy of the SE-enhanced 1D-CLANet model remains superior to the model without any attention mechanisms (CNSM).

(3) The impact of different attention mechanisms: To validate the effectiveness of integrating the SE attention mechanism into our model’s feature extractor, we conducted ablation experiments comparing the following: (1) no attention mechanism, which omits the feature fusion layer and outputs LSTM features directly to the FC layer, as shown in Figure 9a; (2) 1D-CLANet combined with self-attention, which numHeads = 4, numKeyChannels = 512; (3) 1D-CLANet combined with coordinate attention; and (4) our proposed 1D-CLANet combined with SE attention. Table 9 shows the feature extraction layer structures, and for fairness, CA and SE attention mechanisms are configured with numChannels = 512 and reductionRatio = 4, using the same datasets and hyperparameters from the previous multi-classification experiment as described in Table 4 and Section 4.4.

Table 10 summarizes the multi-class recognition accuracy, training time, and number of trainable parameters for each model on the NLoS multi-classification test set, where training time and accuracy are averaged over 100 experiments. Our proposed 1D-CLANet with the SE attention mechanism achieves the highest overall recognition rate of 95.58% across the five scenarios, with a parameter count similar to the original model without any attention mechanism (1.4 M) and a shorter training time (100.31 s) compared to models with other attention mechanisms. The accuracy improves by 5.6% over the 1D-CLANet without attention. While the 1D-CLANet+self and 1D-CLANet+CA methods also show excellent performance, with multi-classification accuracies of 93.37% and 94.91%, respectively, our model outperforms them, indicating that the SE attention mechanism enhances recognition accuracy by more effectively focusing on important features. Although CA mechanisms are beneficial for emphasizing features in image recognition tasks, they may not perform as well as SE attention in extracting features from sequence signals like UWB CIR.

6. Conclusions

Classification of LoS/NLoS scenarios is a crucial issue for high-precision indoor UWB positioning. This article proposes a novel lightweight network, 1D-CLANet, which combines CNN, LSTM, and SE attention mechanisms. Using UWB CIR data as input, this approach leverages the deep feature extraction capability of 1D-CNN, the time series processing strength of LSTM, and the SE attention mechanism’s focus on critical features, significantly enhancing NLoS classification accuracy. Application results on public datasets show that 1D-CLANet achieves a binary classification accuracy of up to 96.88%, with precision, recall, and F1 scores improving by at least 5.83% compared to traditional ML and DL methods. In experiments involving NLoS multi-class recognition of common indoor obstacles, the proposed model also demonstrates excellent performance, achieving a classification accuracy of 95.58% and improving precision and recall metrics by 4.87% over other advanced methods such as MLP and HQCNN. Thanks to the SE attention mechanism, the proposed method effectively differentiates between features that other methods find confusing, such as LoS and human features. Compared to methods without any attention mechanism like CNSM or those incorporating other attention mechanisms, 1D-CLANet achieves a balance of optimal performance and lower processing time (100.31 s). In future work, we will continue to improve the NLoS identification model proposed in this paper and explore its application in indoor positioning systems by integrating it with ranging and localization error mitigation methods.

Author Contributions

Conceptualization, Q.W.; methodology, Q.W. and C.Z.; software, Q.W. and J.L.; validation, X.Y.; formal analysis, Q.W. and J.L.; investigation, J.L.; resources, X.Y. and K.L.; data curation, Q.W.; writing—original draft preparation, Q.W., J.L. and X.Y.; writing—review and editing, M.C. and Y.L.; visualization, Q.W. and J.L.; supervision, M.C.; project administration, Y.L.; funding acquisition, M.C. and Y.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Key Research and Development Program of China, National Natural Science Foundation of China under Grant 2022YFB3706902 and 52305563.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

The authors express their gratitude to engineer Sun from dalian Haoru technology for his contributions to code debugging and experimental work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this paper:

Abbreviation	Full Name of Abbreviation
UWB	Ultra-Wideband
NLoS/LoS	Non-Line-of-Sight/Line-of-Sight
1D-CLANet	1D-ConvLSTM-Attention network
CIR	Channel impulse response
CNN	Convolutional neural network
LSTM	Long Short-Term memory
SE	Squeeze-and-Excitation
IoT	Internet of things
IPS	Indoor positioning systems
TX	Transmitter
RX	Receiver
ToF	Time of flight
ML	Machine learning
DL	Deep learning
SVMs	Support vector machines
RNNs	Recurrent neural networks
TOA	Time of arrival
TDOA	Time difference in arrival
AOA	Angle of arrival
RSSI	Received signal strength indicator
MKF	Map-corrected extended Kalman filter
GD	Gaussian distribution
GGD	Generalized Gaussian distribution
MLP	Multilayer perceptron
HQCNN	Hybrid quantum CNN
BiLSTM	Bi-directional LSTM
ECA	Efficient channel attention
GRU	Gated recurrent units
AWGN	Additive white Gaussian noise
FPH	First path
BN	Batch normalization
CA	Coordinate attention
GAP	Global average pooling
FC	Fully connected
TP/TN	True positive/True negative
FP/FN	False positive/False negative
CNSM	CNN-LSTM
ROC	Receiver operating characteristic
AUC	Area under the curve
FPSD	First path strength difference
FPDD	First path distance difference
NPP	Number of pseudo peaks

References

Ngamakeur, K.; Yongchareon, S.; Yu, J.; Rehman, S.U. A survey on device-free indoor localization and tracking in the multi-resident environment. ACM Comput. Surv. CSUR 2020, 53, 71. [Google Scholar] [CrossRef]
Nascita, A.; Montieri, A.; Aceto, G.; Ciuonzo, D.; Persico, V.; Pescapé, A. Xai meets mobile traffic classification: Understanding and improving multimodal deep learning architectures. IEEE Trans. Netw. Serv. 2021, 18, 4225–4246. [Google Scholar] [CrossRef]
Xiao, Y.; Zhu, J.; Yan, S.; Song, H.; Zhang, S. PEiD: Precise and Real-Time LOS/NLOS Path Identification Based on Peak Energy Index Distribution. Appl. Sci. 2023, 13, 7458. [Google Scholar] [CrossRef]
Martalo, M.; Perri, S.; Verdano, G.; De Mola, F.; Monica, F.; Ferrari, G. Improved UWB TDoA-based positioning using a single hotspot for industrial IoT applications. IEEE Trans. Industr. Inform. 2022, 18, 3915–3925. [Google Scholar] [CrossRef]
Yu, K.; Wen, K.; Li, Y.; Zhang, S.; Zhang, K. A novel nlos mitigation algorithm for UWB localization in harsh indoor environments. IEEE Trans. Veh. Technol. 2019, 68, 686–699. [Google Scholar] [CrossRef]
Zhao, Y.; Wang, M. The LOS/NLOS classification method based on deep learning for the UWB localization system in coal mines. Appl. Sci. 2022, 12, 6484. [Google Scholar] [CrossRef]
Yang, X.; Wang, J.; Song, D.; Feng, B.; Ye, H. A novel NLOS error compensation method based IMU for UWB indoor positioning system. IEEE Sens. J. 2021, 21, 11203–11212. [Google Scholar] [CrossRef]
Csík, D.; Sarcevic, P.; Pesti, R.; Odry, Á. Comparison of different radio communication-based technologies for indoor localization using trilateration. In Proceedings of the 2023 IEEE 17th International Symposium on Applied Computational Intelligence and Informatics (SACI), Timisoara, Romania, 23–26 May 2023; pp. 487–492. [Google Scholar]
Yang, H.; Wang, Y.; Seow, C.K.; Sun, M.; Si, M.; Huang, L. UWB sensor-based indoor LoS/NLoS localization with support vector machine learning. IEEE Sens. J. 2023, 23, 2988–3004. [Google Scholar] [CrossRef]
Djosic, S.; Stojanovic, I.; Jovanovic, M.; Djordjevic, G.L. Multi-algorithm UWB-based localization method for mixed LOS/NLOS environments. Comput. Commun. 2022, 181, 365–373. [Google Scholar] [CrossRef]
Yao, L.; Yao, L.; Wu, Y. Analysis and improvement of indoor positioning accuracy for uwb sensors. Sensors 2021, 21, 5731. [Google Scholar] [CrossRef]
Silva, B.; Hancke, G.P. Ir-uwb-based non-line-of-sight identification in harsh environments: Principles and challenges. IEEE Trans. Industr. Inform. 2016, 12, 1188–1195. [Google Scholar] [CrossRef]
Sang, C.L.; Steinhagen, B.; Homburg, J.D.; Adams, M.; Hesse, M.; Rückert, U. Identification of NLOS and multi-path conditions in UWB localization using machine learning methods. Appl. Sci. 2020, 10, 3980. [Google Scholar] [CrossRef]
Hajiakhondi-Meybodi, Z.; Mohammadi, A.; Hou, M.; Plataniotis, K.N. DQLEL: Deep Q-learning for energy-optimized LoS/NLoS UWB node selection. IEEE Trans. Signal Process. 2022, 70, 2532–2547. [Google Scholar] [CrossRef]
Kim, D.H.; Farhad, A.; Pyun, J.Y. UWB positioning system based on LSTM classification with mitigated NLOS effects. IEEE Internet Things J. 2022, 10, 1822–1835. [Google Scholar] [CrossRef]
Wang, Q.; Chen, M.; Wang, G.; Li, K.; Lin, Y.; Li, Z.; Zhang, C. A novel nlos identification and error mitigation method for uwb ranging and positioning. IEEE Commun. Lett. 2024, 28, 48–52. [Google Scholar] [CrossRef]
Liu, Q.; Yin, Z.; Zhao, Y.; Wu, Z.; Wu, M. UWB LOS/NLOS identification in multiple indoor environments using deep learning methods. Phys. Commun. 2022, 52, 101695. [Google Scholar] [CrossRef]
Vaghefi, R.M.; Gholami, M.R.; Buehrer, R.M.; Ström, E.G. Cooperative Received Signal Strength-Based Sensor Localization with Unknown Transmit Powers. IEEE Trans. Signal Process. 2013, 61, 1389–1403. [Google Scholar] [CrossRef]
Zhang, Q.; Cheng, X.; Wang, K.; Cao, Z.; Hong, Y. NLOS Error Suppression Method based on UWB Indoor Positioning. In Proceedings of the 2023 IEEE International Conference on Mechatronics and Automation (ICMA), Harbin, China, 6–9 August 2023; pp. 1125–1130. [Google Scholar]
Shi, Z.; Wang, J.; Zeng, X.; Yang, H. An improved positioning method based on compensation and optimization of ultra-wideband ranging results. Meas. Sci. Technol. 2024, 35, 086305. [Google Scholar] [CrossRef]
Liu, J.; Zhang, L.; Xu, J.; Shi, J. Dynamic Feasible Region Based IMU/UWB Fusion Method for Indoor Positioning. IEEE Sens. J. 2024, 24, 21447–21457. [Google Scholar] [CrossRef]
Zhu, X.; Yi, J.; Cheng, J.; He, L. Adapted error map based mobile robot UWB indoor positioning. IEEE Trans. Instrum. Meas. 2020, 69, 6336–6350. [Google Scholar] [CrossRef]
Wang, Q.; Li, Z.; Zhang, H.; Yang, Y.; Meng, X. An Indoor UWB NLOS Correction Positioning Method Based on Anchor LOS/NLOS Map. IEEE Sens. J. 2023, 23, 30739–30750. [Google Scholar] [CrossRef]
Ferreira, A.G.; Fernandes, D.; Branco, S.; Catarino, A.P.; Monteiro, J.L. Feature selection for real-time NLOS identification and mitigation for body-mounted UWB transceivers. IEEE Trans. Instrum. Meas. 2021, 70, 1–10. [Google Scholar] [CrossRef]
Wang, F.; Tang, H.; Chen, J. Survey on NLOS identification and error mitigation for UWB indoor positioning. Electronics 2023, 12, 1678. [Google Scholar] [CrossRef]
Wymeersch, H.; Maranò, S.; Gifford, W.M.; Win, M.Z. A machine learning approach to ranging error mitigation for UWB localization. IEEE Trans. Commun. 2012, 60, 1719–1728. [Google Scholar] [CrossRef]
Che, F.; Ahmed, Q.Z.; Fontainc, J.; Van Herbruggen, B.; Shahid, A.; De Poorter, E.; Lazaridis, P.I. Feature-based generalized gaussian distribution method for nlos detection in ultra-wideband (uwb) indoor positioning system. IEEE Sens. J. 2022, 22, 18726–18739. [Google Scholar] [CrossRef]
Yang, H.; Wang, Y.; Xu, S.; Bi, J.; Jia, H.; Seow, C.K. UWB ranging errors mitigation with novel CIR feature parameters and two-step NLOS identification. Sensors 2024, 24, 1703. [Google Scholar] [CrossRef] [PubMed]
Si, M.; Wang, Y.; Siljak, H.; Seow, C.; Yang, H. A lightweight CIR-based CNN with MLP for nlos/los identification in a UWB positioning system. IEEE Commun. Lett. 2023, 27, 1332–1336. [Google Scholar] [CrossRef]
Jiang, C.; Chen, S.; Chen, Y.; Liu, D.; Bo, Y. An UWB Channel Impulse Response De-Noising Method for NLOS/LOS Classification Boosting. IEEE Commun. Lett. 2020, 24, 2513–2517. [Google Scholar] [CrossRef]
Miramá, V.; Bahillo, A.; Quintero, V.; Díez, L.E. NLOS detection generated by body shadowing in a 6.5 GHz UWB localization system using machine learning. IEEE Sens. J. 2023, 23, 20400–20411. [Google Scholar] [CrossRef]
Jeong, S.-G.; Do, Q.-V.; Hwang, H.-J.; Hasegawa, M.; Sekiya, H.; Hwang, W.-J. UWB NLOS/LOS Classification Using Hybrid Quantum Convolutional Neural Networks. In Proceedings of the 2023 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), Busan, Republic of Korea, 23–25 October 2023; pp. 1–2. [Google Scholar]
Yang, Y.; Ke, H.; Gan, W.; Deng, Z. CNN-BiLSTM-ATTENTION: A Novel Neural Network with Attention Mechanism for NLOS Identification of UWB Signal. In Proceedings of the 2023 3rd International Conference on Intelligent Communications and Computing (ICC), Nanchang, China, 24–26 November 2023; pp. 279–283. [Google Scholar]
Niu, Z.; Yang, H.; Zhou, L.; Taha, M.F.; He, Y.; Qiu, Z. Deep learning-based ranging error mitigation method for UWB localization system in greenhouse. Comput. Electron. Agric. 2023, 205, 107573. [Google Scholar] [CrossRef]
Tian, Y.; Lian, Z.; Amparo Núñez-Andrés, M.; Yue, Z.; Li, K.; Wang, P.; Wang, M. The application of gated recurrent unit algorithm with fused attention mechanism in UWB indoor localization. Measurement 2024, 234, 114835. [Google Scholar] [CrossRef]
Wei, J.; Wang, H.; Su, S.; Tang, Y.; Guo, X.; Sun, X. NLOS identification using parallel deep learning model and time-frequency information in UWB-based positioning system. Measurement 2022, 195, 111191. [Google Scholar] [CrossRef]
Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
Smits, J.R.M.; Melssen, W.J.; Buydens, L.M.C.; Kateman, G. Using artificial neural networks for solving chemical problems: Part I. Multi-layer feed-forward networks. Chemometr. Intell. Lab. Syst. 1994, 22, 165–189. [Google Scholar] [CrossRef]
Jin, X.; Xie, Y.; Wei, X.-S.; Zhao, B.-R.; Chen, Z.-M.; Tan, X. Delving deep into spatial pooling for squecze-and-excitation networks. Pattern Recognit. 2022, 121, 108159. [Google Scholar] [CrossRef]
Zhao, H.; Jia, J.; Koltun, V. Exploring self-attention for image recognition. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 10076–10085. [Google Scholar]
Hou, Q.; Zhou, D.; Feng, J. Coordinate attention for efficient mobile network design. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13713–13722. [Google Scholar]
Bregar, K.; Andrej, H.; Mohorcic, M. NLoS channel detection with multilayer perceptron in low-rate personal area networks for indoor localization accuracy improvement. In Proceedings of the 8th Jožef Stefan International Postgraduate School Student Conference, Ljubljana, Slovenia, 1 June 2016; Voume 31, pp. 130–139. [Google Scholar]

Figure 1. Example of NLoS and LoS propagation in a UWB IPS.

Figure 2. Example of a trilateration-based 3-anchor positioning model: (a) positioning under LoS conditions, (b) positioning under NLoS conditions.

Figure 3. CIR curve from typical (a) LoS, (b) other NLoS scenarios.

Figure 4. The network structure diagram of 1D-CLANet.

Figure 5. The structure diagram of 1D-CNN.

Figure 6. The architecture of a LSTM cell.

Figure 7. The architecture of SE attention block.

Figure 8. Instruments and experimental environment. The anchor point (“

Δ

”) and tag (“

□

”) are positioned as shown. LoS ranging positions are shown in blue and NLoS ranging positions are shown in red. (a) Stairway passage, (b) office corridor.

Figure 8. Instruments and experimental environment. The anchor point (“

Δ

”) and tag (“

□

”) are positioned as shown. LoS ranging positions are shown in blue and NLoS ranging positions are shown in red. (a) Stairway passage, (b) office corridor.

Figure 9. Explanation of network structures: (a) CNSM, (b) ResNet without ECA.

Figure 10. ROC curve for different methods in NLoS binary classification.

Figure 11. Performance comparison of different methods for NLoS multi-classification.

Figure 12. Confusion matrix outcomes for multiclassification. The Sce.1, Sce.2, Sce.3, Sce.4, and Sce.5 correspond to LoS, human, glass, door, and wall, respectively. (a) LSTM. (b) SVM. (c) HQCNN. (d) MLP. (e) CNSM. (f) 1D-CLANet.

Table 1. List of notations.

Notation	Definition
$d_{i}$	The distance from the tag to i-th anchor
$d_{LoS}$ $/ d_{NLoS}$	The UWB measurements under LoS/NLoS
$d^{*}$	The true distance between tag and anchor
$η (t)$	The UWB measurement noise
$s (t)$	The transmitted symbol
$A$	The amplitude of the transmitted pulse
$p (t)$	The single Gaussian pulse waveform
$M$	The number of pulses
$T_{p}$	The period of each pulse
$η_{c}$	The fading coefficient of the c-th path
$α_{c}$	The time delay of the c-th path
$n (t)$	Additive white Gaussian noise with variance $σ_{r}^{2}$
$Z$	Input CIR signal
$L$	The length of input CIR signal
$U$	The output feature map after convolution operation
$H \times W \times C$	The spatial dimensions of feature map
$V$	The set of learned filter kernels
$γ_{B N}$ $β$	The learnable parameters of BN operation
$ε$	A small constant added to the variance
$\hat{U}$	The outputs after the batch normalization operation
$f_{t}$	The forget gate’s activation vector
$i_{t}$	The input/update gate’s activation vector
${\hat{C}}_{t}$	Cell input activation vector
${\overset{⌣}{C}}_{t}$	Current cell memory
${\overset{⌣}{C}}_{t - 1}$	Previous cell memory
$C_{t}$	The updated cell state
$O_{t}$	The output gate’s activation vector
$h_{t}$	Current cell output
tanh	The tanh activation function
$h_{t - 1}$	The previous cell output
$b$	The bias vector
$w$	The weight matrices
$h_{c} (i, j)$	$The feature value at spatial location (i, j)$
$W_{1}$ $and W_{2}$	Weights of the FC layers
$b_{1}$ $and b_{2}$	Biases of the FC layers
$ReLU (\cdot)$	The ReLU function
$s m_{c}$	The SE attention excitation output of the c-th channel
${\tilde{h}}_{c}$	The per-channel multiplication output
${\tilde{H}}_{SE}$	The final output of the SE attention block
$H_{lstm}$	Deep spatiotemporal features extracted by LSTM
$F_{concat}$	Fused feature vector
$x_{i s}^{softmax}$ $/ x_{j s}^{softmax}$	The is-th/js-th input elements for Softmax function
$σ$	The sigmoid function
$α$	The balancing factor
$γ$	The focusing parameter
${\hat{y}}_{i y}$	The predicted probability for class iy
$x_{t}$	The LSTM input
$i c$	The five types of CIR signal propagation channel
$ε_{r}$	Energy of CIR
$r_{m a x}$	Maximum amplitude of CIR
$τ_{m e d}$	Mean excess delay of CIR
$τ_{r m s}$	The Root-mean-square delay spread of CIR
$k$	CIR kurtosis
$d_{t}$	UWB ToF estimated distance

Table 2. Summary of UWB NLoS identification methods.

	Category	Advantage	Shortcoming	Related Methods
Traditional NLoS identification methods	Distance-based methods	The identification principle is simple, and the computational efficiency is high	The identification accuracy is easily affected by fluctuations in distance information	Kurtosis [20] SVM [21]
Traditional NLoS identification methods	Prior map-based methods	The NLoS identification performs well in static, well-mapped, known indoor environments	It requires extensive data and cannot identify dynamic obstacles causing NLoS	MKF [23]
Channel feature-based NLoS identification methods	Machine learning-based methods	It offers higher accuracy than traditional methods and greater efficiency than deep learning methods	Collecting a representative UWB channel feature dataset is challenging, and ML models often struggle with generalization in NLoS identification, leading to reduced accuracy	SVM [24,26] GD [25] GGD [27] Two-Step [28]
Channel feature-based NLoS identification methods	Deep learning-based methods	NLoS deep feature extraction and scene recognition excel, with no need for manual feature extraction and strong robustness	Research on NLoS multi-class classification is limited, and complex models with attention mechanisms require costly hardware and may lack efficiency	MLP [29] CNN [30] LSTM [31] HQCNN [32] BiLSTM-CNN [33] ECA-ResNet [34] GRU [35]

Table 3. The network structure of 1D-CLANet.

Module	Network Layer	Kernel Size	Output Size
Feature Extractor	Conv1	16 × 3	16 × 512
	BN1 + ReLU + Pool1	1 × 2	16 × 256
	Conv2	32 × 3	32 × 256
	BN2 + ReLU + Pool2	1 × 2	32 × 128
	Conv3	64 × 3	64 × 256
	BN3 + ReLU + Pool3	1 × 2	64 × 64
	Flatten1	/	1 × 4096
	LSTM	512	Sequence
	SE	512, 4	/
	Concat	/	/
Label predictor	Flatten2	/	/
	FC1 + ReLU	128	/
	Dropout	0.5	/
	FC2	5	/
	Output	1 × 5	/

Table 4. The hyper-parameter settings during the 1D-CLANet model training process.

Hyper-Parameters	Set Value
Batch size	32
Training times	20
Optimizer	Adam
Learning rate	0.0001
Loss function	Focal loss

Table 5. DWM3000 Configurations.

Parameter	Configuration Value
Central frequency	6489.6 MHz
Bandwidth	499.2 MHz
Pulse repetition frequency	64 MHz
Channel	5
Inter-ranging delay	200 ms

Table 6. Data and features employed by different comparison methods for binary classification.

Method	Feature or Data Used
Kurtosis [20]	CIR feature $k$
SVM [26]	CIR feature $ε_{r}$ , $r_{m a x}$ , $t_{r i s e}$ , $τ_{m e d}$ , $τ_{r m s}$ , $k$ and $d_{t}$
CNN [30]	CIR signal $Z = [z^{1}, z^{2}, \dots, z^{L}]$
ResNet (without ECA) [34]	CIR signal $Z = [z^{1}, z^{2}, \dots, z^{L}]$
CNSM	CIR signal $Z = [z^{1}, z^{2}, \dots, z^{L}]$
1D-CLANet	CIR signal $Z = [z^{1}, z^{2}, \dots, z^{L}]$

Table 7. Performance comparison of various approaches for NLoS binary classification.

Method	Accuracy	Precision	Recall	F1
Kurtosis	70.80%	70.87%	70.84%	70.86%
SVM	79.66%	81.15%	79.54%	80.28%
CNN	82.97%	83.92%	82.88%	83.35%
ResNet	85.80%	86.55%	85.72%	86.09%
CNSM	91.03%	91.09%	91.00%	91.04%
1D-CLANet	96.89%	96.92%	96.88%	96.89%

Table 8. Data and features employed by different comparison methods for multi classification.

Method	Feature or Data Used
LSTM	CIR signal $Z = [z^{1}, z^{2}, \dots, z^{L}]$
SVM [26]	CIR feature $ε_{r}$ , $r_{m a x}$ , $t_{r i s e}$ , $τ_{m e d}$ , $τ_{r m s}$ , $k$ and $d_{t}$
HQCNN [32]	CIR signal $Z = [z^{1}, z^{2}, \dots, z^{L}]$
MLP [29]	CIR signal $Z = [z^{1}, z^{2}, \dots, z^{L}]$ andCIR feature FPSD, FPDD, NPP, $τ_{m e d}$ , $τ_{r m s}$ and $k$
CNSM	CIR signal $Z = [z^{1}, z^{2}, \dots, z^{L}]$
1D-CLANet	CIR signal $Z = [z^{1}, z^{2}, \dots, z^{L}]$

Table 9. Differences in feature extractors of compared models.

Model	1D-CLANet without Attention	1D-CLANet with Self	1D-CLANet with CA	1D-CLANet with SE
Structure

Table 10. Comparison of identification accuracy with various attention mechanisms.

Feature Extractor	Parameters/M	Accuracy/%	Time/s
1D-CLANet without	1.2	89.82	85.43
1D-CLANet with Self	2.3	93.37	143.79
1D-CLANet with CA	1.5	94.91	120.82
1D-CLANet with SE	1.4	95.58	100.31

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, Q.; Chen, M.; Liu, J.; Lin, Y.; Li, K.; Yan, X.; Zhang, C. 1D-CLANet: A Novel Network for NLoS Classification in UWB Indoor Positioning System. Appl. Sci. 2024, 14, 7609. https://doi.org/10.3390/app14177609

AMA Style

Wang Q, Chen M, Liu J, Lin Y, Li K, Yan X, Zhang C. 1D-CLANet: A Novel Network for NLoS Classification in UWB Indoor Positioning System. Applied Sciences. 2024; 14(17):7609. https://doi.org/10.3390/app14177609

Chicago/Turabian Style

Wang, Qiu, Mingsong Chen, Jiajie Liu, Yongcheng Lin, Kai Li, Xin Yan, and Chizhou Zhang. 2024. "1D-CLANet: A Novel Network for NLoS Classification in UWB Indoor Positioning System" Applied Sciences 14, no. 17: 7609. https://doi.org/10.3390/app14177609

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

1D-CLANet: A Novel Network for NLoS Classification in UWB Indoor Positioning System

Abstract

1. Introduction

2. Related Work

3. Theoretical Analysis

4. Proposed Method

4.1. Convolutional Neural Network

4.2. Long Short-Term Memory Cell

4.3. Squeeze-and-Excitation Block

4.4. Feature Fusion and Classification Module

5. Experiments and Analysis

5.1. Experimental Setting Up and Dataset Description

5.2. Results and Analysis

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI