1. Introduction
The evolution of outdoor positioning technology, exemplified by the Global Positioning System (GPS), has established itself as a pivotal instrument for outdoor endeavors and spatial navigation. Likewise, indoor positioning assumes critical importance across diverse environments, including shopping malls, parking facilities, and industrial warehouses. In indoor environments, satellite positioning systems lose efficacy, resulting in hindered location detection and the inability to identify points of interest within facilities such as building complexes.
The most used indoor positioning technology is called RSS-based positioning, or Received Signal Strength based positioning, and operates by measuring the power level of received radio signals from known locations, typically from fixed transmitters like Wi-Fi access points or Bluetooth beacons. The core concept is that the signal strength decreases predictably as the distance from the source increases. By measuring the RSS values from multiple known points, the system can estimate the receiver’s location through techniques such as trilateration or fingerprinting.
In trilateration, RSS values are converted into distance estimates between the receiver and multiple transmitters. Using these distances, the system calculates the receiver’s location by finding where these distances intersect.
In fingerprinting, during a preliminary survey, RSS values at various known locations are recorded to create a “radio map”. Later, real-time RSS readings are matched against this map using algorithms like nearest neighbor or machine learning methods to estimate the user’s location based on the best match between the observed and recorded data.
Both methods rely on the propagation model of the environment, which can be influenced by factors such as obstacles, reflections, and interference, affecting the accuracy and reliability of the RSS-based positioning. Commonly employed techniques in indoor positioning include the use of Wi-Fi, RFID, and Bluetooth technologies, each having unique benefits and drawbacks. However, they often struggle with challenges such as signal interference, physical obstructions, and high power consumption, which can degrade positioning accuracy.
Historically, indoor positioning systems primarily relied on Wi-Fi because of its widespread availability. However, the massive deployment of Wi-Fi access points (APs) required to improve positioning accuracy can be costly in terms of power consumption and infrastructure. Extensive Wi-Fi deployments necessitate numerous APs to provide sufficient reference signals for precise positioning, which often prove impractical due to the high energy and operational costs involved.
In contrast, Bluetooth Low Energy (BLE) presents a more viable solution. BLE access points or beacons are significantly cheaper, have a smaller footprint, and are more power-efficient. With battery-powered units that can operate for over a year, BLE beacons can be easily deployed in indoor environments without the need for extensive cabling. This makes BLE an increasingly attractive option for efficient and cost-effective indoor positioning.
Angle of Arrival (AoA) positioning using Bluetooth Low Energy (BLE) is a technique where the direction of an incoming signal is determined using multiple antennas. This method enhances accuracy in determining locations, particularly useful in complex indoor environments for precise applications such as asset tracking. AoA offers superior accuracy and is less susceptible to multipath interference compared to other methods like RSSI, due to its focus on signal direction rather than strength.
However, implementing AoA can be complex and costly because it requires sophisticated antenna arrays and processing units. The setup also demands careful calibration and may need regular adjustments with changes in the environment. Moreover, the effectiveness of AoA decreases over longer distances, and the hardware required can increase device size and power consumption, which might not be ideal for smaller, battery-operated devices. Considering the real-world practical solution, in this study, we still focus on the RSS-based BLE positioning.
By using the Received Signal Strength from BLE beacons, we can determine the current location through positioning algorithms. This method is supplemented by an experimental approach involving five machine learning algorithms in this research: KNN, WKNN, NB, RSS-NN, and a new CNN based method. These algorithms are evaluated in terms of accuracy and our proposed CNN algorithm outperforms all other methods. Finally, we also investigated the Quuppa Intelligent Locating System, a commercial indoor positioning system, to gauge their effectiveness in a controlled indoor environment.
The main contributions of this study are below:
Enhanced Indoor Positioning Model: We introduce a novel indoor positioning model that uses a CNN, optimized for processing the spatial dependencies in RSS data transformed into an image-like format. This model addresses the shortcomings of traditional RSS fingerprinting methods by effectively handling the environmental dynamics within indoor settings.
Comparative Analysis of Algorithms: Our research systematically compares the performance of traditional algorithms with our CNN approach, providing insights into their practical applications and limitations in real-world indoor settings.
Experimental Validation: The effectiveness of the proposed model and comparative algorithms is validated through rigorous field tests within a meticulously prepared indoor environment. This includes the collection of a significant dataset that captures the complexity of real-world conditions.
Foundation for Future Research: Our findings lay the groundwork for future research, particularly in enhancing the accuracy of indoor positioning systems using deep learning techniques and adapting to environmental changes more dynamically.
This paper is organized as follows:
Section 2 presents an experimental overview of applicable research;
Section 3 outlines the methodology employed for indoor positioning;
Section 4 explains the design of positioning algorithms; and
Section 5 encompasses the construction of the experimental site, thorough data analysis, rigorous evaluation of algorithms, and resolution of existing questions. Finally, in
Section 6, we summarize the findings of our research, spotlighting achievements and limitations, and propose prospective research directions.
2. Related Works
With the advent of the Internet of Things (IoT) and the maturity of location-based technologies, the demand for indoor location services is growing.
Indoor positioning is in high demand across various domains, including enterprise management, security surveillance, emergency rescue, smart elderly care, and warehouse logistics. Nevertheless, existing solutions have proven inadequate in satisfying the user's need for heightened precision. In consequence, the widespread adoption of diverse indoor positioning technologies has been spurred by advancements in the field, leading to increased accuracy.
Lu Bai et al. [
1] introduced a sensing system designed to capture raw Received Signal Strength Indicator (RSSI) data from BLE beacons. They proposed both a trilateration-based method and a fingerprinting-based method for indoor location determination and user tracking. Li et al. [
2] employed virtual fingerprinting and two-way ranging techniques to address the four prominent limitations observed in traditional RSSI fingerprint-based methods. Subhan et al. [
3] proposed a hybrid methodology that combines fingerprinting-based K-Nearest Neighbors (K-NN) with lateration-based MinMax position estimation. This approach utilizes a Euclidean distance formulation rather than relying on indoor radio channel modeling. Urano et al. [
4] utilized an end-to-end Long Short-Term Memory (LSTM) neural network for indoor location estimation of BLE devices. Their approach demonstrates superior accuracy when compared to trilateration or fingerprint-based methods.
P. Weerasinghe et al. [
5] introduced a Feed Forward Neural Network (FFNN) which utilizes Received Signal Strength Indicator (RSSI) values to accurately determine the location of a moving object or individual. This capability is essential for IoT-based Ambient Assisted Living (AAL) applications. Choi Jeongsik et al. [
6] explored an unsupervised learning approach to predict the positions of unknown nodes using anchor nodes. They developed a calibration curve to rectify the distortion in raw distance measurements, enhancing the accuracy of node localization. Ding Hai-Lan et al. [
7] introduced a hybrid algorithm merging Angle of Arrival (AOA) and Received Signal Strength Indication (RSSI) to mitigate the limitations inherent in RSSI alone. Through repeated testing, they achieved an average location measurement accuracy of approximately 35 to 36 centimeters. Chia-Hsin et al. [
8] employed Angle of Arrival (AoA) and Received Signal Strength Indication (RSSI) techniques to mitigate Non-Line-of-Sight (NLOS) effects through appropriate range scaling and correction of AoA to the base station. This approach effectively addresses NLOS errors, rendering it appealing for location estimation within cellular networks. Le, Anh Tuyen, et al. [
9] Le, Anh Tuyen, et al. introduced a range-based localization algorithm aimed at enhancing accuracy. The algorithm utilizes mathematical formulas combining Angle of Arrival (AoA) and Received Signal Strength Indication (RSSI) to achieve precise estimation of target nodes. Although several studies have concentrated on BLE-based positioning, there exists a significant gap in the literature concerning a thorough investigation that systematically compares traditional algorithms and their performance within this specific context.
3. Indoor Positioning Method
3.1. Wireless Communication Methods
Indoor positioning technology can be generally classified into two primary categories: wireless communication-based and physical-based methods. Each of these technologies possesses distinct characteristics, and their respective accuracies can be compared to evaluate their performance.
This categorization is based on the principles used to achieve positioning. Wireless communication technology, such as Wi-Fi, RFID, UWB, ZigBee, and Bluetooth Low Energy, is one such category.
3.1.1. Wi-Fi
Indoor Wi-Fi networks possess the capability to serve not only as conventional network infrastructures but also as tools for modeling path loss. This is achieved by leveraging the spatially propagated path loss phenomenon to interpret location data. Wi-Fi technology is frequently utilized for indoor positioning, with wireless local area networks (WLANs) comprising wireless access points, including wireless routers. These networks facilitate positioning, monitoring, and tracking endeavors within complex environments [
10]. Nevertheless, Wi-Fi-based systems face susceptibility to interference from other signals, compromising their accuracy, while concurrently exhibiting high energy consumption in positioner devices. In Wi-Fi positioning, reliance is often placed on Received Signal Strength Indication (RSSI) to estimate distance, subsequently employing trilateration based on multiple distance measurements from different reference points. Position determination hinges on the estimated values of detected and received signals, where signal strength serves as a determinant of signal quality. However, RSSI-based systems typically offer only coarse-grained position estimates and may fall short of providing measurements with sufficient precision to accurately determine position in all scenarios.
3.1.2. Bluetooth Low Energy
The efficient power consumption characteristics of Bluetooth Low Energy (BLE) beacons facilitate battery-powered operation for durations exceeding a year, thereby minimizing infrastructure expenses when juxtaposed with alternative technologies. Despite the shared utilization of Bluetooth technology, the implementation specifics diverge. Historically, indoor positioning systems have been constructed utilizing either Apple's iBeacon protocol or Google's Eddystone protocol. The iBeacon protocol mandates the deployment of a considerable quantity of beacon devices indoors to attain a positioning precision within the range of 1–2 m [
11]. Bluetooth communication represents a short-range, low-power wireless transmission technology utilized in establishing multiple Bluetooth local area network (LAN) access points within indoor environments to configure a network based on a multi-user framework. This approach guarantees that Bluetooth LAN access points consistently function as the principal device within this micro-grid, enabling the determination of user location information through signal strength detection.
3.1.3. RFID
Radio frequency identification (RFID) technology operates on the principles of electromagnetic signal transmission to enable communication between RFID tags and readers. This technology presents numerous benefits, such as enhanced data rates, heightened security measures, and the capacity to mitigate challenges associated with non-line-of-sight communication [
12]. Nonetheless, obstacles including the deployment of a substantial quantity of RFID devices and the necessity to track individuals carrying RFID tags have impeded its widespread commercial utilization within the realm of indoor positioning.
3.1.4. UWB
Ultra-wideband technology (UWB) is a short-range communication technology distinguished by its ability to transmit pulses with a bandwidth exceeding 500 MHz within a 1 nanosecond timeframe, all while maintaining low power consumption. UWB signals possess unique characteristics in terms of signal type and spectrum, rendering them less susceptible to multipath effects and highly adept at penetrating obstacles. Consequently, UWB technology facilitates positioning accuracy at levels as precise as 10 cm, even in intricate indoor environments [
13]. Nonetheless, the progression of UWB technical standards has been sluggish, curtailing its broad implementation in consumer electronics. Although acclaimed for its superior accuracy, UWB technology remains confined to select specialized industrial applications due to the lack of support for the UWB protocol in mainstream mobile smart devices. In contrast to conventional narrow-band systems, UWB systems offer several advantages, including robust penetration capabilities, minimal power consumption, effective mitigation of multipath effects, elevated safety standards, and reduced system complexity, all of which contribute to a notable enhancement in positioning accuracy.
3.1.5. ZigBee
ZigBee technology, founded on the IEEE 802.15.4 [
14] protocol, is designed for low cost, low power, and low data wireless sensor networks. Nonetheless, ZigBee finds limited adoption in commercial indoor positioning primarily because the majority of consumer-oriented smart devices lack support for this protocol.
3.2. Physical Methods
Physical methods, on the other hand, use visible light, infrared, geomagnetism, and computer vision techniques to achieve positioning.
3.2.1. Infrared LED
Visible light communication technology harnesses high-frequency flashing signals emitted by LED light sources to encode positional information [
15]. This encoded information is subsequently deciphered by light-sensitive sensors, enabling the determination of the mobile device's position. However, the efficacy of this technology relies on LED lights equipped with flashing coding functionality, and its accuracy may diminish due to challenges associated with non-line-of-sight communication.
3.2.2. Geomagnetic
Geomagnetic-based indoor location technology has emerged as an innovative positioning method increasingly adopted in commercial sectors in recent years. This approach capitalizes on the ubiquitous nature of the Earth's magnetic field. Geomagnetic field data at specific locations worldwide is derived using the International Geomagnetic Reference Field Model (IGRF), alongside latitude and longitude coordinates provided by the International Association for Geomagnetic and Upper Atmospheric Physics (IAGA) [
16].
3.2.3. Computer Vision
The computer vision-based indoor positioning system represents a well-established method wherein a computer vision algorithm is employed to ascertain the location coordinates of an individual. This is achieved by comparing captured image data during the positioning process with a pre-existing visual database of the venue [
17]. In recent years, propelled by rapid advancements in artificial intelligence technology, the accuracy of computer vision-based indoor positioning systems has notably enhanced, achieving precision levels of approximately 1–2 meters.
3.3. Comparison of Different Indoor Positioning Technologies
By consolidating the distinctive characteristics of various indoor positioning technologies discussed earlier, we can present an overview in the form of
Table 1. This table provides the key properties and attributes associated with each technology, their respective strengths and limitations.
When considering the diverse properties of various indoor positioning technologies, it becomes evident that they exhibit distinct characteristics. Consequently, when deploying these technologies in commercial environments, it is essential to consider not only the inherent accuracy of the technology, but also factors such as the specific environment, construction aspects, deployment strategies, rollout processes, and other relevant considerations. In this context, BLE indoor positioning technology emerges as a highly suitable solution, even in the prevalent era of iBeacon adoption. This is primarily due to the widespread prevalence of Bluetooth-enabled devices, making it a feasible choice for a broad range of commercial applications. Furthermore, it is important to acknowledge that besides the accuracy of the positioning device itself, the effectiveness of the algorithm in correcting measurement inaccuracies also plays a vital role.
A very traditional approach, the computer vision-based indoor positioning system uses a computer vision algorithm to determine the coordinates of a person’s location by comparing the image information captured by the camera during the positioning process with a database of pre-collected visual photos of the venue. With the rapid development of artificial intelligence technology in recent years, the accuracy of computer vision-based indoor positioning systems has been greatly improved, generally in 1 to 2 m.
Integrating all of the above properties of the different indoor positioning technologies,
Table 1 can be visualized as follows.
Thus, different indoor positioning technologies are very different from each other, and if they are to be applied in a commercial environment, not only the accuracy of the technology itself, but also the environment, construction, deployment, rollout, and other factors should be taken into consideration.
Based on BLE’s indoor positioning technology, even in the era of widespread use of iBeacon, it is a very suitable solution when all factors are taken into consideration, because almost everyone has a Bluetooth device. In addition to the measurement accuracy of the device itself, how the accuracy is corrected by the algorithm is also very important.
3.4. Comparing Wi-Fi and BLE for Practical Mobile Positioning
For practical and easy deployment within existing environments, Wi-Fi and BLE are the most commonly used technologies for indoor positioning, as most modern smartphones are equipped with these capabilities. However, these two technologies are not initially designed to consider the position functions. Here, we discuss the practical issues related to the two technologies in terms of MAC mechanism and scanning capabilities.
3.4.1. MAC Mechanism in Wi-Fi and BLE
The Medium Access Control (MAC) mechanisms [
18,
19] of Wi-Fi and BLE are designed to optimize their specific wireless environments and usage models. Wi-Fi utilizes Carrier Sense Multiple Access with Collision Avoidance (CSMA/CA), optionally enhanced with Request to Send/Clear to Send (RTS/CTS) protocols and acknowledgments (ACKs) to manage data traffic in dense and potentially congested environments, ensuring robust high-speed data transmission by avoiding collisions. In contrast, BLE employs a simpler, less resource-intensive approach using Adaptive Frequency Hopping (AFH) to mitigate interference and enhance connection stability in a primarily low power, low data rate environment. BLE’s method involves less overhead and is geared towards maximizing energy efficiency, which is crucial for battery-operated devices like wearables and IoT sensors. In terms of positioning scenario which require long-time continues operating, BLE is more favorable as its low power consumption.
3.4.2. Scanning Capabilities in Wi-Fi and BLE
As RSS-based positioning needs the mobile to constantly scan the surrounding Wi-Fi APs or BLE beacons for real time update of user’s location. The scanning capability of both technologies is crucial for positioning usage.
Wi-Fi devices can perform network scans using either passive or active methods, with active scanning being quicker. Typically, a Wi-Fi device actively scans by spending about 50 milliseconds on each channel, allowing it to theoretically scan all 13 channels of the 2.4 GHz band roughly 92 times per minute. However, such frequent scanning is generally impractical as it would severely impact battery life and network efficiency. In reality, devices adjust their scanning frequency based on context and necessity, employing a mix of passive and active scanning to optimize both power consumption and connection stability. In our own test, a successful scanning of APs can take up to several seconds in smartphones.
The speed at which a mobile phone can scan for BLE (Bluetooth Low Energy) beacons varies based on the scan settings—specifically, the scan interval and window. Typically, a mobile phone can alternate between fast and slow scanning modes depending on the application’s needs and battery life considerations. Fast scanning might involve a scan window of about 30 to 100 milliseconds and an interval of 100 to 500 milliseconds, allowing for the quick discovery of nearby devices. Slow scanning, used mainly to conserve battery when the app is running in the background or when high responsiveness is not critical, could extend the interval to several seconds. In our testing, smartphones can take several scans within one second; therefore, real-time position updates become possible.
3.5. RSS-Based BLE Method
Apple and Google have engaged in collaborative efforts towards the development of standards for BLE beacons [
20]. Apple unveiled its iBeacon technology during the 2013 Apple Worldwide Developers Conference [
21], while Google introduced its Eddystone technology in July 2015. These small wireless sensors transmit data and enable communication with other smart devices, making them a viable option for the Internet of Things (IoT) [
22].
BLE beacons function on Bluetooth Low Energy technology, offering advantages such as low cost, minimal power consumption, and high scanning/broadcasting speed. These devices can broadcasting/transmit radio signals with data, which can be received by nearby devices. Signal strengths can be analyzed to obtain additional information. BLE beacons can also be configured to transmit packets for positioning at high rates (e.g., every 100 ms), rendering them suitable for real-time continuous positioning scenarios.
Additionally, these devices can be deployed in a dense manner, making them particularly effective for accurate and real-time indoor positioning. The BLE positioning technique utilizes RSS to obtain an approximation of the distance between the transmitter and receiver. This distance information is then used in trilateration to determine the position of the receiver based on multiple distance measurements from different points. The quality of the detected signal is estimated based on the strength of the received signal, which is represented by a numerical value. However, using RSS alone can only provide a rough estimation of the position, and its accuracy may not always be sufficient for precise localization.
Wireless-based positioning algorithms have been extensively studied for various wireless technologies over the years. Numerous RSS-based systems rely on free space radio propagation loss equations. For instance, the propagation loss in free space is proportional to, where r denotes the distance between the transmitter and receiver.
The distance between the transmitter and receiver can be estimated from the RSS values by using the path loss model [
23]. For outdoor locations, the relationship between distance
and path loss
in dB is given with Equation (1):
where
is the reference distance,
is the respective path loss in dB and
is the path loss exponent, typically around 2 in free space.
In indoor settings, the presence of multiple obstacles, such as walls, furniture, and human bodies, between a transmitter and receiver result in significant propagation loss due to radio signal absorption and diffraction. The Log-Distance Path Loss extends the free-space model by considering additional losses caused by obstacles and multipath propagation, where signals take multiple paths to the receiver due to reflections, as given with Equation (2):
where
is a Gaussian random variable representing the additional loss caused by obstacles and signal reflections, and
n is usually greater than 2, often ranging from
2.7 to 3.5 depending on the density of obstacles.
However, for an accurate calculation of path loss in complex indoor environment, ray tracing may be used to simulate every potential path that a signal might travel, including reflections, diffraction, and scattering, which provides a more accurate depiction of signal behavior in complex indoor environments.
In real devices, signal strength is usually measured and reflected with Radio Signal Strength Indicator (RSSI). RSSI is measured in decibels relative to a milliwatt (dBm). The value is typically acquired directly from the hardware of the receiving device. It represents the power level of the received signal as detected by the receiver’s RF front end. Most Bluetooth devices can provide the RSSI value of a connection or of nearby devices in discoverable mode. The RSSI value can be approximately linked to the distance between devices using the logarithmic path loss models discussed previously. The formula to relate RSSI to distance is derived from the general path loss model as shown in Equation (3):
where
A is defined as the absolute energy expressed in dB m at a distance of 1 m from the transmitter, which corresponds to the RSS reading at 1 m from the transmitter;
n is the signal transmission constant, which is influenced by the signal transmission environment; and
d is the distance between the transmitter node and the receiver node. Although RSSI can indicate the distance between the transmitter and receiver, however, the RSSI value can vary significantly depending on environmental factors such as obstacles, walls, and interference from other wireless devices. Especially in indoor environments, RSSI values can fluctuate due to multipath effects where the signal bounces off objects, arriving at the receiver multiple times from different paths. In addition, different devices might have slightly different ways of measuring and reporting RSSI, which can lead to inconsistencies across different types of Bluetooth hardware. So, for location tracking, RSSI should be used carefully, as it is susceptible to various sources of error and environmental effects.
4. Algorithms Design and Analysis
4.1. Experiment Environment
In order to test and compare the performance of different positioning algorithms, we chose a large office to setup the experiment. The layout of the experimental environment is shown in
Figure 1, with a size 12 m × 6 m; normal office furniture (office desks, chairs) exist in the environment for a realistic complex indoor environment. There are eight BLE beacons are installed on the ceiling of the room and around the office. On average, each of the eight beacons is at a distance of around 1.5 m from the nearby beacons. The room space was divided in a 1.2 m × 1.2 m block. To compare the efficacy of the algorithms, a series of experiments were conducted within an office space equipped with eight BLE Beacons. The experimental data were acquired using a customized BLE scanner application running on an Android device, which was specifically configured to detect only the aforementioned eight BLE Beacons. Each data record consisted of eight RSS values derived from the detected BLE beacons, along with the corresponding location coordinates.
Figure 1 displays the arrangement of the Beacons, training points, and testing points within the office setting. The 28 green circles denote the training points, with the distance between each training point and its adjacent point set at 1.5 m. Meanwhile, the 12 red triangles represent the testing points, and the 8 blue diamonds indicate the location of the BLE beacons. For each training point, approximately 500 to 600 records were collected in each of the four different directions. As for each testing point, approximately 50 records were gathered. Upon filtering out any invalid records, the total number of valid data points was found to be 540 testing records and 67,589 training records.
The raw data collected are presented in
Figure 2. At each data collection point, multiple entries are recorded. For each entry, the mobile phone’s received signal strength in RSSI was collected from eight beacons (BC1 to BC8), along with the original coordinates ranging from A−J and 01−05. To simplify location calculations, these coordinates have been remapped to integer values ranging from 1−10, both horizontally and vertically. These coordinates will be utilized for both training the model and subsequent positioning tasks.
The experimental data were utilized to empirically assess the performance of five distinct algorithms, namely KNN, WKNN, and CNN. To compare the efficacy of these algorithms, the error distance between the actual location and the location estimated by the respective algorithms was calculated, along with their associated coordinates.
4.2. K-Nearest Neighbor Algorithm
The K-Nearest Neighbors (KNN) algorithm is a well-established and straightforward method for addressing classification tasks, as indicated by its prominence in the literature. Its fundamental premise is to identify the closest neighbors within the entire training dataset by using the received data of the object of interest, and to subsequently assign the object to the class that is most frequently represented among the nearest neighbors.
To identify the k-nearest neighbors, the Euclidean distance between the current RSS observation and each of the pre-existing RSS records is computed, resulting in a distance object that includes the corresponding location coordinates. The Euclidean distance formula is utilized to compute the distance between every location as Equation (4).
where
and
denote the
out of
n values within each pre-recorded RSS and the currently observed RSS.
Once the distance calculations have been completed, the results are sorted in ascending order based on their distances. The coordinate of the current location is then determined by taking the average of the coordinates of the k-nearest neighbors, which are represented by the first k elements in the sorted list of distance objects, as expressed in Algorithm 1.
Algorithm 1 Proposed KNN Algorithm |
1: Input: z: testing sample, X training data, L: class labels of X, k: number of nearest neighbors |
2: Output: Class of the testing data z |
3: Start |
4: |
5: for each do |
6: Calculate the distance: |
7: |
8: end |
9: end |
10: Sort in ascending order; |
11: Select , based on D, be the set of testing data with the k small distances |
12: Classify z in the majority class |
13: End |
4.3. Weight K-Nearest Neighbor Algorithm
To enhance the accuracy of the KNN algorithm, which may be affected by the varying distances of the nearest neighbors from the actual location, the WKNN was developed. In WKNN, the nearest neighbors are assigned weights that are proportional to their respective distances, with closer neighbors receiving higher weights than those that are further away. This approach aims to reduce the impact of more distant neighbors on the final prediction. Following the sorting of the distance list, the coordinates of the current location in WKNN are computed by taking a weighted average of the coordinates of the k-nearest neighbors. The weight assigned to each neighbor is determined by its distance, with closer neighbors receiving higher weights than those that are further away, as presented in Algorithm 2.
Algorithm 2 Proposed WKNN Algorithm |
1: Input: z: testing sample, X training data, L: class labels of X, k: number of nearest neighbors |
2: Output: Class of the testing data z |
3: Start |
4: |
5: for each do |
6: Calculate the distance: |
7: |
8: end |
9: Sort in ascending order; |
10: Select , based on D, be the set of testing data with the k small distances |
11: Calculate the weight: |
12: Classify z in the majority class |
13: End |
Both KNN and WKNN face limitations due to their high sensitivity to environmental changes, such as physical barriers and signal interference, which affect signal strength and positioning accuracy. Both require a densely populated and precisely measured set of sample points for accurate functioning, making them vulnerable to errors in sparse or dynamically changing environments. Additionally, Weighted KNN’s complexity increases with the need to fine-tune the weighting factors based on distance, and both algorithms struggle with scalability and need frequent updates of the RSS map to cope with environmental changes, impacting their efficacy in large-scale or real-time applications.
4.4. Naïve Bayes Algorithm
The Naïve Bayes Classifier is a widely used and effective algorithm for addressing classification problems. It utilizes the renowned Bayes Theorem to make predictions about the probability of a given data point belonging to a specific class. The classifier calculates the probability for each class and selects the class with the highest probability as the final result.
Bayes Theorem provides a mathematical framework for estimating the probability of an event, which can be expressed using Equation (5). By applying Bayes Theorem, the Naïve Bayes Classifier can make probabilistic inferences and classify data points based on their likelihood of belonging to different classes.
where P(A) is the probability of class A being true, P(B) is the probability of predictor B is true and P(A|B) is the probability of class A given that predictor B is true. P(B|A) is the probability of predictor B given that A is true.
The basic idea is train an NB classifier with the training data set first, and then predicts the location using the trained classifier with the current observed RSS values. For classifier training, the first step is converting each RSS sample in the training data set with the corresponding coordinate, and creates a classification dataset object with a list of data point objects. The second step is creating an NB classifier and trains it with the classification dataset object created in the first step.
For location prediction, convert the current observed RSS values to data point object first, and then use the trained classifier to classify the data point object. The result is a object which contains the probabilities of all locations in the training set, and the most likely function returns the index of the location with highest probability in the location list.
However, in the actual scenario, the current location will not always exactly be one of the training points, even if it is, the highest probability may not be actually high and several locations have the probability close to the highest probability.
Thus, the averaging concept in KNN and WKNN may be help to optimize the accuracy of NB. The probability values of each possible location with the same index of location are contained in the location list. The probability values are sorted by descending order, and the corresponding indexes of locations with high probability can be finding via probability value. The locations with probability different from the highest probability not greater than 10% are to be averaged. In addition, for the weighted averaging approach, the locations are weighted based on probability.
4.5. RSS-Based Neural Network Algorithm
Our RSS based Neural Network is computational models inspired by biological neural networks. They are composed of interconnected layers, including an input layer, an output layer, and one or more hidden layers. Neurons within each layer are connected to neurons in the subsequent layer. Through the learning process using a given training set, the model assigns weights to these connections and adjusts them incrementally to improve the accuracy of predictions.
To implement the proposed RSS-based Deep Neural Network, the neural network architecture must be defined, specifying the number of neurons in the input, hidden, and output layers. Training and testing data can be imported from .csv files, and the number of input and output variables is provided to create the corresponding datasets. Once the neural network and datasets are established, the training process can commence.
Before training the neural network, parameters such as the maximum allowable error rate and the maximum number of iterations can be set. During the training phase, a graphical representation of the total network error is displayed, illustrating inputs, outputs, desired outputs, errors, and the total mean square error of the test dataset. This visualization aids in monitoring the network’s convergence and evaluating its performance.
4.6. Convolution Neural Network Algorithm
The Convolutional Neural Network (CNN) is a type of neural network that employs convolution as a distinct linear operation, as opposed to the conventional matrix multiplication, in at least one of its layers. This approach is particularly effective when processing data with a grid-like topology, such as image data or time series data. We utilized a pre-existing CNN model as a reference [
24] and subsequently refined its parameters to enhance its performance, including the adjustment of filter sizes, feature maps, and number of CNN layers.
The room depicted in
Figure 1 was partitioned into 1.2 m × 1.2 m blocks, with RSS collection being conducted at 28 designated training points. Multiple beacons typically detected each point, with all measurements captured. The complete labeled dataset comprised 2699 labeled data points, with 2159 data points utilized for training and 540 data points reserved for testing.
The original dataset consisted solely of RSSI readings from eight BLE beacons and the coordinates of each recording point. To facilitate the use of CNNs for feature extraction, we have transformed these data into an image-like format. Our testing area, originally divided into grids labeled A−J and 1−5, has been expanded to create a 10 × 10 grid representation, as illustrated in
Figure 3. Notably, the original grid configuration was a 10 × 5 rectangle; for compatibility with CNN processing, we extended the
Y-axis from 5 to 10 by padding the grid. The left side of
Figure 3 displays this grid, with the RSSI values from each beacon plotted within the corresponding grid sections. To streamline processing with CNNs, the data have been normalized using a straightforward formula (as shown in Equation (6)) to scale the RSSI values to a real number between 0 and 1. In this scheme, 0 represents pure black and 1 represents pure white, with intermediate values depicted in varying shades of gray.
Our CNN network architecture is illustrated in
Figure 4.
The parameters of our CNN Architecture are set as follows: Input layer: [10, 10, 1] grayscale; 2 Convolution layers: number of the filters, respectively [8, 8], filter sizes: [3, 3] with valid padding; 1 Max-pooling Layer: number of filters [8, 8], filter sizes: [2, 2]; 1 Dense layer: 16 nodes; Output layer: 1 representing x and y coordinates; Activation function: ReLU. Additionally, other hyper-parameters, such as batch size, number of layers, and layer sizes, can be modified during the testing procedure to facilitate the attainment of optimal positioning performance.
The proposed algorithm encompasses distinct phases, namely the offline phase and the online phase. During the offline phase, RSSI measurements are acquired at various
reference positions, yielding datasets denoted as, where , N represents the number of samples in the dataset, and M represents the number of BLE beacons. Subsequently, the dimensionality of the RSSI measurements undergoes reduction through the application of Principal Component Analysis (PCA) and Kalman filtering techniques for noise mitigation. Furthermore, a normalization process is conducted. Subsequent to these preprocessing steps, Convolutional Neural Network (CNN) architecture is employed for feature extraction, followed by the utilization of a Soft-max layer for classification-based learning. Consequently, a localization classification model is formulated. In the subsequent online phase, data preprocessing procedures, coupled with feature extraction techniques, pave the way for the application of the classification model to yield the ultimate estimation of the positional coordinates.