1. Introduction
With the rapid development of wireless communication technology in pervasive infrastructure and mobile clients, the demand for LBSs (location-based services) is also accelerating in people’s daily lives. Hence, indoor localization technologies have gained great concern in both academia and industry, and various indoor localization technologies and techniques have been developed [
1,
2]. In particular, because of the advantages of no need for additional infrastructure and low cost, indoor WLAN (Wireless Local Area Network)-based localization has become one of the most popular technologies [
3], among which fingerprint-based localization technique is one of the hot research topics. The fingerprint-based indoor localization is usually conducted in two phases, namely, the offline site survey phase and the online fingerprint matching phase.
So far, a lot of study work and achievements have been made in these two phases, and many systems have been developed to reduce labor costs and localization errors and improve localization accuracy [
4,
5]. In many existing indoor fingerprint localization techniques, we found that the similarity of positions is usually discriminated by comparing the similarity of RSS vectors and then used to estimate the coordinates of test points (TPs). For example, multidimensional scaling (MDS) technology is a data dimensionality reduction method widely applied in indoor localization. It often employed FDs to represent PDs to construct a distance matrix between nodes, thereby obtaining their similarity and computing the estimate of the TP [
6]. Also, there are a large number of machine learning algorithms that have been introduced into indoor fingerprint localization, many of which utilize FDs to get the estimated coordinate of TP [
5,
7].
That is to say, using the RSS vector distances (i.e., FDs) between two position points to discriminate the PD between them is a prerequisite and foundation for many fingerprint-based indoor localization techniques. Inspired by this viewpoint, we infer that if the discrimination capability of FDs for PDs is improved, the corresponding localization techniques will also be improved. However, most existing efforts mainly focus on improving fingerprint matching algorithms; the fundamental reasons for the impact of each AP on FD’s capability to discriminate PD have not been fully studied, such as the healthy states of APs and the positions of APs.
It is well known that indoor fingerprint-based localization can fully utilize public WLAN facilities but does not require the accurate locations of the APs. However, with the rapid development and popularization of WLAN infrastructure and smart devices, APs are widely deployed. Our intelligent terminal can easily receive high-quality signals from a large number of APs. The states of these APs are dynamic, with some appearing, some disappearing, some blocked, some malfunctioning, and so on. We refer to all of these situations as the healthy states of the APs. From the perspective of signal strength, the healthy states of APs can be divided into three categories: normal, slightly unhealthy, and complete failure. The latter two are also known as abnormal APs. Due to APs being ubiquitous, we still have sufficient normal APs to use. Meanwhile, in large indoor spaces, the set of APs that can be received varies with the TP’s position. So, in large-scale indoor spaces with a large number of APs, using fingerprints formed by RSSs from all APs not only increases fingerprint dimensions and is time-consuming but also greatly reduces localization accuracy due to dynamic APs’ states.
Although there have been some studies on the issue of AP selection, the AP selection criteria and models hardly have universality due to the complexity of indoor environments and applications [
8]. A prevalent AP selection method is to choose APs with the max RSS or mean RSS. However, since RSS similarity is the comparison of RSS vectors between two positions and the APs’ states are dynamic, it can only be used when the AP’s RSS values in both positions are normal; otherwise, this methodology may lead to the possible amplification of the error [
9]. By collecting data in the same position at different times and performing post-processing on them, unstable APs can also be found and eliminated, but it is time-consuming and costly [
10]. Correspondingly, based on the large number of APs present in indoor spaces, our proposed strategy does not require repeated sampling. Similarly, the AP selection algorithm based on loss rate requires scanning the channel multiple times during the sampling process to select the APs with low loss rate [
11], which is also time-consuming. In addition, the APs selection strategy can also be based on mutual information [
12], nonuniform quantization RSSI entropy [
13], information gain, joint information gain [
14], etc. The aforementioned APs selection strategies are mainly determined by analyzing the received RSS, either from a probabilistic or informational perspective, but they are all static. In real indoor scenarios, when comparing the similarity between two positions, it is common that an AP may not be suitable for two positions but may be suitable for the other two. Therefore, the selection of AP subsets should be dynamic, that is, different TPs may select different AP subsets to estimate the coordinates during the online phase.
For the positions of the APs problem, researchers in [
15] identified several crucial problems that caused the localization errors, including APs’ discrimination for fingerprinting a specific location and the transitional fingerprint problem on smartphones. However, the impact of APs on RSS similarity of location pairs has not been deeply studied, and the health states of APs have not been taken into account. For APs’ health state problem, in [
16], the problem of the impaired APs and the joining of the new APs were considered, and a secure fingerprint localization method was proposed for robust variable environments, but it is aimed at specific environments and collaborates by installing reference anchor nodes to update the fingerprint database. This technique, which requires installation facilities, cannot be directly applied in public, large indoor spaces. Although we have analyzed AP’s discrimination capability in [
17], no localization algorithm was provided. Hence, the research of [
17] was extended in this article, and a new FDDC-WKNN technique is redefined by combining the policies of improving FD’s discrimination capability to PD and WKNN algorithm, which can provide a solution for the aforementioned problem about APs’ status. As for WKNN, it is a simple and classic machine learning algorithm and also a typical algorithm widely used in indoor positioning [
10,
18]. The existing improved WKNN algorithm usually directly uses RSS similarity to discriminate the similarity of positions, mainly focusing on the weight parameters and the value of K. In recent years, many researchers have begun to combine WKNN with other more complex algorithms [
19,
20], which not only increases computational complexity but also reduces time efficiency.
The scenes considered in the article are AP-rich large-scale indoor environments. These APs with high-quality signals are installed for communication, they are evenly distributed and dense, but they are not uniformly managed. That is to say, we do not know the state of most of the APs. When we utilize these ubiquitous APs for localization, they will bring greater interference and uncertainty, seriously affecting localization performance. Therefore, the main focus of our study is to select an appropriate subset of normal APs dynamically based on the TPs and adjust their contribution to the discrimination capability of RSS similarity for PDs based on their states.
The main work and contributions of the article can be summarized as follows:
- (1)
The root causes limiting the discrimination capability of FDs related to APs’ state and position are investigated. We deeply analyzed and recognized three factors that influence the contribution of APs to FD’s discrimination capability, which are ① the distance between the AP and the sample position, ② the AP’s direction to the pairwise positions, and ③ the healthy states of the APs.
- (2)
For the above three problems, we provided corresponding solution policies to improve FD’s discrimination capability. For the healthy states of the APs, setting a threshold for all sample RSSs at both offline and online stages will eliminate abnormal APs dynamically. For the distance and direction of APs, a discrimination correction quantity and a priority weight were proposed to adjust APs’ contribution to FD’s discrimination capability.
- (3)
Ultimately, by integrating the solution policies with WKNN, we advanced a redefined indoor localization technique, that is, FDDC-WKNN, which has strong robustness and stability for the state of AP in AP-rich environments without knowing APs’ status.
The rest of the manuscript is organized as follows.
Section 2 is the related work on indoor localization;
Section 3 introduces the framework of FDDC-WKNN and presents preliminary knowledge and some symbols definitions; and in
Section 4, we conduct some observation to find APs’ factors affecting FD’s discrimination capability and provide corresponding solutions. The FDDC-WKNN algorithm process is described in detail in
Section 5, and in
Section 6, extensive experiments and simulations are executed. At last, the conclusions of the article are presented in
Section 7.
2. Related Works
With the modernization of production and lifestyle, more work and entertainment activities can be carried out indoors, and most people spend approximately 80% of their daily lives indoors [
5]. Outdoors, satellite positioning, and navigation have provided convenience in all aspects of our lives. However, these conveniences cannot be extended to indoor environments. Therefore, a variety of indoor localization technologies and techniques are booming, which will be reviewed in this section. In addition, the APs selection strategies are also briefly reviewed in this section.
Classified by the dependent infrastructure, indoor localization technologies include Wi-Fi [
21,
22], acoustic signals [
23], UWB (Ultra Wide Band) [
24], RFID (Radio Frequency Identification) [
25], Bluetooth, etc. Recently, LoRa (Long Range) communication extended these wireless technologies and was employed for indoor positioning. In [
26], the LoRa technology and its employment for RSSI-based indoor localization in the license-free 2.4 GHz band were studied, and an average localization error below 2.2 m was obtained. Moreover, other techniques based on acoustics, vision, magnetic fields, accelerometers, and so on are also used to estimate indoor locations, but due to the need to install additional equipment, the cost is high, and it is difficult to widely apply [
5]. In addition, a variety of other integrated localization technologies have also been proposed. For example, in [
27], a location-aware infrastructure is presented, which can give rise to a sensing mechanism that enables infacility crowdsourcing, which can help fingerprint localization services. Due to the similarity of RSS sequences, the assistant nodes were selected in [
28] to improve the localization accuracy, and by using an adaptive Kalman filter, the time-of-flight ranging error has been alleviated. Recently, a fingerprint augment framework based on super-resolution (FASR) was proposed in [
29] to reduce the cost at the offline phase and ensure localization accuracy. Based on the conversion between fingerprint images and fingerprint data, the framework achieved the fusion of fingerprint augment and super-resolution. A transportable laser range scanner was used to automatically label Wi-Fi scans in [
30] to get an accurate indoor fingerprint localization system with no associated data collection.
Classified by the nature of signals to be measured, various localization techniques have been developed based on the signal nature, such as TDOA (Time Difference Of Arrival), TOA (Time Of Arrival), RSS [
7], CSI (Channel State Information) [
31], etc. Among all the technologies mentioned above, the fingerprint localization technique based on RSS sampling has been widely studied, and with the vigorous development of artificial intelligence (AI) technology, more and more machine learning algorithms are being applied to the two phases of fingerprint localization [
32,
33], such as K-nearest neighbor (KNN) [
34], weighted K-nearest neighbor (WKNN) [
19,
35], support vector machine (SVM) [
36], artificial neural network (ANN) [
37,
38], and so on.
To improve indoor positioning accuracy and stability, in [
35], the WKNN algorithm was improved by fusing the spatial distance and physical distance (PD) of the RSS, while in [
19], a combined algorithm based on WKNN and extreme gradient boosting (XGBoost) was proposed for location determination. For other machine learning methods, to deal with high noise in RSSI measurements and ensure high target-localization accuracy, authors in [
39] proposed two range-free target-localization schemes on SVR, one is a plain support vector regression (SVR)-based model and the other is a fusion of SVR and Kalman filter (KF), while in [
36], based on multi-output least squares SVM regression, authors built a statistical regression model on the RSS dataset to obtain the locality of a mobile device. By using a back-propagation neural network (BPNN) to calculate the similarities between different fingerprints constructed based on absolute RSS values, a high-adaptability indoor localization method was proposed in [
38]. In [
40], BPNN is used to determine the distances between the TP and each reference point (RP) to mitigate the effect of the fluctuation of RSS. Moreover, the deep neural network (DNN) model is also considered in indoor fingerprint localization [
41], and the DNN models obtained from multiple training sessions are combined to locate the target, achieving a lower RMSE. In all, it has been proven that machine learning can effectively improve indoor localization accuracy, enhance system robustness, reduce costs, and improve the performance of the indoor localization method [
42]. For Bluetooth Low Energy technology, authors in [
32] implemented and analyzed the performance of four supervised learning techniques, that is, KNN, SVM, Random Forest (RF), and ANN, to explore the possible improvement in the localization accuracy and found that the most promising machine learning technique was Random Forest, with classification accuracy over 99%. Furthermore, with the growth of computationally capable smartphones, deep learning has also been used in indoor localization systems [
43,
44].
Referring to AP selection, some strategies have already been implemented [
9]. In the survey by [
11], five different AP selection strategies were concluded, and the max mean RSS method and loss rate method were used to perform automatic regulation for the proposed self-adaptive AP selection algorithm. In [
13], the authors introduced the nonuniform quantization RSSI entropy (NQRE) to quantify AP’s discernibility and select APs to construct an offline fingerprint database. For the scene where both indoor structure and user moving mode are fixed, an AP selection method was proposed in [
45] by adding a perturbation operator to the RSS dataset to extract the correlation between RSSs, which can reduce the time consumption at the online stage. On the contrary, to deploy an optimal AP placement configuration in indoor localization systems, the mean and variance of the RSS were used [
46]. Moreover, to effectively select stable and suitable APs, the authors in [
47] proposed an indoor localization algorithm based on multiple access point selection, in which two AP selection steps are added at the offline stage. The first step is to delete the APs with less frequency from the RSSI information at each RP, and the second step is using the k-means algorithm to cluster RPs and re-selecting the AP subset for each cluster. However, so far, it is hard to have the universality of AP selection criteria due to the complexity of indoor environments and applications.
4. Observation and Enhancement Policies for AP’s Discrimination Capability
In the current highly developed social environment of information technology, our smart devices can easily receive high-quality signals from many APs in public indoor environments.
These abundant APs are usually installed for the function of communication by different individuals, which are located at various positions and have various states, including being removed, impaired, unstable, etc. If there exist abnormal APs, it will inevitably lead to wrong FDs of pairwise positions, resulting in large localization errors. Therefore, utilizing such abundant existing APs to select suitable APs to improve the discrimination capability of FD has become an important factor affecting localization accuracy.
From the definition of FD, we have to emphasize AP’s contribution to RSS similarity. So, next, according to an in-depth analysis of the contribution of a single AP to the FD of a position pair, we aim to provide different AP RSS correction and selection ways, to greatly improve the discrimination capability of FD to a PD.
The strategy proposed in this section does not rely on increasing sample collection tasks or complex intelligent algorithms, but only on the presence of sufficient APs in indoor space. We aimed to improve FDDC by analyzing the physical performance of APs and RSS values.
4.1. Observations for Discrimination Capability
During the fingerprint-based indoor localization process, due to the influence of the distance, direction, and states of the APs, it often occurs that the identical RSSD implies a different PD between two positions.
4.1.1. Diverse FDs’ Discrimination Capability Caused by Different Distance APs
It can be seen from Formula (2),
, which indicates that
,where
is the RSSD and
denotes the corresponding distance difference. In other words, an identical
means a larger PD difference
at a farther position, or a smaller
at a closer position. Similarly, an identical
means a bigger RSSD
at a closer position, or a smaller
at a farther position.
Figure 2 illustrates the RSS spatial distribution. It can be seen that the same RSSD of
corresponds to the different difference in PD of two pairwise positions, depending on the distances of the specific AP.
In order to investigate the impact of distance variation between AP and a location on the contribution of AP to RSS similarity, we observed the variation of RSSD with distance variation between AP and a location through a simple experiment. For two locations A and B with
in our school library, connect BA and place an AP at different distances from A on the extension line on side A. Observe the RSSD of two locations A and B, as shown in the
Figure 3.
As consistent with the above analysis, for the identical , the correspondence values are different and decrease with increasing distance. That is to say, APs with different distances have different RSSDs for the same two locations, so the contribution to RSS similarity is also different.
In a word, the PDs of pairwise positions indicated by RSSDs depend on the receivers’ distance, leading to varying discrimination capability for various positions.
4.1.2. Diverse FDs’ Discrimination Capability Caused by APs with Different Directions
Inherent constraints are constrained by the law of radio signal propagation; APs have diverse discrimination capabilities for different pairwise positions. That is, an identical
can imply various PDs (see
Figure 4). For any two positions, one is on the ring with RSS = −50 dBm, and the other is on the ring with RSS = −70 dBm, the value of RSSD
is always 20, but the PD between them varies greatly. According to simple geometry, it’s easy to know that the PD of the position pair
is the smallest and
is the biggest. Meanwhile, the most accurate discrimination capability of RSSD
should be for position pair
, which is related to the direction of the APs.
In other words, see
Figure 5, assume that the pairwise position
is fixed, that is,
(
is a constant). Take position
as the center, and make a circle with a radius of
(smaller than
). Suppose an AP is located at
O, then
is also a constant and
, with
as the center, drawing two circles with a radius of
and
, respectively. Connecting
and
, which will intersect with the two circumferences at points
and
, respectively, and
is the included angle between
and
.
It is obvious that
changes with the location of the AP, but the correspondence between the RSSD
and radius difference
remain unchanged, so set
. Therefore, by studying the relationship between
and
, we can get that when the orientation of AP changes concerning the position pair, the corresponding discriminate capability of RSSD changes rules. For
, the following equation is gained:
We can obtain that the same value
corresponds to different
, depending on the angle
between line
and
, which means APs at different directions have different discrimination capabilities for the same position pair
. As seen from
Figure 5, when one
is located at point
and point
, the discrimination capability of RSSD is optimal and poorest for position pair
.
Remark that if is a very small value, which means that the space between and point is tiny enough, is about equal to the PD . Then, the RSSD can characterize all PDs well.
In a word, the PDs of the pairwise positions indicated by RSSDs depend on the receivers’ direction, leading to varying discrimination capability for various directions.
4.1.3. Diverse FDs’ Discrimination Capability Caused by APs with Different Health States
In order to make the localization more accurate, in addition to the above two aspects, abnormal APs should not be ignored. Abnormal APs can be roughly divided into slight unhealthy (such as slight malfunction/objects blocking) and complete failure (such as broken/removed/newly emerged), which may occur in the online and offline stages. A slight abnormality means that the AP can transmit, but due to the device being impaired, the signal transmitted by the device is weak or unstable. Here, considering the complexity and variability of the indoor environment, for the obstruction of the new facilities or walls results in the weak signal of an AP at a specific location, we also treat the AP as slightly abnormal at the position. For complete failure/removed/newly emerged APs, although there are already works that can automatically enrich old fingerprint databases with new information [
50], we are considering environments with rich APs and do not care about removing or installing individual APs. Hence, completely failed APs mean that the device cannot transmit signals at all in the offline and/or online stage.
In order to observe the effect of normal APs and abnormal APs, we carry out an experimental analysis of the RSS values of a specific position and the RSSD of a position pair. In this experiment, the position of the AP is known, and the power and signal lights of the AP are constantly being monitored. We use an iron box to rotate irregularly around the AP to simulate abnormal AP. First, for the sampling position that is fixed 3 m away from AP, we sample RSS values from a normal AP and an abnormal AP, respectively. The results are shown in
Figure 6a. From the figure, it can be seen that the RSS values of the abnormal AP are obviously lower than the RSS value of the normal AP. In addition, the RSS values of the abnormal AP are extremely unstable compared to the values of normal AP. Second, for the sampling position pair with 2.5 m PD, we calculate the RSSD of the position pair (see
Figure 6b). As seen, when the AP sampling at one position is normal and at the other position is abnormal, the RSSD is obviously larger than that of normal AP, and they are extremely unstable. When the AP sampling at both positions is abnormal, the RSSD is slightly smaller than that of normal AP and can be ignored when calculating FD.
In a word, the PDs of pairwise positions indicated by RSSDs are closely related to the AP’s state, leading to diverse FD’s discrimination capability for different healthy states.
4.2. FD’s Discrimination Capability Enhancement Policies
In this section, different discriminatory policies are discussed based on the above discussion and analysis. We first discuss the strategy to identify the abnormal Aps.
4.2.1. Abnormal Aps Identification
In this section, we design two policies to reduce the impact of abnormal Aps, one for slight abnormality Aps and the other for complete failure/removed/newly emerged Aps.
Since the abundance of Aps in the indoor environment and the possibility of abnormal Aps, it is unnecessary and also unrealistic to utilize the RSSDs of all Aps equally. Whether for the fingerprint in the fingerprint database or online collected the RSS of TP, set a threshold for RSS values to exclude Aps with weak signal strength.
For position
, the RSS vector collected from
Aps is
, the RSS value
is defined as follows:
where
is a threshold, only the RSS values of Aps with RSS values greater than
will be used to participate in the following steps, otherwise, RSS values will be set to 0. Through this method, slightly abnormal Aps with small RSS values (
) in the environment can be eliminated. Furthermore, from Equations (8) and (10), we can find that the component
of RSSD directly impacts the FD
. An AP is normal at one position but is a complete failure or removed or newly emerged at the other position, which can generate a significant
and affect the discrimination capability of FDs. To avoid this situation, make the following rule for
:
We assume that there always exist rich Aps in indoor scenes, and then through the above policies, abnormal Aps, especially the complete failure or removed or newly emerged Aps in the environment, can be eliminated almost.
4.2.2. Discrimination Correction Quantity
In indoor environments, due to signal reflection and refraction, or multipath fading and indoor noise, signal attenuation and RSS fluctuation are severe. Especially, they are closely related to the current environment. Therefore, we hope to adjust the RSS values generated by each AP at each location to weaken the impact of the environment. More generally speaking, the closer Aps are less affected by environmental uncertainty, and the correction rule is to emphasize closer Aps.
Here, we define a correction factor to quantitatively differentiate each AP for a specific location. According to the LDPL model (2) and the estimation of PDs between AP and a position, for the
k-th AP to the
i-th position, the correction factor is calculated as follows:
where
is the estimated distance between the
k-th AP and the
i-th position.
Furthermore, to reduce the strong signal influence, we introduce the way in [
17] to redefine the correction factor as follows:
where
is a flexible value, it can be given by empirical, for example, −50 dBm,
and
are constant parameters, which can be determined based on
such that
is continuous at
.
From Formula (17), because the reciprocal of PD is consistent with the monotonicity of the LDPL model, the closer the AP is, the greater the value of the correction factor. Therefore, the correction quantity is defined as follows:
where
is a flexible empirical value according to the environment conditions, usually between 3 dBm and 6 dBm, approximately equal to the value of signal fluctuation at a position.
4.2.3. Priority Weight
From Formula (13) in
Section 4.1.2, we can gain a rule: for Aps in different directions, the smaller the angle
is, the larger the RSSD
is, which means that the direction in which the AP generates a large RSSD is very advantageous for positions A and B. In other words, the Aps who get the larger RSSD have a greater contribution to FD’s discrimination capability. Here, we want to emphasize the importance of this AP by assigning higher priority weight to the RSSD.
Therefore, we grant each RSSD of each pairwise position from each AP a priority weight
as follows, which determines the priority and contribution of its participation in FD (see Equation (19)). As seen, the larger RSSD corresponds to a higher priority weight.
It is clear that
is the sigmoid function of
, so adding
to RSSD will not have a significant impact on the value of FD, avoiding affecting localization accuracy.
Note that abnormal Aps often generate large RSSD, and in this case, adding priority weights can increase the error. Therefore, priority weights need to be used after removing abnormal Aps.
Overall, through the above-provided discrimination capability improved policies, the Aps with larger contributions to FDs’ discrimination capability will be involved in the calculation of FDs, and the more suitable the Aps, the higher the degree of participation. Hence, FDs with strong discrimination capability to PDs are obtained, providing a guaranteed prerequisite for relevant fingerprint matching algorithms.
5. WKNN and FDDC-WKNN Algorithm
The K-nearest neighbor (KNN) algorithm is a simple and practical machine learning algorithm that is one of the most widely used fingerprint matching algorithms. By setting different weights to the K nearest neighbors, WKNN often performs better in localization performance.
Essentially, both KNN and WKNN obtain similar distances through RSS similarity. Thereby, we expect the above-provided FDs discrimination capability improved policies can be combined with WKNN to improve localization performance, namely FDDC-WKNN.
Moreover, for different TPs in a large-scale indoor space, it is often necessary to select different subsets of Aps. Hence, the FDDC-WKNN algorithm process also has the function of dynamically selecting AP subsets.
5.1. The Idea of KNN and WKNN Algorithms
The idea of the KNN algorithm is to select the positions of K fingerprints closest to the current positional fingerprint to estimate the current position, which is simple, intuitive, and effective. KNN is one of the algorithms that use FD of pairwise positions to discriminate PD between them.
We assume that there are
RPs in an indoor area, and
fingerprints were collected in the offline phase, which is denoted as
, where
, see Equations (6) and (7). In the online phase, the RSS vector of a TP is
. First, calculate all FDs
from position S to RPs; second, select
K nearest RPs based on all FDs; finally, use the following Formula (20) to get the estimation value of S coordinates:
The above formula indicates that the KNN algorithm sets the same weights to all K nearest neighbors. However, according to the propagation characteristics of indoor signals, it is known that the farther the distance, the stronger the signal fluctuation. For this reason, the WKNN algorithm sets different weights based on FD to emphasize that close neighbors contribute ability more, the estimation value of S coordinates can be expressed as follows:
where
is the weight related to distance. In this article we adopt the WKNN algorithm to be the indoor fingerprint algorithm, and the weights are given in the following formula.
Note that: Different K values mean that the number of nearest RPs selected is different, which can affect the accuracy of localization. The optimal K value is often gained from experience or practice.
5.2. The FDDC- WKNN Algorithms
Taking into account all the strategies and WKNN ideas discussed above, a uniform WKNN solution based on improving FD’s discrimination capability will be presented in this section, that is FDDC-WKNN. The presence of a large amount of abundant Aps in the indoor environment is a prerequisite for this solution.
For an indoor area, given that there are RPs and Aps and fingerprint databases have been built on the Server. In the online phase, when we have the RSS vectors at a TP, send them to the server, and want to get the TP location . Based on the policies aforementioned, the algorithm steps on the server side are given as follows:
Step1. Set a threshold , modify the RSS value in and the fingerprint data in the dataset by Formula (14). First, select a subset of Aps with a quantity of M: sort all RSS values > −100 and select the corresponding top Aps. Then, select RPs in the TP neighboring domain: check if all RSS of one RP under the above AP subset is greater than −100; if so, then it is considered that the RP belongs to the TP neighboring domain. Assume that the number of RPs belonging to the TP neighboring domain is , and the RSS vector of TP under this AP subset is denoted as .
Step2. Combine
and the RSS matrix of the
RPs to a new matrix
as follows, where the first row of
is RSS vector
at TP, and the other
rows are the RSS matrix at RPs.
Step3. Calculate the discrimination factor
for each RSS value and gain the discrimination correction quantity matrix
as follows:
where
Then, we calculate modified RSS data by
plus discrimination correction quantity matrix
, see Equation (26). Here,
can be seen as the RSS value corrected by the discrimination factor.
where
Step 4. Calculate the RSSD of the TP for each RP, that is, subtract each remaining row of the matrix
from the first row to obtain each RSSD vector
. Then, an RSSD matrix is gained as follows:
where
Use priority weight Formula (19) to calculate all priority weight of
, we can get the following priority weight matrix
.
where
.
Step5. Calculate the Hadamard product of
and
, that is multiply the corresponding elements of
and
as Equation (31). Then, we get the weighted RSSD
by priority weight.
where
Step 6. Calculate the FD of TP to each RP by Formula (10) to obtain an FD sequence
and express it by the following Equation (33):
where
Step7. Select the K RPs corresponding to the smallest K elements in sequence , and the positions of these K RPs are the K positions closest to TP. Then, calculate the weight of each RP according to Formula (22). Finally, according to Formula (21), we can get the location of TP.
In addition, in order to obtain the correlation and consistency between PD and FD in subsequent experiments, we need to calculate the PD of TP to each RP by Formula (9) to obtain an PD sequence
and express it as follows:
where
Compared to the WKNN algorithm, with a time complexity of , the FDDC-WKNN algorithm requires additional steps such as selecting subsets of Aps (step 1), calculating correction quantity (step 3), and calculating priority weight (step 4). With a time complexity of , the time complexity has increased by . During the fingerprint matching process, the sizes of and are both limited, ranging from tens to hundreds, so the impact on time efficiency is linear, and the FDDC algorithm has high time performance.
6. Simulation and Experiments
In this section, we will evaluate the policies and algorithms proposed in this article by conducting some experiments in two scenes: simulation and real environments. All simulations and experiments were conducted on a personal laptop using MATLAB V9.5. The computer is the 64-bit operating system based on x64 processors, and the hardware configuration is an Intel(R) Core(TM) i7-8550U CPU with 16.0 GB of RAM.
In particular, our technology is suitable for having a large number of Aps in both scenes, and there are phenomena such as Aps being damaged, malfunctioning, removed, etc.
6.1. Scenes Setup and Some Parameter Presets
First, the environment settings, fingerprint database construction, and some presets in both simulated and real scenarios were introduced.
6.1.1. Introduce for Experimental Scene
To simulate large-scale indoor space with rich Aps, we simulated a
sized indoor area with 18 Aps, and these Aps are placed on grid points every 10 m in the simulation area, as depicted in
Figure 7a. In our simulation process, RSS data are generated by ray tracing technology based on the LDPL model, which considers multiple propagation paths between the transmitting and receiving points, analyzes the signals of each path separately, and then superimposes to obtain the RSS value on each receiving point. Here, we mainly consider the issue of wall reflection paths and signal attenuation. The simulation was completed in MATLAB. Initially, we generated a simulation fingerprint dataset at position with an interval of 0.1 m. Then, we can get any offline fingerprint database from the dataset as needed. For example, an offline fingerprint database with an RPs interval of 1 m only needs to take values every 10 position points in the dataset, that is,
m, and so on.
In our following simulation experiment, the offline fingerprint database is from the above fingerprints dataset by setting the RPs interval of the offline database to 1 m, that is, there are RPs in the simulation area, and there are a total of RSS values from 18 Aps at each RP.
In the online phase, we simulated a target moving in a 100 square meter room in the simulation area (see the red rectangle in
Figure 7a) and obtained a trace with 100 trajectory points, as well as RSS on each trajectory point (see
Figure 7b). Meanwhile, we simulated two abnormal Aps, one of which was removed or broken, and the other had a minor malfunction. These 100 trajectory points are used as TPs for the indoor localization algorithm. In order to simulate people’s living habits, the movement trajectory moves in space within 1 m of the boundary.
The real experimental scene was on the second floor of the school library at Dalian University, and its plan is shown in
Figure 8a. The central position of this floor is the vacant part on the first floor, surrounded by an open bookshelf reading area, including a large number of open bookshelves and reading tables and chairs. The experiment area is located on the north side of the floor with about 1200 m
2, in the red rectangle area of
Figure 8a, and
Figure 8b is the enlarged experiment area. As can be seen, there are 261 RPs at an interval of 2 m. In addition, there are a total of 83 Aps that can work in the experimental area.
In the offline phase, we used an indoor positioning system that includes both server and mobile devices, which was developed and implemented in collaboration with a company. The server is deployed on a dual 2U ThinkServer, and the mobile APP is installed on the HUWEI P9 smartphone. Before building the fingerprint database, it is necessary to load the building structure plan into the server, and the real RPs can be marked in the plan, as shown in
Figure 8. Then, the handheld intelligent mobile terminal can be used for RSS collection in real RPs. At each RP, 15 RSS values for each received AP of 83 APs were collected, and the average value as the final RSS value was saved. In the online phase, to reduce the workload of data sampling, we randomly selected 50 points from these 261 to be TPs and set 8 abnormal APs with 4 APs removed or breakdowns and 4 APs with slight abnormality.
Overall, in order to have a clearer understanding of the two scenes, the setups for both scenes are shown in
Table 1.
6.1.2. Parameters Presetting
In the proposed algorithm, many parameters need to be set to experiment. Here, considering the environments of two scenes, the parameters are determined based on experience or specific conditions, respectively, as listed in
Table 2.
Due to the RSS being sampled directly in the real scene, the first three parameters are not required to be set.
6.2. Simulation and Analysis
Aiming to measure whether the proposed policies can improve the whole discrimination capability of FD to PD, we conducted some experiments to verify the correlation and consistency changes between FD and PD before and after using FDDC-WKNN. For the simulation environment, we first verified the effect of proposed policies on the discrimination capability of FD to PD. Then, for a random trace with 100 positions, we find the optimal value of K, and with the optimal K, we simulate the localization positions and derive the CDF curves to demonstrate the localization performance.
For the real scene in the Dalian University library, we also find the optimal value of K for 50 random TPs, and with the optimal K, we gain the localization results similarly.
6.2.1. Discrimination Capability of FD to PD
By calculating the PD sequence
using Equation (35) and the FD sequence
using Equation (33) between 100 TPs and 81 RPs in the localization area, we gain the correlation coefficient and consistency coefficient between each TP and RP. From the results shown in
Figure 9, we can see that both the correlation coefficient and the consistency coefficient have greatly improved. After implementing the policies proposed in this article, strong correlation increased from 0 to 76%, and consistency increased from 26% to 99%, which indicates that the discrimination capability of FD to PD has been greatly improved.
To further examine the impact of abnormal AP on FDs’ discrimination capability, we repeated the experiment of calculating
and
under diverse abnormal AP conditions to obtain the correlation coefficient and consistency coefficient between PD and FD. For clarity, the experimental conditions are recorded as the following situations, and the results before and after implementing FDDC_WKNN are displayed in
Table 3.
Ⅰ: All APs are in a normal state.
Ⅱ: There exists one AP with a slight abnormality.
Ⅲ: There exists one AP with a complete failure.
From the results in
Table 2, it can be seen that the presence of completely failed APs in the localization area has the greatest impact on the correlation and consistency between FD and PD. Moreover, regardless of the conditions, the policies proposed in this article have improved the correlation and consistency between FD and PD to varying degrees.
6.2.2. Setting of K
In the same conditions of localization area, different K values in FDDC-WKNN can lead to different localization accuracy. We assumed that the K value with the lowest average errors was optimal. The optimal K value is usually not fixed and varies depending on the environment. We compared the average errors under different K values for both simulation and real scenes to obtain the corresponding optimal K value.
As shown in
Figure 10, the optimal K value is 7 in the simulation scene and 6 in the real scene. However, we found that the average localization errors did not change significantly with different K values. Although our policies have greatly improved the FD’s capability to discriminate PD, the optimal K value is only reduced from 7 to 6.
6.2.3. Localization Examination by FDDC-WKNN under the Optimal K
For
in the simulation scene with 100 TPs and
in the real library scene with 50 TPs, we apply the FDDC-WKNN algorithm, and the localization results are demonstrated in
Figure 11 and
Figure 12, respectively.
In the simulation scene,
Figure 11a shows the locations of TPs and the estimated locations of TPs obtained based on the FDDC-WKNN algorithm. It can be seen from
Figure 11b that before using the FDDC-WKNN algorithm, due to the influence of abnormal Aps, the localization errors were large and fluctuated greatly, while the localization errors have significantly decreased and are stable after using the FDDC-WKNN algorithm.
In the real library scene, the localization results are basically consistent with the simulation scene (see
Figure 12).
Figure 12a shows the locations of TPs and the estimated locations of TPs obtained based on the FDDC-WKNN algorithm. Due to the complexity and variability of the real environment, the localization errors are larger and the error fluctuation is also greater (see
Figure 12b).
Moreover,
Figure 13 presents a comparison of the CDF between FDDC-WKNN and other algorithms, including KNN, Euclidian-WKNN, Manhattan-WKNN, and KNN, which are related popular algorithms involved in fingerprint-base localization. The FDDC-WKNN algorithm, which considers AP contribution issues, has better accuracy and stability than other algorithms.
In order to more accurately demonstrate the localization errors of several algorithms,
Table 3 presents the mean errors of these several algorithms. From
Table 4, it can be seen that the mean errors have been reduced from 2.2732
to 1.2290
and from 4.0489
to 2.4320
in both the simulation scene and the real library scene, respectively.
In summary, we have examined the WKNN algorithm based on the improved FDs’ discrimination capability proposed in this article through multi-directional experiments. All experimental results indicate that the proposed approach can improve FD’s capability to discriminate PD in environments with abnormal APs, thereby reducing localization errors and improving localization accuracy. In addition, the proposed algorithm performs stably in the presence of interference in the environment, proving its robustness.
7. Conclusions
In this article, by defining a threshold, a correct quantity, and a priority weight, the capability of FD to discriminate PD is greatly improved for complexity and AP-rich indoor environments. Furthermore, combined with WKNN, a new indoor fingerprint localization technique referred to as FDDC-WKNN was provided. The FDDC-WKNN algorithm is particularly suitable for large-scale indoor scenarios with ubiquitous uncontrollable APs and has a stable and robust localization performance.
However, the large-scale indoor scenes considered in this article are relatively stable; for indoor scenes with frequent changes, corresponding updates to the fingerprint dataset are required to cooperate, which will bring a heavy workload.
In addition, during large-scale indoor movement, the selected AP subset also changes dynamically due to the different positioning target positions. Therefore, it is necessary to select the AP subset for each TP. In order to improve time efficiency, the initial AP selection in this article is only filtered through a threshold. In fact, before conducting RSS similarity calculations, a suitable subset of AP can be further screened, which will be a future research direction.
The discrimination capability of FD to PD is the key foundation for most indoor localization methods based on FD, which is only applied and validated in the WKNN algorithm in this article. Next, the algorithm will be implemented, verified, and improved using different indoor localization techniques.
Also, improving the accuracy of localization with the other various intelligent algorithms and auxiliary facilities, studying different strategies based on various environmental requirements, and so on are the future study directions.