In this section, we outline the overall system and divide it into three parts in detail: BWFP Module, BWFP Augmentation Module and Motion Detection Module.
3.1. System Overview
In existing multi-story buildings, such as shopping malls, large office buildings, etc., the geographical environment is often complex, which may contain the intersections of hollow and non-hollow areas, and users may cross multiple floors by various transportation modes (taking elevators, escalators, or going up/down stairs.). This poses a challenge to floor positioning and makes it difficult to rely on a single floor positioning approach. For concision, we define the non-hollow area as a closed area, which means that the adjacent floors are completely separated by concrete slabs.
Benefiting from the various sensors embedded in common smartphones, our system takes both the Wi-Fi [
27] fingerprints received by mobile phones and the readings of accelerometers, gyroscopes [
28] and barometers [
29] as input. Through the combination and improvement of various methods, the floor can be located robustly and accurately.
As shown in
Figure 1, our Motion Detection Module is designed to accurately detect users’ floor switching motions, including but not limited to going upstairs, going downstairs, taking an elevator up, and taking an elevator down. When no floor transition motion is detected, BWFP is performed to achieve accurate floor positioning in closed areas. However, in hollow areas, it is observed that the Wi-Fi fingerprints between adjacent floors are very similar and it is difficult to accurately localize floors because there is no concrete slab barrier and the wireless signal propagation attenuation is very small. To achieve accurate floor positioning both in hollow and closed areas, we develop Bthe WFP Augmentation Module, which combines BPFP and HMM as a compensation method. BPFP is applied by the mapping between the barometric pressure and altitude. HMM is utilized to correct the occasional floor positioning errors caused by the BWFP or the BPFP method. Once the floor transition event is detected, the vertical coordinates in the intermediate areas between adjacent floors is estimated by the barometric pressure difference between current time and the last time when the motion detection result was changed from the walking motion to the floor switching motion.
In order to accurately localize floors in all complex indoor areas, including hollow, closed areas and intermediate areas between adjacent floors, we must ensure that the right remedy is applied, i.e., the BWFP Module is applied in closed areas, and BPFP and HMM are combined to strengthen floor positioning in hollow areas. However, it is difficult to decide when and which method to apply, because the system does not know whether the user is in a hollow or closed area at any given time. We propose a confidence threshold judgment method to handle this problem. Our system uses the BWFP results with high confidence to opportunistically calibrate barometric pressure. Detailed design will be described in
Section 3.3 and
Section 3.4 describes how to judge user’s floor switching behavior.
3.2. BWFP Module
Due to the barrier of concrete slabs, the access point set and received signal strength of the same AP are remarkably different between adjacent floors in closed areas. By making full use of these characteristics, we can achieve accurate floor positioning in such areas. We model the floor positioning as a supervised multi-classification problem. Each floor is given a different label. The floor with the largest prediction probability is taken as the final floor estimation.
The BWFP method contains two phases: offline training phase and online predicting phase. In the offline training phase, the AP fingerprints collected along all paths on the same floor are given the same label. The number of fingerprints in each floor should be balanced.
Let
L represents the
p consecutive floors of a multi-story building and
represents the
floor:
and let
be a set of
D observations as given by:
where
indicates an observation consisting of m APs and their corresponding RSSI values.
Considering the fact that the number of scanned APs in different locations on a floor is different, the dimension of observation vector is dynamic. For convenience, we use to indicate the received signal strength from the AP.
We introduce XGBoost [
12] into the BWFP. The XGBoost classifier combines multiple tree models to build a strong multi-classifier. Before applying the XGBoost classifier, we embed original AP fingerprints in the Euclidean space. In this way, we can define the distances and similarities over features in the Euclidean space. The concrete operation is to run the exponential operation on every
in
, as Equation (1) shows:
where λ is the index coefficient. The hyperparameter λ is determined by experiments (in this paper, it is set to 1.035). Since the received signal strengths are negative, the exponential operation maps original signal strengths between 0 and 1, which can speed up the training process.
is the signal vector after the embedding operation. Next, we attach the ground truth floor as a label to the original data. The set of
represent training and testing samples.
Through training, K trees in XGBoost are obtained and the cumulative value of K trees is the estimated floor positioning result as Equation (2) shows:
where
are all possible trees,
is the weight of sample
on the leaves of the
tree which represents the scores of different floors,
is the final estimated floor positioning result obtained by accumulating the
of all K trees.
Our loss function is defined as Equation (3) shows and the second element in Equation (3) is the regularization term as Equation (4) shows, where T is the number of leaf nodes,
is the output score of each leaf node,
and
are the weight ratios representing preference for two terms. In Equation (3),
represents the ground truth floor,
. represents the output of classifier and
represents the MSE (Mean Squared Error) between
and
.
n is the number of training samples and
K is the number of trees in the model:
Our goal is to balance between MSE and regularization term so as to obtain the optimal model based on a large number of fingerprint training data. Using regularization term can avoid over-fitting by controlling the complexity of model. During the online floor positioning phase, the exponential operation is first applied on the real-timely collected RSSI fingerprints. Then preprocessed RSSI fingerprint is inputted into the pre-trained model to identify the most likely floor on which a user stays.
We introduce the confidence parameter to represent the accuracy of the BWFP method, which is the floor prediction probability using BWFP. Only high confidence floor positioning results are used to calibrate the barometric pressure (which will be detailed in the next section). For example, if the BWFP Module is employed in a four-story building, the four-floor estimation probability is (0.23, 0.28, 0.27, 0.22). We can see that although the prediction probability of F2 is the highest, it is better to discard the floor estimation result and resort to other methods instead. Therefore, we define the maximum probability of floor estimation using BWFP as the confidence. The floor positioning probability can be directly obtained by setting the ‘objective’ to ‘multi:softprob’ in XGBoost. Only the floor estimation with high confidence is accepted. Those floor prediction results with low confidence is discarded. We will detail it in the
Section 3.3.
To evaluate the performance of BWFP, we conducted several experiments in four different buildings and their structure is described in
Table 1. Building 1 contains large hollow areas from the first floor to the third floor. Building 2 and Building 3 have small hollow interior between adjacent floors. Building 4 is almost entirely hollow from the first floor to the fourth floor. Three Wi-Fi-enabled smartphones (Xiaomi 5s/Huawei Mate20/Huawei Mate9) are used to collect floor fingerprints. We used an application that queried the Android Application Programming Interface (API) for the Wi-Fi fingerprints (0.5-Hz sampling rate for all smartphones). We uniformly collected the AP fingerprints along all paths of each floor in the four buildings.
We totally collected 5930 training samples from the underground floor to the third floor in Building 1 (1223 samples on the underground floor, 2379 on the first floor, 1551 on the second floor and 777 on the third floor). Among them, 1661 samples were collected using Xiaomi 5s, 1580 samples were collected using Huawei Mate20 and 2689 were collected using Huawei Mate9. There are totally 4058 APs observed in Building 1. Similarly, we collected 3017 training samples from the first floor to third floor in Building 2 (1069 samples on the first floor, 938 samples on the second floor, 1010 samples on the third floor). Among them, 653 samples were collected using Xiaomi 5s, 402 samples were collected using Huawei Mate20 and 1962 samples were collected using Huawei Mate9. There are totally 1102 APs observed in Building 2. We totally collected 812 training samples from the seventh floor to the eighth floor in Building 3 (325 samples on the seventh floor and 487 samples on the eighth floor). There are totally 581 APs observed in Building 3. We totally collected 6995 training samples from the first floor to the fourth floor in Building 4 (2416 samples on the first floor, 1994 samples on the second floor, 1390 samples on the third floor and 1195 samples on the four floor). There are totally 1826 APs observed in Building 3. The samples of Building 3 and Building 4 were all collected by Huawei Mate9. The samples were collected while tester walking all paths within the building. To summarize, there are totally 16754 training samples and it took about 13 min to collect sample per floor on average.
We also collected 6571 test fingerprints of Building 1 about two months after the training samples were collected (1272 samples on the underground floor, 2751 samples on the first floor, 1595 samples on the second floor and 953 samples on the third floor). It is worth noticing that there are 2252 test samples near the hollow areas (825 on the first floor, 798 on the second floor and 629 on the third floor). And the test samples of Building 2, Building 3 and Building 4 were collected five days after the training samples were collected. We totally collected 10849 test samples. The accuracy of floor positioning is defined as the number of correctly identified floor over the total number of test samples.
Table 2 compares the accuracy of our proposed classification-based BWFP using XGBoost and Bayesian used in HYFI [
9] in these four experimental buildings. From this Table, we can find that using the XGBoost based method can obtain better floor positioning accuracy than the likelihood-based method.
In order to verify whether our proposed BWFP model works well in the hollow areas, we established a comparative experiment between hollow and closed areas. We used the pre-trained BWFP model to localize floors in the hollow and closed areas of Building 2, respectively. As can be seen from
Table 3, the accuracy in the closed areas reaches nearly 100%, while the average accuracy in the hollow areas is about 95.3%. Besides, the accuracy in F3 is only 89.3%, which is far from meeting the requirements of practical applications.
This shows that even though we use the powerful classifier XGBoost, the positioning accuracy of BWFP in the hollow areas is still unsatisfactory because of indistinguishable Wi-Fi fingerprints between adjacent floors.
3.3. BWFP Augmentation Module
The previous section shows that even though we use the powerful XGBoost classifier, the positioning accuracy of BWFP in the hollow areas is still unsatisfactory. In the regions with hollow areas, the Wi-Fi signals are similar and prone to confuse between adjacent floors, so it is difficult to accurately identify floor only by using BWFP. In addition, classification-based BWFP cannot accurately localize floor in the intermediate areas between adjacent floors, which is critical for complete floor positioning. Therefore, we combine BPFP and HMM together as complement.
The BPFP utilizes the physical relationship [
8] between altitude and barometric pressure, as Equation (5) shows, to obtain altitude change for floor positioning:
where
and
are the temperature and pressure at sea level, respectively.
is 288.15 (K) and
is 1013.25 (hPa) in the standard atmosphere. Lapse is lapse rate of temperature, R is the universal gas constant and g is the gravitational constant.
is set to 0 at sea level, and
represents real-time barometer measurements. However, Equation (5) is only applicable to the calculation under the standard atmosphere condition. For non-standard atmosphere conditions, the obtained altitude using Equation (5) need to be furthermore calibrated as described in [
7].
For convenience, we use
to represent the calibrated altitude:
where parameter
represents the bias of altitude caused by different air-data conditions which is opportunistically calibrated with high confidence BWFP result. After calculating the calibrated barometric altitude and assuming that the sea level is an approximation of geoid level, we can obtain the altitude with the barometer measurements and vice versa. The BPFP can not only compensate for the unreliable positioning of the BWFP in the hollow areas, but also estimate vertical coordinates in the intermediate area between adjacent floors during floor transitions.
However, barometric pressure is susceptible to environmental factors, such as temperature and humidity change. In modern large-scale buildings, we find that the barometric pressure may be different between areas with a front view towards the sun and a rear view away from the sun. Air conditioning also may affect barometer measurements. In brief, barometer measurements cannot be directly used to identify floor levels without calibration.
Therefore, the mapping between floor levels and the corresponding barometric pressures must be updated in real time to ensure that the pressure maintained by the system is of confidence and accurate for BPFP. Assuming that each height between adjacent floors is previously known, we only need to maintain a “reference pressure” automatically for a multi-story building during the whole positioning period. We define “reference pressure” as the real-time calibrated pressure of the “reference floor”, which can be any floor within the building (e.g., the first floor in this paper). In the area where the Wi-Fi signals can be distinguished with high confidence, the “reference pressure” is calibrated by the high-reliable result from BWFP. While in the hollow areas, the floor is predicted by using BPFP with “reference pressure”.
As mentioned in
Section 3.2, the BWFP Module simultaneously gives the floor positioning confidence corresponding to the floor positioning estimation. We utilize a confidence threshold judgment method to select appropriate floor positioning result of BWFP for BPFP calibration. As a hyperparameter, the floor positioning confidence threshold is selected based on experiments. The higher the threshold, the higher the positioning accuracy requirement for BWFP. By filtering confidence using preset threshold, the right remedy is applied. The relationship between the threshold and the number of BPFP triggers will be discussed in detail in
Section 4.1. More specifically, the method includes the following two steps:
“Reference pressure” calibration: when the confidence of BWFP exceeds the preset threshold, update the “reference pressure” algorithm as
Figure 2 shows. Parameter
represents the current barometer measurements, and
represents the floor estimation result obtained by BWFP. We use an array of heights [] to record each height between adjacent floors within a multi-story building. Variable
is the height difference between the current floor and the reference floor (the first floor as the reference floor in this paper), which can be estimated based on the parameter
and heights []. The ref_pre is the calibrated pressure of the reference floor and reference_height is the estimated height of the reference floor.
Floor prediction: the “reference pressure” is calculated to predict the floor level. The inference process is shown in
Figure 3.
Considering that the floor level estimated by the abovementioned floor prediction method may not be exactly an integer, and a user may stay in the intermediate areas between two adjacent floors, we design a threshold-based floor detection scheme, as
Figure 4 shows.
If , the user is judged to stay on the floor. Otherwise, once the floor switching motion is detected, which will be described in the next section, the user is judged to be situated in the intermediate area between adjacent floors and the vertical coordinates are then estimated using Equation (7).
We define:
where
. is the height calculated by the BPFP method, and
is the real height of the
floor level.
Considering that the instability of Wi-Fi signals and the sudden transition between the hollow and closed areas, the occasional floor estimation error may occur. In order to reduce the jump problem, we intuitively leverage the Hidden Markov Model (HMM) [
30] to correct the final floor prediction. Based on the Markov hypothesis, HMM has the temporal characteristics of recording historical information. We define our HMM model as Equation (8) shows:
where
A is the transition probability, which is the probability of moving from one floor to another. It is calculated from the statistical analysis of user behavior training datasets.
B is the emission probability and is obtained from the prediction confusion matrix. Parameter
π is the initial state probability vector, which is obtained by probabilistic statistics from the user behavior training dataset.
We regard the floor estimation obtained by BWFP or BPFP as the observation and the ground-truth floor as the hidden state in the Markov chain. We utilize the uncertain floor estimation obtained by BWFP or BPFP to speculate the floor which the user most likely stays on. It is equivalent to the decoding problem in HMM. Given the model and floor observation sequence, the HMM reveals the hidden (real) floor level using Viterbi method [
31]. We abbreviate real floor as rf, prediction floor as pf. The floor estimation is modeled as maximum optimization problem as Equation (9) shows.
According to the Markov hypothesis, Equation (9) is equivalent to:
To make a conclusion, we propose a BWFP augmentation with BPFP and HMM. The augmentation mainly includes two parts: firstly, the BPFP is used to strengthen the hollow areas and the intermediate areas between adjacent floors where the BWFP cannot accurately provide floor positioning result. In this case, only “reference pressure” needs to be maintained. On the other hand, HMM is used to correct the occasional errors of floor positioning. The overall performance is evaluated in
Section 4.
3.4. Motion Detection Module
Most existing floor positioning methods can only estimate floor level, which cannot provide services while users are moving in the intermediate area between adjacent floors. As a critical judgment condition, as
Figure 1 shows, we first detect users’ motion using inertial sensor measurements. Once the floor transition motion is detected, the vertical coordinate of user is continuously estimated. Otherwise, the floor is estimated by the BWFP Module or the BPFP Augmentation Module.
When a user moves between different floors (going upstairs/downstairs, taking elevator up/down), the changing patterns of barometric pressure, acceleration and angular velocity are different from those when user walks along the same floor. This provides inspiration for us to accurately detect floor switching motion.
However, due to the pressure drift with environment factors, it is very unreliable to calculate the height change directly using barometer measurements as described in [
32]. Therefore, we propose to detect users’ motion using accelerometer and gyroscope readings. We regard motion detection as a binary classification problem, which includes floor switching motion and non-floor switching motion. Floor switching motions include but not limited to going upstairs/downstairs, taking elevator up/down. Once the floor switching motion is detected, then we calculate the barometric pressure difference between current time and the last time when the motion detection result was changed from the non-floor switching motion to the floor switching motion. Combined with Equation (6) in the previous section, the user’s vertical coordinates can be calculated.
Theoretically, acceleration can characterize a sudden change of user motion, such as an overweight or underweight state during floor switching behavior. The gyroscope can detect the turning motion between the floors when the user goes upstairs or downstairs. In order to verify the feasibility of this idea, we collected the measurements of accelerometer and gyroscope during walking, going upstairs, going downstairs, taking elevator up, and taking elevator down. The experimental results are shown in the
Figure 5. Note that non-floor switching motion measurements are collected on the same floor and we stretch the data for comparison. As can be seen from
Figure 5a,b different motions produce different acceleration change modes, such as a sharp drop and increase while taking the elevator up or down. And the acceleration also demonstrates a large degree of oscillation while going upstairs or downstairs compared to walking on the same horizontal floor. As shown in
Figure 5c,d the gyro data are slightly different under different motion modes, and we expect the classifier to judge the user’s motion pattern by distinguishing the change pattern of acceleration and gyro.
Considering that the device attitude fluctuation with different holding modes and user walk produces various noise, we perform the coordinate transformation from the carrier coordinate system (b-system) to the navigation coordinate system (n-system) for all sensor data to obtain uniform sensor features, which is free of different holding modes and user walk. The coordinate system transformation from the carrier coordinate system to the navigation coordinate system [
33] is shown as follows:
where
is the angular velocity obtained by the gyro measurements.
is the angle at which the carrier rotates from time m-1 to m.
is the transformation matrix of b-system from time m-1 to m.
is the attitude matrix and should be updated in real time based on the output of gyro.
Motion detection is divided into offline training phase and online motion detection phase. In the offline training phase, to improve floor transition detection accuracy and reduce the detection delay, a sliding window with 50% data overlap is used to collect 2.56 s of accelerometer and gyro measurements (called a data frame). Then all of training data frames are labelled according to the corresponding user motion. Data frames encapsulated based on different sensor data are carefully observed from four different levels of granularity, i.e., statistical features, time domain features, peak and segment features and frequency domain features. They are combined into the final frame feature vector.
More specifically, we extract a total of 46 features based on frame data as
Table 4 shows. “vertical” represents Z-axis measurements and “horizontal” represents the modulus of X-axis and Y-axis measurements. In this module, we also employ the XGBoost method to fit the collected training data and extract the features from the real-time readings for motion detection.
As a vital decision factor, the accuracy of motion detection should be ensured. Similar to
Section 3.2, the HMM is also used to correct the occasional motion detection errors. We collected training data with 2774 floor switching data (including going upstairs and downstairs, taking elevator up and down) and 3761 walking data on the horizontal floors using three different mobile phones (Huawei Honor V1, Huawei Mate9, Xiaomi MIX2).
The sensor sampling frequency is set to 100 Hz, i.e., for every 10 ms, an acceleration sample and a gyroscope sample of mobile phones are collected. Next, a data sample (256 pieces of data) was constructed every 2.56 s to construct a feature vector.
From the simulation experiment, the importance of each feature for XGBoost classifier was obtained. The red mark features in
Table 4 are the most important features. It can be seen that the FFT amplitude of acceleration and angular velocity can differentiate stride span and frequency when users walk on the same floor or a vertical movement happens.
On the real-time testing stage, two testers walk on the horizontal floors and take floor switching activities for 5 min in Building 2, respectively. The motion detection accuracy is shown in
Figure 6.