Detection of Indoor High-Density Crowds via Wi-Fi Tracking Data

Wang, Peixiao; Gao, Fei; Zhao, Yuhui; Li, Ming; Zhu, Xinyan

doi:10.3390/s20185078

Open AccessArticle

Detection of Indoor High-Density Crowds via Wi-Fi Tracking Data

by

Peixiao Wang

¹

,

Fei Gao

¹,

Yuhui Zhao

¹,

Ming Li

² and

Xinyan Zhu

^1,3,4,*

¹

State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430079, China

²

Institute of Space Science and Technology, Nanchang University, Nanchang 330031, China

³

Collaborative Innovation Center of Geospatial Technology, Wuhan University, Wuhan 430079, China

⁴

Key Laboratory of Aerospace Information Security and Trusted Computing of the Ministry of Education, Wuhan University, Wuhan 430079, China

^*

Author to whom correspondence should be addressed.

Sensors 2020, 20(18), 5078; https://doi.org/10.3390/s20185078

Submission received: 15 July 2020 / Revised: 13 August 2020 / Accepted: 20 August 2020 / Published: 7 September 2020

(This article belongs to the Section Remote Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

Accurate detection of locations of indoor high-density crowds is crucial for early warning and emergency rescue during indoor safety accidents. The spatial structure of indoor environments is more complicated than outdoor environments. The locations of indoor high-density crowds are more likely to be the sites of security accidents. Existing detection methods for high-density crowd locations mostly focus on outdoor environments, and relatively few detection methods exist for indoor environments. This study proposes a novel detection framework for high-density indoor crowd locations termed IndoorSRC (Simplification–Reconstruction–Cluster). In this paper, a novel indoor spatiotemporal clustering algorithm called Indoor-STAGNES is proposed to detect the indoor trajectory stay points to simplify indoor movement trajectory. Then, we propose use of a Kalman filter algorithm to reconstruct the indoor trajectory and properly align and resample the data. Finally, an indoor spatiotemporal density clustering algorithm called Indoor-STOPTICS is proposed to detect the locations of high-density crowds in the indoor environment from the reconstructed trajectory. Extensive experiments were conducted using indoor Wi-Fi positioning datasets collected from a shopping mall. The results show that the IndoorSRC framework evidently outperforms the existing baseline method in terms of detection performance.

Keywords:

high-density crowd location detection; indoor trajectory; Indoor-STAGNES; Indoor-STOPTICS

1. Introduction

Indoor environments are the main space for human activities, with research showing that human activities occur indoors approximately 87% of the time [1]. As a result, many indoor spaces host large numbers of people at any point in time. High-density crowds are the primary cause of indoor emergency safety accidents, such as overcrowding and trampling [2,3]. Compared with the outdoor environment, indoor three-dimensional spatial structures are more complicated, and safety accidents are more likely to occur there. Therefore, accurately detecting the locations of these high-density indoor crowds is important for early warning and emergency rescues during instances of indoor safety accidents.

With the rapid development of the Internet, indoor positioning has gradually become a rigid demand [4,5,6]. In recent years, indoor positioning technology has gradually matured and has been applied in our daily life, such as for indoor navigation and indoor location tracking. Concurrently, the available indoor positioning data of indoor users have grown severalfold, becoming a substantial data source for indoor-related research, such as indoor location prediction [7,8,9], indoor association rule mining [10], and indoor positioning methods [11,12,13]. Existing indoor-related research is mainly focused on indoor location services, and there are few studies related to detecting high-density indoor crowd locations. At present, studies related to high-density crowd detection are mainly focused on the outdoor environment. Compared with outdoor trajectories, indoor trajectories are of poor quality and have typical three-dimensional characteristics, which makes it difficult for traditional outdoor high-density crowd detection algorithms to be applied to indoor spaces.

Therefore, a novel high-density location-detection framework for indoor crowds called IndoorSRC (Simplification–Reconstruction–Cluster) is proposed herein. This study uses clustering algorithms to find high-density locations of indoor crowds, thereby providing a scientific basis for indoor emergencies. Based on the characteristics of indoor trajectories, we have made certain improvements to the existing clustering algorithms. The significant contributions of the study are summarized as follows.

(1): A novel indoor spatiotemporal aggregation hierarchical clustering algorithm called Indoor-STAGNES is proposed for detecting the stay points of indoor trajectory and for simplifying indoor movement trajectory.
(2): The Kalman filter algorithm is proposed to reconstruct the indoor trajectory, thereby achieving the required alignment and resampling.
(3): A new indoor spatiotemporal density clustering algorithm called Indoor-STOPTICS is proposed for detecting the location of high-density crowds in the indoor environment from the reconstructed trajectory.
(4): Subsequently, we describe our evaluation of the performance of the IndoorSRC framework using real indoor trajectories. The results demonstrate the advantages of our approach compared to the baseline.

The rest of this paper is organized as follows. In Section 2, a review of the literature focusing on indoor trajectories and detection of high-density locations in outdoor environments is presented. The basic and problem definitions along with a new methodological framework for detecting high-density locations of indoor crowds are described in Section 3. The performances of the frameworks proposed in previous research and this study are compared based on real indoor Wi-Fi positioning data and are presented in Section 4 along with the results. Section 5 provides the conclusion of the study and suggestions for possible further studies are presented.

2. Related Work

In this section, we first review the research related to indoor trajectories and, then, review the detection methods for high-density crowd locations in outdoor environments.

Existing research related to indoor trajectories is mainly focused on indoor positioning technology and indoor location services. Indoor positioning technology is used to improve the accuracy of indoor positioning to obtain more accurate indoor movement trajectories. For example, Ye et al. [14] used a hidden Markov model to improve the accuracy of indoor positioning based on traditional fingerprint positioning. Tomazic et al. [15] proposed a confidence interval fuzzy-logic model to improve the accuracy of indoor pedestrian positioning. Indoor location services primarily improve the indoor user experience from multiple perspectives. For instance, Wang et al. [16,17] proposed the Indoor-WhereNext and Markov-LSTM models from the perspectives of group users and individual users based on indoor trajectories of a mall to predict the next location of indoor users and achieved high prediction performance. Li et al. [18] used uncertain historical indoor mobility data to determine the top-k popular indoor semantic locations with the highest flow values. Mou et al. [10] proposed an R-FP-Growth algorithm based on the traditional FP-Growth algorithm to mine association rules among shops in shopping malls, thereby providing indoor location services. Liu el at. [19] designed a graph structure (IT-Graph) that captures indoor temporal variations to return the valid shortest path. Baba el at. [20] proposed the Indoor RFID Multi-variate Hidden Markov Model (IR-MHMM) to capture the uncertainties in indoor RFID data as well as the correlation between moving object locations and object RFID readings. In addition, several scholars have tried to determine high-density crowd locations from indoor trajectories. For example, Li et al. [21] proposed a data-driven approach that finds the top-k indoor density regions using indoor positioning data; however, there are only a few high-density indoor crowd detection methods, making this an ongoing problem in the field.

Detection methods for high-density crowds in outdoor environments are mainly used to find urban hotspots and alleviate traffic congestion. Identifying urban hotspots can reveal the travel characteristics of urban residents. For example, Zheng et al. [22] proposed a grid-based clustering algorithm based on taxi-trajectory data to find popular travel areas and, thus, analyzed the travel patterns of Chongqing residents. Lu et al. [23] presented a visual analysis system to explore the Origin–Destination patterns of hotspots to reveal the potential functions of urban regions. Zhao et al. [24] proposed a trajectory clustering method based on decision graphing and data fields to determine the dynamic pattern of urban hotspots. Easing traffic congestion mainly provides support for urban traffic planning and management. For instance, Li et al. [25] proposed a density-based clustering algorithm called FlowScan to identify high-density traffic locations at road-level to alleviate traffic congestion. Anbaroglu et al. [26] proposed a Non-Recurrent Congestion (NRC) events detection methodology to support the accurate detection of NRC events on large urban road networks. Cheng et al. [27,28] used a data-driven approach to predict changes in traffic flow to alleviate traffic congestion. However, the abovementioned methods mainly focus on high-density crowd detection in outdoor environments. Due to the low quality and three-dimensional characteristics of indoor trajectories, it is difficult to apply these methods directly to indoor environments.

In this study, we propose a novel high-density indoor crowd location detection framework, termed IndoorSRC. Compared with existing methods, the proposed framework is suitable for indoor spaces. It is a lightweight framework that is not only easy to implement but also combines the advantages of multiple clustering algorithms to improve detection performance.

3. Materials and Methods

First, it is necessary to define the terms utilized herein and identify the problems to be addressed.

Definition 1 (Indoor Trajectory).

An indoor trajectory,

t r a j = {p t_{i}}_{i = 1}^{n}

, is an ordered sequence of points for

p t_{i} = (i d, t_{i}, x_{i}, y_{i}, f_{i})

, where

n

is the length of the trajectory,

i d

is the length of the trajectory,

t_{i}

is a unique user identifier, is the time that

p t_{i}

was collected, and

(x_{i}, y_{i}, f_{i})

corresponds to the longitude, latitude, and floor, respectively, of the user at time

t_{i}

.

Definition 2 (Simplified Trajectory).

A simplified trajectory,

sim_traj = {sim_p t_{i}}_{i = 1}^{k}

, simplifies the trajectory points caused by the stay event in the indoor trajectory. As shown in Figure 1b,

sim_p t_{i} = (i d, sim_t_{i}, sim_x_{i}, sim_y_{i}, f_{i})

is obtained by simplifying the points that are continuous in time and close to each other, where

sim_t_{i}

is the average time of the simplified points,

(sim_x_{i}, sim_y_{i})

is the center coordinate of the simplified points, and

f_{i}

is the floor on which the user is located.

Definition 3 (Reconstructed Trajectory).

A reconstructed trajectory,

rec_traj = {rec_p t_{i}}_{i = 1}^{l} = {i d, t_{i}, {\hat{x}}_{i}, {\hat{y}}_{i}, f_{i}}_{i}^{l}

, reconstructs the missing data in the simplified trajectory. As shown in Figure 1c,

t_{i}

is the recording time of the reconstructed point

rec_p t_{i}

,

(\hat{x}, \hat{y})

is the coordinate information of the user at time

t_{i}

, and

f_{i}

is the floor on which the user is located.

Definition 4 (Reconstructed Trajectory Point Set).

The reconstructed trajectories of all users form a set of reconstructed trajectory points

D B = {rec_p t_{i}}_{i = 1}^{M}

, where

M

is the total number of reconstructed trajectory points for all users.

The research object of this study is the trajectories

{t r a j_{i}}_{i = 1}^{N}

of the group users in the indoor environment. From the trajectories of the group users, the high-density locations of the crowds in the indoor environment are found, thereby assisting in early warning and emergency rescue during indoor safety accidents. The problem defined in this study is expressed by Equation (1):

{l_{i}}_{i = 1}^{m} = ℳ \leftarrow {t r a j_{i}}_{i = 1}^{N}

(1)

where

{t r a j_{i}}_{i = 1}^{N}

represents the group user trajectory for modeling,

N

represents the total number of users,

ℳ

represents the IndoorSRC framework proposed in this study, which is used to detect high-density crowds in the trajectories of group users, and

{l_{i}}_{i = 1}^{m}

represents the high-density locations detected by framework

ℳ

.

The IndoorSRC structure is presented in Figure 2. Based on the bottom-up design principle, our method is divided into three phases: simplification of the indoor movement trajectory; reconstruction of the indoor movement trajectory; and detection of high-density indoor crowd locations. First, a new indoor spatiotemporal agglomeration nesting called Indoor-STAGNES is proposed, which is used to identify the stay point in the indoor trajectory and simplify the indoor movement trajectory. Second, we propose using a Kalman filter algorithm [29,30] to reconstruct the simplified trajectory to align and resample the indoor movement trajectory. Finally, an indoor spatiotemporal density clustering algorithm called Indoor-STOPTICS is proposed to detect the locations of high-density crowds in the indoor environment from the reconstructed trajectory point set.

3.1. Simplification of the Indoor Movement Trajectory

The sampling interval of indoor positioning data is heterogeneous; when a user stays in a specific area for a certain period, the mobile terminal will record more trajectory points in the limited area, thereby forming a cluster of trajectory points. If the original trajectory

t r a j

is used to directly identify the high-density crowd locations, the high-density point locations are often obtained rather than the crowd locations. Therefore, we propose a novel Indoor-STAGNES algorithm to simplify the user trajectory and remove the stay point information from the trajectory.

The Indoor-STAGNES algorithm is an improvement over the traditional agglomerative nesting (AGNES) algorithm. Two major improvements have been made—the addition of time and floor constraints and, consequently, the adjacent spatiotemporal trajectory points (clusters) on the same floor are merged by iteration. Finally, the original

t r a j = {p t_{i}}_{i = 1}^{n}

is divided into

k

disjointed sequential clusters

{C_{1}, C_{2}, \dots, C_{k}}

. The

sim_p t_{i}

is obtained by simplifying the points in cluster

C_{i}

, and

k

simplified trajectory points are obtained from

k

clusters, i.e.,

sim_traj = {sim_p t_{i}}_{i = 1}^{k}

. As shown in Figure 3, the cluster

C_{i} (p t_{1}, p t_{2}, p t_{3}, p t_{4})

is iterated into a new cluster and, then, simplified into a trajectory point

sim_p t_{i}

. The calculation methods of the time distance and spatial distance between clusters are shown in Equations (2) and (3):

s p a t i a l_{d i s t (C_{i}, C_{j})} = {‖ \bar{p t_{i}} - \bar{p t_{j}} ‖}_{2}, \bar{p t_{i}} = \frac{1}{| C_{i} |} \sum_{p t_{i} \in C_{i}} p t_{i},

(2)

t i m e_{d i s t (C_{i}, C_{j})} = | C_{i} . t i m e_{a v e} - C_{j} . t i m e_{a v e} |, C_{i} . t i m e_{a v e} = \frac{1}{| C_{i} |} \sum_{p t_{i} \in C_{i}} p t_{i} . t,

(3)

where

spatial_dist

is used to calculate the spatial distance between

C_{i}

and

C_{j}

,

time_dist

is used to calculate the time distance between

C_{i}

and

C_{j}

,

\bar{p t_{i}}

represents the mean coordinate of cluster

C_{i}

, the number of points in

C_{i}

is represented by

| C_{i} |

, and

C_{i} . t i m e_{a v e}

represents the average recording time of the trajectory points in cluster

C_{i}

.

The overall process of Indoor-STAGNES is shown in Algorithm 1.

(1): The indoor trajectory $t r a j = {p t_{i}}_{i = 1}^{n}$ of the continuous time is input and each trajectory point is initialized as a cluster.
(2): The spatial distance matrix $S D$ and the time distance matrix $T D$ between the clusters are initialized, where $S D_{i j}$ represents the spatial distance between $C_{i}$ and $C_{j}$ , and $T D_{i j}$ represents the time distance between $C_{i}$ and $C_{j}$ . If $C_{i}$ and $C_{j}$ are not on the same floor, $S D_{i j}$ and $T D_{i j}$ are infinity.
(3): The minimum value $d_{\min}$ in the distance matrix $S D$ under the time threshold $t_{t h r e h}$ is examined. If $d_{\min}$ is smaller than the distance threshold $d_{t h r e h}$ , the two nearest clusters are merged, and the spatial distance matrix $S D$ and the time distance matrix $T D$ are updated. Otherwise, step 4 is followed.
(4): The cluster ${C_{1}, C_{2}, \dots, C_{k}}$ is simplified to ${sim_p t_{1}, sim_p t_{2}, \dots, sim_p t_{k}}$ in chronological order.

Algorithm 1 Indoor Spatiotemporal Agglomerative Nesting

Require: Individual trajectory:

t r a j = {p t_{i}}_{i = 1}^{n}

Time threshold:

t_{t h r e h}

Distance threshold:

d_{t h r e h}

Ensure: Individual simplified trajectory:

{sim_p t_{1}, sim_p t_{2}, \dots, sim_p t_{k}}

1. Initialize clusters

c l s A r r = {C_{i}}_{i}^{n}

based on

t r a j = {p t_{i}}_{i = 1}^{n}

2. Construct the spatial distance matrix

S D

and the time distance matrix

T D

3. Search

d_{m i n}

under the time threshold

t_{t h r e h}

in matrix

S D

4. while

d_{m i n} \leq d_{t h r e h}

do

5. Search two clusters

c l s_{1}

,

c l s_{2}

that need to be merged based on

d_{m i n}

6. Merge cluster

c l s_{1}

and cluster

c l s_{2}

, and update

c l s A r r

7. Update matrixes

S D

and

T D

based on

c l s A r r

8. Search

d_{m i n}

under the time threshold

t_{t h r e h}

in matrix

S D

9. for each

c l s \in c l s A r r

do

10. Simplify cluster

c l s

into simplified trajectory point

sim_p t_{i}

11. return

{sim_p t_{1}, sim_p t_{2}, \dots, sim_p t_{k}}

3.2. Reconstruction of the Indoor Movement Trajectory

The simplified trajectory

sim_traj

generally reflects the user mobile skeleton; however, it is not suitable for detecting high-density indoor crowd locations. The simplified trajectory

sim_traj

contains more missing trajectory points. If the simplified trajectory

sim_traj

is directly used to detect the desired location, the detection performance will be affected to some extent. Therefore, indoor trajectory reconstruction is one of the key steps involved in this process. To complete the missing trajectory points in the simplified trajectory, we proposing using a Kalman filter algorithm to reconstruct the simplified trajectory.

Kalman filtering is a linear optimal estimation algorithm that comprehensively considers measurement data and physical motion models and iteratively estimates the optimal location of a user at each moment, that is, the reconstructed trajectory point

rec_pt

. The Kalman filtering algorithm reconstructs the indoor movement trajectory in two main stages:

(1): Identification of the missing trajectory points: The number of missing trajectory points in the simplified trajectory are determined according to the sampling interval of the simplified trajectory. As shown in Figure 4, the trajectory interval with a sampling interval that exceeds twice the average sampling interval in the simplified trajectory is regarded as the missing trajectory interval. When the sampling interval of the missing trajectory interval is less than the 95th percentile of the sampling interval, the missing trajectory points need to be reconstructed. The calculation method of the number and time information of the missing trajectory points are shown in Equations (4) and (5):

$c o u n t = f l o o r (\frac{missing_interval}{ave_interval}) + 1,$

(4)

$rec_p t_{i + j} . t = rec_p t_{i} . t + j \times ave_interval, 1 \leq j \leq c o u n t,$

(5)

where $missing_interval$ represents the sampling interval of the missing trajectory interval, $ave_interval$ represents the average sampling interval of the simplified trajectory points, $f l o o r (x)$ represents the downward rounding function, $c o u n t$ represents the number of missing trajectory points, and $rec_p t_{i + j} . t$ represents the time information of the $j$ -th missing trajectory point in the missing trajectory interval (for example, $rec_p t_{i + 1} . t$ represents the time information of the first missing trajectory point in the missing trajectory interval).
(2): Reconstruction of the missing trajectory points: The Kalman filtering algorithm iteratively solves the location of the reconstructed trajectory point at each moment, which is mainly divided into two stages: the location prediction and location update stages. In the location prediction stage, the physical motion model is used to predict the location of the next moment according to the optimal location of the previous moment. In the location update stage, the optimal location of the current moment is obtained by correcting the predicted location of the current moment using measurement data and error of the current moment. The iterative process is shown in Equations (6) and (7):

$rec_p t_{i} . \hat{x} = k a l M a n F i l t e r (rec_p t_{i - 1} . \hat{x}, sim_p t_{i} . x),$

(6)

$rec_p t_{i} . \hat{y} = k a l M a n F i l t e r (rec_p t_{i - 1} . \hat{y}, sim_p t_{i} . y),$

(7)

where $k a l M a n F i l t e r$ represents the Kalman filter algorithm, $(rec_p t_{i} . \hat{x}, rec_p t_{i} . \hat{y})$ represents the coordinates of the reconstructed trajectory point at the current moment, $(rec_p t_{i - 1} . \hat{x}, rec_p t_{i - 1} . \hat{y})$ represents the coordinates of the reconstructed trajectory point at the previous moment, and $(sim_p t_{i} . x, sim_p t_{i} . y)$ represents the coordinates of the simplified trajectory point at the current moment.

3.3. Detection of High-Density Indoor Crowd Locations

The reconstructed trajectory point

rec_pt

can accurately reflect the movement of a user’s location. By combining all the reconstructed trajectory points of all users, we can analyze the changes in the indoor group user’s location, thereby detecting the locations of high-density indoor crowds. We regard the high-density clusters in the reconstructed trajectory point set

D B

(Definition 4) as these locations. Therefore, we proposed a novel indoor spatiotemporal ordering point to identify the cluster structure (Indoor-STOPTICS).

Definition 5 (Indoor Spatiotemporal Neighborhood).

For

rec_p t_{i} \in D B,

the indoor spatiotemporal neighborhood of its indoor space is defined as a cylinder, with

ϵ_{1}

as

ϵ_{2}

its radius and as its time window;

rec_p t_{i}

is the center of the cylinder,

N_{ϵ_{1}, ϵ_{2}} (p t_{i})

represents a subset of points contained inside the cylinder, and the points in

N_{ϵ_{1}, ϵ_{2}} (p t_{i})

are on the same floor as

p t_{i}

, as defined in Equation (8):

N_{ϵ_{1}, ϵ_{2}} (rec_p t_{i}) = {\begin{matrix} rec_p t_{j} \in D B & s . t . & \begin{matrix} s d (rec_p t_{j}, rec_p t_{i}) \leq ϵ_{1} \\ t d (rec_p t_{j}, rec_p t_{i}) \leq ϵ_{2} \\ rec_p t_{j} . f_{j} = = rec_p t_{i} . f_{i} \end{matrix} \end{matrix}},

(8)

where

s d

is used to calculate the spatial distance between

rec_p t_{i}

and

rec_p t_{j}

,

t d

is used to calculate the time distance between

rec_p t_{i}

and

rec_p t_{j}

, and the number of points in

N_{ϵ_{1}, ϵ_{2}} (rec_p t_{i})

is represented by

| N_{ϵ_{1}, ϵ_{2}} (rec_p t_{i}) |

.

Definition 6 (Indoor Core Trajectory Point).

For

rec_p t_{i} \in D B

, if its indoor spatiotemporal neighborhood

N_{ϵ_{1}, ϵ_{2}} (rec_p t_{i})

contains at least

M i n p t

indoor trajectory points, that is,

| N_{ϵ_{1}, ϵ_{2}} (rec_p t_{i}) | > M i n p t

, then,

rec_p t_{i}

is called the indoor core trajectory point.

Indoor-STOPTICS is an improved algorithm of spatiotemporal ordering points to identify the clustering structure [31,32]. Indoor-STOPTICS first considers the three-dimensional characteristics of indoor trajectories and adds the floor constraint based on ST-OPTICS (Definition 5). Then, it uses core points (Definition 6) as drivers to determine the set of trajectory points connected with the maximum density of the same floor under spatiotemporal constraints. Unlike traditional density-based spatiotemporal clustering algorithms, the Indoor-STOPTICS algorithm does not explicitly generate clusters, but generates a reachable distance for each data point and an ordered list for analysis to assist in detecting high-density crowd locations in the

D B

. The calculation method for the reachable distance of each point and the ordered list is the same as in ST-OPTICS [31,32]. As shown in Figure 5, the reconstructed trajectory point set

D B = {rec_p t_{i}}_{i = 1}^{M}

generates an ordered point list

o r d e r L i s t = {rec_p t_{j}}_{j = 1}^{M}

using the Indoor-STOPTICS algorithm. Taking

o r d e r L i s t

index

j

as the horizontal axis and

rec_p t_{j}

reachable distance as the vertical axis, a decision graph of

D B

can be obtained. The auxiliary information of the decision graph can be summarized as follows.

(1): When the spatial radius of the Indoor-STOPTICS algorithm is $ϵ_{1} = r_{1}$ , two clusters, $C l u s t e r A$ and $C l u s t e r B$ , can be detected from the set $D B$ .
(2): When the spatial radius of the Indoor-STOPTICS algorithm is $ϵ_{1} = r_{2}$ , $C l u s t e r A$ is split into three small clusters, namely $C l u s t e r A 1$ , $C l u s t e r A 2$ , and $C l u s t e r A 3$ . Thus, a total of four clusters can be identified from the set $D B$ .
(3): When the spatial radius of the Indoor-STOPTICS algorithm is $ϵ_{1} = r_{2}$ , the trajectory points included in each cluster can be obtained by the corresponding horizontal axis index sequence. For example, the horizontal axis index sequence corresponding to $C l u s t e r A 1$ is $i d x A r r$ and the trajectory points included in $C l u s t e r A 1$ can be expressed as ${o r d e r L i s t [i]}_{i \in i d x A r r}$ .
(4): When the spatial radius of the Indoor-STOPTICS algorithm is $ϵ_{1} = r_{2}$ , the cluster density can be approximated by the width of the cluster. For example, the density of $C l u s t e r B$ can be represented by $w$ . When $w$ is wider, the density of the cluster is greater.

4. Results

4.1. Data Preparation

4.1.1. Data Sources

The experimental data mainly included Wi-Fi positioning data from a shopping mall in Jinan City, China. The indoor Wi-Fi positioning data covered eight floors of the shopping mall from 23 December, 2017, to 30 December 2017. Approximately 2 million indoor movement trajectories and 30 million indoor trajectory points were collected every day. The positioning accuracy was approximately 3 m, and trajectory points with a sampling interval of 1–5 s accounted for more than 70% of the collected data points. Table 1 lists the unique identifier of the user, record upload time, the user’s (X, Y) coordinates, and the unique floor identifier.

4.1.2. Data Preprocessing

The original Wi-Fi data were collected via fingerprint positioning technology. First, multiple Wi-Fi access points (APs) were deployed in the study area and, then, the coordinate information of each AP was calculated iteratively. After the determination of coordinate information of each AP, the research area was divided into multiple grids that do not overlap; then, fingerprint information from each grid was obtained to construct a fingerprint database. When a mobile terminal enters the coverage area of APs, the mobile terminal matches the signal strength of received AP with the fingerprint database to determine the specific location of the terminal. Because of the unstable signal of the mobile terminal and the artificial shutdown of the Wi-Fi signal, it was easy to generate abnormal, erroneous, and invalid data. There were three types of noise in our dataset:

(1): The coordinate abnormal point. If the trajectory point fell outside the study area, it was regarded as a coordinate abnormal trajectory point.
(2): The trajectory point generated by fixed devices. If a user trajectory remained in the same area for more than eight hours, it was regarded as a trajectory point generated by fixed devices.
(3): The floor abnormal point. If a trajectory point of the user jumped between different floors within a short period, it was regarded as a floor abnormal point.

4.2. Evaluation Metrics

In this study, we regarded the clusters in the reconstructed trajectory point set

D B

as the high-density indoor crowd locations and used crowd density (

C D

), point density (

P D

), and running time as the quantitative evaluation indexes of the IndoorSRC framework. The

C D

and

P D

can be defined by Equations (9) and (10), respectively:

CD = \sum_{i}^{m} \frac{{CrowdNum}_{i}}{V_{i}} \times \frac{Δ t}{m},

(9)

PD = \sum_{i}^{m} \frac{{PointNum}_{i}}{V_{i}} \times \frac{Δ t}{m},

(10)

where

m

represents the number of clusters detected by the IndoorSRC framework,

V_{i}

represents the volume of a certain cluster (i.e., the convex hull volume of the three-dimensional point set),

{CrowdNum}_{i}

represents the number of users in a cluster,

{PointNum}_{i}

represents the number of trajectory points in a cluster, and

Δ t

represents the time step.

4.3. Variable Estimation

The hyperparameters of the IndoorSRC framework primarily included parameters in the Indoor-STAGNES and Indoor-STOPTICS algorithms. The Indoor-STAGNES algorithm has two auxiliary functions. First, it simplifies the user trajectory and reduces the number of trajectory points, thereby reducing the running time of the Indoor-STOPTICS algorithm. Second, it ensures that a single user contains only one trajectory point in a particular spatiotemporal neighborhood. Thus, the detection performance of the IndoorSRC framework predominantly depends on the Indoor-STOPTICS algorithm. Hence, we set the distance threshold

d_{t h r e h}

and the time threshold

t_{t h r e h}

to fixed values in the Indoor-STAGNES algorithm, wherein the distance threshold

d_{t h r e h}

was fixed to 5 m with reference to the average distance between indoor shops and the time threshold

t_{t h r e h}

was fixed to 4 min.

The hyperparameters of Indoor-STOPTICS mainly include the radius

ϵ_{1}

, time window

ϵ_{2}

, and minimum number of points

M i n P t s

. In the Indoor-STDBSCAN algorithm, the main test time window

ϵ_{2}

influences the detection performance. To determine the parameters in Indoor-STOPTICS, the control variable method was used to obtain the combination of parameter values and the best detection performance. In the parameter estimation phase, the radius

ϵ_{1}

was set to infinity for generating the decision graph, time window

ϵ_{2}

was the best parameter found in

[1 \min, 2 \min, 3 \min, \dots, 10 \min]

, and minimum number of

M i n P t s

was set to

5 \times \ln (M)

[33], where

M

represents the number of points in the set

D B

. Figure 6 shows the effect of the time window

ϵ_{2}

on detection performance. The crowd density first increased and then decreased, whereas the point density first increased and then stabilized. This is due to the fact that when the time window

ϵ_{2}

is greater than the time threshold

t_{t h r e h}

in the Indoor-STAGNES algorithm, the probability of including multiple points of a user in the neighborhood (Definition 5) is higher in the Indoor-STOPTICS algorithm. This, therefore, leads to a decrease in the crowd density; the two auxiliary functions of the Indoor-AGNES algorithm is also confirmed, to a certain extent. In this study, we eventually fix the time window

ϵ_{2}

to 5 min.

4.4. IndoorSRC Framework Performance

The IndoorSRC detection results on specific floors obtained after determining the optimal combination of parameters, using 11:00–16:00 as the research time, are presented graphically in Figure 7. The reconstructed trajectory point set

D B

shows an obvious aggregation pattern in different regions and at different times. After drawing the decision graph of the set

D B

, the spatial coordinates and time of the high-density crowd location can be detected. For example, when the distance threshold

ϵ_{1}

= 6 m, there will be eight clusters in the set

D B

; upon further determining the width of each cluster, five high-density crowd locations will be obtained. Table 2 shows the spatial and temporal information of each high-density crowd location. These locations are mostly crowded at noon and are primarily located in the dining area. For example, “Food Shangjia” is part of the food court inside the shopping mall and three short-period–high-density crowd locations were formed there at noon. “Fisherman’s lamp” and “Chinese Restaurant” are restaurants inside the shopping mall; long-period–high-density crowd locations were formed here.

4.5. Comparison with Baselines

To verify the performance of the proposed IndoorSRC framework, it was compared with the existing ST-OPTICS, and the experimental results were analyzed from the perspectives of

C D

,

P D

, and running time.

Figure 8 shows the comparison results of the point and crowd densities with the baseline. From these perspectives, the crowd density detected by the IndoorSRC algorithm is much higher than that calculated by ST-OPTICS. This is because there is more stay-point information in the indoor movement trajectory. If ST-OPTICS is used to identify high-density crowd locations directly, the clusters obtained are mostly “high-density point locations” rather than “high-density crowd locations.” The density of points detected by the IndoorSRC algorithm is slightly lower than that by ST-OPTICS as the IndoorSRC framework simplifies and reconstructs the indoor movement trajectory so that the adjacent trajectory points in the reconstructed trajectory are farther apart. This makes the trajectory points more sparse, resulting in a slight decrease in the point density. Therefore, the IndoorSRC framework is more suitable for detecting high-density indoor crowd locations.

Figure 9 shows the comparative results of the running time with the baseline approach. In this study, we compared the running time of the entire framework with baseline and not the running time of a simple single component, such as Indoor-STOPTICS or Indoor-STAGNES. From the perspective of the framework running time, when the number of users is small, the running time of the IndoorSRC framework is slightly higher than that of the ST-OPTICS algorithm, as the former simplifies and reconstructs the indoor movement trajectory, which increases the running time of the framework. As the number of users increases, the running time of the ST-OPTICS algorithm exceeds that of the IndoorSRC framework. This is because, when the number of trajectory points is very large, although the IndoorSRC framework simplifies and reconstructs the indoor movement trajectory, it greatly reduces the number of trajectory points, and the time consumed by simplification and reconstruction is far less than that consumed by direct detection.

5. Conclusions and Future Work

Accurate and robust detection of high-density indoor crowd locations is of great significance for early warning and emergency rescue during indoor safety accidents. Compared with the outdoor environment, the spatial structure of the indoor environment is more complicated, and tightly packed indoor crowds are more likely to cause security accidents. In this paper, the IndoorSRC framework is proposed to detect high-density indoor crowd locations. First, Indoor-STAGNES is proposed to detect the stay points of the indoor trajectory and simplify it. Then, the use of a Kalman filter algorithm to reconstruct the indoor trajectory is proposed. Finally, Indoor-STOPTICS is proposed to detect the location of high-density crowds in the indoor environment.

Experimentally, a two-week real indoor trajectory was used to verify the detection performance of the proposed framework. First, we used the control variable method to obtain the optimal parameter combination of the IndoorSRC framework. Afterward, we analyzed the predictive performance of the IndoorSRC framework using the dataset. Then, we conducted a comparison with the existing ST-OPTICS algorithm. Compared with the existing approach, the IndoorSRC framework considerably improved the detection performance in running time and crowd density, which demonstrates the efficiency of the IndoorSRC framework.

The following problems need to be investigated in the future. This study considered the high density of indoor crowds as the only necessary condition for indoor congestion, trampling, and other safety accidents; however, additionally, the direction of user movement can affect the occurrence of indoor safety accidents to a great extent. For example, when the crowd density is high and the users’ walking directions are the same, accidents often do not occur. When the crowd density is high and the users’ walking directions are disordered, there is more probability of an accident occurring. Therefore, future studies should introduce additional constraints, such as direction, to further improve the practicality of the IndoorSRC framework.

Author Contributions

P.W. contributed to data preprocessing, experiment, and the writing of the manuscript; X.Z. formulated the general research idea and contributed to writing the manuscript; F.G. advised on the experimental discussion and materials; Y.Z. and M.L. contributed to the manuscript revisions. All authors have read and agreed to the published version of the manuscript.

Funding

This project was supported by the National Key Research and Development Program of China (Grant Nos. 2016YFB0502200, 2016YFB0502204) and the National Natural Science Foundation of China (Grant Nos. 41830645, 41701459).

Conflicts of Interest

The authors declare no conflict of interest.

References

Liu, Y.; Cheng, D.; Pei, T.; Shu, H.; Ge, X.; Ma, T.; Du, Y.; Ou, Y.; Wang, M.; Xu, L. Inferring gender and age of customers in shopping malls via indoor positioning data. Environ. Plan. B: Urban. Anal. City Sci. 2019. [Google Scholar] [CrossRef]
Ahmed, T.; Pedersen, T.B.; Hua, L. Finding Dense Locations in Indoor Tracking Data. In Proceedings of the IEEE 15th International Conference on Mobile Data Management, Brisbane, Australia, 14–18 July 2014; IEEE: Piscataway, NJ, USA, 2014. [Google Scholar]
Ahmed, T.; Pedersen, T.B.; Hua, L. Finding dense locations in symbolic indoor tracking data: Modeling, indexing, and processing. Geoinformatica 2017, 21, 119–150. [Google Scholar] [CrossRef]
Huang, C.; Jin, P.; Wang, H.; Na, W.; Wan, S.; Yue, L. Indoorstg: A flexible Tool to Generate Trajectory Data for Indoor Moving Objects. In Proceedings of the IEEE 13th International Conference on Mobile Data Management, Milan, Italy, 3–6 July 2013; IEEE: Piscataway, NJ, USA, 2013. [Google Scholar]
Guan, W.; Wu, Y.; Wen, S.; Hao, C.; Chen, Y.; Chen, Y.; Zhang, Z. A novel three-dimensional indoor positioning algorithm design based on visible light communication. Opt. Commun. 2017, 392, 282–293. [Google Scholar] [CrossRef]
Guo, S.; Xiong, H.; Zheng, X.; Zhou, Y. Activity recognition and semantic description for indoor mobile localization. Sensors 2017, 17, 649. [Google Scholar] [CrossRef] [Green Version]
Koehler, C.; Banovic, N.; Oakley, I.; Mankoff, J.; Dey, A.K. Indoor-Alps: An Adaptive Indoor Location Prediction System. In Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, Settle, WA, USA, 13–17 September 2014; Association for Computing Machinery: New York, NY, USA, 2014; pp. 71–181. [Google Scholar]
Ang, B.-K.; Dahlmeier, D.; Lin, Z.; Huang, J.; Seeto, M.-L.; Shi, H. Indoor Next Location Prediction with Wi-Fi. In Proceedings of the Fourth International Conference on Digital Information Processing and Communications (ICDIPC 2014), Kuala Lumpur, Malaysia, 18–20 March 2014; pp. 107–113. [Google Scholar]
Petzold, J.; Pietzowski, A.; Faruk, B.; Trumler, W.; Ungerer, T. Prediction of Indoor Movements Using Bayesian Networks; Springer: Berlin/Heidelberg, Germany, 2005. [Google Scholar]
Naixia, M.; Hongen, W.; Hengcai, Z.; Xin, F. Association rule mining method based on the similarity metric of tuple-relation in indoor environment. IEEE Access 2020, 8, 52041–52051. [Google Scholar]
Liu, J.; Hang, Y.; Xie, F.; Li, R. Adaptive robust ultra-tightly coupled global navigation satellite system/inertial navigation system based on global positioning system/beidou vector tracking loops. IET Radar Sonar Navig. 2014, 8, 815–827. [Google Scholar]
Campos, R.S.; Lovisolo, L.; Campos, M.L.R.D. Wi-Fi multi-floor indoor positioning considering architectural aspects and controlled computational complexity. Expert Syst. Appl. 2014, 41, 6211–6223. [Google Scholar] [CrossRef]
Fu, M.; Zhu, W.; Le, Z.; Manko, D.; Gorbov, I.; Beliak, I. Weighted average indoor positioning algorithm that uses leds and image sensors. Photonic Netw. Commun. 2017, 34, 202–212. [Google Scholar] [CrossRef]
Ye, A.; Shao, J.; Xu, L.; Chen, J.; Xiong, J. A local hmm for indoor positioning based on fingerprinting and displacement ranging. IET Commun. 2018, 12, 1163–1170. [Google Scholar] [CrossRef]
Tomazic, S.; Dovzan, D.; Skrjanc, I. Confidence-interval fuzzy model-based indoor localization. IEEE Trans. Ind. Electron. 2018, 66, 2015–2024. [Google Scholar] [CrossRef]
Peixiao, W.; Sheng, W.; Hengcai, Z.; Feng, L. Indoor location prediction method for shopping malls based on location sequence similarity. ISPRS Int. J. Geo-Inf. 2019, 8, 517. [Google Scholar]
Peixiao, W.; Hongen, W.; Hengcai, Z.; Feng, L.; Sheng, W. A hybrid markov and lstm model for indoor location prediction. IEEE Access 2019, 7, 185928–185940. [Google Scholar]
Li, H.; Lu, H.; Shou, L.; Chen, G.; Chen, K. Finding Most Popular Indoor Semantic Locations Using Uncertain Mobility Data. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE), Macau SAR, China, 8–11 April 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 2139–2140. [Google Scholar]
Liu, T.; Feng, Z.; Li, H.; Lu, H.; Cheema, M.A.; Cheng, H.; Xu, J. Shortest Path Queries for Indoor Venues with Temporal Variations. In Proceedings of the 2020 IEEE 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 2014–2017. [Google Scholar]
Baba, A.I.; Jaeger, M.; Lu, H.; Pedersen, T.B.; Ku, W.-S.; Xie, X. Learning-based cleansing for indoor rfid data. In Proceedings of the 2016 International Conference on Management of Data, San Francisco, CA, USA, 26 June 2016–1 July 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 925–936. [Google Scholar]
Li, H.; Hua, L.; Shou, L.; Gang, C.; Ke, C. In search of indoor dense regions: An approach using indoor positioning data. IEEE Trans. Knowl. Data Eng. 2018, 30, 1481–1495. [Google Scholar] [CrossRef] [Green Version]
Zheng, L.; Xia, D.; Zhao, X.; Tan, L.; Li, H.; Chen, L.; Liu, W. Spatial-temporal travel pattern mining using massive taxi trajectory data. Phys. Stat. Mech. Appl. 2018, 501, 24–41. [Google Scholar] [CrossRef]
Lu, M.; Liang, J.; Wang, Z.; Yuan, X. Exploring od patterns of interested region based on taxi trajectories. J. Vis. 2016, 19, 811–821. [Google Scholar] [CrossRef]
Zhao, P.; Qin, K.; Ye, X.; Wang, Y.; Chen, Y. A trajectory clustering approach based on decision graph and data field for detecting hotspots. Int. J. Geogr. Inf. Sci. 2017, 31, 1101–1127. [Google Scholar] [CrossRef]
Li, X.; Han, J.; Lee, J.-G.; Gonzalez, H. Traffic Density-Based Discovery of Hot Routes in Road Networks; Springer: Berlin/Heidelberg, Germany, 2017; pp. 441–459. [Google Scholar]
Anbaroglu, B.; Heydecker, B.; Cheng, T. Spatio-temporal clustering for non-recurrent traffic congestion detection on urban road networks. Transp. Res. Part. C 2014, 48, 47–65. [Google Scholar] [CrossRef] [Green Version]
Cheng, S.; Lu, F.; Peng, P.; Wu, S. A spatiotemporal multi-view-based learning method for short-term traffic forecasting. ISPRS Int. J. Geo-Inf. 2018, 7, 218. [Google Scholar] [CrossRef] [Green Version]
Cheng, S.; Lu, F.; Peng, P.; Wu, S. Short-term traffic forecasting: An adaptive st-knn model that considers spatial heterogeneity. Comput. Environ. Urban. Syst. 2018, 71, 186–198. [Google Scholar] [CrossRef]
Chui, C.K.; Chen, G. Kalman filtering with real-time applications. Appl. Opt. 1987, 28, 1841. [Google Scholar]
Zheng, Y.; Zhou, X. Computing with Spatial Trajectories; Springer Publishing Company Incorporated: Heidelberg, Germany, 2011. [Google Scholar]
Ankerst, M.; Breunig, M.; Kriegel, H.-P.; Sander, J. Optics: Ordering points to identify the clustering structure. Sigmod Rec. 1999, 28, 49–60. [Google Scholar] [CrossRef]
Agrawal, K.P.; Garg, S.; Sharma, S.; Patel, P. Development and validation of optics based spatio-temporal clustering technique. Inf. Sci. 2016, 369, 388–401. [Google Scholar] [CrossRef]
Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the International Conference on Knowledge Discovery & Data Mining, Portland, OR, USA, 2–4 August 1996; AAAI Press: Palo Alto, CA, USA, 1996; pp. 226–231. [Google Scholar]

Figure 1. Basic definition: (a) raw trajectory of a user; (b) simplified trajectory of a user; and (c) reconstructed trajectory of a user.

Figure 2. IndoorSRC framework.

Figure 3. Simplified process of the Indoor-STAGNES algorithm: (a) indoor trajectory of a user, (b) indoor simplified trajectory of a user.

Figure 4. Simplified trajectory with missing trajectory points.

Figure 5. Detection process of the Indoor-STOPTICS algorithm: (a) reconstructed trajectory point set

D B

and (b) decision graph of point set

D B

.

Figure 5. Detection process of the Indoor-STOPTICS algorithm: (a) reconstructed trajectory point set

D B

and (b) decision graph of point set

D B

.

Figure 6. Impact of parameters (

ϵ_{2}

) on IndoorSRC.

Figure 6. Impact of parameters (

ϵ_{2}

) on IndoorSRC.

Figure 7. Detection results of the IndoorSRC framework: (a) floor graph of set

D B

; (b) spatiotemporal prism graph of set

D B

; (c) decision graph of set

D B

; and (d) detection results of set

D B

.

Figure 7. Detection results of the IndoorSRC framework: (a) floor graph of set

D B

; (b) spatiotemporal prism graph of set

D B

; (c) decision graph of set

D B

; and (d) detection results of set

D B

.

Figure 8. Comparison of the point density and crowd density with the baseline method: (a) crowd density comparison, (b) point density comparison.

Figure 9. Comparison of the running time with the baseline approach.

Table 1. Sample table of user trajectory data.

User ID	Date and Time	X (m)	Y (m)	Floor ID
2813BF ***	2017–12–29 09:25:58	130,219 ***	43,904 ***	2
2813BF ***	2017–12–29 09:26:01	130,219 ***	43,903 ***	2
2813BF ***	2017–12–29 09:26:05	130,219 ***	43,904 ***	2
……	……	……	……	……
2813BF ***	2017–12–29 20:18:48	130,219 ***	43,904 ***	5
2813BF ***	2017–12–29 20:18:51	130,219 ***	43,904 ***	5

*** means data omitted.

Table 2. Indoor high-density crowd locations and time characteristics.

	High-Density Crowd Locations	Time
Location 1	Food Shangjia	11:31–11:56
Location 2	Food Shangjia	12:07–12:34
Location 3	Chinese Restaurant	12:16–13:32
Location 4	Fisherman’s lamp	12:20–13:24
Location 5	Food Shangjia	14:14–14:17

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, P.; Gao, F.; Zhao, Y.; Li, M.; Zhu, X. Detection of Indoor High-Density Crowds via Wi-Fi Tracking Data. Sensors 2020, 20, 5078. https://doi.org/10.3390/s20185078

AMA Style

Wang P, Gao F, Zhao Y, Li M, Zhu X. Detection of Indoor High-Density Crowds via Wi-Fi Tracking Data. Sensors. 2020; 20(18):5078. https://doi.org/10.3390/s20185078

Chicago/Turabian Style

Wang, Peixiao, Fei Gao, Yuhui Zhao, Ming Li, and Xinyan Zhu. 2020. "Detection of Indoor High-Density Crowds via Wi-Fi Tracking Data" Sensors 20, no. 18: 5078. https://doi.org/10.3390/s20185078

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Detection of Indoor High-Density Crowds via Wi-Fi Tracking Data

Abstract

1. Introduction

2. Related Work

3. Materials and Methods

3.1. Simplification of the Indoor Movement Trajectory

3.2. Reconstruction of the Indoor Movement Trajectory

3.3. Detection of High-Density Indoor Crowd Locations

4. Results

4.1. Data Preparation

4.1.1. Data Sources

4.1.2. Data Preprocessing

4.2. Evaluation Metrics

4.3. Variable Estimation

4.4. IndoorSRC Framework Performance

4.5. Comparison with Baselines

5. Conclusions and Future Work

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI