After the preliminary screening of abnormal trajectories, the next step is to judge the abnormal type using the mobile feature. We first discuss the types and definition of abnormal behavior, then analyze the characteristics of different types and introduce their judgment methods.
2.3.1. Abnormal Pattern Definition
The motion type embodies the basic characteristics of trajectory movement. The extraction and definition of the motion type initially involve extracting the motion characteristics of the trajectory, which can help us further judge the abnormal patterns. We used the Martino–Saltzman typology method [
26] to define the motion types, which is the most widely acknowledged model. Based on this model, the motion types are defined as follows:
Definition 1. direct: locomotion from a point to a destination along a straightforward path without significant indecision.
Definition 2. pacing: back and forth locomotion between two points, for which the directional heading is reversed.
Definition 3. lapping: circuitous locomotion revisiting, at least, three points sequentially along the path with several directional changes.
Definition 4. random: locomotion along a haphazard path with multiple changes in direction and several instances of indecision at any point along the path.
The motion types represent and define the initial movement characteristics of the trajectory, and the next critical step is to construct the relationship between the motion type and abnormal pattern.
In view of the characteristics of common crimes’ preparation [31], abnormal behaviors before a crime mainly include wandering, scouting, random walking, and trailing. Definition 5. wandering: circuitous locomotion revisiting, at least, three points sequentially along the path with several directional changes.
The main feature of wandering behavior is to walk around a place repeatedly. The existing wandering behavior research mainly focuses on the abnormal behavior of patients with Alzheimer’s disease [7,32]. Definition 6. scouting: the same person has at least two similar wandering behavior trajectories.
Before committing a crime, most criminals will wander around the crime scene many times to ascertain the situation. Therefore, scouting behavior is the important characteristic of crime preparation, and Chinese law stipulates that scouting a spot is one of the important preparations for crimes. Since scouting behavior often lasts for multiple days, we can define it as a multi-day wandering behavior.
Definition 7. random walking: locomotion along a haphazard path with multiple changes in direction and several instances of indecision at any points along the path.
Random crimes are a common type of conventional crime. In random crimes, criminals tend to randomly walk around the crime site, choose the crime site and objects at any time, and commit crimes such as theft.
Definition 8. trailing: two persons take similar trajectories, and one person walks behind another.
Criminals usually follow the target person before committing a crime. Trailing behavior means that criminals follow the target person in a certain area, then follow him or on his side. Regardless of whether the target person moves forward or turns, criminals will follow him/her closely. Trailing behavior is an important crime preparation, and Chinese law lists it as one of the crime preparations.
Since wandering, scouting, and random walking behaviors can be determined by the characteristics of a single person’s trajectory, we divide them into a large set of categories and discuss their characteristics and judgment methods in the first section. Trailing behavior requires analyzing the trajectory of two people and will be discussed in the next section.
2.3.2. Wandering, Scouting, and Random Walking Behaviors’ Identification
In order to determine wandering, scouting, and random walking behaviors, we first examined the original trajectory to extract the motion type of the trajectory and further determine the type of abnormal activity pattern. The basic process of identification is shown in
Figure 3.
The set of abnormal trajectory ids
Z screened through Formula (
7) needs to distinguish the trajectory movement behavior first and then further determine the behavior pattern of the trajectory according to the movement behavior.
Because directly using the latitude and longitude information will cause great computational burden to the algorithm, we used a certain division step to divide the sensitive area into a longitude and latitude network
, containing
small grids, with each small grid
corresponding to an actual geographic area of the sensitive area, where the longitude and latitude of the trajectories are mapped to an
matrix.
where
x,
y refer to the horizontal and vertical coordinates of the matrix.
represents the time of the
i-th point.
and
represent the specific longitude and latitude of the
i-th point.
After mapping the trajectories to the
matrix, we assume that if two trajectories’ points fall into the same geographic grid then we consider these two points are equal.
Next, we used the method proposed in [
33] to classify the pattern of the trajectory. N.K. Vuong et al. [
33] proposed an algorithm based on a deterministic predefined tree, which performs a state diagram on how the trajectory shifts from direct to random, and then from random to pacing or lapping. The algorithm marks the locations of a trajectory as direct, pacing, or lapping. If a trajectory pattern does not belong to direct, pacing, or lapping, it will be regarded as random. The trajectories processed by this algorithm are based on indoor locations, such as bathrooms and living rooms, while our trajectories are based on a large range of sensitive areas with geographic coordinates. Therefore, we extended the algorithm to meet our studied scenarios, and the concrete procedure of the algorithm is shown in Algorithm 2.
After the movement pattern of the trajectory is recognized, the algorithm will further use the four abnormal behaviors defined above to calculate the true abnormal behavior of the trajectory.
For random walk behavior and direct behavior, since they only involve a single day of motion behavior, they can be directly judged from random walk motion and direct motion as random walking behavior and direct behavior.
For wandering behavior, in order to obtain the abnormal trajectory more accurately, we intended to take advantage of the difference between distance and displacement to judge whether there is wandering behavior. This method enjoys an advantage in speed and can quickly detect specific hovering behaviors in massive trajectory data. The concrete procedure of the algorithm is shown in Algorithm 3.
Algorithm 2 Identifying mobility patterns. |
- Input:
: sequence of previously visited locations; - Output:
pattern type (“direct”, “random”, “lapping”, or “pacing”) - 1:
if where then - 2:
label “direct” for the ; - 3:
else - 4:
Find circles in ; - 5:
label pacing for points in the circle whose length is 2; - 6:
label lapping for points in the circle whose length is between 3 and ; - 7:
for each unlabeled sub-sequence of do - 8:
if where then - 9:
label “direct” for ; - 10:
else - 11:
label “random” for ; - 12:
end if - 13:
end for - 14:
= the number of sub-patterns labeled as “random”, “lapping”, and “pacing” respectively; - 15:
; - 16:
if then - 17:
label “random” for the ; - 18:
else - 19:
if then - 20:
label “lapping” for the ; - 21:
else - 22:
label “pacing” for the ; - 23:
end if - 24:
end if - 25:
end if
|
Algorithm 3 Judge wandering behavior. |
- Input:
Divide the trajectory evenly into a area according to time. set threshold :. - Output:
the trajectory if has wandering behavior. ) - 1:
for to a do - 2:
Calculate the displacement and distance of the user’s movement in each area; - 3:
if then - 4:
; - 5:
end if - 6:
end for - 7:
ifthen - 8:
return true; - 9:
else - 10:
return false; - 11:
end if
|
The distance and displacement in the collected trajectory data were calculated according to Formulas (
11) and (
12). By continuously adjusting the size of the threshold, we selected the threshold that worked best.
For scouting behavior, criminals usually carry out careful preparation before committing crimes in sensitive locations. Knowing the surrounding buildings and police locations in advance can help criminals increase the success rate of a crime. The trajectory of scouting behavior has the following characteristics: long staying time, long moving distance, frequent times. We regulated the scouting behavior based on the following rules.
(1) Long stay time: if the number of staying points of a certain trajectory is far more than the number of staying points of the other normal trajectories, the trajectory is suspected to be scouting behavior.
(2) Frequent and multiple times: if a trajectory has multiple similar trajectories in the sensitive area, the trajectory is suspected to be scouting.
2.3.3. Trailing Recognition Based on the Probability Model
We proposed an effective method to determine the trailing relationship. Our method includes three steps: (1) determining the front–behind relationship; (2) spatiotemporal correlation calculation; (3) discovery of the trailing relationship based on the Gaussian kernel function.
(1)Front–behind relationship determination
We used preliminary screening to filter out unrelated data to improve the speed of the method. By using known trajectory information, the method selects two persons who have traveled with the same range of activities in the same time period in one day.
First, we used linear interpolation to preprocess data to ensure that the time points in the trajectory were spaced evenly and the points with the same index had the same time point. Then, we selected the sub-interval formed by the suspected peer points to calculate the trajectory similarity and calculate the peer relationship.
The meeting point refers to the first index in the trajectory sample dataset that satisfies the corresponding Euclidean distance between the two points that is less than a certain threshold. When Objects A and B are in the same state, the Euclidean distance between each index point will continue to be less than a certain threshold value. Therefore, all points that maintain the peer behavior after the meeting are judged to be suspected peer points.
A one-to-one correspondence is used to find the Euclidean distance between two sample points with the same index of two sample sets. The two points with a distance greater than the threshold maxdist are regarded as separate. When the first distance after the points is less than the threshold mindist, it is considered as a meeting between the two. When the distance between the two is less than maxdist, there is a peer relationship between the two. Thus, a state can be assigned to each trajectory point in the pre-processed dataset.
Because the Euclidean distance only considers the distance relationship when evaluating the relationship between the movement trajectories of two objects, this paper used an additional cosine similarity index to evaluate the similarity of the movement direction points of the object. We then combined the distance relationship and directional relationship to comprehensively analyze trajectory similarity. Because the probability of a trajectory point that exceeds the distance threshold of the suspected peer point is extremely small, the trajectory part containing the suspected peer point can be selected as a sub-interval for measuring the similarity of the trajectory points. After the sub-intervals are set, the cosine similarity between the vectors of two adjacent points is used to determine the credibility of the peers for the sample points existing in the interval.
At a certain point of time
i, the position point of object
is
, and the position point of
is
. At the time point
, the location point of object
is
,
, and the position of
is
,
. The cosine angle of the two tracks is given by Formula (
13).
The closer the value of cos is to 1, the more similar the two vectors are. After traversing all the points to obtain the cosine angle of the n pairs of vectors, we summed them and divided by n to obtain a reference value of the average cosine similarity. According to the average similarity, it was determined whether there was a front–behind relationship.
(2) Spatiotemporal correlation calculation
Based on the trajectory data obtained by the preliminary screening, the known trajectory data of the target and the preliminary screened trajectory data were used to calculate the spatiotemporal correlation degree. We employed the semantic spatiotemporal correlation method to calculate it.
Generally, when criminals intend to trail victims, they may visit a place with variant frequencies at different semantic time periods (random walk, trailing, escape). The trajectories still include those of the criminal and the normal person who passes by the victim briefly. Normal person will not walk randomly so as to find the victim target before trailing. Moreover, a normal person will not escape quickly after trailing, but criminals are just the opposite. Therefore, by using the probability distribution of the visit intensity in different semantic time periods, the normal person can be excluded.
A trajectory is divided into three semantic time periods: random walk, trailing, and escape. After the preprocessing in the previous step, we can obtain the front–behind semantic time period between two trajectories. The time period before the beginning of the front–behind relationship is defined as the random walking period; the time period with the front–behind relationship is defined as the trailing period; and the time period after the end of the relationship is defined as the escape period.
We firstly set a threshold TN and filtered out all elements in the
matrix calculated by Formulas (
8) and (
9) that were less than
to form a region list. The region list of each trajectory at different semantic time periods was ranked in frequency and used to produce the TOP-N region set, which is regarded as:
where
represents the probability distribution at different semantic time periods (random walk, trailing, escape) of a trajectory.
represents the distribution frequency of the trajectory in the k-th region, and
represents the coordinate of this region.
where
is a very small value to prevent the denominator from becoming 0.
After using the same method to obtain the TOP-N region set of Trajectory a and Trajectory b at different semantic time periods (random walk, trailing, escape), we performed the weighting calculation by Jensen–Shannon distances (JSD) to obtain the semantic space–time pattern similarity, which was used to judge whether Trajectory a and Trajectory b had the trailing relationship.
(3) Trailing recognition with Gaussian kernel function
After obtaining the spatiotemporal correlation of the trajectory data, the Gaussian kernel function was used to build the discriminate model, providing a list of people who were highly suspected as trailing people along with their corresponding trajectory data.
The Gaussian kernel, also known as the Radial Basis Function (RBF) function, is a commonly used as kernel function. We used the Gaussian kernel function to build the final discriminate model, which maps the finite-dimensional data to a high-dimensional space to obtain a more accurate discrimination result
using Formula (
18). Then, we determined whether Trajectory
a and Trajectory
b had trailing behavior based on whether
was greater than the threshold
.
where
and
are weight coefficients to control the influence of the correlation strength between the criminal and the victim in two spatiotemporal models.
and
are the bandwidth of the Gaussian filter, which control the classification strength of the model. The parameter
defines the possibility of trailing. The threshold of
can be used to filter out the trajectories and persons who may have a trailing relationship.