*Article* **Detection of Salient Crowd Motion Based on Repulsive Force Network and Direction Entropy**

#### **Xuguang Zhang 1,\* , Dujun Lin 1, Juan Zheng 2, Xianghong Tang 1, Yinfeng Fang <sup>1</sup> and Hui Yu 3,\***


Received: 8 May 2019; Accepted: 18 June 2019; Published: 20 June 2019

**Abstract:** This paper proposes a method for salient crowd motion detection based on direction entropy and a repulsive force network. This work focuses on how to effectively detect salient regions in crowd movement through calculating the crowd vector field and constructing the weighted network using the repulsive force. The interaction force between two particles calculated by the repulsive force formula is used to determine the relationship between these two particles. The network node strength is used as a feature parameter to construct a two-dimensional feature matrix. Furthermore, the entropy of the velocity vector direction is calculated to describe the instability of the crowd movement. Finally, the feature matrix of the repulsive force network and direction entropy are integrated together to detect the salient crowd motion. Experimental results and comparison show that the proposed method can efficiently detect the salient crowd motion.

**Keywords:** crowd behavior analysis; salient crowd motion detection; repulsive force; direction entropy; node strength

#### **1. Introduction**

Video surveillance plays an important role in monitoring crowd safety, which is one of the key concerns in our daily life. Since the traditional human-computer interaction between video surveillance and crowd safety is time-consuming and labor-intensive, intelligent video surveillance issues such as target tracking, target detection and crowd analysis have become popular research topics. Crowd motion detection and analysis are essential for crowd behavior understanding [1,2]. It is thus very important to detect the salient motion in the crowd to monitor any potential threats or even damage to social safety. Salient motion has been defined as motion that is likely to result from a typical surveillance target as opposed to other distracting motions [3]. According to this definition, salient crowd motion usually indicates areas that are inconsistent with the mainstream pedestrians' movement. For video surveillance, these areas deserve more attention.

In recent years, due to the rapid development of computer vision technologies, progress has been made in detection of crowd saliency. For example, Lim et al. [4,5] proposed a method for automatically detecting a salient region using time variation of a crowd scene flow field by detecting the fluid activity in a given scene and detecting saliency with a minimum amount of observation region. Some methods for detecting globally salient motion regions for spectral singularity analysis of motion regions in video [6,7] have been also presented. Zhou et al. [8] studied the invariance of coherent neighbors as coherent motion priors, and proposed an effective clustering technique to detect crowd saliency. Solmaz et al. [9] overlaid the scene from the particle grid of the dynamic system defined by the optical

flow, and proposed a method to identify the behavior of five people in the visual scene through time integration. Zhang et al. [10] surveyed physics-based methods for crowd video analysis and sorted out the existing public database of crowd video analysis. Although many methods have shown good performance in crowd salient motion detection, the internal mechanism of crowd movement still needs to be explored. The pattern of crowd movement depends on both individual movement and interaction between individuals. It is of great value to explore a method to describe individual interaction and apply it to crowd salient motion detection.

In this paper, we propose a salient crowd motion detection method based on a direction entropy and a repulsive force network. The optical flow is first obtained using the pyramid-based Lucas-Kanade optical flow algorithm. Then, the weighted network is constructed by the repulsive force and the node strength matrix is obtained by using the node degree as the characteristic parameter. Finally, the particle motion direction entropy is used to optimize the node strength matrix and to detect salient movements of the crowds. The framework of the proposed method is shown in Figure 1. A motion vector field is established by giving each pixel a velocity vector in each image through the Pyramid Lucas-Kanade optical flow algorithm. Each vector in the crowd vector field is treated as a moving micro-particle. In order to build a complex network model, we regard each particle and the relationship between two particles as node and edge in the network, respectively. In order to show whether there is a connection between two particle nodes, we use the interaction force to construct the network. After calculating by optical flow method, the position and velocity parameters of each particle can be determined. Whether there is an edge between the nodes depends on the value of repulsive force between these nodes. The repulsive force can be described by the inertial centrifugal force. The value of the inertial centrifugal force is the weight of the edge and a velocity vector node can be selected accordingly. In the neighborhood of the node, the relevancy between the two velocity vectors is taken as a condition to determine the relevancy between the corresponding nodes.

**Figure 1.** Framework of salient crowd motion detection based on repulsive force network and direction entropy.

A weighted crowd network model is constructed to obtain the adjacency matrix representing the crowd motion information. In order to obtain a complete boundary of salient motion region, the velocity field is reversed and the repulsive force between particles is calculated repeatedly to construct the repulsive force network model. Then, the edge and weight are constructed by the repulsive force model, and the results of the superposition are taken as a construction step. Once all nodes are traversed, the strength of each crowd-weighted network node is extracted as a characteristic parameter to construct the strength matrix of the nodes. By calculating the direction of the velocity entropy of each node in the neighborhood, we can obtain the direction entropy matrix of the node. Then, the normalized direction entropy matrix and the strength matrix of the node are used to further optimize the strength matrix of the node. Once the node strength matrix is obtained, the salient region in crowd movement can be detected.

#### **2. Calculation of Crowd Velocity Vector Field**

To calculate the velocity vector field, the crowd video is decomposed into image sequences. Then, each pixel of the image is given a velocity vector calculated using an optical flow algorithm. A motion vector field is thus established. In this paper, considering the spatio-temporal information in motion detection [11], we adopt an improved algorithm based on Lucas-Kanade optical flow algorithm [12] for this task, namely pyramid optical flow algorithm [13].

Lucas-Kanade optical flow, in the process of moving the picture, assumes that a pixel *(x, y)* on the image has a brightness of *I (x, y, t)* at time *t*. After a small time interval of Δ*t*, the brightness of the point becomes *I(x* + Δ*x, y* + Δ*y, t* + Δ*t)*. The Taylor formula is used to expand and when Δ*t* is small enough to approach zero:

$$I(\mathbf{x} + \Delta \mathbf{x}, y + \Delta y, t + \Delta t) = I(\mathbf{x}, y, t) + \frac{\partial I}{\partial \mathbf{x}} \Delta \mathbf{x} + \frac{\partial I}{\partial y} \Delta y + \frac{\partial I}{\partial t} \Delta t \tag{1}$$

The optical flow constraint equation can be obtained from the brightness constant:

$$\frac{\partial \mathbf{l}}{\partial \mathbf{x}} \frac{d\mathbf{x}}{dt} + \frac{\partial \mathbf{l}}{\partial y} \frac{d\mathbf{y}}{dt} + \frac{\partial \mathbf{l}}{\partial t} = \frac{\partial \mathbf{l}}{\partial \mathbf{x}} \mathbf{u} + \frac{\partial \mathbf{l}}{\partial y} \mathbf{v} + \frac{\partial \mathbf{l}}{\partial t} = \mathbf{l} \mathbf{x} \mathbf{u} + \mathbf{l} \mathbf{y} \mathbf{v} + \mathbf{l} \mathbf{t} = \mathbf{0} \tag{2}$$

According to the uniformity of optical flow, we can establish the optical flow equations:

$$\begin{array}{l} lx1u + ly1v + lt1 = 0\\ lx2u + ly2v + lt2 = 0\\ \vdots\\ lxmu + lynv + ltn = 0 \end{array} \tag{3}$$

Then use the least square method to gain the Lucas-Kanade optical flow, where *u* is the horizontal velocity and *v* is the vertical velocity:

$$
\begin{bmatrix} u \\ v \end{bmatrix} = \begin{bmatrix} \sum\_{i=1}^{n} I\_{ix}^{-2} & \sum\_{i=1}^{n} I\_{ix} I\_{iy} \\ \sum\_{i=1}^{n} I\_{ix} I\_{iy} & \sum\_{i=1}^{n} I\_{iy}^{-2} \end{bmatrix}^{-1} \begin{bmatrix} -\sum\_{i=1}^{n} I\_{ix} I\_{t} \\ \sum\_{i=1}^{n} I\_{iy} I\_{t} \end{bmatrix} \tag{4}$$

The basic ideas of Lucas Kanade optical flow algorithm are mainly based on three assumptions: (1) constant brightness; (2) time continuous or movement is "small movement"; (3) spatial consistency. If an object is moving fast, the second assumption is not fully satisfied. The value calculated by traditional Lucas-Kanade optical flow will have a larger deviation. Pyramid optical flow algorithm reduces the offset of the target motion by reducing the image layer by layer, which satisfies the hypothesis of optical flow calculation better and weakens the influence of fast target motion. In this paper, the crowd velocity field Q is obtained by the pyramid optical flow algorithm. All the velocity

values in the horizontal direction and vertical direction are rounded up. The velocity vector field calculated using pyramid Lucas-Kanade optical flow algorithm for a crowd scene is shown in Figure 2.

**Figure 2.** Crowd optical flow field of the sampled frame.

#### **3. Construction of Repulsive Force Network**

#### *3.1. Establishment of a Network Node*

Complex network is a useful tool for describing a complex system. Each element in the system unit is regarded as a node, and the relationship between elements is regarded as a connection. A complex system can be represented as a network [14]. The crowd velocity vector field can be described as a complex network, in which each velocity vector is a node, and the relationship between the velocity vectors is connected. If the properties of the velocity vectors are measured separately, information stored in the velocity vector cross-correlation cannot be obtained, because the correlation of velocity vectors carries more information than the nature of each velocity [15].

We use the interaction force between particles to construct the network [16,17]. After applying the optical flow method, each vector in the obtained crowd vector field is regarded as a moving microscopic particle. The position and velocity parameters of each particle can be then determined. In our crowd complex network, each particle is treated as a node, and the interaction force between two particles is treated as an edge in the network. Whether there is an edge between the nodes depends on the repulsive force between the nodes, the repulsive force can be described by the inertial centrifugal force, and the value of the inertial centrifugal force is the weight of the edge. A weighted undirected network *G<sup>w</sup>* node set *Q* = *q*1, *q*2, ··· , *qn* can be generated, where *n* is the total number of nodes. The number of network nodes is equal to the number of particles in the crowd velocity field.

#### *3.2. Establishing the Network Edges Using Repulsive Force Model*

In the whole particle field, the size and direction of particle velocity are instantaneous, and the motion of the next times is random. If the moving particle is assumed as an agent, there is a possibility of interaction and collision between particles in motion. Imagine that each particle is an agent. In order to avoid collision between agents, an agent adds a repulsive force element to prevent them from colliding with each other. This repulsive force can be described by inertial centrifugal force [18]. For a given crowds particle field *Q (M, N)* in the column *N* and row *M*, selecting a particle *qxo yo* as the node, constructing a two-dimensional neighborhood δ, the size is (*x*<sup>0</sup> ± ε, *y*<sup>0</sup> ± ε). In this region, the connection between *qxo yo* and other nodes *qxy*(*x x*0, *y y*0) can be described as *e*(*qxoyo*, *qxy*). Whether this connection exists is determined by the following formula:

$$\{\epsilon(q\_{\text{xoy}\nu}, q\_{\text{xy}})\} \begin{cases} \exists\_{\text{\textdegree}} & \overrightarrow{F}\_{ij} \neq 0, q\_{\text{xy}} \in \delta \\ \nexists & \text{otherwise} \end{cases} \tag{5}$$

The formula for calculating the inertial centrifugal force is as follow:

$$
\stackrel{\rightarrow}{F}\_{ij} = -m\_i k\_{i\bar{j}} \frac{v\_{i\bar{j}}^2}{\text{dist}\_{i\bar{j}}} \stackrel{\rightarrow}{\mathcal{C}}\_{i\bar{j}} \tag{6}
$$

where <sup>→</sup> *eij* is the direction vector and *mi* is the mass of particle *qi*. In this paper, the mass of all particles is set as unit 1.*vij* is the relative velocity of two particles, *distij* is the distance between two particles, *kij* is a coefficient, the calculation of *vij* and *kij* is determined by the following formula:

$$v\_{i\bar{j}} = \begin{cases} (\overrightarrow{\boldsymbol{v}}\_i - \overrightarrow{\boldsymbol{v}}\_j) \cdot \overrightarrow{\boldsymbol{c}}\_{i\bar{j}\boldsymbol{\prime}} & (\overrightarrow{\boldsymbol{v}}\_i - \overrightarrow{\boldsymbol{v}}\_j) \cdot \overrightarrow{\boldsymbol{c}}\_{i\bar{j}} > 0\\ 0, & \text{others} \end{cases} \tag{7}$$

$$k\_{ij} == = \begin{cases} (\stackrel{\cdot}{\upsilon}\_i \cdot \stackrel{\cdot}{e}\_{ij}) / \upsilon\_{i\prime} & \stackrel{\cdot}{\upsilon}\_i \cdot \stackrel{\cdot}{e}\_{ij} > 0\_\prime \upsilon\_i \neq 0\\ 0, & \text{others} \end{cases} \tag{8}$$

Then, we can obtain the joint weight, which can be expressed by the magnitude of the repulsive force:

$$\mathcal{W}\mathcal{e} = \begin{vmatrix} \overrightarrow{F}\_{ij} \end{vmatrix} \tag{9}$$

According to the repulsive force formula, if the particle moves away from the affected particle, the repulsive force will be very low. As shown in Figure 3, the arrow represents the moving optical flow, and the blue line represents the repulsive force generated. Figure 3a is the schematic diagram of the repulsive force in the original direction, and Figure 3b is the schematic diagram of the repulsive force in the opposite direction. Thus, for some application of salient region detection, only half of the boundary can be detected.

**Figure 3.** Schematic diagram of repulsive force: (**a**) original motion flow; (**b**) motion flow reversed.

In order to get a complete boundary, the velocity field is reversed and the repulsive force between particles is calculated repeatedly. Thus, the repulsive force model can be used to construct the edge and weight of the repulsive force network. The results of the superposition of the two are taken as a construction step. Figure 4 shows an example. If we construct the repulsive force network for the optical flow field of the original video sequence, only half of the boundary can be obtained. If we construct the repulsive force network again after reversing the optical flow field, the other half of the boundary can be obtained.

**Figure 4.** Expressing the effect of optical flow reversal: (**a**) original sample frame; (**b**) detection result using original optical flow; (**c**) detection result using reversed optical flow; (**d**) detection result after the integration of optical flow.

The two-dimensional crowd velocity field is transformed into weighted undirected network model *Gw*(*Q*, *E*, *We*) by repeating the above steps for each node. The corresponding weighted undirected network node is set as *Q* = *q*1, *q*2, ··· , *qn* and the network edge is set as *E* = {*e*1,*e*2, ··· ,*em*}. In the crowds weighted network model, the connection between nodes and the degree of connection between nodes can be expressed by the following adjacency matrix:

*A* = ⎡ ⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣ → *F*<sup>11</sup> → *F*<sup>12</sup> ... → *F*1*<sup>n</sup>* . . . . . . ... . . . → *Fn*<sup>1</sup> → *Fn*<sup>2</sup> ··· → *Fnn* ⎤ ⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦ , (10)

#### *3.3. Calculation of Node Strength*

Statistical characteristic parameters of a network can be used to represent the characteristics of a network, such as node degree, average path length, clustering coefficient. In this paper, node strength is chosen to describe the characteristics of the crowd complex network. In the complex network model, node strength is the generalization of node degree, which integrates the strength between edges and nodes [19,20]. From the adjacency matrix, the node strength *s*(*qi*) of node *qi* can be expressed as follows:

$$s(q\_i) = \sum\_{n=1}^{j} \left| \overrightarrow{F\_{ij}} \right| \tag{11}$$

After calculating each point in the crowd velocity field, we can get the node strength of all nodes. The node strength field *S(M, N)* is also a two-dimensional matrix containing *M* rows and *N* columns. There is also a one-to-one correspondence between the node strength field and the crowd speed field:

$$S = \left\lfloor \begin{array}{ccccc} S\_{11} & S\_{12} & \dots & S\_{1N} \\ \vdots & \vdots & \ddots & \vdots \\ S\_{M1} & S\_{M2} & \dots & S\_{MN} \end{array} \right\rfloor \tag{12}$$

In order to facilitate the node strength field optimization operation in later stage, the node strength field is normalized as follows:

$$S\prime = \frac{S - S\_{min}}{S\_{max} - S\_{min}} \begin{bmatrix} S\_{11} & S\_{12} & \dots & S\_{1N} \\ \vdots & \vdots & \ddots & \vdots \\ S\_{M1} & S\_{M2} & \dots & S\_{MN} \end{bmatrix} \tag{13}$$

where *Smax* and *Smin* are the maximum and minimum values of the nodes in all node strengths.

#### **4. Optimizing Node Strength Field Using Direction Entropy**

#### *4.1. Establishment of Vector Direction Entropy Matrix*

For a crowd motion field *Q (M, N)* of the *M* row and *N* column, one particle *qxo yo* is selected, and thus, the direction angle of particle motion is divided into eight directions at 45 degrees interval. The calculation of velocity direction angle and direction grade is determined by the following formula:

$$
\theta = \arctan \frac{q\_{\rm yo}}{q\_{\rm xo}},
\tag{14}
$$

$$d = \begin{cases} 1 & 0 \le \theta < \frac{\pi}{4} \\ \vdots & \vdots \\ 8 & \frac{7\pi}{4} \le \theta < 2\pi \end{cases} \tag{15}$$

Choose a two-dimensional neighborhood δ with the same edge and weight as the repulsive force model with the size of (*x*<sup>0</sup> ± ε, *y*<sup>0</sup> ± ε). For a sub-image region, because of the different motion forms of particles, the direction of particle motion is uncertain at eight angles. Shannon entropy is a classical method to measure the uncertainty of information, and is the basis of communication science [21–23]. In this paper, Shannon entropy is used to measure the uncertainty of particle motion direction. In this paper, we employ Shannon entropy to describe the chaotic degree of crowd motion. In a neighborhood δ, each particle can be calculated by direction rank formula to get a direction rank *d*. Each direction rank occupies a certain probability *pi* in all direction ranks. According to the definition of Shannon entropy [21] and [23], we can assign the velocity direction entropy between the central particle *qxo yo* and other particles *qxy*(*x x*0, *y y*0) neighboring the central particle. The calculation is determined by the following formula:

$$H\_{\text{xoyo}} = -\sum\_{i=1}^{n} p\_i \log p\_{i\prime} \qquad \qquad n = \varepsilon^2,\tag{16}$$

For each position, in the crowd particle field *Q* (*M, N*), the entropy can be calculated by repeating the steps mentioned above. Therefore, the direction entropy of each particle in the crowd particle field can be obtained. The two-dimensional crowd velocity vector field can be transformed into a particle direction entropy matrix:

$$H = \begin{bmatrix} H\_{11} & H\_{12} & \dots & H\_{1N} \\ \vdots & \vdots & \ddots & \vdots \\ H\_{M1} & H\_{M2} & \dots & H\_{MN} \end{bmatrix} \\ \tag{17}$$

where, H*11, H12* ... ... *HMN* is the entropy at the corresponding position of the crowd particle field. In order to facilitate the node strength field optimization operation in later stage, the direction entropy matrix is normalized as follows:

$$H\nu = \frac{H - H\_{\text{min}}}{H\_{\text{max}} - H\_{\text{min}}} \begin{bmatrix} H\_{11} & H\_{12} & \dots & H\_{1N} \\ \vdots & \vdots & \ddots & \vdots \\ H\_{M1} & H\_{M2} & \dots & H\_{MN} \end{bmatrix} \tag{18}$$

*Hmax* and *Hmin* are the maximum and minimum values in the entropy matrix for all directions.

#### *4.2. Optimizing the Node Strength Field*

The direction entropy matrix of crowd movement can describe the degree of changes in the direction of movement of the nodes. Furthermore, the strength field of the repulsive force node describes the degree of repulsion of each node and the surrounding nodes. In order to reduce the noise caused by other interference motion, this paper combines these two kinds of model to optimize the node strength field. It is very important to choose an effective way to integrate these two features, e.g., node strength and entropy. There are many ways to integrate features, such as multiplication and addition. For the application of salient crowd motion detection, the way of feature fusion requires significant expression of specific crowd motion regions and adaptation to the changes of scene. We analyzed the feature of node strength and entropy. The saliency region can be detected by combining the two features by multiplying or add. However, the saliency region obtained by addition is more effective. Because the range of the two features is quite different and there are great changes in different scenarios, it is difficult to determine the combined weights. Therefore, this paper applies a normalized processing of the two features before adding the two features together. Although there are differences in dimension between them, as a normalized feature, it works well when integrating them at the application level.

The direction entropy matrix of crowd motion is in one-to-one correspondence with the strength field of nodes; thus, we have made a comparison according to the following formulas:

$$P\_{ij} = \begin{cases} \mathbb{S}\_{ij}{}' + H\_{ij}{}' & \mathbb{S}\_{ij}{}' \neq 0 \, \_tH\_{ij}{}' \neq 0\\ 0 & \text{others} \end{cases} \tag{19}$$

The optimized node strength field is:

$$P = \begin{bmatrix} P\_{11} & P\_{12} & \dots & P\_{1N} \\ \vdots & \vdots & \ddots & \vdots \\ P\_{M1} & P\_{M2} & \dots & P\_{MN} \end{bmatrix} \tag{20}$$

Then, for nomalizing the optimized node strength field, the specific calculation formula is as follows:

$$P\prime = \frac{P - P\_{\rm min}}{P\_{\rm max} - P\_{\rm min}} \begin{bmatrix} P\_{11} & P\_{12} & \dots & P\_{1N} \\ \vdots & \vdots & \ddots & \vdots \\ P\_{M1} & P\_{M2} & \dots & P\_{MN} \end{bmatrix} \tag{21}$$

After normalizing the strength field of the nodes, we smoothed the node strength field with a 3 × 3 mean filter template. It can eliminate the negative effects of the node strength caused by too high or too low values on the experimental results. In order to intuitively describe and observe the value of node strength, we use a pseudo-color image display method to visualize node strength. Pseudo-color image shows the pixel value corresponding to the node strength value. In a crowd scene, it is obvious that the node pixel values in salient regions are higher than those in other regions.

#### **5. Experimental Results and Analysis**

In our experiments, we tested three crowded scene video sequences from Crowd Saliency dataset [5] and a video sequence in [24] to show the performance of the proposed method. Retrograde and instability regions of a crowd were detected in the experiment. For different crowded scenes, the scale of the velocity field *Q(M,N)* and the parameters ε (the size of neighborhood) in the experiment are shown in Table 1. The proposed method is effective for images used in this experiment, which do not have a high resolution. If it is used to deal with high resolution images, there are two ways to processing the data. One is to reduce the high-resolution image using interval sampling and local mean, and the other is to process optical flow data by interval sampling.


**Table 1.** Different scenes and parameter values.

#### *5.1. Crowd Retrograde Behavior Detection*

In this experiment, we used the train station scene and the single retrograde scene to show the salient detection for retrograde behavior. As shown in Figures 5 and 6, some pedestrians do not conform to the flow of the mainstream crowd, hence, a retrograde motion was formed instead. The particles will thus have a larger repulsion force and direction entropy in this region. This proposed method can effectively detect human retrograde movement. Figures 5a and 6a shows the original video frame. The node strength field calculated from the repulsive force network is shown in Figures 5b and 6b. It can be clearly seen that the regions with high node strength represents the retrograde motion. However, there are still some disturbances. As shown in Figures 5c and 6c, though the entropy value of the retrograde region is large, there are still some noise regions. Fortunately, the disturbance regions detected by node strength and direction entropy are different. Therefore, we can optimize the saliency detection results by integrating node strength and direction entropy, as is shown in Figures 5d and 6d. In order to illustrate the detection performance, we overlap the saliency detection results with the original video frames in Figures 5e and 6e. Experiments show that our method can detect pedestrians who even move oppositely to the flow of mainstream crowd.

**Figure 5.** Retrograde motion detection in train station scene: (**a**) input frame; (**b**) node strength field of repulsive force network; (**c**) detection result using direction entropy; (**d**) salient region detection after optimized; (**e**) overlap the salient region with input frame.

**Figure 6.** Retrograde motion detection in single retrograde scene: (**a**) input frame; (**b**) node strength field of repulsive force network; (**c**) detection result using direction entropy; (**d**) salient region detection after optimized; (**e**) overlap the salient region with input frame.

#### *5.2. Crowd Motion Instability Region Detection*

In the crowd surveillance system, the instability area of crowd movement often deserves attention. In this experiment, we used two scenes, including the marathon scene (Figure 7) and the pilgrimage scene (Figure 8) to show the performance of the proposed method for detecting the instability crowd motion.

The sample frames for the two scenarios are shown in Figures 7a and 8a. There are instability motion regions (some pedestrians are different from the mainstream crowd) in these two crowds. The results of node strength fields of two scenes are shown in Figures 7b and 8b, respectively. We can see that the node strength of the repulsive force model is larger in instability motion regions. The direction entropy fields of two scenarios are shown in Figures 7c and 8c, respectively. The entropy values of the instability region are clearly large. However, there is some noise in the unstable region detected by any single method. After integrating these two methods of node strength and direction entropy, the saliency detection results are optimized and the interference areas are effectively removed, which can be seen in Figures 7d and 8d. Figures 7e and 8e show saliency detection results after overlapping with the original video frame. Experimental results show that the proposed method can detect the salient crowd instability motion in large-scale crowded scenes.

**Figure 7.** Salient crowd instability motion detection in marathon scene: (**a**) original video frame and ground true (red box); (**b**) node strength field of repulsive force network; (**c**) detection result using direction entropy; (**d**) salient region detection after optimized; (**e**) overlap the salient region with original video frame.

**Figure 8.** Salient crowd instability motion detection in pilgrimage scene: (**a**) original video frame and ground true (red box); (**b**) node strength field of repulsive force network; (**c**) detection result using direction entropy; (**d**) salient region detection after optimized; (**e**) overlap the salient region with original video frame.

#### *5.3. Detection Results Using Di*ff*erent Neighborhood Size*

It is very important to select a suitable neighborhood size ε to construct complex network. A neighborhood that is too small will not be bias to salient motion, while a neighborhood with too large a scale will introduce more noise. In this section, the salient crowd motion will be detected using different neighborhood sizes ε. For retrograde motion detection, the train station scene and single retrograde scene is used to show the performance of the proposed method. From Figure 9, we can see that the salient motion region detected by applying the size of 5 × 5 neighborhood is slightly scattering, while the area detected by the size of 13 × 13 is more complete. A larger the neighborhood 23 × 23, can cause more noise in the detection results. As for the Figure 10, the salient motion region size detected by using the 5 × 5 neighborhood is small, while the area detected by using the 15 × 15 neighborhood size is more complete. When choosing a larger neighborhood of 23 × 23, the result includes more noise. For instability motion detection, two scenes were used in this experiment. For the marathon scene, the detection result was usually not closed if the neighborhood size was too small (5 × 5 neighborhood). Applying a neighborhood size of 23 × 23, noise interference will be introduced, although closed salient motion regions can still be obtained. After the experiment, the closed salient motion region can be obtained using a neighborhood size of 11 × 11 (Figure 11). For another pilgrimage scene, although the saliency region can also be detected with a 5 × 5 size neighborhood, the result obtained with a 15 × 15 size neighborhood is closer to the ground truth (Figure 12).

**Figure 9.** Retrograde motion detection in train station scene using different neighborhood size: (**a**) original video frame; (**b**) detection result using 5 × 5 neighborhood; (**c**) detection result using 13 × 13 neighborhood; (**d**) detection result using 23 × 23 neighborhood.

**Figure 10.** Retrograde motion detection in single retrograde scene using different neighborhood size: (**a**) original video frame; (**b**) detection result using 5 × 5 neighborhood; (**c**) detection result using 15 × 15 neighborhood; (**d**) detection result using 23 × 23 neighborhood.

**Figure 11.** Instability motion detection in marathon scene using different neighborhood size: (**a**) original video frame and ground truth region; (**b**) detection result using 5 × 5 neighborhood; (**c**) detection result using 11 × 11 neighborhood; (**d**) detection result using 23 × 23 neighborhood.

**Figure 12.** Instability motion detection in pilgrimage scene using different neighborhood size: (**a**) original video frame and ground truth region; (**b**) detection result using 5 × 5 neighborhood; (**c**) detection result using 15 × 15 neighborhood; (**d**) detection result using 25 × 25 neighborhood.

#### *5.4. Performance Evaluation and Comparison*

The ground truth of crowd salient detection for pilgrimage and marathon scene has been given in the Crowd Saliency dataset [5]. The ground truth is given using a rectangular area. In order to evaluate the performance of the proposed method, we calculate the minimum enclosing rectangle of the detected salient motion region. To quantitatively evaluate the performance of the method, two indicators (precision and recall) are calculated in our experiments. In this paper, precision is the ratio of the number of pixels in the detected region that belong to the ground truth to the number of pixels in the detected area, indicating whether the number of pixels in the detected local motion instability area is accurate, expressed by *Pr*. Recall is the ratio of the number of pixels in the detected results region that belonging to the instability motion region to the number of all pixels of the ground truth, represented by *R* [25]. The precision and recall can be calculated as:

$$Pr = \frac{TP}{TP + FP} \tag{22}$$

$$R = \frac{TP}{TP + FN} \tag{23}$$

where *TP* indicates that both the detection result and the ground truth are positive. *FP* indicates that the detection is positive and the actual is negative. *TN* indicates that both the prediction and the ground truth are negative. *FN* indicates that the prediction is negative but the actual is positive.

The precision and recall calculated from the pilgrimage and marathon scene using different parameters (the size of neighborhood) are given in Table 2. Obviously, according to the parameters selected in this paper, satisfactory detection accuracy can be obtained. If the neighborhood size is too large or too small, the detection accuracy will be seriously affected. Figure 13 shows the detection results of the pilgrimage and marathon scene using different methods. From Figure 13 we can see that both the proposed method and the methods mentioned in [26] can detect the salient region correctly. However, the rectangular region obtained by the proposed method is closer to the ground truth.


**Table 2.** The measurement of the accuracy of the detection results using different parameters.

**Figure 13.** Comparison of the method in this paper with the article [26]: (**a**,**e**) are the ground truth of marathon and pilgrimage scene; (**b**,**f**) are the results gained by our method; (**c**,**g**) are courtesy of reference [26]; (**d**,**h**) are the local enlarged displays of the results.

#### **6. Conclusions**

In this paper, we proposed a method for crowd salient motion detection based on a direction entropy and a repulsive force network. This paper focused on how to detect saliency regions in crowd movement effectively. Firstly, the crowd video sequence frames are processed by the optical flow algorithm followed by the crowd velocity vector field calculation. Secondly, according to the repulsive force model, the interaction force between two particles is determined as a certain condition. The repulsive force network is obtained and the strength of the crowd weighted network node is extracted as the characteristic parameter to construct a two-dimensional feature matrix. Finally, the velocity vector direction entropy is combined with the repulsive force network characteristic matrix to detect the salient crowd motion structure. The experimental results of four crowd video sequences show that the proposed method can not only detect the region of retrograde behavior of crowd movement but also the region of unstable crowd movement in large-scale crowd scenes. For future work, we will focus on the development of a method for an adaptive threshold and neighborhood calculation.

**Author Contributions:** X.Z. proposed the idea in this paper; X.Z., D.L., J.Z. and X.T. conceived and designed the experiments; D.L., J.Z. and Y.F. performed the experiments; X.Z., D.L., and H.Y. analyzed the data; X.Z. and D.L. wrote the paper; X.Z., Y.F., X.T. and H.Y. edited and reviewed the paper; All authors read and approved the final manuscript.

**Funding:** This research was supported by National Natural Science Foundation of China (no. 61771418).

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**


© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

MDPI St. Alban-Anlage 66 4052 Basel Switzerland Tel. +41 61 683 77 34 Fax +41 61 302 89 18 www.mdpi.com

*Entropy* Editorial Office E-mail: entropy@mdpi.com www.mdpi.com/journal/entropy

MDPI St. Alban-Anlage 66 4052 Basel Switzerland

Tel: +41 61 683 77 34 Fax: +41 61 302 89 18