Swarm Intelligence Response Methods Based on Urban Crime Event Prediction

Wang, Changhao; Tian, Feng; Pan, Yan

doi:10.3390/electronics12224610

Open AccessArticle

Swarm Intelligence Response Methods Based on Urban Crime Event Prediction

by

Changhao Wang

¹,

Feng Tian

¹ and

Yan Pan

^2,*

¹

School of Electronic Information and Artificial Intelligence, Shaanxi University of Science and Technology, Xi’an 710021, China

²

Science and Technology on Information Systems Engineering Laboratory, National University of Defense Technology, Changsha 410073, China

^*

Author to whom correspondence should be addressed.

Electronics 2023, 12(22), 4610; https://doi.org/10.3390/electronics12224610

Submission received: 10 October 2023 / Revised: 3 November 2023 / Accepted: 6 November 2023 / Published: 11 November 2023

(This article belongs to the Special Issue AI in Disaster, Crisis, and Emergency Management)

Download

Browse Figures

Versions Notes

Abstract

:

Cities attract a large number of inhabitants due to their more advanced industrial and commercial sectors and more abundant and convenient living conditions. According to statistics, more than half of the world’s population resides in urban areas, contributing to the prosperity of cities. However, it also brings more crime risks to the city. Crime prediction based on spatiotemporal data, along with the implementation of multiple unmanned drone patrols and responses, can effectively reduce a city’s crime rate. This paper utilizes machine learning and data mining techniques, predicts crime incidents in small geographic areas with short timeframes, and proposes a random forest algorithm based on oversampling, which outperforms other prediction algorithms in terms of performance. The research results indicate that the random forest algorithm based on oversampling can effectively predict crimes with an accuracy rate of up to 95%, and an AUC value close to 0.99. Based on the crime prediction results, this paper proposes a multi-drone patrol response strategy to patrol and respond to predicted high-crime areas, which is based on target clustering and combined genetic algorithms. This strategy may help with the pre-warning patrol planning within an hourly range. This paper aims to combine crime event predictions with crowd-sourced cruise responses to proactively identify potential crimes, providing an effective solution to reduce urban crime rates.

Keywords:

random forest; patrol response; genetic algorithm; crime event prediction

1. Introduction

In recent years, with the accelerated urbanization process and improved living standards, cities have become hubs for every large population. However, this has also brought about a series of security problems, leading to various types of crimes. According to the statistics from the United Nations Office on Drugs and Crime (UNODC), various types of crimes, including robbery, theft, and violent crimes, are on the rise worldwide [1]. Data released by the Chinese Ministry of Public Security shows that overall crime cases in China have been on the rise in recent years, posing serious challenges to social order and public safety [2,3]. According to daily statistical reports from the China Emergency Service Network, in July 2018 alone, there were 21,426 reports of warning information and 5129 reports of emergency incidents nationwide [4]. Criminal events seriously affect economic development and social stability, posing a significant threat to the safety of people’s lives.

With the advancement of artificial intelligence and big data analytics technology, many scholars both domestically and internationally are using spatiotemporal data to predict occurrences of criminal events, providing effective means to reduce crime incidents. Using machine learning, researchers like Jenga [5] have assessed the latest technologies in crime prediction and proposed future research directions in the field of crime prediction. Zhang et al. [6], for instance, extracted patterns and features from historical crime data and improved the accuracy of LSTM models by incorporating built environment covariates. These achievements in crime event prediction have been substantial, but many have not addressed intelligent responses based on the prediction results.

Using crime prediction results as a base, designing real-time patrol routes for multiple drones to comprehensively cover high-crime areas can effectively prevent crime incidents and enable rapid responses when crimes occur, thereby reducing crime rates and losses. This paper utilizes machine learning and big data analytic techniques to propose a crowd-sourced response method based on crime event prediction for both proactive alerts and post-event responses. The main contributions of this work are as follows:

Crime events in short time frames and small areas are predicted, using data mining techniques and various machine learning methods. The specific prediction process is described in Section 3, and the prediction results is described in Section 3.4.4. The random forest algorithm is based on the oversampling proposed in this paper and outperforms other prediction algorithms. The results show that the oversampled random forest prediction has an accuracy of up to 95%, an AUC value close to 0.99, an F1-score of 0.94, and a recall of 0.95;
A drone patrol response strategy built upon the foundation of the previous section is designed based on target clustering, as is described in Section 4. Combined with a genetic algorithm, this strategy can be used for patrols and responds to high-crime areas predicted in advance. The experimental results can help patrol planning with area-wide pre-warning within one hour, providing an effective solution to reduce urban crime rates.

In this study, Figure 1 presents the proposed system structure, which consists of two parts: crime event prediction and the collective intelligence response. It includes a central database, a server, and drones equipped with various sensors (UAVS). The central database stores various feature data and prediction results. The server is the central processing entity in the system, responsible for data preprocessing, model computation, crime prediction, and the allocation of drones to patrolled areas. First, data are inputted into the model on the left, and the predicted output P is set to be the high-crime area in region

A_{n}

at time period

T_{m}

, corresponding to

{T a r g e t s}_{i}

in the circle on the right. Next, UAVS are clustered into

{U A V S}_{j}

, where i = j, with each UAV patrolling its corresponding targets. Finally, drones can collect data through sensors during patrolling and feed them back to the dataset before repeating the above process.

This paper aims to integrate crime event prediction with the collective intelligence response. It provides a comprehensive approach that can proactively identify potential crime events, optimize resource allocation, and plan response routes, thereby, effectively addressing urban crime issues and enhancing public safety. This approach also offers valuable insights for predicting and responding to other emergent incidents, contributing to the field of smart disaster prevention in cities.

2. Relevant Work

2.1. Crime Event Prediction

Crime event prediction involves the utilization of techniques such as data analysis, statistical methods, and machine learning to forecast potential future criminal incidents by analyzing historical crime data along with spatiotemporal factors associated with criminal incidents.

Primarily, crime events are closely correlated with external factors and predicting them involves analyzing and modeling this relationship. In other words, it entails training a model that establishes a connection between crime events and external factors to enable the model to predict criminal incidents. For example, Catlett et al. [7] introduced a predictive approach based on a spatial analysis and autoregressive model. This method employs a density-sensitive clustering algorithm Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [8] and seasonal Autoregressive Integrated Moving Average (ARIMA) [9] models to automatically detect high-risk crime areas within a city and reliably forecast crime trends.

Furthermore, enhancing prediction accuracy can be achieved by analyzing the spatiotemporal characteristics of data. Yi et al. [10] proposed a Clustered Continuous Conditional Random Field (Clustered-CCRF) model that combines autoregressive temporal correlation and feature-based spatial correlation between regions. By utilizing a tree-based clustering algorithm, highly similar regions are identified, thereby, enhancing the performance of crime prediction models. Zhang et al. [11] developed an interpretable machine learning crime prediction model that combines Extreme Gradient Boosting (XGBoost) [12,13] and Shapley additive explanation (SHAP) [14] methods. This model elucidates the precise spatial variation of each variable, thereby, improving the accuracy and transparency of the crime prediction model. Hajela et al. [15] introduced a spatiotemporal crime prediction technique based on machine learning and a two-dimensional hotspot analysis. They employed a finer-grained partitioning method to capture the spatial distribution characteristics of crime and used multiple features and complex classification models to enhance prediction accuracy and robustness. These approaches have achieved certain breakthroughs in the realm of spatiotemporal features.

Additionally, in certain instances, different types of security events may exhibit evident or covert interactions. For instance, Ahsan et al. [16] proposed a machine learning-based approach to analyze and predict road traffic accident risks. They selected multiple attributes as input variables and utilized decision trees and random forest algorithms to predict accidents. Thakkar et al. [17] employed a random forest algorithm to predict fires, calculating the Pearson correlation coefficients between variables and using a correlation matrix to unveil relationships between different factors and the probability of fire occurrence. Their research provided a novel approach and reference for preventing and controlling fires and traffic accidents, concurrently offering valuable insights for the crime event prediction aspect of this paper.

The advantages and disadvantages of the aforementioned technologies are shown in Table 1.

Currently, both domestic and international research on crime event predictions primarily focuses on long-term and large-scale forecasting; yet, such coarse predictions fail to meet modern urban requirements for crime event responses. Therefore, this paper proposes a granular prediction algorithm based on spatiotemporal factors. By utilizing an oversampled random forest algorithm, it forecasts crime incidents in small geographical areas within short time periods for the subsequent collective intelligence responses.

2.2. Multi-Agent-Based Collective Intelligence Response

In recent years, intelligent agents represented by unmanned aerial vehicles (UAVs), as compared to ground vehicles, have gained widespread application in areas such as missing person searches, medical transports, and emergency communication due to their agility and efficiency [18,19,20,21,22,23,24]. Goodrich et al. [25] and Nakadai et al. [26] implemented camera-equipped and microphone-equipped intelligent collectives, respectively, and tested their performance in searching and rescuing tasks. Advanced planning and navigation algorithms are the key for intelligent collectives to accomplish emergency response tasks. Heintzman et al. [27] primarily focused on target-predictive motion models and investigated path planning for human-intelligent collective search and rescue movements. Wu et al. [28] utilized Markov models to address uncertainties in the deployment of intelligent collective-relief personnel. Liu et al. [29] and Huang et al. [30] employed road models to guide intelligent collectives in road anomaly detection and response. Ding et al. [31] improved the Particle Swarm Optimization (PSO) [32] algorithm and applied the enhanced Artificial Bee Colony-Particle Swarm Optimization (ABC-PSO) algorithm to solve task allocation problems, effectively addressing the reassignment of tasks in multi-agent emergency relief scenarios. Han et al. [33] introduced an optimized A* path planning algorithm, allocating intelligent collectives and rapidly devising optimal flight paths. Zhao et al. [34] proposed a unified framework for UAV-assisted disaster emergency networks and investigated optimization problems in three scenarios, including UAV trajectory and scheduling, transmitter–receiver design, multi-hop Device-to-Device (D2D) [35] communication, and multi-hop UAV relay. Jaradat et al. [36] employed a finite-state Q-Learning algorithm to enhance the efficiency of path planning in unknown environments. Liu et al. [37] proposed an improved reinforcement learning method, Neural Networks Heuristic Q-learning (NNH-QL), using Back Propagation (BP) [38] neural networks, enhancing the learning efficiency of the Q-learning algorithm with the neural network’s fitting and enabling the algorithm to operate effectively in larger environments. Zhao et al. [39] presented a UAV path planning algorithm based on deep multi-agent reinforcement learning, considering complex urban environments, flight time restrictions, wireless channel characteristics, and various scenario parameters, achieving generalization across diverse scenarios without the need for retraining or adaptation.

The advantages and disadvantages of the aforementioned technologies are shown in Table 2.

These studies collectively demonstrate significant advancements in intelligent collectives for emergency response tasks. Advanced planning and navigation algorithms empower them to efficiently tackle emergency communication and other tasks, offering robust support for societal safety and rescue efforts. However, most existing research predominantly focuses on post-incident responses, while this paper centers on proactive pre-warning patrols and post-emergency responses. By predicting potential events in advance, intelligent collectives can take preemptive measures before incidents occur, effectively managing crises and emergency situations. Urban-scale crime events possess highly dynamic spatiotemporal characteristics, providing intelligent collectives for the opportunity to forecast and optimize patrol strategies, facilitating both swift post-event responses and pre-event measures. This spatiotemporal dynamism presents intelligent collectives with more effective and flexible options for emergency responses.

3. Crime Event Prediction

3.1. Overview of Denver Crime Datasets

In this paper, publicly available datasets from the city of Denver in the United States were utilized. These datasets encompass spatiotemporal information from 2016 to 2021. The raw data comprise 546,882 rows and 19 columns, with each row representing an individual event record and each column representing distinct attributes of the event, as illustrated in Table 3. This dataset encompasses a total of 19 attributes.

3.2. Data Preprocessing

3.2.1. Attribute Reduction

The original dataset contains 19 attributes, some of which are irrelevant to this study. These redundant attributes are removed from the analysis.

3.2.2. Handling Missing Values

The original datasets comprise 546,882 records, after analyzing with the latitude and longitude that exhibit the most severe instances of missing values which account for approximately 0.8% of the total. As the proportion of missing values is relatively small, after these instances were removed, a final dataset of 505,285 records were left.

3.2.3. Feature Extraction

Temporal features, such as “Year”, “Month”, “Weekday”, “Hour”, and “Minute” representing the incident occurrence time, are extracted from the “LAST_OCCURENCE_DAT” field. To effectively facilitate the analysis of feature importance and model training, unique numerical identifiers are assigned for different blocks, months, and weekdays. For instance, {1: ‘Mon’, 2: ‘Thu’, …, 7: ‘Sun’}.

3.2.4. Spatial Division

This paper employs the United States National Grid (USNG) system to convert GPS coordinates into unique block identifiers serving as new input features. This feature is labeled as “Grid_3Km_no”, offering accuracy down to 10 m. The study divides the area into grid cells of 3 km × 3 km, resulting in a total of 84 grids. This division approach enhances the foundation for model training.

3.3. Exploratory Analysis and Feature Selection

3.3.1. Temporal Correlation Analysis

In this section, an analysis of the number of crime events, which occurred during different time periods in Denver, is conducted, with different time intervals exhibiting distinct patterns in the distribution of crime events. As is shown in Figure 2a, there is a fluctuation in the number of crime events between 2020 and 2021, displaying an upward trend. This suggests that public security in Denver still requires improvement. Figure 2b illustrates the monthly distribution of crime events. The results reveal that August has the highest occurrence rate of crime events, followed by July and January, while February witnesses the lowest number of incidents. Figure 2c delves into the impact of dates within each month on the number of crime events. Notably, there is a dip in occurrences on the 31st day, as not all months have 31 days. On the contrary, a peak is observed on the 1st day, possibly attributed to New Year’s Day and its higher population density, which may lead to an abnormal increase in crime incidents.

As is depicted in Figure 2d and Figure 3, there are significant variations in the number of crime events during different time intervals throughout the day. On workdays, the occurrences of crime events peak between 12:00 and 18:00, with a smaller peak observed between 12:00 and 13:00. On weekends, the distribution trend of crime events is similar to that of the workdays. Before 5:00 AM, the total number of crime events on weekends tends to surpass that of the workdays.

The aforementioned analysis shows a close correlation between the occurrence of crime events and temporal factors. Consequently, features such as year (Year), month (Month), weekday (Week), and hour (Hour) are chosen as temporal features for the analysis and modeling, with an aim to achieve accurate crime event predictions and swift responses for an optimized resource allocation. This approach seeks to enhance public safety and effectively deal with security risks that arise during various time periods.

3.3.2. Spatial Correlation Analysis

In this section, we delve into the correlation between crime events and spatial factors. By means of a statistical analysis, the spatial distribution patterns of crime events can be derived, as is depicted in Figure 4. The entire city is partitioned into seven police districts, with each color representing the distribution of crime events in a specific district. A distinct difference in district sizes is evident, as seen from the Figure. Specifically speaking, District 3 has the largest area, followed by Districts 4 and 5, while District 6 encompasses the smallest area. This implies that District 3 covers a relatively larger area, and likely encompasses the most densely populated regions and various social activity hubs, which may result in a higher concentration of crime events. In contrast, District 7’s smaller area might indicate a more secluded region or a relatively sparser community population, potentially leading to fewer crime events and a comparatively safer environment.

However, reality does not always adhere to this notion. The size of a district does not necessarily correlate with the number of crime events, as is shown in Figure 4, which presents a heatmap of crime events in Denver. This heatmap illustrates the density of crime events within each police district, with brighter colors indicating a higher crime event density. Despite District 6 having the smallest area, it displays the highest crime density, indicating that it is a hotspot for criminal activity. This might be linked to factors such as population density, social activity hubs, or other variables that concentrate criminal incidents in this region. On the other hand, despite District 3’s area being the largest, its crime event density is relatively low. This suggests that while the district covers a larger region, it might include relatively safer communities or areas.

Taking into consideration the analysis from both Figure 4 and Figure 5, it can be concluded that the size of a police district does not directly determine the number of crime events. In fact, the spatial distribution of crime events is more influenced by internal social and environmental factors within each district. Therefore, when formulating crime prevention measures, it is crucial to consider both the district’s size and the hotspots of criminal activity within it. In this way, targeted measures can be adopted to enhance public safety effectively.

3.3.3. Feature Selection

Based on the aforementioned analysis, it is evident that the factors influencing crime primarily involve temporal and spatial aspects. To facilitate model computation and maximize the utilization of these attributes for predicting the probability of crime occurrence within a region, this paper has selected 14 attributes, as are illustrated in Table 4.

Among these attributes, “A_P_M”, “INCIDENT_ADDRESS”, “PRECINCT_ID”, and “NEIGHBORHOOD_ID” are nominal variables that need a Dummy Variable to be processed. Thus, one-hot encoding is employed to convert these nominal variables into binary vectors. This approach enhances the model’s ability to comprehend and process the data more effectively.

3.4. Crime Event Prediction

This section focuses on predicting crime events. Considering both temporal and spatial factors, precise patterns and distribution characteristics of events are captured through the analysis of historical data and real-time information. This is achieved using an oversampling-based random forest algorithm to accurately predict crime events occurring within a short time frame in a small area.

3.4.1. Problem Description

Initially, the time series T is defined as a set comprising time points, denoted as

T = {t_{0}, t_{1}, \dots, t_{m}}

, where

m \in [0, 23]

. Here, m represents the division of a day into 24 equidistant time intervals, each spanning 1 h. This equidistant division of time intervals enhances the time series’ temporal relevance.

Subsequently, the spatial sequence A is defined as a collection composed of spatial regions, represented as

A = {a_{0} {, a}_{1}, \dots, a_{n}}

, where

n \in [0, k]

. In this context, n signifies the division of geographical latitude and longitude into distinct regions, with

a_{n}

signifying the nth region. This grid-based partitioning facilitates the segmentation of geographical space into numerous smaller areas, facilitating the analysis of region-specific attributes and the spatial localization of crime incidents.

By merging the time series and spatial sequence, it becomes possible to conduct the analysis and prediction of crime incidents within specific time intervals and designated spatial regions.

Furthermore, the forecasting objective is as follows: the primary goal of this study is to predict the locations of crime incidents occurring within each hour, alongside estimating the probability of crime occurrences at each location. This can be represented in the following manner:

P = (I s C r i m e = 1 | T = t_{m}, A = a_{n}) .

(1)

Equation (1) signifies the probability of a crime incident occurring in the time interval [m, m + 1] at location

a_{0}

. Here, T represents the time series,

t_{m}

denotes the mth hour, A signifies the spatial sequence,

a_{n}

represents the nth region, and P represents probability. The term “IsCrime = 1” within the formula signifies the occurrence of a crime incident at the specific location

a_{n}

during the provided time interval

t_{m}

.

3.4.2. Crime Prediction Model

The traditional Random Forest algorithm typically employs balanced sample sets during the training process of each decision tree. However, when dealing with imbalanced datasets, certain classes may have fewer samples, which can lead to a poorer performance of the model in predicting minority classes. Oversampling-based Random Forest algorithms can enhance the classification performance on imbalanced datasets, as the use of oversampling techniques helps balance the data distribution among different classes by replicating existing minority class samples or generating new synthetic samples. This, in turn, improves the model’s ability to learn from minority classes.

For each sample x in the minority class, calculate its Euclidean distance to all samples in the minority class sample set to obtain its k nearest neighbors. Based on the imbalance ratio of the dataset, set a sampling ratio to determine the sampling multiplier N. For each minority class sample x, randomly select several samples from its k nearest neighbors, assuming the selected neighbors are denoted as

x_{n e w}

. For each randomly selected neighbor

x_{n e w}

, construct a new sample according to Formula (2).

x_{n e w} = x + r a n d (0,1) \times (x - x)

(2)

The positive-to-negative sample ratio in this dataset is approximately 3:1, as illustrated in Figure 6. Negative samples are augmented to achieve a 1:1 positive-to-negative sample ratio.

To prevent overfitting, assess the model’s performance objectively, and ascertain its generalization capability; the 5-Fold Cross Validation method is employed. The original training dataset was evenly divided into five subsets. Each subset’s data were used as a validation set in turn, while the remaining four subset data served as the training set, and this process was iterated. The model’s performance, as indicated in Table 5, demonstrates a notably high average accuracy, approximately 0.95, especially in the oversampled random forest model. Consequently, after data preprocessing, the original dataset was randomized and split in an 8:2 ratio to create a training set and a test set.

A random forest model with oversampling is used for crime prediction. Random forests utilize a decision tree approach, involving an optimal feature selection, with criteria such as the Gini Index and Information Gain [40]. Here, the Gini Index is used as the splitting criterion. This metric is highly sensitive in classification problems and effectively measures the impurity between different categories. As is shown in Table 6 through a simple validation, it can be observed that this metric results in a shorter training and prediction time. This makes it particularly advantageous to deal with large datasets, as it can significantly improve computational speed and reduce processing times.

3.4.3. Crime Prediction Process

As is shown in Figure 7, the crime event prediction process in this paper follows the following steps:

Data Collection and Preprocessing. Collect historical crime data and other relevant information, and preprocess the data as necessary, including handling missing values, outlier treatment, and more.
Feature Engineering. Construct appropriate feature variables based on the data analysis’ results, which may include factors related to crime events such as time, location, etc.
Negative Sample Set Partitioning. Utilize feature recombination to reliably partition the negative sample set (samples where no crime occurred) to ensure the quality and effectiveness of the training set.
Training and Testing Set Splitting. Building upon the partitioned negative sample set, create training and testing sets for model training and evaluation.
Random Forest Model Construction. Select the random forest as the base classifier for learning due to its strong classification performance, ability to provide high accuracy and stable classification results, and effectiveness in handling high-dimensional data. The core pseudocode for this section is as shown in Algorithm 1, with specific parameter descriptions in Section 3.4.1.
Model Training. Train the random forest model using the training set, enabling the model to learn the characteristics and patterns of crime events.
Parameter Selection. Utilize 5-fold cross-validation and grid search techniques to select the optimal model parameters, further enhancing the model’s performance and generalization ability.
Model Evaluation and Prediction. Evaluate the well-trained random forest model using the testing set, calculate the metrics such as prediction accuracy, recall rate, etc., and employ the model to predict future crime events.

Algorithm 1: Crime Predict (Crime Datasets, Time, Area, Target = IsCrime)
INPUT: Crime Datasets (T, A, IsCrime)
OUTPUT: $P = (I s C r i m e = 1 \| T = t_{m}, A = a_{n})$
1	$T = {t_{0}, t_{1}, \dots, t_{m}}$ ← define time series()	*/ define time series**
6	$A = {a_{0} {, a}_{1}, \dots, a_{n}}$ ← define spatial sequence()	*/ define spatial sequence**
7	Merged data ← merge timeand spatial (T, A)	*/ merge timeand spatial**
8	for $t_{m}$ in T:	/* iterate time code
9	for $a_{n}$ in A:	/* iterate region code
10	P = predict crime probability (merged data)	*/ predict crime probability**
11	if $P (I s C r i m e) > 0.5$ OR iScrime = 1	/* crime has occurred
12 13	$P = (I s C r i m e = 1 \| T = t_{m}, A = a_{n})$	/* probability of the crime
14	$Else if P (I s C r i m e) < 0.5$ OR iScrime = 0	*/ crime has no occurred**
15	P = −1	*/ output −1**
16	END FOR	*/ end the loop**
17	RETURN P	*/ output result**

3.4.4. Crime Prediction Results

According to the prediction process shown in Figure 7, crime events were forecasted. The prediction models are evaluated using metrics such as the accuracy, precision, recall, and F1-score, and were compared with algorithms including logistic regression, decision trees, Bayesian methods, random forests, and KNN [5,6,7]. Precision is the ratio of true positives (TP) to the sum of true positives (TP) and false positives (FP), recall is the ratio of true positives (TP) to the sum of true positives (TP) and false negatives (FN), and the F1-score is the harmonic mean of precision and recall. Accuracy refers to the percentage of correct predictions [5,13]. As is shown in Table 7, the oversampled random forest model utilized in this study exhibits a superior performance.

As is shown in Figure 8, the confusion matrices are presented for each crime prediction model. Clearly, the oversampled random forest model exhibits the highest proportion of TP (True Positives) and TN (True Negatives), indicating the best predictive performance. Therefore, for the remaining sections of this paper, we will use this model as the basis for predictions.

This section validates the algorithm’s performance by evaluating the model’s complexity. The experiments were conducted on a computer equipped with an Nvidia RTX 3080 graphics card and a dedicated 16-thread CPU. Table 8 presents the average accuracy, training time, testing set prediction time, and real-time simulation prediction time for each model over the 5-fold cross-validation. It can be observed that although the random forest model achieves the highest accuracy, it also exhibits relatively longer execution times, indirectly reflecting its higher computational complexity. However, in terms of real-time predictions, most models are completed within 10 s, achieving a nearly-real-time forecasting. In the future, we aim to further enhance the models, reduce the model’s complexity, and improve their prediction accuracy.

As depicted in Figure 9, the AUC value of the random forest curve approaches 0.99, which is the highest among the presented models. This signifies that, in terms of the AUC evaluation metric, the random forest model excels in the classification prediction performance.

Based on the analysis results presented above, data for the model input from 5 PM to 6 PM on a certain day are randomly selected. As is shown in Figure 10, each red dot represents a crime event and includes attributes such as the time, location, and likelihood of the event occurring. The data input to the model every hour is updated based on information collected by various sensors, enabling real-time crime predictions. Target points are extracted from the prediction results. These target points form the basis for multi-agent patrolling, which will be elaborated on in the subsequent chapters.

4. Multi-Drone Response Based on Crime Prediction

In this section, we will discuss the cruise response to the extracted target points. As depicted in Figure 8, each red point represents a target point for drone patrol. Consequently, the challenge of a group intelligence response can be viewed as a multi-target response problem. As the number of targets increases and the coverage area expands, issues such as prolonged response times and complex algorithmic calculations arise. This hampers the identification of an optimal response allocation scheme. To address these challenges, this paper proposes a drone response strategy based on target clustering, coupled with an enhanced genetic algorithm for simulating patrol responses.

4.1. Problem Description

The multi-drone cruise response strategy can be described as follows:

Given N predicted crime points as targets, employ the k-means algorithm to cluster the targets in the response area into M clusters. Consequently, the task of accessing N target points from M centers can be treated as a multi-center drone task allocation problem. Each center can dispatch a maximum of

K_{m}

drones (m = 1, 2, …, M), and any drone associated with the centroid of a target can access that target point. The objective is to design a rational drone dispatch plan to minimize the drone flight paths while satisfying the following constraints:

Each target point must be accessed by only one drone, i.e., multiple drones cannot pass through a single target point simultaneously;
Drones must return to their original centers after visiting the target points, with the determined center as the starting point.

For the purpose of simplifying the model complexity, let the target points be encoded as {1, 2, 3, 4, 5, …, N}, and the center codes as {N + 1, N + 2, N + 3, …, N + M}. Define the variable

x_{i j}^{m k}

to indicate whether drone k from the center m travels from target point i to target point j. If no travel occurs,

x_{i j}^{m k}

is assigned a value of 0; otherwise,

x_{i j}^{m k}

is assigned a value of 1, as shown in Equation (3):

x_{i j}^{m k} = {\{}_{0}^{1} .

(3)

The driving distance is denoted as totalDist, as shown in Equation (4):

t o t a l D i s t = \sum_{i = 1}^{N + M} \sum_{j = 1}^{N + M} \sum_{m = 1}^{M} \sum_{k = 1}^{k_{m}} d_{i j} x_{i j}^{m k} .

(4)

The objective function aims to minimize the cost, where the cost is defined as the total distance multiplied by the distance weight, as expressed in Equation (5). The weights represent the probability of a crime event occurring at each target point.

m i n t o t a l C o a t = t o t a l D i s t * W e i g h t

(5)

Among the constraints (6)–(9), Equation (6) stipulates that the starting and ending points of a drone’s cruise must be its associated center. Equations (7) and (8) denote that each target point should be visited by only one drone. Equation (9) ensures that drones cannot travel from one center to another center.

\sum_{j = 1}^{N} x_{i j}^{m k} = \sum_{i = 1}^{N} x_{i j}^{m k} \leq 1, i = m \in {N + 1, N + 2, \dots \dots, N + M} k \in {1, 2, \dots, K m}

(6)

\sum_{j = 1}^{N + M} \sum_{m = 1}^{M} \sum_{k = 1}^{k_{m}} x_{i j}^{m k} = 1, i \in {1, 2, \dots, N}

(7)

\sum_{i = 1}^{N + M} \sum_{m = 1}^{M} \sum_{k = 1}^{k_{m}} x_{i j}^{m k} = 1, j \in {1, 2, \dots, N}

(8)

\sum_{i = N + 1}^{N + 1} x_{i j}^{m k} = \sum_{j = N + 1}^{N + M} x_{i j}^{m k} = 0, i = m \in {N + 1, N + 2, \dots \dots, N + M}, k \in {1, 2, \dots, K m}

(9)

4.2. Cruise Response Model

4.2.1. Target Selection Strategy

Initially, the k-means algorithm is employed to cluster targets within the response area based on the principle of minimizing distances. This is illustrated in Figure 11, where the right-side red section depicts the clustering of the target set into M clusters, each denoted as

{C l u s t e r}_{i}

(where i = 1, 2, 3, …, M). The left-side blue section represents the grouping of Unmanned Aerial Vehicles (UAVs), where drones are divided into M groups according to the same principle of an equal number of clusters and groups, each denoted as

{G r o u p}_{i}

(where i = 1, 2, 3, …, M). Subsequently, cluster allocation is performed.

Subsequently, following the strategy depicted in Figure 11, the predicted target points are clustered, resulting in the configuration in Figure 12. In this representation, red cross markers denote the centroids of each cluster, and the surrounding points of the same color signify multiple targets within each respective cluster.

Finally, an exemplary cruise response cluster is chosen from the clusters, as is illustrated in Figure 12. The cluster selected by the red circle is extracted, encompassing a total of 46 target points. However, it is important to note that some selected targets might appear duplicated due to multiple occurrences at certain locations. Yet, for the purpose of drone cruising, a single localization suffices, allowing for the removal of duplicates while still meeting the cruising criteria. This refinement leads to a reduced set of 36 unique target points, as demonstrated in Figure 13. The same procedure can be applied to other clusters, enabling concurrent responses.

4.2.2. Cruise Algorithm

This paper employs an enhanced genetic algorithm for the drone cruise response, as is depicted in Figure 14. The figure outlines the fundamental steps of the genetic algorithm process.

After finalizing the algorithm, improvements were made to the crossover operation. Unlike the traditional crossover approach, this paper introduces an enhanced crossover operation which is depicted in Figure 15. The specific steps are outlined as follows:

Set a mutation probability p.
For each parent chromosome in the population:
- Randomly select a crossover point, G1, within the chromosome. Let us say G1 = 34.
- Generate a random decimal number, R, between 0 and 1.
- If R < p, go to Step 3. Otherwise, proceed to Step 4.
If R < p (mutation occurs):
- Randomly select another point, G2, from the same individual’s chromosome.
- Invert the segment between G1 and G2.
If R ≥ p (no mutation):
- Select another individual, Parent B, randomly from the population.
- Locate G1 = 34 within Parent B’s chromosome and identify the point before it as G3.
- Invert the segment between G1 and G3 in the original parent chromosome.
Repeat the above steps for all parent chromosomes to generate the offspring population.

The cross-operator enhancement Algorithm 2 pseudocode is as follows:

Algorithm 2: CrossoverIndividual (individual0, individual1, crossoverRate)
INPUT: individual0, individual1, CrossoverRate
OUTPUT: newIndividual
1	D ← Decision variable dimension;
2	if rand > crossoverRate
3	r0, r1 ← Generate two random number between1 ~ D
4	else
5	r0 ← Generate a random number between1 ~ D
6	gene ← individual0[r0]	*/ Get r0th gene from individual0**
7	index ← find(individual1, gene)	*/ Get the location of gene from individual1**
8	if index < D index++	*/ right location**
9	else index−−	*/ left location**
10	gene ← individual1[index]	*/ A gene adjacent to a previous gene**
11	r1 ← find(individual0, gene)	*/ Get the location of gene from individual0**
12	if r0 > r1 swap(r0, r1)
13	newIndividual ← individual0
14	newIndividual[r0:r1] = individual0[r1:r0]	*/ Reverse the elements from r0 to r1**
15	return newIndividual	*/ return new Individual**

4.2.3. Simulation Results

For the purpose of conducting a collaborative intelligence response, the clusters extracted from Figure 13 are selected in this study. A total of 36 target regions are present within these clusters, and each target is sequentially numbered with 0 to 35. The experiment involves four drones, and the outcomes are presented in Table 9. Each row represents the response route of an individual drone, starting from point 0, returning to point 0, and ensuring that each target point is visited only once.

As is shown in Figure 16a, the cruise trajectories from the experimental simulation are illustrated. The four distinct colors correspond to the four individual drones, and their trajectories neither overlap nor revisit the target points. This approach ensures an efficient enhancement of response efficiency. Furthermore, Figure 16b depicts the population evolution curve. The objective function gradually converges to its optimal value as the iteration count increases. By the time it reaches 400 iterations, it demonstrates a near-complete convergence.

In addition, as is shown in Table 10, we also use the Ant Colony Optimization (ACO) algorithm for comparison. It is evident that the algorithm employed in this study yielded notably favorable results. An equal number of drones being considered, the algorithm demonstrates fewer iterations and shorter flight distances compared to the ACO algorithm. These findings underscore the effectiveness of the approach employed in the current study, showcasing its potential for optimizing drone routing and responses in scenarios with similar parameters.

This article proposes a crowdsourced response method based on crime event predictions. Firstly, the Section 3 of the article predicts crime hotspots using a random forest algorithm based on oversampling. Secondly, a drone response strategy based on target clustering is designed in the Section 4 of the article. This strategy clusters, segments, and extracts the predicted crime hotspots, addressing the issue of high computational complexity in the multi-target response. Finally, an improved genetic algorithm is combined for a patrol response to obtain the optimal response allocation scheme. The specific research contributions are as follows:

High Accuracy: This method achieves a high accuracy of up to 95% in both the prediction and response, indicating its effectiveness in crime event predictions.
Prediction Granularity: The method can provide predictions on an hourly basis, enabling real-time prediction and warning responses, which are highly valuable for urban public safety management.
Intelligent Response Strategy: By using target clustering in the drone response strategy, responses to multiple targets can be effectively handled to reduce computational complexity and enhance response efficiency.
Genetic Algorithm Optimization: Combining genetic algorithms for optimizing patrol responses allows for finding the best response allocation scheme, reducing the number of iterations, and further improving the response effectiveness.
Providing Public Safety Insights: While perfect accuracy cannot be achieved, this method can offer valuable insights that contribute to urban public safety governance and help reduce crime rates.

This article has yielded positive results to realize real-time crime prediction and alert responses. Additionally, leveraging deep learning and artificial intelligence techniques can improve the prediction algorithm, making it more adaptive and refined. This can help in better recognizing new crime patterns and trends, as well as enabling timely decision-making. Moreover, the predictions and patrol alerts presented in the article offer valuable insights for urban public safety governance. However, the article only considers a limited set of factors, and thus, it cannot achieve complete accuracy. To obtain more accurate prediction results, it is essential to identify additional crime attributes beyond the ones currently considered. So far, the method has been trained using specific attributes, but it should explore more factors to enhance its accuracy.

Author Contributions

Conceptualization, Y.P. and F.T.; methodology, Y.P.; software, F.T.; validation, F.T. and C.W.; formal analysis, C.W.; investigation, C.W.; resources, Y.P.; data curation, F.T.; writing—original draft preparation, F.T.; writing—review and editing, C.W.; visualization, F.T.; supervision, Y.P.; project administration, C.W.; funding acquisition, Y.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported in part by the National Natural Science Foundation of China under the grant of NO. 62102431, the National University of Defense Technology under the grant of NO. ZK21-32, and the Science and Technology on Information Systems Engineering Laboratory under the grant of NO. 6142101220209.

Data Availability Statement

The dataset pertaining to the City and County of Denver crime, available at https://www.denvergov.org/opendata/dataset/city-and-county-of-denver-crime, is publicly accessible.

Acknowledgments

The authors thank the editor, the anonymous reviewers, and all Intelligent Instrument Laboratory members for their insightful comments and feedback. The authors thank Yan Pan at the National University of Defense Technology for giving them suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

References

Citaristi, I. United Nations Office on Drugs and Crime—UNODC. In The Europa Directory of International Organizations 2022; Routledge: New York, NY, USA, 2022; pp. 248–252. [Google Scholar]
Statistics on Major Violations and Crimes for the Third Quarter of 2021. The Police Association of Chaina. Policing Stud. 2022, 93–96. Available online: http://www.tpaoc.org.cn/html/wenzhangxuandeng/2022/04/1663.html (accessed on 5 November 2023).
Lu, J.Q. Crime Statistics and Optimizing Crime Governance. Chin. Soc. Sci. 2021, 105–125+206–207. [Google Scholar]
China Emergency Service Network. Available online: http://www.52safety.com/yjfxbg/index.jhtml (accessed on 25 September 2018).
Jenga, K.; Catal, C.; Kar, G. Machine Learning in Crime Prediction. J. Ambient. Intell. Humaniz. Comput. 2023, 14, 2887–2913. [Google Scholar] [CrossRef]
Zhang, X.; Liu, L.; Xiao, L.; Ji, J. Comparison of Machine Learning Algorithms for Predicting Crime Hotspots. IEEE Access 2020, 8, 181302–181310. [Google Scholar] [CrossRef]
Catlett, C.; Cesario, E.; Talia, D.; Vinci, A. Spatio-Temporal Crime Predictions in Smart Cities: A Data-Driven Approach and Experiments. Pervasive Mob. Comput. 2019, 53, 62–74. [Google Scholar] [CrossRef]
Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A density-based algorithm for discovering clusters in large spatial databases with noise. In KDD’96, Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; AAAI Press: Washington, DC, USA, 1996; Volume 96, pp. 226–231. [Google Scholar]
Catlett, C.; Cesario, E.; Talia, D.; Vinci, A. A data-driven approach for spatio-temporal crime predictions in smart cities. In Proceedings of the 2018 IEEE International Conference on Smart Computing, SMARTCOMP’18, Shanghai, China, 15–17 June 2018; pp. 17–24. [Google Scholar]
Yi, F.; Yu, Z.; Zhuang, F.; Zhang, X.; Xiong, H. An Integrated Model for Crime Prediction Using Temporal and Spatial Factors. In Proceedings of the 2018 IEEE International Conference on Data Mining (ICDM), Singapore, 17–20 November 2018; pp. 1386–1391. [Google Scholar]
Zhang, X.; Liu, L.; Lan, M.; Song, G.; Xiao, L.; Chen, J. Interpretable Machine Learning Models for Crime Prediction. Comput. Environ. Urban Syst. 2022, 94, 101789. [Google Scholar] [CrossRef]
Ramraj, S.; Uzir, N.; Sunil, R.; Banerjee, S. Experimenting XGBoost algorithm for prediction and classification of different datasets. Int. J. Control. Theory Appl. 2016, 9, 651–662. [Google Scholar]
Mousa, S.R.; Bakhit, P.R.; Osman, O.A.; Ishak, S. A Comparative Analysis of Tree-Based Ensemble Methods for Detecting Imminent Lane Change Maneuvers in Connected Vehicle Environments. Transp. Res. Rec. J. Transp. Res. Board 2018, 2672, 268–279. [Google Scholar] [CrossRef]
Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems 30; NIPS: Long Beach, CA, USA, 2017. [Google Scholar]
Hajela, G.; Chawla, M.; Rasool, A. A Clustering Based Hotspot Identification Approach for Crime Prediction. Procedia Comput. Sci. 2020, 167, 1462–1470. [Google Scholar] [CrossRef]
Ahsan, A.; Moon, N.N.; Sharmin, S.; Islam, M.M.; Hossain, R.A.; Nawshin, S. Machine Learning Approach to Predict Traffic Accident Occurrence in Bangladesh. In Proceedings of the 2021 IEEE International Conference on Biomedical Engineering, Computer and Information Technology for Health (BECITHCON), Dhaka, Bangladesh, 4–5 December 2021; pp. 30–33. [Google Scholar]
Thakkar, R.; Abhyankar, V.; Reddy, P.D.; Prakash, S. Environmental Fire Hazard Detection and Prediction Using Random Forest Algorithm. In Proceedings of the 2022 International Conference for Advancement in Technology (ICONAT), Goa, India, 21–22 January 2022; pp. 1–4. [Google Scholar]
Pan, Y.; Chen, Q.; Zhang, N.; Li, Z.; Zhu, T.; Han, Q. Extending delivery range and decelerating battery aging of logistics UAVs using public buses. IEEE Trans. Mob. Comput. 2022, 22, 5280–5295. [Google Scholar] [CrossRef]
Pan, Y.; Li, S.; Chen, Q.; Zhang, N.; Cheng, T.; Li, Z.; Zhu, T. Efficient schedule of energy-constrained UAV using crowdsourced buses in last-mile parcel delivery. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2021, 5, 1–23. [Google Scholar] [CrossRef]
Pan, Y.; Li, S.; Ning, Z.; Li, B.; Zhang, Q.; Zhu, T. auSense: Collaborative airspace sensing by commercial airplanes and unmanned aerial vehicles. IEEE Trans. Veh. Technol. 2020, 69, 5995–6010. [Google Scholar] [CrossRef]
Pan, Y.; Li, S.; Li, B.; Bhargav, B.; Ning, Z.; Han, Q.; Zhu, T. When UAVs coexist with manned airplanes: Large-scale aerial network management using ADS-B. Trans. Emerg. Telecommun. Technol. 2019, 30, e3714. [Google Scholar] [CrossRef]
Benarbia, T.; Kyamakya, K. A literature review of drone-based package delivery logistics systems and their implementation feasibility. Sustainability 2021, 14, 360. [Google Scholar] [CrossRef]
Pan, Y.; Li, S.; Chang, J.L.; Yan, Y.; Xu, S.; An, Y.; Zhu, T. An unmanned aerial vehicle navigation mechanism with preserving privacy. In Proceedings of the ICC 2019—2019 IEEE International Conference on Communications (ICC), Changchun, China, 11–13 May 2019; IEEE: New York, NY, USA; pp. 1–6.
Pan, Y.; Li, S.; Zhang, X.; Liu, J.; Huang, Z.; Zhu, T. Directional monitoring of multiple moving targets by multiple unmanned aerial vehicles. In Proceedings of the GLOBECOM 2017—2017 IEEE Global Communications Conference, Singapore, 4–8 December 2017; IEEE: New York, NY, USA; pp. 1–6.
Goodrich, M.A.; Morse, B.S.; Gerhardt, D.; Cooper, J.L.; Quigley, M.; Adams, J.A.; Humphrey, C. Supporting Wilderness Search and Rescue Using a Camera-Equipped Mini UAV: UAV-Enabled WiSAR. J. Field Robot. 2008, 25, 89–110. [Google Scholar] [CrossRef]
Nakadai, K.; Kumon, M.; Okuno, H.G.; Hoshiba, K.; Wakabayashi, M.; Washizaki, K.; Ishiki, T.; Gabriel, D.; Bando, Y.; Morito, T.; et al. Development of Microphone-Array-Embedded UAV for Search and Rescue Task. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Vancouver, BC, Canada, 24–28 September 2017; pp. 5985–5990. [Google Scholar]
Heintzman, L.; Hashimoto, A.; Abaid, N.; Williams, R.K. Anticipatory Planning and Dynamic Lost Person Models for Human-Robot Search and Rescue. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; pp. 8252–8258. [Google Scholar]
Wu, F.; Ramchurn, S.D.; Chen, X. Coordinating human-UAV teams in disaster response. In Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI), New York, NY, USA, 9–15 July 2016; pp. 524–530. [Google Scholar]
Liu, X.; Ma, J.; Chen, D.; Zhang, L.Y. Real-Time Unmanned Aerial Vehicle Cruise Route Optimization for Road Segment Surveillance Using Decomposition Algorithm. Robotica 2021, 39, 1007–1022. [Google Scholar] [CrossRef]
Huang, H.; Savkin, A.V.; Huang, C. Decentralized Autonomous Navigation of a UAV Network for Road Traffic Monitoring. IEEE Trans. Aerosp. Electron. Syst. 2021, 57, 2558–2564. [Google Scholar] [CrossRef]
Ding, Z.J. Research on Task Allocation Technology for Emergency Relief of Multiple Unmanned Aerial Vehicles in Urban Environment. Master’s Thesis, Nanjing University of Aeronautics and Astronautics, Nanjing, China, 2016. [Google Scholar]
Aje, O.; Anyandi, A.J. The particle swarm optimization (PSO) algorithm application—A review. Glob. J. Eng. Technol. Adv. 2020, 3, 001–006. [Google Scholar]
Han, X.W.; Han, Z.; Yue, G.F.; Cui, J.J. Path Planning Algorithm of Disaster Relief UAV Based on Optimized A*. Comput. Eng. Appl. 2021, 57, 232–238. [Google Scholar]
Zhao, N.; Lu, W.; Sheng, M.; Chen, Y.; Tang, J.; Yu, F.R.; Wong, K.-K. UAV-Assisted Emergency Networks in Disasters. IEEE Wirel. Commun. 2019, 26, 45–51. [Google Scholar] [CrossRef]
Christy, R.P.E.; Astuti, B.; Syihabuddin, B.; Narottama, O.; Rhesa, F. Optimum UAV flying path for device-to-device communications in disaster area. In Proceedings of the 2017 International Conference on Signals and Systems (ICSigSys), Bali, Indonesia, 16–18 May 2017; pp. 318–322. [Google Scholar]
Kareem Jaradat, M.A.; Al-Rousan, M.; Quadan, L. Reinforcement Based Mobile Robot Navigation in Dynamic Environment. Robot. Comput. Integr. Manuf. 2011, 27, 135–149. [Google Scholar] [CrossRef]
Liu, Z.B.; Zeng, X.Q.; Liu, H.Y.; Chu, R. A Heuristic Two-layer Reinforcement Learning Algorithm Based on BP Neural Networks. J. Comput. Res. Dev. 2015, 52, 579–587. [Google Scholar]
Cilimkovic, M. Neural Networks and Back Propagation Algorithm; Institute of Technology Blanchardstown: Dublin, Ireland, 2015; Volume 15. [Google Scholar]
Wei, Z.; Zhao, X. Multi-UAVs Cooperative Reconnaissance Task Allocation under Heterogeneous Target Values. IEEE Access 2022, 10, 70955–70963. [Google Scholar] [CrossRef]
Suryakanthi, T. Evaluating the impact of GINI index and information gain on classification using decision tree classifier algorithm. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 612–619. [Google Scholar]

Figure 1. System structure proposed in this study.

Figure 2. Temporal correlation analysis. (a) The trend of crime numbers over the years; (b) The trend of crime numbers over the months; (c) The trend of crime numbers over the days; (d) The trend of crime numbers over the hours.

Figure 3. Incidence of crime by time of week.

Figure 4. Illustration of the distribution of crime incidents across police precincts.

Figure 5. Displays of a heatmap of crime incidents.

Figure 6. Showcase of the oversampled dataset.

Figure 7. Outlines of the process flow for crime event prediction.

Figure 8. The Confusion Matrices of the Six Models.

Figure 9. Illustrates the AUC curves for various models.

Figure 10. Visualization of the predictive results.

Figure 11. Depiction of the process of target clustering.

Figure 12. Display of the results of target clustering. Different colored dots represent different target clusters, and red cross symbols represent cluster centroids.

Figure 13. The selected target outcomes.

Figure 14. Outline of the basic process of the genetic algorithm.

Figure 15. Illustrations of the Enhanced Genetic Operations.

Figure 16. Cruise results. (a) Cruise trajectories of UAVs, where different colors represent different UAVs; (b) Population evolution curve, which shows the convergence of population evolution.

Table 1. Summary of Works Related to Crime Event Prediction.

Method/Model	Technical Advantages	Technical Disadvantages
Spatial analysis and ARIMA [7]	Automatic detection of high-risk crime areas using DBSCAN and ARIMA models	Requires appropriate parameter configuration
Clustered-CCRF model [10]	Combines temporal and spatial correlation, improving prediction performance	Requires an effective data clustering method
XGBoost and SHAP methods [11]	Enhanced model accuracy and transparency, explaining spatial variations	May require substantial computational resources
2D hotspot analysis and machine learning [15]	Improved prediction accuracy and robustness, fine-grained spatial partitioning	Complex classification models may require large datasets
Decision trees and random forest [16]	Effective prediction of road traffic accidents, multiple input attributes	More feature engineering and data preprocessing may be needed
Random forest algorithm [17]	Revealing correlations between different factors, providing fire prevention methods	Data quality and correlations may affect prediction accuracy

Table 2. Summary of Works Related to Collective Intelligence Response.

Approach/Algorithm	Technical Advantages	Technical Disadvantages
Camera-equipped intelligent collectives [25]	Enhanced situational awareness	Limited performance in complex tasks
Microphone-equipped intelligent collectives [26]	Enhanced audio-based search capabilities	Limited use in non-audio scenarios
Target-predictive motion models [27]	Improved path planning for collective	Complexity in modeling target motion
Markov models [28]	Addressing uncertainties effectively	Limited applicability in certain tasks
Road models [29]	Effective guidance for road-based tasks	Limited to tasks related to roads
Enhanced PSO (ABC-PSO) algorithm [31]	Improved task reassignment in emergencies	Potential complexity in parameter tuning
Optimized A* path planning algorithm [33]	Rapid task allocation and path planning	May not handle complex urban scenarios
Trajectory and scheduling optimization [34]	Comprehensive framework for UAV scenarios	May require complex optimization
Finite-state Q-Learning algorithm [36]	Enhanced path planning in unknown areas	Limited to simple Q-learning scenarios
Neural Networks Heuristic Q-learning [37]	Improved learning efficiency	Potential complexity in neural networks
Deep multi-agent reinforcement learning [39]	Generalization across diverse scenarios	Complex learning and adaptation process

Table 3. Attributes of the Crime Dataset.

Id	Columns
1	OFFENSE_ID
2	INCIDENT_ID
3	OFFENSE_CODE
4	OFFENSE_CODE_EXTENSION
5	OFFENSE_TYPE_ID
6	OFFENSE_CATEGORY_ID
7	FIRST_OCCURENCE_DATE
8	LAST_OCCURENCE_DATE
9	REPORTED_DATE
10	INCIDENT_ADDRESS
11	GEO_LON
12	GEO_LAT
13	GEO_X
14	GEO_Y
15	DISTRICT_ID
16	PRECINCT_ID
17	NEIGHBORHOOD_ID
18	IS_CRIME
19	IS_TRAFFIC

Table 4. Selected Features for the Analysis.

Factors	Feature
Temporal	Year
	Min
	Day
	Week
	Month
	AM/PM
Spatial	GEO_LON
	GEO_LAT
	PRECINCT_ID
	DISTRICT_ID
	INCIDENT_ADDRESS
	NEIGHBORHOOD_ID
	Gird_3Km_no
	IS_TRAFFIC

Table 5. 5-Fold Cross-Validation Average Accuracy Evaluation.

Model	Average Accuracy
Logistic Regression	0.6106596299259336
Bayesian Classifier	0.6690275878087176
KNN	0.6968863712611938
Decision Tree	0.9292069472760587
Random Forest	0.8563860906722042
Random Forest with Oversampling	0.9508477638132596

Table 6. Time and Accuracy of Different Metrics.

Criterion	Training Time/s	Prediction Time/s	Accuracy
Entropy	146.37401604652405	5.760926723480225	0.9504632377609745
Gini	140.28293132781982	5.735930585861206	0.9508309891868177

Table 7. Model Evaluation Metrics.

Model	Precision	Recall	F1-Score	Accuracy
Logistic Regression	0.62	0.61	0.61	0.61
Bayesian Classifier	0.67	0.67	0.67	0.67
KNN	0.70	0.70	0.69	0.70
Decision Tree	0.93	0.92	0.92	0.92
Random Forest	0.85	0.85	0.84	0.85
Random Forest with Oversampling	0.95	0.95	0.94	0.95

Table 8. The Time Complexity of the Model.

Model	Average Accuracy	Training Time/s	Testing Time/s	Actual Simulation Time/s
Logistic Regression	0.61	60.1	0.08567643165588379	0.000561516165111566
Bayesian Classifier	0.67	13.29	0.3382411003112793	0.003650665283203125
KNN	0.70	8511.59	1841.9281747341156	30.335577487945557
Decision Tree	0.92	92.37	0.11479330062866211	0.014016151428222656
Random Forest	0.85	740.73	6.163089036941528	0.024358510971069336
Random Forest with Oversampling	0.95	1203.10	7.163089036941528	0.10506129264831543

Table 9. Drone Cruise Waypoints.

Drone	Waypoints
Drone1	0-10-30-2-34-12-32-8-20-6-5-24-22-31-25-0
Drone2	0-21-3-23-19-29-17-0
Drone3	0-15-11-4-13-14-18-0
Drone4	0-1-26-27-33-9-7-28-16-35-0

Table 10. Comparative Simulation Results of Different Algorithms.

Algorithm	Flight Distance	UAV	Evolution Generations
Improved Genetic Algorithm (GA)	83.760 km	4	326
ACO	86.748 km	4	714

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, C.; Tian, F.; Pan, Y. Swarm Intelligence Response Methods Based on Urban Crime Event Prediction. Electronics 2023, 12, 4610. https://doi.org/10.3390/electronics12224610

AMA Style

Wang C, Tian F, Pan Y. Swarm Intelligence Response Methods Based on Urban Crime Event Prediction. Electronics. 2023; 12(22):4610. https://doi.org/10.3390/electronics12224610

Chicago/Turabian Style

Wang, Changhao, Feng Tian, and Yan Pan. 2023. "Swarm Intelligence Response Methods Based on Urban Crime Event Prediction" Electronics 12, no. 22: 4610. https://doi.org/10.3390/electronics12224610

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Swarm Intelligence Response Methods Based on Urban Crime Event Prediction

Abstract

1. Introduction

2. Relevant Work

2.1. Crime Event Prediction

2.2. Multi-Agent-Based Collective Intelligence Response

3. Crime Event Prediction

3.1. Overview of Denver Crime Datasets

3.2. Data Preprocessing

3.2.1. Attribute Reduction

3.2.2. Handling Missing Values

3.2.3. Feature Extraction

3.2.4. Spatial Division

3.3. Exploratory Analysis and Feature Selection

3.3.1. Temporal Correlation Analysis

3.3.2. Spatial Correlation Analysis

3.3.3. Feature Selection

3.4. Crime Event Prediction

3.4.1. Problem Description

3.4.2. Crime Prediction Model

3.4.3. Crime Prediction Process

3.4.4. Crime Prediction Results

4. Multi-Drone Response Based on Crime Prediction

4.1. Problem Description

4.2. Cruise Response Model

4.2.1. Target Selection Strategy

4.2.2. Cruise Algorithm

4.2.3. Simulation Results

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI