**5. Conclusions**

In previous grid-based crime prediction studies, information from all cells was combined and used for training the data on crimes in adjacent land. However, the actual land has a continuous flow, and the patterns and affected environmental factors vary with the method of crime. This study proposes a spatial clustering technique to solve this problem. The results showed that using reflecting spatial continuity to predict crime was effective in enhancing the model's performance. Moreover, by identifying the importance of each variable, it was found that there were places where crimes were spatiotemporally concentrated. With regard to the time-related variables, more recent crimes are known to have a greater influence on future crimes. However, it was difficult to significantly influence the model training if the period set as a variable was too short. Considering the physical environment-related variables, the feature importance of restaurants and pubs was high, suggesting that spaces frequented by people in their daily lives are more related to crime. Therefore, further in-depth analysis is required.

As part of the existing machine learning-based crime prediction research, this study is important because it provides guidelines for future related studies to apply spatial clustering in the crime prediction process and compare the results according to the cluster's configuration. Furthermore, this study attempted to predict the location of crimes more microscopically, using a grid unit in the analysis. Once this study is supplemented and practically applied in the future, it can help to improve the effectiveness of crime prevention by distributing crime prevention resources more efficiently.

Regarding this study's limitations, first, while the physical and environmental factors described in environmental criminology are highly diverse, this study used only some of them as variables. Second, because machine learning algorithms focus on prediction and classification using the given data rather than identifying the correlation between each variable, it is difficult to describe the correlation between each variable and the prediction result in detail. This study performed spatial clustering based on the entire target area; however, future studies can consider a method to derive cells with high similarity that is based on each cell. Finally, this study is expected to serve as a basis for further studies that use various variables and conduct both regression and statistical analyses.

**Author Contributions:** Conceptualization, D.K. and S.J.; methodology, D.K. and Y.J.; software, D.K.; validation, S.J.; formal analysis, D.K., Y.J. and S.J.; investigation, D.K. and Y.J.; resources, Y.J. and S.J.; data curation, D.K. and Y.J.; writing—review and editing, D.K and S.J.; visualization, D.K.; supervision, Y.J. and S.J.; project administration, S.J.; funding acquisition, S.J. All authors have read and agreed to the published version of the manuscript.

**Funding:** This work was supported by the National Research Foundation of Korea (NRF) gran<sup>t</sup> funded by the Korean Government (MSIT) (No. NRF-2018R1A2B2005528).

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Conflicts of Interest:** The authors declare no conflict of interest.
