SynthSecureNet: An Improved Deep Learning Architecture with Application to Intelligent Violence Detection
Abstract
:1. Introduction
1.1. State-of-the-Art Overview
1.2. Proposed Contribution
2. Methodology
2.1. The Architecture
2.2. Combining ResNet and MobileNet
2.3. SynthSecureNet Architecture Network Development
- Random flipping;
- Random zooming;
- Random brightness adjustment;
- Random rotation.
2.4. Ensemble Model (Benefits)
2.4.1. Better Predictive Performance
2.4.2. Less Overfitting
2.4.3. Less Increased Robustness
2.4.4. Dealing with Complex Relationships
2.4.5. Bias-Variance Reduction
2.4.6. Dealing with Data Diversity
2.4.7. Flexibility and Modularity
2.4.8. Parallel Processing
2.4.9. Interpretability and Insights
2.4.10. Robustness to Concept Drift
2.5. Block Diagram
2.6. Model Training Dataset
3. Experimental Results
3.1. Performance Metrics
3.2. Experimental Procedure
3.3. Performance Evaluation
3.3.1. Dataset of 400 Videos
3.3.2. Dataset of 800 Videos
3.3.3. Dataset of 1200 Videos
3.3.4. Dataset of 1600 Videos
3.3.5. Dataset of 2000 Videos
3.4. Summary
4. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Inbavalli, A.; Jarshini, T.; Muralikrishnaa, M. Efficient Aggressive Behaviour Detection And Alert System Employing Deep Learning Techniques. In Proceedings of the 2024 International Conference on Automation and Computation (AUTOCOM), Dehradun, India, 14–16 March 2024; IEEE: Piscataway, HJ, USA, 2024; pp. 404–410. [Google Scholar]
- Tyagi, S.; Tyagi, S.; Bagga, V.; Bansal, S.; Goswami, S.; Verma, Y.; Tyagi, M.; Singh, S.P. Deep learning solution for real-time violence detection in video streams. In Advances in AI for Biomedical Instrumentation, Electronics and Computing; CRC Press: Boca Raton, FL, USA, 2024; pp. 142–146. [Google Scholar]
- Thomas, M.; Balamurugan, P. Real-Time Violence Detection and Alert System using MobileNetV2 and Cloud Firestore. In Proceedings of the 2024 2nd International Conference on Networking and Communications (ICNWC), Chennai, India, 2–4 April 2024; IEEE: Piscataway, HJ, USA, 2024; pp. 1–9. [Google Scholar]
- Maheswari, G.; Balaji, M.U.A.; Asik, A.; Adams, S.R.; Thanikavel, B. Public Space Real-Time Violence Detection and Notifier. In Proceedings of the 2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS), Chennai, India, 28–29 March 2024; IEEE: Piscataway, HJ, USA, 2024; pp. 1–6. [Google Scholar]
- Huillcen Baca, H.A.; Palomino Valdivia, F.d.L.; Gutierrez Caceres, J.C. Efficient human violence recognition for surveillance in real time. Sensors 2024, 24, 668. [Google Scholar] [CrossRef] [PubMed]
- Arroyo, R.; Yebes, J.J.; Bergasa, L.M.; Daza, I.G.; Almazán, J. Expert video-surveillance system for real-time detection of suspicious behaviors in shopping malls. Expert Syst. Appl. 2015, 42, 7991–8005. [Google Scholar] [CrossRef]
- Park, J.H.; Mahmoud, M.; Kang, H.S. Conv3D-based video violence detection network using optical flow and RGB data. Sensors 2024, 24, 317. [Google Scholar] [CrossRef] [PubMed]
- Gautam, V.; Maheshwari, H.; Tiwari, R.G.; Agarwal, A.K.; Trivedi, N.K. Federated Learning Empowered Violence Recognition in CCTV Footage: A YOLO and ResNet-50 Fusion Approach. In Proceedings of the 2024 2nd International Conference on Intelligent Data Communication Technologies and Internet of Things (IDCIoT), Bengaluru, India, 4–6 January 2024; IEEE: Piscataway, HJ, USA, 2024; pp. 01–06. [Google Scholar]
- Mumtaz, A.; Sargano, A.B.; Habib, Z. Violence detection in surveillance videos with deep network using transfer learning. In Proceedings of the 2018 2nd European Conference on Electrical Engineering and Computer Science (EECS), Bern, Switzerland, 20–22 December 2018; IEEE: Piscataway, HJ, USA, 2018; pp. 558–563. [Google Scholar]
- Akinci, G.M. The purposes and meanings of surveillance: A case study in a shopping mall in Ankara, Turkey. Secur. J. 2015, 28, 39–53. [Google Scholar] [CrossRef]
- Lee, H.E.; Ermakova, T.; Ververis, V.; Fabian, B. Detecting child sexual abuse material: A comprehensive survey. Forensic Sci. Int. Digit. Investig. 2020, 34, 301022. [Google Scholar] [CrossRef]
- Ballard, C.; Brady, L. Violence Prevention in Georgia’s Rural Public School Systems: A Comparison of Perceptions of School Superintendents 1995–2005. J. Sch. Violence 2007, 6, 105–129. [Google Scholar] [CrossRef]
- Honarjoo, N.; Abdari, A.; Mansouri, A. Violence detection using pre-trained models. In Proceedings of the 2021 5th International Conference on Pattern Recognition and Image Analysis (IPRIA), Kashan, Iran, 3–4 March 2021; IEEE: Piscataway, HJ, USA, 2021; pp. 1–4. [Google Scholar]
- Negre, P.; Alonso, R.S.; González-Briones, A.; Prieto, J.; Rodríguez-González, S. Literature Review of Deep-Learning-Based Detection of Violence in Video. Sensors 2024, 24, 4016. [Google Scholar] [CrossRef]
- Haque, M.; Nyeem, H.; Afsha, S. BrutNet: A novel approach for violence detection and classification using DCNN with GRU. J. Eng. 2024, 2024, e12375. [Google Scholar] [CrossRef]
- Contardo, P.; Tomassini, S.; Falcionelli, N.; Dragoni, A.F.; Sernani, P. Combining a mobile deep neural network and a recurrent layer for violence detection in videos. In Proceedings of the 5th International Conference on Recent Trends and Applications in Computer Science and Information Technology, RTA-CSIT 2023, Tirana, Albania, 26–27 April 2023. [Google Scholar]
- Zahid, Y.; Tahir, M.A.; Durrani, N.M.; Bouridane, A. IBaggedFCNet: An Ensemble Framework for Anomaly Detection in Surveillance Videos. IEEE Access 2020, 8, 220620–220630. [Google Scholar] [CrossRef]
- Young, S.; Abdou, T.; Bener, A. Deep Super Learner: A Deep Ensemble for Classification Problems. In Advances in Artificial Intelligence; Bagheri, E., Cheung, J.C., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2018; Volume 10832, pp. 84–95. [Google Scholar] [CrossRef]
- Jayaswal, R.; Dixit, M. A Framework for Anomaly Classification Using Deep Transfer Learning Approach. Rev. d’Intell. Artif. 2021, 35, 255–263. [Google Scholar] [CrossRef]
- Kumar, V.; Recupero, D.R.; Riboni, D.; Helaoui, R. Ensembling Classical Machine Learning and Deep Learning Approaches for Morbidity Identification From Clinical Notes. IEEE Access 2021, 9, 7107–7126. [Google Scholar] [CrossRef]
- Tariq, U.; Lin, K.H.; Li, Z.; Zhou, X.; Wang, Z.; Le, V.; Huang, T.S.; Lv, X.; Han, T.X. Recognizing Emotions from an Ensemble of Features. IEEE Trans. Syst. Man Cybern. Part (Cybern.) 2012, 42, 1017–1026. [Google Scholar] [CrossRef]
- Zheng, J.; Cao, X.; Zhang, B.; Zhen, X.; Su, X. Deep Ensemble Machine for Video Classification. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 553–565. [Google Scholar] [CrossRef]
- Shripriya, C.; Akshaya, J.; Sowmya, R.; Poonkodi, M. Violence Detection System Using Resnet. In Proceedings of the 2021 5th International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, Tamil Nadu, 2–4 December 2021; IEEE: Piscataway, HJ, USA, 2021; pp. 1069–1072. [Google Scholar]
- Jain, M.; Kumar, M. A Review of Violence Detection Techniques. In Proceedings of the 2024 2nd International Conference on Computer, Communication and Control (IC4), Indore, India, 8–10 February 2024; IEEE: Piscataway, HJ, USA, 2024; pp. 1–6. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4510–4520. [Google Scholar]
- Koonce, B. ResNet 50. In Convolutional Neural Networks with Swift for Tensorflow; Apress: Berkeley, CA, USA, 2021; pp. 63–72. [Google Scholar] [CrossRef]
- Jain, A.; Vishwakarma, D.K. State-of-the-arts Violence Detection using ConvNets. In Proceedings of the 2020 International Conference on Communication and Signal Processing (ICCSP), Melmaruvathur, India, 28–30 July 2020; IEEE: Piscataway, HJ, USA, 2020; pp. 0813–0817. [Google Scholar]
- Bermeo, M.; Morocho-Cayamcela, M.E.; Cuenca, E. Unraveling the Power of 4D Residual Blocks and Transfer Learning in Violence Detection. In Information and Communication Technologies; Maldonado-Mahauad, J., Herrera-Tapia, J., Zambrano-Martínez, J.L., Berrezueta, S., Eds.; Communications in Computer and Information Science; Springer Nature: Cham, Switzerland, 2023; Volume 1885, pp. 207–219. [Google Scholar] [CrossRef]
- Jianjie, S.; Weijun, Z. Violence detection based on three-dimensional convolutional neural network with inception-ResNet. In Proceedings of the 2020 IEEE Conference on Telecommunications, Optics and Computer Science (TOCS), Shenyang, China, 11–13 December 2020; IEEE: Piscataway, HJ, USA, 2020; pp. 145–150. [Google Scholar]
- Feichtenhofer, C.; Fan, H.; Malik, J.; He, K. Slowfast networks for video recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6202–6211. [Google Scholar]
- Uemura, T.; Näppi, J.J.; Hironaka, T.; Kim, H.; Yoshida, H. Comparative performance of 3D-DenseNet, 3D-ResNet, and 3D-VGG models in polyp detection for CT colonography. In Proceedings of the Medical Imaging 2020: Computer-Aided Diagnosis, Houston, TX, USA, 16–19 February 2020; SPIE: Bellingham, WA, USA, 2020; Volume 11314, pp. 736–741. [Google Scholar]
- Dubey, S.; Boragule, A.; Jeon, M. 3d resnet with ranking loss function for abnormal activity detection in videos. In Proceedings of the 2019 International Conference on Control, Automation and Information Sciences (ICCAIS), Chengdu, China, 23–26 October 2019; IEEE: Piscataway, HJ, USA, 2019; pp. 1–6. [Google Scholar]
- Liu, Z.; Ning, J.; Cao, Y.; Wei, Y.; Zhang, Z.; Lin, S.; Hu, H. Video swin transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 3202–3211. [Google Scholar]
- Nguyen, L.D.; Gao, R.; Lin, D.; Lin, Z. Biomedical image classification based on a feature concatenation and ensemble of deep CNNs. J. Ambient. Intell. Humaniz. Comput. 2023, 14, 15455–15467. [Google Scholar] [CrossRef]
- Traoré, A.; Akhloufi, M.A. Violence detection in videos using deep recurrent and convolutional neural networks. In Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), Toronto, ON, Canada, 11–14 October 2020; IEEE: Piscataway, HJ, USA, 2020; pp. 154–159. [Google Scholar]
- Nguyen, L.D.; Lin, D.; Lin, Z.; Cao, J. Deep CNNs for microscopic image classification by exploiting transfer learning and feature concatenation. In Proceedings of the 2018 IEEE International Symposium on Circuits and Systems (ISCAS), Florence, Italy, 27–30 May 2018; IEEE: Piscataway, HJ, USA, 2018; pp. 1–5. [Google Scholar]
- Che, W.; Liu, Y.; Wang, Y.; Zheng, B.; Liu, T. Towards Better UD Parsing: Deep Contextualized Word Embeddings, Ensemble, and Treebank Concatenation. arXiv 2018, arXiv:1807.03121. [Google Scholar]
- Lu, Y.; Huang, X.; Huang, Y.; Liu, D. Sigmoid function model for a PFM power electronic converter. IEEE Trans. Power Electron. 2019, 35, 4233–4241. [Google Scholar] [CrossRef]
- Qin, Y.; Wang, X.; Zou, J. The optimized deep belief networks with improved logistic sigmoid units and their application in fault diagnosis for planetary gearboxes of wind turbines. IEEE Trans. Ind. Electron. 2018, 66, 3814–3824. [Google Scholar] [CrossRef]
- Zhang, Z. Improved adam optimizer for deep neural networks. In Proceedings of the 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), Banff, AB, Canada, 4–6 June 2018; IEEE: Piscataway, HJ, USA, 2018; pp. 1–2. [Google Scholar]
- Jayasimhan, A.; Pabitha, P. A hybrid model using 2D and 3D Convolutional Neural Networks for violence detection in a video dataset. In Proceedings of the 2022 3rd International Conference on Communication, Computing and Industry 4.0 (C2I4), Bangalore, India, 15–16 December 2022; IEEE: Piscataway, HJ, USA, 2022; pp. 1–5. [Google Scholar]
- Feng, J.; Wang, Z.; Zha, M.; Cao, X. Flower Recognition Based on Transfer Learning and Adam Deep Learning Optimization Algorithm. In Proceedings of the 2019 International Conference on Robotics, Intelligent Control and Artificial Intelligence, Shanghai, China, 20–22 September 2019; pp. 598–604. [Google Scholar] [CrossRef]
- Das, A.; Hoque, M.M.; Sharif, O.; Dewan, M.A.A.; Siddique, N. TEmoX: Classification of Textual Emotion Using Ensemble of Transformers. IEEE Access 2023, 11, 109803–109818. [Google Scholar] [CrossRef]
- Sarman, S.; Sert, M. Audio based violent scene classification using ensemble learning. In Proceedings of the 2018 6th International Symposium on Digital Forensic and Security (ISDFS), Antalya, Turkey, 22–25 March 2018; IEEE: Piscataway, HJ, USA, 2018; pp. 1–5. [Google Scholar]
- Ju, C.; Bibaut, A.; Van Der Laan, M. The relative performance of ensemble methods with deep convolutional neural networks for image classification. J. Appl. Stat. 2018, 45, 2800–2818. [Google Scholar] [CrossRef]
- Kumari, A.; Kaur, T.; Ranjan, P.; Chopra, S.; Sarkar, S.; Baitha, U. Workplace violence against doctors: Characteristics, risk factors, and mitigation strategies. J. Postgrad. Med. 2020, 66, 149. [Google Scholar] [CrossRef]
- Liu, H.; Zhu, X.; Fujii, T. Ensemble deep learning based cooperative spectrum sensing with semi-soft stacking fusion center. In Proceedings of the 2019 IEEE Wireless Communications and Networking Conference (WCNC), Marrakech, Morocco, 15–19 April 2019; IEEE: Piscataway, HJ, USA, 2019; pp. 1–6. [Google Scholar]
- Vo-Le, C.; Vo, H.S.; Vu, T.D.; Son, N.H. Violence Detection using Feature Fusion of Optical Flow and 3D CNN on AICS-Violence Dataset. In Proceedings of the 2022 IEEE Ninth International Conference on Communications and Electronics (ICCE), Nha Trang, Vietnam, 27–29 July 2022; IEEE: Piscataway, HJ, USA, 2022; pp. 395–399. [Google Scholar]
- Lv, Q. Classification of Grapevine Leaf Images with Deep Learning Ensemble Models. In Proceedings of the 2023 4th International Conference on Computer Vision, Image and Deep Learning (CVIDL), Zhuhai, China, 12–14 May 2023; pp. 191–194. [Google Scholar] [CrossRef]
- Wang, J.; Hazarika, S.; Li, C.; Shen, H.W. Visualization and visual analysis of ensemble data: A survey. IEEE Trans. Vis. Comput. Graph. 2018, 25, 2853–2872. [Google Scholar] [CrossRef] [PubMed]
- Catolino, G.; Ferrucci, F. Ensemble techniques for software change prediction: A preliminary investigation. In Proceedings of the 2018 IEEE Workshop on Machine Learning Techniques for Software Quality Evaluation (MaLTeSQuE), Campobasso, Italy, 20 March 2018; IEEE: Piscataway, HJ, USA, 2018; pp. 25–30. [Google Scholar]
- Ramzan, M.; Abid, A.; Khan, H.U.; Awan, S.M.; Ismail, A.; Ahmed, M.; Ilyas, M.; Mahmood, A. A review on state-of-the-art violence detection techniques. IEEE Access 2019, 7, 107560–107575. [Google Scholar] [CrossRef]
- Oh, H. A YouTube Spam Comments Detection Scheme Using Cascaded Ensemble Machine Learning Model. IEEE Access 2021, 9, 144121–144128. [Google Scholar] [CrossRef]
- Naik, A.J.; Gopalakrishna, M.T. Deep-violence: Individual person violent activity detection in video. Multimed. Tools Appl. 2021, 80, 18365–18380. [Google Scholar] [CrossRef]
- Mohandes, M.; Deriche, M.; Aliyu, S.O. Classifiers combination techniques: A comprehensive review. IEEE Access 2018, 6, 19626–19639. [Google Scholar] [CrossRef]
- Soliman.; Nashed, K.; Mostafa, C.K. Real Life Violence Situations Dataset. Available online: https://www.kaggle.com/datasets/mohamedmustafa/real-life-violence-situations-dataset (accessed on 29 October 2024).
- Aktı, Ş.; Tataroğlu, G.A.; Ekenel, H.K. Vision-based fight detection from surveillance cameras. In Proceedings of the 2019 Ninth International Conference on Image Processing Theory, Tools and Applications (IPTA), Istanbul, Turkey, 6–9 November 2019; IEEE: Piscataway, HJ, USA, 2019; pp. 1–6. [Google Scholar]
- Cheng, M.; Cai, K.; Li, M. RWF-2000: An open large scale video database for violence detection. In Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; IEEE: Piscataway, HJ, USA, 2021; pp. 4183–4190. [Google Scholar]
- Ciampi, L.; Foszner, P.; Messina, N.; Staniszewski, M.; Gennaro, C.; Falchi, F.; Serao, G.; Cogiel, M.; Golba, D.; Szczęsna, A. Bus violence: An open benchmark for video violence detection on public transport. Sensors 2022, 22, 8345. [Google Scholar] [CrossRef]
- Fourure, D.; Javaid, M.U.; Posocco, N.; Tihon, S. Anomaly Detection: How to Artificially Increase Your F1-Score with a Biased Evaluation Protocol. In Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track; Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A., Eds.; Lecture Notes in Computer Science; Springer International Publishing: Cham, Switzerland, 2021; Volume 12978, pp. 3–18. [Google Scholar] [CrossRef]
- Yacouby, R.; Axman, D. Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models. In Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, Online, 16 March 2020; pp. 79–91. [Google Scholar]
Model | F1 Score | Recall | Precision | Accuracy | Training Duration |
---|---|---|---|---|---|
SynthSecureNet | 96% | 96% | 96% | 96% | 15 min |
ResNetV2 | 91% | 88% | 89% | 89% | 9 min |
MobileNetV2 | 89% | 89% | 88% | 88% | 5 min |
Model | F1 Score | Recall | Precision | Accuracy | Training Duration |
---|---|---|---|---|---|
SynthSecureNet | 98% | 98% | 98% | 98% | 30 min |
ResNetV2 | 90% | 88% | 92% | 89% | 14 min |
MobileNetV2 | 89% | 89% | 89% | 89% | 9 min |
Model | F1 Score | Recall | Precision | Accuracy | Training Duration |
---|---|---|---|---|---|
SynthSecureNet | 98.31% | 98.25% | 98.37% | 98.15% | 46 min |
ResNetV2 | 91% | 89% | 90% | 90% | 26 min |
MobileNetV2 | 91% | 92% | 90% | 90% | 19 min |
Model | F1 Score | Recall | Precision | Accuracy | Training Duration |
---|---|---|---|---|---|
SynthSecureNet | 98.88% | 98.89% | 98.87% | 98.77% | 1 h 40 min |
ResNetV2 | 91% | 90% | 91% | 91% | 30 min |
MobileNetV2 | 91% | 91% | 90% | 90% | 1 h 11 min |
Model | F1 Score | Recall | Precision | Accuracy | Training Duration |
---|---|---|---|---|---|
SynthSecureNet | 99.24% | 99.23% | 99.26% | 99.22% | 6 h 5 min |
ResNetV2 | 94% | 95% | 94% | 94% | 3 h 9 min |
MobileNetV2 | 91% | 91% | 90% | 90% | 2 h 30 min |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zungu, N.; Olukanmi, P.; Bokoro, P. SynthSecureNet: An Improved Deep Learning Architecture with Application to Intelligent Violence Detection. Algorithms 2025, 18, 39. https://doi.org/10.3390/a18010039
Zungu N, Olukanmi P, Bokoro P. SynthSecureNet: An Improved Deep Learning Architecture with Application to Intelligent Violence Detection. Algorithms. 2025; 18(1):39. https://doi.org/10.3390/a18010039
Chicago/Turabian StyleZungu, Ntandoyenkosi, Peter Olukanmi, and Pitshou Bokoro. 2025. "SynthSecureNet: An Improved Deep Learning Architecture with Application to Intelligent Violence Detection" Algorithms 18, no. 1: 39. https://doi.org/10.3390/a18010039
APA StyleZungu, N., Olukanmi, P., & Bokoro, P. (2025). SynthSecureNet: An Improved Deep Learning Architecture with Application to Intelligent Violence Detection. Algorithms, 18(1), 39. https://doi.org/10.3390/a18010039