LightAnomalyNet: A Lightweight Framework for Efficient Abnormal Behavior Detection
Abstract
:1. Introduction
- The paper reviews the recent proposals of abnormalities detection, and determines a common set of those found in uncrowded scenes in videos. Specifically, it focuses on scenes featuring only one or a few persons, involving actions related to falling, suspicious behavior (e.g., loitering, being in the wrong place (intrusions), a strange action that deviates from the learned behavior), or violence.
- By focusing on more than one anomaly, as mentioned above, the study combines the classification of various commonly found abnormal behaviors in an uncrowded scene. As reported by previous studies such as [10], because of the challenges resulting from the essential similarities in the acts of various abnormal behaviors, the efficient joint detection of different anomalies (leading to an accurate classification as normal and abnormal behavior) is an interesting problem, and hence a notable contribution of the current study.
- A dataset based on SG3I image format is provided by selecting videos from publicly available datasets that are suitable for learning the behaviors involved in the study.
- The framework uses an advanced deep learning architecture that employs an effective method of motion representation, thus avoiding the use of expensive optical flow.
- We use a lightweight CNN architecture that effectively learns to classify the anomalies with high accuracy at low computational cost.
2. Related Work
2.1. Traditional Methods
2.2. Deep Learning-Based Methods
3. The Proposed Framework
3.1. Input Images Generation
3.2. CNN Model Architecture
3.3. Model Training and Testing
4. Experiments
4.1. Datasets
4.2. Overall Performance Evaluation
4.3. Evaluation Based on Execution Time
4.4. Comparison with Other Networks
4.5. Comparison with the State-of-the-Art
5. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Khan, M.A.; Javed, K.; Khan, S.A.; Saba, T.; Habib, U.; Khan, J.A.; Abbasi, A.A. Human Action Recognition Using Fusion of Multiview and Deep Features: An Application to Video Surveillance. Multimed. Tools Appl. 2020, 79, 1–27. [Google Scholar] [CrossRef]
- Khan, M.A.; Zhang, Y.D.; Khan, S.A.; Attique, M.; Rehman, A.; Seo, S. A Resource Conscious Human Action Recognition Framework Using 26-Layered Deep Convolutional Neural Network. Multimed. Tools Appl. 2021, 80, 35827–35849. [Google Scholar] [CrossRef]
- Cheoi, K.J. Temporal Saliency-Based Suspicious Behavior Pattern Detection. Appl. Sci. 2020, 10, 1020. [Google Scholar] [CrossRef] [Green Version]
- Harari, Y.; Shawen, N.; Mummidisetty, C.K.; Albert, M.V.; Kording, K.P.; Jayaraman, A. A Smartphone-Based Online System for Fall Detection with Alert Notifications and Contextual Information of Real-Life Falls. J. Neuro Eng. Rehabil. 2021, 18, 124. [Google Scholar] [CrossRef]
- Vishnu, C.; Datla, R.; Roy, D.; Babu, S.; Mohan, C.K. Human Fall Detection in Surveillance Videos Using Fall Motion Vector Modeling. IEEE Sens. J. 2021, 21, 17162–17170. [Google Scholar] [CrossRef]
- Yao, C.; Hu, J.; Min, W.; Deng, Z.; Zou, S.; Min, W. A Novel Real-Time Fall Detection Method Based on Head Segmentation and Convolutional Neural Network. J. Real-Time Image Process. 2020, 17, 1939–1949. [Google Scholar] [CrossRef]
- Pan, J.; Liu, L.; Lin, M.; Luo, S.; Zhou, C.; Liao, H.; Wang, F. An Improved Two-Stream Inflated 3d Convnet for Abnormal Behavior Detection. Intell. Autom. Soft Comput. 2021, 30, 673–688. [Google Scholar] [CrossRef]
- Rendón-Segador, F.J.; Álvarez-García, J.A.; Enríquez, F.; Deniz, O. ViolenceNet: Dense Multi-Head Self-Attention with Bidirectional Convolutional LSTM for Detecting Violence. Electronics 2021, 10, 1601. [Google Scholar] [CrossRef]
- ben Mabrouk, A.; Zagrouba, E. Abnormal Behavior Recognition for Intelligent Video Surveillance Systems: A Review. Expert Syst. Appl. 2018, 91, 480–491. [Google Scholar] [CrossRef]
- Mehmood, A. Abnormal Behavior Detection in Uncrowded Videos with Two-Stream 3D Convolutional Neural Networks. Appl. Sci. 2021, 11, 3523. [Google Scholar] [CrossRef]
- Kim, D.; Kim, H.; Mok, Y.; Paik, J. Real-Time Surveillance System for Analyzing Abnormal Behavior of Pedestrians. Appl. Sci. 2021, 11, 6153. [Google Scholar] [CrossRef]
- Sikdar, A.; Chowdhury, A.S. An Adaptive Training-Less Framework for Anomaly Detection in Crowd Scenes. Neurocomputing 2020, 415, 317–331. [Google Scholar] [CrossRef]
- Asad, M.; Yang, J.; He, J.; Shamsolmoali, P.; He, X.J. Multi-Frame Feature-Fusion-Based Model for Violence Detection. Vis. Comput. 2020, 17, 1415–1431. [Google Scholar] [CrossRef]
- Kim, J.; Won, C.S. Action Recognition in Videos Using Pre-Trained 2D Convolutional Neural Networks. IEEE Access 2020, 8, 60179–60188. [Google Scholar] [CrossRef]
- Li, N.; Wu, X.; Xu, D.; Guo, H.; Feng, W. Spatio-Temporal Context Analysis within Video Volumes for Anomalous-Event Detection and Localization. Neurocomputing 2015, 155, 309–319. [Google Scholar] [CrossRef]
- Hu, X.; Huang, Y.; Duan, Q.; Ci, W.; Dai, J.; Yang, H. Abnormal Event Detection in Crowded Scenes Using Histogram of Oriented Contextual Gradient Descriptor. Eurasip J. Adv. Signal Process. 2018, 2018, 54. [Google Scholar] [CrossRef] [Green Version]
- Bansod, S.D.; Nandedkar, A.V. Crowd Anomaly Detection and Localization Using Histogram of Magnitude and Momentum. Vis. Comput. 2020, 36, 609–620. [Google Scholar] [CrossRef]
- Zhang, X.; Ma, D.; Yu, H.; Huang, Y.; Howell, P.; Stevens, B. Scene Perception Guided Crowd Anomaly Detection. Neurocomputing 2020, 414, 291–302. [Google Scholar] [CrossRef]
- Singh, G.; Kapoor, R.; Khosla, A. Optical Flow-Based Weighted Magnitude and Direction Histograms for the Detection of Abnormal Visual Events Using Combined Classifier. Int. J. Cogn. Inform. Nat. Intell. 2021, 15, 12–30. [Google Scholar] [CrossRef]
- Brox, T.; Bruhn, A.; Papenberg, N.; Weickert, J. High Accuracy Optical Flow Estimation Based on a Theory for Warping. In Proceedings of the Computer Vision—ECCV 2004, Prague, Czech Republic, 11–14 May 2004; Pajdla, T., Matas, J., Eds.; Springer: Berlin, Heidelberg, 2004; pp. 25–36. [Google Scholar]
- Tran, D.; Bourdev, L.; Fergus, R.; Torresani, L.; Paluri, M. Learning Spatiotemporal Features with 3D Convolutional Networks. In Proceedings of the IEEE International Conference on Computer Vision 2015, Santiago, Chile, 7–13 December 2015; Volume 2015, pp. 4489–4497. [Google Scholar] [CrossRef] [Green Version]
- Xie, S.; Sun, C.; Huang, J.; Tu, Z.; Murphy, K. Rethinking Spatiotemporal Feature Learning: Speed-Accuracy Trade-Offs in Video Classification. In Proceedings of the Computer Vision—ECCV 2018, Munich, Germany, 8–14 September 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer International Publishing: Cham, Switzerland, 2018; pp. 318–335. [Google Scholar] [CrossRef] [Green Version]
- Lapierre, N.; St-Arnaud, A.; Meunier, J.; Rousseau, J. Implementing an Intelligent Video Monitoring System to Detect Falls of Older Adults at Home: A Multiple Case Study. J. Enabling Technol. 2020, 14, 253–271. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar] [CrossRef] [Green Version]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef] [Green Version]
- Dai, W.; Chen, Y.; Huang, C.; Gao, M.K.; Zhang, X. Two-Stream Convolution Neural Network with Video-Stream for Action Recognition. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar] [CrossRef]
- Ramya, P.; Rajeswari, R. Human Action Recognition Using Distance Transform and Entropy Based Features. Multimed. Tools Appl. 2021, 80, 8147–8173. [Google Scholar] [CrossRef]
- Afza, F.; Khan, M.A.; Sharif, M.; Kadry, S.; Manogaran, G.; Saba, T.; Ashraf, I.; Damaševičius, R. A Framework of Human Action Recognition Using Length Control Features Fusion and Weighted Entropy-Variances Based Feature Selection. Image Vis. Comput. 2021, 106, 104090. [Google Scholar] [CrossRef]
- Nasir, I.M.; Raza, M.; Shah, J.H.; Attique Khan, M.; Rehman, A. Human Action Recognition Using Machine Learning in Uncontrolled Environment. In Proceedings of the 2021 1st International Conference on Artificial Intelligence and Data Analytics, CAIDA 2021, Riyadh, Saudi Arabia, 6–7 April 2021; pp. 182–187. [Google Scholar]
- Kiran, S.; Khan, M.A.; Javed, M.Y.; Alhaisoni, M.; Tariq, U.; Nam, Y.; Damaševǐcius, R.; Sharif, M. Multi-Layered Deep Learning Features Fusion for Human Action Recognition. Comput. Mater. Contin. 2021, 69, 4061–4075. [Google Scholar] [CrossRef]
- Khan, M.A.; Alhaisoni, M.; Armghan, A.; Alenezi, F.; Tariq, U.; Nam, Y.; Akram, T. Video Analytics Framework for Human Action Recognition. Comput. Mater. Contin. 2021, 68, 3841–3859. [Google Scholar] [CrossRef]
- Rashid, M.; Khan, M.A.; Alhaisoni, M.; Wang, S.H.; Naqvi, S.R.; Rehman, A.; Saba, T. A Sustainable Deep Learning Framework for Object Recognition Using Multi-Layers Deep Features Fusion and Selection. Sustainability 2020, 12, 5037. [Google Scholar] [CrossRef]
- Sharif, A.; Attique Khan, M.; Javed, K.; Gulfam Umer, H.; Iqbal, T.; Saba, T.; Ali, H.; Nisar, W. Intelligent Human Action Recognition: A Framework of Optimal Features Selection Based on Euclidean Distance and Strong Correlation. J. Control. Eng. Appl. Inform. 2019, 21, 3–11. [Google Scholar]
- Tsai, J.K.; Hsu, C.C.; Wang, W.Y.; Huang, S.K. Deep Learning-Based Real-Time Multiple-Person Action Recognition System. Sensors 2020, 20, 4758. [Google Scholar] [CrossRef] [PubMed]
- Carreira, J.; Zisserman, A. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR, Honolulu, HI, USA, 21–26 July 2017; pp. 4724–4733. [Google Scholar] [CrossRef] [Green Version]
- Ionescu, R.T.; Khan, F.S.; Georgescu, M.I.; Shao, L. Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7834–7843. [Google Scholar] [CrossRef] [Green Version]
- Smeureanu, S.; Ionescu, R.T.; Popescu, M.; Alexe, B. Deep Appearance Features for Abnormal Behavior Detection in Video. In Proceedings of the Image Analysis and Processing—ICIAP 2017, Catania, Italy, 11–15 September 2017; Battiato, S., Gallo, G., Schettini, R., Stanco, F., Eds.; Springer International Publishing: Cham, Switzerland, 2017; pp. 779–789. [Google Scholar]
- Zhang, X.; Zhang, Q.; Hu, S.; Guo, C.; Yu, H. Energy Level-Based Abnormal Crowd Behavior Detection. Sensors 2018, 18, 423. [Google Scholar] [CrossRef] [Green Version]
- Mehmood, A. Efficient Anomaly Detection in Crowd Videos Using Pre-Trained 2D Convolutional Neural Networks. IEEE Access 2021, 9, 138283–138295. [Google Scholar] [CrossRef]
- Bakalos, N.; Voulodimos, A.; Doulamis, N.; Doulamis, A.; Ostfeld, A.; Salomons, E.; Caubet, J.; Jimenez, V.; Li, P. Protecting Water Infrastructure from Cyber and Physical Threats: Using Multimodal Data Fusion and Adaptive Deep Learning to Monitor Critical Systems. IEEE Signal Process. Mag. 2019, 36, 36–48. [Google Scholar] [CrossRef]
- Min, S.; Moon, J. Online Fall Detection Using Attended Memory Reference Network. In Proceedings of the 2021 3rd International Conference on Artificial Intelligence in Information and Communication, ICAIIC, Jeju Island, Korea, 13–16 April 2021; pp. 105–110. [Google Scholar] [CrossRef]
- Zerrouki, N.; Houacine, A. Combined Curvelets and Hidden Markov Models for Human Fall Detection. Multimed. Tools Appl. 2018, 77, 6405–6424. [Google Scholar] [CrossRef]
- Núñez-Marcos, A.; Azkune, G.; Arganda-Carreras, I. Vision-Based Fall Detection with Convolutional Neural Networks. Wirel. Commun. Mob. Comput. 2017, 2017, 9474806. [Google Scholar] [CrossRef] [Green Version]
- Khraief, C.; Benzarti, F.; Amiri, H. Elderly Fall Detection Based on Multi-Stream Deep Convolutional Networks. Multimed. Tools Appl. 2020, 79, 19537–19560. [Google Scholar] [CrossRef]
- Roman, D.G.C.; Chavez, G.C. Violence Detection and Localization in Surveillance Video. In Proceedings of the 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images, SIBGRAPI, Porto de Galinhas, Brazil, 7–10 November 2020; pp. 248–255. [Google Scholar] [CrossRef]
- Ullah, F.U.M.; Obaidat, M.S.; Muhammad, K.; Ullah, A.; Baik, S.W.; Cuzzolin, F.; Rodrigues, J.J.P.C.; de Albuquerque, V.H.C. An Intelligent System for Complex Violence Pattern Analysis and Detection. Int. J. Intell. Syst. 2021, 36, 1–23. [Google Scholar] [CrossRef]
- Ullah, W.; Ullah, A.; Haq, I.U.; Muhammad, K.; Sajjad, M.; Baik, S.W. CNN Features with Bi-Directional LSTM for Real-Time Anomaly Detection in Surveillance Networks. Multimed. Tools Appl. 2021, 80, 16979–16995. [Google Scholar] [CrossRef]
- Ullah, A.; Muhammad, K.; Haydarov, K.; Haq, I.U.; Lee, M.; Baik, S.W. One-Shot Learning for Surveillance Anomaly Recognition Using Siamese 3D CNN. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, UK, 19–24 July 2020. [Google Scholar] [CrossRef]
- Song, W.; Zhang, D.; Zhao, X.; Yu, J.; Zheng, R.; Wang, A. A Novel Violent Video Detection Scheme Based on Modified 3D Convolutional Neural Networks. IEEE Access 2019, 7, 39172–39179. [Google Scholar] [CrossRef]
- Fang, M.T.; Przystupa, K.; Chen, Z.J.; Li, T.; Majka, M.; Kochan, O. Examination of Abnormal Behavior Detection Based on Improved YOLOv3. Electronics 2021, 10, 197. [Google Scholar] [CrossRef]
- Sha, L.; Zhiwen, Y.; Kan, X.; Jinli, Z.; Honggang, D. An Improved Two-Stream CNN Method for Abnormal Behavior Detection. J. Phys. Conf. Ser. 2020, 1617, 012064. [Google Scholar] [CrossRef]
- Chriki, A.; Touati, H.; Snoussi, H.; Kamoun, F. Deep Learning and Handcrafted Features for One-Class Anomaly Detection in UAV Video. Multimed. Tools Appl. 2021, 80, 2599–2620. [Google Scholar] [CrossRef]
- Bilen, H.; Fernando, B.; Gavves, E.; Vedaldi, A.; Gould, S. Dynamic Image Networks for Action Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 3034–3042. [Google Scholar]
- Li, D.; Zhang, Z.; Yu, K.; Huang, K.; Tan, T. ISEE: An Intelligent Scene Exploration and Evaluation Platform for Large-Scale Visual Surveillance. IEEE Trans. Parallel Distrib. Syst. 2019, 30, 2743–2758. [Google Scholar] [CrossRef]
- Tang, Y.; Ma, L.; Zhou, L. Hallucinating Optical Flow Features for Video Classification. arXiv 2019, arXiv:1905.11799. [Google Scholar]
- Crasto, N.; Weinzaepfel, P.; Alahari, K.; Schmid, C. MARS: Motion-Augmented RGB Stream for Action Recognition. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7874–7883. [Google Scholar]
- Ji, S.; Xu, W.; Yang, M.; Yu, K. 3D Convolutional Neural Networks for Human Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 221–231. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Varol, G.; Laptev, I.; Schmid, C. Long-Term Temporal Convolutions for Action Recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 1510–1517. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Karpathy, A.; Toderici, G.; Shetty, S.; Leung, T.; Sukthankar, R.; Li, F.F. Large-Scale Video Classification with Convolutional Neural Networks. In Proceedings of the 2014 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1725–1732. [Google Scholar] [CrossRef] [Green Version]
- Hara, K.; Kataoka, H.; Satoh, Y. Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6546–6555. [Google Scholar] [CrossRef] [Green Version]
- Qiu, Z.; Yao, T.; Mei, T. Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Direkoglu, C. Abnormal Crowd Behavior Detection Using Motion Information Images and Convolutional Neural Networks. IEEE Access 2020, 8, 80408–80416. [Google Scholar] [CrossRef]
- Takahashi, R.; Matsubara, T.; Uehara, K. Data Augmentation Using Random Image Cropping and Patching for Deep CNNs. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 2917–2931. [Google Scholar] [CrossRef] [Green Version]
- Kepski, M.; Kwolek, B. Fall Detection on Embedded Platform Using Kinect and Wireless Accelerometer. In Computers Helping People with Special Needs; Springer: Berlin, Heidelberg, 2012; pp. 407–414. [Google Scholar] [CrossRef]
- Lu, C.; Shi, J.; Jia, J. Abnormal Event Detection at 150 FPS in MATLAB. In Proceedings of the 2013 IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 2720–2727. [Google Scholar]
- Bonetto, M.; Korshunov, P.; Ramponi, G.; Ebrahimi, T. Privacy in Mini-Drone Based Video Surveillance. In Proceedings of the 2015 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Ljubljana, Slovenia, 3–8 May 2015; Volume 4, pp. 1–6. [Google Scholar]
- Bermejo Nievas, E.; Deniz Suarez, O.; Bueno García, G.; Sukthankar, R. Violence Detection in Video Using Computer Vision Techniques. Comput. Anal. Images Patterns 2011, 6855, 332–339. [Google Scholar] [CrossRef] [Green Version]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception Architecture for Computer Vision. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar] [CrossRef] [Green Version]
- Huang, G.; Liu, Z.; van der Maaten, L.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2261–2269. [Google Scholar] [CrossRef] [Green Version]
Reference | Data Used | Feature/Model | Type(s) of Anomaly Detected | Dataset(s) |
---|---|---|---|---|
Traditional Methods | ||||
Harari et al. [4] | Accelerometer data, gyroscope signals | Acceleration threshold, logistic regression-based classifier | Falling | Self-collected |
Vishnu et al. [5] | RGB | GMM, FMMM, fall motion vector | Falling | UR Fall Detection, Montreal |
Min and Moon [41] | RGB | Embedding module, attended memory module | Falling | AI Hub DS |
Zerrouki and Houacine [42] | RGB | Curvelet transforms, area ratios features, SVM-HMM | Falling | UR Fall Detection |
Cheoi [3] | Optical flow | Optical flow, temporal saliency map | Falling, violence, suspicious | UMN, Avenue, Self-collected from CCTV footage |
Kim et al. [11] | RGB | Object detection, YOLOv4 | Falling, intrusion, loitering, violence | Korea Internet & Security DS |
Deep Learning-Based Methods | ||||
Nunez et al. [43] | RGB, optical flow | 2D-CNN | Falling | UR Fall Detection, Multicam, FDD |
Yao et al. [6] | RGB | GMM, 2D-CNN | Falling | Self-collected |
Khraief et al. [44] | RGB, depth images | Multi-stream CNN | Falling | Self-collected, UR Fall Detection, FDD |
Pan et al. [7] | RGB, optical flow | 3D-CNN | Violence | UCF-Crime, UCF-101 |
Roman and Chavez [45] | RGB | CNN | Violence | Hockey Fights, Violent Flows, UCFCrime2Local |
Rendón-Segador et al. [8] | RGB, optical flow | Multi-head self-attention, bidirectional convolutional LSTM | Violence | Hockey Fights, Movies, Violent Flows, Real Life Violence Situations |
Ullah et al. [46] | RGB, optical flow | CNN | Violence | Hockey Fights, Violent Flows, Surveillance Fight |
Asad et al. [13] | RGB | Feature fusion, 2D-CNN, LSTM | Violence | Hockey Fights, Movies, Violent Flows, BEHAVE |
Ullah et al. [47] | RGB | Spatiotemporal features, CNN, bidirectional convolutional LSTM | Violence | UCF-Crime, UCFCrime2Local |
Ullah et al. [48] | RGB | 3D-CNN | Violence | UCF-Crime |
Song et al. [49] | RGB | Key frames sampling, 3D-CNN | Violence | Hocky Fights, Movies, Violent Flows |
Fang et al. [50] | RGB | CNN, YOLOv3 | Suspicious | Self-collected |
Sha et al. [51] | RGB, optical flow | Two-stream 2D-CNN | Suspicious | Self-collected |
Chriki et al. [52] | RGB | HOG, HOG3D, CNN | Suspicious | Mini-Drone Video Dataset |
Mehmood [10] | RGB, optical flow | 2-stream 3D-CNN | Falling, loitering, violence | UFLV |
Anomalous Samples | Non-Anomalous Samples | ||||||||
---|---|---|---|---|---|---|---|---|---|
Dataset | # Video Samples Used | Frame Rate | Resolution | # Samples | # Anomaly Sequences | # Frames | # Samples | # Non-Anomaly Sequences | # Frames |
UR Fall * | 48 | 30 | 24 | 24 | 720 | 24 | 250 | 7500 | |
Avenue | 37 | 25 | 18 | 57 | 3750 | 19 | 238 | 10,350 | |
Mini-Drone Video | 38 | 30 | 24 | 43 | 6380 | 10 | 24 | 2925 | |
Hockey Fights ** | 70 | 25 | 35 | 35 | 875 | 35 | 35 | 875 |
UR Fall | Avenue | Mini-Drone Video | Hockey Fights | |
---|---|---|---|---|
Recall | 0.9892 | 0.9569 | 0.9659 | 0.9981 |
FP Rate | 0.0121 | 0.0513 | 0.0497 | 0.0034 |
Precision | 0.9879 | 0.9491 | 0.9511 | 0.9966 |
Accuracy | 0.9886 | 0.9528 | 0.9581 | 0.9974 |
F1 | 0.9886 | 0.9530 | 0.9584 | 0.9974 |
Dataset | Accuracy (%)—100 Iterations | Statistical Measures | ||||
---|---|---|---|---|---|---|
Minimum | Average | Maximum | Standard Deviation | Standard Error | MoE | |
UR fall | 97.01 | 98.06 | 98.88 | 0.5574 | 0.0258 | 0.1098 |
Avenue | 93.06 | 94.21 | 95.54 | 0.7432 | 0.0342 | 0.1464 |
Mini-drone video | 93.09 | 94.24 | 95.83 | 0.7643 | 0.0435 | 0.1506 |
Hockey fights | 98.76 | 99.34 | 99.92 | 0.3247 | 0.0147 | 0.0640 |
Dataset | Optical Flow | Dynamic Image | SG3I |
---|---|---|---|
UR fall | 16.59 | 175.10 | 719.61 |
Avenue | 15.93 | 184.65 | 776.12 |
Mini-drone video | 16.09 | 189.14 | 789.36 |
Hockey fights | 16.77 | 177.85 | 745.70 |
Network | No. of Learnable Parameters | Size (MB) | Time per Inference Step (ms)—CPU | Time per Inference Step (ms)—GPU | UR Fall Accuracy% | Avenue Accuracy% | Mini-Drone Video Accuracy% | Hockey Fights Accuracy% |
---|---|---|---|---|---|---|---|---|
ResNet-50 + SG3I | 25M+ | 106 | 698.40 | 45.50 | 97.92 | 95.78 | 95.18 | 99.78 |
Inception-V3 + SG3I | 23M+ | 101 | 507.00 | 68.60 | 98.89 | 95.17 | 95.86 | 99.71 |
DenseNet-250 + SG3I | 15M+ | 93 | 1526.88 | 66.70 | 97.21 | 94.91 | 95.66 | 99.08 |
LightAnomalyNet | 7154 | 14 | 278.45 | 23.05 | 98.86 | 95.28 | 95.81 | 99.74 |
UR Fall Dataset | ||||
---|---|---|---|---|
Method | AUC% | Recall% | Precision% | Accuracy% |
Vishnu et al. [5] | - | 97.5 | 96.9 | - |
Zerrouki and Houacine [42] | - | - | - | 97.0 |
Nunez et al. [43] | - | 100.0 | - | 95.0 |
Khraief et al. [44] | - | 100.0 | 95.0 | - |
LightAnomalyNet | 98.71 | 98.92 | 98.79 | 98.86 |
Avenue Dataset | Mini-Drone Video | |||||||
---|---|---|---|---|---|---|---|---|
Method | AUC% | Recall% | Precision% | Accuracy% | AUC% | Recall% | Precision% | Accuracy% |
Cheoi [3] | - | 94.5 | 93.2 | 90.1 | - | - | - | - |
Chriki et al. [52] | - | - | - | - | - | 100.0 | 88.37 | 93.57 |
LightAnomalyNet | 94.97 | 95.69 | 94.91 | 95.28 | 96.11 | 96.59 | 95.11 | 95.81 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Mehmood, A. LightAnomalyNet: A Lightweight Framework for Efficient Abnormal Behavior Detection. Sensors 2021, 21, 8501. https://doi.org/10.3390/s21248501
Mehmood A. LightAnomalyNet: A Lightweight Framework for Efficient Abnormal Behavior Detection. Sensors. 2021; 21(24):8501. https://doi.org/10.3390/s21248501
Chicago/Turabian StyleMehmood, Abid. 2021. "LightAnomalyNet: A Lightweight Framework for Efficient Abnormal Behavior Detection" Sensors 21, no. 24: 8501. https://doi.org/10.3390/s21248501
APA StyleMehmood, A. (2021). LightAnomalyNet: A Lightweight Framework for Efficient Abnormal Behavior Detection. Sensors, 21(24), 8501. https://doi.org/10.3390/s21248501