Machine Learning for Human Activity Recognition: State-of-the-Art Techniques and Emerging Trends
Abstract
:1. Introduction
2. Review Methodology
2.1. Eligibility Criteria
2.1.1. Inclusion Criteria
- The article either investigates or proposes methods for HAR.
- It uses sensor data relevant to HAR, such as RGB, depth, infrared, motion capture devices, skeleton joint positions, wearable devices, acoustic sensors, radar, WiFi, LiDAR, proximity sensors, or combinations of these.
- It applies machine learning or deep learning techniques for activity classification.
- The article is published in English in a peer-reviewed journal or conference proceeding.
2.1.2. Exclusion Criteria
- The article focuses on sensor development without direct relevance to HAR.
- The article does not employ machine learning or deep learning for activity recognition.
- The article primarily focuses on human detection, pose estimation, or motion identification without activity classification.
- Reviews and non-peer-reviewed, non-English articles.
2.2. Search Criteria
- Scopus
- Google Scholar
- IEEE Xplore Digital Library
- ACM Digital Library
- Activity-related keywords: [“human activity recognition” OR “activity monitoring” OR “gesture recognition” OR “Activity Discovery”].
- Machine learning-related keywords: [machine learning OR deep learning OR convolutional neural network OR recurrent neural network OR support vector machine OR random forest OR Reinforcement Learning OR Graph Convolutional Networks OR Generative Adversarial Networks OR Autoencoders OR Artificial neural network OR ANN].
- Sensor-related keywords: [RGB OR Depth OR Infrared OR Motion capture devices OR Mocap OR Skelton joint positions OR Wearable devices OR accelerometer OR gyroscope OR Acoustic sensors OR Radar OR Wifi OR Lidar OR Proximity sensors OR Fusion HAR].
2.3. Screening Strategy
- Identification:The initial search yielded 7938 records from the seven databases: ACM (479), IEEE Xplore (4760), Elsevier (563), Springer (1570), Wiley (18), Taylor and Francis (32), and Nature (286).
- Screening:A total of 5759 records were screened based on titles and abstracts to remove studies that were not relevant to the scope of this review. In total, 1949 records were excluded for not utilising machine learning or deep learning techniques for HAR. Additionally, 3890 articles were found to be inaccessible due to paywall restrictions, despite utilizing institutional subscriptions and exploring open-access alternatives. These constraints limited access to potentially relevant studies, but efforts were made to ensure that the available open-access literature provided a comprehensive and representative dataset.
- Eligibility:Following the initial screening, 1869 full-text articles were assessed for eligibility based on the predefined inclusion and exclusion criteria. During this stage, 650 articles were excluded for reasons such as lack of implementation details, a narrow focus on partial-body movements, or the absence of performance metrics essential for evaluating HAR systems. Additionally, 860 articles were excluded due to significant overlap with existing studies, ensuring that only unique and methodologically sound research was retained for review.
- Inclusion:After completing the eligibility assessment, 359 studies met the inclusion criteria and were included for detailed analysis. These studies represent a broad range of sensor modalities, machine learning methodologies, and emerging trends in HAR research. The selected studies provide valuable insights into the field, addressing key challenges and identifying potential research directions.
3. Human Activity Recognition (HAR)
3.1. Definition and Categories of Human Activity Recognition
3.1.1. Overview of Human Activity Recognition
3.1.2. Categories of Activities
3.2. Stages of HAR
3.3. Activity Discovery and Activity Recognition
3.4. Types of Sensors Used for Human Activity Recognition
4. Data Collection and Pre-Processing
4.1. Publicly Available Datasets for Human Activity Recognition
4.2. Data Pre-Processing
4.2.1. Importance of Data Pre-Processing in Human Activity Recognition
4.2.2. Data Pre-Processing Techniques
Data Preparation
Feature Engineering
Data Transformation and Standardisation
Data Enhancement and Handling Class Imbalances
5. Machine Learning Techniques in Human Activity Recognition
5.1. Shallow Learning Models
5.1.1. Supervised Learning Techniques
5.1.2. Unsupervised Learning and Clustering Techniques
5.1.3. Semi-Supervised Learning Techniques
5.2. Deep Learning Models
5.2.1. CNN, RNN, GCN, and LSTM-Based Methods
5.2.2. Attention-Based Networks
5.2.3. Reinforcement Learning-Based Networks
5.3. Machine Learning Algorithm Performance Comparison
5.3.1. Evaluation Metrics
- Accuracy
- Precision
- Recall (Sensitivity)
- F1-Score:
- Confusion matrix:
- Intersection-over-Union (IoU):
- Mean Average Precision (mAP):
- Receiver Operating Characteristic (ROC) and Area Under the Curve (AUC):
- Internal Cluster Evaluation Metrics
- Silhouette Score
- Davies–Bouldin Index (DBI)
- Dunn Index
- External Evaluation Metrics
- Adjusted Rand Index (ARI):
- Normalised Mutual Information (NMI):
- Fowlkes–Mallows Index (FMI):
5.3.2. Computational Complexity
5.3.3. Scalability and Generalisation
6. Applications of Human Activity Recognition
7. Challenges and Future Directions
7.1. Emerging Trends and Future Directions in Human Activity Recognition Research
7.2. Research Challenges and Potential Solutions
8. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhou, X.; Liang, W.; Kevin, I.; Wang, K.; Wang, H.; Yang, L.T.; Jin, Q. Deep-Learning-Enhanced Human Activity Recognition for Internet of Healthcare Things. IEEE Internet Things J. 2020, 7, 6429–6438. [Google Scholar]
- Mekruksavanich, S.; Jitpattanakul, A. Lstm Networks Using Smartphone Data for Sensor-Based Human Activity Recognition in Smart Homes. Sensors 2021, 21, 1636. [Google Scholar] [CrossRef]
- Hsu, Y.-L.; Yang, S.-C.; Chang, H.-C.; Lai, H.-C. Human Daily and Sport Activity Recognition Using a Wearable Inertial Sensor Network. IEEE Access 2018, 6, 31715–31728. [Google Scholar]
- Beddiar, D.R.; Nini, B.; Sabokrou, M.; Hadid, A. Vision-Based Human Activity Recognition: A Survey. Multimed. Tools Appl. 2020, 79, 30509–30555. [Google Scholar] [CrossRef]
- He, J.; Zhang, C.; He, X.; Dong, R. Visual Recognition of Traffic Police Gestures with Convolutional Pose Machine and Handcrafted Features. Neurocomputing 2020, 390, 248–259. [Google Scholar]
- Bian, S.; Liu, M.; Zhou, B.; Lukowicz, P.; Magno, M. Body-Area Capacitive or Electric Field Sensing for Human Activity Recognition and Human-Computer Interaction: A Comprehensive Survey. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2024, 8, 1–49. [Google Scholar] [CrossRef]
- Al, G.A.; Estrela, P.; Martinez-Hernandez, U. Towards an Intuitive Human-Robot Interaction Based on Hand Gesture Recognition and Proximity Sensors. In Proceedings of the 2020 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), Karlsruhe, Germany, 14–16 September 2020; pp. 330–335. [Google Scholar]
- Malche, T.; Tharewal, S.; Tiwari, P.K.; Jabarulla, M.Y.; Alnuaim, A.A.; Hatamleh, W.A.; Ullah, M.A. Artificial Intelligence of Things-(AIoT-) Based Patient Activity Tracking System for Remote Patient Monitoring. J. Healthc. Eng. 2022, 2022, 8732213. [Google Scholar]
- Zolfaghari, S.; Keyvanpour, M.R. SARF: Smart Activity Recognition Framework in Ambient Assisted Living. In Proceedings of the 2016 Federated Conference on Computer Science and Information Systems (FedCSIS), Gdansk, Poland, 11–14 September 2016; pp. 1435–1443. [Google Scholar]
- Hoang, V.-H.; Lee, J.W.; Piran, M.J.; Park, C. Advances in Skeleton-Based Fall Detection in RGB Videos: From Handcrafted to Deep Learning Approaches. IEEE Access 2023, 11, 92322–92352. [Google Scholar]
- Tien, P.W.; Wei, S.; Calautit, J.K.; Darkwa, J.; Wood, C. Vision-Based Human Activity Recognition for Reducing Building Energy Demand. Build. Serv. Eng. Res. Technol. 2021, 42, 691–713. [Google Scholar]
- Zhang, W.; Liu, Z.; Zhou, L.; Leung, H.; Chan, A.B. Martial Arts, Dancing and Sports Dataset: A Challenging Stereo and Multi-View Dataset for 3d Human Pose Estimation. Image Vis. Comput. 2017, 61, 22–39. [Google Scholar]
- Yu, S.J.; Koh, P.; Kwon, H.; Kim, D.S.; Kim, H.K. Hurst Parameter Based Anomaly Detection for Intrusion Detection System. In Proceedings of the 2016 IEEE International Conference on Computer and Information Technology (CIT), Nadi, Fiji, 8–10 December 2016; pp. 234–240. [Google Scholar]
- Lipton, Z.C.; Berkowitz, J.; Elkan, C. A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv 2015, arXiv:1506.00019. [Google Scholar]
- Li, Z.; Liu, F.; Yang, W.; Peng, S.; Zhou, J. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 6999–7019. [Google Scholar]
- Tay, Y.; Dehghani, M.; Bahri, D.; Metzler, D. Efficient Transformers: A Survey. ACM Comput. Surv. 2022, 55, 1–28. [Google Scholar]
- Khan, S.; Naseer, M.; Hayat, M.; Zamir, S.W.; Khan, F.S.; Shah, M. Transformers in Vision: A Survey. ACM Comput. Surv. 2022, 54, 1–41. [Google Scholar]
- Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; et al. Recent Advances in Convolutional Neural Networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar]
- Yu, Y.; Si, X.; Hu, C.; Zhang, J. A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures. Neural Comput. 2019, 31, 1235–1270. [Google Scholar]
- Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar]
- Zhang, S.; Tong, H.; Xu, J.; Maciejewski, R. Graph Convolutional Networks: A Comprehensive Review. Comput. Soc. Netw. 2019, 6, 11. [Google Scholar] [CrossRef] [PubMed]
- Wu, F.; Souza, A.; Zhang, T.; Fifty, C.; Yu, T.; Weinberger, K. Simplifying Graph Convolutional Networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 6861–6871. [Google Scholar]
- Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv 2014, arXiv:1412.3555. [Google Scholar]
- Dey, R.; Salem, F.M. Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. Adv. Neural Inf. Process. Syst. 2017, 30, 6000–6010. [Google Scholar]
- Tasnim, N.; Islam, M.; Baek, J.H. Deep Learning-Based Action Recognition Using 3d Skeleton Joints Information. Inventions 2020, 5, 49. [Google Scholar] [CrossRef]
- Georis, B.; Maziere, M.; Bremond, F.; Thonnat, M. A Video Interpretation Platform Applied to Bank Agency Monitoring. In Proceedings of the IEE Intelligent Distributed Surveilliance Systems, London, UK, 23 February 2004; pp. 46–50. [Google Scholar]
- Turaga, P.; Chellappa, R.; Subrahmanian, V.S.; Udrea, O. Machine Recognition of Human Activities: A Survey. IEEE Trans. Circuits Syst. Video Technol. 2008, 18, 1473–1488. [Google Scholar] [CrossRef]
- Pawlyta, M.; Skurowski, P. A Survey of Selected Machine Learning Methods for the Segmentation of Raw Motion Capture Data into Functional Body Mesh. In Proceedings of the Information Technologies in Medicine: 5th International Conference (ITIB 2016), Kamień Śląski, Poland, 20–22 June 2016; Volume 2, pp. 321–336. [Google Scholar]
- Akula, A.; Shah, A.K.; Ghosh, R. Deep Learning Approach for Human Action Recognition in Infrared Images. Cogn. Syst. Res. 2018, 50, 146–154. [Google Scholar]
- Chapron, K.; Lapointe, P.; Bouchard, K.; Gaboury, S. Highly Accurate Bathroom Activity Recognition Using Infrared Proximity Sensors. IEEE J. Biomed. Health Inform. 2020, 24, 2368–2377. [Google Scholar] [CrossRef]
- Ghosh, R.; Gupta, A.; Nakagawa, A.; Soares, A.; Thakor, N. Spatiotemporal Filtering for Event-Based Action Recognition. arXiv 2019, arXiv:1903.07067. [Google Scholar]
- Wang, Y.; Xiao, Y.; Xiong, F.; Jiang, W.; Cao, Z.; Zhou, J.T.; Yuan, J. 3DV: 3D Dynamic Voxel for Action Recognition in Depth Video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 13–19 June 2020; pp. 511–520. [Google Scholar]
- Singh, A.D.; Sandha, S.S.; Garcia, L.; Srivastava, M. Radhar: Human Activity Recognition from Point Clouds Generated through a Millimeter-Wave Radar. In Proceedings of the 3rd ACM Workshop on Millimeter-Wave Networks and Sensing Systems, Los Cabos, Mexico, 21–25 October 2019; pp. 51–56. [Google Scholar]
- Duan, H.; Zhao, Y.; Chen, K.; Lin, D.; Dai, B. Revisiting Skeleton-Based Action Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 2969–2978. [Google Scholar]
- Donahue, J.; Anne Hendricks, L.; Guadarrama, S.; Rohrbach, M.; Venugopalan, S.; Saenko, K.; Darrell, T. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 2625–2634. [Google Scholar]
- Liang, D.; Thomaz, E. Audio-Based Activities of Daily Living (Adl) Recognition with Large-Scale Acoustic Embeddings from Online Videos. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2019, 3, 1–18. [Google Scholar]
- Wang, F.; Song, Y.; Zhang, J.; Han, J.; Huang, D. Temporal Unet: Sample Level Human Action Recognition Using Wifi. arXiv 2019, arXiv:1904.11953. [Google Scholar]
- Roche, J.; De-Silva, V.; Hook, J.; Moencks, M.; Kondoz, A. A Multimodal Data Processing System for LiDAR-Based Human Activity Recognition. IEEE Trans. Cybern. 2022, 52, 10027–10040. [Google Scholar] [CrossRef]
- Attal, F.; Mohammed, S.; Dedabrishvili, M.; Chamroukhi, F.; Oukhellou, L.; Amirat, Y. Physical Human Activity Recognition Using Wearable Sensors. Sensors 2015, 15, 31314–31338. [Google Scholar] [CrossRef]
- Gupta, A.; Gupta, K.; Gupta, K.; Gupta, K. A Survey on Human Activity Recognition and Classification. In Proceedings of the 2020 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 28–30 July 2020; pp. 915–919. [Google Scholar] [CrossRef]
- Zeng, M.; Nguyen, L.T.; Yu, B.; Mengshoel, O.J.; Zhu, J.; Wu, P.; Zhang, J. Convolutional Neural Networks for Human Activity Recognition Using Mobile Sensors. In Proceedings of the 6th International Conference on Mobile Computing, Applications and Services, Austin, TX, USA, 6–7 November 2014; pp. 197–205. [Google Scholar]
- Imran, J.; Raman, B. Evaluating Fusion of RGB-D and Inertial Sensors for Multimodal Human Action Recognition. J. Ambient Intell. Humaniz. Comput. 2020, 11, 189–208. [Google Scholar] [CrossRef]
- Aggarwal, J.K.; Cai, Q. Human Motion Analysis: A Review. Comput. Vis. Image Underst. 1999, 73, 428–440. [Google Scholar]
- Keyvanpour, M.R.; Vahidian, S.; Ramezani, M. HMR-Vid: A Comparative Analytical Survey on Human Motion Recognition in Video Data. Multimed. Tools Appl. 2020, 79, 31819–31863. [Google Scholar]
- Gupta, N.; Gupta, S.K.; Pathak, R.K.; Jain, V.; Rashidi, P.; Suri, J.S. Human Activity Recognition in Artificial Intelligence Framework: A Narrative Review. Artif. Intell. Rev. 2022, 55, 4755–4808. [Google Scholar]
- Ma, N.; Wu, Z.; Cheung, Y.; Guo, Y.; Gao, Y.; Li, J.; Jiang, B. A Survey of Human Action Recognition and Posture Prediction. Tsinghua Sci. Technol. 2022, 27, 973–1001. [Google Scholar] [CrossRef]
- Aggarwal, J.K.; Xia, L. Human Activity Recognition from 3D Data: A Review. Pattern Recognit. Lett. 2014, 48, 70–80. [Google Scholar] [CrossRef]
- Chen, L.; Wei, H.; Ferryman, J. A Survey of Human Motion Analysis Using Depth Imagery. Pattern Recognit. Lett. 2013, 34, 1995–2006. [Google Scholar] [CrossRef]
- Carvalho, L.I.; Sofia, R.C. A Review on Scaling Mobile Sensing Platforms for Human Activity Recognition: Challenges and Recommendations for Future Research. IoT 2020, 1, 451–473. [Google Scholar] [CrossRef]
- Homayounfar, S.Z.; Andrew, T.L. Wearable Sensors for Monitoring Human Motion: A Review on Mechanisms, Materials, and Challenges. SLAS Technol. Transl. Life Sci. Innov. 2020, 25, 9–24. [Google Scholar]
- Desmarais, Y.; Mottet, D.; Slangen, P.; Montesinos, P. A Review of 3D Human Pose Estimation Algorithms for Markerless Motion Capture. Comput. Vis. Image Underst. 2021, 212, 103275. [Google Scholar]
- Han, F.; Reily, B.; Hoff, W.; Zhang, H. Space-Time Representation of People Based on 3D Skeletal Data: A Review. Comput. Vis. Image Underst. 2017, 158, 85–105. [Google Scholar] [CrossRef]
- Singh, R.; Sonawane, A.; Srivastava, R. Recent Evolution of Modern Datasets for Human Activity Recognition: A Deep Survey. Multimed. Syst. 2020, 26, 83–106. [Google Scholar] [CrossRef]
- Sun, Z.; Ke, Q.; Rahmani, H.; Bennamoun, M.; Wang, G.; Liu, J. Human Action Recognition from Various Data Modalities: A Review. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 3200–3225. [Google Scholar]
- Pareek, P.; Thakkar, A. A Survey on Video-Based Human Action Recognition: Recent Updates, Datasets, Challenges, and Applications. Artif. Intell. Rev. 2021, 54, 2259–2322. [Google Scholar]
- Kong, Y.; Fu, Y. Human Action Recognition and Prediction: A Survey. Int. J. Comput. Vis. 2022, 130, 1366–1401. [Google Scholar]
- Azis, F.A.; Rijal, M.; Suhaimi, H.; Abas, P.E. Patent Landscape of Composting Technology: A Review. Inventions 2022, 7, 38. [Google Scholar] [CrossRef]
- Ratnayake, A.M.B.; Yasin, H.M.; Naim, A.G.; Abas, P.E. Buzzing through Data: Advancing Bee Species Identification with Machine Learning. Appl. Syst. Innov. 2024, 7, 62. [Google Scholar] [CrossRef]
- Hossen, M.A.; Abas, P.E. A Comparative Study of Supervised and Unsupervised Approaches in Human Activity Analysis Based on Skeleton Data. Int. J. Comput. Digit. Syst. 2023, 14, 10407–10421. [Google Scholar]
- Vallabh, P.; Malekian, R. Fall Detection Monitoring Systems: A Comprehensive Review. J. Ambient Intell. Humaniz. Comput. 2018, 9, 1809–1833. [Google Scholar] [CrossRef]
- Nweke, H.F.; Teh, Y.W.; Mujtaba, G.; Al-garadi, M.A. Data Fusion and Multiple Classifier Systems for Human Activity Detection and Health Monitoring: Review and Open Research Directions. Inf. Fusion 2019, 46, 147–170. [Google Scholar] [CrossRef]
- Frank, A.E.; Kubota, A.; Riek, L.D. Wearable Activity Recognition for Robust Human-Robot Teaming in Safety-Critical Environments via Hybrid Neural Networks. In Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 3–8 November 2019; pp. 449–454. [Google Scholar]
- Xia, L.; Chen, C.C.; Aggarwal, J.K. View Invariant Human Action Recognition Using Histograms of 3D Joints. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, Providence, RI, USA, 16–21 June 2012; pp. 20–27. [Google Scholar]
- Vrigkas, M.; Nikou, C.; Kakadiaris, I.A. A Review of Human Activity Recognition Methods. Front. Robot. AI 2015, 2, 28. [Google Scholar] [CrossRef]
- Materzynska, J.; Berger, G.; Bax, I.; Memisevic, R. The Jester Dataset: A Large-Scale Video Dataset of Human Gestures. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Fronteddu, G.; Porcu, S.; Floris, A.; Atzori, L. A Dynamic Hand Gesture Recognition Dataset for Human-Computer Interfaces. Comput. Netw. 2022, 205, 108781. [Google Scholar] [CrossRef]
- Minh Dang, L.; Min, K.; Wang, H.; Jalil Piran, M.; Hee Lee, C.; Moon, H. Sensor-Based and Vision-Based Human Activity Recognition: A Comprehensive Survey. Pattern Recognit. 2020, 108, 107561. [Google Scholar] [CrossRef]
- Zhang, S.; Wei, Z.; Nie, J.; Huang, L.; Wang, S.; Li, Z. A Review on Human Activity Recognition Using Vision-Based Method. J. Healthc. Eng. 2017, 2017, 3090343. [Google Scholar] [CrossRef] [PubMed]
- Bhatnagar, B.L.; Xie, X.; Petrov, I.A.; Sminchisescu, C.; Theobalt, C.; Pons-Moll, G. Behave: Dataset and Method for Tracking Human Object Interactions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 15935–15946. [Google Scholar]
- Lu, L.; Lu, Y.; Yu, R.; Di, H.; Zhang, L.; Wang, S. GAIM: Graph Attention Interaction Model for Collective Activity Recognition. IEEE Trans. Multimed. 2019, 22, 524–539. [Google Scholar] [CrossRef]
- Ibrahim, M.S.; Muralidharan, S.; Deng, Z.; Vahdat, A.; Mori, G. A Hierarchical Deep Temporal Model for Group Activity Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 1971–1980. [Google Scholar]
- Qi, M.; Wang, Y.; Qin, J.; Li, A.; Luo, J.; Van Gool, L. StagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 549–565. [Google Scholar] [CrossRef]
- Tang, Y.; Lu, J.; Wang, Z.; Yang, M.; Zhou, J. Learning Semantics-Preserving Attention and Contextual Interaction for Group Activity Recognition. IEEE Trans. Image Process. 2019, 28, 4997–5012. [Google Scholar] [CrossRef] [PubMed]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4338–4364. [Google Scholar] [CrossRef]
- Hossen, M.A.; Hong, O.W.; Caesarendra, W. Investigation of the Unsupervised Machine Learning Techniques for Human Activity Discovery. In Proceedings of the 2nd International Conference on Electronics, Biomedical Engineering, and Health Informatics, Surabaya, Indonesia, 3–4 November 2021; Triwiyanto, T., Rizal, A., Caesarendra, W., Eds.; Springer Nature: Singapore, 2022; pp. 499–514. [Google Scholar]
- Huynh, T.; Fritz, M.; Schiele, B. Discovery of Activity Patterns Using Topic Models. In Proceedings of the 10th International Conference on Ubiquitous Computing (UbiComp ’08), Seoul, Republic of Korea, 21–24 September 2008; ACM Press: New York, NY, USA, 2008; pp. 10–19. [Google Scholar]
- Kim, E.; Helal, S.; Cook, D. Human Activity Recognition and Pattern Discovery. IEEE Pervasive Comput. 2010, 9, 48–53. [Google Scholar] [CrossRef]
- Gjoreski, H.; Roggen, D. Unsupervised Online Activity Discovery Using Temporal Behaviour Assumption. In Proceedings of the 2017 ACM International Symposium on Wearable Computers (ISWC 2017), Maui, HI, USA, 11–15 September 2017; pp. 42–49. [Google Scholar] [CrossRef]
- Rashidi, P. Stream Sequence Mining for Human Activity Discovery; Elsevier Inc.: Amsterdam, The Netherlands, 2014; ISBN 9780123985323. [Google Scholar]
- Ye, J.; Fang, L.; Dobson, S. Discovery and Recognition of Unknown Activities. In Proceedings of the 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct, Heidelberg, Germany, 12–16 September 2016; pp. 783–792. [Google Scholar] [CrossRef]
- Ong, W.H.; Koseki, T.; Palafox, L. Unsupervised Human Activity Detection with Skeleton Data from RGB-D Sensor. In Proceedings of the 2013 Fifth International Conference on Computational Intelligence, Communication Systems and Networks (CICSyN 2013), Madrid, Spain, 5–7 June 2013; pp. 30–35. [Google Scholar] [CrossRef]
- Steil, J.; Bulling, A. Discovery of Everyday Human Activities from Long-Term Visual Behaviour Using Topic Models. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp 2015), Osaka, Japan, 7–11 September 2015; pp. 75–85. [Google Scholar] [CrossRef]
- Wang, Y.; Jiang, H.; Drew, M.S.; Li, Z.N.; Mori, G. Unsupervised Discovery of Action Classes. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006; Volume 2, pp. 1654–1661. [Google Scholar] [CrossRef]
- Fang, L.; Ye, J.; Dobson, S. Discovery and Recognition of Emerging Human Activities Using a Hierarchical Mixture of Directional Statistical Models. IEEE Trans. Knowl. Data Eng. 2019, 14, 1304–1316. [Google Scholar] [CrossRef]
- Rieping, K.; Englebienne, G.; Kröse, B. Behavior Analysis of Elderly Using Topic Models. Pervasive Mob. Comput. 2014, 15, 181–199. [Google Scholar] [CrossRef]
- Kwon, Y.; Kang, K.; Bae, C. Unsupervised Learning for Human Activity Recognition Using Smartphone Sensors. Expert Syst. Appl. 2014, 41, 6067–6074. [Google Scholar] [CrossRef]
- Veeraraghavan, A.; Roy-Chowdhury, A.K.; Chellappa, R. Matching Shape Sequences in Video with Applications in Human Movement Analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2005, 27, 1896–1909. [Google Scholar] [PubMed]
- Han, J.; Bhanu, B. Human Activity Recognition in Thermal Infrared Imagery. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05)-Workshops, San Diego, CA, USA, 20–25 June 2005; p. 17. [Google Scholar]
- Hossen, M.A.; Naim, A.G.; Abas, P.E. Evaluation of 2D and 3D Posture for Human Activity Recognition. AIP Conf. Proc. 2023, 2643, 40013. [Google Scholar] [CrossRef]
- Barnachon, M.; Bouakaz, S.; Boufama, B.; Guillou, E. Ongoing Human Action Recognition with Motion Capture. Pattern Recognit. 2014, 47, 238–247. [Google Scholar] [CrossRef]
- Xu, T.; An, D.; Jia, Y.; Yue, Y. A Review: Point Cloud-Based 3d Human Joints Estimation. Sensors 2021, 21, 1684. [Google Scholar] [CrossRef] [PubMed]
- Calabrese, E.; Taverni, G.; Awai Easthope, C.; Skriabine, S.; Corradi, F.; Longinotti, L.; Eng, K.; Delbruck, T. DHP19: Dynamic Vision Sensor 3D Human Pose Dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Zia Uddin, M.; Khaksar, W.; Torresen, J. A Thermal Camera-Based Activity Recognition Using Discriminant Skeleton Features and RNN. In Proceedings of the 2019 IEEE 17th International Conference on Industrial Informatics (INDIN), Helsinki, Finland, 22–25 July 2019; Volume 1, pp. 777–782. [Google Scholar]
- Jung, M.; Chi, S. Human Activity Classification Based on Sound Recognition and Residual Convolutional Neural Network. Autom. Constr. 2020, 114, 103177. [Google Scholar]
- Fioranelli, D.F.; Shah, D.S.A.; Li1, H.; Shrestha, A.; Yang, D.S.; Kernec, D.J. Le Radar Sensing for Healthcare: Associate Editor Francesco Fioranelli on the Applications of Radar in Monitoring Vital Signs and Recognising Human Activity Patterns. Electron. Lett. 2019, 55, 1022–1024. [Google Scholar]
- Yadav, S.K.; Sai, S.; Gundewar, A.; Rathore, H.; Tiwari, K.; Pandey, H.M.; Mathur, M. CSITime: Privacy-Preserving Human Activity Recognition Using WiFi Channel State Information. Neural Netw. 2022, 146, 11–21. [Google Scholar] [CrossRef]
- Straczkiewicz, M.; Onnela, J.-P. A Systematic Review of Human Activity Recognition Using Smartphones. arXiv 2019, arXiv:1910.03970. [Google Scholar]
- Soomro, K.; Zamir, A.R.; Shah, M. UCF101: A Dataset of 101 Human Actions Classes from Videos in the Wild. arXiv 2012, arXiv:1212.0402. [Google Scholar]
- Karpathy, A.; Toderici, G.; Shetty, S.; Leung, T.; Sukthankar, R.; Fei-Fei, L. Large-Scale Video Classification with Convolutional Neural Networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1725–1732. [Google Scholar]
- Caba Heilbron, F.; Escorcia, V.; Ghanem, B.; Carlos Niebles, J. Activitynet: A Large-Scale Video Benchmark for Human Activity Understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 961–970. [Google Scholar]
- Gorban, A.; Idrees, H.; Jiang, Y.-G.; Zamir, A.R.; Laptev, I.; Shah, M.; Sukthankar, R. THUMOS Challenge: Action Recognition with a Large Number of Classes; 2015. Available online: http://www.thumos.info (accessed on 14 March 2025).
- Sigurdsson, G.A.; Varol, G.; Wang, X.; Farhadi, A.; Laptev, I.; Gupta, A. Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding. In Proceedings of the Computer Vision—ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part I 14. Springer: Cham, Switzerland; pp. 510–526. [Google Scholar]
- Gao, C.; Du, Y.; Liu, J.; Lv, J.; Yang, L.; Meng, D.; Hauptmann, A.G. Infar Dataset: Infrared Action Recognition at Different Times. Neurocomputing 2016, 212, 36–47. [Google Scholar] [CrossRef]
- Shahroudy, A.; Liu, J.; Ng, T.T.; Wang, G. NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1010–1019. [Google Scholar] [CrossRef]
- Amir, A.; Taba, B.; Berg, D.; Melano, T.; McKinstry, J.; Di Nolfo, C.; Nayak, T.; Andreopoulos, A.; Garreau, G.; Mendoza, M.; et al. A Low Power, Fully Event-Based Gesture Recognition System. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7243–7252. [Google Scholar]
- Jiang, Y.-G.; Wu, Z.; Wang, J.; Xue, X.; Chang, S.-F. Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 352–364. [Google Scholar]
- Kay, W.; Carreira, J.; Simonyan, K.; Zhang, B.; Hillier, C.; Vijayanarasimhan, S.; Viola, F.; Green, T.; Back, T.; Natsev, P.; et al. The Kinetics Human Action Video Dataset. arXiv 2017, arXiv:1705.06950. [Google Scholar]
- Liu, C.; Hu, Y.; Li, Y.; Song, S.; Liu, J. Pku-Mmd: A Large Scale Benchmark for Continuous Multi-Modal Human Action Understanding. arXiv 2017, arXiv:1703.07475. [Google Scholar]
- Goyal, R.; Kahou, S.E.; Michalski, V.; Materzynska, J.; Westphal, S.; Kim, H.; Haenel, V.; Fruend, I.; Yianilos, P.; Mueller-Freitag, M.; et al. The “Something Something” Video Database for Learning and Evaluating Visual Common Sense. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5843–5851. [Google Scholar]
- Carreira, J.; Noland, E.; Banki-Horvath, A.; Hillier, C.; Zisserman, A. A Short Note about Kinetics-600. arXiv 2018, arXiv:1808.01340. [Google Scholar]
- Ji, Y.; Xu, F.; Yang, Y.; Shen, F.; Shen, H.T.; Zheng, W.-S. A Large-Scale RGB-D Database for Arbitrary-View Human Action Recognition. In Proceedings of the 26th ACM International Conference on Multimedia, Seoul, Republic of Korea, 22–26 October 2018; pp. 1510–1518. [Google Scholar]
- Martin, M.; Roitberg, A.; Haurilet, M.; Horne, M.; Reiß, S.; Voit, M.; Stiefelhagen, R. Drive & Act: A Multi-Modal Dataset for Fine-Grained Driver Behavior Recognition in Autonomous Vehicles. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 2801–2810. [Google Scholar]
- Zhang, Y.; Cao, C.; Cheng, J.; Lu, H. Egogesture: A New Dataset and Benchmark for Egocentric Hand Gesture Recognition. IEEE Trans. Multimed. 2018, 20, 1038–1050. [Google Scholar]
- Monfort, M.; Andonian, A.; Zhou, B.; Ramakrishnan, K.; Bargal, S.A.; Yan, T.; Brown, L.; Fan, Q.; Gutfreund, D.; Vondrick, C.; et al. Moments in Time Dataset: One Million Videos for Event Understanding. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 42, 502–508. [Google Scholar] [CrossRef]
- Liu, J.; Shahroudy, A.; Perez, M.; Wang, G.; Duan, L.-Y.; Kot, A.C. NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2684–2701. [Google Scholar] [CrossRef]
- Miech, A.; Alayrac, J.-B.; Laptev, I.; Sivic, J.; Zisserman, A. Rareact: A Video Dataset of Unusual Interactions. arXiv 2020, arXiv:2008.01018. [Google Scholar]
- Ghorbani, S.; Mahdaviani, K.; Thaler, A.; Kording, K.; Cook, D.J.; Blohm, G.; Troje, N.F. MoVi: A Large Multi-Purpose Human Motion and Video Dataset. PLoS ONE 2021, 16, e0253157. [Google Scholar]
- Li, T.; Liu, J.; Zhang, W.; Ni, Y.; Wang, W.; Li, Z. Uav-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 16266–16275. [Google Scholar]
- Rai, N.; Chen, H.; Ji, J.; Desai, R.; Kozuka, K.; Ishizaka, S.; Adeli, E.; Niebles, J.C. Home Action Genome: Cooperative Compositional Action Understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 19–25 June 2021; pp. 11184–11193. [Google Scholar]
- Grauman, K.; Westbury, A.; Byrne, E.; Chavis, Z.; Furnari, A.; Girdhar, R.; Hamburger, J.; Jiang, H.; Liu, M.; Liu, X.; et al. Ego4d: Around the World in 3000 Hours of Egocentric Video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 18995–19012. [Google Scholar]
- Damen, D.; Doughty, H.; Farinella, G.M.; Furnari, A.; Kazakos, E.; Ma, J.; Moltisanti, D.; Munro, J.; Perrett, T.; Price, W.; et al. Rescaling Egocentric Vision: Collection, Pipeline and Challenges for Epic-Kitchens-100. Int. J. Comput. Vis. 2022, 130, 33–55. [Google Scholar]
- Damen, D.; Doughty, H.; Farinella, G.M.; Fidler, S.; Furnari, A.; Kazakos, E.; Moltisanti, D.; Munro, J.; Perrett, T.; Price, W.; et al. Scaling Egocentric Vision: The Epic-Kitchens Dataset. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 720–736. [Google Scholar]
- Seneviratne, S.; Hu, Y.; Nguyen, T.; Lan, G.; Khalifa, S.; Thilakarathna, K.; Hassan, M.; Seneviratne, A. A Survey of Wearable Devices and Challenges. IEEE Commun. Surv. Tutor. 2017, 19, 2573–2620. [Google Scholar] [CrossRef]
- Gawande, U.; Hajari, K.; Golhar, Y. Scale-Invariant Mask R-CNN for Pedestrian Detection. Electron. Lett. Comput. Vis. Image Anal. 2020, 19, 98–117. [Google Scholar] [CrossRef]
- Ziaeefard, M.; Bergevin, R. Semantic Human Activity Recognition: A Literature Review. Pattern Recognit. 2015, 48, 2329–2345. [Google Scholar] [CrossRef]
- Hossen, M.A.; Naim, A.G.; Abas, P.E. Deep Learning for Skeleton-Based Human Activity Segmentation: An Autoencoder Approach. Technologies 2024, 12, 96. [Google Scholar] [CrossRef]
- Kim, S.-Y.; Kim, M.; Ho, Y.-S. Depth Image Filter for Mixed and Noisy Pixel Removal in RGB-D Camera Systems. IEEE Trans. Consum. Electron. 2013, 59, 681–689. [Google Scholar]
- Ahad, M.A.R.; Ahmed, M.; Das Antar, A.; Makihara, Y.; Yagi, Y. Action Recognition Using Kinematics Posture Feature on 3D Skeleton Joint Locations. Pattern Recognit. Lett. 2021, 145, 216–224. [Google Scholar] [CrossRef]
- Uddin, M.Z.; Thang, N.D.; Kim, J.T.; Kim, T.S. Human Activity Recognition Using Body Joint-Angle Features and Hidden Markov Model. ETRI J. 2011, 33, 569–579. [Google Scholar] [CrossRef]
- Franco, A.; Magnani, A.; Maio, D. Joint Orientations from Skeleton Data for Human Activity Recognition. In Proceedings of the International Conference on Image Analysis and Processing, Modena, Italy, 11–15 September 2017; Volume 2, pp. 152–162, ISBN 9783319685601. [Google Scholar]
- Wang, Q.; Guo, Y.; Yu, L.; Chen, X.; Li, P. Deep Q-Network-Based Feature Selection for Multisourced Data Cleaning. IEEE Internet Things J. 2020, 8, 16153–16164. [Google Scholar]
- Richert, W.; Coelho, L.P. Building Machine Learning Systems with Python, 2nd ed.; Packt Publishing: Birmingham, UK, 2015; ISBN 978-1-78439-277-2. [Google Scholar]
- Müller, P.N.; Rauterberg, F.; Achenbach, P.; Tregel, T.; Göbel, S. Physical Exercise Quality Assessment Using Wearable Sensors. In Multimedia Tools and Applications; Springer: Cham, Switzerland, 2021; Volume 80, pp. 229–243. [Google Scholar]
- He, K.; Sun, J.; Tang, X. Guided Image Filtering. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 1397–1409. [Google Scholar]
- Vishwakarma, D.K. A Two-Fold Transformation Model for Human Action Recognition Using Decisive Pose. Cogn. Syst. Res. 2020, 61, 1–13. [Google Scholar] [CrossRef]
- Xu, D.; Tian, Y. A Comprehensive Survey of Clustering Algorithms. Ann. Data Sci. 2015, 2, 165–193. [Google Scholar] [CrossRef]
- Taherkhani, A.; Cosma, G.; McGinnity, T.M. AdaBoost-CNN: An Adaptive Boosting Algorithm for Convolutional Neural Networks to Classify Multi-Class Imbalanced Datasets Using Transfer Learning. Neurocomputing 2020, 404, 351–366. [Google Scholar]
- Ng, W.W.Y.; Zeng, G.; Zhang, J.; Yeung, D.S.; Pedrycz, W. Dual Autoencoders Features for Imbalance Classification Problem. Pattern Recognit. 2016, 60, 875–889. [Google Scholar]
- Dawar, N.; Ostadabbas, S.; Kehtarnavaz, N. Data Augmentation in Deep Learning-Based Fusion of Depth and Inertial Sensing for Action Recognition. IEEE Sens. Lett. 2018, 3, 7101004. [Google Scholar]
- Nan, M.; Florea, A.M. Fast Temporal Graph Convolutional Model for Skeleton-Based Action Recognition. Sensors 2022, 22, 7117. [Google Scholar] [CrossRef]
- Lopez-del Rio, A.; Martin, M.; Perera-Lluna, A.; Saidi, R. Effect of Sequence Padding on the Performance of Deep Learning Models in Archaeal Protein Functional Prediction. Sci. Rep. 2020, 10, 14634. [Google Scholar] [CrossRef]
- Lee, N.; Kim, J.-M. Conversion of Categorical Variables into Numerical Variables via Bayesian Network Classifiers for Binary Classifications. Comput. Stat. Data Anal. 2010, 54, 1247–1265. [Google Scholar]
- Angelov, P.; Filev, D.P.; Kasabov, N. Evolving Intelligent Systems: Methodology and Applications; John Wiley & Sons: Hoboken, NJ, USA, 2010; Volume 4, ISBN 9780470287194. [Google Scholar]
- Kumar, K.; Mishra, R.K. A Heuristic SVM Based Pedestrian Detection Approach Employing Shape and Texture Descriptors. Multimed. Tools Appl. 2020, 79, 21389–21408. [Google Scholar] [CrossRef]
- Fan, L.; Wang, Z.; Wang, H. Human Activity Recognition Model Based on Decision Tree. In Proceedings of the 2013 International Conference on Advanced Cloud and Big Data, Sanya, China, 16–18 December 2013; pp. 64–68. [Google Scholar]
- Nurwulan, N.R.; Selamaj, G. Human Daily Activities Recognition Using Decision Tree. J. Phys. Conf. Ser. 2021, 1833, 12039. [Google Scholar]
- Müller, P.N.; Rauterberg, F.; Achenbach, P.; Tregel, T.; Göbel, S. Physical Exercise Quality Assessment Using Wearable Sensors. In Proceedings of the Serious Games; Fletcher, B., Ma, M., Göbel, S., Baalsrud Hauge, J., Marsh, T., Eds.; Springer International Publishing: Cham, Switzerland, 2021; pp. 229–243. [Google Scholar]
- Lara, O.D.; Labrador, M.A. A Survey on Human Activity Recognition Using Wearable Sensors. IEEE Commun. Surv. Tutor. 2013, 15, 1192–1209. [Google Scholar] [CrossRef]
- Pärkkä, J.; Cluitmans, L.; Ermes, M. Personalization Algorithm for Real-Time Activity Recognition Using PDA, Wireless Motion Bands, and Binary Decision Tree. IEEE Trans. Inf. Technol. Biomed. 2010, 14, 1211–1215. [Google Scholar] [CrossRef]
- Balli, S.; Sa\ugba\cs, E.A.; Peker, M. Human Activity Recognition from Smart Watch Sensor Data Using a Hybrid of Principal Component Analysis and Random Forest Algorithm. Meas. Control 2019, 52, 37–45. [Google Scholar]
- Dewi, C.; Chen, R.-C. Human Activity Recognition Based on Evolution of Features Selection and Random Forest. In Proceedings of the 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari, Italy, 6–9 October 2019; pp. 2496–2501. [Google Scholar]
- Zhu, Y.; Chen, W.; Guo, G. Evaluating Spatiotemporal Interest Point Features for Depth-Based Action Recognition. Image Vis. Comput. 2014, 32, 453–464. [Google Scholar]
- Nunes, U.M.; Faria, D.R.; Peixoto, P. A Human Activity Recognition Framework Using Max-Min Features and Key Poses with Differential Evolution Random Forests Classifier. Pattern Recognit. Lett. 2017, 99, 21–31. [Google Scholar] [CrossRef]
- Nurwulan, N.R.; Selamaj, G. Random Forest for Human Daily Activity Recognition. J. Phys. Conf. Ser. 2020, 1655, 12087. [Google Scholar]
- Gan, L.; Chen, F. Human Action Recognition Using APJ3D and Random Forests. J. Softw. 2013, 8, 2238–2245. [Google Scholar] [CrossRef]
- Cippitelli, E.; Gasparrini, S.; Gambi, E.; Spinsante, S. A Human Activity Recognition System Using Skeleton Data from RGBD Sensors. Comput. Intell. Neurosci. 2016, 2016, 4351435. [Google Scholar] [CrossRef]
- Tran, D.N.; Phan, D.D. Human Activities Recognition in Android Smartphone Using Support Vector Machine. In Proceedings of the 2016 7th International Conference on Intelligent Systems, Modelling and Simulation (ISMS), Bangkok, Thailand, 26–28 January 2016; pp. 64–68. [Google Scholar]
- Anguita, D.; Ghio, A.; Oneto, L.; Parra, X.; Reyes-Ortiz, J.L. Human Activity Recognition on Smartphones Using a Multiclass Hardware-Friendly Support Vector Machine. In Ambient Assisted Living and Home Care, Proceedings of the 4th International Workshop (IWAAL 2012), Vitoria-Gasteiz, Spain, 3–5 December 2012; Proceedings 4; Springer: Berlin/Heidelberg, Germany, 2012; pp. 216–223. [Google Scholar]
- Kim, Y.; Ling, H. Human Activity Classification Based on Micro-Doppler Signatures Using a Support Vector Machine. IEEE Trans. Geosci. Remote Sens. 2009, 47, 1328–1337. [Google Scholar] [CrossRef]
- Shuvo, M.M.H.; Ahmed, N.; Nouduri, K.; Palaniappan, K. A Hybrid Approach for Human Activity Recognition with Support Vector Machine and 1D Convolutional Neural Network. In Proceedings of the 2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), Virtual, 13–15 October 2020; pp. 1–5. [Google Scholar]
- Nurhanim, K.; Elamvazuthi, I.; Izhar, L.I.; Ganesan, T. Classification of Human Activity Based on Smartphone Inertial Sensor Using Support Vector Machine. In Proceedings of the 2017 IEEE 3rd International Symposium in Robotics and Manufacturing Automation (ROMA), Kuala Lumpur, Malaysia, 19–21 September 2017; pp. 1–5. [Google Scholar]
- Mohsen, S.; Elkaseer, A.; Scholz, S.G. Human Activity Recognition Using K-Nearest Neighbor Machine Learning Algorithm. In Proceedings of the International Conference on Sustainable Design and Manufacturing, Virtual, 15–17 September 2021; pp. 304–313. [Google Scholar]
- Kaghyan, S.; Sarukhanyan, H. Activity Recognition Using K-Nearest Neighbor Algorithm on Smartphone with Tri-Axial Accelerometer. Int. J. Inform. Model. Anal. (IJIMA) ITHEA Int. Sci. Soc. Bulg. 2012, 1, 146–156. [Google Scholar]
- Al-Akam, R.; Paulus, D. RGBD Human Action Recognition Using Multi-Features Combination and k-Nearest Neighbors Classification. Int. J. Adv. Comput. Sci. Appl. 2017, 8, 383–389. [Google Scholar] [CrossRef]
- Mandong, A.; Munir, U. Smartphone Based Activity Recognition Using K-Nearest Neighbor Algorithm. In Proceedings of the International Conference on Engineering Technologies, Konya, Turkey, 26–28 October 2018; pp. 26–28. [Google Scholar]
- Zhang, L.; Wu, X.; Luo, D. Human Activity Recognition with HMM-DNN Model. In Proceedings of the 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC), Beijing, China, 15–17 June 2015; pp. 192–197. [Google Scholar]
- Ronao, C.A.; Cho, S.-B. Human Activity Recognition Using Smartphone Sensors with Two-Stage Continuous Hidden Markov Models. In Proceedings of the 2014 10th International Conference on Natural Computation (ICNC), Xiamen, China, 19–21 August 2014; pp. 681–686. [Google Scholar]
- San-Segundo, R.; Montero, J.M.; Moreno-Pimentel, J.; Pardo, J.M. HMM Adaptation for Improving a Human Activity Recognition System. Algorithms 2016, 9, 60. [Google Scholar] [CrossRef]
- Shen, J.; Fang, H. Human Activity Recognition Using Gaussian Naive Bayes Algorithm in Smart Home. J. Phys. Conf. Ser. 2020, 1631, 12059. [Google Scholar] [CrossRef]
- Kose, M.; Incel, O.D.; Ersoy, C. Online Human Activity Recognition on Smart Phones. In Proceedings of the Workshop on Mobile Sensing: From Smartphones and Wearables to Big Data, Pittsburgh, PA, USA, 5–8 September 2012; Volume 16, pp. 11–15. [Google Scholar]
- Sarkar, A.M.J.; Lee, Y.-K.; Lee, S. A Smoothed Naive Bayes-Based Classifier for Activity Recognition. IETE Tech. Rev. 2010, 27, 107–119. [Google Scholar] [CrossRef]
- Chetty, G.; White, M.; Akther, F. Smart Phone Based Data Mining for Human Activity Recognition. Procedia Comput. Sci. 2015, 46, 1181–1187. [Google Scholar] [CrossRef]
- Zhang, W.; Zhao, X.; Li, Z. A Comprehensive Study of Smartphone-Based Indoor Activity Recognition via Xgboost. IEEE Access 2019, 7, 80027–80042. [Google Scholar] [CrossRef]
- Bukht, T.F.N.; Jalal, A. A Robust Model of Human Activity Recognition Using Independent Component Analysis and XGBoost. In Proceedings of the 2024 5th International Conference on Advancements in Computational Sciences (ICACS), Dubai, United Arab Emirates, 18–20 March 2024; pp. 1–7. [Google Scholar]
- O’Halloran, J.; Curry, E. A Comparison of Deep Learning Models in Human Activity Recognition and Behavioural Prediction on the MHEALTH Dataset. In Proceedings of the AICS, Galway, Ireland, 5–6 December 2019; pp. 212–223. [Google Scholar]
- Ayumi, V. Pose-Based Human Action Recognition with Extreme Gradient Boosting. In Proceedings of the 2016 IEEE Student Conference on Research and Development (SCOReD), Malacca, Malaysia, 13–14 December 2016; pp. 1–5. [Google Scholar]
- Ambati, L.S.; El-Gayar, O. Human Activity Recognition: A Comparison of Machine Learning Approaches. J. Midwest Assoc. Inf. Syst. 2021, 2021, 4. [Google Scholar]
- Sozinov, K.; Vlassov, V.; Girdzijauskas, S. Human Activity Recognition Using Federated Learning. In Proceedings of the 2018 IEEE International Conference on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainC), Melbourne, Australia, 11–13 December 2018; pp. 1103–1111. [Google Scholar]
- Bustoni, I.A.; Hidayatulloh, I.; Ningtyas, A.M.; Purwaningsih, A.; Azhari, S.N. Classification Methods Performance on Human Activity Recognition. J. Phys. Conf. Ser. 2020, 1456, 12027. [Google Scholar] [CrossRef]
- Halim, N. Stochastic Recognition of Human Daily Activities via Hybrid Descriptors and Random Forest Using Wearable Sensors. Array 2022, 15, 100190. [Google Scholar] [CrossRef]
- Wang, L. Recognition of Human Activities Using Continuous Autoencoders with Wearable Sensors. Sensors 2016, 16, 189. [Google Scholar] [CrossRef]
- Shi, D.; Li, Y.; Ding, B. Unsupervised Feature Learning for Human Activity Recognition. Guofang Keji Daxue Xuebao/J. Natl. Univ. Def. Technol. 2015, 37, 128–134. [Google Scholar] [CrossRef]
- Jamel, A.A.M.; Akay, B. Human Activity Recognition Based on Parallel Approximation Kernel K-Means Algorithm. Comput. Syst. Sci. Eng. 2020, 35, 441. [Google Scholar] [CrossRef]
- Manzi, A.; Dario, P.; Cavallo, F. A Human Activity Recognition System Based on Dynamic Clustering of Skeleton Data. Sensors 2017, 17, 1100. [Google Scholar] [CrossRef]
- Gaglio, S.; Re, G.L.; Morana, M. Human Activity Recognition Process Using 3-D Posture Data. IEEE Trans. Human-Machine Syst. 2015, 45, 586–597. [Google Scholar] [CrossRef]
- Colpas, P.A.; Vicario, E.; De-La-Hoz-Franco, E.; Pineres-Melo, M.; Oviedo-Carrascal, A.; Patara, F. Unsupervised Human Activity Recognition Using the Clustering Approach: A Review. Sensors 2020, 20, 2702. [Google Scholar] [CrossRef]
- Wang, T.; Ng, W.W.Y.; Li, J.; Wu, Q.; Zhang, S.; Nugent, C.; Shewell, C. A Deep Clustering via Automatic Feature Embedded Learning for Human Activity Recognition. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 210–223. [Google Scholar]
- Mejia-Ricart, L.F.; Helling, P.; Olmsted, A. Evaluate Action Primitives for Human Activity Recognition Using Unsupervised Learning Approach. In Proceedings of the 2017 12th International Conference for Internet Technology and Secured Transactions (ICITST), London, UK, 11–14 December 2017; pp. 186–188. [Google Scholar]
- Dobbins, C.; Rawassizadeh, R. Towards Clustering of Mobile and Smartwatch Accelerometer Data for Physical Activity Recognition. Informatics 2018, 5, 29. [Google Scholar] [CrossRef]
- Piyathilaka, L.; Kodagoda, S. Gaussian Mixture Based HMM for Human Daily Activity Recognition Using 3D Skeleton Features. In Proceedings of the 2013 IEEE 8th Conference on Industrial Electronics and Applications (ICIEA), Melbourne, Australia, 19–21 June 2013; pp. 567–572. [Google Scholar]
- Cheng, X.; Huang, B. CSI-Based Human Continuous Activity Recognition Using GMM–HMM. IEEE Sens. J. 2022, 22, 18709–18717. [Google Scholar]
- Yang, X. Personalized Semi-Supervised Federated Learning for Human Activity Recognition. arXiv 2021, arXiv:2104.08094. [Google Scholar]
- Yang, S.-H.; Baek, D.-G.; Thapa, K. Semi-Supervised Adversarial Learning Using LSTM for Human Activity Recognition. Sensors 2022, 22, 4755. [Google Scholar] [CrossRef]
- Cook, D.; Crandall, A.; Thomas, B.; Krishnan, N. CASAS: A Smart Home in a Box. Computer 2013, 46, 62–69. [Google Scholar] [CrossRef]
- Oh, S.; Ashiquzzaman, A.; Lee, D.; Kim, Y.; Kim, J. Study on Human Activity Recognition Using Semi-Supervised Active Transfer Learning. Sensors 2021, 21, 2760. [Google Scholar] [CrossRef]
- Qu, Y.; Tang, Y.; Yang, X.; Wen, Y.; Zhang, W. Context-Aware Mutual Learning for Semi-Supervised Human Activity Recognition Using Wearable Sensors. Expert Syst. Appl. 2023, 219, 119679. [Google Scholar] [CrossRef]
- Anguita, D.; Ghio, A.; Oneto, L.; Parra, X.; Reyes-Ortiz, J.L. A Public Domain Dataset for Human Activity Recognition Using Smartphones. In Proceedings of the ESANN, Bruges, Belgium, 24–26 April 2013; Volume 3, p. 3. [Google Scholar]
- Wan, S.; Qi, L.; Xu, X.; Tong, C.; Gu, Z. Deep Learning Models for Real-Time Human Activity Recognition with Smartphones. Mob. Netw. Appl. 2020, 25, 743–755. [Google Scholar] [CrossRef]
- Banos, O.; Garcia, R.; Holgado-Terriza, J.A.; Damas, M.; Pomares, H.; Rojas, I.; Saez, A.; Villalonga, C. MHealthDroid: A Novel Framework for Agile Development of Mobile Health Applications. In Ambient Assisted Living and Daily Activities; Pecchia, L., Chen, L.L., Nugent, C., Bravo, J., Eds.; Springer International Publishing: Cham, Switzerland, 2014; pp. 91–98. [Google Scholar]
- Kim, D.; Han, S.; Son, H.; Lee, D. Human Activity Recognition Using Semi-Supervised Multi-Modal DEC for Instagram Data; Springer: Cham, Switzerland, 2020; pp. 869–880. ISBN 978-3-030-47425-6. [Google Scholar]
- Paola Patricia, A.C.; Rosberg, P.C.; Butt-Aziz, S.; Marlon Alberto, P.M.; Roberto-Cesar, M.O.; Miguel, U.T.; Naz, S. Semi-Supervised Ensemble Learning for Human Activity Recognition in Casas Kyoto Dataset. Heliyon 2024, 10, e29398. [Google Scholar] [CrossRef]
- Zeng, M.; Yu, T.; Wang, X.; Nguyen, L.T.; Mengshoel, O.J.; Lane, I. Semi-Supervised Convolutional Neural Networks for Human Activity Recognition. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; pp. 522–529. [Google Scholar]
- Zilelioglu, H.; Khodabandelou, G.; Chibani, A.; Amirat, Y. Semisupervised Generative Adversarial Networks with Temporal Convolutions for Human Activity Recognition. IEEE Sens. J. 2023, 23, 12355–12369. [Google Scholar] [CrossRef]
- Mehta, D.; Rhodin, H.; Casas, D.; Fua, P.; Sotnychenko, O.; Xu, W.; Theobalt, C. Monocular 3d Human Pose Estimation in the Wild Using Improved Cnn Supervision. In Proceedings of the 2017 International Conference on 3D Vision (3DV), Qingdao, China, 10–12 October 2017; pp. 506–516. [Google Scholar]
- Ronao, C.A.; Cho, S.-B. Human Activity Recognition with Smartphone Sensors Using Deep Learning Neural Networks. Expert Syst. Appl. 2016, 59, 235–244. [Google Scholar] [CrossRef]
- Zhang, S.; Li, Y.; Zhang, S.; Shahabi, F.; Xia, S.; Deng, Y.; Alshurafa, N. Deep Learning in Human Activity Recognition with Wearable Sensors: A Review on Advances. Sensors 2022, 22, 1476. [Google Scholar] [CrossRef]
- Wang, H.; Zhao, J.; Li, J.; Tian, L.; Tu, P.; Cao, T.; An, Y.; Wang, K.; Li, S. Wearable Sensor-Based Human Activity Recognition Using Hybrid Deep Learning Techniques. Secur. Commun. Netw. 2020, 2020, 2132138. [Google Scholar] [CrossRef]
- Park, S.U.; Park, J.H.; Al-Masni, M.A.; Al-Antari, M.A.; Uddin, M.Z.; Kim, T.S. A Depth Camera-Based Human Activity Recognition via Deep Learning Recurrent Neural Network for Health and Social Care Services. Procedia Comput. Sci. 2016, 100, 78–84. [Google Scholar] [CrossRef]
- Ren, B.; Liu, M.; Ding, R.; Liu, H. A Survey on 3D Skeleton-Based Action Recognition Using Learning Method. arXiv 2020, arXiv:2002.05907. [Google Scholar]
- Schuster, M.; Paliwal, K.K. Bidirectional Recurrent Neural Networks. IEEE Trans. Signal Process. 1997, 45, 2673–2681. [Google Scholar] [CrossRef]
- Liu, Y.; Lu, Z.; Li, J.; Yang, T.; Yao, C. Global Temporal Representation Based CNNs for Infrared Action Recognition. IEEE Signal Process. Lett. 2018, 25, 848–852. [Google Scholar]
- Cardenas, E.J.E.; Chavez, G.C. Multimodal Human Action Recognition Based on a Fusion of Dynamic Images Using CNN Descriptors. In Proceedings of the SIBGRAPI, Foz do Iguaçu, Brazil, 7–10 October 2018. [Google Scholar]
- Bianchi, V.; Bassoli, M.; Lombardo, G.; Fornacciari, P.; Mordonini, M.; De Munari, I. IoT Wearable Sensor and Deep Learning: An Integrated Approach for Personalized Human Activity Recognition in a Smart Home Environment. IEEE Internet Things J. 2019, 6, 8553–8562. [Google Scholar]
- Herath, S.; Harandi, M.; Porikli, F. Going Deeper into Action Recognition: A Survey. Image Vis. Comput. 2017, 60, 4–21. [Google Scholar] [CrossRef]
- Nguyen, H.C.; Nguyen, T.H.; Scherer, R.; Le, V.H. Deep Learning for Human Activity Recognition on 3D Human Skeleton: Survey and Comparative Study. Sensors 2023, 23, 5121. [Google Scholar] [CrossRef]
- Wang, P.; Li, W.; Gao, Z.; Zhang, Y.; Tang, C.; Ogunbona, P. Scene Flow to Action Map: A New Representation for Rgb-d Based Action Recognition with Convolutional Neural Networks. In Proceedings of the CVPR, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Ke, Q.; An, S.; Bennamoun, M.; Sohel, F.; Boussaid, F. Skeletonnet: Mining Deep Part Features for 3-D Action Recognition. IEEE signal Process. lett. 2017, 24, 731–735. [Google Scholar] [CrossRef]
- Rahmani, H.; Bennamoun, M. Learning Action Recognition Model from Depth and Skeleton Videos. In Proceedings of the ICCV, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Yan, S.; Xiong, Y.; Lin, D. Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; AAAI Press: Washington, DC, USA, 2018. [Google Scholar]
- Wang, L.; Xiong, Y.; Wang, Z.; Qiao, Y.; Lin, D.; Tang, X.; Van Gool, L. Temporal Segment Networks for Action Recognition in Videos. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 2740–2755. [Google Scholar]
- Li, M.; Chen, S.; Chen, X.; Zhang, Y.; Wang, Y.; Tian, Q. Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 3590–3598. [Google Scholar]
- Song, Y.-F.; Zhang, Z.; Shan, C.; Wang, L. Richly Activated Graph Convolutional Network for Robust Skeleton-Based Action Recognition. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 1915–1925. [Google Scholar] [CrossRef]
- Shi, L.; Zhang, Y.; Cheng, J.; Lu, H. Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 12018–12027. [Google Scholar]
- Shi, L.; Zhang, Y.; Cheng, J.; Lu, H. Skeleton-Based Action Recognition with Directed Graph Neural Networks. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 7904–7913. [Google Scholar]
- Yang, H.; Yan, D.; Zhang, L.; Sun, Y.; Li, D.; Maybank, S.J. Feedback Graph Convolutional Network for Skeleton-Based Action Recognition. IEEE Trans. Image Process. 2022, 31, 164–175. [Google Scholar] [CrossRef]
- Cheng, K.; Zhang, Y.; He, X.; Chen, W.; Cheng, J.; Lu, H. Skeleton-Based Action Recognition with Shift Graph Convolutional Network. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 13–19 June 2020; pp. 180–189. [Google Scholar]
- Shi, L.; Zhang, Y.; Cheng, J.; Lu, H. Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action-Gesture Recognition. In Proceedings of the Asian Conference on Computer Vision, Macau, China, 1–4 December 2022; pp. 38–53. [Google Scholar]
- De Smedt, Q.; Wannous, H.; Vandeborre, J.-P.; Guerry, J.; Le Saux, B.; Filliat, D. Shrec’17 Track: 3d Hand Gesture Recognition Using a Depth and Skeletal Dataset. In Proceedings of the 3DOR—10th Eurographics Workshop on 3D Object Retrieval, Lyon, France, 26 April 2017; pp. 1–6. [Google Scholar]
- De Smedt, Q.; Wannous, H.; Vandeborre, J.-P. Skeleton-Based Dynamic Hand Gesture Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Las Vegas, NV, USA, 27–30 June 2016; pp. 1–9. [Google Scholar]
- Liu, Z.; Zhang, H.; Chen, Z.; Wang, Z.; Ouyang, W. Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, 13–19 June 2020; pp. 143–152. [Google Scholar]
- Kwon, H.; Kim, M.; Kwak, S.; Cho, M. Learning Self-Similarity in Space and Time as Generalized Motion for Video Action Recognition. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Virtual, 11–17 October 2021; pp. 13045–13055. [Google Scholar]
- Davoodikakhki, M.; Yin, K. Hierarchical Action Classification with Network Pruning. In Advances in Visual Computing, Proceedings of the 15th International Symposium (ISVC 2020), San Diego, CA, USA, 5–7 October 2020; Proceedings, Part I 15; Springer International Publishing: New York, NY, USA, 2020; pp. 291–305. [Google Scholar]
- Das, S.; Dai, R.; Yang, D.; Bremond, F. Vpn++: Rethinking Video-Pose Embeddings for Understanding Activities of Daily Living. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 9703–9717. [Google Scholar]
- Das, S.; Dai, R.; Koperski, M.; Minciullo, L.; Garattoni, L.; Bremond, F.; Francesca, G. Toyota Smarthome: Real-World Activities of Daily Living. In Proceedings of the ICCV, Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Liu, Z.; Ning, J.; Cao, Y.; Wei, Y.; Zhang, Z.; Lin, S.; Hu, H. Video Swin Transformer. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; pp. 3192–3201. [Google Scholar]
- Duan, H.; Zhao, Y.; Xiong, Y.; Liu, W.; Lin, D. Omni-Sourced Webly-Supervised Learning for Video Recognition. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 670–688. [Google Scholar]
- Cook, D.; Feuz, K.D.; Krishnan, N.C. Transfer Learning for Activity Recognition: A Survey. Knowl. Inf. Syst. 2013, 36, 537–556. [Google Scholar] [CrossRef] [PubMed]
- Ranasinghe, K.; Naseer, M.; Khan, S.; Khan, F.S.; Ryoo, M.S. Self-Supervised Video Transformer. In Proceedings of the CVPR, New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Bertasius, G.; Wang, H.; Torresani, L. Is Space-Time Attention All You Need for Video Understanding? In Proceedings of the ICML, Virtual, 18–24 July 2021. [Google Scholar]
- Yang, J.; Dong, X.; Liu, L.; Zhang, C.; Shen, J.; Yu, D. Recurring the Transformer for Video Action Recognition. In Proceedings of the CVPR, New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Carreira, J.; Zisserman, A. Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. In Proceedings of the CVPR, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Girdhar, R.; Joao Carreira, J.; Doersch, C.; Zisserman, A. Video Action Transformer Network. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 244–253. [Google Scholar]
- Li, S.; Cao, Q.; Liu, L.; Yang, K.; Liu, S.; Hou, J.; Yi, S. GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer. In Proceedings of the ICCV, Virtual, 11–17 October 2021. [Google Scholar]
- Choi, W.; Shahid, K.; Savarese, S. What Are They Doing?: Collective Activity Classification Using Spatio-Temporal Relationship among People. In Proceedings of the 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops), Kyoto, Japan, 27 September–4 October 2009; pp. 1282–1289. [Google Scholar] [CrossRef]
- Neimark, D.; Bar, O.; Zohar, M.; Asselmann, D. Video Transformer Network. In Proceedings of the ICCV, Virtual, 11–17 October 2021. [Google Scholar]
- Zhang, Y.; Li, X.; Liu, C.; Shuai, B.; Zhu, Y.; Brattoli, B.; Chen, H.; Marsic, I.; Tighe, J. Vidtr: Video Transformer Without Convolutions. In Proceedings of the ICCV, Virtual, 11–17 October 2021. [Google Scholar]
- Kuehne, H.; Jhuang, H.; Garrote, E.; Poggio, T.; Serre, T. HMDB: A Large Video Database for Human Motion Recognition. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2556–2563. [Google Scholar]
- Bulat, A.; Perez Rua, J.M.; Sudhakaran, S.; Martinez, B.; Tzimiropoulos, G. Space-Time Mixing Attention for Video Transformer. Adv. Neural Inf. Process. Syst. 2021, 34, 19594–19607. [Google Scholar]
- Patrick, M.; Campbell, D.; Asano, Y.; Misra, I.; Metze, F.; Feichtenhofer, C.; Vedaldi, A.; Henriques, J.F. Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12493–12506. [Google Scholar]
- Fan, H.; Xiong, B.; Mangalam, K.; Li, Y.; Yan, Z.; Malik, J.; Feichtenhofer, C. Multiscale Vision Transformers. In Proceedings of the ICCV, Virtual, 11–17 October 2021. [Google Scholar]
- Arnab, A.; Dehghani, M.; Heigold, G.; Sun, C.; Lučić, M.; Schmid, C. Vivit: A Video Vision Transformer. In Proceedings of the ICCV, Virtual, 11–17 October 2021. [Google Scholar]
- Ryoo, M.S.; Piergiovanni, A.J.; Arnab, A.; Dehghani, M.; Angelova, A. TokenLearner: What Can 8 Learned Tokens Do for Images and Videos? arXiv 2021, arXiv:2106.11297. [Google Scholar]
- Piergiovanni, A.J.; Ryoo, M. Avid Dataset: Anonymized Videos from Diverse Countries. Adv. Neural Inf. Process. Syst. 2020, 33, 16711–16721. [Google Scholar]
- Zha, X.; Zhu, W.; Xun, L.T.; Yang, S.; Liu, J. Shifted Chunk Transformer for Spatio-Temporal Representational Learning. In Proceedings of the NeurIPS, Virtual, 6–14 December 2021. [Google Scholar]
- Zhang, F.Z.; Campbell, D.; Gould, S. Efficient Two-Stage Detection of Human-Object Interactions with a Novel Unary-Pairwise Transformer. In Proceedings of the CVPR, New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Guo, H.; Wang, H.; Ji, Q. Uncertainty-Guided Probabilistic Transformer for Complex Action Recognition. In Proceedings of the CVPR, New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Chen, C.-F.; Panda, R.; Fan, Q. Regionvit: Regional-to-Local Attention for Vision Transformers. In Proceedings of the ICLR, Virtual, 25–29 April 2022. [Google Scholar]
- Truong, T.-D.; Bui, Q.-H.; Duong, C.N.; Seo, H.-S.; Phung, S.L.; Li, X.; Luu, K. Direcformer: A Directed Attention in Transformer Approach to Robust Action Recognition. In Proceedings of the CVPR, New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Li, K.; Wang, Y.; Gao, P.; Song, G.; Liu, Y.; Li, H.; Qiao, Y. Uniformer: Unified Transformer for Efficient Spatiotemporal Representation Learning. In Proceedings of the ICLR, Virtual, 25–29 April 2022. [Google Scholar]
- Fan, Q.; Chen, C.-F.; Panda, R. Can an Image Classifier Suffice For Action Recognition? In Proceedings of the ICLR, Virtual, 25–29 April 2022. [Google Scholar]
- Yan, S.; Xiong, X.; Arnab, A.; Lu, Z.; Zhang, M.; Sun, C.; Schmid, C. Multiview Transformers for Video Recognition. In Proceedings of the CVPR, New Orleans, LA, USA, 19–24 June 2022. [Google Scholar]
- Tong, Z.; Song, Y.; Wang, J.; Wang, L. Videomae: Masked Autoencoders Are Data-Efficient Learners for Self-Supervised Video Pre-Training. Adv. Neural Inf. Process. Syst. 2022, 35, 10078–10093. [Google Scholar]
- Feichtenhofer, C.; Li, Y.; He, K. Masked Autoencoders as Spatiotemporal Learners. Adv. Neural Inf. Process. Syst. 2022, 35, 35946–35958. [Google Scholar]
- Wang, J.; Bertasius, G.; Tran, D.; Torresani, L. Long-Short Temporal Contrastive Learning of Video Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 14010–14020. [Google Scholar]
- Wang, R.; Chen, D.; Wu, Z.; Chen, Y.; Dai, X.; Liu, M.; Jiang, Y.-G.; Zhou, L.; Yuan, L. Bevt: Bert Pretraining of Video Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 14733–14743. [Google Scholar]
- Girdhar, R.; El-Nouby, A.; Singh, M.; Alwala, K.V.; Joulin, A.; Misra, I. Omnimae: Single Model Masked Pretraining on Images and Videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 10406–10417. [Google Scholar]
- Sun, X.; Chen, P.; Chen, L.; Li, C.; Li, T.H.; Tan, M.; Gan, C. Masked Motion Encoding for Self-Supervised Video Representation Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 2235–2245. [Google Scholar]
- Fan, D.; Wang, J.; Liao, S.; Zhu, Y.; Bhat, V.; Santos-Villalobos, H.; MV, R.; Li, X. Motion-Guided Masking for Spatiotemporal Representation Learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Vancouver, BC, Canada, 18–22 June 2023; pp. 5619–5629. [Google Scholar]
- Nikpour, B.; Sinodinos, D.; Armanfard, N. Deep Reinforcement Learning in Human Activity Recognition: A Survey and Outlook. IEEE Trans. Neural Networks Learn. Syst. 2024, 36, 4267–4278. [Google Scholar]
- Laptev, I.; Marszalek, M.; Schmid, C.; Rozenfeld, B. Learning Realistic Human Actions from Movies. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
- Klaser, A.; Marszałek, M.; Schmid, C. A Spatio-Temporal Descriptor Based on 3d-Gradients. In Proceedings of the BMVC; 2008. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, MA, USA, 2018. [Google Scholar]
- Liu, Z.; Yao, C.; Yu, H.; Wu, T. Deep Reinforcement Learning with Its Application for Lung Cancer Detection in Medical Internet of Things. Futur. Gener. Comput. Syst. 2019, 97, 1–9. [Google Scholar] [CrossRef]
- Shao, K.; Tang, Z.; Zhu, Y.; Li, N.; Zhao, D. A Survey of Deep Reinforcement Learning in Video Games. arXiv 2019, arXiv:1912.10944. [Google Scholar]
- Xu, W.; Yu, J.; Miao, Z.; Wan, L.; Ji, Q. Spatio-Temporal Deep Q-Networks for Human Activity Localization. IEEE Trans. Circuits Syst. Video Technol. 2019, 30, 2984–2999. [Google Scholar] [CrossRef]
- Rodriguez, M.D.; Ahmed, J.; Shah, M. Action Mach a Spatio-Temporal Maximum Average Correlation Height Filter for Action Recognition. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 23–28 June 2008; pp. 1–8. [Google Scholar]
- Wang, L.; Qiao, Y.; Tang, X. Video Action Detection with Relational Dynamic-Poselets. In Computer Vision—ECCV 2014, Proceedings of the 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13; Springer: Zurich, Switzerland, 2014; pp. 565–580. [Google Scholar]
- Tang, Y.; Tian, Y.; Lu, J.; Li, P.; Zhou, J. Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition. In Proceedings of the CVPR, Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Hu, J.-F.; Zheng, W.-S.; Lai, J.; Zhang, J. Jointly Learning Heterogeneous Features for RGB-D Activity Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 5344–5352. [Google Scholar]
- Wu, W.; He, D.; Tan, X.; Chen, S.; Wen, S. Multi-Agent Reinforcement Learning Based Frame Sampling for Effective Untrimmed Video Recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6222–6231. [Google Scholar]
- Lu, Y.; Li, Y.; Velipasalar, S. Efficient Human Activity Classification from Egocentric Videos Incorporating Actor-Critic Reinforcement Learning. In Proceedings of the 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 564–568. [Google Scholar]
- Kumrai, T.; Korpela, J.; Maekawa, T.; Yu, Y.; Kanai, R. Human Activity Recognition with Deep Reinforcement Learning Using the Camera of a Mobile Robot. In Proceedings of the 2020 IEEE International Conference on Pervasive Computing and Communications (PerCom), Austin, TX, USA, 23–27 March 2020; pp. 1–10. [Google Scholar]
- Brodeur, S.; Perez, E.; Anand, A.; Golemo, F.; Celotti, L.; Strub, F.; Rouat, J.; Larochelle, H.; Courville, A. Home: A Household Multimodal Environment. arXiv 2017, arXiv:1711.11017. [Google Scholar]
- Ma, Y.; Arshad, S.; Muniraju, S.; Torkildson, E.; Rantala, E.; Doppler, K.; Zhou, G. Location-and Person-Independent Activity Recognition with WiFi, Deep Neural Networks, and Reinforcement Learning. ACM Trans. Internet Things 2021, 2, 1–25. [Google Scholar]
- Yousefi, S.; Narui, H.; Dayal, S.; Ermon, S.; Valaee, S. A Survey on Behavior Recognition Using WiFi Channel State Information. IEEE Commum. Mag. 2017, 55, 98–104. [Google Scholar] [CrossRef]
- Palipana, S.; Rojas, D.; Agrawal, P.; Pesch, D. FallDeFi: Ubiquitous Fall Detection Using Commodity Wi-Fi Devices. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2018, 1, 1–25. [Google Scholar]
- Gowda, S.N.; Sevilla-Lara, L.; Keller, F.; Rohrbach, M. Claster: Clustering with Reinforcement Learning for Zero-Shot Action Recognition. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 187–203. [Google Scholar]
- Niebles, J.C.; Chen, C.-W.; Fei-Fei, L. Modeling Temporal Structure of Decomposable Motion Segments for Activity Classification. In Proceedings of the Computer Vision—ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, 5–11 September 2010; Proceedings, Part II 11. Springer: Berlin/Heidelberg, Germany, 2010; pp. 392–405. [Google Scholar]
- Li, Q.; Gravina, R.; Li, Y.; Alsamhi, S.H.; Sun, F.; Fortino, G. Multi-User Activity Recognition: Challenges and Opportunities. Inf. Fusion 2020, 63, 121–135. [Google Scholar] [CrossRef]
- Wang, J.; Chen, Y.; Hao, S.; Peng, X.; Hu, L. Deep Learning for Sensor-Based Activity Recognition: A Survey. Pattern Recognit. Lett. 2019, 119, 3–11. [Google Scholar] [CrossRef]
- Zhang, H.; Zhang, L.; Qui, X.; Li, H.; Torr, P.H.S.; Koniusz, P. Few-Shot Action Recognition with Permutation-Invariant Attention. In Proceedings of the ECCV, Glasgow, UK, 23–28 August 2020. [Google Scholar]
- Wei, Y.; Liu, H.; Xie, T.; Ke, Q.; Guo, Y. Spatial-Temporal Transformer for 3D Point Cloud Sequences. In Proceedings of the WACV, Waikoloa, HI, USA, 4–8 January 2022. [Google Scholar]
- Dalianis, H. Evaluation Metrics and Evaluation. In Clinical Text Mining; Springer International Publishing: Cham, Switzerland, 2018; pp. 45–53. [Google Scholar]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Davis, J.; Goadrich, M. The Relationship between Precision-Recall and ROC Curves. In Proceedings of the 23rd International Conference on Machine Learning—ICML ’06, Pittsburgh, PA, USA, 25–29 June 2006; ACM Press: New York, New York, USA, 2006; pp. 233–240. [Google Scholar]
- Arbelaitz, O.; Gurrutxaga, I.; Muguerza, J.; Pérez, J.M.; Perona, I. An Extensive Comparative Study of Cluster Validity Indices. Pattern Recognit. 2013, 46, 243–256. [Google Scholar] [CrossRef]
- Wijekoon, A.; Wiratunga, N.; Sani, S.; Cooper, K. A Knowledge-Light Approach to Personalised and Open-Ended Human Activity Recognition. Knowl.-Based Syst. 2020, 192, 105651. [Google Scholar] [CrossRef]
- Abdallah, Z.S.; Gaber, M.M.; Srinivasan, B.; Krishnaswamy, S. Activity Recognition with Evolving Data Streams: A Review. ACM Comput. Surv. 2018, 51, 1–36. [Google Scholar] [CrossRef]
- Saleem, G.; Raza, R.H. Toward Human Activity Recognition: A Survey; Springer: London, UK, 2022; Volume 4, ISBN 0123456789. [Google Scholar]
- Hassan, M.M.; Ullah, S.; Hossain, M.S.; Alelaiwi, A. An End-to-End Deep Learning Model for Human Activity Recognition from Highly Sparse Body Sensor Data in Internet of Medical Things Environment. J. Supercomput. 2021, 77, 2237–2250. [Google Scholar] [CrossRef]
- Long, X.; Yin, B.; Aarts, R.M. Single-Accelerometer-Based Daily Physical Activity Classification. In Proceedings of the 2009 Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2009), Minneapolis, MN, USA, 3–6 September 2009; pp. 6107–6110. [Google Scholar] [CrossRef]
- Fahad, L.G.; Tahir, S.F.; Rajarajan, M. Activity Recognition in Smart Homes Using Clustering Based Classification. In Proceedings of the 2014 22nd International Conference on Pattern Recognition, Stockholm, Sweden, 24–28 August 2014; pp. 1348–1353. [Google Scholar]
- Ramirez, H.; Velastin, S.A.; Meza, I.; Fabregas, E.; Makris, D.; Farias, G. Fall Detection and Activity Recognition Using Human Skeleton Features. IEEE Access 2021, 9, 33532–33542. [Google Scholar]
- Yadav, S.K.; Tiwari, K.; Pandey, H.M.; Akbar, S.A. A Review of Multimodal Human Activity Recognition with Special Emphasis on Classification, Applications, Challenges and Future Directions. Knowl.-Based Syst. 2021, 223, 106970. [Google Scholar] [CrossRef]
- De, D.; Bharti, P.; Das, S.K.; Chellappan, S. Multimodal Wearable Sensing for Fine-Grained Activity Recognition in Healthcare. IEEE Internet Comput. 2015, 19, 26–35. [Google Scholar] [CrossRef]
- Chen, C.; Jafari, R.; Kehtarnavaz, N. A Survey of Depth and Inertial Sensor Fusion for Human Action Recognition. Multimed. Tools Appl. 2017, 76, 4405–4425. [Google Scholar] [CrossRef]
- Majumder, S.; Kehtarnavaz, N. Vision and Inertial Sensing Fusion for Human Action Recognition: A Review. IEEE Sens. J. 2020, 21, 2454–2467. [Google Scholar]
- Wei, H.; Kehtarnavaz, N. Simultaneous Utilization of Inertial and Video Sensing for Action Detection and Recognition in Continuous Action Streams. IEEE Sens. J. 2020, 20, 6055–6063. [Google Scholar] [CrossRef]
- Islam, M.M.; Nooruddin, S.; Karray, F.; Muhammad, G. Multi-Level Feature Fusion for Multimodal Human Activity Recognition in Internet of Healthcare Things. Inf. Fusion 2023, 94, 17–31. [Google Scholar] [CrossRef]
- Rokni, S.A.; Nourollahi, M.; Ghasemzadeh, H. Personalized Human Activity Recognition Using Convolutional Neural Networks. In Proceedings of the AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 32. [Google Scholar]
- Ferrari, A.; Micucci, D.; Mobilio, M.; Napoletano, P. On the Personalization of Classification Models for Human Activity Recognition. IEEE Access 2020, 8, 32066–32079. [Google Scholar] [CrossRef]
- Lendínez, A.M.; Linares, C.; Perandres, A.; Cruz, A.; Ruiz, J.L.L.; Nugent, C.; Espinilla, M. Transforming Elderly Care Through Ethical and Social Evaluation of Intelligent Activity Recognition Systems in Nursing Homes. In Proceedings of the International Conference on Ubiquitous Computing and Ambient Intelligence, Salamanca, Spain, 28 November–1 December 2023; pp. 221–227. [Google Scholar]
- Jyothi, A.P.; Kumar, M.; Pravallika, V.; Saha, R.; Sanjana, K.; Varshitha, V.; Shankar, A.; Narayan, A. Face and Action Detection for Crime Prevention: Balancing Promise and Ethical Considerations. In Proceedings of the 2023 International Conference on Integrated Intelligence and Communication Systems (ICIICS), Bengaluru, India, 14–16 December 2023; pp. 1–7. [Google Scholar]
- Chen, K.; Zhang, D.; Yao, L.; Guo, B.; Yu, Z.; Liu, Y. Deep Learning for Sensor-Based Human Activity Recognition: Overview, Challenges, and Opportunities. ACM Comput. Surv. 2021, 54, 1–40. [Google Scholar]
- Zhu, G.; Zhang, L.; Shen, P.; Song, J. An Online Continuous Human Action Recognition Algorithm Based on the Kinect Sensor. Sensors 2016, 16, 161. [Google Scholar] [CrossRef] [PubMed]
- Uddin, M.Z.; Soylu, A. Human Activity Recognition Using Wearable Sensors, Discriminant Analysis, and Long Short-Term Memory-Based Neural Structured Learning. Sci. Rep. 2021, 11, 16455. [Google Scholar] [CrossRef]
- Babaians, E.; Korghond, N.K.; Ahmadi, A.; Karimi, M.; Ghidary, S.S. Skeleton and Visual Tracking Fusion for Human Following Task of Service Robots. In Proceedings of the 2015 3rd RSI International Conference on Robotics and Mechatronics (ICROM), Tehran, Iran, 7–9 October 2015; pp. 761–766. [Google Scholar] [CrossRef]
- Sargano, A.B.; Wang, X.; Angelov, P.; Habib, Z. Human Action Recognition Using Transfer Learning with Deep Representations. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 463–469. [Google Scholar]
Methods | Year | Modality | Top Accuracy | Datasets Used |
---|---|---|---|---|
Estimation-based [77] | 2008 | Wearable | 72.7 | [77] |
UnADevs using temporal behaviour assumption [73] | 2017 | Wearable | 52, 61.0 | JSI-ADL, REALDISP |
Stream sequence mining [79,80] | 2014 | Smart Home | - | CASAS |
Discover unknown activities [81] | 2016 | Smart Home | 68.01 | CASAS |
Incremental clustering using K-means [82] | 2013 | Skeleton | 71.9 | CAAD60 |
Particle swarm [83] | 2023 | Skeleton | 80.1, 58.10, 64.1, 45.1, 40.1 | CAD-60, UTK, F3D, KARD, MSR |
Long-term visual behaviour [84] | 2015 | RGB | 74.75 | Gaze [84] |
Unsupervised discovery using spectral clustering [85] | 2006 | RGB | 50.01, 52.1, 49.5 | Figure Skating [85], Baseball, Basketball |
HMDSM [86] | 2019 | Smart Home | 91.5 | UASH [86] |
Topic models [87] | 2014 | Smart Home | 85.1 | [87] |
Evaluation of multiple clustering algorithms [76] K-means, Spectral, GMM, | 2023 | Skeletons | 70.1, 68.05, 74.5 | CAD60, UTK, UBDKinect |
DBSCAN [68] | 2014 | Smart phone | 80.4 | [68] |
Modality | Devices | Sample | Pros | Cons |
---|---|---|---|---|
Silhouette [88] | Different Cameras |
|
| |
RGB | Standard RGB cameras |
|
| |
Infrared [89] | Infrared Cameras |
|
| |
Depth [90] | Depth Cameras such as Microsoft Kinect/Kinect Azure/ |
|
| |
Skeletons [34] | Derived Sensors/RGB Cameras |
|
| |
Mocap [91] | Commercial Mocap systems |
|
| |
Point cloud [92] | Depth sensors (like Kinect, LiDAR, range sensors), stereo cameras |
|
| |
Event stream [93] | Specialised event cameras |
|
| |
Thermal [94] | Thermal Cameras |
|
| |
Audio [95] | Microphones, audio recording devices |
|
| |
Radar [96] | Radar devices |
|
| |
WiFi CSI [97] | Standard WiFi routers, network interface cards |
|
| |
Inertial sensors [98] | Smartwatches, fitness trackers, smartphones with IMUs (accelerometers, gyroscopes, magnetometers) |
|
| |
Environmental sensor [39] | Sensors embedded in floors or wearables; various proximity sensor devices |
|
|
Dataset | Year | Modality | #Class | #Subject | #Sample | #Viewpoint |
---|---|---|---|---|---|---|
UCF101 [99] | 2012 | RGB | 101 | - | 13,320 | - |
Sports-1M [100] | 2014 | RGB | 487 | - | 1,113,158 | - |
ActivityNet [101] | 2015 | RGB | 203 | - | 27,801 | - |
THUMOS Challenge 15 [102] | 2015 | RGB | 101 | - | 24,017 | - |
Charades [103] | 2016 | RGB | 157 | 267 | 9848 | - |
InfAR [104] | 2016 | IR | 12 | 40 | 600 | 2 |
NTU RGB+D RGB+D [105] | 2016 | RGB, S, D, IR | 60 | 40 | 56,880 | 80 |
DvsGesture [106] | 2017 | Event Stream | 17 | 29 | - | - |
FCVID [107] | 2017 | RGB | 239 | - | 91,233 | - |
Kinetics-400 [108] | 2017 | RGB | 400 | - | 306,245 | - |
PKU-MMD [109] | 2017 | RGB, S, D, IR | 51 | 66 | 1076 | 3 |
Something-Something-v1 [110] | 2017 | RGB | 174 | - | 108,499 | - |
Kinetics-600 [111] | 2018 | RGB | 600 | - | 495,547 | - |
RGB-D Varying-view [112] | 2018 | RGB, S, D | 40 | 118 | 25,600 | 8 + 1 (360°) |
DHP19 [94] | 2019 | ES | S | 33 | 17 | - |
Drive&Act [113] | 2019 | RGB, S, D, IR | 83 | 15 | - | 6 |
Egogesture [114] | 2018 | RGB, D | 83 | 50 | 2,953,224 | |
Moments in time [115] | 2019 | RGB | 339 | - | ∼1,000,000 | - |
NTU RGB+D 120 [116] | 2019 | RGB, S, D, IR | 120 | 106 | 114,480 | 155 |
RareAct [117] | 2020 | RGB | 122 | - | 905 | - |
MoVi [118] | 2021 | RGB, Mocap | 21 | 90 | 7,344,000+ | 4 |
UAV-Human [119] | 2021 | RGB, S, D, IR, etc. | 155 | 119 | 67,428 | - |
HOMAGE [120] | 2021 | RGB, IR, Au, Ac, Gy | 86 | 26,000+ | 5 | |
Ego4D [121] | 2022 | RGB, Au, Ac, etc. | - | 923 | - | Egocentric |
EPIC-KITCHENS-100 [122] | 2022 | RGB, Au, Ac | - | 45 | 89,979 | Egocentric |
EPIC-KITCHENS-55 [123] | 2023 | RGB | - | 397 | 2,000,000+ | Egocentric |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Hossen, M.A.; Abas, P.E. Machine Learning for Human Activity Recognition: State-of-the-Art Techniques and Emerging Trends. J. Imaging 2025, 11, 91. https://doi.org/10.3390/jimaging11030091
Hossen MA, Abas PE. Machine Learning for Human Activity Recognition: State-of-the-Art Techniques and Emerging Trends. Journal of Imaging. 2025; 11(3):91. https://doi.org/10.3390/jimaging11030091
Chicago/Turabian StyleHossen, Md Amran, and Pg Emeroylariffion Abas. 2025. "Machine Learning for Human Activity Recognition: State-of-the-Art Techniques and Emerging Trends" Journal of Imaging 11, no. 3: 91. https://doi.org/10.3390/jimaging11030091
APA StyleHossen, M. A., & Abas, P. E. (2025). Machine Learning for Human Activity Recognition: State-of-the-Art Techniques and Emerging Trends. Journal of Imaging, 11(3), 91. https://doi.org/10.3390/jimaging11030091