A MediaPipe Holistic Behavior Classification Model as a Potential Model for Predicting Aggressive Behavior in Individuals with Dementia
Abstract
:1. Introduction
1.1. Related Works
1.2. Objective
2. Materials and Methods
2.1. Data Collection and Processing
- Argumentative Behavior Dataset: This dataset contains a combination of images from two different folders, courtyard_arguing_00 and downtown_arguing_00. Half of the images from each folder were used for training and the other half for testing. This includes images from two individuals in an argument that gradually escalated.
- Non-Argumentative Behavior Dataset: This dataset contains a combination of images from two different folders, courtyard_warmWelcome_00 and courtyard_giveDirections_00. Half of the images were used for training, and the other half for testing. This included images of two individuals in a warm greeting and giving directions to a specific destination.
- Number of Argumentative Images for training: 466 images;
- Number of Non-Argumentative Images for training: 466 images;
- Number of Argumentative Images for testing: 498 images;
- Number of Non-Argumentative Images for testing: 498 images.
2.2. Handling the Missing Data
2.3. Artifact and Noise Management
- Noise Reduction Techniques
- Quality Control
2.4. Extraction of Features Using MediaPipe
- Arms Gestures: A set of 21 landmarks for each hand corresponding to key points for it with proper x, y, z coordinates. These landmarks capture the positioning of fingers and the hand motions;
- Body Pose: A set of 33 landmarks for the body stance corresponding to key points for it with proper x, y, z coordinates. These landmarks capture the skeleton of the body providing a full-body pose estimation;
- Facial Features: A set of 468 landmarks for the face corresponding to key points for it with proper x, y, z coordinates. These landmarks capture facial expressions and can be relevant in expressions of feelings such as aggression.
- 63 + 63 values for each arm landmarks (21 landmarks for each hand multiplied by the count of coordinates [x, y, z]);
- 99 values for the body pose landmarks (33 landmarks for body stance multiplied by the count of coordinates [x, y, z]);
- 1404 values for the facial expressions landmarks (468 landmarks for body stance multiplied by the count of coordinates [x, y, z]).
- The image file was first converted to RGB from BGR, which is the default format for OpenCV, since MediaPipe expects an RGB image as the input;
- The model then processes the converted images in order to detect and return the landmarks for the body pose, hands and face;
- If failure for landmark detection occurs, then NaN values are assigned to the list for the specific parts.
2.5. Model Development and Training
- Random Forest Classifier: A commonly used machine learning algorithm that combines the output of multiple decision trees to reach a single result.
- Gradient Boosting: A technique based on functional space boosting. It gives a prediction model which contains an ensemble of weak prediction models. These models make very few assumptions about the data, in which most of the cases are simple decision trees.
- Ridge Classifier: A machine learning algorithm designed for multi-class classification tasks.
2.6. Trained Model Evaluation
- Accuracy: The percentage of the correctly classified samples.
- Precision: The percentage of true positives out of all positive classifications.
- Recall: The percentage of true positives out of all actual positive classifications.
- F1 Score: Harmonic mean of precision and recall.
- ROC Curve and AUC: A graphical representation of the trade-off between the true positive rate and false positive rate at various thresholds. They are used as a model’s ability to discriminate between classes.
- Confusion Matrix: The number of true-negative, true-positive, false-negative and false-positive classifications of the model.
- Learning Curves: A graphical representation of the model’s performance on the training and validation datasets over time. This graph represents the model’s accuracy or loss during training.
2.7. Testing Model’s Performance on New Data
- Hyperparameters over-tuned to the validation set, especially in cases when the validation set is small, leading to false predictions on the test set;
- Dimensionality Curse [37]: Using the gridsearch method with many hyperparameters sometimes finds a good combination by chance, rather than through true model improvement;
- Ignored Shifts in Data Distribution [38]: In our case, as the validation set’s distribution is different from the test set, fine-tuning may produce biased hyperparameters, causing poor performance on the test set.
- Model testing was performed through the classification of an argument between two individuals by using the second half of the images from the folders courtyard_arguing_00 and downtown_arguing_00. The total number of images in this class is 498.
- Casual conversation classification was performed by using images between two individuals casually chatting. The second half of the images from the folders courtyard_warmWeclome_00 and courtyard_giveDirections_00 were used. The total number of the images for this class is also 498.
2.8. Final Model Evaluation
3. Results
3.1. Trained Model Performance
Precision | Recall | F1-Score | Support | |
---|---|---|---|---|
Class 0 (Non-Argumentative) | 1.00 | 1.00 | 1.00 | 89 |
Class 1 (Argumentative) | 1.00 | 1.00 | 1.00 | 98 |
Accuracy | 1.00 | 187 | ||
Macro avg | 1.00 | 1.00 | 1.00 | 187 |
Weighted avg | 1.00 | 1.00 | 1.00 | 187 |
Metric | Value |
---|---|
Accuracy | 1.00 |
Precision | 1.00 |
Recall | 1.00 |
F1-Score | 1.00 |
- Feature Engineering: The extraction of high-quality landmarks from the images serve as features for the model. If too-specific details are being captured from these features, the model could overfit to the specific data and may not be able to generalize to new data.
- Clean Dataset and High Performance: The dataset used for this model was within a sterile environment without much noise or artifacts. Despite this being a good case for our model training, leading it to perform exceptionally, it may cause it to perform less efficiently in real-world scenarios.
- The code splits the dataset into 80% of the data for training and 20% of the data for testing. Since the models are being trained on the training set and then evaluated on the test set that contains data unseen by the model during training, the performance results give a clear indication for a good model generalization.
- The performance metrics on the test are also very high (see Table 2), suggesting that the model learned meaningful patterns rather than memorizing data from training. If a significant performance drop had occurred on the test set, then it would be an indication of overfitting. In our case, the model does not exhibit such behaviors.
- We performed 5-fold cross-validation during our model’s evaluation process. This technique is effective in controlling overfitting since it ensures that the model training is being evaluated on multiple subsets of the dataset, exposing it to different distributions of data. This prevents the model from memorizing patterns in a single set of training data.
- Also, the evaluation metrics (F1 score, recall, precision and accuracy) are averaged across the validation folds with their standard deviations being reported, indicating how much variance in performance happened across different folds, thus providing insight into the consistency of the model’s predictions across various data splits. High variance indicates potential overfitting. In our case, the low values on standard deviations suggest stable performance.
- The results for each model after cross-validation are very similar, indicating that the performance is good across all the different folds of the data, which is a sign of avoiding overfitting.
3.2. Final Model Performance
- Timing Function: We used Python’s time module to accurately calculate inference times for our model on our dataset;
- Adding inference_times list: This allowed us to collect the times for each batch size;
- Timing the predictions: The time was recorded before and after the predictions to calculate the inference time;
- Saving times: The results were saved in a .csv file called inference_times.csv.
- Parallelization: Using the joblib function, our model utilizes all available processing cores of the machine;
- Batch Processing: Implementation for inference to indicate how the model handles larger data sizes;
- Batch Size Evaluation: Evaluation through a loop for different batch sizes.
3.3. Secondary Analysis (Initial Model)
4. Discussion
Ethical Considerations
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Tsolaki, M. Introduction to a New Open Access Journal by MDPI: Journal of Dementia and Alzheimer’s Disease. J. Dement. Alzheimer’s Dis. 2024, 1, 1–2. [Google Scholar] [CrossRef]
- AlShboul, R.; Thabtah, F.; Walter Scott, A.J.; Wang, Y. The Application of Intelligent Data Models for Dementia Classification. Appl. Sci. 2023, 13, 3612. [Google Scholar] [CrossRef]
- Çelebi, S.B.; Emiroğlu, B.G. A Novel Deep Dense Block-Based Model for Detecting Alzheimer’s Disease. Appl. Sci. 2023, 13, 8686. [Google Scholar] [CrossRef]
- Jia, Y.L.; Yang, B.N.; Yang, Y.H.; Zheng, W.M.; Wang, L.; Huang, C.Y.; Lu, J.; Chen, N. Application of machine learning techniques in the diagnostic approach of PTSD using MRI neuroimaging data: A systematic review. Heliyon 2024, 10, e28559. [Google Scholar] [CrossRef] [PubMed]
- Lalousis, P.A.; Wood, S.J.; Schmaal, L.; Chisholm, K.; Griffiths, S.L.; Reniers, R.L.E.P.; Bertolino, A.; Borgwardt, S.; Brambilla, P.; Kambeitz, J.; et al. Heterogeneity and Classification of Recent Onset Psychosis and Depression: A Multimodal Machine Learning Approach. Schizophr. Bull. 2021, 47, 1130–1140. [Google Scholar] [CrossRef]
- Nicholson, A.A.; Harricharan, S.; Densmore, M.; Neufeld, R.W.J.; Ros, T.; McKinnon, M.C.; Frewen, P.A.; Théberge, J.; Jetly, R.; Pedlar, D.; et al. Classifying heterogeneous presentations of PTSD via the default mode, central executive, and salience networks with machine learning. NeuroImage Clin. 2020, 27, 102262. [Google Scholar] [CrossRef]
- Zandvakili, A.; Barredo, J.; Swearingen, H.R.; Aiken, E.M.; Berlow, Y.A.; Greenberg, B.D.; Carpenter, L.L.; Philip, N.S. Mapping PTSD symptoms to brain networks: A machine learning study. Transl. Psychiatry 2020, 10, 195. [Google Scholar] [CrossRef]
- Hashmi, A.; Barukab, O. Dementia Classification Using Deep Reinforcement Learning for Early Diagnosis. Appl. Sci. 2023, 13, 1464. [Google Scholar] [CrossRef]
- Irfan, M.; Shahrestani, S.; Elkhodr, M. Enhancing Early Dementia Detection: A Machine Learning Approach Leveraging Cognitive and Neuroimaging Features for Optimal Predictive Performance. Appl. Sci. 2023, 13, 10470. [Google Scholar] [CrossRef]
- So, A.; Hooshyar, D.; Park, K.; Lim, H. Early Diagnosis of Dementia from Clinical Data by Machine Learning Techniques. Appl. Sci. 2017, 7, 651. [Google Scholar] [CrossRef]
- Zadgaonkar, A.; Keskar, R.; Kakde, O. Towards a Machine Learning Model for Detection of Dementia Using Lifestyle Parameters. Appl. Sci. 2023, 13, 10630. [Google Scholar] [CrossRef]
- Cho, E.; Kim, S.; Heo, S.-J.; Shin, J.; Hwang, S.; Kwon, E.; Lee, S.; Kim, S.; Kang, B. Machine learning-based predictive models for the occurrence of behavioral and psychological symptoms of dementia: Model development and validation. Sci. Rep. 2023, 13, 8073. [Google Scholar] [CrossRef] [PubMed]
- HekmatiAthar, S.; Goins, H.; Samuel, R.; Byfield, G.; Anwar, M. Data-Driven Forecasting of Agitation for Persons with Dementia: A Deep Learning-Based Approach. SN Comput. Sci. 2021, 2, 326. [Google Scholar] [CrossRef] [PubMed]
- Kim, K.; Jang, J.; Park, H.; Jeong, J.; Shin, D.; Shin, D. Detecting Abnormal Behaviors in Dementia Patients Using Lifelog Data: A Machine Learning Approach. Information 2023, 14, 433. [Google Scholar] [CrossRef]
- Iaboni, A.; Spasojevic, S.; Newman, K.; Schindel Martin, L.; Wang, A.; Ye, B.; Mihailidis, A.; Khan, S.S. Wearable multimodal sensors for the detection of behavioral and psychological symptoms of dementia using personalized machine learning models. Alzheimer’sDement. Diagn. Assess. Dis. Monit. 2022, 14, e12305. [Google Scholar] [CrossRef]
- Cipriani, G.; Vedovello, M.; Nuti, A.; di Fiorino, M. Aggressive behavior in patients with dementia: Correlates and management. Geriatr. Gerontol. Int. 2011, 11, 408–413. [Google Scholar] [CrossRef]
- Cohen-Mansfield, J. Agitated behavior in persons with dementia: The relationship between type of behavior, its frequency, and its disruptiveness. J. Psychiatr. Res. 2008, 43, 64–69. [Google Scholar] [CrossRef]
- Enmarker, I.; Olsen, R.; Hellzen, O. Management of person with dementia with aggressive and violent behaviour: A systematic literature review. Int. J. Older People Nurs. 2011, 6, 153–162. [Google Scholar] [CrossRef]
- Hall, K.A.; O’Connor, D.W. Correlates of aggressive behavior in dementia. Int. Psychogeriatr. 2004, 16, 141–158. [Google Scholar] [CrossRef]
- McShane, R.; Keene, J.; Fairburn, C.; Jacoby, R.; Hope, T. Psychiatric symptoms in patients with dementia predict the later development of behavioural abnormalities. Psychol. Med. 1998, 28, 1119–1127. [Google Scholar] [CrossRef]
- Patel, V.; Hope, T. Aggressive behaviour in elderly people with dementia: A review. Int. J. Geriatr. Psychiatry 1993, 8, 457–472. [Google Scholar] [CrossRef]
- Pulsford, D.; Duxbury, J. Aggressive behaviour by people with dementia in residential care settings: A review. J. Psychiatr. Ment. Health Nurs. 2006, 13, 611–618. [Google Scholar] [CrossRef] [PubMed]
- Swearer, J.M.; Drachman, D.A.; O’Donnell, B.F.; Mitchell, A.L. Troublesome and Disruptive Behaviors in Dementia. J. Am. Geriatr. Soc. 1988, 36, 784–790. [Google Scholar] [CrossRef]
- Thabtah, F.; Peebles, D. Assessment for Alzheimer’s Disease Advancement Using Classification Models with Rules. Appl. Sci. 2023, 13, 12152. [Google Scholar] [CrossRef]
- Amit, M.L.; Fajardo, A.C.; Medina, R.P. Recognition of Real-Time Hand Gestures using Mediapipe Holistic Model and LSTM with MLP Architecture. In Proceedings of the 2022 IEEE 10th Conference on Systems, Process & Control (ICSPC), Malacca, Malaysia, 17 December 2022; pp. 292–295. [Google Scholar] [CrossRef]
- Rigatti, S.J. Random Forest. J. Insur. Med. 2017, 47, 31–39. [Google Scholar] [CrossRef]
- Konstantinov, A.v.; Utkin, L.v. Interpretable machine learning with an ensemble of gradient boosting machines. Knowl.-Based Syst. 2021, 222, 106993. [Google Scholar] [CrossRef]
- Singh, A.; Prakash, B.S.; Chandrasekaran, K. A comparison of linear discriminant analysis and ridge classifier on Twitter data. In Proceedings of the 2016 International Conference on Computing, Communication and Automation (ICCCA), Greater Noida, India, 29–30 April 2016; pp. 133–138. [Google Scholar] [CrossRef]
- Jerez, J.M.; Molina, I.; García-Laencina, P.J.; Alba, E.; Ribelles, N.; Martín, M.; Franco, L. Missing data imputation using statistical and machine learning methods in a real breast cancer problem. Artif. Intell. Med. 2010, 50, 105–115. [Google Scholar] [CrossRef]
- Von Marcard, T.; Henschel, R.; Black, M.J.; Rosenhahn, B.; Pons-Moll, G. Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera; Springer International Publishing: Berlin, Germany, 2018; pp. 614–631. [Google Scholar] [CrossRef]
- Jabbar, H.K.; Khan, R.Z. Methods to Avoid Over-Fitting and Under-Fitting in Supervised Machine Learning (Comparative Study). Comput. Sci. Commun. Instrum. Devices 2014, 70, 163–172. [Google Scholar] [CrossRef]
- Liu, Y.-L.; Wang, J.; Chen, X.; Guo, Y.-W.; Peng, Q.-S. A Robust and Fast Non-Local Means Algorithm for Image Denoising. J. Comput. Sci. Technol. 2008, 23, 270–279. [Google Scholar] [CrossRef]
- Coupé, P.; Yger, P.; Barillot, C. Fast Non Local Means Denoising for 3D MR Images; Springer International Publishing: Berlin, Germany, 2006; pp. 33–40. [Google Scholar] [CrossRef]
- Siam, A.I.; Soliman, N.F.; Algarni, A.D.; Abd El-Samie, F.E.; Sedik, A. Deploying Machine Learning Techniques for Human Emotion Detection. Comput. Intell. Neurosci. 2022, 2022, 032673. [Google Scholar] [CrossRef]
- Farkhod, A.; Abdusalomov, A.B.; Mukhiddinov, M.; Cho, Y.-I. Development of Real-Time Landmark-Based Emotion Recognition CNN for Masked Faces. Sensors 2022, 22, 8704. [Google Scholar] [CrossRef] [PubMed]
- Ahmad, G.N.; Fatima, H.; Ullah, S.; Salah Saidi, A.; Imdadullah. Efficient Medical Diagnosis of Human Heart Diseases Using Machine Learning Techniques with and Without GridSearchCV. IEEE Access 2022, 10, 80151–80173. [Google Scholar] [CrossRef]
- Aremu, O.O.; Hyland-Wood, D.; McAree, P.R. A machine learning approach to circumventing the curse of dimensionality in discontinuous time series machine data. Reliab. Eng. Syst. Saf. 2020, 195, 106706. [Google Scholar] [CrossRef]
- Becker, A.; Becker, J. Dataset shift assessment measures in monitoring predictive models. Procedia Comput. Sci. 2021, 192, 3391–3402. [Google Scholar] [CrossRef]
- Freiesleben, T.; Grote, T. Beyond generalization: A theory of robustness in machine learning. Synthese 2023, 202, 109. [Google Scholar] [CrossRef]
- Cánovas-García, F.; Alonso-Sarría, F.; Gomariz-Castillo, F.; Oñate-Valdivieso, F. Modification of the Random Forest algorithm to avoid statistical dependence problems when classifying remote sensing imagery. Comput. Geosci. 2017, 103, 1–11. [Google Scholar] [CrossRef]
- Barreñada, L.; Dhiman, P.; Timmerman, D.; Boulesteix, A.-L.; van Calster, B. Understanding overfitting in Random Forest for probability estimation: A visualization and simulation study. Diagn. Progn. Res. 2024, 8, 14. [Google Scholar] [CrossRef]
- Ng, A.Y. Feature selection, L1 vs. L2 regularization, and rotational invariance. In Proceedings of the Twenty-First International Conference on Machine Learning—ICML ’04, Banff, AB, Canada, 4–8 July 2004; Volume 78. [Google Scholar] [CrossRef]
- Hamarashid, H.K. Utilizing Statistical Tests for Comparing Machine Learning Algorithms. Kurd. J. Appl. Res. 2021, 6, 69–74. [Google Scholar] [CrossRef]
- Mwitta, C.; Rains, G.C.; Prostko, E. Evaluation of Inference Performance of Deep Learning Models for Real-Time Weed Detection in an Embedded Computer. Sensors 2024, 24, 514. [Google Scholar] [CrossRef]
- Hwang, J.-S.; Lee, S.-S.; Gil, J.-W.; Lee, C.-K. Determination of Optimal Batch Size of Deep Learning Models with Time Series Data. Sustainability 2024, 16, 5936. [Google Scholar] [CrossRef]
- Tueth, M.J. Dementia: Diagnosis and emergency behavioral complications. J. Emerg. Med. 1995, 13, 519–525. [Google Scholar] [CrossRef] [PubMed]
- Soldatos, R.F.; Cearns, M.; Nielsen, M.Ø.; Kollias, C.; Xenaki, L.-A.; Stefanatou, P.; Ralli, I.; Dimitrakopoulos, S.; Hatzimanolis, A.; Kosteletos, I.; et al. Prediction of Early Symptom Remission in Two Independent Samples of First-Episode Psychosis Patients Using Machine Learning. Schizophr. Bull. 2022, 48, 122–133. [Google Scholar] [CrossRef] [PubMed]
- Ogutu, J.O.; Piepho, H.-P.; Schulz-Streeck, T. A comparison of Random Forests, boosting and support vector machines for genomic selection. BMC Proc. 2011, 5, S11. [Google Scholar] [CrossRef]
- Golden, C.E.; Rothrock, M.J.; Mishra, A. Comparison between Random Forest and gradient boosting machine methods for predicting Listeria spp. prevalence in the environment of pastured poultry farms. Food Res. Int. 2019, 122, 47–55. [Google Scholar] [CrossRef]
- Joseph, J. Predicting crime or perpetuating bias? The AI dilemma. AI Society 2024. [Google Scholar] [CrossRef]
- Farayola, M.M.; Tal, I.; Connolly, R.; Saber, T.; Bendechache, M. Ethics and Trustworthiness of AI for Predicting the Risk of Recidivism: A Systematic Literature Review. Information 2023, 14, 426. [Google Scholar] [CrossRef]
Metric | Comparison | T-Statistic | p-Value | Statistical Significance |
---|---|---|---|---|
Accuracy | RF vs. GB | 1.633 | 0.1778 | Not significant |
Accuracy | RF vs. Ridge | 5.222 | 0.0064 | Significant |
Accuracy | GB vs. Ridge | 4.961 | 0.0077 | Significant |
Precision | RF vs. GB | 1.521 | 0.2030 | Not significant |
Precision | RF vs. Ridge | 6.678 | 0.0026 | Significant |
Precision | GB vs. Ridge | 7.115 | 0.0021 | Significant |
Recall | RF vs. GB | 1.000 | 0.3739 | Not significant |
Recall | RF vs. Ridge | 2.743 | 0.0517 | Not significant |
Recall | GB vs. Ridge | 2.260 | 0.0867 | Not significant |
F1 | RF vs. GB | 1.633 | 0.1779 | Not significant |
F1 | RF vs. Ridge | 5.302 | 0.0061 | Significant |
F1 | GB vs. Ridge | 4.986 | 0.0076 | Significant |
True_Label | BATCH_SIZE | Inference_Time | |
---|---|---|---|
1 | argumentative | 1 | 0.0057275295257568367 |
2 | argumentative | 5 | 0.003355264663696289 |
3 | argumentative | 10 | 0.0031197071075439453 |
4 | argumentative | 20 | 0.0032854080200195312 |
5 | non-argumentative | 1 | 0.003977537155151367 |
6 | non-argumentative | 5 | 0.0030760765075683594 |
7 | non-argumentative | 10 | 0.003057241439819336 |
8 | non-argumentative | 20 | 0.003520488739013672 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Galanakis, I.; Soldatos, R.F.; Karanikolas, N.; Voulodimos, A.; Voyiatzis, I.; Samarakou, M. A MediaPipe Holistic Behavior Classification Model as a Potential Model for Predicting Aggressive Behavior in Individuals with Dementia. Appl. Sci. 2024, 14, 10266. https://doi.org/10.3390/app142210266
Galanakis I, Soldatos RF, Karanikolas N, Voulodimos A, Voyiatzis I, Samarakou M. A MediaPipe Holistic Behavior Classification Model as a Potential Model for Predicting Aggressive Behavior in Individuals with Dementia. Applied Sciences. 2024; 14(22):10266. https://doi.org/10.3390/app142210266
Chicago/Turabian StyleGalanakis, Ioannis, Rigas Filippos Soldatos, Nikitas Karanikolas, Athanasios Voulodimos, Ioannis Voyiatzis, and Maria Samarakou. 2024. "A MediaPipe Holistic Behavior Classification Model as a Potential Model for Predicting Aggressive Behavior in Individuals with Dementia" Applied Sciences 14, no. 22: 10266. https://doi.org/10.3390/app142210266
APA StyleGalanakis, I., Soldatos, R. F., Karanikolas, N., Voulodimos, A., Voyiatzis, I., & Samarakou, M. (2024). A MediaPipe Holistic Behavior Classification Model as a Potential Model for Predicting Aggressive Behavior in Individuals with Dementia. Applied Sciences, 14(22), 10266. https://doi.org/10.3390/app142210266