Addressing Class Imbalances in Video Time-Series Data for Estimation of Learner Engagement: “Over Sampling with Skipped Moving Average”
Abstract
:1. Introduction
2. Related Work
2.1. Definition of Engagement
2.2. Approaches in Emotional Engagement Estimation
2.3. Computer-Vision-Based Features
2.4. Dataset
2.5. Architectures in Emotional Engagement Estimation
2.6. Issues Addressed
- RQ1: How do we deal with class imbalanced datasets, DAiSEE?
- RQ2: How does the proposed method affect the accuracy of engagement estimation?
3. Proposed Methods
3.1. Sampling Method
3.1.1. Skipped Moving Average and Video Frame Downsampling
3.1.2. Average Oversampling Input Videos
3.2. Feature Extraction Method
3.3. Training Method
4. Results
5. Discussion
6. Conclusions
- RQ1: How do we deal with class imbalanced datasets, DAiSEE?
- RQ2: How does the proposed method affect the accuracy of engagement estimation?
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Kage, M. Theory of Motivation to Learn: Motivational Educational Psychology; Kaneko Bookstore: Tokyo, Japan, 2013. [Google Scholar]
- Dewan, M.A.A.; Murshed, M.; Lin, F. Engagement detection in online learning: A review. Smart Learn. Environ. 2019, 6, 1. [Google Scholar] [CrossRef]
- Hollister, B.; Nair, P.; Hill-Lindsay, S.; Chukoskie, L. Engagement in online learning: Student attitudes and behavior during COVID-19. Front. Educ. 2022, 7, 851019. [Google Scholar] [CrossRef]
- Martin, F.; Bolliger, D.U. Engagement matters: Student perceptions on the importance of engagement strategies in the online learning environment. Online Learn. 2018, 22, 205–222. [Google Scholar] [CrossRef]
- Nouri, J. The flipped classroom: For active, effective and increased learning–especially for low achievers. Int. J. Educ. Technol. High. Educ. 2016, 13, 1–10. [Google Scholar] [CrossRef]
- Bolliger, D.U. Key factors for determining student satisfaction in online courses. Int. J. E-Learn. 2004, 3, 61–67. [Google Scholar]
- Fredricks, J.A.; Blumenfeld, P.C.; Paris, A.H. School engagement: Potential of the concept, state of the evidence. Rev. Educ. Res. 2004, 74, 59–109. [Google Scholar] [CrossRef]
- Karimah, S.N.; Hasegawa, S. Automatic engagement estimation in smart education/learning settings: A systematic review of engagement definitions, datasets, and methods. Smart Learn. Environ. 2022, 9, 1–48. [Google Scholar] [CrossRef]
- Kaur, A.; Mustafa, A.; Mehta, L.; Dhall, A. Prediction and localization of student engagement in the wild. In Proceedings of the 2018 Digital Image Computing: Techniques and Applications (DICTA), Canberra, Australia, 10–13 December 2018; pp. 1–8. [Google Scholar]
- Gupta, A.; D’Cunha, A.; Awasthi, K.; Balasubramanian, V. Daisee: Towards user engagement recognition in the wild. arXiv 2016, arXiv:1609.01885. [Google Scholar]
- Allen, R.L.; Davis, A.S. Hawthorne Effect. In Encyclopedia of Child Behavior and Development; Goldstein, S., Naglieri, J.A., Eds.; Springer: Boston, MA, USA, 2011. [Google Scholar] [CrossRef]
- Japkowicz, N.; Stephen, S. The class imbalance problem: A systematic study. Intell. Data Anal. 2002, 6, 429–449. [Google Scholar] [CrossRef]
- Dresvyanskiy, D.; Minker, W.; Karpov, A. Deep learning based engagement recognition in highly imbalanced data. In Proceedings of the 23rd International Conference, SPECOM 2021, St. Petersburg, Russia, 27–30 September 2021; Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 166–178. [Google Scholar]
- Anderson, A.R.; Christenson, S.L.; Sinclair, M.F.; Lehr, C.A. Check & Connect: The importance of relationships for promoting engagement with school. J. Sch. Psychol. 2004, 42, 95–113. [Google Scholar]
- Reschly, A.L.; Christenson, S.L. Handbook of Research on Student Engagement; Springer Nature: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
- Cocea, M.; Weibelzahl, S. Log file analysis for disengagement detection in e-Learning environments. User Model. User-Adapt. Interact. 2009, 19, 341–385. [Google Scholar] [CrossRef]
- Chaouachi, M.; Pierre, C.; Jraidi, I.; Frasson, C. Affect and mental engagement: Towards adaptability for intelligent. In Proceedings of the Twenty-Third International FLAIRS Conference, Daytona Beach, FL, USA, 19–21 May 2010. [Google Scholar]
- Fairclough, S.H.; Venables, L. Prediction of subjective states from psychophysiology: A multivariate approach. Biol. Psychol. 2006, 71, 100–110. [Google Scholar] [CrossRef] [PubMed]
- Goldberg, B.S.; Sottilare, R.A.; Brawner, K.W.; Holden, H.K. Predicting learner engagement during well-defined and ill-defined computer-based intercultural interactions. In Proceedings of the 4th International Conference on Affective Computing and Intelligent Interaction, ACII 2011, Memphis, TN, USA, 9–12 October 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 538–547. [Google Scholar]
- Zhang, Z.; Li, Z.; Liu, H.; Cao, T.; Liu, S. Data-driven online learning engagement detection via facial expression and mouse behavior recognition technology. J. Educ. Comput. Res. 2020, 58, 63–86. [Google Scholar] [CrossRef]
- James, W.T. A study of the expression of bodily posture. J. Gen. Psychol. 1932, 7, 405–437. [Google Scholar] [CrossRef]
- Kleinsmith, A.; Bianchi-Berthouze, N. Affective body expression perception and recognition: A survey. IEEE Trans. Affect. Comput. 2012, 4, 15–33. [Google Scholar] [CrossRef]
- Ekman, P.; Friesen, W.V. Measuring facial movement. Environ. Psychol. Nonverbal Behav. 1976, 1, 56–75. [Google Scholar] [CrossRef]
- Ojala, T.; Pietikainen, M.; Maenpaa, T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 971–987. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–26 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Chang, C.; Zhang, C.; Chen, L.; Liu, Y. An ensemble model using face and body tracking for engagement detection. In Proceedings of the 20th ACM International Conference on Multimodal Interaction, Boulder CO, USA, 16–20 October 2018; pp. 616–622. [Google Scholar]
- Grafsgaard, J.; Wiggins, J.B.; Boyer, K.E.; Wiebe, E.N.; Lester, J. Automatically recognizing facial expression: Predicting engagement and frustration. In Proceedings of the Educational Data Mining, Memphis, TN, USA, 6–8 July 2013. [Google Scholar]
- Seventh Emotion Recognition in the Wild Challenge (EmotiW). Available online: https://sites.google.com/view/emotiw2019/home?authuser=0 (accessed on 17 May 2024).
- DAiSEE Dataset for Affective States in E-Environments. Available online: https://people.iith.ac.in/vineethnb/resources/daisee/index.html (accessed on 17 May 2024).
- Villaroya, S.M.; Gamboa-Montero, J.J.; Bernardino, A.; Maroto-Gómez, M.; Castillo, J.C.; Salichs, M.Á. Real-time Engagement Detection from Facial Features. In Proceedings of the 2022 IEEE International Conference on Development and Learning (ICDL), London, UK, 12–15 September 2022; pp. 231–237. [Google Scholar]
- Zheng, X.; Hasegawa, S.; Tran, M.T.; Ota, K.; Unoki, T. Estimation of learners’ engagement using face and body features by transfer learning. In Proceedings of the International Conference on Human–Computer Interaction, Virtual, 24–29 July 2021; Springer International Publishing: Cham, Switzerland, 2021; pp. 541–552. [Google Scholar]
- Zheng, X.; Tran, M.T.; Ota, K.; Unoki, T.; Hasegawa, S. Engagement Estimation using Time-series Facial and Body Features in an Unstable Dataset. In Proceedings of the 30th International Conference on Computers in Education (ICCE 2022), Kuala Lumpur, Malaysia, 28 November–2 December 2022; pp. 89–94. [Google Scholar]
- Ai, X.; Sheng, V.S.; Li, C.; Cui, Z. Class-attention video transformer for engagement intensity prediction. arXiv 2022, arXiv:2208.07216. [Google Scholar]
- Jeni, L.A.; Cohn, J.F.; De La Torre, F. Facing imbalanced data–recommendations for the use of performance metrics. In Proceedings of the 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, Geneva, Switzerland, 2–5 September 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 245–251. [Google Scholar]
- Hasegawa, S.; Hirako, A.; Zheng, X.; Karimah, S.N.; Ota, K.; Unoki, T. Learner’s mental state estimation with PC built-in camera. In Learning and Collaboration Technologies. Human and Technology Ecosystems: Proceedings of the 7th International Conference, LCT 2020, Held as Part of the 22nd HCI International Conference, HCII 2020, Copenhagen, Denmark, 19–24 July 2020; Springer International Publishing: Berlin/Heidelberg, Germany, 2020; pp. 165–175. [Google Scholar]
- Mohammed, R.; Rawashdeh, J.; Abdullah, M. Machine learning with oversampling and undersampling techniques: Overview study and experimental results. In Proceedings of the 2020 11th International Conference on Information and Communication Systems (ICICS), Irbid, Jordan, 7–9 April 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 243–248. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- Jiang, Z.; Pan, T.; Zhang, C.; Yang, J. A new oversampling method based on the classification contribution degree. Symmetry 2021, 13, 194. [Google Scholar] [CrossRef]
- Yao, B.; Ota, K.; Kashihara, A.; Unoki, T.; Hasegawa, S. Development of a Learning Companion Robot with Adaptive Engagement Enhancement. In Proceedings of the 30th International Conference on Computers in Education (ICCE 2022), Asia-Pacific Society for Computers in Education, Kuala Lumpur, Malaysia, 28 November–2 December 2022; pp. 111–117. [Google Scholar]
- Dewan, M.A.A.; Lin, F.; Wen, D.; Murshed, M.; Uddin, Z. A deep learning approach to detecting engagement of online learners. In Proceedings of the 2018 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), Guangzhou, China, 8–12 October 2018; pp. 1895–1902. [Google Scholar]
- Murshed, M.; Dewan, M.A.A.; Lin, F.; Wen, D. Engagement detection in e-learning environments using convolutional neural networks. In Proceedings of the 2019 IEEE International Conference on Dependable, Autonomic and Secure Computing, International Conference on Pervasive Intelligence and Computing, International Conference on Cloud and Big Data Computing, International Conference on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Fukuoka, Japan, 5–8 August 2019; pp. 80–86. [Google Scholar]
- Bosch, N. Detecting student engagement: Human versus machine. In Proceedings of the 2016 Conference on User Modeling Adaptation and Personalization, Halifax, NS, Canada, 13–17 July 2016; pp. 317–320. [Google Scholar]
- Bond, M.; Buntins, K.; Bedenlier, S.; Zawacki-Richter, O.; Kerres, M. Mapping research in student engagement and educational technology in higher education: A systematic evidence map. Int. J. Educ. Technol. High. Educ. 2020, 17, 1–30. [Google Scholar] [CrossRef]
- Mehrabian, A.; Friar, J.T. Encoding of attitude by a seated communicator via posture and position cues. J. Consult. Clin. Psychol. 1969, 33, 330. [Google Scholar] [CrossRef]
- Dael, N.; Mortillaro, M.; Scherer, K.R. Emotion expression in body action and posture. Emotion 2012, 12, 1085. [Google Scholar] [CrossRef]
- Cao, Z.; Hidalgo, G.; Simon, T.; Wei, S.-E.; Sheikh, Y. OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 172–186. [Google Scholar] [CrossRef] [PubMed]
- Simon, T.; Joo, H.; Matthews, I.; Sheikh, Y. Hand keypoint detection in single images using multiview bootstrapping. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1145–1153. [Google Scholar]
- Ekman, P. Facial expressions. In Handbook of Cognition and Emotion; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 1999; Volume 16, p. e320. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Karim, F.; Majumdar, S.; Darabi, H.; Chen, S. LSTM fully convolutional networks for time series classification. IEEE Access 2017, 6, 1662–1669. [Google Scholar] [CrossRef]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems; MIT Press: Cambridge, MA, USA, 2019; Volume 32. [Google Scholar]
- LSTM-FCN-Pytorch. Available online: https://github.com/roytalman/LSTM-FCN-Pytorch (accessed on 28 January 2024).
Dataset | Subjects | Video Snippets | Snippets Time | Total Timeh |
---|---|---|---|---|
“in the wild” | 78 (male/female 53/25) | 197 | 5 min | 59,100 s |
DAiSEE | 112 (male/female 80/32) | 9068 | 10 s | 90,680 s |
Engagement Label | Dataset | Low (Recall/F1) | High (Recall/F1) | Very High (Recall/F1) |
---|---|---|---|---|
LSTM [31,32] | DAiSEE | 0.050/0.090 | 0.740/0.640 | 0.410/0.470 |
QRNN [31,32] | DAiSEE | 0.000/0.000 | 0.930/0.660 | 0.070/0.120 |
LSTM [33] | DAiSEE | 0.068/0.122 | 0.732/0.625 | 0.421/0.489 |
LSTM [33] | “in the wild” | 0.571/0.696 | 0.789/0.682 | 0.667/0.690 |
Engagement Label | Low (Recall/Precision/F1) | High (Recall/Precision/F1) | Very High (Recall/Precision/F1) |
---|---|---|---|
LSTM (3 frames average) | 0.295/0.079/0.125 | 0.490/0.508/0.499 | 0.381/0.507/0.435 |
LSTM (5 frames average) | 0.346/0.090/0.142 | 0.523/0.521/0.522 | 0.373/0.537/0.440 |
LSTM (6 frames average) | 0.192/0.066/0.098 | 0.544/0.501/0.521 | 0.354/0.500/0.414 |
Affective State | Very Low/Low | High | Very High |
---|---|---|---|
Original Labels | 61/459 | 4477 | 4071 |
Relabel | 520 | 4477 | 4071 |
Oversample | 2764 | 4009 | 3286 |
Engagement Label | Low (Recall/Precision/F1) | High (Recall/Precision/F1) | Very High (Recall/Precision/F1) |
---|---|---|---|
LSTM (Original) | 0.069/0.114/0.086 | 0.587/0.558/0.572 | 0.482/0.491/0.486 |
LSTM-FCN (Original) | 0.049/0.211/0.079 | 0.654/0.562/0.605 | 0.500/0.553/0.525 |
LSTM (SMOTE) | 0.821/0.751/0.784 | 0.539/0.557/0.548 | 0.556/0.579/0.567 |
LSTM-FCN (SMOTE) | 0.792/0.690/0.738 | 0.538/0.530/0.534 | 0.515/0.598/0.554 |
LSTM (SMA) | 0.096/0.235/0.137 | 0.634/0.561/0.595 | 0.521/0.558/0.539 |
LSTM-FCN (SMA) | 0.036/0.348/0.065 | 0.694/0.547/0.612 | 0.502/0.590/0.543 |
LSTM (SMA+OS) | 0.806/0.702/0.751 | 0.474/0.525/0.498 | 0.539/0.544/0.541 |
LSTM-FCN (SMA+OS) | 0.637/0.623/0.630 | 0.527/0.498/0.512 | 0.510/0.557/0.533 |
Engagement Label | Low (Recall/Precision/F1) | High (Recall/Precision/F1) | Very High (Recall/Precision/F1) |
---|---|---|---|
LSTM (Original) | 0.069/0.114/0.086 | 0.587/0.558/0.572 | 0.482/0.491/0.486 |
LSTM-FCN (Original) | 0.014/0.111/0.025 | 0.694/0.518/0.594 | 0.300/0.434/0.355 |
LSTM (SMOTE) | 0.295/0.053/0.089 | 0.385/0.475/0.425 | 0.350/0.487/0.407 |
LSTM-FCN (SMOTE) | 0.179/0.037/0.061 | 0.579/0.503/0.539 | 0.241/0.558/0.336 |
LSTM (SMA) | 0.192/0.109/0.140 | 0.665/0.510/0.577 | 0.314/0.526/0.393 |
LSTM-FCN (SMA) | 0.038/0.071/0.050 | 0.728/0.526/0.611 | 0.355/0.553/0.433 |
LSTM (SMA+OS) | 0.346/0.090/0.142 | 0.523/0.521/0.522 | 0.373/0.537/0.440 |
LSTM-FCN (SMA+OS) | 0.269/0.063/0.103 | 0.561/0.512/0.535 | 0.312/0.562/0.401 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zheng, X.; Hasegawa, S.; Gu, W.; Ota, K. Addressing Class Imbalances in Video Time-Series Data for Estimation of Learner Engagement: “Over Sampling with Skipped Moving Average”. Educ. Sci. 2024, 14, 556. https://doi.org/10.3390/educsci14060556
Zheng X, Hasegawa S, Gu W, Ota K. Addressing Class Imbalances in Video Time-Series Data for Estimation of Learner Engagement: “Over Sampling with Skipped Moving Average”. Education Sciences. 2024; 14(6):556. https://doi.org/10.3390/educsci14060556
Chicago/Turabian StyleZheng, Xianwen, Shinobu Hasegawa, Wen Gu, and Koichi Ota. 2024. "Addressing Class Imbalances in Video Time-Series Data for Estimation of Learner Engagement: “Over Sampling with Skipped Moving Average”" Education Sciences 14, no. 6: 556. https://doi.org/10.3390/educsci14060556
APA StyleZheng, X., Hasegawa, S., Gu, W., & Ota, K. (2024). Addressing Class Imbalances in Video Time-Series Data for Estimation of Learner Engagement: “Over Sampling with Skipped Moving Average”. Education Sciences, 14(6), 556. https://doi.org/10.3390/educsci14060556