Occlusion Robust Cognitive Engagement Detection in Real-World Classroom
Abstract
:1. Introduction
- The Students’ Cognitive Engagement (SCE) dataset in a real-world classroom is built. In contrast to experiment-induced behaviors, this method non-invasively collects visual data of students, which serve as valuable input for training automatic detection models in authentic classroom settings.
- The Object-Enhanced–YOLOv8n (OE-YOLOv8n) method is proposed to detect students’ cognitive engagement in real-world scenes. First, it enhances the computation of easily occluded behaviors. Second, it enhanced the small-scale cognitive engagement data.
- The OE-YOLOv8n method uses Mosaic and Cutout methods to enhance the real-world cognitive engagement data. Subsequently, it leverages the Inner Minimum Point Distance Intersection over Union (IMPDIoU) loss function, refining the key point distance between the potentially occluded predicted box and its corresponding ground truth box.
2. Related Works
2.1. Cognitive Engagement Detection
2.2. Small-Scale Object Detection with YOLO
2.3. Occluded Object Detection with YOLO
3. The Proposed Method
3.1. Overall Architecture
3.2. OE-YOLOv8n with Mosaic and Cutout Data Augmentation
3.3. OE-YOLOv8n with an IMPDIoU Loss Function
4. Experimental Results and Discussion
4.1. SCE Dataset
4.2. Experimental Setting
4.3. Training Procedures
4.4. Experimental Comparison
4.4.1. Comparison with Baseline
4.4.2. Ablation Experiments
4.4.3. Validity of the Method
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
OE-YOLOv8n | Object-Enhanced–You Only Look Once version 8 nano. |
IMPDIoU | Inner Minimum Point Distance Intersection over Union. |
SCE | Students’ Cognitive Engagement. |
ICAPD | interactive, constructive, active, passive, and disengaged. |
BCE | Binary Cross-Entropy. |
DFL-CIoU | Distribution Focal Loss and Complete Intersection over Union. |
References
- Kelly, S. Classroom discourse and the distribution of student engagement. Soc. Psychol. Educ. 2007, 10, 331–352. [Google Scholar] [CrossRef]
- Goldberg, P.; Sümer, Ö.; Stürmer, K.; Wagner, W.; Göllner, R.; Gerjets, P.; Kasneci, E.; Trautwein, U. Attentive or not? Toward a machine learning approach to assessing students’ visible engagement in classroom instruction. Educ. Psychol. Rev. 2021, 33, 27–49. [Google Scholar] [CrossRef]
- Chi, M.T.; Wylie, R. The ICAP framework: Linking cognitive engagement to active learning outcomes. Educ. Psychol. 2014, 49, 219–243. [Google Scholar] [CrossRef]
- Fredricks, J.A.; Blumenfeld, P.C.; Paris, A.H. School engagement: Potential of the concept, state of the evidence. Rev. Educ. Res. 2004, 74, 59–109. [Google Scholar] [CrossRef]
- Olney, A.M.; Risko, E.F.; D’Mello, S.K.; Graesser, A.C. Attention in Educational Contexts: The Role of the Learning Task in Guiding Attention; Grantee Submission; MIT Press: London, UK, 2015. [Google Scholar]
- Pi, Z.; Zhang, Y.; Zhou, W.; Xu, K.; Chen, Y.; Yang, J.; Zhao, Q. Learning by explaining to oneself and a peer enhances learners’ theta and alpha oscillations while watching video lectures. Br. J. Educ. Technol. 2021, 52, 659–679. [Google Scholar] [CrossRef]
- Scagnoli, N.I.; Choo, J.; Tian, J. Students’ insights on the use of video lectures in online classes. Br. J. Educ. Technol. 2019, 50, 399–414. [Google Scholar] [CrossRef]
- Chen, H.; Zhou, G.; Jiang, H. Student Behavior Detection in the Classroom Based on Improved YOLOv8. Sensors 2023, 23, 8385. [Google Scholar] [CrossRef] [PubMed]
- Xu, Q.; Wei, Y.; Gao, J.; Yao, H.; Liu, Q. ICAPD Framework and simAM-YOLOv8n for Student Cognitive Engagement Detection in Classroom. IEEE Access 2023, 11, 136063–136076. [Google Scholar] [CrossRef]
- Arnicane, A.; Oberauer, K.; Souza, A.S. Validity of attention self-reports in younger and older adults. Cognition 2021, 206, 104482. [Google Scholar] [CrossRef]
- Smallwood, J.; Schooler, J.W. The science of mind wandering: Empirically navigating the stream of consciousness. Annu. Rev. Psychol. 2015, 66, 487–518. [Google Scholar] [CrossRef]
- Guhan, P.; Agarwal, M.; Awasthi, N.; Reeves, G.; Manocha, D.; Bera, A. ABC-Net: Semi-supervised multimodal GAN-based engagement detection using an affective, behavioral and cognitive model. arXiv 2020, arXiv:2011.08690. [Google Scholar]
- Li, S.; Lajoie, S.P.; Zheng, J.; Wu, H.; Cheng, H. Automated detection of cognitive engagement to inform the art of staying engaged in problem-solving. Comput. Educ. 2021, 163, 104114. [Google Scholar] [CrossRef]
- Salam, H.; Celiktutan, O.; Gunes, H.; Chetouani, M. Automatic Context-Aware Inference of Engagement in HMI: A Survey. In IEEE Transactions on Affective Computing; IEEE: New York, NY, USA, 2023. [Google Scholar]
- D’Mello, S. Emotional learning analytics. In Handbook of Learning Analytics; Society for Learning Analytics Research (SoLAR): Beaumont, AB, USA, 2017. [Google Scholar]
- Liu, S.; Liu, S.; Liu, Z.; Peng, X.; Yang, Z. Automated detection of emotional and cognitive engagement in MOOC discussions to predict learning achievement. Comput. Educ. 2022, 181, 104461. [Google Scholar] [CrossRef]
- Ringeval, F.; Sonderegger, A.; Sauer, J.; Lalanne, D. Introducing the RECOLA multimodal corpus of remote collaborative and affective interactions. In Proceedings of the 2013 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), Shanghai, China, 22–26 April 2013; pp. 1–8. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Chen, H.; Guan, J. Teacher–student behavior recognition in classroom teaching based on improved YOLO-v4 and Internet of Things technology. Electronics 2022, 11, 3998. [Google Scholar] [CrossRef]
- Shorten, C.; Khoshgoftaar, T.M. A survey on image data augmentation for deep learning. J. Big Data 2019, 6, 1–48. [Google Scholar]
- Mikołajczyk, A.; Grochowski, M. Data augmentation for improving deep learning in image classification problem. In Proceedings of the 2018 International Interdisciplinary PhD Workshop (IIPhDW), Swinoujscie, Poland, 9–12 May 2018; pp. 117–122. [Google Scholar]
- Chawla, N.V.; Bowyer, K.W.; Hall, L.O.; Kegelmeyer, W.P. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 2002, 16, 321–357. [Google Scholar] [CrossRef]
- DeVries, T.; Taylor, G.W. Improved regularization of convolutional neural networks with cutout. arXiv 2017, arXiv:1708.04552. [Google Scholar]
- Hu, X.; Wen, S.; Lam, H. Improvement Accuracy in Deep Learning: An Increasing Neurons Distance Approach with the Penalty Term of Loss Function. Inf. Sci. 2023, 644, 119268. [Google Scholar] [CrossRef]
- Rezatofighi, H.; Tsoi, N.; Gwak, J.; Sadeghian, A.; Reid, I.; Savarese, S. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 658–666. [Google Scholar]
- Zheng, Z.; Wang, P.; Liu, W.; Li, J.; Ye, R.; Ren, D. Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12993–13000. [Google Scholar]
- Zhang, Y.F.; Ren, W.; Zhang, Z.; Jia, Z.; Wang, L.; Tan, T. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing 2022, 506, 146–157. [Google Scholar] [CrossRef]
- Siliang, M.; Yong, X. MPDIoU: A loss for efficient and accurate bounding box regression. arXiv 2023, arXiv:2307.07662. [Google Scholar]
- Zhang, H.; Xu, C.; Zhang, S. Inner-IoU: More Effective Intersection over Union Loss with Auxiliary Bounding Box. arXiv 2023, arXiv:2311.02877. [Google Scholar]
Ratio | P | R | F1 | mAP50 | mAP50-95 |
---|---|---|---|---|---|
MPDIoU | 0.901 | 0.866 | 0.883 | 0.917 | 0.559 |
IMPDIoU (ration = 0.6) | 0.913 | 0.879 | 0.896 | 0.924 | 0.565 |
IMPDIoU (ration = 0.8) | 0.910 | 0.887 | 0.898 | 0.928 | 0.567 |
IMPDIoU (ration = 1.0) | 0.921 | 0.885 | 0.903 | 0.932 | 0.575 |
IMPDIoU (ration = 1.2) | 0.920 | 0.885 | 0.902 | 0.934 | 0.576 |
IMPDIoU (ration = 1.4) | 0.918 | 0.891 | 0.904 | 0.935 | 0.573 |
Classes | Behaviors | Example Behaviors |
---|---|---|
Disengaged | Behavior unrelated to learning | Yawning, drinking water, lying on the desk, sleeping, looking out the window, playing on a phone/computer |
Passive | Sitting silently | Seated with a static posture (no movement of the head, hands, or body) |
Active | Thinking and operating learning materials | Pointing to materials, underlining sentences, taking out tools, scratching the head, hands on the face or head |
Constructive | Generating and expressing new ideas | Taking notes, raising hands, drawing on paper |
Interactive | Dialogue with teachers or students | Standing up to talk to teachers, turning the body and talking to peers, applauding mates, clapping hands, patting others |
Training | Testing | Total | |
---|---|---|---|
Passive | 33,613 | 8309 | 41,922 |
Active | 9481 | 2530 | 12,011 |
Constructive | 4071 | 1001 | 5072 |
Disengaged | 3107 | 773 | 3880 |
Interactive | 73,516 | 18,375 | 91,891 |
total | 123,788 | 30,988 | 154,776 |
Method | P | R | F1 | mAP50 | mAP50-95 |
---|---|---|---|---|---|
Faster R-CNN | 0.815 | 0.637 | 0.715 | 0.820 | 0.504 |
SSD | 0.831 | 0.573 | 0.678 | 0.833 | 0.479 |
YOLOv5n | 0.849 | 0.816 | 0.832 | 0.852 | 0.508 |
YOLOv8n | 0.853 | 0.823 | 0.838 | 0.866 | 0.549 |
simAM-YOLOv8n | 0.861 | 0.832 | 0.846 | 0.875 | 0.548 |
OE-YOLOv8n | 0.918 | 0.891 | 0.904 | 0.935 | 0.573 |
Method | P | R | F1 | mAP50 | mAP50-95 |
---|---|---|---|---|---|
YOLOv8n | 0.853 | 0.823 | 0.838 | 0.866 | 0.549 |
L-YOLOv8n | 0.870 | 0.841 | 0.855 | 0.880 | 0.520 |
OE-YOLOv8n | 0.918 | 0.891 | 0.904 | 0.935 | 0.573 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xiao, G.; Xu, Q.; Wei, Y.; Yao, H.; Liu, Q. Occlusion Robust Cognitive Engagement Detection in Real-World Classroom. Sensors 2024, 24, 3609. https://doi.org/10.3390/s24113609
Xiao G, Xu Q, Wei Y, Yao H, Liu Q. Occlusion Robust Cognitive Engagement Detection in Real-World Classroom. Sensors. 2024; 24(11):3609. https://doi.org/10.3390/s24113609
Chicago/Turabian StyleXiao, Guangrun, Qi Xu, Yantao Wei, Huang Yao, and Qingtang Liu. 2024. "Occlusion Robust Cognitive Engagement Detection in Real-World Classroom" Sensors 24, no. 11: 3609. https://doi.org/10.3390/s24113609
APA StyleXiao, G., Xu, Q., Wei, Y., Yao, H., & Liu, Q. (2024). Occlusion Robust Cognitive Engagement Detection in Real-World Classroom. Sensors, 24(11), 3609. https://doi.org/10.3390/s24113609