Advancements in Gaze Coordinate Prediction Using Deep Learning: A Novel Ensemble Loss Approach
Abstract
:1. Introduction
2. Related Works
3. Materials and Methods
3.1. Dataset
3.2. Models
3.3. Ensemble Loss Function
4. Results
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Majaranta, P.; Räihä, K.J. Twenty Years of Eye Typing: Systems and Design Issues. In Proceedings of the ETRA ’02: 2002 Symposium on Eye Tracking Research & Applications, New Orleans, LA, USA, 25–27 March 2002; pp. 15–22. [Google Scholar] [CrossRef]
- Ou, W.L.; Cheng, Y.H.; Chang, C.C.; Chen, H.L.; Fan, C.P. Calibration-free and deep-learning-based customer gaze direction detection technology based on the YOLOv3-tiny model for smart advertising displays. J. Chin. Inst. Eng. 2023, 46, 856–869. [Google Scholar] [CrossRef]
- He, H.; She, Y.; Xiahou, J.; Yao, J.; Li, J.; Hong, Q.; Ji, Y. Real-Time Eye-Gaze Based Interaction for Human Intention Prediction and Emotion Analysis. In Proceedings of the CGI 2018: Computer Graphics International, Bintan Island, Indonesia, 11–14 June 2018; pp. 185–194. [Google Scholar] [CrossRef]
- Damm, O.; Malchus, K.; Jaecks, P.; Krach, S.; Paulus, F.; Naber, M.; Jansen, A.; Kamp-Becker, I.; Einhäuser, W.; Stenneken, P.; et al. Different gaze behavior in human-robot interaction in Asperger’s syndrome: An eye-tracking study. In Proceedings of the 2013 IEEE RO-MAN, Gyeongju, Republic of Korea, 26–29 August 2013; pp. 368–369. [Google Scholar] [CrossRef]
- Chennamma, H.; Yuan, X. A Survey on Eye-Gaze Tracking Techniques. Indian J. Comput. Sci. Eng. 2013, 4, 388–393. [Google Scholar]
- Zhang, X.; Sugano, Y.; Fritz, M.; Bulling, A. Appearance-based gaze estimation in the wild. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 4511–4520. [Google Scholar] [CrossRef]
- Krafka, K.; Khosla, A.; Kellnhofer, P.; Kannan, H.; Bhandarkar, S.; Matusik, W.; Torralba, A. Eye Tracking for Everyone. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Kellnhofer, P.; Recasens, A.; Stent, S.; Matusik, W.; Torralba, A. Gaze360: Physically Unconstrained Gaze Estimation in the Wild. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Park, S.; Mello, S.D.; Molchanov, P.; Iqbal, U.; Hilliges, O.; Kautz, J. Few-Shot Adaptive Gaze Estimation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 9367–9376. [Google Scholar] [CrossRef]
- He, J.; Pham, K.; Valliappan, N.; Xu, P.; Roberts, C.; Lagun, D.; Navalpakkam, V. On-Device Few-Shot Personalization for Real-Time Gaze Estimation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), Seoul, Republic of Korea, 27–28 October 2019; pp. 1149–1158. [Google Scholar] [CrossRef]
- Yang, H.; Yang, Z.; Liu, J.; Chi, J. A new appearance-based gaze estimation via multi-modal fusion. In Proceedings of the 2023 3rd International Conference on Neural Networks, Information and Communication Engineering (NNICE), Guangzhou, China, 24–26 February 2023; pp. 498–502. [Google Scholar] [CrossRef]
- Bandi, C.; Thomas, U. Face-Based Gaze Estimation Using Residual Attention Pooling Network. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Lisabon, Portugal, 19–21 February 2023; pp. 541–549. [Google Scholar] [CrossRef]
- Huang, L.; Li, Y.; Wang, X.; Wang, H.; Bouridane, A.; Chaddad, A. Gaze Estimation Approach Using Deep Differential Residual Network. Sensors 2022, 22, 5462. [Google Scholar] [CrossRef]
- Negrinho, R.; Gordon, G. Deeparchitect: Automatically designing and training deep architectures. arXiv 2017, arXiv:1704.08792. [Google Scholar]
- Dias, P.A.; Malafronte, D.; Medeiros, H.; Odone, F. Gaze Estimation for Assisted Living Environments. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 1–5 March 2020. [Google Scholar] [CrossRef]
- Cazzato, D.; Leo, M.; Distante, C.; Voos, H. When I look into your eyes: A survey on computer vision contributions for human gaze estimation and tracking. Sensors 2020, 20, 3739. [Google Scholar] [CrossRef] [PubMed]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
- Chen, Z.; Shi, B.E. Appearance-Based Gaze Estimation Using Dilated-Convolutions. In Proceedings of the Computer Vision—ACCV 2018; Jawahar, C., Li, H., Mori, G., Schindler, K., Eds.; Springer: Cham, Switzerland, 2019; pp. 309–324. [Google Scholar]
- Palmero, C.; Selva, J.; Bagheri, M.A.; Escalera, S. Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues. In Proceedings of the British Machine Vision Conference, Newcastle, UK, 3–6 September 2018. [Google Scholar]
- L R D, M.; Biswas, P. Appearance-based Gaze Estimation using Attention and Difference Mechanism. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Nashville, TN, USA, 19–25 June 2021; pp. 3137–3146. [Google Scholar] [CrossRef]
- Wong, E.T.; Yean, S.; Hu, Q.; Lee, B.S.; Liu, J.; Rajan, D. Gaze Estimation Using Residual Neural Network. In Proceedings of the 2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), Kyoto, Japan, 11–15 March 2019; pp. 411–414. [Google Scholar]
- Funes Mora, K.A.; Monay, F.; Odobez, J.M. Eyediap: A database for the development and evaluation of gaze estimation algorithms from rgb and rgb-d cameras. In Proceedings of the Symposium on Eye Tracking Research and Applications, Safety Harbor, FL, USA, 26–28 March 2014; pp. 255–258. [Google Scholar]
- Shen, R.; Zhang, X.; Xiang, Y. AFFNet: Attention Mechanism Network Based on Fusion Feature for Image Cloud Removal. Int. J. Pattern Recognit. Artif. Intell. 2022, 36, 2254014. [Google Scholar] [CrossRef]
- Hastie, T.; Tibshirani, R.; Friedman, J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer Series in Statistics; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
Previous Study | Used Method | Performance (MAE) |
---|---|---|
Wong et al. [21] | uses ResNet18, remove blinking data | GazeCapture: 3.05 cm |
Zhang et al. [6] | uses VGG, propose MPIIGaze dataset | MPIIGaze: 13.9°, Eyediap: 10.5° |
FAZE [9] | few shot calibration | MPIIGaze 3.42° |
iTracker [7] | CNN-based model, uses images to predict x, y coordinates, propose GazeCapture dataset | GazeCapture (mobile phone): 1.71 cm, GazeCapture (tablet PC): 2.53 cm |
AFF-Net [23] | CNN-based model, state of the art performance in MPIIFaceGaze | GazeCapture (mobile phone): 1.62 cm, GazeCapture (tablet PC): 2.30 cm, MPIIFaceGaze: 3.9 cm |
Model | 2D (cm) | 3D (°) | L2 within 2 cm (%) |
---|---|---|---|
iTracker (original) | 7.67 | 7.25 | 41.3912 |
iTracker (with loss function) | 1.31 | 1.51 | 86.7912 |
AFF-Net (original) | 4.21 | 3.69 | 90.96 |
AFF-Net (with loss function) | 0.81 | 0.93 | 95.3423 |
ResNet (original) | 7.50 | 8.58 | 5.9245 |
ResNet (with loss function) | 1.28 | 1.47 | 88.3912 |
Person ID | AFF-Net with Loss | ResNet with Loss | iTracker with Loss |
---|---|---|---|
0 | 2.60 | 2.46 | 2.89 |
1 | 0.53 | 1.19 | 1.11 |
2 | 0.69 | 1.32 | 1.08 |
3 | 0.64 | 1.11 | 1.03 |
4 | 0.61 | 1.10 | 1.14 |
5 | 0.61 | 1.10 | 1.10 |
6 | 0.77 | 1.45 | 1.50 |
7 | 1.08 | 1.69 | 1.59 |
8 | 0.87 | 1.49 | 1.42 |
9 | 0.64 | 1.25 | 1.13 |
10 | 0.62 | 1.46 | 1.17 |
11 | 0.65 | 1.13 | 1.42 |
12 | 0.82 | 1.46 | 1.25 |
13 | 0.56 | 1.21 | 1.18 |
14 | 0.50 | 0.89 | 0.75 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kim, S.; Lee, S.; Lee, E.C. Advancements in Gaze Coordinate Prediction Using Deep Learning: A Novel Ensemble Loss Approach. Appl. Sci. 2024, 14, 5334. https://doi.org/10.3390/app14125334
Kim S, Lee S, Lee EC. Advancements in Gaze Coordinate Prediction Using Deep Learning: A Novel Ensemble Loss Approach. Applied Sciences. 2024; 14(12):5334. https://doi.org/10.3390/app14125334
Chicago/Turabian StyleKim, Seunghyun, Seungkeon Lee, and Eui Chul Lee. 2024. "Advancements in Gaze Coordinate Prediction Using Deep Learning: A Novel Ensemble Loss Approach" Applied Sciences 14, no. 12: 5334. https://doi.org/10.3390/app14125334
APA StyleKim, S., Lee, S., & Lee, E. C. (2024). Advancements in Gaze Coordinate Prediction Using Deep Learning: A Novel Ensemble Loss Approach. Applied Sciences, 14(12), 5334. https://doi.org/10.3390/app14125334