Next Article in Journal
Time Series Forecasting of Motor Bearing Vibration Based on Informer
Next Article in Special Issue
Ghostformer: A GhostNet-Based Two-Stage Transformer for Small Object Detection
Previous Article in Journal
A Comparative Study of Parameter Identification Methods for Asymmetric Nonlinear Systems with Quadratic and Cubic Stiffness
Previous Article in Special Issue
Short-Term Drift Prediction of Multi-Functional Buoys in Inland Rivers Based on Deep Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks

by
Catherine Lollett
1,*,
Mitsuhiro Kamezaki
2 and
Shigeki Sugano
1
1
Graduate School of Creative Science and Engineering, Waseda University, Tokyo 169-8555, Japan
2
Research Institute for Science and Engineering (RISE), Waseda University, Tokyo 162-0044, Japan
*
Author to whom correspondence should be addressed.
Sensors 2022, 22(15), 5857; https://doi.org/10.3390/s22155857
Submission received: 30 June 2022 / Revised: 29 July 2022 / Accepted: 1 August 2022 / Published: 5 August 2022
(This article belongs to the Special Issue Application of Deep Learning in Intelligent Transportation)

Abstract

Estimating the driver’s gaze in a natural real-world setting can be problematic for different challenging scenario conditions. For example, faces will undergo facial occlusions, illumination, or various face positions while driving. In this effort, we aim to reduce misclassifications in driving situations when the driver has different face distances regarding the camera. Three-dimensional Convolutional Neural Networks (CNN) models can make a spatio-temporal driver’s representation that extracts features encoded in multiple adjacent frames that can describe motions. This characteristic may help ease the deficiencies of a per-frame recognition system due to the lack of context information. For example, the front, navigator, right window, left window, back mirror, and speed meter are part of the known common areas to be checked by drivers. Based on this, we implement and evaluate a model that is able to detect the head direction toward these regions having various distances from the camera. In our evaluation, the 2D CNN model had a mean average recall of 74.96% across the three models, whereas the 3D CNN model had a mean average recall of 87.02%. This result show that our proposed 3D CNN-based approach outperforms a 2D CNN per-frame recognition approach in driving situations when the driver’s face has different distances from the camera.
Keywords: driver monitoring; gaze classification; convolutional neural networks driver monitoring; gaze classification; convolutional neural networks

Share and Cite

MDPI and ACS Style

Lollett, C.; Kamezaki, M.; Sugano, S. Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks. Sensors 2022, 22, 5857. https://doi.org/10.3390/s22155857

AMA Style

Lollett C, Kamezaki M, Sugano S. Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks. Sensors. 2022; 22(15):5857. https://doi.org/10.3390/s22155857

Chicago/Turabian Style

Lollett, Catherine, Mitsuhiro Kamezaki, and Shigeki Sugano. 2022. "Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks" Sensors 22, no. 15: 5857. https://doi.org/10.3390/s22155857

APA Style

Lollett, C., Kamezaki, M., & Sugano, S. (2022). Single Camera Face Position-Invariant Driver’s Gaze Zone Classifier Based on Frame-Sequence Recognition Using 3D Convolutional Neural Networks. Sensors, 22(15), 5857. https://doi.org/10.3390/s22155857

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop