Fatigue Driving Recognition Method Based on Multi-Scale Facial Landmark Detector
Abstract
:1. Introduction
1.1. Background
1.2. Motivation
- (1)
- The existing fatigue driving behavior recognition methods obtain the opening and closing state of eyes and mouths through the annotation frames or calculate the aspect ratio of eyes and mouth by annotating key points. If the detection accuracy of facial key points is low, then the recognition accuracy of fatigue driving behavior becomes low. Therefore, it is necessary to design a high-accuracy facial landmark detector.
- (2)
- Most existing fatigue driving decision models use fixed thresholds to recognize drive fatigue. However, under the condition of fatigue driving, the aspect ratios of the eyes and mouths of different drivers vary. Therefore, the threshold value of the aspect ratio of the eyes and mouth of each driver in the fatigue driving state is different, requiring dynamic changes in the threshold value.
- (3)
- At present, the public datasets of fatigue driving behavior mainly focuses on yawning behavior and rarely involves dozing behavior. The public dataset of facial key point detection rarely contains driver behavior images, and the task of manually marking 68 or 98 key points is arduous. Therefore, a dataset of driver behavior images needs be built. The dataset should consider reducing the number of key points manually marked in each image and various driving behavior types, including dozing, yawning, talking and normal.
1.3. Our Contributions
- (1)
- A novel deep learning framework called Multi-scale Facial Landmark Detector (MSFLD) is proposed to perform facial 23 key point detection. The MSFLD model replaces all the bottleneck layers in the traditional facial landmark detector with inverted residual blocks and increases the number of multi-scale fully connected layers, which reduces the number of model parameters. Thus, the proposed MSFLD can improve the detection accuracy of facial key points while keeping the detection speed constant.
- (2)
- A MSFLD-based method is proposed for fatigue driving behavior recognition. The proposed method uses a spatial pyramid pooling and multi-scale output (SPP-MSFO) detection model to obtain the face region, detects 23 key points through MSFLD, calculates the fatigue parameter matrix according to the key points, and uses the combination of adaptive threshold and statistical threshold to determine the driver’s fatigue status. This method not only improves the accuracy of fatigue driving behavior recognition, but also reduces the workload of labeling facial key points in dataset images.
- (3)
- In the proposed fatigue driving recognition method, a driving behavior judgment strategy combining an adaptive threshold and statistical threshold is presented. The adaptive threshold addresses the problem of differences in the aspect ratio of the eyes and mouth of different drivers. The statistical threshold solves the problem that the adaptive threshold of the eyes is too low or the adaptive threshold of the mouth is too high. The combination of the two can avoid misjudgment and improve the recognition accuracy of driving behavior.
- (4)
- The Hunan University Fatigue Driving Detection Dataset (HNUFDD) is built. The HNUFDD dataset includes yawning, dozing, talking, mouth closed and normal driving behavior types, and annotates 23 key points in the driver’s face area. The proposed method is evaluated on the HNUFDD dataset, and the results show the superior performance of the proposed methods in comparison with state-of-the-art methods.
2. Related Work
2.1. Facial Landmarks Detection
2.2. Fatigue Driving Recognition
3. Fatigue Driving Recognition Method Based on MSFLD
3.1. Overview of Architecture
3.2. SPP-MSFO Detection
3.3. Model of The MSFLD and Its Learning Algorithm
3.3.1. The MSFLD Model
3.3.2. Learning Algorithm of the MSFLD
Algorithm 1: Training strategy of MSFLD |
Input: Given training samples 17441 face region images of size 112 × 112, and their 23 key point annotations, . Output: The well trained model MSFLD 1: Construct the MSFLD model shown in Figure 3; 2: Initialize the parameters , set the batch size (i.e., 96); 3: Repeat 4: Randomly select a batch instances from ; 5: Forward learn training samples through the MSFLD model; 6: Compute the loss function by Equation (1); 7: Propagate the error back-through MSFLD and update the parameters of MSFLD; 8: Find by minimizing with ; |
9: Until end condition is satisfied. |
- (1)
- In Line 1, the structure of the MSFLD model is constructed. This model consists of convolutions, inverted bottleneck blocks, average pooling, and multi-scale fully connected layers. The overview of the MSFLD architecture is illustrated in Figure 3.
- (2)
- In Line 2, parameters of the MSFLD model, including the weight value w, bias b, learning rate α, and batch size, are initialized. The initialization scheme for these parameters is described in detail in Section 4.
- (3)
- In Lines 3–9, the strategies of forward learning and backward propagation are used to train the MSFLD model. In the backward propagation, the model uses Adam to optimize parameters. In Line 6, the loss function L2 is defined as shown in Equation (1).
- (4)
- In Line 9, model training is completed until the end condition is met. The iteration limit and early stop policy are used as the end conditions. At the end of the training, the MSFLD model with optimal parameters for 23 key point detection is obtained.
3.4. Facial Fatigue Feature Parameter Matrix
3.4.1. Feature Extraction of Eye Fatigue
3.4.2. Feature Extraction of Mouth Fatigue
3.4.3. Calculate the Fatigue Parameter Matrix
3.5. Driving Behavior Decision Module
3.5.1. Adaptive Threshold Calculation
3.5.2. Statistical Threshold Calculation
3.5.3. Fusion Strategy of Adaptive and Statistical Thresholds
4. Experiments
4.1. Settings
4.1.1. Dataset Description
4.1.2. Dataset Preprocessing
4.1.3. Experimental Conditions
4.1.4. Evaluation Metrics
4.1.5. Baselines
4.2. Experimental Results
4.2.1. Convergence Analysis of the MSFLD
4.2.2. Ablation Study of the MSFLD
4.2.3. Facial Key Point Detection
4.2.4. Fatigue Driving Recognition
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- World Health Organization. Global status report on road safety 2013: Supporting a decade of action. Inj. Prev. 2013, 15, 286. [Google Scholar]
- Road Safety in Canada. Available online: https://www.tc.gc.ca/ (accessed on 24 March 2022).
- Azam, K.; Shakoor, A.; Shah, R.A.; Khan, A.; Shah, S.A.; Khalil, M.S. Comparison of fatigue related road traffic crashes on the national highways and motorways in Pakistan. J. Eng. Appl. Sci. 2014, 33, 47–54. [Google Scholar]
- AAA Foundation for Traffic Safety. Available online: https://www.aaafoundation.org (accessed on 10 January 2022).
- Fatigue. Available online: https://ec.europa.eu/transport/roadsafety/ (accessed on 21 January 2022).
- Abtahi, S.; Omidyeganeh, M.; Shirmohammadi, S.; Hariri, B. YawDD: A Yawning Detection Dataset. In Proceedings of the ACM Multimedia Systems, Singapore, 19 March 2014; pp. 24–28. [Google Scholar] [CrossRef]
- Yang, H.; Liu, L.; Min, W.; Yang, X.; Xiong, X. Driver Yawning Detection Based on Subtle Facial Action Recognition. IEEE Trans. Multimed. 2021, 23, 572–583. [Google Scholar] [CrossRef]
- Köstinger, M.; Wohlhart, P.; Roth, P.M.; Bischof, H. Annotated Facial Landmarks in the Wild: A large-scale, real-world database for facial landmark localization. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Barcelona, Spain, 6–13 November 2011; pp. 2144–2151. [Google Scholar] [CrossRef]
- Savaş, B.K.; Becerikli, Y. Real Time Driver Fatigue Detection System Based on Multi-Task ConNN. IEEE Access 2020, 8, 12491–12498. [Google Scholar] [CrossRef]
- Liu, W.; Tang, M.; Wang, C.; Zhang, K.; Wang, Q.; Xu, X. Attention-guided Dual Enhancement Train Driver Fatigue Detection Based on MTCNN. In Proceedings of the International Academic Exchange Conference on Science and Technology Innovation (IAECST), Guangzhou, China, 10–12 December 2021; pp. 1324–1329. [Google Scholar] [CrossRef]
- Salem, E.; Hassaballah, M.; Mahmoud, M.M.; Ali, A.M.M. Facial Features Detection: A Comparative Study. In Proceedings of the International Conference on Artificial Intelligence and Computer Vision (AICV2021), Settat, Morocco, 28–30 June 2021; pp. 402–412. [Google Scholar]
- Khabarlak, K.; Koriashkina, L. Fast facial landmark detection and applications: A survey. J. Comput. Sci. Technol. 2022, 22, 12–41. [Google Scholar] [CrossRef]
- Kazemi, V.; Sullivan, J. One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 1867–1874. [Google Scholar] [CrossRef] [Green Version]
- Hassaballah, M.; Bekhet, S.; Rashed, A.A.M.; Zhang, G. Facial Features Detection and Localization. Recent Adv. Comput. Vis. 2019, 804, 33–59. [Google Scholar] [CrossRef]
- Kansizoglou, I.; Misirlis, E.; Tsintotas, K.; Gasteratos, A. Continuous Emotion Recognition for Long-Term Behavior Modeling through Recurrent Neural Networks. Technologies 2022, 10, 59. [Google Scholar] [CrossRef]
- Sun, Y.; Wang, X.; Tang, X. Deep Convolutional Network Cascade for Facial Point Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, 23–28 June 2013; pp. 3476–3483. [Google Scholar] [CrossRef]
- Zhang, K.; Zhang, Z.; Li, Z.; Qiao, Y. Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks. IEEE Signal Proc. Let. 2016, 23, 1499–1503. [Google Scholar] [CrossRef] [Green Version]
- Deng, J.; Guo, J.; Zhou, Y. Retinaface: Single-Stage Dense Face Localisation in the Wild. Available online: https://arxiv.org/abs/1905.00641 (accessed on 20 May 2022).
- Guo, X.J.; Li, S.Y.; Yu, J.K. PFLD: A Practical Facial Landmark Detector. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 16–20 June 2019; pp. 1–11. [Google Scholar] [CrossRef]
- Liu, Y. Grand Challenge of 106-Point Facial Landmark Localization. In Proceedings of the IEEE International Conference on Multimedia & Expo Workshops (ICMEW), Shanghai, China, 8–12 July 2019; pp. 613–616. [Google Scholar] [CrossRef] [Green Version]
- Hassaballah, M.; Salem, E.; Ali, A.M.M.; Mahmoud, M.M. Deep recurrent regression with a heatmap coupling module for facial landmarks detection. Cogn. Comput. 2022, 1–15. [Google Scholar] [CrossRef]
- Sikander, G.; Anwar, S. Driver Fatigue Detection Systems: A Review. IEEE Trans. Intell. Transp. 2019, 20, 2339–2352. [Google Scholar] [CrossRef]
- Portouli, E.; Bekiaris, E.; Papakostopoulos, V.; Maglaveras, N. On-road experiment for collecting driving behavioural data of sleepy drivers. Somnology 2007, 11, 259–267. [Google Scholar] [CrossRef]
- Yang, Z.; Ren, H. Feature Extraction and Simulation of EEG Signals During Exercise-Induced Fatigue. IEEE Access 2019, 7, 46389–46398. [Google Scholar] [CrossRef]
- Chui, K.T.; Tsang, K.F.; Chi, H.R.; Ling, B.W.; Wu, C.K. An Accurate ECG-Based Transportation Safety Drowsiness Detection Scheme. IEEE Trans. Ind. Inform. 2016, 12, 1438–1452. [Google Scholar] [CrossRef]
- Tsuchida, A.; Bhuiyan, M.S.; Oguri, K. Estimation of drowsiness level based on eyelid closure and heart rate variability. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Minneapolis, MN, USA, 3–6 September 2009; pp. 2543–2546. [Google Scholar] [CrossRef]
- Balasubramanian, V.; Adalarasu, K. EMG-based analysis of change in muscle activity during simulated driving. J. Bodyw. Mov. Ther. 2007, 11, 151–158. [Google Scholar] [CrossRef]
- Yang, J.H.; Mao, Z.H.; Tijerina, L.; Pilutti, T.; Coughlin, J.F.; Feron, E. Detection of Driver Fatigue Caused by Sleep Deprivation. IEEE Trans. Syst. Man Cybern. 2009, 39, 694–705. [Google Scholar] [CrossRef]
- Lee, B.G.; Chung, W.Y. Driver Alertness Monitoring Using Fusion of Facial Features and Bio-Signals. IEEE Sens. J. 2012, 12, 2416–2422. [Google Scholar] [CrossRef]
- Li, K.; Gong, Y.; Ren, Z. A Fatigue Driving Detection Algorithm Based on Facial Multi-Feature Fusion. IEEE Access 2020, 8, 101244–101259. [Google Scholar] [CrossRef]
- Du, G.; Li, T.; Li, C.; Liu, P.X.; Li, D. Vision-Based Fatigue Driving Recognition Method Integrating Heart Rate and Facial Features. IEEE Trans. Intell. Trans. 2021, 22, 3089–3100. [Google Scholar] [CrossRef]
- Raja, M.S.; Manu, V.S.; Reshma, D. A Real-time Fatigue Detection System using Multi-Task Cascaded CNN Model. In Proceedings of the IEEE International Conference on Communication Systems and Network Technologies (CSNT), Bhopal, India, 24–25 April 2021; pp. 674–679. [Google Scholar] [CrossRef]
- Jia, H.; Xiao, Z.; Ji, P. Fatigue Driving Detection Based on Deep Learning and Multi-Index Fusion. IEEE Access 2021, 9, 147054–147062. [Google Scholar] [CrossRef]
- Hao, Z.; Li, Z.; Dang, X.; Ma, Z.; Liu, G. MM-LMF: A Low-Rank Multimodal Fusion Dangerous Driving Behavior Recognition Method Based on FMCW Signals. Electrcnics 2022, 11, 3800. [Google Scholar] [CrossRef]
Input | Operator | s | ||||
---|---|---|---|---|---|---|
112 × 112 × 3 | Conv 3 × 3 | − | 64 | 1 | 2 | 1 |
56 × 56 × 64 | Depth wise Conv 3 × 3 | − | 64 | 1 | 1 | 1 |
56 × 56 × 64 | Inverted bottleneck | 2 | 64 | 3 | 2 | 1 |
28 × 28 × 64 | Inverted bottleneck | 3 | 96 | 3 | 2 | 1 |
14 × 14 × 96 | Inverted bottleneck | 4 | 144 | 4 | 2 | 1 |
7 × 7 × 144 | Inverted bottleneck | 2 | 16 | 1 | 1 | 1 |
7 × 7 × 16 | Conv 3 × 3 | − | 32 | 1 | 1 | 1 |
7 ×7 × 32 | Conv 7 × 7 | − | 128 | 1 | 1 | 0 |
(S1) 56 × 56 × 64 | Avg pool | − | 64 | 1 | 2 | 1 |
(S2) 28 × 28 × 64 | Avg pool | − | 96 | 1 | 2 | 1 |
(S3) 14 × 14 ×96 | Avg pool | − | 144 | 1 | 2 | 1 |
(S4) 7 × 7 × 144 | Avg pool | − | 128 | 1 | 1 | 0 |
(S5) 1 × 1 × 128 | − | − | 128 | − | − | − |
In_feature = 496 | Full connection | − | 46 | 1 | − | − |
Formulation | NME (%) | Model Size (MiB) |
---|---|---|
Bottleneck + S1 + S2 + S3 | 6.4803 | 6.6 |
Inverted bottleneck + S1 + S2 + S3 | 6.3845 | 6.1 |
Inverted bottleneck + S1 + S2 + S3 + S4 + S5 (MSFLD) | 5.4518 | 6.2 |
Method | NME (%) | Model Size (MiB) | Excution Time (s) |
---|---|---|---|
MTCNN [17] | 9.0951 | 1.97 | 10.72 |
Retina Face _Resnet50 [18] | 5.7063 | 104.7 | 8.18 |
PFLD [19] | 6.4803 | 6.6 | 4.29 |
DURN_ Mobilenetv3 [20] | 9.1648 | 3.6 | 6.02 |
MSFLD | 5.4518 | 6.2 | 4.82 |
p | Accuracy (%) |
---|---|
30 | 90.4624 |
35 | 89.5954 |
40 | 89.0173 |
50 | 87.8613 |
EAR | MAR | Accuracy (%) |
---|---|---|
0.2452 | 0.8299 | 83.5260 |
0.3729 | 0.4559 | 44.5087 |
0.3091 | 0.6429 | 94.5087 |
EAR | MAR | Accuracy (%) | |
---|---|---|---|
30 | 0.3091 | 0.6429 | 99.1329 |
35 | 0.3091 | 0.6429 | 97.9769 |
40 | 0.3091 | 0.6429 | 97.1098 |
50 | 0.3091 | 0.6429 | 95.9538 |
Threshold Strategy | Accuracy (%) |
---|---|
Adaptive threshold | 90.4624 |
Statistical threshold | 94.5087 |
Combination threshold | 99.1329 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xiao, W.; Liu, H.; Ma, Z.; Chen, W.; Sun, C.; Shi, B. Fatigue Driving Recognition Method Based on Multi-Scale Facial Landmark Detector. Electronics 2022, 11, 4103. https://doi.org/10.3390/electronics11244103
Xiao W, Liu H, Ma Z, Chen W, Sun C, Shi B. Fatigue Driving Recognition Method Based on Multi-Scale Facial Landmark Detector. Electronics. 2022; 11(24):4103. https://doi.org/10.3390/electronics11244103
Chicago/Turabian StyleXiao, Weichu, Hongli Liu, Ziji Ma, Weihong Chen, Changliang Sun, and Bo Shi. 2022. "Fatigue Driving Recognition Method Based on Multi-Scale Facial Landmark Detector" Electronics 11, no. 24: 4103. https://doi.org/10.3390/electronics11244103
APA StyleXiao, W., Liu, H., Ma, Z., Chen, W., Sun, C., & Shi, B. (2022). Fatigue Driving Recognition Method Based on Multi-Scale Facial Landmark Detector. Electronics, 11(24), 4103. https://doi.org/10.3390/electronics11244103