Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (129)

Search Parameters:
Keywords = facial landmark detection

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 2634 KB  
Article
Minimal Angular Facial Representation for Real-Time Emotion Recognition
by Gerardo Garcia-Gil
Appl. Sci. 2026, 16(7), 3572; https://doi.org/10.3390/app16073572 - 6 Apr 2026
Viewed by 452
Abstract
Real-time facial emotion recognition remains challenging due to the high dimensionality and computational cost of dense facial representations, which limit their applicability in resource-constrained and real-time scenarios. This study proposes a compact, anatomically informed angular facial representation for efficient, interpretable emotion recognition under [...] Read more.
Real-time facial emotion recognition remains challenging due to the high dimensionality and computational cost of dense facial representations, which limit their applicability in resource-constrained and real-time scenarios. This study proposes a compact, anatomically informed angular facial representation for efficient, interpretable emotion recognition under real-time constraints. Facial landmarks are first extracted using a standard landmark detection framework, from which a reduced facial mesh of 27 anatomically selected points is defined. Internal geometric angles computed from this mesh are analyzed using temporal variability and redundancy criteria, resulting in a minimal set of eight angular descriptors that capture the most expressive facial dynamics while preserving geometric invariance and computational efficiency. The proposed representation is evaluated using multiple supervised machine learning classifiers under two complementary validation strategies: stratified frame-level cross-validation and strict Leave-One-Subject-Out evaluation. Under mixed-subject stratified validation, the best-performing model (MLP) achieved macro-averaged F1-scores exceeding 0.95 and near-unity ROC–AUC values. However, subject-independent evaluation revealed reduced generalization performance (average accuracy ≈55%), highlighting the influence of inter-subject morphological variability embedded in absolute angular descriptors. These findings indicate that a minimal angular geometric encoding provides strong intra-subject discriminative capability while transparently characterizing its cross-subject generalization limits, offering a practical and interpretable alternative for data- and resource-constrained real-time scenarios. Full article
Show Figures

Figure 1

19 pages, 3413 KB  
Article
AI-Based Angle Map Analysis of Facial Asymmetry in Peripheral Facial Palsy
by Andreas Heinrich, Gerd Fabian Volk, Christian Dobel and Orlando Guntinas-Lichius
Bioengineering 2026, 13(4), 426; https://doi.org/10.3390/bioengineering13040426 - 6 Apr 2026
Viewed by 449
Abstract
Peripheral facial palsy (PFP) causes pronounced facial asymmetry and functional impairment, highlighting the need for reliable, objective assessment. This study presents a novel, fully automated, reference-free method for quantifying facial symmetry using artificial intelligence (AI)-based facial landmark detection. A total of 405 datasets [...] Read more.
Peripheral facial palsy (PFP) causes pronounced facial asymmetry and functional impairment, highlighting the need for reliable, objective assessment. This study presents a novel, fully automated, reference-free method for quantifying facial symmetry using artificial intelligence (AI)-based facial landmark detection. A total of 405 datasets from 198 PFP patients were analyzed, each including nine standardized facial expressions covering both resting and dynamic movements. AI detected 478 landmarks per image, from which 225 paired landmarks were used to compute local asymmetry angles. Systematic evaluation identified 91 highly informative landmark pairs, primarily around the eyes, nose and mouth, which simplified the analysis and enhanced discriminatory power, while also enabling region-specific assessment of asymmetry. Statistical evaluation included Kruskal–Wallis H-tests across clinical scores and Spearman correlations, showing moderate to strong associations (0.32–0.73, p < 0.001). The fully automated pipeline produced reproducible results and demonstrated robustness to head rotation. Intuitive full-face angle maps allowed direct assessment of asymmetry without a reference image. This AI-driven approach provides a robust, objective, and visually interpretable framework for clinical monitoring, severity classification, and treatment evaluation in PFP, combining quantitative precision with practical applicability. Full article
Show Figures

Figure 1

21 pages, 13964 KB  
Article
Towards Generalizable Deepfake Detection via Facial Landmark-Guided Convolution and Local Structure Awareness
by Hao Chen, Zhengxu Zhang, Qin Li and Chunhui Feng
Algorithms 2026, 19(4), 270; https://doi.org/10.3390/a19040270 - 1 Apr 2026
Viewed by 372
Abstract
As deepfakes become increasingly realistic, there is a growing need for robust and highly accurate facial forgery detection algorithms. Existing studies show that global feature modeling approaches (Transformer, VMamba) are effective in capturing long-range dependencies, yet they often lack sufficient sensitivity to localized [...] Read more.
As deepfakes become increasingly realistic, there is a growing need for robust and highly accurate facial forgery detection algorithms. Existing studies show that global feature modeling approaches (Transformer, VMamba) are effective in capturing long-range dependencies, yet they often lack sufficient sensitivity to localized facial tampering artifacts. Meanwhile, traditional convolutional methods excel at extracting local image features but struggle to incorporate prior knowledge about facial anatomy, resulting in limited representational capability. To address these limitations, this paper proposes LGMamba, a novel detection framework that integrates facial guidance focusing on key facial components and fine-grained detail regions commonly manipulated in deepfakes with global modeling. First, we introduce an innovative Landmark-Guided Convolution (LGConv), which adaptively adjusts convolutional sampling positions using facial landmark information. This allows the model to attend to forgery-prone facial regions, such as the eyes and mouth. Second, we design a parallel Facial Structure Awareness Block (FSAB) to operate alongside the VMamba-based visual State-Space Model. Equipped with a multi-stage residual design and a CBAM attention mechanism, FSAB enhances the model’s sensitivity to subtle facial artifacts, enabling joint exploitation of global semantic consistency and fine-grained forgery cues within a unified architecture. The proposed LGMamba achieves superior performance compared to existing mainstream approaches. In cross-dataset evaluations, it attains AUC scores of 92.34% on CD1 and 96.01% on CD2, outperforming all compared methods. Full article
Show Figures

Figure 1

21 pages, 22338 KB  
Article
Nighttime Driver Fatigue Detection Based on Real-Time Joint Face and Facial Landmarks Detection
by Zhuofan Huang, Shangkun Liu, Jingli Huang and Jie Huang
Modelling 2026, 7(2), 60; https://doi.org/10.3390/modelling7020060 - 21 Mar 2026
Viewed by 349
Abstract
Driver fatigue detection (DFD) in low-light nighttime driving environments is crucial for road safety, but it remains challenging due to degraded image quality and computational constraints. This paper proposes a real-time three-stage framework specifically designed for nighttime driver fatigue detection, integrating low-light image [...] Read more.
Driver fatigue detection (DFD) in low-light nighttime driving environments is crucial for road safety, but it remains challenging due to degraded image quality and computational constraints. This paper proposes a real-time three-stage framework specifically designed for nighttime driver fatigue detection, integrating low-light image enhancement, joint face and facial landmark detection, and geometry-based fatigue judgment. In the initial stage, the framework utilizes the Zero-Reference Deep Curve Estimation (Zero-DCE) algorithm to improve the visual quality of input images under low-light conditions. Subsequently, a novel lightweight single-stage detector, You Only Look Once for Joint Face and Facial Landmark Detection (YOLOJFF), is introduced for efficient joint localization. Finally, fatigue judgment is performed in real-time by calculating the Eye Aspect Ratio (EAR) and Mouth Aspect Ratio (MAR) from the detected landmarks and using a sliding time window strategy. Experimental results demonstrate that the enhancement module significantly improves detection performance. The YOLOJFF model achieves a favorable balance, with 90.9% precision, 87.6% mean Average Precision (mAP), and 5.2 Normalized Mean Error (NME), while requiring only 3.7 million (M) parameters and running at 107.5 FPS. The proposed framework provides a robust and efficient solution for real-time DFD in nighttime scenarios. Full article
Show Figures

Figure 1

19 pages, 7295 KB  
Article
Video Identifying and Eraser: Use Multi-Task Cascaded Convolutional Neural Network to Enhance Safety in a Text-to-Video Diffusion Model
by Shuang Lin, Ranran Zhou and Yong Wang
Appl. Sci. 2026, 16(6), 2995; https://doi.org/10.3390/app16062995 - 20 Mar 2026
Viewed by 276
Abstract
Current security solutions predominantly rely on cloud-based implementations, often neglecting computational resource constraints and operational efficiency. While contemporary methodologies typically require additional training, the few that operate without retraining frequently yield suboptimal performance. To address these limitations, this work leverages a pre-trained MTCNN [...] Read more.
Current security solutions predominantly rely on cloud-based implementations, often neglecting computational resource constraints and operational efficiency. While contemporary methodologies typically require additional training, the few that operate without retraining frequently yield suboptimal performance. To address these limitations, this work leverages a pre-trained MTCNN architecture to detect faces of copyright-protected individuals. We construct a facial landmark database comprising five critical fiducial points, which serves as a supplementary module integrated into the stable diffusion framework, enabling real-time security filtering for synthesized video content. The proposed system utilizes MTCNN models pre-trained in the cloud to build a repository of copyrighted facial signatures, generating a geometric parameter database of facial landmarks. This database, coupled with a parallel verification unit, functions as a plugin within the standard Stable Diffusion pipeline. By leveraging Stable Diffusion’s native decoder, we decode stochastic frames from the U-Net latent representations and perform real-time comparative analysis to identify potential copyright violations in generated video sequences. Upon detecting an infringement, an on-screen display (OSD) alert notifies the user and immediately halts the text-to-video (T2V) generation process. Experimental evaluations demonstrate that our framework effectively mitigates the resource constraints and latency issues inherent in edge deployment scenarios of prior security implementations. Leveraging MTCNN’s proven robustness and extensive edge compatibility for facial recognition, the proposed detection and obfuscation plugin integrates seamlessly with Stable Diffusion while preserving generation quality. Full article
(This article belongs to the Special Issue Applied Multimodal AI: Methods and Applications Across Domains)
Show Figures

Figure 1

19 pages, 34223 KB  
Article
A Real Time Multi Modal Computer Vision Framework for Automated Autism Spectrum Disorder Screening
by Lehel Dénes-Fazakas, Ioan Catalin Mateas, Alexandru George Berciu, László Szilágyi, Levente Kovács and Eva-H. Dulf
Electronics 2026, 15(6), 1287; https://doi.org/10.3390/electronics15061287 - 19 Mar 2026
Viewed by 419
Abstract
Background: The early detection of autism spectrum disorder (ASD) is imperative for enhancing long-term developmental outcomes. Nevertheless, conventional screening methods depend on time-consuming, expert-driven behavioral assessments and are characterized by limited scalability. Automated video-based analysis provides a noninvasive and objective approach for the [...] Read more.
Background: The early detection of autism spectrum disorder (ASD) is imperative for enhancing long-term developmental outcomes. Nevertheless, conventional screening methods depend on time-consuming, expert-driven behavioral assessments and are characterized by limited scalability. Automated video-based analysis provides a noninvasive and objective approach for the extraction of behavioral biomarkers from naturalistic recordings. Methods: A modular multimodal framework was developed that integrates motion-based video analysis and facial feature extraction for the purpose of ASD versus typically developing (TD) classification. The system is capable of processing RGB videos, skeleton/stickman representations, and motion trajectory streams. A comprehensive set of kinematic features was extracted, encompassing joint trajectories, velocity and acceleration profiles, posture variability, movement smoothness, and bilateral asymmetry. The repetitive stereotypical behaviors exhibited by the subjects were characterized using frequency-domain analysis via FFT within the 0.3–7.0 Hz band. Facial expression features derived from normalized face crops and landmark-based morphological descriptors were integrated as complementary modalities. The feature-level fusion process was executed subsequent to z-score normalization, and the classification procedure was conducted using a Random Forest model with stratified 5-fold cross validation. The implementation of GPU acceleration was instrumental in facilitating near real-time inference. Results: The motion-based ComplexVideos pipeline demonstrated a cross-validated accuracy of 94.2 ± 2.1% with an area under the ROC curve (AUC) of 0.93. Skeleton-based KinectStickman inputs demonstrated moderate performance, with an accuracy range of 60–80%. In contrast, facial-only models exhibited an accuracy of approximately 60%. The integration of multiple modalities through feature fusion has been demonstrated to enhance the robustness of classification algorithms and mitigate the occurrence of false negative outcomes, thereby surpassing the performance of single-modality models. The mean inference time remained below one second per video frame under standard operating conditions. Conclusions: The experimental results demonstrate that the integration of multimodal cues, including motion and facial features, facilitates the development of effective and efficient video-based screening methods for autism spectrum disorder (ASD). The proposed framework is designed to offer a scalable, extensible, and computationally efficient solution that can support early screening in clinical and remote assessment settings. Full article
(This article belongs to the Special Issue Computer Vision and Machine Learning for Biometric Systems)
Show Figures

Figure 1

12 pages, 3058 KB  
Proceeding Paper
AI Facial Acupuncture Point Interactive Voice Health Care Teaching System
by Wen-Cheng Chen, Yu-Hsuan Chen, Yu-Hsing Chen, Jiu-Wen Wang, Hung-Jen Chen and Jr-Wei Tsai
Eng. Proc. 2026, 128(1), 37; https://doi.org/10.3390/engproc2026128037 - 16 Mar 2026
Viewed by 331
Abstract
We developed an AI-based system for facial acupoint recognition and healthcare support, integrating MediaPipe facial and hand tracking technologies to address the problems of inaccurate and non-standardized acupoint identification in traditional Chinese medicine (TCM). By leveraging facial landmark detection and fingertip tracking, the [...] Read more.
We developed an AI-based system for facial acupoint recognition and healthcare support, integrating MediaPipe facial and hand tracking technologies to address the problems of inaccurate and non-standardized acupoint identification in traditional Chinese medicine (TCM). By leveraging facial landmark detection and fingertip tracking, the system enables accurate localization of facial acupoints to ensure precise stimulation. The system contributes to the standardization of acupoint recognition, intelligent health consultation, and the digital transformation of TCM practices. Further enhancements are necessary by expanding acupoint recognition to other body parts (e.g., ears, hands, feet, and back) and integrating with wearable devices to further promote personalized and precise TCM healthcare. Full article
Show Figures

Figure 1

33 pages, 5767 KB  
Article
Hyper-Thyro Vision: An Integrated Framework for Hyperthyroidism Diagnostic Facial Image Analysis Based on Deep Learning
by Poonyisa Thepmangkorn and Suchada Sitjongsataporn
Biomimetics 2026, 11(3), 210; https://doi.org/10.3390/biomimetics11030210 - 15 Mar 2026
Viewed by 577
Abstract
This paper presents an integrated multi-modal framework for detecting hyperthyroidism-associated abnormalities, namely exophthalmos and thyroid-related neck swelling, through the joint analysis of frontal facial and neck images using a deep learning-based approach. The objective of this research is to develop an integrated AI [...] Read more.
This paper presents an integrated multi-modal framework for detecting hyperthyroidism-associated abnormalities, namely exophthalmos and thyroid-related neck swelling, through the joint analysis of frontal facial and neck images using a deep learning-based approach. The objective of this research is to develop an integrated AI framework that improves hyperthyroid-related abnormality detection by simultaneously analyzing facial images of both the eye and neck based on pattern clinical knowledge. The multi-modal framework mimics a biological visual mechanism by using a dual-pathway architecture that concurrently processes foveal-like details of the eyes and neck. It integrates these high-resolution visual embeddings with quantitative morphological measurements to simulate a clinician’s ability to fuse observation with physical assessment. The proposed system employs a multi-faceted decision-making process derived from three distinct data components: two from frontal face analysis and one from neck region analysis. Specifically, eye regions extracted from facial images are preprocessed using the YOLOv11s model. The proposed system leverages a dual-pathway processing architecture to extract comprehensive diagnostic features. For the eye dataset, the framework utilizes a face mesh-based eye landmark (FMEL) to extract both eye regions and perform eyes unfold processing. These regions are subsequently analyzed by the proposed sclera map unwrapping engine (SMUE) to derive quantitative sclera metrics from both the left and right eyes. To optimize classification, a dual-branch architecture is employed by integrating CNN visual embeddings with SMUE-derived statistical features through a feature fusion layer. Simultaneously, the neck processing path executes the neck region of interest (ROI) prediction {upper, lower} to segment critical regions for goiter assessment via the proposed neck μσ ensemble thresholding (NSET) algorithm. The experimental results demonstrate that the proposed algorithm for eye analysis achieved a mean average precision (mAP50) of 96.4%, with a specific mAP50 of 98.6% for the hyperthyroid class. Regarding quantitative scleral measurement, the SMUE process revealed distinct morphological differences, with the experimental data group exhibiting consistently higher pixel distances across the reference points compared with the normal group. Furthermore, the proposed NSET algorithm yielded the highest performance for swollen neck classification with an mAP50 of 92.0%, significantly outperforming the baseline deep learning models while maintaining lower computational complexity. Full article
Show Figures

Graphical abstract

16 pages, 1729 KB  
Article
Objective Dynamic Assessment of Facial Movement Asymmetry in Children Using a Marker-Based Video Method
by Dawid Danecki, Agata Sage, Zuzanna Miodońska, Sebastian Zowada, Anna Lipowicz, Andrzej Myśliwiec, Krzysztof Dowgierd, Ewa Piętka and Michał Kręcichowst
J. Clin. Med. 2026, 15(5), 1870; https://doi.org/10.3390/jcm15051870 - 28 Feb 2026
Viewed by 385
Abstract
Background: Facial movement symmetry is an important indicator of neuromuscular function, with asymmetries associated with neurological disorders, trauma, and surgery. Quantitative symmetry assessment supports diagnosis, therapy monitoring, and surgical planning. This study proposes a marker-based approach to improve tracking stability and investigates [...] Read more.
Background: Facial movement symmetry is an important indicator of neuromuscular function, with asymmetries associated with neurological disorders, trauma, and surgery. Quantitative symmetry assessment supports diagnosis, therapy monitoring, and surgical planning. This study proposes a marker-based approach to improve tracking stability and investigates whether dynamic facial movement descriptors can distinguish symmetric from asymmetric exercise execution. Methods: Videos were recorded using a low-cost acquisition setup during two facial exercises: eyebrow raising and smiling (75 patient; mean age 14 ± 4 years). Seventeen ArUco markers were placed at predefined facial landmarks. The dataset comprised 134 recordings labeled as symmetric (S) or asymmetric (AS). The processing pipeline included marker and face detection, symmetry axis estimation, feature extraction, and statistical analysis. Features were based on distances between paired markers and the estimated facial symmetry axis, yielding two dynamic descriptors: VertDist (vertical displacement) and Ratio (relative position across facial halves), along with their first derivatives. Results: Group differences between S and AS movements were analyzed using Welch’s t-test with effect sizes quantified by Hedges’ g. Statistically significant differences were found primarily in the first derivatives of VertDist and Ratio. For eyebrow raising, VertDist showed large effects (Hedges’ |g|=1.411.42) and Ratio moderate effects (|g|=0.750.87). For smiling, VertDist demonstrated moderate effects (|g|=0.870.93), while Ratio exhibited large effects (|g|=1.141.21). Conclusions: The proposed marker-based method enables reliable, low-cost quantitative assessment of facial movement asymmetry. Dynamic descriptors derived from VertDist and Ratio effectively differentiate symmetric and asymmetric facial movements. Full article
(This article belongs to the Section Clinical Pediatrics)
Show Figures

Figure 1

13 pages, 1494 KB  
Article
Development and Clinical Validation of an Artificial Intelligence-Based Automated Visual Acuity Testing System
by Kelvin Zhenghao Li, Hnin Hnin Oo, Kenneth Chee Wei Liang, Najah Ismail, Jasmine Ling Ling Chua, Jackson Jie Sheng Chng, Yang Wu, Daryl Wei Ren Wong, Sumaya Rani Khan, Boon Peng Yap, Rong Tong, Choon Meng Kiew, Yufei Huang, Chun Hau Chua, Alva Khai Shin Lim and Xiuyi Fan
Life 2026, 16(2), 357; https://doi.org/10.3390/life16020357 - 20 Feb 2026
Viewed by 800
Abstract
Background: To develop and validate an automated visual acuity (VA) testing system integrating artificial intelligence (AI)–driven speech and image recognition technologies, enabling self-administered, clinic-based VA assessment; Methods: The system incorporated a fine-tuned Whisper speech-recognition model with Silero voice activity detection and pose estimation [...] Read more.
Background: To develop and validate an automated visual acuity (VA) testing system integrating artificial intelligence (AI)–driven speech and image recognition technologies, enabling self-administered, clinic-based VA assessment; Methods: The system incorporated a fine-tuned Whisper speech-recognition model with Silero voice activity detection and pose estimation through facial landmark and ArUco marker detection. A state-driven interface guided users through sequential testing with and without a pinhole. Speech recognition was enhanced using a local Singaporean English dataset. Laboratory validation assessed speech and pose recognition performance, while clinical validation compared automated and manual VA testing at a tertiary eye clinic; Results: The fine-tuned model reduced word error rates from 17.83% to 9.81% for letters and 2.76% to 1.97% for numbers. Pose detection accurately identified valid occluder states. Among 72 participants (144 eyes), automated unaided VA showed good agreement with manual VA (ICC = 0.77, 95% CI 0.62–0.85), while pinhole VA demonstrated moderate agreement (ICC = 0.63, 95% CI 0.25–0.83). Automated testing took longer (132.1 ± 47.5 s vs. 97.1 ± 47.8 s; p < 0.001), but user experience remained positive (mean Likert scale score 4.3 ± 0.8); Conclusions: The AI-based automated VA system delivered accurate, reliable, and user-friendly performance, supporting its feasibility for clinical implementation. Full article
(This article belongs to the Section Biochemistry, Biophysics and Computational Biology)
Show Figures

Figure 1

22 pages, 1659 KB  
Article
Lightweight Depression Detection Using 3D Facial Landmark Pseudo-Images and CNN-LSTM on DAIC-WOZ and E-DAIC
by Achraf Jallaglag, My Abdelouahed Sabri, Ali Yahyaouy and Abdellah Aarab
BioMedInformatics 2026, 6(1), 8; https://doi.org/10.3390/biomedinformatics6010008 - 4 Feb 2026
Cited by 1 | Viewed by 1117
Abstract
Background: Depression is a common mental disorder, and early and objective diagnosis of depression is challenging. New advances in deep learning show promise for processing audio and video content when screening for depression. Nevertheless, the majority of current methods rely on raw video [...] Read more.
Background: Depression is a common mental disorder, and early and objective diagnosis of depression is challenging. New advances in deep learning show promise for processing audio and video content when screening for depression. Nevertheless, the majority of current methods rely on raw video processing or multimodal pipelines, which are computationally costly and challenging to understand and create privacy issues, restricting their use in actual clinical settings. Methods: Based solely on spatiotemporal 3D face landmark representations, we describe a unique, totally visual, and lightweight deep learning approach to overcome these constraints. In this paper we introduce, for the first time, a pure visual deep learning framework, based on spatiotemporal 3D facial landmarks extracted from clinical interview videos contained in the DAIC-WOZ and Extended DAIC-WOZ (E-DAIC) datasets. Our method does not use raw video or any type of semi-automated multimodal fusion. Whereas raw video streaming can be computationally expensive and is not well suited to investigating specific variables, we first take a temporal series of 3D landmarks, convert them to pseudo-images (224 × 224 × 3), and then use them within a CNN-LSTM framework. Importantly, CNN-LSTM provides the ability to analyze the spatial configuration and temporal dimensions of facial behavior. Results: The experimental results indicate macro-average F1 scores of 0.74 on DAIC-WOZ and 0.762 on E-DAIC, demonstrating robust performance under heavy class imbalances, with a variability of ±0.03 across folds. Conclusion: These results indicate that landmark-based spatiotemporal modeling represents the future of lightweight, interpretable, and scalable automatic depression detection. Second, our results suggest exciting opportunities for completely embedding ADI systems within the framework of real-world MHA. Full article
Show Figures

Graphical abstract

19 pages, 3470 KB  
Article
Driver Monitoring System Using Computer Vision for Real-Time Detection of Fatigue, Distraction and Emotion via Facial Landmarks and Deep Learning
by Tamia Zambrano, Luis Arias, Edgar Haro, Victor Santos and María Trujillo-Guerrero
Sensors 2026, 26(3), 889; https://doi.org/10.3390/s26030889 - 29 Jan 2026
Viewed by 1277
Abstract
Car accidents remain a leading cause of death worldwide, with drowsiness and distraction accounting for roughly 25% of fatal crashes in Ecuador. This study presents a real-time driver monitoring system that uses computer vision and deep learning to detect fatigue, distraction, and emotions [...] Read more.
Car accidents remain a leading cause of death worldwide, with drowsiness and distraction accounting for roughly 25% of fatal crashes in Ecuador. This study presents a real-time driver monitoring system that uses computer vision and deep learning to detect fatigue, distraction, and emotions from facial expressions. It combines a MobileNetV2-based CNN trained on RAF-DB for emotion recognition and MediaPipe’s 468 facial landmarks to compute the EAR (Eye Aspect Ratio), the MAR (Mouth Aspect Ratio), the gaze, and the head pose. Tests with 27 participants in both real and simulated driving environments showed strong results. There was a 100% accuracy in detecting distraction, 85.19% for yawning, and 88.89% for eye closure. The system also effectively recognized happiness (100%) and anger/disgust (96.3%). However, it struggled with sadness and failed to detect fear, likely due to the subtlety of real-world expressions and limitations in the training dataset. Despite these challenges, the results highlight the importance of integrating emotional awareness into driver monitoring systems, which helps reduce false alarms and improve response accuracy. This work supports the development of lightweight, non-invasive technologies that enhance driving safety through intelligent behavior analysis. Full article
(This article belongs to the Special Issue Sensor Fusion for the Safety of Automated Driving Systems)
Show Figures

Figure 1

18 pages, 4862 KB  
Article
Development of a Robot-Assisted TMS Localization System Using Dual Capacitive Sensors for Coil Tilt Detection
by Czaryn Diane Salazar Ompico, Julius Noel Banayo, Yamato Mashio, Masato Odagaki, Yutaka Kikuchi, Armyn Chang Sy and Hirofumi Kurosaki
Sensors 2026, 26(2), 693; https://doi.org/10.3390/s26020693 - 20 Jan 2026
Viewed by 636
Abstract
Transcranial Magnetic Stimulation (TMS) is a non-invasive technique for neurological research and therapy, but its effectiveness depends on accurate and stable coil placement. Manual localization based on anatomical landmarks is time-consuming and operator-dependent, while state-of-the-art robotic and neuronavigation systems achieve high accuracy using [...] Read more.
Transcranial Magnetic Stimulation (TMS) is a non-invasive technique for neurological research and therapy, but its effectiveness depends on accurate and stable coil placement. Manual localization based on anatomical landmarks is time-consuming and operator-dependent, while state-of-the-art robotic and neuronavigation systems achieve high accuracy using optical tracking with head-mounted markers and infrared cameras, at the cost of increased system complexity and setup burden. This study presents a cost-effective, markerless robotic-assisted TMS system that combines a 3D depth camera and textile capacitive sensors to assist coil localization and contact control. Facial landmarks detected by the depth camera are used to estimate the motor cortex (C3) location without external tracking markers, while a dual textile-sensor suspension provides compliant “soft-landing” behavior, contact confirmation, and coil-tilt estimation. Experimental evaluation with five participants showed reliable C3 targeting with valid motor evoked potentials (MEPs) obtained in most trials after initial calibration, and tilt-verification experiments revealed that peak MEP amplitudes occurred near balanced sensor readings in 12 of 15 trials (80%). The system employs a collaborative robot designed in accordance with international human–robot interaction safety standards, including force-limited actuation and monitored stopping. These results suggest that the proposed approach can improve the accessibility, safety, and consistency of TMS procedures while avoiding the complexity of conventional optical tracking systems. Full article
Show Figures

Figure 1

33 pages, 5188 KB  
Article
Geometric Feature Enhancement for Robust Facial Landmark Detection in Makeup Paper Templates
by Cheng Chang, Yong-Yi Fanjiang and Chi-Huang Hung
Appl. Sci. 2026, 16(2), 977; https://doi.org/10.3390/app16020977 - 18 Jan 2026
Viewed by 788
Abstract
Traditional scoring of makeup face templates in beauty skill assessments heavily relies on manual judgment, leading to inconsistencies and subjective bias. Hand-drawn templates often exhibit proportion distortions, asymmetry, and occlusions that reduce the accuracy of conventional facial landmark detection algorithms. This study proposes [...] Read more.
Traditional scoring of makeup face templates in beauty skill assessments heavily relies on manual judgment, leading to inconsistencies and subjective bias. Hand-drawn templates often exhibit proportion distortions, asymmetry, and occlusions that reduce the accuracy of conventional facial landmark detection algorithms. This study proposes a novel approach that integrates Geometric Feature Enhancement (GFE) with Dlib’s 68-landmark detection to improve the robustness and precision of landmark localization. A comprehensive comparison among Haar Cascade, MTCNN-MobileNetV2, and Dlib was conducted using a curated dataset of 11,600 hand-drawn facial templates. The proposed GFE-enhanced Dlib achieved 60.5% accuracy—outperforming MTCNN (23.4%) and Haar (20.3%) by approximately 37 percentage points, with precision and F1-score improvements exceeding 20% and 25%, respectively. The results demonstrate that the proposed method significantly enhances detection accuracy and scoring consistency, providing a reliable framework for automated beauty skill evaluation, and laying a solid foundation for future applications such as digital archiving and style-guided synthesis. Full article
(This article belongs to the Special Issue Advances in Computer Vision and Digital Image Processing)
Show Figures

Figure 1

15 pages, 4459 KB  
Article
Automated Custom Sunglasses Frame Design Using Artificial Intelligence and Computational Design
by Prodromos Minaoglou, Anastasios Tzotzis, Klodian Dhoska and Panagiotis Kyratsis
Machines 2026, 14(1), 109; https://doi.org/10.3390/machines14010109 - 17 Jan 2026
Viewed by 883
Abstract
Mass production in product design typically relies on standardized geometries and dimensions to accommodate a broad user population. However, when products are required to interface directly with the human body, such generalized design approaches often result in inadequate fit and reduced user comfort. [...] Read more.
Mass production in product design typically relies on standardized geometries and dimensions to accommodate a broad user population. However, when products are required to interface directly with the human body, such generalized design approaches often result in inadequate fit and reduced user comfort. This limitation highlights the necessity of fully personalized design methodologies based on individual anthropometric characteristics. This paper presents a novel application that automates the design of custom-fit sunglasses through the integration of Artificial Intelligence (AI) and Computational Design. The system is implemented using both textual (Python™ version 3.10.11) and visual (Grasshopper 3D™ version 1.0.0007) programming environments. The proposed workflow consists of the following four main stages: (a) acquisition of user facial images, (b) AI-based detection of facial landmarks, (c) three-dimensional reconstruction of facial features via an optimization process, and (d) generation of a personalized sunglass frame, exported as a three-dimensional model. The application demonstrates a robust performance across a diverse set of test images, consistently generating geometries that conformed closely to each user’s facial morphology. The accurate recognition of facial features enables the successful generation of customized sunglass frame designs. The system is further validated through the fabrication of a physical prototype using additive manufacturing, which confirms both the manufacturability and the fit of the final design. Overall, the results indicate that the combined use of AI-driven feature extraction and parametric Computational Design constitutes a powerful framework for the automated development of personalized wearable products. Full article
Show Figures

Figure 1

Back to TopTop