Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (130)

Search Parameters:
Keywords = self-face recognition

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
17 pages, 2122 KiB  
Article
Improving Dynamic Gesture Recognition with Attention-Enhanced LSTM and Grounding SAM
by Jinlong Chen, Fuqiang Jin, Yingjie Jiao, Yongsong Zhan and Xingguo Qin
Electronics 2025, 14(9), 1793; https://doi.org/10.3390/electronics14091793 - 28 Apr 2025
Viewed by 167
Abstract
Dynamic gesture detection is a key topic in computer vision and deep learning, with applications in human–computer interaction and virtual reality. However, traditional methods struggle with long sequences, complex scenes, and multimodal data, facing issues such as high computational cost and background noise. [...] Read more.
Dynamic gesture detection is a key topic in computer vision and deep learning, with applications in human–computer interaction and virtual reality. However, traditional methods struggle with long sequences, complex scenes, and multimodal data, facing issues such as high computational cost and background noise. This study proposes an Attention-Enhanced dual-layer LSTM (Long Short-Term Memory) network combined with Grounding SAM (Grounding Segment Anything Model) for gesture detection. The dual-layer LSTM captures long-term temporal dependencies, while a multi-head attention mechanism improves the extraction of global spatiotemporal features. Grounding SAM, composed of Grounding DINO for object localization and SAM (Segment Anything Model) for image segmentation, is employed during preprocessing to precisely extract gesture regions and remove background noise. This enhances feature quality and reduces interference during training. Experiments show that the proposed method achieves 96.3% accuracy on a self-constructed dataset and 96.1% on the SHREC 2017 dataset, outperforming several baseline methods by an average of 4.6 percentage points. It also demonstrates strong robustness under complex and dynamic conditions. This approach provides a reliable and efficient solution for future dynamic gesture-recognition systems. Full article
Show Figures

Figure 1

22 pages, 7640 KiB  
Article
MCL-SWT: Mirror Contrastive Learning with Sliding Window Transformer for Subject-Independent EEG Recognition
by Qi Mao, Hongke Zhu, Wenyao Yan, Yu Zhao, Xinhong Hei and Jing Luo
Brain Sci. 2025, 15(5), 460; https://doi.org/10.3390/brainsci15050460 - 27 Apr 2025
Viewed by 248
Abstract
Background: In brain–computer interfaces (BCIs), transformer-based models have found extensive application in motor imagery (MI)-based EEG signal recognition. However, for subject-independent EEG recognition, these models face challenges: low sensitivity to spatial dynamics of neural activity and difficulty balancing high temporal resolution features [...] Read more.
Background: In brain–computer interfaces (BCIs), transformer-based models have found extensive application in motor imagery (MI)-based EEG signal recognition. However, for subject-independent EEG recognition, these models face challenges: low sensitivity to spatial dynamics of neural activity and difficulty balancing high temporal resolution features with manageable computational complexity. The overarching objective is to address these critical issues. Methods: We introduce Mirror Contrastive Learning with Sliding Window Transformer (MCL-SWT). Inspired by left/right hand motor imagery inducing event-related desynchronization (ERD) in the contralateral sensorimotor cortex, we develop a mirror contrastive loss function. It segregates feature spaces of EEG signals from contralateral ERD locations while curtailing variability in signals sharing similar ERD locations. The Sliding Window Transformer computes self-attention scores over high temporal resolution features, enabling efficient capture of global temporal dependencies. Results: Evaluated on benchmark datasets for subject-independent MI EEG recognition, MCL-SWT achieves classification accuracies of 66.48% and 75.62%, outperforming State-of-the-Art models by 2.82% and 2.17%, respectively. Ablation studies validate the efficacy of both the mirror contrastive loss and sliding window mechanism. Conclusions: These findings underscore MCL-SWT’s potential as a robust, interpretable framework for subject-independent EEG recognition. By addressing existing challenges, MCL-SWT could significantly advance BCI technology development. Full article
(This article belongs to the Special Issue The Application of EEG in Neurorehabilitation)
Show Figures

Figure 1

16 pages, 1095 KiB  
Article
GCN-Former: A Method for Action Recognition Using Graph Convolutional Networks and Transformer
by Xueshen Cui, Jikai Zhang, Yihao He, Zhixing Wang and Wentao Zhao
Appl. Sci. 2025, 15(8), 4511; https://doi.org/10.3390/app15084511 - 19 Apr 2025
Viewed by 150
Abstract
Skeleton-based action recognition, which aims to classify human actions through the coordinates of body joints and their connectivity, is a significant research area in computer vision with broad application potential. Although Graph Convolutional Networks (GCNs) have made significant progress in processing skeleton data [...] Read more.
Skeleton-based action recognition, which aims to classify human actions through the coordinates of body joints and their connectivity, is a significant research area in computer vision with broad application potential. Although Graph Convolutional Networks (GCNs) have made significant progress in processing skeleton data represented as graphs, their performance is constrained by local receptive fields and fixed joint connection patterns. Recently, researchers have introduced Transformer-based methods to overcome these limitations and better capture long-range dependencies. However, these methods face significant computational resource challenges when attempting to capture the correlations between all joints across all frames. This paper proposes an innovative Spatio-Temporal Graph Convolutional Network: GCN-Former, which aims to enhance model performance in skeleton-based action recognition tasks. The model integrates the Transformer architecture with traditional GCNs, leveraging the Transformer’s powerful capability for handling long-sequence data and the effective capture of spatial dependencies by GCNs. Specifically, this study designs a Transformer Block temporal encoder based on the self-attention mechanism to model long-sequence temporal actions. The temporal encoder can effectively capture long-range dependencies in action sequences while retaining global contextual information in the temporal dimension. In addition, in order to achieve a smooth transition from graph convolutional networks (GCNs) to Transformers, we further develop a contextual temporal attention (CTA) module. These components are aimed at enhancing the understanding of temporal and spatial information within action sequences. Experimental validation on multiple benchmark datasets demonstrates that our approach not only surpasses existing techniques in prediction accuracy, but also has significant performance advantages in handling action recognition tasks involving long time sequences and can more effectively capture and understand long-range dependencies in complex action patterns. Full article
Show Figures

Figure 1

30 pages, 71082 KiB  
Article
GTrXL-SAC-Based Path Planning and Obstacle-Aware Control Decision-Making for UAV Autonomous Control
by Jingyi Huang, Yujie Cui, Guipeng Xi, Shuangxia Bai, Bo Li, Geng Wang and Evgeny Neretin
Drones 2025, 9(4), 275; https://doi.org/10.3390/drones9040275 - 3 Apr 2025
Viewed by 333
Abstract
Research on UAV (unmanned aerial vehicle) path planning and obstacle avoidance control based on DRL (deep reinforcement learning) still faces limitations, as previous studies primarily utilized current perceptual inputs while neglecting the continuity of flight processes, resulting in low early-stage learning efficiency. To [...] Read more.
Research on UAV (unmanned aerial vehicle) path planning and obstacle avoidance control based on DRL (deep reinforcement learning) still faces limitations, as previous studies primarily utilized current perceptual inputs while neglecting the continuity of flight processes, resulting in low early-stage learning efficiency. To address these issues, this paper integrates DRL with the Transformer architecture to propose the GTrXL-SAC (gated Transformer-XL soft actor critic) algorithm. The algorithm performs positional embedding on multimodal data combining visual and sensor information. Leveraging the self-attention mechanism of GTrXL, it effectively focuses on different segments of multimodal data for encoding while capturing sequential relationships, significantly improving obstacle recognition accuracy and enhancing both learning efficiency and sample efficiency. Additionally, the algorithm capitalizes on GTrXL’s memory characteristics to generate current drone control decisions through the combined analysis of historical experiences and present states, effectively mitigating long-term dependency issues. Experimental results in the AirSim drone simulation environment demonstrate that compared to PPO and SAC algorithms, GTrXL-SAC achieves more precise policy exploration and optimization, enabling superior control of drone velocity and attitude for stabilized flight while accelerating convergence speed by nearly 20%. Full article
Show Figures

Figure 1

23 pages, 13041 KiB  
Article
A Sheep Behavior Recognition Approach Based on Improved FESS-YOLOv8n Neural Network
by Xiuru Guo, Chunyue Ma, Chen Wang, Xiaochen Cui, Guangdi Xu, Ruimin Wang, Yuqi Liu, Bo Sun, Zhijun Wang and Xuchao Guo
Animals 2025, 15(6), 893; https://doi.org/10.3390/ani15060893 - 20 Mar 2025
Viewed by 389
Abstract
Sheep are an important breed of livestock in the northern regions of China, providing humans with nutritious meat and by-products. Therefore, it is essential to ensure the health status of sheep. Research has shown that the individual and group behaviors of sheep can [...] Read more.
Sheep are an important breed of livestock in the northern regions of China, providing humans with nutritious meat and by-products. Therefore, it is essential to ensure the health status of sheep. Research has shown that the individual and group behaviors of sheep can reflect their overall health status. However, as the scale of farming expands, traditional behavior detection methods based on manual observation and those that employ contact-based devices face challenges, including poor real-time performance and unstable accuracy, making them difficult to meet the current demands. To address these issues, this paper proposes a sheep behavior detection model, Fess-YOLOv8n, based on an enhanced YOLOv8n neural network. On the one hand, this approach achieves a lightweight model by introducing the FasterNet structure and the selective channel down-sampling module (SCDown). On the other hand, it utilizes the efficient multi-scale attention mechanism (EMA)as well as the spatial and channel synergistic attention module (SCSA) to improve recognition performance. The results on a self-built dataset show that Fess-YOLOv8n reduced the model size by 2.56 MB and increased the detection accuracy by 4.7%. It provides technical support for large-scale sheep behavior detection and lays a foundation for sheep health monitoring. Full article
(This article belongs to the Section Small Ruminants)
Show Figures

Figure 1

26 pages, 3153 KiB  
Article
The Role of Latin American Universities in Entrepreneurial Ecosystems: A Multi-Level Study of Academic Entrepreneurship in Ecuador
by Roberto Vallejo-Imbaquingo and Andrés Robalino-López
Adm. Sci. 2025, 15(3), 108; https://doi.org/10.3390/admsci15030108 - 18 Mar 2025
Viewed by 515
Abstract
Entrepreneurship plays a crucial role in driving innovation, productivity, and economic growth, with universities emerging as key actors within entrepreneurial ecosystems. This study seeks to expand the understanding on the role of Latin American universities on entrepreneurial ecosystems by examining the case of [...] Read more.
Entrepreneurship plays a crucial role in driving innovation, productivity, and economic growth, with universities emerging as key actors within entrepreneurial ecosystems. This study seeks to expand the understanding on the role of Latin American universities on entrepreneurial ecosystems by examining the case of alumni from Escuela Politécnica Nacional (EPN). Employing a mixed-methods approach, this research explores individual, organizational, and institutional dynamics within the Ecuadorian entrepreneurial ecosystem. Results indicate that universities like EPN nurture professional and technical capabilities but face institutional obstacles that restrict their capacity to foster knowledge-based, high-growth ventures. This study highlights several institutional-level barriers, including market dominance, limited access to formal financing, corruption, and complex regulations, that limit innovation. Thus, universities in the region play an important role in preparing potential entrepreneurs, yet their impact is ultimately restricted by contextual factors. To overcome these challenges, universities can strengthen their support by integrating entrepreneurship education, networking opportunities, early-stage venture experiences, and exposure to role models or success stories. Particularly in contexts like Ecuador, fostering self-efficacy, resilience, and opportunity recognition can boost entrepreneurial behavior. In addition, enhancing university–industry collaboration, encouraging business transparency, improving funding accessibility, and supporting knowledge-intensive businesses are essential steps to harness the full potential of universities in the entrepreneurial ecosystem. Full article
Show Figures

Figure 1

26 pages, 4668 KiB  
Article
Assessing Computational Thinking in Engineering and Computer Science Students: A Multi-Method Approach
by Farman Ali Pirzado, Awais Ahmed, Sadam Hussain, Gerardo Ibarra-Vázquez and Hugo Terashima-Marin
Educ. Sci. 2025, 15(3), 344; https://doi.org/10.3390/educsci15030344 - 11 Mar 2025
Viewed by 1252
Abstract
The rapid integration of computational thinking (CT) into STEM education highlights its importance as a critical skill for problem-solving in the digital age, equipping students with the cognitive tools needed to address complex challenges systematically. This study evaluates CT skills among Engineering and [...] Read more.
The rapid integration of computational thinking (CT) into STEM education highlights its importance as a critical skill for problem-solving in the digital age, equipping students with the cognitive tools needed to address complex challenges systematically. This study evaluates CT skills among Engineering and Computer Science students using a multi-method approach by combining quantitative methods (CTT scores and CTS responses) with qualitative methods (thematic analysis of open-ended questions), integrating objective assessments, self-perception scales, and qualitative insights. The Computational Thinking Test (CTT) measures proficiency in core CT sub-competencies, abstraction, decomposition, algorithmic thinking, and pattern recognition through objective tests. The Computational Thinking Scale (CTS) captures students’ perceived CT skills. At the same time, open-ended questions elicit perspectives on the practical applications of CT in academic and professional contexts. Data from 196 students across two Mexican universities were analyzed through quantitative and thematic methods. The results show that students excel in pattern recognition and abstraction but face decomposition and algorithmic thinking challenges. Cross-sectional analyses were conducted between CTT, CTS and the open-ended part to compare CT skills across different demographic groups (e.g., age, gender, academic disciplines), showing clear differences based on age, gender, and academic disciplines, with Computer Science students performing better than engineering students. These findings highlight the importance of CT in preparing students for modern challenges and provide a foundation for improving teaching methods and integrating these skills into university programs. Full article
Show Figures

Figure 1

23 pages, 1774 KiB  
Article
Adaptive Transformer-Based Deep Learning Framework for Continuous Sign Language Recognition and Translation
by Yahia Said, Sahbi Boubaker, Saleh M. Altowaijri, Ahmed A. Alsheikhy and Mohamed Atri
Mathematics 2025, 13(6), 909; https://doi.org/10.3390/math13060909 - 8 Mar 2025
Viewed by 858
Abstract
Sign language recognition and translation remain pivotal for facilitating communication among the deaf and hearing communities. However, end-to-end sign language translation (SLT) faces major challenges, including weak temporal correspondence between sign language (SL) video frames and gloss annotations and the complexity of sequence [...] Read more.
Sign language recognition and translation remain pivotal for facilitating communication among the deaf and hearing communities. However, end-to-end sign language translation (SLT) faces major challenges, including weak temporal correspondence between sign language (SL) video frames and gloss annotations and the complexity of sequence alignment between long SL videos and natural language sentences. In this paper, we propose an Adaptive Transformer (ADTR)-based deep learning framework that enhances SL video processing for robust and efficient SLT. The proposed model incorporates three novel modules: Adaptive Masking (AM), Local Clip Self-Attention (LCSA), and Adaptive Fusion (AF) to optimize feature representation. The AM module dynamically removes redundant video frame representations, improving temporal alignment, while the LCSA module learns hierarchical representations at both local clip and full-video levels using a refined self-attention mechanism. Additionally, the AF module fuses multi-scale temporal and spatial features to enhance model robustness. Unlike conventional SLT models, our framework eliminates the reliance on gloss annotations, enabling direct translation from SL video sequences to spoken language text. The proposed method was evaluated using the ArabSign dataset, demonstrating state-of-the-art performance in translation accuracy, processing efficiency, and real-time applicability. The achieved results confirm that ADTR is a highly effective and scalable deep learning solution for continuous sign language recognition, positioning it as a promising AI-driven approach for real-world assistive applications. Full article
(This article belongs to the Special Issue Artificial Intelligence: Deep Learning and Computer Vision)
Show Figures

Figure 1

16 pages, 6568 KiB  
Article
Rapid Mental Stress Evaluation Based on Non-Invasive, Wearable Cortisol Detection with the Self-Assembly of Nanomagnetic Beads
by Junjie Li, Qian Chen, Weixia Li, Shuang Li, Cherie S. Tan, Shuai Ma, Shike Hou, Bin Fan and Zetao Chen
Biosensors 2025, 15(3), 140; https://doi.org/10.3390/bios15030140 - 23 Feb 2025
Viewed by 840
Abstract
The rapid and timely evaluation of the mental health of emergency rescuers can effectively improve the quality of emergency rescues. However, biosensors for mental health evaluation are now facing challenges, such as the rapid and portable detection of multiple mental biomarkers. In this [...] Read more.
The rapid and timely evaluation of the mental health of emergency rescuers can effectively improve the quality of emergency rescues. However, biosensors for mental health evaluation are now facing challenges, such as the rapid and portable detection of multiple mental biomarkers. In this study, a non-invasive, flexible, wearable electrochemical biosensor was constructed based on the self-assembly of nanomagnetic beads for the rapid detection of cortisol in interstitial fluid (ISF) to assess the mental stress of emergency rescuers. Based on a one-step reduction, gold nanoparticles (AuNPs) were functionally modified on a screen-printed electrode to improve the detection of electrochemical properties. Afterwards, nanocomposites of MXene and multi-wall carbon nanotubes were coated onto the AuNPs layer through a physical deposition to enhance the electron transfer rate. The carboxylated nanomagnetic beads immobilized with a cortisol antibody were treated as sensing elements for the specific recognition of the mental stress marker, cortisol. With the rapid attraction of magnets to nanomagnetic beads, the sensing element can be rapidly replaced on the electrode uniformly, which can lead to extreme improvements in detection efficiency. The detected linear response to cortisol was 0–32 ng/mL. With the integrated reverse iontophoresis technique on a flexible printed circuit board, the ISF can be extracted non-invasively for wearable cortisol detection. The stimulating current was set to be under 1 mA for the extraction, which was within the safe and acceptable range for human bodies. Therefore, based on the positive correlation between cortisol concentration and mental stress, the mental stress of emergency rescuers can be evaluated, which will provide feedback on the psychological statuses of rescuers and effectively improve rescuer safety and rescue efficiency. Full article
Show Figures

Figure 1

17 pages, 6161 KiB  
Article
Efficient Triple Attention and AttentionMix: A Novel Network for Fine-Grained Crop Disease Classification
by Yanqi Zhang, Ning Zhang, Jingbo Zhu, Tan Sun, Xiujuan Chai and Wei Dong
Agriculture 2025, 15(3), 313; https://doi.org/10.3390/agriculture15030313 - 31 Jan 2025
Viewed by 703
Abstract
In the face of global climate change, crop pests and diseases have emerged on a large scale, with diverse species lasting for long periods and exerting wide-ranging impacts. Identifying crop pests and diseases efficiently and accurately is crucial in enhancing crop yields. Nonetheless, [...] Read more.
In the face of global climate change, crop pests and diseases have emerged on a large scale, with diverse species lasting for long periods and exerting wide-ranging impacts. Identifying crop pests and diseases efficiently and accurately is crucial in enhancing crop yields. Nonetheless, the complexity and variety of scenarios render this a challenging task. In this paper, we propose a fine-grained crop disease classification network integrating the efficient triple attention (ETA) module and the AttentionMix data enhancement strategy. The ETA module is capable of capturing channel attention and spatial attention information more effectively, which contributes to enhancing the representational capacity of deep CNNs. Additionally, AttentionMix can effectively address the label misassignment issue in CutMix, a commonly used method for obtaining high-quality data samples. The ETA module and AttentionMix can work together on deep CNNs for greater performance gains. We conducted experiments on our self-constructed crop disease dataset and on the widely used IP102 plant pest and disease classification dataset. The results showed that the network, which combined the ETA module and AttentionMix, could reach an accuracy as high as 98.2% on our crop disease dataset. When it came to the IP102 dataset, this network achieved an accuracy of 78.7% and a recall of 70.2%. In comparison with advanced attention models such as ECANet and Triplet Attention, our proposed model exhibited an average performance improvement of 5.3% and 4.4%, respectively. All of this implies that the proposed method is both practical and applicable for classifying diseases in the majority of crop types. Based on classification results from the proposed network, an install-free WeChat mini program that enables real-time automated crop disease recognition by taking photos with a smartphone camera was developed. This study can provide an accurate and timely diagnosis of crop pests and diseases, thereby providing a solution reference for smart agriculture. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

12 pages, 945 KiB  
Article
Behavioral Inhibition and Fear: The Moderating Role of Emotional Competence and Gender in Preadolescents
by Andrea Baroncelli, Stefania Righi, Carolina Facci and Enrica Ciucci
Children 2025, 12(2), 179; https://doi.org/10.3390/children12020179 - 31 Jan 2025
Viewed by 619
Abstract
Background: The behavioral inhibition system (BIS) is a key motivational system that shapes human emotions and behaviors; specifically, the BIS regulates avoidance behaviors, and it is linked to negative emotions such as fear and anxiety. Previous studies have demonstrated a link between high [...] Read more.
Background: The behavioral inhibition system (BIS) is a key motivational system that shapes human emotions and behaviors; specifically, the BIS regulates avoidance behaviors, and it is linked to negative emotions such as fear and anxiety. Previous studies have demonstrated a link between high BIS scores and attentional bias to threat in children, but literature is inconsistent. This may be due to differences in the individual awareness of emotions or in the accuracy of effectively detecting emotions. Moreover, the past literature has also found gender differences in BIS scores, which may suggest differential processes in boys and girls. Methods: The present study aims to investigate whether BIS scores were associated with an attentional facilitation index of fear in a sample of preadolescents (n = 264; 52.27% girls; M age = 12.98 years; SD = 0.89 years), considering the potential moderating role of (a) the awareness of others’ emotions as assessed by a self-report questionnaire, (b) emotion perception accuracy of fear as assessed by a laboratory task of emotion recognition, and (c) gender. Results: Our results showed that, only in males, higher scores of the BIS were associated with a lower attentional facilitation index of fear in the conditions of low levels of emotional competence (i.e., low levels of self-reported awareness of other emotions or low levels of accuracy recognition of fearful faces). Conclusions: Results were discussed in light of both theories of emotional development and practical clinical implications, with special attention to the emerged gender difference. Full article
Show Figures

Figure 1

17 pages, 2819 KiB  
Article
DGA-Based Fault Diagnosis Using Self-Organizing Neural Networks with Incremental Learning
by Siqi Liu, Zhiyuan Xie and Zhengwei Hu
Electronics 2025, 14(3), 424; https://doi.org/10.3390/electronics14030424 - 22 Jan 2025
Viewed by 841
Abstract
Power transformers are vital components of electrical power systems, ensuring reliable and efficient energy transfer between high-voltage transmission and low-voltage distribution networks. However, they are prone to various faults, such as insulation breakdowns, winding deformations, partial discharges, and short circuits, which can disrupt [...] Read more.
Power transformers are vital components of electrical power systems, ensuring reliable and efficient energy transfer between high-voltage transmission and low-voltage distribution networks. However, they are prone to various faults, such as insulation breakdowns, winding deformations, partial discharges, and short circuits, which can disrupt electrical service, incur significant economic losses, and pose safety risks. Traditional fault diagnosis methods, including visual inspection, dissolved gas analysis (DGA), and thermal imaging, face challenges such as subjectivity, intermittent data collection, and reliance on expert interpretation. To address these limitations, this paper proposes a novel distributed approach for multi-fault diagnosis of power transformers based on a self-organizing neural network combined with data augmentation and incremental learning techniques. The proposed framework addresses critical challenges, including data quality issues, computational complexity, and the need for real-time adaptability. Data cleaning and preprocessing techniques improve the reliability of input data, while data augmentation generates synthetic samples to mitigate data imbalance and enhance the recognition of rare fault patterns. A two-stage classification model integrates unsupervised and supervised learning, with k-means clustering applied in the first stage for initial fault categorization, followed by a self-organizing neural network in the second stage for refined fault diagnosis. The self-organizing neural network dynamically suppresses inactive nodes and optimizes its training parameter set, reducing computational complexity without sacrificing accuracy. Additionally, incremental learning enables the model to continuously adapt to new fault scenarios without modifying its architecture, ensuring real-time performance and adaptability across diverse operational conditions. Experimental validation demonstrates the effectiveness of the proposed method in achieving accurate, efficient, and adaptive fault diagnosis for power transformers, outperforming traditional and conventional machine learning approaches. This work provides a robust framework for integrating advanced machine learning techniques into power system monitoring, paving the way for automated, real-time, and reliable transformer fault diagnosis systems. Full article
(This article belongs to the Special Issue New Advances in Distributed Computing and Its Applications)
Show Figures

Figure 1

24 pages, 5261 KiB  
Article
Extended Study of a Multi-Modal Loop Closure Detection Framework for SLAM Applications
by Mohammed Chghaf, Sergio Rodríguez Flórez and Abdelhafid El Ouardi
Electronics 2025, 14(3), 421; https://doi.org/10.3390/electronics14030421 - 21 Jan 2025
Viewed by 1052
Abstract
Loop Closure (LC) is a crucial task in Simultaneous Localization and Mapping (SLAM) for Autonomous Ground Vehicles (AGV). It is an active research area because it improves global localization efficiency. The consistency of the global map and the accuracy of the AGV’s location [...] Read more.
Loop Closure (LC) is a crucial task in Simultaneous Localization and Mapping (SLAM) for Autonomous Ground Vehicles (AGV). It is an active research area because it improves global localization efficiency. The consistency of the global map and the accuracy of the AGV’s location in an unknown environment are highly correlated with the efficiency and robustness of Loop Closure Detection (LCD), especially when facing environmental changes or data unavailability. We propose to introduce multimodal complementary data to increase the algorithms’ resilience. Various methods using different data sources have been proposed to achieve precise place recognition. However, integrating a multimodal loop-closure fusion process that combines multiple information sources within a SLAM system has been explored less. Additionally, existing multimodal place recognition techniques are often difficult to integrate into existing frameworks. In this paper, we propose a fusion scheme of multiple place recognition methods based on camera and LiDAR data for a robust multimodal LCD. The presented approach uses Similarity-Guided Particle Filtering (SGPF) to identify and verify candidates for loop closure. Based on the ORB-SLAM2 framework, the proposed method uses two perception sensors (camera and LiDAR) under two data representation models for each. Our experiments on both KITTI and a self-collected dataset show that our approach outperforms the state-of-the-art methods in terms of place recognition metrics or localization accuracy metrics. The proposed Multi-Modal Loop Closure (MMLC) framework enhances the robustness and accuracy of AGV’s localization by fusing multiple sensor modalities, ensuring consistent performance across diverse environments. Its real-time operation and early loop closure detection enable timely trajectory corrections, reducing navigation errors and supporting cost-effective deployment with adaptable sensor configurations. Full article
(This article belongs to the Special Issue Image Analysis Using LiDAR Data)
Show Figures

Figure 1

27 pages, 4439 KiB  
Article
Personal Identification Using Embedded Raspberry Pi-Based Face Recognition Systems
by Sebastian Pecolt, Andrzej Błażejewski, Tomasz Królikowski, Igor Maciejewski, Kacper Gierula and Sebastian Glowinski
Appl. Sci. 2025, 15(2), 887; https://doi.org/10.3390/app15020887 - 17 Jan 2025
Viewed by 1734
Abstract
Facial recognition technology has significantly advanced in recent years, with promising applications in fields ranging from security to consumer electronics. Its importance extends beyond convenience, offering enhanced security measures for sensitive areas and seamless user experiences in everyday devices. This study focuses on [...] Read more.
Facial recognition technology has significantly advanced in recent years, with promising applications in fields ranging from security to consumer electronics. Its importance extends beyond convenience, offering enhanced security measures for sensitive areas and seamless user experiences in everyday devices. This study focuses on the development and validation of a facial recognition system utilizing a Haar cascade classifier and the AdaBoost machine learning algorithm. The system leverages characteristic facial features—distinct, measurable attributes used to identify and differentiate faces within images. A biometric facial recognition system was implemented on a Raspberry Pi microcomputer, capable of detecting and identifying faces using a self-contained reference image database. Verification involved selecting the similarity threshold, a critical factor influencing the balance between accuracy, security, and user experience in biometric systems. Testing under various environmental conditions, facial expressions, and user demographics confirmed the system’s accuracy and efficiency, achieving an average recognition time of 10.5 s under different lighting conditions, such as daylight, artificial light, and low-light scenarios. It is shown that the system’s accuracy and scalability can be enhanced through testing with larger databases, hardware upgrades like higher-resolution cameras, and advanced deep learning algorithms to address challenges such as extreme facial angles. Threshold optimization tests with six male participants revealed a value that effectively balances accuracy and efficiency. While the system performed effectively under controlled conditions, challenges such as biometric similarities and vulnerabilities to spoofing with printed photos underscore the need for additional security measures, such as thermal imaging. Potential applications include access control, surveillance, and statistical data collection, highlighting the system’s versatility and relevance. Full article
Show Figures

Figure 1

30 pages, 6387 KiB  
Article
Transformer-Based Re-Ranking Model for Enhancing Contextual and Syntactic Translation in Low-Resource Neural Machine Translation
by Arifa Javed, Hongying Zan, Orken Mamyrbayev, Muhammad Abdullah, Kanwal Ahmed, Dina Oralbekova, Kassymova Dinara and Ainur Akhmediyarova
Electronics 2025, 14(2), 243; https://doi.org/10.3390/electronics14020243 - 8 Jan 2025
Viewed by 1755
Abstract
Neural machine translation (NMT) plays a vital role in modern communication by bridging language barriers and enabling effective information exchange across diverse linguistic communities. Due to the limited availability of data in low-resource languages, NMT faces significant translation challenges. Data sparsity limits NMT [...] Read more.
Neural machine translation (NMT) plays a vital role in modern communication by bridging language barriers and enabling effective information exchange across diverse linguistic communities. Due to the limited availability of data in low-resource languages, NMT faces significant translation challenges. Data sparsity limits NMT models’ ability to learn, generalize, and produce accurate translations, which leads to low coherence and poor context awareness. This paper proposes a transformer-based approach incorporating an encoder–decoder structure, bilingual curriculum learning, and contrastive re-ranking mechanisms. Our approach enriches the training dataset using back-translation and enhances the model’s contextual learning through BERT embeddings. An incomplete-trust (in-trust) loss function is introduced to replace the traditional cross-entropy loss during training. The proposed model effectively handles out-of-vocabulary words and integrates named entity recognition techniques to maintain semantic accuracy. Additionally, the self-attention layers in the transformer architecture enhance the model’s syntactic analysis capabilities, which enables better context awareness and more accurate translations. Extensive experiments are performed on a diverse Chinese–Urdu parallel corpus, developed using human effort and publicly available datasets such as OPUS, WMT, and WiLi. The proposed model demonstrates a BLEU score improvement of 1.80% for Zh→Ur and 2.22% for Ur→Zh compared to the highest-performing comparative model. This significant enhancement indicates better translation quality and accuracy. Full article
Show Figures

Graphical abstract

Back to TopTop