Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (53)

Search Parameters:
Keywords = virtual–real fusion interaction

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
21 pages, 20486 KB  
Article
Semantic–Physical Sensor Fusion for Safe Physical Human–Robot Interaction in Dual-Arm Rehabilitation
by Disha Zhu, Xuefeng Wang and Shaomei Shang
Sensors 2026, 26(5), 1510; https://doi.org/10.3390/s26051510 - 27 Feb 2026
Viewed by 687
Abstract
A safe physical human–robot interaction (pHRI) in rehabilitation requires reliable perception and low-latency decision making under heterogeneous and unreliable sensor inputs. This paper presents a multimodal sensor-fusion-based safety framework that integrates physical state estimation, semantic information fusion, and an edge-deployed large language model [...] Read more.
A safe physical human–robot interaction (pHRI) in rehabilitation requires reliable perception and low-latency decision making under heterogeneous and unreliable sensor inputs. This paper presents a multimodal sensor-fusion-based safety framework that integrates physical state estimation, semantic information fusion, and an edge-deployed large language model (LLM) for real-time pHRI safety control. A dynamics-based virtual sensing method is introduced to estimate internal joint torques from external force–torque measurements, achieving a normalized mean absolute error of 18.5% in real-world experiments. An asynchronous semantic state pool with a time-to-live mechanism is designed to fuse visual, force, posture, and human semantic cues while maintaining robustness to sensor delays and dropouts. Based on structured multimodal tokens, an instruction-tuned edge LLM outputs discrete safety decisions that are further mapped to continuous compliant control parameters. The framework is trained using a hybrid dataset consisting of limited real-world samples and LLM-augmented synthetic data, and evaluated on unseen real and mixed-condition scenarios. Experimental results show reliable detection of safety-critical events with a low emergency misdetection rate, while maintaining an end-to-end decision latency of approximately 223 ms on edge hardware. Real-world experiments on a rehabilitation robot demonstrate effective responses to impacts, user instability, and visual occlusions, indicating the practical applicability of the proposed approach for real-time pHRI safety monitoring. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

22 pages, 1145 KB  
Article
TSMTFN: Two-Stream Temporal Shift Module Network for Efficient Egocentric Gesture Recognition in Virtual Reality
by Muhammad Abrar Hussain, Chanjun Chun and SeongKi Kim
Virtual Worlds 2025, 4(4), 58; https://doi.org/10.3390/virtualworlds4040058 - 4 Dec 2025
Cited by 1 | Viewed by 1082
Abstract
Egocentric hand gesture recognition is vital for natural human–computer interaction in augmented and virtual reality (AR/VR) systems. However, most deep learning models struggle to balance accuracy and efficiency, limiting real-time use on wearable devices. This paper introduces a Two-Stream Temporal Shift Module Transformer [...] Read more.
Egocentric hand gesture recognition is vital for natural human–computer interaction in augmented and virtual reality (AR/VR) systems. However, most deep learning models struggle to balance accuracy and efficiency, limiting real-time use on wearable devices. This paper introduces a Two-Stream Temporal Shift Module Transformer Fusion Network (TSMTFN) that achieves high recognition accuracy with low computational cost. The model integrates Temporal Shift Modules (TSMs) for efficient motion modeling and a Transformer-based fusion mechanism for long-range temporal understanding, operating on dual RGB-D streams to capture complementary visual and depth cues. Training stability and generalization are enhanced through full-layer training from epoch 1 and MixUp/CutMix augmentations. Evaluated on the EgoGesture dataset, TSMTFN attained 96.18% top-1 accuracy and 99.61% top-5 accuracy on the independent test set with only 16 GFLOPs and 21.3M parameters, offering a 2.4–4.7× reduction in computation compared to recent state-of-the-art methods. The model runs at 15.10 samples/s, achieving real-time performance. The results demonstrate robust recognition across over 95% of gesture classes and minimal inter-class confusion, establishing TSMTFN as an efficient, accurate, and deployable solution for next-generation wearable AR/VR gesture interfaces. Full article
Show Figures

Figure 1

23 pages, 614 KB  
Article
MSF-Net: A Data-Driven Multimodal Transformer for Intelligent Behavior Recognition and Financial Risk Reasoning in Virtual Live-Streaming
by Yang Song, Liman Zhang, Ruoyun Zhang, Haoyuan Zhan, Mingyuan Dai, Xinyi Hu, Ranran Chen and Manzhou Li
Electronics 2025, 14(23), 4769; https://doi.org/10.3390/electronics14234769 - 4 Dec 2025
Cited by 1 | Viewed by 1092
Abstract
With the rapid advancement of virtual human technology and live-streaming e-commerce, virtual anchors have increasingly become key interactive entities in the digital economy. However, emerging issues such as fake reviews, abnormal tipping, and illegal transactions pose significant threats to platform financial security and [...] Read more.
With the rapid advancement of virtual human technology and live-streaming e-commerce, virtual anchors have increasingly become key interactive entities in the digital economy. However, emerging issues such as fake reviews, abnormal tipping, and illegal transactions pose significant threats to platform financial security and user privacy. To address these challenges, a multimodal emotion–finance fusion security recognition framework (MSF-Net) is proposed, which integrates visual, audio, textual, and financial transaction signals to achieve cross-modal feature alignment and multi-signal risk modeling. The framework consists of three core modules: the multimodal alignment transformer (MAT), the fake review detection (FRD) module, and the multi-signal fusion decision module (MSFDM), enabling deep integration of semantic consistency modeling and emotion–behavior collaborative recognition. Experimental results demonstrate that MSF-Net achieves superior performance in virtual live-streaming financial security detection, reaching a precision of 0.932, a recall of 0.924, an F1-score of 0.928, an accuracy of 0.931, and an area under curve (AUC) of 0.956, while maintaining a real-time inference speed of 60.7 FPS, indicating outstanding precision and responsiveness. The ablation experiments further verify the necessity of each module, as the removal of any component leads to an F1-score decrease exceeding 4%, confirming the structural validity of the model’s hierarchical fusion design. In addition, a lightweight version of MSF-Net was developed through parameter distillation and quantization pruning techniques, achieving real-time deployment on mobile devices with an average latency of only 19.4 milliseconds while maintaining an F1-score of 0.923 and an AUC of 0.947. The results indicate that MSF-Net exhibits both innovation and practicality in multimodal deep fusion and security risk recognition, offering a scalable solution for intelligent risk control in data-driven artificial intelligence applications across financial and virtual interaction domains. Full article
(This article belongs to the Special Issue Advances in Data-Driven Artificial Intelligence)
Show Figures

Figure 1

26 pages, 2310 KB  
Systematic Review
A Systematic Review of Intelligent Navigation in Smart Warehouses Using Prisma: Integrating AI, SLAM, and Sensor Fusion for Mobile Robots
by Domagoj Zimmer, Mladen Jurišić, Ivan Plaščak, Željko Barač, Hrvoje Glavaš, Dorijan Radočaj and Robert Benković
Eng 2025, 6(12), 339; https://doi.org/10.3390/eng6120339 - 1 Dec 2025
Cited by 1 | Viewed by 2692
Abstract
This systematic review focuses on intelligent navigation as a core enabler of autonomy in smart warehouses, where mobile robots must dynamically perceive, reason, and act in complex, human-shared environments. By synthesizing advancements in AI-driven decision-making, SLAM, and multi-sensor fusion, the study highlights how [...] Read more.
This systematic review focuses on intelligent navigation as a core enabler of autonomy in smart warehouses, where mobile robots must dynamically perceive, reason, and act in complex, human-shared environments. By synthesizing advancements in AI-driven decision-making, SLAM, and multi-sensor fusion, the study highlights how intelligent navigation architectures reduce operational uncertainty and enhance task efficiency in logistics automation. Smart warehouses, powered by mobile robots and AGVs and integrated with AI and algorithms, are enabling more efficient storage with less human labour. This systematic review followed PRISMA 2020 guidelines to systematically identify, screen, and synthesize evidence from 106 peer-reviewed scientific articles (including pri-mary studies, technical papers, and reviews) published between 2020–2025, sourced from Web of Science. Thematic synthesis was conducted across 8 domains: AI, SLAM, sensor fusion, safety, network, path planning, implementation, and design. The transition to smart warehouses requires modern technologies to automate tasks and optimize resources. This article examines how intelligent systems can be integrated with mathematical models to improve navigation accuracy, reduce costs and prioritize human safety. Real-time data management with precise information for AMRs and AGVs is crucial for low-risk operation. This article studies AI, the IoT, LiDAR, machine learning (ML), SLAM and other new technologies for the successful implementation of mobile robots in smart warehouses. Modern technologies such as reinforcement learning optimize the routes and tasks of mobile robots. Data and sensor fusion methods integrate information from various sources to provide a more precise understanding of the indoor environment and inventory. Semantic mapping enables mobile robots to navigate and interact with complex warehouse environments with high accuracy in real time. The article also analyses how virtual reality (VR) can improve the spatial orientation of mobile robots by developing sophisticated navigation solutions that reduce time and financial costs. Full article
(This article belongs to the Special Issue Interdisciplinary Insights in Engineering Research)
Show Figures

Figure 1

28 pages, 4565 KB  
Article
Improving VR Welding Simulator Tracking Accuracy Through IMU-SLAM Fusion
by Kwang-Seong Shin, Jong Chan Kim, Kyung Won Cho and Won Ik Cho
Electronics 2025, 14(23), 4693; https://doi.org/10.3390/electronics14234693 - 28 Nov 2025
Cited by 1 | Viewed by 1629
Abstract
Virtual reality (VR) welding simulators provide safe and cost-effective training environments, but precise torch tracking remains a key challenge. Current commercial systems are limited in accurate bead simulation and posture feedback due to tracking errors of 3–10 mm, while external motion capture systems [...] Read more.
Virtual reality (VR) welding simulators provide safe and cost-effective training environments, but precise torch tracking remains a key challenge. Current commercial systems are limited in accurate bead simulation and posture feedback due to tracking errors of 3–10 mm, while external motion capture systems offer high precision but suffer from high cost and installation complexity issues. Therefore, a new approach is needed that achieves high precision while maintaining cost efficiency. This paper proposes an IMU-SLAM fusion-based tracking algorithm. The method combines Inertial Measurement Unit (IMU) data with visual–inertial SLAM (Simultaneous Localization and Mapping) for sensor fusion and applies a drift correction technique utilizing the periodic weaving patterns of the welding torch. This achieves precision below 5 mm without requiring external equipment. Experimental results demonstrate an average 3.8 mm RMSE (Root Mean Square Error) across 15 datasets spanning three welding scenarios, showing a 1.8× accuracy improvement over commercial baselines. Results were validated against OptiTrack ground truth data. Latency was maintained below 100 ms to meet real-time haptic feedback requirements, ensuring responsive interaction during training sessions. The proposed approach is a software solution using only standard VR hardware, eliminating the need for expensive external tracking equipment installation. User studies confirmed significant improvements in tracking quality perception from 6.8 to 8.4/10 and bead simulation realism from 7.1 to 8.7/10, demonstrating the practical effectiveness of the proposed method. Full article
(This article belongs to the Special Issue Virtual Reality Applications in Enhancing Human Lives)
Show Figures

Figure 1

44 pages, 1049 KB  
Review
Toward Intelligent AIoT: A Comprehensive Survey on Digital Twin and Multimodal Generative AI Integration
by Xiaoyi Luo, Aiwen Wang, Xinling Zhang, Kunda Huang, Songyu Wang, Lixin Chen and Yejia Cui
Mathematics 2025, 13(21), 3382; https://doi.org/10.3390/math13213382 - 23 Oct 2025
Cited by 5 | Viewed by 3764
Abstract
The Artificial Intelligence of Things (AIoT) is rapidly evolving from basic connectivity to intelligent perception, reasoning, and decision making across domains such as healthcare, manufacturing, transportation, and smart cities. Multimodal generative AI (GAI) and digital twins (DTs) provide complementary solutions. DTs deliver high-fidelity [...] Read more.
The Artificial Intelligence of Things (AIoT) is rapidly evolving from basic connectivity to intelligent perception, reasoning, and decision making across domains such as healthcare, manufacturing, transportation, and smart cities. Multimodal generative AI (GAI) and digital twins (DTs) provide complementary solutions. DTs deliver high-fidelity virtual replicas for real-time monitoring, simulation, and optimization with GAI enhancing cognition, cross-modal understanding, and the generation of synthetic data. This survey presents a comprehensive overview of DT–GAI integration in the AIoT. We review the foundations of DTs and multimodal GAI and highlight their complementary roles. We further introduce the Sense–Map–Generate–Act (SMGA) framework, illustrating their interaction through the SMGA loop. We discuss key enabling technologies, including multimodal data fusion, dynamic DT evolution, and cloud–edge–end collaboration. Representative application scenarios, including smart manufacturing, smart cities, autonomous driving, and healthcare, are examined to demonstrate their practical impact. Finally, we outline open challenges, including efficiency, reliability, privacy, and standardization, and we provide directions for future research toward sustainable, trustworthy, and intelligent AIoT systems. Full article
Show Figures

Figure 1

18 pages, 2718 KB  
Article
Metamodel-Based Digital Twin Architecture with ROS Integration for Heterogeneous Model Unification in Robot Shaping Processes
by Qingxin Li, Peng Zeng, Qiankun Wu and Hualiang Zhang
Machines 2025, 13(10), 898; https://doi.org/10.3390/machines13100898 - 1 Oct 2025
Cited by 3 | Viewed by 3777
Abstract
Precision manufacturing requires handling multi-physics coupling during processing, where digital twin and AI technologies enable rapid robot programming under customized requirements. However, heterogeneous data sources, diverse domain models, and rapidly changing demands pose significant challenges to digital twin system integration. To overcome these [...] Read more.
Precision manufacturing requires handling multi-physics coupling during processing, where digital twin and AI technologies enable rapid robot programming under customized requirements. However, heterogeneous data sources, diverse domain models, and rapidly changing demands pose significant challenges to digital twin system integration. To overcome these limitations, this paper proposes a digital twin modeling strategy based on a metamodel and a virtual–real fusion architecture, which unifies models between the virtual and physical domains. Within this framework, subsystems achieve rapid integration through ontology-driven knowledge configuration, while ROS provides the execution environment for establishing robot manufacturing digital twin scenarios. A case study of a robot shaping system demonstrates that the proposed architecture effectively addresses heterogeneous data association, model interaction, and application customization, thereby enhancing the adaptability and intelligence of precision manufacturing processes. Full article
(This article belongs to the Section Advanced Manufacturing)
Show Figures

Figure 1

30 pages, 9435 KB  
Article
Intelligent Fault Warning Method for Wind Turbine Gear Transmission System Driven by Digital Twin and Multi-Source Data Fusion
by Tiantian Xu, Xuedong Zhang and Wenlei Sun
Appl. Sci. 2025, 15(15), 8655; https://doi.org/10.3390/app15158655 - 5 Aug 2025
Cited by 5 | Viewed by 2375
Abstract
To meet the demands for real-time and accurate fault warning of wind turbine gear transmission systems, this study proposes an innovative intelligent warning method based on the integration of digital twin and multi-source data fusion. A digital twin system architecture is developed, comprising [...] Read more.
To meet the demands for real-time and accurate fault warning of wind turbine gear transmission systems, this study proposes an innovative intelligent warning method based on the integration of digital twin and multi-source data fusion. A digital twin system architecture is developed, comprising a high-precision geometric model and a dynamic mechanism model, enabling real-time interaction and data fusion between the physical transmission system and its virtual model. At the algorithmic level, a CNN-LSTM-Attention fault prediction model is proposed, which innovatively integrates the spatial feature extraction capabilities of a convolutional neural network (CNN), the temporal modeling advantages of long short-term memory (LSTM), and the key information-focusing characteristics of an attention mechanism. Experimental validation shows that this model outperforms traditional methods in prediction accuracy. Specifically, it achieves average improvements of 0.3945, 0.546 and 0.061 in Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE), and R-squared (R2) metrics, respectively. Building on the above findings, a monitoring and early warning platform for the wind turbine transmission system was developed, integrating digital twin visualization with intelligent prediction functions. This platform enables a fully intelligent process from data acquisition and status evaluation to fault warning, providing an innovative solution for the predictive maintenance of wind turbines. Full article
Show Figures

Figure 1

16 pages, 4481 KB  
Article
Construction and Validation of a Digital Twin-Driven Virtual-Reality Fusion Control Platform for Industrial Robots
by Wenxuan Chang, Wenlei Sun, Pinghui Chen and Huangshuai Xu
Sensors 2025, 25(13), 4153; https://doi.org/10.3390/s25134153 - 3 Jul 2025
Cited by 5 | Viewed by 4254
Abstract
Traditional industrial robot programming methods often pose high usage thresholds due to their inherent complexity and lack of standardization. Manufacturers typically employ proprietary programming languages or user interfaces, resulting in steep learning curves and limited interoperability. Moreover, conventional systems generally lack capabilities for [...] Read more.
Traditional industrial robot programming methods often pose high usage thresholds due to their inherent complexity and lack of standardization. Manufacturers typically employ proprietary programming languages or user interfaces, resulting in steep learning curves and limited interoperability. Moreover, conventional systems generally lack capabilities for remote control and real-time status monitoring. In this study, a novel approach is proposed by integrating digital twin technology with traditional robot control methodologies to establish a virtual–real mapping architecture. A high-precision and efficient digital twin-based control platform for industrial robots is developed using the Unity3D (2022.3.53f1c1) engine, offering enhanced visualization, interaction, and system adaptability. The high-precision twin environment is constructed from the three dimensions of the physical layer, digital layer, and information fusion layer. The system adopts the socket communication mechanism based on TCP/IP protocol to realize the real-time acquisition of robot state information and the synchronous issuance of control commands, and constructs the virtual–real bidirectional mapping mechanism. The Unity3D platform is integrated to develop a visual human–computer interaction interface, and the user-oriented graphical interface and modular command system effectively reduce the threshold of robot use. A spatially curved part welding experiment is carried out to verify the adaptability and control accuracy of the system in complex trajectory tracking and flexible welding tasks, and the experimental results show that the system has high accuracy as well as good interactivity and stability. Full article
(This article belongs to the Section Sensors and Robotics)
Show Figures

Figure 1

17 pages, 4622 KB  
Article
Dual Focus-3D: A Hybrid Deep Learning Approach for Robust 3D Gaze Estimation
by Abderrahmen Bendimered, Rabah Iguernaissi, Mohamad Motasem Nawaf, Rim Cherif, Séverine Dubuisson and Djamal Merad
Sensors 2025, 25(13), 4086; https://doi.org/10.3390/s25134086 - 30 Jun 2025
Cited by 1 | Viewed by 2176
Abstract
Estimating gaze direction is a key task in computer vision, especially for understanding where a person is focusing their attention. It is essential for applications in assistive technology, medical diagnostics, virtual environments, and human–computer interaction. In this work, we introduce Dual Focus-3D, a [...] Read more.
Estimating gaze direction is a key task in computer vision, especially for understanding where a person is focusing their attention. It is essential for applications in assistive technology, medical diagnostics, virtual environments, and human–computer interaction. In this work, we introduce Dual Focus-3D, a novel hybrid deep learning architecture that combines appearance-based features from eye images with 3D head orientation data. This fusion enhances the model’s prediction accuracy and robustness, particularly in challenging natural environments. To support training and evaluation, we present EyeLis, a new dataset containing 5206 annotated samples with corresponding 3D gaze and head pose information. Our model achieves state-of-the-art performance, with a MAE of 1.64° on EyeLis, demonstrating its ability to generalize effectively across both synthetic and real datasets. Key innovations include a multimodal feature fusion strategy, an angular loss function optimized for 3D gaze prediction, and regularization techniques to mitigate overfitting. Our results show that including 3D spatial information directly in the learning process significantly improves accuracy. Full article
(This article belongs to the Special Issue Advances in Optical Sensing, Instrumentation and Systems: 2nd Edition)
Show Figures

Figure 1

18 pages, 1498 KB  
Article
Speech Emotion Recognition on MELD and RAVDESS Datasets Using CNN
by Gheed T. Waleed and Shaimaa H. Shaker
Information 2025, 16(7), 518; https://doi.org/10.3390/info16070518 - 21 Jun 2025
Cited by 12 | Viewed by 10088
Abstract
Speech emotion recognition (SER) plays a vital role in enhancing human–computer interaction (HCI) and can be applied in affective computing, virtual support, and healthcare. This research presents a high-performance SER framework based on a lightweight 1D Convolutional Neural Network (1D-CNN) and a multi-feature [...] Read more.
Speech emotion recognition (SER) plays a vital role in enhancing human–computer interaction (HCI) and can be applied in affective computing, virtual support, and healthcare. This research presents a high-performance SER framework based on a lightweight 1D Convolutional Neural Network (1D-CNN) and a multi-feature fusion technique. Rather than employing spectrograms as image-based input, frame-level characteristics (Mel-Frequency Cepstral Coefficients, Mel-Spectrograms, and Chroma vectors) are calculated throughout the sequences to preserve temporal information and reduce the computing expense. The model attained classification accuracies of 94.0% on MELD (multi-party talks) and 91.9% on RAVDESS (acted speech). Ablation experiments demonstrate that the integration of complimentary features significantly outperforms the utilisation of a singular feature as a baseline. Data augmentation techniques, including Gaussian noise and time shifting, enhance model generalisation. The proposed method demonstrates significant potential for real-time emotion recognition using audio only in embedded or resource-constrained devices. Full article
(This article belongs to the Special Issue Artificial Intelligence Methods for Human-Computer Interaction)
Show Figures

Figure 1

23 pages, 9051 KB  
Article
Predicting User Attention States from Multimodal Eye–Hand Data in VR Selection Tasks
by Xiaoxi Du, Jinchun Wu, Xinyi Tang, Xiaolei Lv, Lesong Jia and Chengqi Xue
Electronics 2025, 14(10), 2052; https://doi.org/10.3390/electronics14102052 - 19 May 2025
Cited by 4 | Viewed by 2823
Abstract
Virtual reality (VR) devices that integrate eye-tracking and hand-tracking technologies can capture users’ natural eye–hand data in real time within a three-dimensional virtual space, providing new opportunities to explore users’ attentional states during natural 3D interactions. This study aims to develop an attention-state [...] Read more.
Virtual reality (VR) devices that integrate eye-tracking and hand-tracking technologies can capture users’ natural eye–hand data in real time within a three-dimensional virtual space, providing new opportunities to explore users’ attentional states during natural 3D interactions. This study aims to develop an attention-state prediction model based on the multimodal fusion of eye and hand features, which distinguishes whether users primarily employ goal-directed attention or stimulus-driven attention during the execution of their intentions. In our experiment, we collected three types of data—eye movements, hand movements, and pupil changes—and instructed participants to complete a virtual button selection task. This setup allowed us to establish a binary ground truth label for attentional state during the execution of selection intentions for model training. To investigate the impact of different time windows on prediction performance, we designed eight time windows ranging from 0 to 4.0 s (in increments of 0.5 s) and compared the performance of eleven algorithms, including logistic regression, support vector machine, naïve Bayes, k-nearest neighbors, decision tree, linear discriminant analysis, random forest, AdaBoost, gradient boosting, XGBoost, and neural networks. The results indicate that, within the 3 s window, the gradient boosting model performed best, achieving a weighted F1-score of 0.8835 and an Accuracy of 0.8860. Furthermore, the analysis of feature importance demonstrated that the multimodal eye–hand features play a critical role in the prediction. Overall, this study introduces an innovative approach that integrates three types of multimodal eye–hand behavioral and physiological data within a virtual reality interaction context. This framework provides both theoretical and methodological support for predicting users’ attentional states within short time windows and contributes practical guidance for the design of attention-adaptive 3D interfaces. In addition, the proposed multimodal eye–hand data fusion framework also demonstrates potential applicability in other three-dimensional interaction domains, such as game experience optimization, rehabilitation training, and driver attention monitoring. Full article
Show Figures

Figure 1

40 pages, 24863 KB  
Article
Digital Twin-Based Technical Research on Comprehensive Gear Fault Diagnosis and Structural Performance Evaluation
by Qiang Zhang, Zhe Wu, Boshuo An, Ruitian Sun and Yanping Cui
Sensors 2025, 25(9), 2775; https://doi.org/10.3390/s25092775 - 27 Apr 2025
Cited by 13 | Viewed by 3394
Abstract
In the operation process of modern industrial equipment, as the core transmission component, the operation state of the gearbox directly affects the overall performance and service life of the equipment. However, the current gear operation is still faced with problems such as poor [...] Read more.
In the operation process of modern industrial equipment, as the core transmission component, the operation state of the gearbox directly affects the overall performance and service life of the equipment. However, the current gear operation is still faced with problems such as poor monitoring, a single detection index, and low data utilization, which lead to incomplete evaluation results. In view of these challenges, this paper proposes a shape and property integrated gearbox monitoring system based on digital twin technology and artificial intelligence, which aims to realize real-time fault diagnosis, performance prediction, and the dynamic visualization of gear through virtual real mapping and data interaction, and lays the foundation for the follow-up predictive maintenance application. Taking the QPZZ-ii gearbox test bed as the physical entity, the research establishes a five-layer architecture: functional service layer, software support layer, model integration layer, data-driven layer, and digital twin layer, forming a closed-loop feedback mechanism. In terms of technical implementation, combined with HyperMesh 2023 refinement mesh generation, ABAQUS 2023 simulates the stress distribution of gear under thermal fluid solid coupling conditions, the Gaussian process regression (GPR) stress prediction model, and a fault diagnosis algorithm based on wavelet transform and the depth residual shrinkage network (DRSN), and analyzes the vibration signal and stress distribution of gear under normal, broken tooth, wear and pitting fault types. The experimental verification shows that the fault diagnosis accuracy of the system is more than 99%, the average value of the determination coefficient (R2) of the stress prediction model is 0.9339 (driving wheel) and 0.9497 (driven wheel), and supports the real-time display of three-dimensional cloud images. The advantage of the research lies in the interaction and visualization of fusion of multi-source data, but it is limited to the accuracy of finite element simulation and the difficulty of obtaining actual stress data. This achievement provides a new method for intelligent monitoring of industrial equipment and effectively promotes the application of digital twin technology in the field of predictive maintenance. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

39 pages, 1564 KB  
Article
Future Outdoor Safety Monitoring: Integrating Human Activity Recognition with the Internet of Physical–Virtual Things
by Yu Chen, Jia Li, Erik Blasch and Qian Qu
Appl. Sci. 2025, 15(7), 3434; https://doi.org/10.3390/app15073434 - 21 Mar 2025
Cited by 7 | Viewed by 3335
Abstract
The convergence of the Internet of Physical–Virtual Things (IoPVT) and the Metaverse presents a transformative opportunity for safety and health monitoring in outdoor environments. This concept paper explores how integrating human activity recognition (HAR) with the IoPVT within the Metaverse can revolutionize public [...] Read more.
The convergence of the Internet of Physical–Virtual Things (IoPVT) and the Metaverse presents a transformative opportunity for safety and health monitoring in outdoor environments. This concept paper explores how integrating human activity recognition (HAR) with the IoPVT within the Metaverse can revolutionize public health and safety, particularly in urban settings with challenging climates and architectures. By seamlessly blending physical sensor networks with immersive virtual environments, the paper highlights a future where real-time data collection, digital twin modeling, advanced analytics, and predictive planning proactively enhance safety and well-being. Specifically, three dimensions of humans, technology, and the environment interact toward measuring safety, health, and climate. Three outdoor cultural scenarios showcase the opportunity to utilize HAR–IoPVT sensors for urban external staircases, rural health, climate, and coastal infrastructure. Advanced HAR–IoPVT algorithms and predictive analytics would identify potential hazards, enabling timely interventions and reducing accidents. The paper also explores the societal benefits, such as proactive health monitoring, enhanced emergency response, and contributions to smart city initiatives. Additionally, we address the challenges and research directions necessary to realize this future, emphasizing AI technical scalability, ethical considerations, and the importance of interdisciplinary collaboration for designs and policies. By articulating an AI-driven HAR vision along with required advancements in edge-based sensor data fusion, city responsiveness with fog computing, and social planning through cloud analytics, we aim to inspire the academic community, industry stakeholders, and policymakers to collaborate in shaping a future where technology profoundly improves outdoor health monitoring, enhances public safety, and enriches the quality of urban life. Full article
(This article belongs to the Special Issue Human Activity Recognition (HAR) in Healthcare, 2nd Edition)
Show Figures

Figure 1

22 pages, 3652 KB  
Article
Named Entity Recognition in Online Medical Consultation Using Deep Learning
by Ze Hu, Wenjun Li and Hongyu Yang
Appl. Sci. 2025, 15(6), 3033; https://doi.org/10.3390/app15063033 - 11 Mar 2025
Cited by 1 | Viewed by 2601
Abstract
Named entity recognition in online medical consultation aims to address the challenge of identifying various types of medical entities within complex and unstructured social text in the context of online medical consultations. This can provide important data support for constructing more powerful online [...] Read more.
Named entity recognition in online medical consultation aims to address the challenge of identifying various types of medical entities within complex and unstructured social text in the context of online medical consultations. This can provide important data support for constructing more powerful online medical consultation knowledge graphs and improving virtual intelligent health assistants. A dataset of 26 medical entity types for named entity recognition for online medical consultations is first constructed. Then, a novel approach for deep named entity recognition in the medical field based on the fusion context mechanism is proposed. This approach captures enhanced local and global contextual semantic representations of online medical consultation text while simultaneously modeling high- and low-order feature interactions between local and global contexts, thereby effectively improving the sequence labeling performance. The experimental results show that the proposed approach can effectively identify 26 medical entity types with an average F1 score of 85.47%, outperforming the state-of-the-art (SOTA) method. The practical significance of this study lies in improving the efficiency and performance of domain-specific knowledge extraction in online medical consultation, supporting the development of virtual intelligent health assistants based on large language models and enabling real-time intelligent medical decision-making, thereby helping patients and their caregivers access common medical information more promptly. Full article
Show Figures

Figure 1

Back to TopTop