Processing math: 0%
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,049)

Search Parameters:
Keywords = VIT-2763

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 3604 KiB  
Article
An AI-Enabled Framework for Cacopsylla chinensis Monitoring and Population Dynamics Prediction
by Ruijun Jing, Deyan Peng, Jingtong Xu, Zhengjie Zhao, Xinyi Yang, Yihai Yu, Liu Yang, Ruiyan Ma and Zhiguo Zhao
Agriculture 2025, 15(11), 1210; https://doi.org/10.3390/agriculture15111210 - 1 Jun 2025
Viewed by 135
Abstract
The issue of pesticide and chemical residue in food has drawn increasing public attention, making effective control of plant pests and diseases a critical research focus in agriculture. Monitoring of pest populations is a key factor constraining the precision of pest management strategies. [...] Read more.
The issue of pesticide and chemical residue in food has drawn increasing public attention, making effective control of plant pests and diseases a critical research focus in agriculture. Monitoring of pest populations is a key factor constraining the precision of pest management strategies. Low-cost and high-efficiency monitoring devices are highly desirable. To address these challenges, we focus on Cacopsylla chinensis and design a portable, AI-based detection device, along with an integrated online monitoring and forecasting system. First, to enhance the model’s capability for detecting small targets, we developed a backbone network based on the RepVit block and its variants. Additionally, we introduced a Dynamic Position Encoder module to improve feature position encoding. To further enhance detection performance, we adopt a Context Guide Fusion Module, which enables context-driven information guidance and adaptive feature adjustment. Second, a framework facilitates the development of an online monitoring system centered on Cacopsylla chinensis detection. The system incorporates a hybrid neural network model to establish the relationship between multiple environmental parameters and the Cacopsylla chinensis population, enabling trend prediction. We conduct feasibility validation experiments by comparing detection results with a manual survey. The experimental results show that the detection model achieves an accuracy of 87.4% for both test samples and edge devices. Furthermore, the population dynamics model yields a mean absolute error of 1.94% for the test dataset. These performance indicators fully meet the requirements of practical agricultural applications. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

32 pages, 6964 KiB  
Article
MDFT-GAN: A Multi-Domain Feature Transformer GAN for Bearing Fault Diagnosis Under Limited and Imbalanced Data Conditions
by Chenxi Guo, Vyacheslav V. Potekhin, Peng Li, Elena A. Kovalchuk and Jing Lian
Appl. Sci. 2025, 15(11), 6225; https://doi.org/10.3390/app15116225 - 31 May 2025
Viewed by 246
Abstract
In industrial scenarios, bearing fault diagnosis often suffers from data scarcity and class imbalance, which significantly hinders the generalization performance of data-driven models. While generative adversarial networks (GANs) have shown promise in data augmentation, their efficacy deteriorates in the presence of multi-category and [...] Read more.
In industrial scenarios, bearing fault diagnosis often suffers from data scarcity and class imbalance, which significantly hinders the generalization performance of data-driven models. While generative adversarial networks (GANs) have shown promise in data augmentation, their efficacy deteriorates in the presence of multi-category and structurally complex fault distributions. To address these challenges, this paper proposes a novel fault diagnosis framework based on a Multi-Domain Feature Transformer GAN (MDFT-GAN). Specifically, raw vibration signals are transformed into 2D RGB representations via joint time-domain, frequency-domain, and time–frequency-domain mappings, effectively encoding multi-perspective fault signatures. A Transformer-based feature extractor, integrated with Efficient Channel Attention (ECA), is embedded into both the generator and discriminator to capture global dependencies and channel-wise interactions, thereby enhancing the representation quality of synthetic samples. Furthermore, a gradient penalty (GP) term is introduced to stabilize adversarial training and suppress mode collapse. To improve classification performance, an Enhanced Hybrid Visual Transformer (EH-ViT) is constructed by coupling a lightweight convolutional stem with a ViT encoder, enabling robust and discriminative fault identification. Beyond performance metrics, this work also incorporates a Grad-CAM-based interpretability scheme to visualize hierarchical feature activation patterns within the discriminator, providing transparent insight into the model’s decision-making rationale across different fault types. Extensive experiments on the CWRU and Jiangnan University (JNU) bearing datasets validate that the proposed method achieves superior diagnostic accuracy, robustness under limited and imbalanced conditions, and enhanced interpretability compared to existing state-of-the-art approaches. Full article
(This article belongs to the Special Issue Explainable Artificial Intelligence Technology and Its Applications)
Show Figures

Figure 1

9 pages, 299 KiB  
Article
Persistent Vitamin D Deficiency in Pediatric Patients with Cystic Fibrosis
by Magali Reyes-Apodaca, José L. Lezana-Fernández, Rodrigo Vázquez Frias, Mario E. Rendón-Macías, Aline González-Molina, Benjamín A. Rodríguez Espino, Isela Núñez-Barrera and Mara Medeiros
Nutrients 2025, 17(11), 1890; https://doi.org/10.3390/nu17111890 - 31 May 2025
Viewed by 174
Abstract
Background/Objectives: Cystic fibrosis (CF) is a multisystem disease caused by CFTR gene variants, with a high prevalence of vitamin D (VitD) deficiency despite the supplementation and schedules specifically developed for this population. Lower VitD levels have been associated with an increased risk of [...] Read more.
Background/Objectives: Cystic fibrosis (CF) is a multisystem disease caused by CFTR gene variants, with a high prevalence of vitamin D (VitD) deficiency despite the supplementation and schedules specifically developed for this population. Lower VitD levels have been associated with an increased risk of respiratory infections and pulmonary exacerbations in CF, with some pilot studies indicating the potential benefits of supplementation during acute episodes. This study aimed to describe the occurrence of VitD deficiency according to the supplemented dose in pediatric patients with CF. Methods: A cross-sectional analytical study was conducted to assess serum VitD levels in a pediatric population with cystic fibrosis. Clinical and biochemical data were collected, along with information on VitD intake and pancreatic enzyme dosage at the time of evaluation. Results: A total of 48 patients were included in the study. Normal VitD levels were observed in 41.7% of the patients, insufficiency in 31.3%, and deficiency in 27%. The median VitD intake was 2050 IU. A statistically significant difference was observed in patients with a daily intake exceeding 2000 IU. Only 10% of patients achieved levels above 30 ng/mL with a lower dose. No statistically significant association was identified between the pancreatic enzyme dosage and vitamin D levels. Conclusions: Vitamin D deficiency/insufficiency is a persistent problem in the pediatric CF population; the interventions targeting factors associated with this condition are required to refine supplementation schedules. These findings underscore the need for personalized strategies to optimize vitamin D status in PwCF. Ideally, these strategies should consider all associated factors, including genetic variants; however, with limited resources, our results suggest that a daily dose of 2000 IU of vitamin D may represent a reasonable and effective starting point for supplementation. Full article
Show Figures

Figure 1

31 pages, 2654 KiB  
Article
A Hybrid Model of Feature Extraction and Dimensionality Reduction Using ViT, PCA, and Random Forest for Multi-Classification of Brain Cancer
by Hisham Allahem, Sameh Abd El-Ghany, A. A. Abd El-Aziz, Bader Aldughayfiq, Menwa Alshammeri and Malak Alamri
Diagnostics 2025, 15(11), 1392; https://doi.org/10.3390/diagnostics15111392 - 30 May 2025
Viewed by 226
Abstract
Background/Objectives: The brain serves as the central command center for the nervous system in the human body and is made up of nerve cells known as neurons. When these nerve cells grow rapidly and abnormally, it can lead to the development of a [...] Read more.
Background/Objectives: The brain serves as the central command center for the nervous system in the human body and is made up of nerve cells known as neurons. When these nerve cells grow rapidly and abnormally, it can lead to the development of a brain tumor. Brain tumors are severe conditions that can significantly reduce a person’s lifespan. Failure to detect or delayed diagnosis of brain tumors can have fatal consequences. Accurately identifying and classifying brain tumors poses a considerable challenge for medical professionals, especially in terms of diagnosing and treating them using medical imaging analysis. Errors in diagnosing brain tumors can significantly impact a person’s life expectancy. Magnetic Resonance Imaging (MRI) is highly effective in early detection, diagnosis, and classification of brain cancers due to its advanced imaging abilities for soft tissues. However, manual examination of brain MRI scans is prone to errors and heavily depends on radiologists’ experience and fatigue levels. Swift detection of brain tumors is crucial for ensuring patient safety. Methods: In recent years, computer-aided diagnosis (CAD) systems incorporating deep learning (DL) and machine learning (ML) technologies have gained popularity as they offer precise predictive outcomes based on MRI images using advanced computer vision techniques. This article introduces a novel hybrid CAD approach named ViT-PCA-RF, which integrates Vision Transformer (ViT) and Principal Component Analysis (PCA) with Random Forest (RF) for brain tumor classification, providing a new method in the field. ViT was employed for feature extraction, PCA for feature dimension reduction, and RF for brain tumor classification. The proposed ViT-PCA-RF model helps detect early brain tumors, enabling timely intervention, better patient outcomes, and streamlining the diagnostic process, reducing patient time and costs. Our research trained and tested on the Brain Tumor MRI (BTM) dataset for multi-classification of brain tumors. The BTM dataset was preprocessed using resizing and normalization methods to ensure consistent input. Subsequently, our innovative model was compared against traditional classifiers, showcasing impressive performance metrics. Results: It exhibited outstanding accuracy, specificity, precision, recall, and F1 score with rates of 99%, 99.4%, 98.1%, 98.1%, and 98.1%, respectively. Conclusions: Our innovative classifier’s evaluation underlined our model’s potential, which leverages ViT, PCA, and RF techniques, showing promise in the precise and effective detection of brain tumors. Full article
(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)
Show Figures

Figure 1

25 pages, 9740 KiB  
Article
Autism Spectrum Disorder Detection Using Skeleton-Based Body Movement Analysis via Dual-Stream Deep Learning
by Jungpil Shin, Abu Saleh Musa Miah, Manato Kakizaki, Najmul Hassan and Yoichi Tomioka
Electronics 2025, 14(11), 2231; https://doi.org/10.3390/electronics14112231 - 30 May 2025
Viewed by 116
Abstract
Autism Spectrum Disorder (ASD) poses significant challenges in diagnosis due to its diverse symptomatology and the complexity of early detection. Atypical gait and gesture patterns, prominent behavioural markers of ASD, hold immense potential for facilitating early intervention and optimising treatment outcomes. These patterns [...] Read more.
Autism Spectrum Disorder (ASD) poses significant challenges in diagnosis due to its diverse symptomatology and the complexity of early detection. Atypical gait and gesture patterns, prominent behavioural markers of ASD, hold immense potential for facilitating early intervention and optimising treatment outcomes. These patterns can be efficiently and non-intrusively captured using modern computational techniques, making them valuable for ASD recognition. Various types of research have been conducted to detect ASD through deep learning, including facial feature analysis, eye gaze analysis, and movement and gesture analysis. In this study, we optimise a dual-stream architecture that combines image classification and skeleton recognition models to analyse video data for body motion analysis. The first stream processes Skepxels—spatial representations derived from skeleton data—using ConvNeXt-Base, a robust image recognition model that efficiently captures aggregated spatial embeddings. The second stream encodes angular features, embedding relative joint angles into the skeleton sequence and extracting spatiotemporal dynamics using Multi-Scale Graph 3D Convolutional Network(MSG3D) , a combination of Graph Convolutional Networks (GCNs) and Temporal Convolutional Networks (TCNs). We replace the ViT model from the original architecture with ConvNeXt-Base to evaluate the efficacy of CNN-based models in capturing gesture-related features for ASD detection. Additionally, we experimented with a Stack Transformer in the second stream instead of MSG3D but found it to result in lower performance accuracy, thus highlighting the importance of GCN-based models for motion analysis. The integration of these two streams ensures comprehensive feature extraction, capturing both global and detailed motion patterns. A pairwise Euclidean distance loss is employed during training to enhance the consistency and robustness of feature representations. The results from our experiments demonstrate that the two-stream approach, combining ConvNeXt-Base and MSG3D, offers a promising method for effective autism detection. This approach not only enhances accuracy but also contributes valuable insights into optimising deep learning models for gesture-based recognition. By integrating image classification and skeleton recognition, we can better capture both global and detailed motion patterns, which are crucial for improving early ASD diagnosis and intervention strategies. Full article
(This article belongs to the Special Issue Convolutional Neural Networks and Vision Applications, 4th Edition)
19 pages, 4932 KiB  
Article
Deep Learning-Based Fluid Identification with Residual Vision Transformer Network (ResViTNet)
by Yunan Liang, Bin Zhang, Wenwen Wang, Sinan Fang, Zhansong Zhang, Liang Peng and Zhiyang Zhang
Processes 2025, 13(6), 1707; https://doi.org/10.3390/pr13061707 - 29 May 2025
Viewed by 208
Abstract
The tight sandstone gas reservoirs in the LX area of the Ordos Basin are characterized by low porosity, poor permeability, and strong heterogeneity, which significantly complicate fluid type identification. Conventional methods based on petrophysical logging and core analysis have shown limited effectiveness in [...] Read more.
The tight sandstone gas reservoirs in the LX area of the Ordos Basin are characterized by low porosity, poor permeability, and strong heterogeneity, which significantly complicate fluid type identification. Conventional methods based on petrophysical logging and core analysis have shown limited effectiveness in this region, often resulting in low accuracy of fluid identification. To improve the precision of fluid property identification in such complex tight gas reservoirs, this study proposes a hybrid deep learning model named ResViTNet, which integrates ResNet (residual neural network) with ViT (vision transformer). The proposed method transforms multi-dimensional logging data into thermal maps and utilizes a sliding window sampling strategy combined with data augmentation techniques to generate high-dimensional image inputs. This enables automatic classification of different reservoir fluid types, including water zones, gas zones, and gas–water coexisting zones. Application of the method to a logging dataset from 80 wells in the LX block demonstrates a fluid identification accuracy of 97.4%, outperforming conventional statistical methods and standalone machine learning algorithms. The ResViTNet model exhibits strong robustness and generalization capability, providing technical support for fluid identification and productivity evaluation in the exploration and development of tight gas reservoirs. Full article
(This article belongs to the Section Energy Systems)
Show Figures

Figure 1

24 pages, 2032 KiB  
Article
ViT-Based Classification and Self-Supervised 3D Human Mesh Generation from NIR Single-Pixel Imaging
by Carlos Osorio Quero, Daniel Durini and Jose Martinez-Carranza
Appl. Sci. 2025, 15(11), 6138; https://doi.org/10.3390/app15116138 - 29 May 2025
Viewed by 249
Abstract
Accurately estimating 3D human pose and body shape from a single monocular image remains challenging, especially under poor lighting or occlusions. Traditional RGB-based methods struggle in such conditions, whereas single-pixel imaging (SPI) in the Near-Infrared (NIR) spectrum offers a robust alternative. NIR penetrates [...] Read more.
Accurately estimating 3D human pose and body shape from a single monocular image remains challenging, especially under poor lighting or occlusions. Traditional RGB-based methods struggle in such conditions, whereas single-pixel imaging (SPI) in the Near-Infrared (NIR) spectrum offers a robust alternative. NIR penetrates clothing and adapts to illumination changes, enhancing body shape and pose estimation. This work explores an SPI camera (850–1550 nm) with Time-of-Flight (TOF) technology for human detection in low-light conditions. SPI-derived point clouds are processed using a Vision Transformer (ViT) to align poses with a predefined SMPL-X model. A self-supervised PointNet++ network estimates global rotation, translation, body shape, and pose, enabling precise 3D human mesh reconstruction. Laboratory experiments simulating night-time conditions validate NIR-SPI’s potential for real-world applications, including human detection in rescue missions. Full article
(This article belongs to the Special Issue Single-Pixel Imaging and Identification)
Show Figures

Figure 1

34 pages, 1606 KiB  
Systematic Review
Deep Learning Techniques for Lung Cancer Diagnosis with Computed Tomography Imaging: A Systematic Review for Detection, Segmentation, and Classification
by Kabiru Abdullahi, Ramakrishnan Kannan and Aziah Binti Ali
Information 2025, 16(6), 451; https://doi.org/10.3390/info16060451 - 27 May 2025
Viewed by 124
Abstract
Background/Objectives: Lung cancer is a major global health challenge and the leading cause of cancer-related mortality, due to its high morbidity and mortality rates. Early and accurate diagnosis is crucial for improving patient outcomes. Computed tomography (CT) imaging plays a vital role in [...] Read more.
Background/Objectives: Lung cancer is a major global health challenge and the leading cause of cancer-related mortality, due to its high morbidity and mortality rates. Early and accurate diagnosis is crucial for improving patient outcomes. Computed tomography (CT) imaging plays a vital role in detection, and deep learning (DL) has emerged as a transformative tool to enhance diagnostic precision and enable early identification. This systematic review examined the advancements, challenges, and clinical implications of DL in lung cancer diagnosis via CT imaging, focusing on model performance, data variability, generalizability, and clinical integration. Methods: Following the 2020 Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, we analyzed 1448 articles published between 2015 and 2024. These articles are sourced from major scientific databases, including the Institute of Electrical and Electronics Engineers (IEEE), Scopus, Springer, PubMed, and Multidisciplinary Digital Publishing Institute (MDPI). After applying stringent inclusion and exclusion criteria, we selected 80 articles for review and analysis. Our analysis evaluated DL methodologies for lung nodule detection, segmentation, and classification, identified methodological limitations, and examined challenges to clinical adoption. Results: Deep learning (DL) models demonstrated high accuracy, achieving nodule detection rates exceeding 95% (with a maximum false-positive rate of 4 per scan) and a classification accuracy of 99% (sensitivity: 98%). However, challenges persist, including dataset scarcity, annotation variability, and population generalizability. Hybrid architectures, such as convolutional neural networks (CNNs) and transformers, show promise in improving nodule localization. Nevertheless, fewer than 15% of the studies validated models using multicenter datasets or diverse demographic data. Conclusions: While DL exhibits significant potential for lung cancer diagnosis, limitations in reproducibility and real-world applicability hinder its clinical translation. Future research should prioritize explainable artificial intelligence (AI) frameworks, multimodal integration, and rigorous external validation across diverse clinical settings and patient populations to bridge the gap between theoretical innovation and practical deployment. Full article
21 pages, 20038 KiB  
Article
CN2VF-Net: A Hybrid Convolutional Neural Network and Vision Transformer Framework for Multi-Scale Fire Detection in Complex Environments
by Naveed Ahmad, Mariam Akbar, Eman H. Alkhammash and Mona M. Jamjoom
Fire 2025, 8(6), 211; https://doi.org/10.3390/fire8060211 - 26 May 2025
Viewed by 304
Abstract
Fire detection remains a challenging task due to varying fire scales, occlusions, and complex environmental conditions. This paper proposes the CN2VF-Net model, a novel hybrid architecture that combines vision Transformers (ViTs) and convolutional neural networks (CNNs), effectively addressing these challenges. By leveraging the [...] Read more.
Fire detection remains a challenging task due to varying fire scales, occlusions, and complex environmental conditions. This paper proposes the CN2VF-Net model, a novel hybrid architecture that combines vision Transformers (ViTs) and convolutional neural networks (CNNs), effectively addressing these challenges. By leveraging the global context understanding of ViTs and the local feature extraction capabilities of CNNs, the model learns a multi-scale attention mechanism that dynamically focuses on fire regions at different scales, thereby improving accuracy and robustness. The evaluation on the D-Fire dataset demonstrate that the proposed model achieves a mean average precision at an IoU threshold of 0.5 (mAP50) of 76.1%, an F1-score of 81.5%, a recall of 82.8%, a precision of 83.3%, and a mean IoU (mIoU50–95) of 77.1%. These results outperform existing methods by 1.6% in precision, 0.3% in recall, and 3.4% in F1-score. Furthermore, visualizations such as Grad-CAM heatmaps and prediction overlays provide insight into the model’s decision-making process, validating its capability to effectively detect and segment fire regions. These findings underscore the effectiveness of the proposed hybrid architecture and its applicability in real-world fire detection and monitoring systems. With its superior performance and interpretability, the CN2VF-Net architecture sets a new benchmark in fire detection and segmentation, offering a reliable approach to protecting life, property, and the environment. Full article
Show Figures

Figure 1

19 pages, 14298 KiB  
Article
BETAV: A Unified BEV-Transformer and Bézier Optimization Framework for Jointly Optimized End-to-End Autonomous Driving
by Rui Zhao, Ziguo Chen, Yuze Fan, Fei Gao and Yuzhuo Men
Sensors 2025, 25(11), 3336; https://doi.org/10.3390/s25113336 - 26 May 2025
Viewed by 203
Abstract
End-to-end autonomous driving demands precise perception, robust motion planning, and efficient trajectory generation to navigate complex and dynamic environments. This paper proposes BETAV, a novel framework that addresses the persistent challenges of low 3D perception accuracy and suboptimal trajectory smoothness in autonomous driving [...] Read more.
End-to-end autonomous driving demands precise perception, robust motion planning, and efficient trajectory generation to navigate complex and dynamic environments. This paper proposes BETAV, a novel framework that addresses the persistent challenges of low 3D perception accuracy and suboptimal trajectory smoothness in autonomous driving systems through unified BEV-Transformer encoding and Bézier-optimized planning. By leveraging Vision Transformers (ViTs), our approach encodes multi-view camera data into a Bird’s Eye View (BEV) representation using a transformer architecture, capturing both spatial and temporal features to enhance scene understanding comprehensively. For motion planning, a Bézier curve-based planning decoder is proposed, offering a compact, continuous, and parameterized trajectory representation that inherently ensures motion smoothness, kinematic feasibility, and computational efficiency. Additionally, this paper introduces a set of constraints tailored to address vehicle kinematics, obstacle avoidance, and directional alignment, further enhancing trajectory accuracy and safety. Experimental evaluations on Nuscences benchmark datasets and simulations demonstrate that our framework achieves state-of-the-art performance in trajectory prediction and planning tasks, exhibiting superior robustness and generalization across diverse and challenging Bench2Drive driving scenarios. Full article
(This article belongs to the Section Vehicular Sensing)
Show Figures

Figure 1

21 pages, 3661 KiB  
Article
WindDefNet: A Multi-Scale Attention-Enhanced ViT-Inception-ResNet Model for Real-Time Wind Turbine Blade Defect Detection
by Majad Mansoor, Xiyue Tan, Adeel Feroz Mirza, Tao Gong, Zhendong Song and Muhammad Irfan
Machines 2025, 13(6), 453; https://doi.org/10.3390/machines13060453 - 25 May 2025
Viewed by 239
Abstract
Real-time non-intrusive monitoring of wind turbines, blades, and defect surfaces poses a set of complex challenges related to accuracy, safety, cost, and computational efficiency. This work introduces an enhanced deep learning-based framework for real-time detection of wind turbine blade defects. The WindDefNet is [...] Read more.
Real-time non-intrusive monitoring of wind turbines, blades, and defect surfaces poses a set of complex challenges related to accuracy, safety, cost, and computational efficiency. This work introduces an enhanced deep learning-based framework for real-time detection of wind turbine blade defects. The WindDefNet is introduced, which features the Inception-ResNet modules, Visual Transformer (ViT), and multi-scale attention mechanisms. WindDefNet utilizes modified cross-convolutional blocks, including the powerful Inception-ResNet hybrid, to capture both fine-grained and high-level features from input images. A multi-scale attention module is added to focus on important regions in the image, improving detection accuracy, especially in challenging areas of the wind turbine blades. We employ pertaining to Inception-ResNet and ViT patch embedding architectures to achieve superior performance in defect classification. WindDefNet’s capability to capture and integrate multi-scale feature representations enhances its effectiveness for robust wind turbine condition monitoring, thereby reducing operational downtime and minimizing maintenance costs. Our model WindDefNet integrates a novel advanced attention mechanism, with custom-pretrained Inception-ResNet combining self-attention with a Visual Transformer encoder, to enhance feature extraction and improve model accuracy. The proposed method demonstrates significant improvements in classification performance, as evidenced by the evaluation metrics attain precision, recall, and F1-scores of 0.88, 1.00, and 0.93 for the damage, 1.00, 0.71, and 0.83 for the edge, and 1.00, 1.00, and 1.00 for both the erosion and normal surfaces. The macro-average and weighted-average F1 scores stand at 0.94, highlighting the robustness of our approach. These results underscore the potential of the proposed model for defect detection in industrial applications. Full article
Show Figures

Figure 1

19 pages, 2321 KiB  
Article
Dual-Branch Network with Hybrid Attention for Multimodal Ophthalmic Diagnosis
by Xudong Wang, Anyu Cao, Caiye Fan, Zuoping Tan and Yuanyuan Wang
Bioengineering 2025, 12(6), 565; https://doi.org/10.3390/bioengineering12060565 - 25 May 2025
Viewed by 339
Abstract
In this paper, we propose a deep learning model based on dual-branch learning with a hybrid attention mechanism for meeting challenges in the underutilization of features in ophthalmic image diagnosis and the limited generalization ability of traditional single modal deep learning models when [...] Read more.
In this paper, we propose a deep learning model based on dual-branch learning with a hybrid attention mechanism for meeting challenges in the underutilization of features in ophthalmic image diagnosis and the limited generalization ability of traditional single modal deep learning models when using imbalanced data. Firstly, a dual-branch architecture layout is designed, in which the left and right branches use residual blocks to deal with the features of a 2D image and 3D volume, respectively. Secondly, a frequency domain transform-driven hybrid attention module is innovated, which consists of frequency domain attention, spatial attention, and channel attention, respectively, to solve the problem of inefficiency in network feature extraction. Finally, through a multi-scale grouped attention fusion mechanism, the local details and global structure information of the bimodal modalities are integrated, which solves the problem of the inefficiency of fusion caused by the heterogeneity of modal features. The experimental results show that the accuracy of MOD-Net improved by 1.66% and 1.14% over GeCoM-Net and ViT-2SPN, respectively. It can be concluded that the model effectively mines the deep correlation features of multimodal images through the hybrid attention mechanism, which provides a new paradigm for the intelligent diagnosis of ophthalmic diseases. Full article
(This article belongs to the Special Issue AI in OCT (Optical Coherence Tomography) Image Analysis)
Show Figures

Figure 1

16 pages, 1561 KiB  
Article
An Investigation into the Effects of Frailty and Sarcopenia on Postoperative Anesthesia Recovery and Complications Among Geriatric Patients Undergoing Colorectal Malignancy Surgery
by Rüştü Özdemir and Ferda Yaman
Medicina 2025, 61(6), 969; https://doi.org/10.3390/medicina61060969 - 23 May 2025
Viewed by 154
Abstract
Backgrounds and Objectives: In this study, we aimed to assess preoperative frailty among hospitalized patients over 60 undergoing colorectal cancer surgery. We investigated the impacts of frailty and sarcopenia on postoperative recovery, complications, and discharge time, while also identifying a cost-effective, bedside-accessible USG [...] Read more.
Backgrounds and Objectives: In this study, we aimed to assess preoperative frailty among hospitalized patients over 60 undergoing colorectal cancer surgery. We investigated the impacts of frailty and sarcopenia on postoperative recovery, complications, and discharge time, while also identifying a cost-effective, bedside-accessible USG parameter for diagnosing sarcopenia among patients assessed using the “Sonographic Thigh Adjustment Ratio” method. Materials and Methods: In this prospective study, we investigated the impacts of frailty and sarcopenia on the postoperative outcomes of 42 geriatric patients (with American Society of Anesthesiologists (ASA) scores of I–III) undergoing colorectal cancer surgery under general anesthesia. Frailty was assessed using the FRAIL scale, and sarcopenia was evaluated using the STAR (sonographic thigh adjustment ratio). Ultrasonographic measurements of rectus femoris and vastus intermedius muscle thicknesses were taken, and thigh lengths (TLs) were recorded. Ratios, including rectus femoris thickness/TL (RFT/TL), vastus intermedius thickness/TL (VIT/TL), and total muscle thickness/TL (TMT/TL), were calculated. Postoperative anesthesia recovery was monitored using the Modified Aldrete Score, indicating the time until discharge from the recovery unit. Complications were classified using the Clavien–Dindo system, and hospital discharge times were noted. Results: We observed significant differences between frailty status and ASA scores, as well as between age and frailty status. Muscle thickness significantly differed between the frail and pre-frail patients. Among the sarcopenic patients, age differences were significant. In men, VIT/TL was significantly correlated with sarcopenia diagnosis, whereas, in women, RFT/TL, VIT/TL, and TMT/TL were all correlated with sarcopenia. Conclusions: Based on our results, we conclude that VIT/TL measurement can serve as a predictive marker for preoperative sarcopenia, optimizing patient health before surgery. Full article
(This article belongs to the Section Intensive Care/ Anesthesiology)
Show Figures

Figure 1

14 pages, 2145 KiB  
Article
Advanced AI-Driven Thermographic Analysis for Diagnosing Diabetic Peripheral Neuropathy and Peripheral Arterial Disease
by Albert Siré Langa, Jose Luis Lázaro-Martínez, Aroa Tardáguila-García, Irene Sanz-Corbalán, Sergi Grau-Carrión, Ibon Uribe-Elorrieta, Arià Jaimejuan-Comes and Ramon Reig-Bolaño
Appl. Sci. 2025, 15(11), 5886; https://doi.org/10.3390/app15115886 - 23 May 2025
Viewed by 365
Abstract
This study explores the integration of advanced artificial intelligence (AI) techniques with infrared thermography for diagnosing diabetic peripheral neuropathy (DPN) and peripheral arterial disease (PAD). Diabetes-related foot complications, including DPN and PAD, are leading causes of morbidity and disability worldwide. Traditional diagnostic methods, [...] Read more.
This study explores the integration of advanced artificial intelligence (AI) techniques with infrared thermography for diagnosing diabetic peripheral neuropathy (DPN) and peripheral arterial disease (PAD). Diabetes-related foot complications, including DPN and PAD, are leading causes of morbidity and disability worldwide. Traditional diagnostic methods, such as the monofilament test for DPN and ankle–brachial pressure index for PAD, have limitations in sensitivity, highlighting the need for improved solutions. Thermographic imaging, a non-invasive, cost-effective, and reliable tool, captures temperature distributions of the patient plantar surface, enabling the detection of physiological changes linked to these conditions. This study collected thermographic data from diabetic patients and employed convolutional neural networks (CNNs) and vision transformers (ViTs) to classify individuals as healthy or affected by DPN or PAD (not healthy). These neural networks demonstrated superior diagnostic performance, compared to traditional methods (an accuracy of 95.00%, a sensitivity of 100.00%, and a specificity of 90% in the case of the ResNet-50 network). The results underscored the potential of combining thermography with AI to provide scalable, accurate, and patient-friendly diagnostics for diabetic foot care. Future work should focus on expanding datasets and integrating explainability techniques to enhance clinical trust and adoption. Full article
(This article belongs to the Special Issue Applications of Sensors in Biomechanics and Biomedicine)
Show Figures

Figure 1

36 pages, 28088 KiB  
Article
Sustainable Color Development Strategies for Ancient Chinese Historical Commercial Areas: A Case Study of Suzhou’s Xueshi Street–Wuzounfang Street
by Lyuhang Feng, Guanchao Yu, Mingrui Miao and Jiawei Sun
Sustainability 2025, 17(11), 4756; https://doi.org/10.3390/su17114756 - 22 May 2025
Viewed by 303
Abstract
This study focuses on the issue of visual sustainability of colors in commercial historical districts, taking the historical area of Xueshi Street–Wuzoufang Street in Suzhou, China as a case study. It explores how to balance modern commercial development with the protection of historical [...] Read more.
This study focuses on the issue of visual sustainability of colors in commercial historical districts, taking the historical area of Xueshi Street–Wuzoufang Street in Suzhou, China as a case study. It explores how to balance modern commercial development with the protection of historical culture. Due to the impact of commercialization and the introduction of various immature protection policies, historical districts often face the dilemma of coexisting “color conflict” and “color poverty”. Traditional color protection methods are either overly subjective or excessively quantitative, making it difficult to balance scientific rigor and adaptability. Therefore, this study provides a detailed literature review, compares and selects current quantitative color research methods, and proposes a comprehensive color analysis framework based on ViT (Vision Transformer), the CIEDE2000 color difference model, and K-means clustering (V-C-K framework). Using this framework, we conducted an in-depth analysis of the color-harmony situation in the studied area, aiming to accurately identify color issues in the district and provide optimization strategies. The experimental results show that the commercial colors of the Xueshi Street–Wuzoufang Street historical district exhibit a clear phenomenon of polarization: some areas have colors that are overly bright, leading to visual conflict, while others have colors that are too dull, lacking vitality and energy; furthermore, some areas display a mix of both conditions. Based on this situation, we then compared the extracted negative colors to the prohibited colors in the mainstream Munsell color system’s urban-color management guidelines. We found that colors with “high lightness and high saturation”, which are strictly limited by traditional color criteria, are not necessarily disharmonious, while “low lightness and low saturation” colors that are not restricted may not guarantee harmony either and could exacerbate the area’s “dilapidated feeling”. In other words, traditional color-protection standards often emphasize the safety of “low saturation and low lightness” colors unilaterally, ignoring that they can also cause dullness and discordance in certain environments. Under the ΔE (color difference value) threshold framework, color recognition is relatively more sensitive, balancing the inclusivity of “vibrant” colors and the caution against “dull” colors. Based on the above experimental results, this study proposes the following recommendations: (1) use the ΔE00 threshold to control the commercial colors in the district, ensuring that the colors align with the historical atmosphere while possessing commercial vitality; (2) in protection practices, comprehensively utilize the ViT, CIEDE2000, and K-means quantitative methods (i.e., the V-C-K framework) to reduce subjective errors; (3) based on the above quantitative framework, while referencing the reasonable parts of existing protection guidelines, combine cooperative collaboration, cultural group color preference surveys, policy incentives, and continuous monitoring and feedback to construct an operable plan for the entire “recognition–analysis–control” process. Full article
(This article belongs to the Collection Sustainable Conservation of Urban and Cultural Heritage)
Show Figures

Figure 1

Back to TopTop