Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,575)

Search Parameters:
Keywords = visual attention mechanism

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 9415 KB  
Article
A Protocol for ZnO Nanoparticle Incorporation into Wood via Waterborne Seeding and Microwave-Assisted Growth: Effects on the Physicochemical and Mechanical Properties
by Christina Sperantza, George Vekinis, Stamatios Boyatzis, Anastasia Pournou and Eleni Makarona
Coatings 2026, 16(6), 708; https://doi.org/10.3390/coatings16060708 (registering DOI) - 13 Jun 2026
Abstract
Zinc oxide (ZnO) nanoparticles have attracted increasing attention in wood science due to their multifunctional properties, including antimicrobial activity, UV absorption, and photocatalytic behavior. Water-based deposition protocols offer clear advantages yet typically struggle with nanoparticle aggregation and limited adhesion to lignocellulosic substrates. This [...] Read more.
Zinc oxide (ZnO) nanoparticles have attracted increasing attention in wood science due to their multifunctional properties, including antimicrobial activity, UV absorption, and photocatalytic behavior. Water-based deposition protocols offer clear advantages yet typically struggle with nanoparticle aggregation and limited adhesion to lignocellulosic substrates. This work introduces a rapid and scalable waterborne protocol combining catalyst-free aqueous seeding with microwave-assisted (MWA) growth under mild conditions. Pinus pinaster veneer samples were treated via dip-coating and spraying, with single and double seeding cycles, followed by MWA growth. Protocol efficiency was assessed through ZnO retention, SEM, and EDS analysis, while the impact of the substrate was assessed via mechanical testing, ATR-FTIR spectroscopy, and colorimetry. Dip-coating achieves significantly higher precursor uptake than spraying, while repeated seeding cycles further increase ZnO loading. Results suggest that incorporation may proceed through zinc–carboxylate bonds within the wood matrix, followed by localized ZnO nanostructures development. The effective integration did not weaken the mechanical properties, while color changes were significant for dip-coated samples and noticeable for sprayed ones. Overall, this methodology provides a fast, water-based, and minimally invasive route for ZnO incorporation into wood and a scalable pathway with retained mechanical and chemical properties and limited visual impact. Full article
(This article belongs to the Special Issue Innovations in Functional Coatings for Wood Processing)
Show Figures

Figure 1

33 pages, 22512 KB  
Article
A Simulation-Based Hybrid Quantum-Classical Channel Attention Network for Reliable Aircraft Skin Defect Recognition
by Shiqi Jiang, Hai Peng, Dingqi Zhang and Yupei Zhu
Technologies 2026, 14(6), 361; https://doi.org/10.3390/technologies14060361 (registering DOI) - 13 Jun 2026
Abstract
Aircraft skin defect recognition is a safety-critical visual inspection task in which lightweight models must maintain high diagnostic accuracy while suppressing false alarms caused by complex surface textures, illumination variations, and weak defect patterns. This study proposes HQCA-Net, a simulation-based hybrid quantum-classical channel [...] Read more.
Aircraft skin defect recognition is a safety-critical visual inspection task in which lightweight models must maintain high diagnostic accuracy while suppressing false alarms caused by complex surface textures, illumination variations, and weak defect patterns. This study proposes HQCA-Net, a simulation-based hybrid quantum-classical channel attention network for reliable aircraft skin defect recognition. The core component, termed Residual Quantum Channel Attention (RQCA), embeds a 10-qubit variational quantum circuit into a classical ResNet-18 backbone to perform compact and structured nonlinear feature recalibration, introducing only 30 trainable quantum-gate parameters. The quantum circuit is evaluated using state-vector simulation, and this study focuses on model-level feature recalibration, reliability, and robustness within the evaluated dataset rather than implementation on physical quantum hardware. Experiments on a six-class aircraft skin defect dataset show that HQCA-Net achieves 97.93% classification accuracy and a global false positive rate of 0.49%, outperforming ResNet-18 and classical lightweight attention mechanisms including SE, ECA, and SimAM. Additional analyses using confidence calibration, Grad-CAM visualization, Gaussian noise perturbation, few-shot training, and circuit-depth ablation further indicate that the proposed RQCA module improves feature discrimination and false-alarm suppression under compact parameter constraints. These results suggest that the hybrid quantum-classical attention module can serve as a parameter-efficient nonlinear feature recalibration strategy for reliable visual defect inspection under the tested experimental conditions. Full article
(This article belongs to the Section Quantum Technologies)
Show Figures

Figure 1

22 pages, 920 KB  
Article
Early Detection of Fake News via Structured Social Interaction Simulation and Hierarchical Cross-Modal Fusion
by Ruihua Qi, Shuqin Chen, Weilong Li, Chenwei Zhang, Jiatai Lei, Haobo Lv and Yunhao Sun
Appl. Sci. 2026, 16(12), 6001; https://doi.org/10.3390/app16126001 (registering DOI) - 13 Jun 2026
Abstract
The widespread dissemination and societal impact of fake news underscore the critical need for effective detection. Existing methods remain limited, as they often fail to learn joint representations from multi-modal data and rely heavily on complete social interaction signals. Such information is frequently [...] Read more.
The widespread dissemination and societal impact of fake news underscore the critical need for effective detection. Existing methods remain limited, as they often fail to learn joint representations from multi-modal data and rely heavily on complete social interaction signals. Such information is frequently unavailable in practice, especially during the early propagation stages. To address early fake news detection in social media, this paper proposes a hierarchical cross-modal fusion framework with structured LLM-simulated social interaction (HCF-LSIM). The framework employs a progressive cross-modal attention mechanism to systematically align semantic representations across multiple levels, integrating textual, thematic, and visual features. Additionally, HCF-LSIM designs an LLM-powered social interaction simulator that generates structured triplets from adapted user profiles, effectively compensating for missing real-time interaction data. Experiments on public benchmarks demonstrate strong performance, with accuracies of 93.5% on Weibo and 87.2% on X (formerly Twitter), ranking first on Weibo and second on Twitter. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
30 pages, 6714 KB  
Article
Study on a Method for Identifying Particles Causing High-Speed Fluid Wear Based on Multi-Source Information Fusion
by Long Feng, Zhiyu Xiang, Junming Liu, Feng Zhu, Zhenzhen Zhang and Hongxin Xu
Processes 2026, 14(12), 1918; https://doi.org/10.3390/pr14121918 (registering DOI) - 12 Jun 2026
Viewed by 134
Abstract
Mechanical Wear particle recognition is an important approach for equipment health monitoring and fault early warning. However, flow-field disturbances and high-speed particle motion in high-speed fluid environments can lead to image degradation, non-stationary electrostatic signals, and insufficient reliability of single-source recognition methods. Therefore, [...] Read more.
Mechanical Wear particle recognition is an important approach for equipment health monitoring and fault early warning. However, flow-field disturbances and high-speed particle motion in high-speed fluid environments can lead to image degradation, non-stationary electrostatic signals, and insufficient reliability of single-source recognition methods. Therefore, this study proposes a wear particle recognition method based on multi-source information fusion for high-speed fluid environments. The method establishes a multi-scale electrostatic sensing model to characterize the coupling relationship among particle material properties, motion states, and electrostatic response characteristics. Empirical mode decomposition and independent component analysis are combined for adaptive electrostatic signal denoising, and a Transformer network is used to extract multi-domain features. Meanwhile, an ECA-CNN model with an efficient channel attention mechanism is introduced to enhance the feature representation of degraded particle images. On this basis, a meta-learning-based sample-adaptive decision fusion framework is developed to achieve dynamic and complementary fusion of electrostatic and visual information. The experimental results demonstrate that the proposed method exhibits excellent recognition accuracy and robustness in the tested high-speed fluid environment of 10 m/s, achieving a fusion recognition accuracy of 96.0%, which is significantly superior to single-source recognition methods. Ablation experiments further show that removing the global scaling factor, guidance loss, interpolation loss, and category-specific weight generator decreases the average recognition accuracy by 0.7%, 1.2%, 0.4%, and 1.8%, respectively, confirming the contribution of each key module to fusion recognition performance. These findings provide a new technical approach for the online intelligent recognition of wear particles under high-speed fluid conditions and offer theoretical support and methodological guidance for condition monitoring, health assessment, and intelligent operation and maintenance of large-scale equipment. Full article
(This article belongs to the Section Process Control, Modeling and Optimization)
Show Figures

Figure 1

20 pages, 27181 KB  
Communication
Infrared and Visible Image Fusion Network Based on Self-Compensating Lightweight Convolution
by Ruolin Li, Hongmei Wang, Qiaorong Wu, Cheng Liang, Haoyu Li and Jingyu Wang
Sensors 2026, 26(12), 3748; https://doi.org/10.3390/s26123748 - 12 Jun 2026
Viewed by 142
Abstract
Deep learning has significantly improved the quality of infrared and visible image fusion. However, existing mainstream deep fusion networks often come with complex architectures and a large number of parameters. While general lightweight techniques can effectively reduce model complexity, they often weaken feature [...] Read more.
Deep learning has significantly improved the quality of infrared and visible image fusion. However, existing mainstream deep fusion networks often come with complex architectures and a large number of parameters. While general lightweight techniques can effectively reduce model complexity, they often weaken feature interactions during the lightweighting process, resulting in the loss of complementary texture and thermal information in fused images and making it difficult to balance fusion performance and model efficiency. To address these issues, this paper constructs an infrared and visible image fusion network based on a self-compensating lightweight convolution mechanism, named LWC-DenseFuse. The core of the network lies in a self-compensating lightweight convolution module, which goes beyond conventional convolution replacement and explicitly addresses feature degradation introduced by lightweight design. The module decouples spatial and channel correlations of standard convolution through depthwise convolution and pointwise convolution, while incorporating a channel attention mechanism to adaptively enhance salient features. Additionally, channel shuffle technology is employed to promote information exchange between groups, thereby enhancing feature interaction and compensating for the loss of critical information caused by lightweight design. To further improve the representation capability of the lightweight network during optimization, a staged training strategy with progressive loss weighting is introduced. Experimental evaluations demonstrate that the proposed fusion network significantly reduces the number of model parameters while ensuring real-time inference performance. Meanwhile, it effectively alleviates the performance degradation typically associated with lightweight architectures, as evidenced by improvements in information entropy and visual fidelity. Full article
(This article belongs to the Collection Multi-Sensor Information Fusion)
Show Figures

Figure 1

22 pages, 12892 KB  
Article
A Fault Diagnosis Method for Plunger Pumps Based on Multi-Scale Convolution and Attention
by Linlin Liu, Shuhui Hao, Ruonan Yin, Kewen Li and Liechong Wang
Appl. Sci. 2026, 16(12), 5944; https://doi.org/10.3390/app16125944 - 12 Jun 2026
Viewed by 135
Abstract
Plunger pumps serve as core power equipment in oilfield water injection systems, where their reliable operation directly affects crude oil recovery efficiency and production safety. Failures such as mechanical wear and seal leakage can cause injection pressure fluctuations, increased energy consumption, and even [...] Read more.
Plunger pumps serve as core power equipment in oilfield water injection systems, where their reliable operation directly affects crude oil recovery efficiency and production safety. Failures such as mechanical wear and seal leakage can cause injection pressure fluctuations, increased energy consumption, and even pipeline burst accidents. This study addresses the challenges in plunger pump fault diagnosis, including the difficulty in capturing multi-scale fault features, interference from redundant information in high-dimensional feature spaces, and high model computational complexity. We propose a lightweight fault diagnosis approach called Multi-scale Attention Neural Network (MSLAN), which combines multi-scale convolution and attention mechanisms. In this model, a Separable Multi-scale Fusion Module (SMSF) employs parallel multi-branch convolutional kernels to acquire fault signatures across multiple scales, while computational overhead is reduced through depthwise separable convolution and shared pointwise convolution. Additionally, a Multi-Branch Parallel Attention Module (MBPA) is introduced to finely model complex inter-channel dependencies through a four-branch parallel structure, enhancing the perception of key features and suppressing redundant information. Experimental results on a self-constructed plunger pump dataset, the Case Western Reserve University bearing dataset, and the Southeast University gearbox dataset demonstrate that MSLAN achieves F1-scores of 88.95%, 98.89%, and 99.90%, respectively. While maintaining high diagnostic accuracy, the model exhibits significantly lower parameter count and computational cost compared to baseline models, effectively balancing diagnostic precision and computational efficiency. Ablation studies and visualization analyses further validate the effectiveness of each module. This study establishes an accurate and efficient intelligent fault diagnosis solution for plunger pumps, which is also readily applicable to a broader range of rotating machinery. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

20 pages, 4278 KB  
Article
Image Watermarking Algorithm Leveraging Dual-Attention Synergy and Adaptive Multi-Scale Fusion
by Zhenghan Yang, Huadong Sun and Nuohan Lv
Electronics 2026, 15(12), 2580; https://doi.org/10.3390/electronics15122580 - 11 Jun 2026
Viewed by 163
Abstract
Blind image watermarking models such as HiDDeN have laid an important foundation for end-to-end watermarking. Nevertheless, they still suffer from three major limitations: single-scale feature extraction, fixed fusion weights, and slow training convergence. To address these issues, this paper proposes an adaptive multi-scale [...] Read more.
Blind image watermarking models such as HiDDeN have laid an important foundation for end-to-end watermarking. Nevertheless, they still suffer from three major limitations: single-scale feature extraction, fixed fusion weights, and slow training convergence. To address these issues, this paper proposes an adaptive multi-scale watermarking algorithm based on collaborative dual-attention mechanisms. The algorithm designs an adaptive multi-scale feature fusion module (MA-FFM) with a dynamic gating network in the encoder, which flexibly combines local multi-scale textures with global contextual information, overcoming the limitation of fixed fusion weights. In the decoder, a multi-level channel attention module is embedded to strengthen the extraction of watermark signals. The two attention modules work synergistically: the encoder focuses on adaptive feature fusion while the decoder leverages channel attention to selectively enhance watermark-related features, forming a dual-attention synergy that balances robustness and imperceptibility. Moreover, the dynamic gating network adaptively adjusts the contribution of local versus global features via learnable weights, whose evolution from approximately 0.51 to about 0.89 improves model interpretability. Experiments are conducted on the COCO 2017 dataset. Compared with HiDDeN, the proposed algorithm reduces the bit error rate (BER) from 0.1696 to 0.1538 under no attack with a relative reduction of 9.3%, increases PSNR by 0.61 dB, and improves SSIM from 0.9058 to 0.9077. Under various attacks—including JPEG compression, Gaussian noise, salt-and-pepper noise, and brightness/contrast adjustments—the BER remains consistently lower than that of HiDDeN. Ablation studies confirm the effectiveness of each module. Overall, the proposed algorithm preserves visual quality, improves the accuracy of watermark embedding and extraction, and exhibits strong generalization robustness against common image distortions. Full article
Show Figures

Figure 1

30 pages, 729 KB  
Article
Restorative Design Perception and User Satisfaction in Concert Hall Architecture: The Serial Mediating Roles of Flow Experience and Musical Resonance
by Jing Wang, Guangliang Sang and Ken Nah
Buildings 2026, 16(12), 2328; https://doi.org/10.3390/buildings16122328 - 11 Jun 2026
Viewed by 197
Abstract
With the continuous deepening of green building concepts and the sustained advancement of research on health-oriented design, increasing attention has been paid to the impact of architectural space on users’ psychological perception and behavioral outcomes. In China, the rapid development of urban cultural [...] Read more.
With the continuous deepening of green building concepts and the sustained advancement of research on health-oriented design, increasing attention has been paid to the impact of architectural space on users’ psychological perception and behavioral outcomes. In China, the rapid development of urban cultural facilities and the growing emphasis on high-quality public cultural spaces have made concert halls an important context for examining how architectural environments shape user experience. In recent years, relevant studies have gradually expanded from energy conservation, function, and technical performance evaluation to discussion of the subjective experience of the architectural environment and its psychological effects. As a typical type of cultural building, the concert hall is an important place for music communication and artistic experience, and its spatial environment may also influence users’ state of immersion and emotional resonance. However, existing studies mostly focus on the acoustic quality, visual characteristics, and functional organization of concert halls, and still lack a systematic empirical explanation of how restorative design influences user satisfaction through psychological mechanisms. Using survey data from 972 users of six representative concert halls in six Chinese cities, this study constructs a theoretical model with perceived restorative design as the independent variable, flow experience and musical resonance as mediating variables, and user satisfaction as the dependent variable, aiming to broaden the understanding of the internal mechanism through which restorative design affects user satisfaction. The results show that: (1) perceived restorative design is positively associated with user satisfaction; (2) flow experience and musical resonance respectively play mediating roles between perceived restorative design and user satisfaction; and (3) flow experience and musical resonance respectively play a chain mediating role between perceived restorative design and user satisfaction. This study enriches the applied research on restorative design in the field of cultural architecture, reveals the psychological path through which restorative design in concert halls affects user satisfaction, and expands the theoretical boundaries of research on architectural environment experience. The conclusions provide a theoretical basis for optimizing the design of concert hall buildings and improving user experience, and also offer practical insights for the human-centered and high-quality development of cultural buildings in the context of green building. Full article
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)
Show Figures

Figure 1

21 pages, 3212 KB  
Article
Strain Prediction of Pile-Type Adjustable Wind-Turbine Foundation Caps Using XGBoost–SHAP Feature Selection and the TimeXer Model
by Lei Bian, Cong Liu, Huanwei Wei, Honghua Zhao and Xinyang Li
Buildings 2026, 16(12), 2325; https://doi.org/10.3390/buildings16122325 - 10 Jun 2026
Viewed by 131
Abstract
Accurate prediction of pile-cap strain is crucial for the safety of wind-turbine foundations, yet conventional methods struggle to screen key features from high-dimensional monitoring data and to model the nonlinear coupling between endogenous and exogenous variables, hindering both accuracy and interpretability. To address [...] Read more.
Accurate prediction of pile-cap strain is crucial for the safety of wind-turbine foundations, yet conventional methods struggle to screen key features from high-dimensional monitoring data and to model the nonlinear coupling between endogenous and exogenous variables, hindering both accuracy and interpretability. To address these limitations, this paper proposes a pile-cap-strain prediction method integrating XGBoost-SHAP feature selection with the TimeXer deep-learning model. XGBoost-SHAP first identifies critical predictors from high-dimensional pile-stress data; the TimeXer model then exploits its endogenous–exogenous fusion mechanism for strain prediction. The results show that XGBoost-SHAP effectively selected 10 key features, of which the upper-middle and middle windward-side stresses (Z1-4A, Z1-5A) contributed over 40% of the explanatory power. This stage performs dimensionality reduction and sensor-importance interpretation, halving the input dimensionality while maintaining accuracy comparable to the full 19-channel input. TimeXer achieved a coefficient of determination (R2) of 0.993 in single-step prediction, comparable to the best-performing baselines, and maintained stable performance over a 120 min multi-step horizon. In a zero-shot cross-site transfer test, TimeXer attained the highest eight-step average R2 (0.914) among all models, indicating strong cross-site generalization. Attention-mechanism visualization further suggested consistency between the model’s prediction logic and structural mechanics principles. The proposed framework provides a technical solution combining high accuracy with strong interpretability for wind-turbine foundation health monitoring. Full article
(This article belongs to the Special Issue Structural Health Monitoring Through Advanced Artificial Intelligence)
18 pages, 1275 KB  
Article
Research on Two-Stream Networks Integrating Physiological Features and Attention Mechanisms for Motion Classification in Visually Impaired Individuals
by Wentong Wang, Changyuan Wang, Zehui Chen and Wenbo Huang
Sensors 2026, 26(12), 3681; https://doi.org/10.3390/s26123681 - 9 Jun 2026
Viewed by 257
Abstract
To address the issues of low perception accuracy and poor robustness in traditional motion recognition methods within complex walking environments for visually impaired individuals, this study utilizes multi-modal data, including ECG, PPG, and IMU, for classification. Regarding the low filtering efficiency of multi-modal [...] Read more.
To address the issues of low perception accuracy and poor robustness in traditional motion recognition methods within complex walking environments for visually impaired individuals, this study utilizes multi-modal data, including ECG, PPG, and IMU, for classification. Regarding the low filtering efficiency of multi-modal data, an improved wavelet filtering algorithm based on LSTM is proposed. To further enhance classification accuracy, this paper introduces a motion recognition method for the blindfolded mobility simulation based on an Attention-based Two-Stream Deep Fusion Convolutional Neural Network (ATS-DFCNN). The proposed method constructs a two-stream heterogeneous feature extraction architecture by synchronously collecting tri-axial motion signals and physiological signals from subjects. A 1D-CNN is employed to capture the spatial geometric features of limb movements, while a hybrid CNN-GRU network is utilized to mine the temporal evolution patterns of physiological stress. Furthermore, an attention mechanism is introduced to achieve dynamic weighted fusion at the feature level, which strengthens critical motion features and suppresses environmental noise. Experiments were conducted with 10 subjects simulating the movements of visually impaired individuals, covering typical actions such as walking, standing, climbing stairs, descending stairs, and falling. The results demonstrate that the proposed adaptive filtering algorithm achieves an AUC of 0.942, significantly improving feature distinctiveness compared to traditional algorithms. The ATS-DFCNN model achieved an average recognition accuracy of 92.2% across five activity categories, representing a 4.8% performance increase over single IMU modal classification. Particularly in fall detection, the model effectively reduces false alarms through physiological feedback and accurately infers motion intentions, providing reliable technical support for the safety monitoring of intelligent walking-aid systems. Full article
(This article belongs to the Special Issue AI in Sensor-Based E-Health, Wearables and Assisted Technologies)
Show Figures

Figure 1

17 pages, 5529 KB  
Article
EA-StrongSORT: An Efficient Attention StrongSORT Framework for Detection-Based Tumor Tracking in Cine-MRI TrackRAD2025 Dataset
by Alyaa Amer, Noha Ghatwary, Salema Fayed, Sahar Magdy, Alla Hussein, Rania Kadry and Amina I. Abdelmaksoud
Mach. Learn. Knowl. Extr. 2026, 8(6), 158; https://doi.org/10.3390/make8060158 - 9 Jun 2026
Viewed by 113
Abstract
MRI-guided radiotherapy (MRIgRT) enables the real-time visualization of tumor motion, allowing adaptive radiation delivery based on dynamic anatomical changes. However, respiratory-induced tumor motion remains a major challenge, particularly for thoracic and abdominal tumors. Continuous tumor motion may reduce treatment accuracy and increase radiation [...] Read more.
MRI-guided radiotherapy (MRIgRT) enables the real-time visualization of tumor motion, allowing adaptive radiation delivery based on dynamic anatomical changes. However, respiratory-induced tumor motion remains a major challenge, particularly for thoracic and abdominal tumors. Continuous tumor motion may reduce treatment accuracy and increase radiation exposure to surrounding healthy tissues. Therefore, reliable and efficient tumor tracking is essential for real-time motion management in MRI-guided radiotherapy. Recent advances in artificial intelligence have demonstrated significant potential for medical image analysis; however, many existing tumor tracking approaches rely on segmentation-based methods that require detailed annotations and complex processing, which can limit their use in real-time clinical environments. In this work, we propose a detection-based tumor tracking framework that integrates the YOLOv11 object detection model with an enhanced StrongSORT tracking algorithm (EA-StrongSORT). The proposed approach replaces the conventional re-identification backbone with a lightweight EfficientNetV2 architecture augmented with an Efficient Channel Attention (ECA) mechanism. The overall framework follows a tracking-by-detection concept, where tumor regions are first detected and subsequently associated across frames. The proposed framework is evaluated on the TrackRAD2025 dataset using multiple YOLOv11 variants to analyze the balance between performance and model complexity. Experimental results demonstrate that the lightweight YOLOv11n model achieves the best detection performance, with a precision of 0.912, recall of 0.607, mean Average Precision (mAP) of 0.771, and mAP5095 of 0.608. Furthermore, the proposed tracking framework achieves stable temporal association, with Multiple-Object Tracking Accuracy (MOTA) scores exceeding 91% and Higher-Order Tracking Accuracy (HOTA) scores around 90%. The proposed framework demonstrates the potential of detection-based tumor localization and tracking for real-time MRI-guided radiotherapy applications. Full article
Show Figures

Figure 1

25 pages, 4406 KB  
Article
Nondestructive Detection of Foreign Matter in Pu-erh Ripe Tea Based on Deep Learning
by Baijuan Wang, Xiaoxue Guo, Xin Fang, He Ji, Jihong Zhou, Junjie He, Shihao Zhang and Yuefei Wang
Foods 2026, 15(12), 2083; https://doi.org/10.3390/foods15122083 - 8 Jun 2026
Viewed by 166
Abstract
To address the challenges of small foreign matter size, severe occlusion, and complex backgrounds in Pu-erh ripe tea processing, this study drew inspiration from primate visual mechanisms and proposed an improved YOLOv13-based network, AE-YOLOv13-S. To mitigate loss of fine details, the weakening of [...] Read more.
To address the challenges of small foreign matter size, severe occlusion, and complex backgrounds in Pu-erh ripe tea processing, this study drew inspiration from primate visual mechanisms and proposed an improved YOLOv13-based network, AE-YOLOv13-S. To mitigate loss of fine details, the weakening of discriminative features, and the frequent occurrence of missed and false detections, the Adaptive Sparse Self-Attention Network was introduced to optimize the backbone of the network, inspired by the sequential cognitive pattern of primates involving target search, local verification, selective integration, and final decision making. To address insufficient long-range semantic associations and the submergence of fine-grained differences in background noise, Emulating Self-Attention with Convolution was employed to optimize part of the Conv modules of the network, drawing on the hierarchical information processing mechanisms of primates from peripheral perception to central fine analysis. In response to the limitations of bounding boxes, such as approximate target enclosure, the large amount of geometric supervision noise, the obvious localization deviation, and delayed model convergence, a Scale-based Dynamic Loss, inspired by primate visual perception mechanisms, was introduced to optimize the network’s loss function. The results showed that, during training, compared with the baseline, AE-YOLOv13-S achieved lower training loss values: Box Loss declined by 6.76%, Cls Loss by 6.52%, and DFL Loss by 8.65%. On the validation dataset, the model demonstrated reductions of 6.58%, 16.39%, and 8.33% for these respective metrics. After the overall improvements, AE-YOLOv13-S achieved increases of 1.43, 4.85, and 2.69 percentage points in precision, recall, and mAP@50, respectively, with only a 0.3 G increase in FLOPs. The improved model can classify and detect foreign matter in Pu-erh ripe tea efficiently and accurately, providing not only a new technical pathway for foreign matter detection in tea processing but also a practically meaningful technical solution for intelligent quality control and food safety assurance in the tea processing chain. Full article
Show Figures

Figure 1

23 pages, 3094 KB  
Article
A Camera-Based Visual Sensor Pipeline for Fine-Grained Human Activity Recognition in Classroom Scenes
by Cheng Sun, Danning Wu, Zihao Wu, Weibing Zhou and Jin Zhang
Sensors 2026, 26(12), 3666; https://doi.org/10.3390/s26123666 - 8 Jun 2026
Viewed by 279
Abstract
Student behavior recognition in classroom environments is important for teaching quality assessment and intelligent education, yet it remains challenging due to dense student distributions, frequent occlusion, substantial scale variation, and the subtle nature of common classroom activities. To address these issues, this paper [...] Read more.
Student behavior recognition in classroom environments is important for teaching quality assessment and intelligent education, yet it remains challenging due to dense student distributions, frequent occlusion, substantial scale variation, and the subtle nature of common classroom activities. To address these issues, this paper proposes RepYOLOv5-SF3D, a cascaded visual perception framework for fine-grained student behavior recognition in complex classroom scenes. The framework integrates a lightweight RepYOLOv5m detector with a dual-stream SlowFast-3D recognition branch, enabling automated inference from raw video input to behavior labels. To improve robustness in dense and occluded scenes, the front-end detector serves as a spatial-prior module, while a decoupled training strategy reduces the impact of localization instability on back-end spatiotemporal learning. In addition, two task-oriented modules are introduced in the recognition branch: the Spatiotemporal Depthwise-Separable 3D module (SDS3D) and the Normalization-Based Temporal Attention Mechanism (NTAM). Experimental results on a real classroom dataset show that RepYOLOv5-SF3D achieves a mean average precision (mAP) of 88.83%, outperforming the baseline SlowFast model by 3.36% and surpassing the existing LSTC method by 2.05%, while maintaining a front-end inference latency of 12.5 ms per frame and a total model size of 151.46 MB. These results demonstrate a favorable balance between fine-grained recognition accuracy and edge-deployment efficiency in practical classroom visual sensing. Full article
(This article belongs to the Special Issue Sensors for Human Activity Recognition: 3rd Edition)
Show Figures

Figure 1

28 pages, 14978 KB  
Article
SCT-YOLO: A Dual-Stream Defect Detection Network Utilizing Computational Shape, Texture, and Color Features
by Zhenning Mou, Yuchao Dai, Zihe Cao, Zhe Lv, Lei Wang, Yehu Shen and Guizhong Fu
Sensors 2026, 26(12), 3662; https://doi.org/10.3390/s26123662 - 8 Jun 2026
Viewed by 203
Abstract
Steel surface defect detection is a key component of industrial quality control. Existing deep learning methods mostly rely on single-backbone networks to extract high-level semantic features from raw images, yet fail to explicitly analyze low-level visual features. To overcome the limitations of conventional [...] Read more.
Steel surface defect detection is a key component of industrial quality control. Existing deep learning methods mostly rely on single-backbone networks to extract high-level semantic features from raw images, yet fail to explicitly analyze low-level visual features. To overcome the limitations of conventional frameworks, this paper proposes SCT-YOLO, a defect detection model based on a dual-stream collaborative architecture that effectively integrates low-level visual features with high-level semantic information for complementary feature representation. The primary scientific novelty of this work lies in the formalization of a multi-dimensional prior feature-guided learning paradigm, which mathematically bridges explicit hand-crafted physical priors with deep latent representations within this dual-stream architecture, differing from conventional black-box deep feature extraction. On the SD-Saliency-900 dataset, the proposed SCT-YOLO achieves an mAP@0.5 of 88.9%, representing a significant 4.6% improvement over the baseline model YOLOv8n, while maintaining an inference speed of 257.2 FPS with only 5.66 M parameters and 15.2 GFLOPs, fully meeting the real-time deployment requirements of industrial production lines. Visualization analysis demonstrates that the method exhibits more stable detection capability for small defects in complex backgrounds. Meanwhile, experiments on the GC10-DET dataset further verify its excellent generalization performance, providing a reliable technical solution for other industrial defect detection scenarios. Full article
(This article belongs to the Section Industrial Sensors)
Show Figures

Figure 1

17 pages, 40394 KB  
Article
Lightweight Low-Light Enhancement Network with Multi-Bio-Inspired Visual Mechanisms
by Yafeng Zhao, Xiang Li, Shuaipeng Hao, Min Yu, Yanli Gao and Shiwei Fan
Biomimetics 2026, 11(6), 401; https://doi.org/10.3390/biomimetics11060401 - 7 Jun 2026
Viewed by 254
Abstract
In edge deployment scenarios, low-light image enhancement faces a trade-off between model complexity and perceptual quality, limiting lightweight models under resource constraints. To address this problem, this paper proposes a perceptual quality optimization model inspired by biological visual mechanisms. Specifically, a GT-Mean loss [...] Read more.
In edge deployment scenarios, low-light image enhancement faces a trade-off between model complexity and perceptual quality, limiting lightweight models under resource constraints. To address this problem, this paper proposes a perceptual quality optimization model inspired by biological visual mechanisms. Specifically, a GT-Mean loss is introduced to simulate the luminance adaptation property of the mammalian retina, effectively mitigating optimization bias caused by exposure inconsistency in imaging sensors, while the LPIPS loss, aligned with the perceptual preferences of the human visual system (HVS), is incorporated to enhance subjective visual quality. From a structural perspective, inspired by the multi-scale perception of insect compound eyes, biologically selective attention, and color constancy mechanisms, the proposed model integrates an efficient texture-aware attention module, an enhanced multi-scale feature fusion strategy, and a chrominance denoising module. Experimental results demonstrate that, while maintaining an extremely low parameter count of only 0.52 M, the proposed model consistently outperforms existing lightweight methods on the LOL series datasets in terms of PSNR, SSIM, and LPIPS. This work provides an efficient perceptual quality optimization solution for bioinspired visual sensing under resource-constrained conditions. Full article
(This article belongs to the Special Issue Bionic Vision Applications and Validation)
Show Figures

Figure 1

Back to TopTop