Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (498)

Search Parameters:
Keywords = collaborative attention mechanism

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 7089 KB  
Article
Multi-Scale Context-Aware Network Implementation for Efficient Image Semantic Segmentation
by Yi Yang and Chong Guo
Appl. Sci. 2026, 16(8), 4033; https://doi.org/10.3390/app16084033 (registering DOI) - 21 Apr 2026
Abstract
Image semantic segmentation is essential in autonomous driving, medical imaging, and remote sensing. While convolutional neural networks (CNNs) excel at local feature extraction and spatial structure modeling, their limited receptive fields restrict the capture of long-range dependencies and global semantic consistency. Transformers provide [...] Read more.
Image semantic segmentation is essential in autonomous driving, medical imaging, and remote sensing. While convolutional neural networks (CNNs) excel at local feature extraction and spatial structure modeling, their limited receptive fields restrict the capture of long-range dependencies and global semantic consistency. Transformers provide strong global modeling through self-attention but often lack local inductive bias and show weaker generalization on small datasets. To address these limitations, this paper proposes a Multi-Scale Context-aware Network (MSC-Net) for image semantic segmentation. Under an encoder–decoder framework, MSC-Net combines a convolutional backbone with a Multi-Scale Self-Attention module to integrate the complementary strengths of CNNs and attention mechanisms. The backbone extracts local texture and structural information and can adopt architectures such as MobileNet, Xception, DRN, and ResNet, while the attention module captures long-range dependencies and multi-scale contextual information. This design improves cross-layer feature collaboration, multi-scale feature fusion, and boundary quality while maintaining computational efficiency. Experimental results show that MSC-Net achieves 38.8% mIoU and 98.4% ACC under comparable computational settings. Compared with SegFormer and DeepLabV3+, the model improves mIoU by approximately +3.0 and +3.3 percentage points, respectively, while reducing FLOPs and parameter size. Full article
Show Figures

Figure 1

24 pages, 34048 KB  
Article
Unsupervised Hyperspectral Unmixing Based on Multi-Faceted Graph Representation and Curriculum Learning
by Ran Liu, Junfeng Pu, Yanru Chen, Yanling Miao, Dawei Liu and Qi Wang
Remote Sens. 2026, 18(8), 1250; https://doi.org/10.3390/rs18081250 (registering DOI) - 21 Apr 2026
Abstract
Hyperspectral unmixing aims to estimate endmember spectra and their corresponding abundance fractions at the subpixel scale, which is a critical preprocessing step for quantitative analysis of hyperspectral remote sensing imagery. While deep learning-based methods have achieved remarkable progress, three fundamental challenges remain: (i) [...] Read more.
Hyperspectral unmixing aims to estimate endmember spectra and their corresponding abundance fractions at the subpixel scale, which is a critical preprocessing step for quantitative analysis of hyperspectral remote sensing imagery. While deep learning-based methods have achieved remarkable progress, three fundamental challenges remain: (i) reliance on a single shared spatial prior that cannot decouple the heterogeneous spatial patterns of different land covers; (ii) the lack of synergy in jointly optimizing endmember extraction and abundance estimation; (iii) the poor robustness of unsupervised training to complex mixtures, noise, and class imbalance. To address these issues, we propose a novel unsupervised unmixing framework that integrates adaptive orthogonal multi-faceted graph representation with curriculum learning. Specifically, we design an Adaptive Orthogonal Multi-Faceted Graph Generator (AOMFG) to learn a set of independent orthogonal graph structures, achieving spatially informed decoupling of land cover patterns. Then, a dual-branch collaborative optimization network is constructed: a Graph Convolutional Network (GCN) branch that incorporates the learned spatial topological priors for abundance estimation, and a 1D Convolutional Neural Network (1DCNN) branch that employs a query-attention mechanism to adaptively aggregate pure spectral features for endmember extraction. Finally, we introduce a three-stage curriculum learning strategy that progressively fine-tunes the model, which significantly enhances its performance. Extensive experiments on three widely used real-world benchmark datasets demonstrate that our proposed framework consistently outperforms state-of-the-art methods in both endmember extraction and abundance estimation accuracy. Comprehensive ablation studies, parameter sensitivity analysis, and noise robustness tests further validate the effectiveness of each core component. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

22 pages, 2130 KB  
Article
MFAFENet: A Multi-Sensor Collaborative and Multi-Scale Feature Information Adaptive Fusion Network for Spindle Rotational Error Classification in CNC Machine Tools
by Fei Wang, Lin Song, Pengfei Wang, Ping Deng and Tianwei Lan
Entropy 2026, 28(4), 475; https://doi.org/10.3390/e28040475 - 20 Apr 2026
Abstract
Accurate classification of spindle rotational errors is critical for ensuring machining precision and operational reliability of CNC machine tools. However, existing methods face challenges in extracting discriminative feature information from vibration signals due to small inter-class differences and complex electromechanical interference. This paper [...] Read more.
Accurate classification of spindle rotational errors is critical for ensuring machining precision and operational reliability of CNC machine tools. However, existing methods face challenges in extracting discriminative feature information from vibration signals due to small inter-class differences and complex electromechanical interference. This paper proposes a novel deep learning model, MFAFENet, based on multi-sensor collaboration and multi-scale feature information adaptive fusion. Vibration signals from three mounting positions are transformed into time-frequency information representations via Short-time Fourier Transform. The proposed network adaptively fuses multi-scale feature information from parallel branches with different kernel sizes through a branch attention mechanism. An efficient channel attention module is then incorporated to recalibrate channel-wise feature responses. The cross-entropy loss function is employed to optimize the network parameters during training. Experiments on a spindle reliability test bench demonstrate that MFAFENet achieves 93.37% average test accuracy, outperforming other comparative methods. Ablation and comparative studies confirm the effectiveness of each module and the clear advantage of adaptive fusion over fixed-weight multi-scale methods. Multi-sensor fusion further improves accuracy by 7.23% over the best single-sensor setup. The proposed method establishes an effective end-to-end mapping between vibration signals and rotational errors, providing a promising solution for high-precision spindle condition monitoring. Full article
(This article belongs to the Section Multidisciplinary Applications)
24 pages, 5983 KB  
Article
‌Visual Understanding of Intelligent Apple Picking: Detection-Segmentation Joint Architecture Based on Improved YOLOv11
by Bin Yan and Qianru Wu
Horticulturae 2026, 12(4), 494; https://doi.org/10.3390/horticulturae12040494 - 18 Apr 2026
Viewed by 197
Abstract
Achieving precise fruit localization and fine branch segmentation simultaneously in unstructured orchard environments remains challenging due to variable lighting, occlusion, and complex backgrounds. This study proposed a joint detection–segmentation architecture based on an improved YOLOv11 network for collaborative perception of apples and tree [...] Read more.
Achieving precise fruit localization and fine branch segmentation simultaneously in unstructured orchard environments remains challenging due to variable lighting, occlusion, and complex backgrounds. This study proposed a joint detection–segmentation architecture based on an improved YOLOv11 network for collaborative perception of apples and tree branches. First, a dual-task dataset of spindle-type apple orchards was constructed with bounding-box annotations for fruits and pixel-level polygon masks for branches, encompassing diverse illumination and occlusion conditions. Second, Convolutional Block Attention Modules (CBAMs) are strategically embedded into the YOLOv11 backbone to enhance feature discrimination for slender branch structures while preserving high fruit detection accuracy. The enhanced model achieves precision of 0.981, recall of 0.986, and F1-score of 0.983 for apple detection, and precision of 0.803, recall of 0.715, mAP of 0.698, and IoU of 0.6066 for branch segmentation on the validation set. Comparative experiments against YOLOv8 and baseline YOLOv11 confirm improved segmentation continuity and finer branch delineation. The proposed integrated perception framework provides reliable visual guidance for collision-avoidance robotic harvesting and offers a practical reference for multi-task agricultural vision systems. Full article
26 pages, 2494 KB  
Systematic Review
Project Delivery Methods (PDMs) in BIM Implementation: A Scoping Review
by Filip Ivančić and Mladen Vukomanović
Buildings 2026, 16(8), 1595; https://doi.org/10.3390/buildings16081595 - 18 Apr 2026
Viewed by 179
Abstract
Building Information Modeling (BIM) supports information integration and coordination across the construction lifecycle, but benefits depend on collaboration that is shaped by the selected project delivery method (PDM). BIM-PDM evidence is difficult to consolidate due to heterogeneous terminology and fragmented, context-specific studies. This [...] Read more.
Building Information Modeling (BIM) supports information integration and coordination across the construction lifecycle, but benefits depend on collaboration that is shaped by the selected project delivery method (PDM). BIM-PDM evidence is difficult to consolidate due to heterogeneous terminology and fragmented, context-specific studies. This scoping review maps which PDMs are addressed in the BIM-related literature and how adequacy is framed. Following PRISMA-ScR, Web of Science and Scopus were searched and 71 studies met the eligibility criteria. Publications increased markedly after 2018 and were geographically concentrated, with the largest shares associated with author affiliations in China, the United Kingdom, Australia, Canada, Malaysia, and the United States. Integrated Project Delivery (IPD) was the most frequently examined (46 studies), followed by Design–Bid–Build (DBB) (29), Design–Build (DB) (29), Public–Private Partnership (PPP) (17), and Engineering, Procurement, and Construction (EPC) (14), while Alliancing, Lean-oriented delivery approaches, and Construction Management were comparatively underrepresented. A temporal analysis indicates a recent shift toward collaborative delivery methods in BIM research. Case-based studies are predominantly situated in public sector projects, with DBB, DB, EPC, and IPD examined across both infrastructure and building contexts, while PPP is limited to infrastructure. The literature is largely focused on design and construction phases, with limited attention to early project stages and operation and maintenance. Results indicate both traditional and relationship-based PDMs are studied in the existing literature, with research framing PDMs that allow for early contractor involvement as most compatible with BIM. Moreover, IPD, DB, and EPC show the best alignment compared to most used traditional DBB methods primarily due to the early involvement of the contractor in the project. EPC and DB achieve this through the allocation of responsibility to the contractor, whereas IPD relies on the early engagement of key participants and the systematic alignment of their objectives. Collaborative and relationship-based approaches are consistently presented as the most suitable for BIM, while DBB tends to constrain BIM benefits because of its fragmented nature. This study contributes by providing a systematic synthesis of BIM-PDM relationships in the scientific literature, identifying the key mechanisms underlying the suitability of different delivery methods for BIM implementation, and offering recommendations for future research based on the identified gaps. Full article
Show Figures

Figure 1

40 pages, 7468 KB  
Review
Traffic Flow Prediction in Intelligent Transportation Systems: A Comprehensive Review of Graph Neural Networks and Hybrid Deep Learning Methods
by Zhenhua Wang, Xinmeng Wang, Lijun Wang, Zheng Wu, Jiangang Hu, Fujiang Yuan and Zhen Tian
Algorithms 2026, 19(4), 310; https://doi.org/10.3390/a19040310 - 16 Apr 2026
Viewed by 335
Abstract
Traffic flow prediction is a key component of Intelligent Transportation Systems (ITS), crucial for alleviating urban congestion, optimizing traffic management, and improving the overall efficiency of road networks. With the rapid growth in vehicle numbers and the increasing complexity of urban traffic patterns, [...] Read more.
Traffic flow prediction is a key component of Intelligent Transportation Systems (ITS), crucial for alleviating urban congestion, optimizing traffic management, and improving the overall efficiency of road networks. With the rapid growth in vehicle numbers and the increasing complexity of urban traffic patterns, accurate short-term traffic flow prediction has become increasingly important. This paper comprehensively reviews the latest advancements in traffic flow prediction methods, focusing on graph neural network (GNN)-based approaches and hybrid deep learning frameworks. First, we introduce the fundamental theoretical foundations, including graph neural networks, deep learning algorithms, heuristic optimization methods, and attention mechanisms. Subsequently, we summarize GNN-based prediction methods into four paradigms: (1) federated learning and privacy-preserving methods, enabling cross-regional collaboration while protecting sensitive data; (2) dynamically adaptive graph structure methods, capturing time-varying spatial dependencies; (3) multi-graph fusion and attention mechanism methods, enhancing feature representations from multiple perspectives; and (4) cross-domain technology integration methods, fusing novel architectures and interdisciplinary technologies. Furthermore, we investigate hybrid methods combining signal decomposition, heuristic optimization, and attention mechanisms with LSTM networks to address challenges related to non-stationarity and model optimization. For each category, we analyzed representative works and summarized their core innovations, strengths, and limitations using a systematic comparative table. Finally, we discussed current challenges, including computational complexity, model interpretability, and generalization ability, and outlined future research directions such as lightweight model design, uncertainty quantification, multimodal data fusion, and integration with traffic control systems. This review provides researchers and practitioners with a systematic understanding of the latest advances in traffic flow prediction and offers guidance for methodological selection and future research. Full article
Show Figures

Figure 1

31 pages, 5534 KB  
Article
Precise Identification of Tomato Leaf Diseases: A VMamba-FCS Classification Model Based on Multi-Mechanism Synergistic Enhancement
by Ziming Liu, Zenglin Zhang and Sigao Li
Agriculture 2026, 16(8), 875; https://doi.org/10.3390/agriculture16080875 - 15 Apr 2026
Viewed by 255
Abstract
To address the challenge of balancing computational efficiency with fine-grained feature capture in complex field environments when using existing deep learning methods for tomato leaf disease detection, this paper proposes a novel lightweight classification model called Visual Mamba with Frequency-channel attention, Cross-layer attention [...] Read more.
To address the challenge of balancing computational efficiency with fine-grained feature capture in complex field environments when using existing deep learning methods for tomato leaf disease detection, this paper proposes a novel lightweight classification model called Visual Mamba with Frequency-channel attention, Cross-layer attention and Salient feature suppression (VMamba-FCS). Based on the visual state-space model, this model integrates three collaborative enhancement mechanisms: a frequency-domain channel attention module, which improves the perception of disease-related textures by recalibrating features in the frequency domain; a cross-layer attention module, which promotes multi-scale feature fusion by integrating the semantic context of early layers; and a salient feature suppression module, which forces the network to learn more comprehensive discriminative features to improve robustness by suppressing overactivated feature regions during training. Experimental results on the real-world field dataset “Tomato-Village” demonstrate that VMamba-FCS achieves a classification accuracy of 93.62% and an inference speed of 126.5 frames per second (FPS) with only 1.20 M parameters, representing a 7.48% improvement in accuracy compared to the basic VMamba model. In the cross-dataset (PlantDoc) generalization test, VMamba-FCS significantly outperformed all comparison models with an accuracy of 71.3%, demonstrating its excellent domain adaptability and robustness. This work verifies the effectiveness of the multi-mechanism collaborative enhancement strategy in the state-space model architecture, providing a new lightweight solution for real-time and accurate agricultural disease detection on resource-constrained edge devices. Full article
(This article belongs to the Section Crop Protection, Diseases, Pests and Weeds)
Show Figures

Figure 1

24 pages, 6110 KB  
Article
Research on Medical Image Segmentation Based on Frequency-Domain Enhancement and Edge Awareness
by Jiamin Li, Yazhi Liu and Wei Li
Algorithms 2026, 19(4), 303; https://doi.org/10.3390/a19040303 - 12 Apr 2026
Viewed by 196
Abstract
Medical images commonly exhibit low contrast, weak boundaries, and complex textures. In addition, significant semantic differences exist between deep-level semantic features and shallow-level detail features, posing challenges for multi-scale feature fusion in terms of detail preservation and structural consistency. To address these issues, [...] Read more.
Medical images commonly exhibit low contrast, weak boundaries, and complex textures. In addition, significant semantic differences exist between deep-level semantic features and shallow-level detail features, posing challenges for multi-scale feature fusion in terms of detail preservation and structural consistency. To address these issues, a frequency-enhanced and bidirectional feature-guided segmentation network (FBNet) is proposed. The network comprises two core components. The frequency-based enhancement (FBE) module employs the Fast Fourier Transform and applies adaptive modulation to the amplitude spectrum through a content-aware gating mechanism, enhancing detail expression and inter-structural contrast. The Bidirectional Guided Feature Fusion module (BGF) enables bidirectional interaction between shallow and deep features. Additionally, the Structure and Edge Awareness (SEA) module is constructed using directional and variance attention mechanisms to achieve collaborative optimization of structural modeling and edge perception. Experiments on four medical image segmentation datasets show that, compared to the second-best method, FBNet achieves improvements of 2.12, 1.57, 1.37, and 1.56 percentage points on the mIoU metric and 1.54, 1.11, 0.84, and 1.03 percentage points on the mDice metric. Full article
(This article belongs to the Section Evolutionary Algorithms and Machine Learning)
Show Figures

Figure 1

27 pages, 667 KB  
Article
A Cross-Modal Temporal Alignment Framework for Artificial Intelligence-Driven Sensing in Multilingual Risk Monitoring
by Hanzhi Sun, Jiarui Zhang, Wei Hong, Yihan Fang, Mengqi Ma, Kehan Shi and Manzhou Li
Sensors 2026, 26(8), 2319; https://doi.org/10.3390/s26082319 - 9 Apr 2026
Viewed by 236
Abstract
Against the background of highly interconnected global capital markets and rapidly propagating cross-lingual information streams, traditional anomaly detection paradigms based solely on single-modality numerical time-series sensors are insufficient for forward-looking risk sensing. From the perspective of artificial intelligence-driven sensing, this study proposes a [...] Read more.
Against the background of highly interconnected global capital markets and rapidly propagating cross-lingual information streams, traditional anomaly detection paradigms based solely on single-modality numerical time-series sensors are insufficient for forward-looking risk sensing. From the perspective of artificial intelligence-driven sensing, this study proposes a multilingual semantic–numerical collaborative Transformer framework to construct a unified multimodal financial sensing architecture for intelligent anomaly sensing and risk perception. Within the proposed sensing paradigm, multilingual texts are conceptualized as semantic sensors that continuously emit event-driven sensing signals, while market prices, trading volumes, and order book dynamics are modeled as heterogeneous numerical sensor streams reflecting behavioral market sensing responses. These heterogeneous sensors are jointly integrated through a cross-modal sensor fusion architecture. A cross-modal temporal alignment attention mechanism is designed to explicitly model dynamic lag structures between semantic sensing signals and numerical sensor responses, enabling temporally adaptive sensor-level alignment and fusion. To enhance sensing robustness, a multilingual semantic noise-robust encoding module is introduced to suppress unreliable textual sensor noise and stabilize cross-lingual semantic sensing representations. Furthermore, a semantic–numerical collaborative risk fusion module is constructed within a shared latent sensing space to achieve adaptive sensor contribution weighting and cross-sensor feature coupling, thereby improving anomaly sensing accuracy and robustness under complex multimodal sensing environments. Extensive experiments conducted on real-world multi-market financial sensing datasets demonstrate that the proposed artificial intelligence-driven sensing framework significantly outperforms representative statistical and deep learning baselines. The framework achieves a Precision of 0.852, Recall of 0.781, F1-score of 0.815, and an AUC of 0.892, while substantially improving early warning time in practical risk sensing scenarios. In cross-market transfer settings, the proposed sensing architecture maintains stable anomaly sensing performance under bidirectional domain shifts, with AUC consistently exceeding 0.86, indicating strong structural generalization across heterogeneous sensing environments. Ablation analysis further verifies that temporal sensor alignment, semantic sensor denoising, and collaborative cross-sensor risk coupling contribute independently and synergistically to the overall sensing performance. Overall, this study establishes a scalable multimodal intelligent sensing framework for dynamic financial anomaly sensing, providing an effective artificial intelligence-driven sensing solution for cross-market risk surveillance and adaptive financial signal sensing. Full article
(This article belongs to the Special Issue Artificial Intelligence-Driven Sensing)
Show Figures

Figure 1

20 pages, 4468 KB  
Article
Regional Integration, University Resources, and Firm Performance: Evidence from the Yangtze River Delta in China
by Jiawen Zhou, Fei Peng, Qi Chen and Sajid Anwar
Economies 2026, 14(4), 128; https://doi.org/10.3390/economies14040128 - 9 Apr 2026
Viewed by 316
Abstract
Universities play a critical role in knowledge creation and technological innovation, serving as key drivers of regional development. However, existing research has paid limited attention to the mechanisms through which university innovation inputs translate into firm-level performance, particularly in the context of science [...] Read more.
Universities play a critical role in knowledge creation and technological innovation, serving as key drivers of regional development. However, existing research has paid limited attention to the mechanisms through which university innovation inputs translate into firm-level performance, particularly in the context of science and technology corridors in emerging economies. This study investigates how university innovation resources affect enterprise performance in the G60 Science and Technology Corridor within China’s Yangtze River Delta, one of the country’s most dynamic innovation regions. Using a panel dataset of 55 universities across nine cities from 2008 to 2017, we employ spatial analysis and fixed-effects panel regression models to examine the relationship between university innovation inputs and firm performance and further explore the mediating roles of local human capital and firm R&D investment. The results show that university innovation inputs significantly enhance enterprise performance, although excessive human resource inputs exhibit a negative effect on both short-term and long-term outcomes. Local human capital and firm R&D investment serve as key mediating mechanisms, with input and output resources influencing enterprise performance through distinct pathways. Heterogeneity analysis reveals that non-state-owned enterprises and small- and medium-sized enterprises derive greater long-term benefits from university resources. These findings contribute to the literature by clarifying the conceptual distinction between university innovation inputs and outputs, and by demonstrating the micro-level mechanisms—R&D investment and human capital—through which university-generated knowledge affects firm performance. The results also provide empirical evidence from an emerging economic context, extending the applicability of knowledge spillover and absorptive capacity theories. Policy implications include optimizing university human resource allocation, strengthening university–enterprise collaboration, and providing targeted support for non-state-owned enterprises and SMEs. Future research may extend the analysis to include institutional factors and university heterogeneity. Full article
Show Figures

Figure 1

27 pages, 2114 KB  
Article
MSFE-YOLO: A Steel Surface Defect Detection Algorithm Integrating Multi-Scale Frequency Domain and Defect-Aware Attention
by Siqi Su, Jiale Shen, Peiyi Lin, Wanhe Tang, Weijie Zhang and Zhen Chen
Sensors 2026, 26(8), 2311; https://doi.org/10.3390/s26082311 - 9 Apr 2026
Viewed by 365
Abstract
Detecting surface defects on steel products is crucial for maintaining quality standards in industrial manufacturing. However, existing detection algorithms face several challenges, including the difficulty of capturing multi-scale defect characteristics with fixed receptive fields, insufficient utilization of defect edge and frequency domain features, [...] Read more.
Detecting surface defects on steel products is crucial for maintaining quality standards in industrial manufacturing. However, existing detection algorithms face several challenges, including the difficulty of capturing multi-scale defect characteristics with fixed receptive fields, insufficient utilization of defect edge and frequency domain features, and simplistic feature fusion strategies. In response to the above challenges, this paper proposed the Multi-Scale Frequency-Enhanced YOLO (MSFE-YOLO) algorithm that integrates multi-scale frequency domain enhancement with defect-aware attention mechanisms. First, a Multi-Scale Frequency-Enhanced Convolution (MSFC) module was constructed, which extracted multi-scale spatial features in parallel through depth-adaptive dilated convolutions, explicitly modeled high-frequency edge information using the Laplacian operator, and achieved adaptive fusion of multi-branch features via learnable weights. Second, a Cross-Stage Partial with Multi-Scale Defect-Aware Attention (C2MSDA) module was designed, integrating Sobel operator-based edge perception, multi-scale spatial attention, and adaptive channel attention to collaboratively enhance features across spatial, channel, and edge domains through a gated fusion strategy. Finally, an Adaptive Feature Fusion Enhancement (AFFE) module was proposed to achieve adaptive aggregation of multi-level features through a data-driven weight generation network and cross-scale feature interaction mechanism. Experimental results on the NEU-DET and GC10-DET datasets demonstrated that MSFE-YOLO achieved the mAP@0.5 of 79.8% and 66.7%, respectively, which were 1.7% and 2.1% higher than the benchmark model YOLOv11s respectively, while maintaining an inference speed of 89.3 FPS, which satisfied the real-time detection requirements in industrial scenarios. Full article
(This article belongs to the Special Issue AI-Based Visual Sensing for Object Detection)
Show Figures

Figure 1

23 pages, 2687 KB  
Article
Eye-Tracking Response Modeling and Design Optimization Method for Smart Home Interface Based on Transformer Attention Mechanism
by Yanping Lu and Myun Kim
Electronics 2026, 15(8), 1562; https://doi.org/10.3390/electronics15081562 - 8 Apr 2026
Viewed by 209
Abstract
In response to the redundant spatio-temporal modeling and insufficient adaptation to dynamic decision-making in eye-tracking interaction of smart home interfaces, a smart home interface eye-tracking response optimization model based on spatio-temporal Transformer and gate control cross-attention is proposed. It adapts the physiological characteristics [...] Read more.
In response to the redundant spatio-temporal modeling and insufficient adaptation to dynamic decision-making in eye-tracking interaction of smart home interfaces, a smart home interface eye-tracking response optimization model based on spatio-temporal Transformer and gate control cross-attention is proposed. It adapts the physiological characteristics of eye-tracking jumps through dynamic sparse attention gating to compress computational redundancy and combines multi-objective reinforcement learning attention modulation to construct a closed-loop decision-making mechanism, optimizing interface parameters in real-time. Experiments showed that the model reduced eye-tracking trajectory prediction error by 23.7% compared to advanced benchmarks, increased the success rate of adapting to dynamic mutation scenarios to 89.2%, and controlled performance fluctuations within 2.3% under noise interference. In high-fidelity user testing, the accuracy of cross-task gaze transfer reached 93.4%, the failure rate of glare interference was optimized to 2.4%, and the user cognitive load index was reduced by 27.9%. Its resource consumption and energy consumption were reduced by 26.7% and 44.9%, respectively, while its posture deviation tolerance remained at 3.5°. The sparse spatio-temporal modeling of the spatio-temporal adaptive Transformer module and the enhanced gating mechanism of the hierarchical gated cross-attention module work together to break through the limitations of traditional methods in computational efficiency and dynamic feedback, providing high-precision and low-latency eye-tracking interaction solutions for smart home interface systems, and promoting the practical evolution of personalized human–machine collaborative control. Full article
Show Figures

Figure 1

30 pages, 4987 KB  
Article
AT-BSS: A Broker Selection Strategy for Efficient Cross-Shard Processing in Sharded IoT–Blockchain Systems
by Yue Su, Yang Xiang, Kien Nguyen and Hiroo Sekiya
Sensors 2026, 26(8), 2296; https://doi.org/10.3390/s26082296 - 8 Apr 2026
Viewed by 352
Abstract
The deep integration of the Internet of Things (IoT) and blockchain technology enables emerging applications in multi-party collaboration and trusted data sharing. However, the scalability constraints of blockchain networks remain a major bottleneck when handling high-frequency interactions in IoT–blockchain systems. Sharding addresses this [...] Read more.
The deep integration of the Internet of Things (IoT) and blockchain technology enables emerging applications in multi-party collaboration and trusted data sharing. However, the scalability constraints of blockchain networks remain a major bottleneck when handling high-frequency interactions in IoT–blockchain systems. Sharding addresses this challenge by partitioning the blockchain network into parallel sub-networks. Nevertheless, it introduces significant coordination overhead for cross-shard transactions. Among mitigation strategies, Broker-based mechanisms (e.g., BrokerChain) have attracted increasing attention for their efficiency in handling cross-shard communication by reducing verification overhead and communication latency. Despite these advantages, existing research typically treats the Broker group as a fixed configuration, neglecting the impact of Broker selection on system performance. To bridge this gap, this paper proposes the Accumulative Activity–Temporal Liveness Broker Selection Strategy (AT-BSS) to optimize cross-shard transaction processing in sharded IoT–blockchains. Specifically, we formally characterize the Accumulative Activity and Temporal Liveness of accounts in the account–transaction network and use these two metrics to identify accounts that maximize transaction-aggregation efficiency. We implement AT-BSS on the BlockEmulator platform and evaluate it against two baselines, namely, ABChain and BrokerChain. Under different settings of the number of Brokers (BrokerNum), number of shards (ShardNum), transaction arrival rate (InjectSpeed), and maximum block size (MaxBlockSize), AT-BSS consistently outperforms both baselines in terms of Transactions Per Second (TPS), Transaction Confirmation Latency (TCL), and Cross-shard Transaction Ratio (CTX). Compared with ABChain, AT-BSS achieves up to 15.5% higher TPS and reduces TCL and CTX by up to 80.2% and 28.7%, respectively. AT-BSS yields more pronounced results over BrokerChain, with TPS improvements of up to 229% and reductions of up to 97.7% in TCL and 80.5% in CTX. Full article
Show Figures

Figure 1

23 pages, 3301 KB  
Article
Hierarchical Active Perception and Stability Control for Multi-Robot Collaborative Search in Unknown Environments
by Zeyu Xu, Kai Xue, Ping Wang and Decheng Kong
Actuators 2026, 15(4), 209; https://doi.org/10.3390/act15040209 - 7 Apr 2026
Viewed by 331
Abstract
Multi-robot systems (MRS) have attracted a lot of attention from researchers due to their widespread application in various environments. However, in multi-robot collaborative search tasks, two problems often arise: sparse rewards for capturing targets and control oscillations. To address these issues, this paper [...] Read more.
Multi-robot systems (MRS) have attracted a lot of attention from researchers due to their widespread application in various environments. However, in multi-robot collaborative search tasks, two problems often arise: sparse rewards for capturing targets and control oscillations. To address these issues, this paper proposes the hierarchical active perception multi-agent deep deterministic policy gradient (HAP-MADDPG) framework. This framework guides robots to efficiently explore maps and discover targets through global utility planning based on global exploration rate and local information aggregation based on local exploration rate. A stability control mechanism, which includes hysteresis logic and reward decay, is introduced to suppress control oscillations. Experimental results show that the HAP-MADDPG framework achieves a success rate of 96.25% and an average search time of 216.3 steps. The path trajectories are smooth, demonstrating the effectiveness of the proposed approach. Full article
Show Figures

Graphical abstract

20 pages, 12712 KB  
Article
Large-Scale Airborne LiDAR Point Cloud Building Extraction Based on Improved Voxelized Deep Learning Network
by Bai Xue, Yanru Song, Pi Ai, Hongzhou Li, Shuhan Liu and Li Guo
Buildings 2026, 16(7), 1450; https://doi.org/10.3390/buildings16071450 - 7 Apr 2026
Viewed by 334
Abstract
High-precision 3D building data are pivotal for smart city development, urban planning, and disaster management. However, large-scale building extraction from airborne LiDAR point clouds remains challenging due to semantic ambiguity, uneven point density, and complex architectural structures. To address these limitations, we propose [...] Read more.
High-precision 3D building data are pivotal for smart city development, urban planning, and disaster management. However, large-scale building extraction from airborne LiDAR point clouds remains challenging due to semantic ambiguity, uneven point density, and complex architectural structures. To address these limitations, we propose a novel framework integrating geometric topology perception with cross-dimensional attention mechanisms within a Sparse Voxel Convolutional Neural Network (SPVCNN). The key contributions include: (1) an enhanced LaserMix++ multi-scale hybrid augmentation strategy featuring cross-scene block replacement, ground normal–constrained rotation, and non-uniform scaling; (2) a dual-branch SPVCNN architecture embedding a collaborative module of Geometric Self-Attention (GSA) and Cross-Space Residual Attention (CSRA) to preserve topological consistency and enable cross-dimensional feature interaction; and (3) a Boundary Enhancement Module (BEM) specifically designed to resolve boundary ambiguity and overlapping predictions. Evaluated on a 177 km2 dataset covering Washington, D.C., our method significantly outperforms the baseline SPVCNN, improving accuracy by 12.04 percentage points (0.8212 to 0.9416) and Intersection over Union (IoU) by 9.96 percentage points (0.866 to 0.9656). Furthermore, it surpasses mainstream networks such as Cylinder3D and MinkResNet by over 50% in absolute accuracy gain. These results demonstrate the effectiveness of synergistically combining geometric perception with adaptive attention for robust building extraction from large-scale LiDAR data. Full article
(This article belongs to the Section Construction Management, and Computers & Digitization)
Show Figures

Figure 1

Back to TopTop