Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (191)

Search Parameters:
Keywords = Wise-IoU

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
28 pages, 17257 KB  
Article
A Box-Based Method for Regularizing the Prediction of Semantic Segmentation of Building Facades
by Shuyu Liu, Zhihui Wang, Yuexia Hu, Xiaoyu Zhao and Si Zhang
Buildings 2025, 15(19), 3562; https://doi.org/10.3390/buildings15193562 - 2 Oct 2025
Viewed by 240
Abstract
Semantic segmentation of building facade images has enabled a lot of intelligent support for architectural research and practice in the last decade. However, the classifiers for semantic segmentation usually predict facade elements (e.g., windows) as graphics in irregular shapes. The non-smooth edges and [...] Read more.
Semantic segmentation of building facade images has enabled a lot of intelligent support for architectural research and practice in the last decade. However, the classifiers for semantic segmentation usually predict facade elements (e.g., windows) as graphics in irregular shapes. The non-smooth edges and hard-to-define shapes impede the further use of the predicted graphics. This study proposes a method to regularize the predicted graphics following the prior knowledge of composition principles of building facades. Specifically, we define four types of boxes for each predicted graphic, namely minimum circumscribed box (MCB), maximum inscribed box (MIB), candidate box (CB), and best overlapping box (BOB). Based on these boxes, a three-stage process, consisting of denoising, BOB finding, and BOB stacking, was established to regularize the predicted graphics of facade elements into basic rectilinear polygons. To compare the proposed and existing methods of graphic regularization, an experiment was conducted based on the predicted graphics of facade elements obtained from four pixel-wise annotated building facade datasets, Irregular Facades (IRFs), CMP Facade Database, ECP Paris, and ICG Graz50. The results demonstrate that the graphics regularized by our method align more closely with real facade elements in shape and edge. Moreover, our method avoids the prevalent issue of correctness degradation observed in existing methods. Compared with the predicted graphics, the average IoU and F1-score of our method-regularized graphics respectively increase by 0.001–0.017 and 0.000–0.012 across the datasets, while those of previous method-regularized graphics decrease by 0.002–0.021 and 0.002–0.015. The regularized graphics contribute to improving the precision and depth of semantic segmentation-based applications of building facades. They are also expected to be useful for the exploration of data mining on urban images in the future. Full article
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)
Show Figures

Figure 1

18 pages, 1985 KB  
Article
AI-Enhanced Deep Learning Framework for Pulmonary Embolism Detection in CT Angiography
by Nan-Han Lu, Chi-Yuan Wang, Kuo-Ying Liu, Yung-Hui Huang and Tai-Been Chen
Bioengineering 2025, 12(10), 1055; https://doi.org/10.3390/bioengineering12101055 - 29 Sep 2025
Viewed by 297
Abstract
Pulmonary embolism (PE) on CT pulmonary angiography (CTPA) demands rapid, accurate assessment, yet small, low-contrast clots in distal arteries remain challenging. We benchmarked ten fully convolutional network (FCN) backbones and introduced Consensus Intersection-Optimized Fusion (CIOF)—a K-of-M, pixel-wise mask fusion with the voting threshold [...] Read more.
Pulmonary embolism (PE) on CT pulmonary angiography (CTPA) demands rapid, accurate assessment, yet small, low-contrast clots in distal arteries remain challenging. We benchmarked ten fully convolutional network (FCN) backbones and introduced Consensus Intersection-Optimized Fusion (CIOF)—a K-of-M, pixel-wise mask fusion with the voting threshold K* selected on training patients to maximize IoU. Using the FUMPE cohort (35 patients; 12,034 slices) with patient-based random splits (18 train, 17 test), we trained five FCN architectures (each with Adam and SGDM) and evaluated segmentation with IoU, Dice, FNR/FPR, and latency. CIOF achieved the best overall performance (mean IoU 0.569; mean Dice 0.691; FNR 0.262), albeit with a higher runtime (~63.7 s per case) because all ten models are executed and fused; the strongest single backbone was Inception-ResNetV2 + SGDM (IoU 0.530; Dice 0.648). Stratified by embolization ratio, CIOF remained superior across <10−4, 10−4–10−3, and >10−3 clot burdens, with mean IoU/Dice = 0.238/0.328, 0.566/0.698, and 0.739/0.846, respectively—demonstrating gains for tiny, subsegmental emboli. These results position CIOF as an accuracy-oriented, interpretable ensemble for offline or second-reader use, while faster single backbones remain candidates for time-critical triage. Full article
(This article belongs to the Section Biosignal Processing)
Show Figures

Figure 1

20 pages, 1860 KB  
Article
An Improved YOLOv11n Model Based on Wavelet Convolution for Object Detection in Soccer Scenes
by Yue Wu, Lanxin Geng, Xinqi Guo, Chao Wu and Gui Yu
Symmetry 2025, 17(10), 1612; https://doi.org/10.3390/sym17101612 - 28 Sep 2025
Viewed by 226
Abstract
Object detection in soccer scenes serves as a fundamental task for soccer video analysis and target tracking. This paper proposes WCC-YOLO, a symmetry-enhanced object detection framework based on YOLOv11n. Our approach integrates symmetry principles at multiple levels: (1) The novel C3k2-WTConv module synergistically [...] Read more.
Object detection in soccer scenes serves as a fundamental task for soccer video analysis and target tracking. This paper proposes WCC-YOLO, a symmetry-enhanced object detection framework based on YOLOv11n. Our approach integrates symmetry principles at multiple levels: (1) The novel C3k2-WTConv module synergistically combines conventional convolution with wavelet decomposition, leveraging the orthogonal symmetry of Haar wavelet quadrature mirror filters (QMFs) to achieve balanced frequency-domain decomposition and enhance multi-scale feature representation. (2) The Channel Prior Convolutional Attention (CPCA) mechanism incorporates symmetrical operations—using average-max pooling pairs in channel attention and multi-scale convolutional kernels in spatial attention—to automatically learn to prioritize semantically salient regions through channel-wise feature recalibration, thereby enabling balanced feature representation. Coupled with InnerShape-IoU for refined bounding box regression, WCC-YOLO achieves a 4.5% improvement in mAP@0.5:0.95 and a 5.7% gain in mAP@0.5 compared to the baseline YOLOv11n while simultaneously reducing the number of parameters and maintaining near-identical inference latency (δ < 0.1 ms). This work demonstrates the value of explicit symmetry-aware modeling for sports analytics. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

30 pages, 14129 KB  
Article
Evaluating Two Approaches for Mapping Solar Installations to Support Sustainable Land Monitoring: Semantic Segmentation on Orthophotos vs. Multitemporal Sentinel-2 Classification
by Adolfo Lozano-Tello, Andrés Caballero-Mancera, Jorge Luceño and Pedro J. Clemente
Sustainability 2025, 17(19), 8628; https://doi.org/10.3390/su17198628 - 25 Sep 2025
Viewed by 324
Abstract
This study evaluates two approaches for detecting solar photovoltaic (PV) installations across agricultural areas, emphasizing their role in supporting sustainable energy monitoring, land management, and planning. Accurate PV mapping is essential for tracking renewable energy deployment, guiding infrastructure development, assessing land-use impacts, and [...] Read more.
This study evaluates two approaches for detecting solar photovoltaic (PV) installations across agricultural areas, emphasizing their role in supporting sustainable energy monitoring, land management, and planning. Accurate PV mapping is essential for tracking renewable energy deployment, guiding infrastructure development, assessing land-use impacts, and informing policy decisions aimed at reducing carbon emissions and fostering climate resilience. The first approach applies deep learning-based semantic segmentation to high-resolution RGB orthophotos, using the pretrained “Solar PV Segmentation” model, which achieves an F1-score of 95.27% and an IoU of 91.04%, providing highly reliable PV identification. The second approach employs multitemporal pixel-wise spectral classification using Sentinel-2 imagery, where the best-performing neural network achieved a precision of 99.22%, a recall of 96.69%, and an overall accuracy of 98.22%. Both approaches coincided in detecting 86.67% of the identified parcels, with an average surface difference of less than 6.5 hectares per parcel. The Sentinel-2 method leverages its multispectral bands and frequent revisit rate, enabling timely detection of new or evolving installations. The proposed methodology supports the sustainable management of land resources by enabling automated, scalable, and cost-effective monitoring of solar infrastructures using open-access satellite data. This contributes directly to the goals of climate action and sustainable land-use planning and provides a replicable framework for assessing human-induced changes in land cover at regional and national scales. Full article
Show Figures

Figure 1

27 pages, 5776 KB  
Article
R-SWTNet: A Context-Aware U-Net-Based Framework for Segmenting Rural Roads and Alleys in China with the SQVillages Dataset
by Jianing Wu, Junqi Yang, Xiaoyu Xu, Ying Zeng, Yan Cheng, Xiaodong Liu and Hong Zhang
Land 2025, 14(10), 1930; https://doi.org/10.3390/land14101930 - 23 Sep 2025
Viewed by 297
Abstract
Rural road networks are vital for rural development, yet narrow alleys and occluded segments remain underrepresented in digital maps due to irregular morphology, spectral ambiguity, and limited model generalization. Traditional segmentation models struggle to balance local detail preservation and long-range dependency modeling, prioritizing [...] Read more.
Rural road networks are vital for rural development, yet narrow alleys and occluded segments remain underrepresented in digital maps due to irregular morphology, spectral ambiguity, and limited model generalization. Traditional segmentation models struggle to balance local detail preservation and long-range dependency modeling, prioritizing either local features or global context alone. Hypothesizing that integrating hierarchical local features and global context will mitigate these limitations, this study aims to accurately segment such rural roads by proposing R-SWTNet, a context-aware U-Net-based framework, and constructing the SQVillages dataset. R-SWTNet integrates ResNet34 for hierarchical feature extraction, Swin Transformer for long-range dependency modeling, ASPP for multi-scale context fusion, and CAM-Residual blocks for channel-wise attention. The SQVillages dataset, built from multi-source remote sensing imagery, includes 18 diverse villages with adaptive augmentation to mitigate class imbalance. Experimental results show R-SWTNet achieves a validation IoU of 54.88% and F1-score of 70.87%, outperforming U-Net and Swin-UNet, and with less overfitting than R-Net and D-LinkNet. Its lightweight variant supports edge deployment, enabling on-site road management. This work provides a data-driven tool for infrastructure planning under China’s Rural Revitalization Strategy, with potential scalability to global unstructured rural road scenes. Full article
(This article belongs to the Section Land Innovations – Data and Machine Learning)
Show Figures

Figure 1

28 pages, 14913 KB  
Article
Turning Seasonal Signals into Segmentation Cues: Recolouring the Harmonic Normalized Difference Vegetation Index for Agricultural Field Delineation
by Filip Papić, Luka Rumora, Damir Medak and Mario Miler
Sensors 2025, 25(18), 5926; https://doi.org/10.3390/s25185926 - 22 Sep 2025
Viewed by 318
Abstract
Accurate delineation of fields is difficult in fragmented landscapes where single-date images provide no seasonal cues and supervised models require labels. We propose a method that explicitly represents phenology to improve zero-shot delineation. Using 22 cloud-free PlanetScope scenes over a 5 × 5 [...] Read more.
Accurate delineation of fields is difficult in fragmented landscapes where single-date images provide no seasonal cues and supervised models require labels. We propose a method that explicitly represents phenology to improve zero-shot delineation. Using 22 cloud-free PlanetScope scenes over a 5 × 5 km area, a single harmonic model is fitted to the NDVI per pixel to obtain the phase, amplitude and mean. These values are then mapped into cylindrical colour spaces (Hue–Saturation–Value, Hue–Whiteness–Blackness, Luminance-Chroma-Hue). The resulting recoloured composites are segmented using the Segment Anything Model (SAM), without fine-tuning. The results are evaluated object-wise, object-wise grouped by area size, and pixel-wise. Pixel-wise evaluation achieved up to F1 = 0.898, and a mean Intersection-over-Union (mIoU) of 0.815, while object-wise performance reached F1 = 0.610. HSV achieved the strongest area match, while HWB produced the fewest fragments. The ordinal time-of-day basis provided better parcel separability than the annual radian adjustment. The main errors were over-segmentation and fragmentation. As the parcel size increased, the IoU increased, but the precision decreased. It is concluded that recolouring using harmonic NDVI time series is a simple, scalable, and interpretable basis for field delineation that can be easily improved. Full article
(This article belongs to the Special Issue Sensors and Data-Driven Precision Agriculture—Second Edition)
Show Figures

Figure 1

21 pages, 5771 KB  
Article
SCOPE: Spatial Context-Aware Pointcloud Encoder for Denoising Under the Adverse Weather Conditions
by Hyeong-Geun Kim
Appl. Sci. 2025, 15(18), 10113; https://doi.org/10.3390/app151810113 - 16 Sep 2025
Viewed by 324
Abstract
Reliable LiDAR point clouds are essential for perception in robotics and autonomous driving. However, adverse weather conditions introduce substantial noise that significantly degrades perception performance. To tackle this challenge, we first introduce a novel, point-wise annotated dataset of over 800 scenes, created by [...] Read more.
Reliable LiDAR point clouds are essential for perception in robotics and autonomous driving. However, adverse weather conditions introduce substantial noise that significantly degrades perception performance. To tackle this challenge, we first introduce a novel, point-wise annotated dataset of over 800 scenes, created by collecting and comparing point clouds from real-world adverse and clear weather conditions. Building upon this comprehensive dataset, we propose the Spatial Context-Aware Point Cloud Encoder Network (SCOPE), a deep learning framework that identifies noise by effectively learning spatial relationships from sparse point clouds. SCOPE partitions the input into voxels and utilizes a Voxel Spatial Feature Extractor with contrastive learning to distinguish weather-induced noise from structural points. Experimental results validate SCOPE’s effectiveness, achieving high Intersection-over-Union (mIoU) scores in snow (88.66%), rain (92.33%), and fog (88.77%), with a mean mIoU of 89.92%. These consistent results across diverse scenarios confirm the robustness and practical effectiveness of our method in challenging environments. Full article
(This article belongs to the Special Issue AI-Aided Intelligent Vehicle Positioning in Urban Areas)
Show Figures

Figure 1

15 pages, 1836 KB  
Article
Public Security Patrol and Alert Recognition for Police Patrol Robots Based on Improved YOLOv8 Algorithm
by Yuehan Shi, Xiaoming Zhang, Qilei Wang and Xiaojun Liu
Math. Comput. Appl. 2025, 30(5), 97; https://doi.org/10.3390/mca30050097 - 10 Sep 2025
Viewed by 511
Abstract
Addressing the prevalent challenges of inadequate detection accuracy and sluggish detection speed encountered by police patrol robots during security patrols, we propose an innovative algorithm based on the YOLOv8 model. Our method consists of substituting the backbone network of YOLOv8 with FasterNet. As [...] Read more.
Addressing the prevalent challenges of inadequate detection accuracy and sluggish detection speed encountered by police patrol robots during security patrols, we propose an innovative algorithm based on the YOLOv8 model. Our method consists of substituting the backbone network of YOLOv8 with FasterNet. As a result, the model’s ability to identify accurately is enhanced, and its computational performance is improved. Additionally, the extraction of geographical data becomes more efficient. In addition, we introduce the BiFormer attention mechanism, incorporating dynamic sparse attention to significantly improve algorithm performance and computational efficiency. Furthermore, to bolster the regression performance of bounding boxes and enhance detection robustness, we utilize Wise-IoU as the loss function. Through experimentation across three perilous police scenarios—fighting, knife threats, and gun incidents—we demonstrate the efficacy of our proposed algorithm. The results indicate notable improvements over the original model, with enhancements of 2.42% and 5.83% in detection accuracy and speed for behavioral recognition of fighting, 2.87% and 4.67% for knife threat detection, and 3.01% and 4.91% for gun-related situation detection, respectively. Full article
(This article belongs to the Section Engineering)
Show Figures

Figure 1

28 pages, 5402 KB  
Article
Real-Time Strawberry Ripeness Classification and Counting: An Optimized YOLOv8s Framework with Class-Aware Multi-Object Tracking
by Oluwasegun Moses Ogundele, Niraj Tamrakar, Jung-Hoo Kook, Sang-Min Kim, Jeong-In Choi, Sijan Karki, Timothy Denen Akpenpuun and Hyeon Tae Kim
Agriculture 2025, 15(18), 1906; https://doi.org/10.3390/agriculture15181906 - 9 Sep 2025
Viewed by 810
Abstract
Accurate fruit counting is crucial for data-driven decision-making in modern precision agriculture. In strawberry cultivation, a labor-intensive sector, automated, scalable yield estimation is especially critical. However, dense foliage, variable lighting, visual ambiguity of ripeness stages, and fruit clustering pose significant challenges. To overcome [...] Read more.
Accurate fruit counting is crucial for data-driven decision-making in modern precision agriculture. In strawberry cultivation, a labor-intensive sector, automated, scalable yield estimation is especially critical. However, dense foliage, variable lighting, visual ambiguity of ripeness stages, and fruit clustering pose significant challenges. To overcome these, we developed a real-time multi-stage framework for strawberry detection and counting by optimizing a YOLOv8s detector and integrating a class-aware tracking system. The detector was enhanced with a lightweight C3x module, an additional detection head for small objects, and the Wise-IOU (WIoU) loss function, thereby improving performance against occlusion. Our final model achieved a 92.5% mAP@0.5, outperforming the baseline while reducing the number of parameters by 27.9%. This detector was integrated with the ByteTrack multiple object tracking (MOT) algorithm. Our system enabled accurate, automated fruit counting in complex greenhouse environments. When validated on video data, results showed a strong correlation with ground-truth counts (R2 = 0.914) and a low mean absolute percentage error (MAPE) of 9.52%. Counting accuracy was highest for ripe strawberries (R2 = 0.950), confirming the value for harvest-ready estimation. This work delivers an efficient, accurate, and resource-conscious solution for automated yield monitoring in commercial strawberry production. Full article
(This article belongs to the Section Artificial Intelligence and Digital Agriculture)
Show Figures

Figure 1

20 pages, 8561 KB  
Article
LCW-YOLO: An Explainable Computer Vision Model for Small Object Detection in Drone Images
by Dan Liao, Rengui Bi, Yubi Zheng, Cheng Hua, Liangqing Huang, Xiaowen Tian and Bolin Liao
Appl. Sci. 2025, 15(17), 9730; https://doi.org/10.3390/app15179730 - 4 Sep 2025
Viewed by 1267
Abstract
Small targets in drone imagery are often difficult to accurately locate and identify due to scale imbalance and limitations, such as pixel representation and dynamic environmental interference, and the balance between detection accuracy and resource consumption of the model also poses challenges. Therefore, [...] Read more.
Small targets in drone imagery are often difficult to accurately locate and identify due to scale imbalance and limitations, such as pixel representation and dynamic environmental interference, and the balance between detection accuracy and resource consumption of the model also poses challenges. Therefore, we propose an interpretable computer vision framework based on YOLOv12m, called LCW-YOLO. First, we adopt multi-scale heterogeneous convolutional kernels to improve the lightweight channel-level and spatial attention combined context (LA2C2f) structure, enhancing spatial perception capabilities while reducing model computational load. Second, to enhance feature fusion capabilities, we propose the Convolutional Attention Integration Module (CAIM), enabling the fusion of original features across channels, spatial dimensions, and layers, thereby strengthening contextual attention. Finally, the model incorporates Wise-IoU (WIoU) v3, which dynamically allocates loss weights for detected objects. This allows the model to adjust its focus on samples of average quality during training based on object difficulty, thereby improving the model’s generalization capabilities. According to experimental results, LCW-YOLO eliminates 0.4 M parameters and improves mAP@0.5 by 3.3% on the VisDrone2019 dataset when compared to YOLOv12m. And the model improves mAP@0.5 by 1.9% on the UAVVaste dataset. In the task of identifying small objects with drones, LCW-YOLO, as an explainable AI (XAI) model, provides visual detection results and effectively balances accuracy, lightweight design, and generalization capabilities. Full article
(This article belongs to the Special Issue Explainable Artificial Intelligence Technology and Its Applications)
Show Figures

Figure 1

38 pages, 13994 KB  
Article
Post-Heuristic Cancer Segmentation Refinement over MRI Images and Deep Learning Models
by Panagiotis Christakakis and Eftychios Protopapadakis
AI 2025, 6(9), 212; https://doi.org/10.3390/ai6090212 - 2 Sep 2025
Viewed by 906
Abstract
Lately, deep learning methods have greatly improved the accuracy of brain-tumor segmentation, yet slice-wise inconsistencies still limit reliable use in clinical practice. While volume-aware 3D convolutional networks achieve high accuracy, their memory footprint and inference time may limit clinical adoption. This study proposes [...] Read more.
Lately, deep learning methods have greatly improved the accuracy of brain-tumor segmentation, yet slice-wise inconsistencies still limit reliable use in clinical practice. While volume-aware 3D convolutional networks achieve high accuracy, their memory footprint and inference time may limit clinical adoption. This study proposes a resource-conscious pipeline for lower-grade-glioma delineation in axial FLAIR MRI that combines a 2D Attention U-Net with a guided post-processing refinement step. Two segmentation backbones, a vanilla U-Net and an Attention U-Net, are trained on 110 TCGA-LGG axial FLAIR patient volumes under various loss functions and activation functions. The Attention U-Net, optimized with Dice loss, delivers the strongest baseline, achieving a mean Intersection-over-Union (mIoU) of 0.857. To mitigate slice-wise inconsistencies inherent to 2D models, a White-Area Overlap (WAO) voting mechanism quantifies the tumor footprint shared by neighboring slices. The WAO curve is smoothed with a Gaussian filter to locate its peak, after which a percentile-based heuristic selectively relabels the most ambiguous softmax pixels. Cohort-level analysis shows that removing merely 0.1–0.3% of ambiguous low-confidence pixels lifts the post-processing mIoU above the baseline while improving segmentation for two-thirds of patients. The proposed refinement strategy holds great potential for further improvement, offering a practical route for integrating deep learning segmentation into routine clinical workflows with minimal computational overhead. Full article
Show Figures

Figure 1

21 pages, 13392 KB  
Article
YOLO-HDEW: An Efficient PCB Defect Detection Model
by Chuanwang Song, Yuanteng Zhou, Yinghao Ma, Qingshuo Qi, Zhaoyu Wang and Keyong Hu
Electronics 2025, 14(17), 3383; https://doi.org/10.3390/electronics14173383 - 26 Aug 2025
Viewed by 897
Abstract
To address the challenge of detecting small defects in Printed Circuit Boards (PCBs), a YOLO-HDEW model based on the enhanced YOLOv8 architecture is proposed. A high-resolution detection layer is introduced at the P2 feature level to improve sensitivity to small targets. Depthwise Separable [...] Read more.
To address the challenge of detecting small defects in Printed Circuit Boards (PCBs), a YOLO-HDEW model based on the enhanced YOLOv8 architecture is proposed. A high-resolution detection layer is introduced at the P2 feature level to improve sensitivity to small targets. Depthwise Separable Convolution (DSConv) is used for downsampling, reducing parameter complexity. An Edge-enhanced Multi-scale Parallel Attention mechanism (EMP-Attention) is integrated to capture multi-scale and edge features. The EMP mechanism is incorporated into the C2f module to form the C2f-EMP module, and dynamic non-monotonic Wise-IoU (W-IoU) loss is employed to enhance bounding box regression. The model is evaluated on the PKU-Market-PCB, DeepPCB, and NEU-DET datasets, with experimental results showing that YOLO-HDEW achieves 98.1% accuracy, 91.6% recall, 90.3% mAP@0.5, and 61.7% mAP@0.5:0.95, surpassing YOLOv8 by 1.5%, 2.3%, 1.2%, and 1.9%, respectively. Additionally, the model demonstrates strong generalization performance on the DeePCB and NEU-DET datasets. These results indicate that YOLO-HDEW significantly improves detection accuracy while maintaining a manageable model size, offering an effective solution for PCB defect detection. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

18 pages, 2565 KB  
Article
Rock Joint Segmentation in Drill Core Images via a Boundary-Aware Token-Mixing Network
by Seungjoo Lee, Yongjin Kim, Yongseong Kim, Jongseol Park and Bongjun Ji
Buildings 2025, 15(17), 3022; https://doi.org/10.3390/buildings15173022 - 25 Aug 2025
Viewed by 506
Abstract
The precise mapping of rock joint traces is fundamental to the design and safety assessment of foundations, retaining structures, and underground cavities in building and civil engineering. Existing deep learning approaches either impose prohibitive computational demands for on-site deployment or disrupt the topological [...] Read more.
The precise mapping of rock joint traces is fundamental to the design and safety assessment of foundations, retaining structures, and underground cavities in building and civil engineering. Existing deep learning approaches either impose prohibitive computational demands for on-site deployment or disrupt the topological continuity of subpixel lineaments that govern rock mass behavior. This study presents BATNet-Lite, a lightweight encoder–decoder architecture optimized for joint segmentation on resource-constrained devices. The encoder introduces a Boundary-Aware Token-Mixing (BATM) block that separates feature maps into patch tokens and directionally pooled stripe tokens, and a bidirectional attention mechanism subsequently transfers global context to local descriptors while refining stripe features, thereby capturing long-range connectivity with negligible overhead. A complementary Multi-Scale Line Enhancement (MLE) module combines depth-wise dilated and deformable convolutions to yield scale-invariant responses to joints of varying apertures. In the decoder, a Skeletal-Contrastive Decoder (SCD) employs dual heads to predict segmentation and skeleton maps simultaneously, while an InfoNCE-based contrastive loss enforces their topological consistency without requiring explicit skeleton labels. Training leverages a composite focal Tversky and edge IoU loss under a curriculum-thinning schedule, improving edge adherence and continuity. Ablation experiments confirm that BATM, MLE, and SCD each contribute substantial gains in boundary accuracy and connectivity preservation. By delivering topology-preserving joint maps with small parameters, BATNet-Lite facilitates rapid geological data acquisition for tunnel face mapping, slope inspection, and subsurface digital twin development, thereby supporting safer and more efficient building and underground engineering practice. Full article
Show Figures

Figure 1

19 pages, 2306 KB  
Article
Optimized Adaptive Multi-Scale Architecture for Surface Defect Recognition
by Xueli Chang, Yue Wang, Heping Zhang, Bogdan Adamyk and Lingyu Yan
Algorithms 2025, 18(8), 529; https://doi.org/10.3390/a18080529 - 20 Aug 2025
Cited by 1 | Viewed by 646
Abstract
Detection of defects on steel surface is crucial for industrial quality control. To address the issues of structural complexity, high parameter volume, and poor real-time performance in current detection models, this study proposes a lightweight model based on an improved YOLOv11. The model [...] Read more.
Detection of defects on steel surface is crucial for industrial quality control. To address the issues of structural complexity, high parameter volume, and poor real-time performance in current detection models, this study proposes a lightweight model based on an improved YOLOv11. The model first reconstructs the backbone network by introducing a Reversible Connected Multi-Column Network (RevCol) to effectively preserve multi-level feature information. Second, the lightweight FasterNet is embedded into the C3k2 module, utilizing Partial Convolution (PConv) to reduce computational overhead. Additionally, a Group Convolution-driven EfficientDetect head is designed to maintain high-performance feature extraction while minimizing consumption of computational resources. Finally, a novel WISEPIoU loss function is developed by integrating WISE-IoU and POWERFUL-IoU to accelerate the model convergence and optimize the accuracy of bounding box regression. The experiments on the NEU-DET dataset demonstrate that the improved model achieves a parameter reduction of 39.1% from the baseline and computational complexity of 49.2% reduction in comparison with the baseline, with an mAP@0.5 of 0.758 and real-time performance of 91 FPS. On the DeepPCB dataset, the model exhibits reduction of parameters and computations by 39.1% and 49.2%, respectively, with mAP@0.5 = 0.985 and real-time performance of 64 FPS. The study validates that the proposed lightweight framework effectively balances accuracy and efficiency, and proves to be a practical solution for real-time defect detection in resource-constrained environments. Full article
(This article belongs to the Special Issue Visual Attributes in Computer Vision Applications)
Show Figures

Figure 1

19 pages, 8903 KB  
Article
LSH-YOLO: A Lightweight Algorithm for Helmet-Wear Detection
by Zhao Liu, Fuwei Wang, Weimin Wang, Shenyi Cao, Xinhao Gao and Mingxin Chen
Buildings 2025, 15(16), 2918; https://doi.org/10.3390/buildings15162918 - 18 Aug 2025
Viewed by 512
Abstract
This work addresses the high computational cost and excessive parameter count associated with existing helmet-wearing detection models in complex construction scenarios. This paper proposes a lightweight helmet detection model, LSH-YOLO (Lightweight Safety Helmet) based on improvements to YOLOv8. First, the KernelWarehouse (KW) dynamic [...] Read more.
This work addresses the high computational cost and excessive parameter count associated with existing helmet-wearing detection models in complex construction scenarios. This paper proposes a lightweight helmet detection model, LSH-YOLO (Lightweight Safety Helmet) based on improvements to YOLOv8. First, the KernelWarehouse (KW) dynamic convolution is introduced to replace the standard convolution in the backbone and bottleneck structures. KW dynamically adjusts convolution kernels based on input features, thereby enhancing feature extraction and reducing redundant computation. Based on this, an improved C2f-KW module is proposed to further strengthen feature representation and lower computational complexity. Second, a lightweight detection head, SCDH (Shared Convolutional Detection Head), is designed to replace the original YOLOv8 Detect head. This modification maintains detection accuracy while further reducing both computational cost and parameter count. Finally, the Wise-IoU loss function is introduced to further enhance detection accuracy. Experimental results show that LSH-YOLO increases mAP50 by 0.6%, reaching 92.9%, while reducing computational cost by 63% and parameter count by 19%. Compared to YOLOv8n, LSH-YOLO demonstrates clear advantages in computational efficiency and detection performance, significantly lowering hardware resource requirements. These improvements make the model highly suitable for deployment in resource-constrained environments for real-time intelligent monitoring, thereby advancing the fields of industrial edge computing and intelligent safety surveillance. Full article
(This article belongs to the Special Issue AI in Construction: Automation, Optimization, and Safety)
Show Figures

Figure 1

Back to TopTop