MDPI - Publisher of Open Access Journals

22 pages, 4891 KB

Open AccessArticle

Optimization of Visual Detection Algorithms for Elevator Landing Door Safety-Keeper Bolts

by Chuanlong Zhang, Zixiao Li, Jinjin Li, Lin Zou and Enyuan Dong

Machines 2025, 13(9), 790; https://doi.org/10.3390/machines13090790 - 1 Sep 2025

Viewed by 174

As the safety requirements of elevator systems continue to rise, the detection of loose bolts and the high-precision segmentation of anti-loosening lines have become critical challenges in elevator landing door inspection. Traditional manual inspection and conventional visual detection often fail to meet the [...] Read more.

As the safety requirements of elevator systems continue to rise, the detection of loose bolts and the high-precision segmentation of anti-loosening lines have become critical challenges in elevator landing door inspection. Traditional manual inspection and conventional visual detection often fail to meet the requirements of high precision and robustness under real-world conditions such as oil contamination and low illumination. This paper proposes two improved algorithms for detecting loose bolts and segmenting anti-loosening lines in elevator landing doors. For small-bolt detection, we introduce the DS-EMA model, an enhanced YOLOv8 variant that integrates depthwise-separable convolutions and an Efficient Multi-scale Attention (EMA) module. The DS-EMA model achieves a 2.8 percentage point improvement in mAP over the YOLOv8n baseline on our self-collected dataset, while reducing parameters from 3.0 M to 2.8 M and maintaining real-time throughput at 126 FPS. For anti-loosening-line segmentation, we develop an improved DeepLabv3+ by adopting a MobileViT backbone, incorporating a Global Attention Mechanism (GAM) and optimizing the ASPP dilation rate. The revised model increases the mean IoU to 85.8% (a gain of 5.4 percentage points) while reducing parameters from 57.6 M to 38.5 M. Comparative experiments against mainstream lightweight models, including YOLOv5n, YOLOv6n, YOLOv7-tiny, and DeepLabv3, demonstrate that the proposed methods achieve superior accuracy while balancing efficiency and model complexity. Moreover, compared with recent lightweight variants such as YOLOv9-tiny and YOLOv11n, DS-EMA achieves comparable mAP while delivering notably higher recall, which is crucial for safety inspection. Overall, the enhanced YOLOv8 and DeepLabv3+ provide robust and efficient solutions for elevator landing door safety inspection, delivering clear practical application value. Full article

(This article belongs to the Section Machines Testing and Maintenance)

► Show Figures

Figure 1

21 pages, 3725 KB

Open AccessArticle

Pruning-Friendly RGB-T Semantic Segmentation for Real-Time Processing on Edge Devices

by Jun Young Hwang, Youn Joo Lee, Ho Gi Jung and Jae Kyu Suhr

Electronics 2025, 14(17), 3408; https://doi.org/10.3390/electronics14173408 - 27 Aug 2025

Viewed by 288

Abstract

RGB-T semantic segmentation using thermal and RGB images simultaneously is actively being researched to robustly recognize the surrounding environment of vehicles regardless of challenging lighting and weather conditions. It is important for them to operate in real time on edge devices. As transformer-based [...] Read more.

RGB-T semantic segmentation using thermal and RGB images simultaneously is actively being researched to robustly recognize the surrounding environment of vehicles regardless of challenging lighting and weather conditions. It is important for them to operate in real time on edge devices. As transformer-based approaches, which most recent RGB-T semantic segmentation studies belong to, are very difficult to perform on edge devices, this paper considers only CNN-based RGB-T semantic segmentation networks that can be performed on edge devices and operated in real time. Although EAEFNet shows the best performance among CNN-based networks on edge devices, its inference speed is too slow for real-time operation. Furthermore, even when channel pruning is applied, the speed improvement is minimal. The analysis of EAEFNet identifies the intermediate fusion of RGB and thermal features and the high complexity of the decoder as the main causes. To address these issues, this paper proposes a network using a ResNet encoder with an early-fused four-channel input and the U-Net decoder structure. To improve the decoder performance, bilinear upsampling is replaced with PixelShuffle. Additionally, mini Atrous Spatial Pyramid Pooling (ASPP) and Progressive Transposed Module (PTM) modules are applied. Since the Proposed Network is primarily composed of convolutional layers, channel pruning is confirmed to be effectively applicable. Consequently, channel pruning significantly improves inference speed, and enables real-time operation on the neural processing unit (NPU) of edge devices. The Proposed Network is evaluated using the MFNet dataset, one of the most widely used public datasets for RGB-T semantic segmentation. It is shown that the proposed method achieves a performance comparable to EAEFNet while operating at over 30 FPS on an embedded board equipped with the Qualcomm QCS6490 SoC. Full article

(This article belongs to the Special Issue New Insights in 2D and 3D Object Detection and Semantic Segmentation)

► Show Figures

Figure 1

20 pages, 5187 KB

Open AccessArticle

IceSnow-Net: A Deep Semantic Segmentation Network for High-Precision Snow and Ice Mapping from UAV Imagery

by Yulin Liu, Shuyuan Yang, Guangyang Zhang, Minghui Wu, Feng Xiong, Pinglv Yang and Zeming Zhou

Remote Sens. 2025, 17(17), 2964; https://doi.org/10.3390/rs17172964 - 27 Aug 2025

Viewed by 517

Abstract

Accurate monitoring of snow and ice cover is essential for climate research and disaster management, but conventional remote sensing methods often struggle in complex terrain and fog-contaminated conditions. To address the challenges of high-resolution UAV-based snow and ice segmentation—including visual similarity, fragmented spatial [...] Read more.

Accurate monitoring of snow and ice cover is essential for climate research and disaster management, but conventional remote sensing methods often struggle in complex terrain and fog-contaminated conditions. To address the challenges of high-resolution UAV-based snow and ice segmentation—including visual similarity, fragmented spatial distributions, and terrain shadow interference—we introduce IceSnow-Net, a U-Net-based architecture enhanced with three key components: (1) a ResNet50 backbone with atrous convolutions to expand the receptive field, (2) an Atrous Spatial Pyramid Pooling (ASPP) module for multi-scale context aggregation, and (3) an auxiliary path loss for deep supervision to enhance boundary delineation and training stability. The model was trained and validated on UAV-captured orthoimagery from Ganzi Prefecture, Sichuan, China. The experimental results demonstrate that IceSnow-Net achieved excellent performance compared to other models, attaining a mean Intersection over Union (mIoU) of 98.74%, while delivering 27% higher computational efficiency than U-Mamba. Ablation studies further validated the individual contributions of each module. Overall, IceSnow-Net provides an effective and accurate solution for cryosphere monitoring in topographically complex environments using UAV imagery. Full article

(This article belongs to the Special Issue Recent Progress in UAV-AI Remote Sensing II)

► Show Figures

Figure 1

21 pages, 2799 KB

Open AccessArticle

Few-Shot Leukocyte Classification Algorithm Based on Feature Reconstruction Network with Improved EfficientNetV2

by Xinzheng Wang, Cuisi Ou, Guangjian Pan, Zhigang Hu and Kaiwen Cao

Appl. Sci. 2025, 15(17), 9377; https://doi.org/10.3390/app15179377 - 26 Aug 2025

Viewed by 384

Abstract

Deep learning has excelled in image classification largely due to large, professionally labeled datasets. However, in the field of medical images data annotation often relies on experienced experts, especially in tasks such as white blood cell classification where the staining methods for different [...] Read more.

Deep learning has excelled in image classification largely due to large, professionally labeled datasets. However, in the field of medical images data annotation often relies on experienced experts, especially in tasks such as white blood cell classification where the staining methods for different cells vary greatly and the number of samples in certain categories is relatively small. To evaluate leukocyte classification performance with limited labeled samples, a few-shot learning method based on Feature Reconstruction Network with Improved EfficientNetV2 (FRNE) is proposed. Firstly, this paper presents a feature extractor based on the improved EfficientNetv2 architecture. To enhance the receptive field and extract multi-scale features effectively, the network incorporates an ASPP module with dilated convolutions at different dilation rates. This enhancement improves the model’s spatial reconstruction capability during feature extraction. Subsequently, the support set and query set are processed by the feature extractor to obtain the respective feature maps. A feature reconstruction-based classification method is then applied. Specifically, ridge regression reconstructs the query feature map using features from the support set. By analyzing the reconstruction error, the model determines the likelihood of the query sample belonging to a particular class, without requiring additional modules or extensive parameter tuning. Evaluated on the LDWBC and Raabin datasets, the proposed method achieves accuracy improvements of 3.67% and 1.27%, respectively, compared to the method that demonstrated strong OA performance on both datasets among all compared approaches. Full article

► Show Figures

Figure 1

21 pages, 6925 KB

Open AccessArticle

U²-LFOR: A Two-Stage U² Network for Light-Field Occlusion Removal

by Mostafa Farouk Senussi, Mahmoud Abdalla, Mahmoud SalahEldin Kasem, Mohamed Mahmoud and Hyun-Soo Kang

Mathematics 2025, 13(17), 2748; https://doi.org/10.3390/math13172748 - 26 Aug 2025

Viewed by 419

Abstract

Light-field (LF) imaging transforms occlusion removal by using multiview data to reconstruct hidden regions, overcoming the limitations of single-view methods. However, this advanced capability often comes at the cost of increased computational complexity. To overcome this, we propose the U²-LFOR network, [...] Read more.

Light-field (LF) imaging transforms occlusion removal by using multiview data to reconstruct hidden regions, overcoming the limitations of single-view methods. However, this advanced capability often comes at the cost of increased computational complexity. To overcome this, we propose the U²-LFOR network, an end-to-end neural network designed to remove occlusions in LF images without compromising performance, addressing the inherent complexity of LF imaging while ensuring practical applicability. The architecture employs Residual Atrous Spatial Pyramid Pooling (ResASPP) at the feature extractor to expand the receptive field, capture localized multiscale features, and enable deep feature learning with efficient aggregation. A two-stage U²-Net structure enhances hierarchical feature learning while maintaining a compact design, ensuring accurate context recovery. A dedicated refinement module, using two cascaded residual blocks (ResBlock), restores fine details to the occluded regions. Experimental results demonstrate its competitive performance, achieving an average Peak Signal-to-Noise Ratio (PSNR) of 29.27 dB and Structural Similarity Index Measure (SSIM) of 0.875, which are two widely used metrics for evaluating reconstruction fidelity and perceptual quality, on both synthetic and real-world LF datasets, confirming its effectiveness in accurate occlusion removal. Full article

(This article belongs to the Special Issue Emerging Deep Learning Models and Applications in Image Processing and Computer Vision)

► Show Figures

Figure 1

39 pages, 4783 KB

Open AccessArticle

Sparse-MoE-SAM: A Lightweight Framework Integrating MoE and SAM with a Sparse Attention Mechanism for Plant Disease Segmentation in Resource-Constrained Environments

by Benhan Zhao, Xilin Kang, Hao Zhou, Ziyang Shi, Lin Li, Guoxiong Zhou, Fangying Wan, Jiangzhang Zhu, Yongming Yan, Leheng Li and Yulong Wu

Plants 2025, 14(17), 2634; https://doi.org/10.3390/plants14172634 - 24 Aug 2025

Viewed by 390

Abstract

Plant disease segmentation has achieved significant progress with the help of artificial intelligence. However, deploying high-accuracy segmentation models in resource-limited settings faces three key challenges, as follows: (A) Traditional dense attention mechanisms incur quadratic computational complexity growth (O(

n^{2} d

)), rendering [...] Read more.

Plant disease segmentation has achieved significant progress with the help of artificial intelligence. However, deploying high-accuracy segmentation models in resource-limited settings faces three key challenges, as follows: (A) Traditional dense attention mechanisms incur quadratic computational complexity growth (O(

n^{2} d

)), rendering them ill-suited for low-power hardware. (B) Naturally sparse spatial distributions and large-scale variations in the lesions on leaves necessitate models that concurrently capture long-range dependencies and local details. (C) Complex backgrounds and variable lighting in field images often induce segmentation errors. To address these challenges, we propose Sparse-MoE-SAM, an efficient framework based on an enhanced Segment Anything Model (SAM). This deep learning framework integrates sparse attention mechanisms with a two-stage mixture of experts (MoE) decoder. The sparse attention dynamically activates key channels aligned with lesion sparsity patterns, reducing self-attention complexity while preserving long-range context. Stage 1 of the MoE decoder performs coarse-grained boundary localization; Stage 2 achieves fine-grained segmentation by leveraging specialized experts within the MoE, significantly enhancing edge discrimination accuracy. The expert repository—comprising standard convolutions, dilated convolutions, and depthwise separable convolutions—dynamically routes features through optimized processing paths based on input texture and lesion morphology. This enables robust segmentation across diverse leaf textures and plant developmental stages. Further, we design a sparse attention-enhanced Atrous Spatial Pyramid Pooling (ASPP) module to capture multi-scale contexts for both extensive lesions and small spots. Evaluations on three heterogeneous datasets (PlantVillage Extended, CVPPP, and our self-collected field images) show that Sparse-MoE-SAM achieves a mean Intersection-over-Union (mIoU) of 94.2%—surpassing standard SAM by 2.5 percentage points—while reducing computational costs by 23.7% compared to the original SAM baseline. The model also demonstrates balanced performance across disease classes and enhanced hardware compatibility. Our work validates that integrating sparse attention with MoE mechanisms sustains accuracy while drastically lowering computational demands, enabling the scalable deployment of plant disease segmentation models on mobile and edge devices. Full article

(This article belongs to the Special Issue Advances in Artificial Intelligence for Plant Research)

► Show Figures

Figure 1

18 pages, 1956 KB

Open AccessArticle

FCNet: A Transformer-Based Context-Aware Segmentation Framework for Detecting Camouflaged Fruits in Orchard Environments

by Ivan Roy Evangelista, Argel Bandala and Elmer Dadios

Technologies 2025, 13(8), 372; https://doi.org/10.3390/technologies13080372 - 20 Aug 2025

Viewed by 316

Abstract

Fruit segmentation is an essential task due to its importance in accurate disease prevention, yield estimation, and automated harvesting. However, accurate object segmentation in agricultural environments remains challenging due to visual complexities such as background clutter, occlusion, small object size, and color–texture similarities [...] Read more.

Fruit segmentation is an essential task due to its importance in accurate disease prevention, yield estimation, and automated harvesting. However, accurate object segmentation in agricultural environments remains challenging due to visual complexities such as background clutter, occlusion, small object size, and color–texture similarities that lead to camouflaging. Traditional methods often struggle to detect partially occluded or visually blended fruits, leading to poor detection performance. In this study, we propose a context-aware segmentation framework designed for orchard-level mango fruit detection. We integrate multiscale feature extraction based on PVTv2 architecture, a feature enhancement module using Atrous Spatial Pyramid Pooling (ASPP) and attention techniques, and a novel refinement mechanism employing a Position-based Layer Normalization (PLN). We conducted a comparative study against established segmentation models, employing both quantitative and qualitative evaluations. Results demonstrate the superior performance of our model across all metrics. An ablation study validated the contributions of the enhancement and refinement modules, with the former yielding performance gains of 2.43%, 3.10%, 5.65%, 4.19%, and 4.35% in S-measure, mean E-measure, weighted F-measure, mean F-measure, and IoU, respectively, and the latter achieving improvements of 2.07%, 1.93%, 6.85%, 4.84%, and 2.73%, in the said metrics. Full article

(This article belongs to the Special Issue Artificial Intelligence and Smart Information Systems: Trends and Innovations)

► Show Figures

Graphical abstract

17 pages, 2482 KB

Open AccessArticle

Coastline Identification with ASSA-Resnet Based Segmentation for Marine Navigation

by Yuhan Wang, Weixian Li, Zhengxun Zhou and Ning Wu

Appl. Sci. 2025, 15(16), 9113; https://doi.org/10.3390/app15169113 - 19 Aug 2025

Viewed by 306

Abstract

Real-time and accurate segmentation of coastlines is of paramount importance for the safe navigation of unmanned surface vessels (USVs). Classical methods such as U-Net and DeepLabV3 have been proven to be effective in coastline segmentation tasks. However, their performance substantially degrades in real-world [...] Read more.

Real-time and accurate segmentation of coastlines is of paramount importance for the safe navigation of unmanned surface vessels (USVs). Classical methods such as U-Net and DeepLabV3 have been proven to be effective in coastline segmentation tasks. However, their performance substantially degrades in real-world scenarios due to variations in lighting and environmental conditions, particularly from water surface reflections. This paper proposes an enhanced ResNet-50 model, namely ASSA-ResNet, for coastline segmentation for vision-based marine navigation. ASSA-ResNet integrates Atrous Spatial Pyramid Pooling (ASPP) to expand the model’s receptive field and incorporates a Global Channel Spatial Attention (GCSA) module to suppress interference from water reflections. Through feature pyramid fusion, ASSA-ResNet reinforces the semantic representation of features at various scales to ensure precise boundary delineation. The performance of ASSA-ResNet is validated with a dataset encompassing diverse brightness conditions and scenarios. Notably, mean Pixel Accuracy (mPA) and mean Intersection over Union (mIoU) of 98.90% and 98.17%, respectively, have been achieved on the self-constructed dataset, with corresponding values of 99.18% and 98.39% observed on the USVInland unmanned vessel dataset. Comparative analyses reveal that ASSA-ResNet outperforms the U-Net model by 1.78% in mPA and 2.9% in mIOU relative to the DeepLabV3 model. It also demonstrates enhancements of 1.85% in mPA and 3.19% in mIoU. On the USVInland dataset, ASSA-ResNet exhibits superior performance compared to U-Net, with improvements of 0.41% in mPA and 0.12% in mIoU, while surpassing DeepLabV3 by 0.33% in mPA and 0.21% in mIoU. Full article

(This article belongs to the Section Marine Science and Engineering)

► Show Figures

Figure 1

25 pages, 3910 KB

Open AccessArticle

Design and Comparative Experimental Study of Air-Suction Mulai-Arm Potato Planter

by Xiaoxin Zhu, Pinyan Lyu, Qiang Gao, Haiqin Ma, Yuxuan Chen, Yu Qi, Jicheng Li and Jinqing Lyu

Agriculture 2025, 15(16), 1714; https://doi.org/10.3390/agriculture15161714 - 8 Aug 2025

Viewed by 428

Abstract

China ranks as the world’s leading potato (Solanum tuberosum L.) producer, while the poor seeding machinery performance limited a higher input–output ratio in potato cultivation and impeded sustainable development. We developed an advanced air-suction mulai-arm potato planter (ASPP) that incorporated integrated side-deep [...] Read more.

China ranks as the world’s leading potato (Solanum tuberosum L.) producer, while the poor seeding machinery performance limited a higher input–output ratio in potato cultivation and impeded sustainable development. We developed an advanced air-suction mulai-arm potato planter (ASPP) that incorporated integrated side-deep fertilization, automated seed feeding, negative-pressure seed filling, seed transportation, positive-pressure seed delivery, soil covering, and compaction. The study proposes a Negative-pressure seed extraction mechanism that minimizes seed damage by precisely controlling suction pressure, and the near-zero-speed seed delivery mechanism synchronizes seed release with ground speed, reducing bounce-induced spacing errors. Furthermore, the structural configuration and operation principle of ASPP were systematically elucidated, and key performance parameters and optimal values were identified. We conducted a randomized complete block design plot trial comparing the spoon-belt potato planter (SBPP) and spoon-chain potato planter (SCPP), evaluating sowing quality, seedling emergence rate (ER), potato yield (PY), and comprehensive economic benefits. The results revealed that plant spacing index (PSI), missed-seeding index (MI), re-seeding index (RI), and coefficient of variation (CV) of ASPP were 90.05%, 3.78%, 2.32%, and 7.93%, respectively. The mean ER values for ASPP, SBPP, and SCPP were 94.76%, 85.42%, and 83.46%, respectively, with the ASPP showing improvements of 10.93% and 13.54% over SBPP and SCPP. However, the SBPP and SCPP exhibited greater emergence uniformity than ASPP. The mean PY value was 37,205.25, 32,973.75, and 34,620 kg·ha⁻¹ for ASPP, SBPP, and SCPP. The ASPP outperformed the SBPP and SCPP by 12.83% and 7.47%. Overall, ASPP demonstrated balanced and superior performance across the above-mentioned indicators, demonstrating its potential to enable precision agriculture in tuber crop cultivation. Full article

(This article belongs to the Section Agricultural Technology)

► Show Figures

Figure 1

19 pages, 2987 KB

Open AccessArticle

Predicting Range Shifts in the Distribution of Arctic/Boreal Plant Species Under Climate Change Scenarios

by Yan Zhang, Shaomei Li, Yuanbo Su, Bingyu Yang and Xiaojun Kou

Diversity 2025, 17(8), 558; https://doi.org/10.3390/d17080558 - 7 Aug 2025

Viewed by 575

Abstract

Climate warming is anticipated to significantly alter the distribution and composition of plant species in the Arctic, thereby cascading through food webs and affecting both associated fauna and entire ecosystems. To elucidate the trend in plant distribution in response to climate change, we [...] Read more.

Climate warming is anticipated to significantly alter the distribution and composition of plant species in the Arctic, thereby cascading through food webs and affecting both associated fauna and entire ecosystems. To elucidate the trend in plant distribution in response to climate change, we employed the MaxEnt model to project the future ranges of 25 representative Arctic and Circumpolar plant species (including grasses and shrubs). Species distribution data, in conjunction with bioclimatic variables derived from climate projections of three selected General Circulation Models (GCMs), ESM2, IPSL, and MPIE, were utilized to fit the MaxEnt models. Subsequently, we predicted the potential distributions of these species under three Shared Socioeconomic Pathways (SSPs)—SSP126, SSP245, and SSP585—across a timeline spanning 2010, 2050, 2100, 2200, 2250, and 2300 AD. Range shift indices were applied to quantify changes in plant distribution and range sizes. Our results show that the ranges of nearly all species are projected to diminish progressively over time, with a more pronounced rate of reduction under higher emission scenarios. The species are generally expected to shift northward, with the distances of these shifts positively correlated with both the time intervals from the current state and the intensity of thermal forcing associated with the SSPs. Arctic species (A_Spps) are anticipated to face higher extinction risks compared to Boreal–Arctic species (B_Spps). Additional indices, such as range gain, loss, and overlap, consistently corroborate these patterns. Notably, the peak range shift speeds differ markedly between SSP245 and SSP585, with the latter extending beyond 2100 AD. In conclusion, under all SSPs, A_Spps are generally expected to experience more significant range shifts than B_Spps. In the SSP585 scenario all species are projected to face substantial range reductions, with Arctic species being more severely affected and consequently facing the highest extinction risks. These findings provide valuable insights for developing conservation recommendations for polar plant species and have significant ecological and socioeconomic implications. Full article

(This article belongs to the Section Plant Diversity)

► Show Figures

Figure 1

22 pages, 2420 KB

Open AccessArticle

BiEHFFNet: A Water Body Detection Network for SAR Images Based on Bi-Encoder and Hybrid Feature Fusion

by Bin Han, Xin Huang and Feng Xue

Mathematics 2025, 13(15), 2347; https://doi.org/10.3390/math13152347 - 23 Jul 2025

Viewed by 295

Abstract

Water body detection in synthetic aperture radar (SAR) imagery plays a critical role in applications such as disaster response, water resource management, and environmental monitoring. However, it remains challenging due to complex background interference in SAR images. To address this issue, a bi-encoder [...] Read more.

Water body detection in synthetic aperture radar (SAR) imagery plays a critical role in applications such as disaster response, water resource management, and environmental monitoring. However, it remains challenging due to complex background interference in SAR images. To address this issue, a bi-encoder and hybrid feature fuse network (BiEHFFNet) is proposed for achieving accurate water body detection. First, a bi-encoder structure based on ResNet and Swin Transformer is used to jointly extract local spatial details and global contextual information, enhancing feature representation in complex scenarios. Additionally, the convolutional block attention module (CBAM) is employed to suppress irrelevant information of the output features of each ResNet stage. Second, a cross-attention-based hybrid feature fusion (CABHFF) module is designed to interactively integrate local and global features through cross-attention, followed by channel attention to achieve effective hybrid feature fusion, thus improving the model’s ability to capture water structures. Third, a multi-scale content-aware upsampling (MSCAU) module is designed by integrating atrous spatial pyramid pooling (ASPP) with the Content-Aware ReAssembly of FEatures (CARAFE), aiming to enhance multi-scale contextual learning while alleviating feature distortion caused by upsampling. Finally, a composite loss function combining Dice loss and Active Contour loss is used to provide stronger boundary supervision. Experiments conducted on the ALOS PALSAR dataset demonstrate that the proposed BiEHFFNet outperforms existing methods across multiple evaluation metrics, achieving more accurate water body detection. Full article

(This article belongs to the Special Issue Advanced Mathematical Methods in Remote Sensing)

► Show Figures

Figure 1

21 pages, 2919 KB

Open AccessArticle

A Feasible Domain Segmentation Algorithm for Unmanned Vessels Based on Coordinate-Aware Multi-Scale Features

by Zhengxun Zhou, Weixian Li, Yuhan Wang, Haozheng Liu and Ning Wu

J. Mar. Sci. Eng. 2025, 13(8), 1387; https://doi.org/10.3390/jmse13081387 - 22 Jul 2025

Viewed by 257

Abstract

The accurate extraction of navigational regions from images of navigational waters plays a key role in ensuring on-water safety and the automation of unmanned vessels. Nonetheless, current technological methods encounter significant challenges in addressing fluctuations in water surface illumination, reflective disturbances, and surface [...] Read more.

The accurate extraction of navigational regions from images of navigational waters plays a key role in ensuring on-water safety and the automation of unmanned vessels. Nonetheless, current technological methods encounter significant challenges in addressing fluctuations in water surface illumination, reflective disturbances, and surface undulations, among other disruptions, in turn making it challenging to achieve rapid and precise boundary segmentation. To cope with these challenges, in this paper, we propose a coordinate-aware multi-scale feature network (GASF-ResNet) method for water segmentation. The method integrates the attention module Global Grouping Coordinate Attention (GGCA) in the four downsampling branches of ResNet-50, thus enhancing the model’s ability to capture target features and improving the feature representation. To expand the model’s receptive field and boost its capability in extracting features of multi-scale targets, the Avoidance Spatial Pyramid Pooling (ASPP) technique is used. Combined with multi-scale feature fusion, this effectively enhances the expression of semantic information at different scales and improves the segmentation accuracy of the model in complex water environments. The experimental results show that the average pixel accuracy (mPA) and average intersection and union ratio (mIoU) of the proposed method on the self-made dataset and on the USVInaland unmanned ship dataset are 99.31% and 98.61%, and 98.55% and 99.27%, respectively, significantly better results than those obtained for the existing mainstream models. These results are helpful in overcoming the background interference caused by water surface reflection and uneven lighting in the aquatic environment and in realizing the accurate segmentation of the water area for the safe navigation of unmanned vessels, which is of great value for the stable operation of unmanned vessels in complex environments. Full article

(This article belongs to the Section Ocean Engineering)

► Show Figures

Figure 1

12 pages, 3688 KB

Open AccessArticle

Automated Traumatic Bleeding Detection in Whole-Body CT Using 3D Object Detection Model

by Rizki Nurfauzi, Ayaka Baba, Taka-aki Nakada, Toshiya Nakaguchi and Yukihiro Nomura

Appl. Sci. 2025, 15(15), 8123; https://doi.org/10.3390/app15158123 - 22 Jul 2025

Viewed by 472

Abstract

Traumatic injury remains a major cause of death worldwide, with bleeding being one of its most critical and life-threatening consequences. Whole-body computed tomography (WBCT) has become a standard diagnostic method in trauma settings; however, timely interpretation remains challenging for acute care physicians. In [...] Read more.

Traumatic injury remains a major cause of death worldwide, with bleeding being one of its most critical and life-threatening consequences. Whole-body computed tomography (WBCT) has become a standard diagnostic method in trauma settings; however, timely interpretation remains challenging for acute care physicians. In this study, we propose a new automated method for detecting traumatic bleeding in CT images using a three-dimensional object detection model enhanced with an atrous spatial pyramid pooling (ASPP) module. Furthermore, we incorporate a false positive (FP) reduction approach based on multi-organ segmentation, as developed in our previous study. The proposed method was evaluated on a multi-institutional dataset of delayed-phase contrast-enhanced CT images using a six-fold cross-validation approach. It achieved a maximum sensitivity of 90.0% with 587.3 FPs per case and a sensitivity of 70.0% with 46.9 FPs per case, outperforming previous segmentation-based methods. In addition, the average processing time was reduced to 4.2 ± 1.1 min. These results suggest that the proposed method enables rapid and accurate bleeding detection, demonstrating its potential for clinical application in emergency trauma care. Full article

(This article belongs to the Special Issue Research Progress in Medical Image Analysis)

► Show Figures

Figure 1

21 pages, 4008 KB

Open AccessArticle

Enhancing Suburban Lane Detection Through Improved DeepLabV3+ Semantic Segmentation

by Shuwan Cui, Bo Yang, Zhifu Wang, Yi Zhang, Hao Li, Hui Gao and Haijun Xu

Electronics 2025, 14(14), 2865; https://doi.org/10.3390/electronics14142865 - 17 Jul 2025

Viewed by 472

Abstract

Lane detection is a key technology in automatic driving environment perception, and its accuracy directly affects vehicle positioning, path planning, and driving safety. In this study, an enhanced real-time model for lane detection based on an improved DeepLabV3+ architecture is proposed to address [...] Read more.

Lane detection is a key technology in automatic driving environment perception, and its accuracy directly affects vehicle positioning, path planning, and driving safety. In this study, an enhanced real-time model for lane detection based on an improved DeepLabV3+ architecture is proposed to address the challenges posed by complex dynamic backgrounds and blurred road boundaries in suburban road scenarios. To address the lack of feature correlation in the traditional Atrous Spatial Pyramid Pooling (ASPP) module of the DeepLabV3+ model, we propose an improved LC-DenseASPP module. First, inspired by DenseASPP, the number of dilated convolution layers is reduced from six to three by adopting a dense connection to enhance feature reuse, significantly reducing computational complexity. Second, the convolutional block attention module (CBAM) attention mechanism is embedded after the LC-DenseASPP dilated convolution operation. This effectively improves the model’s ability to focus on key features through the adaptive refinement of channel and spatial attention features. Finally, an image-pooling operation is introduced in the last layer of the LC-DenseASPP to further enhance the ability to capture global context information. DySample is introduced to replace bilinear upsampling in the decoder, ensuring model performance while reducing computational resource consumption. The experimental results show that the model achieves a good balance between segmentation accuracy and computational efficiency, with a mean intersection over union (mIoU) of 95.48% and an inference speed of 128 frames per second (FPS). Additionally, a new lane-detection dataset, SubLane, is constructed to fill the gap in the research field of lane detection in suburban road scenarios. Full article

► Show Figures

Figure 1

15 pages, 3012 KB

Open AccessArticle

Deep Learning-Based Layout Analysis Method for Complex Layout Image Elements

by Yunfei Zhong, Yumei Pu, Xiaoxuan Li, Wenxuan Zhou, Hongjian He, Yuyang Chen, Lang Zhong and Danfei Liu

Appl. Sci. 2025, 15(14), 7797; https://doi.org/10.3390/app15147797 - 11 Jul 2025

Viewed by 675

Abstract

The layout analysis of elements is indispensable in graphic design, as effective layout design not only facilitates the delivery of visual information but also enhances the overall esthetic appeal to the audience. The combination of deep learning and graphic design has gradually turned [...] Read more.

The layout analysis of elements is indispensable in graphic design, as effective layout design not only facilitates the delivery of visual information but also enhances the overall esthetic appeal to the audience. The combination of deep learning and graphic design has gradually turned into a popular research direction in graphic design in recent years. However, in the era of rapid development of artificial intelligence, the analysis of layout still requires manual participation. To address this problem, this paper proposes a method for analyzing the layout of complex layout image elements based on the improved DeepLabv3++ model. The method reduces the number of model parameters and training time by replacing the backbone network. To improve the effect of multi-scale semantic feature extraction, the null rate of ASPP is fine-tuned, and the model is trained by self-constructed movie poster dataset. The experimental results show that the improved DeepLabv3+ model achieves a better segmentation effect on the self-constructed poster dataset, with MIoU reaching 75.60%. Compared with the classical models such as FCN, PSPNet, and DeepLabv3, the improved model in this paper effectively reduces the number of model parameters and training time while also ensuring the accuracy of the model. Full article

(This article belongs to the Special Issue Engineering Applications of Hybrid Artificial Intelligence Tools)

► Show Figures

Figure 1

Search Results (199)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (199)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI