MDPI - Publisher of Open Access Journals

23 pages, 15968 KB

Open AccessArticle

YOLOv8n-RMB: UAV Imagery Rubber Milk Bowl Detection Model for Autonomous Robots’ Natural Latex Harvest

by Yunfan Wang, Lin Yang, Pengze Zhong, Xin Yang, Chuanchuan Su, Yi Zhang and Aamir Hussain

Agriculture 2025, 15(19), 2075; https://doi.org/10.3390/agriculture15192075 - 3 Oct 2025

Viewed by 394

Natural latex harvest is pushing the boundaries of unmanned agricultural production in rubber milk collection via integrated robots in hilly and mountainous regions, such as the fixed and mobile tapping robots widely deployed in forests. As there are bad working conditions and complex [...] Read more.

Natural latex harvest is pushing the boundaries of unmanned agricultural production in rubber milk collection via integrated robots in hilly and mountainous regions, such as the fixed and mobile tapping robots widely deployed in forests. As there are bad working conditions and complex natural environments surrounding rubber trees, the real-time and precision assessment of rubber milk yield status has emerged as a key requirement for improving the efficiency and autonomous management of these kinds of large-scale automatic tapping robots. However, traditional manual rubber milk yield status detection methods are limited in their ability to operate effectively under conditions involving complex terrain, dense forest backgrounds, irregular surface geometries of rubber milk, and the frequent occlusion of rubber milk bowls (RMBs) by vegetation. To address this issue, this study presents an unmanned aerial vehicle (UAV) imagery rubber milk yield state detection method, termed YOLOv8n-RMB, in unstructured field environments instead of manual watching. The proposed method improved the original YOLOv8n by integrating structural enhancements across the backbone, neck, and head components of the network. First, a receptive field attention convolution (RFACONV) module is embedded within the backbone to improve the model’s ability to extract target-relevant features in visually complex environments. Second, within the neck structure, a bidirectional feature pyramid network (BiFPN) is applied to strengthen the fusion of features across multiple spatial scales. Third, in the head, a content-aware dynamic upsampling module of DySample is adopted to enhance the reconstruction of spatial details and the preservation of object boundaries. Finally, the detection framework is integrated with the BoT-SORT tracking algorithm to achieve continuous multi-object association and dynamic state monitoring based on the filling status of RMBs. Experimental evaluation shows that the proposed YOLOv8n-RMB model achieves an AP@0.5 of 94.9%, an AP@0.5:0.95 of 89.7%, a precision of 91.3%, and a recall of 91.9%. Moreover, the performance improves by 2.7%, 2.9%, 3.9%, and 9.7%, compared with the original YOLOv8n. Plus, the total number of parameters is kept within 3.0 million, and the computational cost is limited to 8.3 GFLOPs. This model meets the requirements of yield assessment tasks by conducting computations in resource-limited environments for both fixed and mobile tapping robots in rubber plantations. Full article

(This article belongs to the Special Issue Plant Diagnosis and Monitoring for Agricultural Production)

► Show Figures

Figure 1

14 pages, 4145 KB

Open AccessArticle

The Spatial Logic of Privacy: Uncovering Privacy Patterns in Shared Housing Environments

by Ana Moreira and Francisco Serdoura

Buildings 2025, 15(19), 3532; https://doi.org/10.3390/buildings15193532 - 1 Oct 2025

Viewed by 158

Abstract

In response to the growing relevance of shared housing models such as co-living and co-housing, this study investigates how spatial configuration affects the experience and negotiation of privacy in shared domestic environments. While privacy is often treated as a subjective or cultural concern, [...] Read more.

In response to the growing relevance of shared housing models such as co-living and co-housing, this study investigates how spatial configuration affects the experience and negotiation of privacy in shared domestic environments. While privacy is often treated as a subjective or cultural concern, this research adopts a spatial perspective to examine its morphological underpinnings. Using space syntax methods, the study analyses contemporary shared housing models, focusing on three shared housing developments in Barcelona. Through Visual Graph Analysis (VGA), spatial parameters, including integration, through vision, control, and controllability values, are applied to assess the degree of accessibility, visibility, and spatial separation within and between private and communal areas. The results reveal distinct configurational patterns that correlate with different privacy gradients, identifying how spatial arrangement enables or restricts autonomy and co-presence among residents. The study concludes that privacy in shared housing is not only a matter of design intention but is embedded in the spatial logic of dwelling morphology: exposed and controlled spaces provide less privacy but enhance sociability, while spatial elements such as boundaries and transitions play an important role in managing privacy gradation and degrees. These findings offer a framework for understanding and designing shared living environments that are better attuned to the complexities of everyday privacy needs. Full article

(This article belongs to the Special Issue Emerging Trends in Architecture, Urbanization, and Design)

► Show Figures

Figure 1

24 pages, 1826 KB

Open AccessArticle

Cloud and Snow Segmentation via Transformer-Guided Multi-Stream Feature Integration

by Kaisheng Yu, Kai Chen, Liguo Weng, Min Xia and Shengyan Liu

Remote Sens. 2025, 17(19), 3329; https://doi.org/10.3390/rs17193329 - 29 Sep 2025

Viewed by 284

Abstract

Cloud and snow often share comparable visual and structural patterns in satellite observations, making their accurate discrimination and segmentation particularly challenging. To overcome this, we design an innovative Transformer-guided architecture with complementary feature-extraction capabilities. The encoder adopts a dual-path structure, integrating a Transformer [...] Read more.

Cloud and snow often share comparable visual and structural patterns in satellite observations, making their accurate discrimination and segmentation particularly challenging. To overcome this, we design an innovative Transformer-guided architecture with complementary feature-extraction capabilities. The encoder adopts a dual-path structure, integrating a Transformer Encoder Module (TEM) for capturing long-range semantic dependencies and a ResNet18-based convolutional branch for detailed spatial representation. A Feature-Enhancement Module (FEM) is introduced to promote bidirectional interaction and adaptive feature integration between the two pathways. To improve delineation of object boundaries, especially in visually complex areas, we embed a Deep Feature-Extraction Module (DFEM) at the deepest layer of the convolutional stream. This component refines channel-level information to highlight critical features and enhance edge clarity. Additionally, to address noise from intricate backgrounds and ambiguous cloud-snow transitions, we incorporate both a Transformer Fusion Module (TFM) and a Strip Pooling Auxiliary Module (SPAM) in the decoding phase. These modules collaboratively enhance structural recovery and improve robustness in segmentation. Extensive experiments on the CSWV and SPARCS datasets show that our method consistently outperforms state-of-the-art baselines, demonstrating its strong effectiveness and applicability in real-world cloud and snow-detection scenarios. Full article

(This article belongs to the Special Issue Deep Learning-Based Cloud Detection and Removal for Remote Sensing Images)

► Show Figures

Figure 1

43 pages, 7808 KB

Open AccessArticle

GeoJSEval: An Automated Evaluation Framework for Large Language Models on JavaScript-Based Geospatial Computation and Visualization Code Generation

by Guanyu Chen, Haoyue Jiao, Shuyang Hou, Ziqi Liu, Lutong Xie, Shaowen Wu, Huayi Wu, Xuefeng Guan and Zhipeng Gui

ISPRS Int. J. Geo-Inf. 2025, 14(10), 382; https://doi.org/10.3390/ijgi14100382 - 28 Sep 2025

Viewed by 517

Abstract

With the widespread adoption of large language models (LLMs) in code generation tasks, geospatial code generation has emerged as a critical frontier in the integration of artificial intelligence and geoscientific analysis. This growing trend underscores the urgent need for systematic evaluation methodologies to [...] Read more.

With the widespread adoption of large language models (LLMs) in code generation tasks, geospatial code generation has emerged as a critical frontier in the integration of artificial intelligence and geoscientific analysis. This growing trend underscores the urgent need for systematic evaluation methodologies to assess the generation capabilities of LLMs in geospatial contexts. In particular, geospatial computation and visualization tasks in the JavaScript environment rely heavily on the orchestration of diverse frontend libraries and ecosystems, posing elevated demands on a model’s semantic comprehension and code synthesis capabilities. To address this challenge, we propose GeoJSEval—the first multimodal, function-level automatic evaluation framework for LLMs in JavaScript-based geospatial code generation tasks. The framework comprises three core components: a standardized test suite (GeoJSEval-Bench), a code submission engine, and an evaluation module. It includes 432 function-level tasks and 2071 structured test cases, spanning five widely used JavaScript geospatial libraries that support spatial analysis and visualization functions, as well as 25 mainstream geospatial data types. GeoJSEval enables multidimensional quantitative evaluation across metrics such as accuracy, output stability, resource consumption, execution efficiency, and error type distribution. Moreover, it integrates boundary testing mechanisms to enhance robustness and evaluation coverage. We conduct a comprehensive assessment of 20 state-of-the-art LLMs using GeoJSEval, uncovering significant performance disparities and bottlenecks in spatial semantic understanding, code reliability, and function invocation accuracy. GeoJSEval offers a foundational methodology, evaluation resource, and practical toolkit for the standardized assessment and optimization of geospatial code generation models, with strong extensibility and promising applicability in real-world scenarios. This manuscript represents the peer-reviewed version of our earlier preprint previously made available on arXiv. Full article

► Show Figures

Figure 1

38 pages, 14848 KB

Open AccessArticle

Image Sand–Dust Removal Using Reinforced Multiscale Image Pair Training

by Dong-Min Son, Jun-Ru Huang and Sung-Hak Lee

Sensors 2025, 25(19), 5981; https://doi.org/10.3390/s25195981 - 26 Sep 2025

Viewed by 393

Abstract

This study proposes an image-enhancement method to address the challenges of low visibility and color distortion in images captured during yellow sandstorms for an image sensor based outdoor surveillance system. The technique combines traditional image processing with deep learning to improve image quality [...] Read more.

This study proposes an image-enhancement method to address the challenges of low visibility and color distortion in images captured during yellow sandstorms for an image sensor based outdoor surveillance system. The technique combines traditional image processing with deep learning to improve image quality while preserving color consistency during transformation. Conventional methods can partially improve color representation and reduce blurriness in sand–dust environments. However, they are limited in their ability to restore fine details and sharp object boundaries effectively. In contrast, the proposed method incorporates Retinex-based processing into the training phase, enabling enhanced clarity and sharpness in the restored images. The proposed framework comprises three main steps. First, a cycle-consistent generative adversarial network (CycleGAN) is trained with unpaired images to generate synthetically paired data. Second, CycleGAN is retrained using these generated images along with clear images obtained through multiscale image decomposition, allowing the model to transform dust-interfered images into clear ones. Finally, color preservation is achieved by selecting the A and B chrominance channels from the small-scale model to maintain the original color characteristics. The experimental results confirmed that the proposed method effectively restores image color and removes sand–dust-related interference, thereby providing enhanced visual quality under sandstorm conditions. Specifically, it outperformed algorithm-based dust removal methods such as Sand-Dust Image Enhancement (SDIE), Chromatic Variance Consistency Gamma and Correction-Based Dehazing (CVCGCBD), and Rank-One Prior (ROP+), as well as machine learning-based methods including Fusion strategy and Two-in-One Low-Visibility Enhancement Network (TOENet), achieving a Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE) score of 17.238, which demonstrates improved perceptual quality, and an Local Phase Coherence-Sharpness Index (LPC-SI) value of 0.973, indicating enhanced sharpness. Both metrics showed superior performance compared to conventional methods. When applied to Closed-Circuit Television (CCTV) systems, the proposed method is expected to mitigate the adverse effects of color distortion and image blurring caused by sand–dust, thereby effectively improving visual clarity in practical surveillance applications. Full article

(This article belongs to the Special Issue Deep Learning-Based Image and Signal Sensing and Processing: 2nd Edition)

► Show Figures

Figure 1

21 pages, 7458 KB

Open AccessArticle

Dynamic and Lightweight Detection of Strawberry Diseases Using Enhanced YOLOv10

by Huilong Jin, Xiangrong Ji and Wanming Liu

Electronics 2025, 14(19), 3768; https://doi.org/10.3390/electronics14193768 - 24 Sep 2025

Viewed by 335

Abstract

Strawberry cultivation faces significant challenges from pests and diseases, which are difficult to detect due to complex natural backgrounds and the high visual similarity between targets and their surroundings. This study proposes an advanced and lightweight detection algorithm, YOLO10-SC, based on the YOLOv10 [...] Read more.

Strawberry cultivation faces significant challenges from pests and diseases, which are difficult to detect due to complex natural backgrounds and the high visual similarity between targets and their surroundings. This study proposes an advanced and lightweight detection algorithm, YOLO10-SC, based on the YOLOv10 model, to address these challenges. The algorithm integrates the convolutional block attention module (CBAM) to enhance feature representation by focusing on critical disease-related information while suppressing irrelevant data. Additionally, the Spatial and Channel Reconstruction Convolution (SCConv) module is incorporated into the C2f module to improve the model’s ability to distinguish subtle differences among various pest and disease types. The introduction of DySample, an ultra-lightweight dynamic upsampler, further enhances feature boundary smoothness and detail preservation, ensuring efficient upsampling with minimal computational resources. Experimental results demonstrate that YOLO10-SC outperforms the original YOLOv10 and other mainstream algorithms in precision, recall, mAP50, F1 score, and FPS while reducing model parameters, GFLOPs, and size. These improvements significantly enhance detection accuracy and efficiency, making the model well-suited for real-time applications in natural agricultural environments. The proposed algorithm offers a robust solution for strawberry pest and disease detection, contributing to the advancement of smart agriculture. Full article

► Show Figures

Figure 1

29 pages, 19475 KB

Open AccessArticle

Fine-Scale Grassland Classification Using UAV-Based Multi-Sensor Image Fusion and Deep Learning

by Zhongquan Cai, Changji Wen, Lun Bao, Hongyuan Ma, Zhuoran Yan, Jiaxuan Li, Xiaohong Gao and Lingxue Yu

Remote Sens. 2025, 17(18), 3190; https://doi.org/10.3390/rs17183190 - 15 Sep 2025

Viewed by 615

Abstract

Grassland classification via remote sensing is essential for ecosystem monitoring and precision management, yet conventional satellite-based approaches are fundamentally constrained by coarse spatial resolution. To overcome this limitation, we harness high-resolution UAV multi-sensor data, integrating multi-scale image fusion with deep learning to achieve [...] Read more.

Grassland classification via remote sensing is essential for ecosystem monitoring and precision management, yet conventional satellite-based approaches are fundamentally constrained by coarse spatial resolution. To overcome this limitation, we harness high-resolution UAV multi-sensor data, integrating multi-scale image fusion with deep learning to achieve fine-scale grassland classification that satellites cannot provide. First, four categories of UAV data, including RGB, multispectral, thermal infrared, and LiDAR point cloud, were collected, and a fused image tensor consisting of 10 channels (NDVI, VCI, CHM, etc.) was constructed through orthorectification and resampling. For feature-level fusion, four deep fusion networks were designed. Among them, the MultiScale Pyramid Fusion Network, utilizing a pyramid pooling module, effectively integrated spectral and structural features, achieving optimal performance in all six image fusion evaluation metrics, including information entropy (6.84), spatial frequency (15.56), and mean gradient (12.54). Subsequently, training and validation datasets were constructed by integrating visual interpretation samples. Four backbone networks, including UNet++, DeepLabV3+, PSPNet, and FPN, were employed, and attention modules (SE, ECA, and CBAM) were introduced separately to form 12 model combinations. Results indicated that the UNet++ network combined with the SE attention module achieved the best segmentation performance on the validation set, with a mean Intersection over Union (mIoU) of 77.68%, overall accuracy (OA) of 86.98%, F1-score of 81.48%, and Kappa coefficient of 0.82. In the categories of Leymus chinensis and Puccinellia distans, producer’s accuracy (PA)/user’s accuracy (UA) reached 86.46%/82.30% and 82.40%/77.68%, respectively. Whole-image prediction validated the model’s coherent identification capability for patch boundaries. In conclusion, this study provides a systematic approach for integrating multi-source UAV remote sensing data and intelligent grassland interpretation, offering technical support for grassland ecological monitoring and resource assessment. Full article

(This article belongs to the Special Issue Advanced Remote Sensing for Next-Generation Smart Agriculture: Innovations, Integration, and Applications)

► Show Figures

Figure 1

24 pages, 7007 KB

Open AccessArticle

M4MLF-YOLO: A Lightweight Semantic Segmentation Framework for Spacecraft Component Recognition

by Wenxin Yi, Zhang Zhang and Liang Chang

Remote Sens. 2025, 17(18), 3144; https://doi.org/10.3390/rs17183144 - 10 Sep 2025

Viewed by 500

Abstract

With the continuous advancement of on-orbit services and space intelligence sensing technologies, the efficient and accurate identification of spacecraft components has become increasingly critical. However, complex lighting conditions, background interference, and limited onboard computing resources present significant challenges to existing segmentation algorithms. To [...] Read more.

With the continuous advancement of on-orbit services and space intelligence sensing technologies, the efficient and accurate identification of spacecraft components has become increasingly critical. However, complex lighting conditions, background interference, and limited onboard computing resources present significant challenges to existing segmentation algorithms. To address these challenges, this paper proposes a lightweight spacecraft component segmentation framework for on-orbit applications, termed M4MLF-YOLO. Based on the YOLOv5 architecture, we propose a refined lightweight design strategy that aims to balance segmentation accuracy and resource consumption in satellite-based scenarios. MobileNetV4 is adopted as the backbone network to minimize computational overhead. Additionally, a Multi-Scale Fourier Adaptive Calibration Module (MFAC) is designed to enhance multi-scale feature modeling and boundary discrimination capabilities in the frequency domain. We also introduce a Linear Deformable Convolution (LDConv) to explicitly control the spatial sampling span and distribution of the convolution kernel, thereby linearly adjusting the receptive field coverage range to improve feature extraction capabilities while effectively reducing computational costs. Furthermore, the efficient C3-Faster module is integrated to enhance channel interaction and feature fusion efficiency. A high-quality spacecraft image dataset, comprising both real and synthetic images, was constructed, covering various backgrounds and component types, including solar panels, antennas, payload instruments, thrusters, and optical payloads. Environment-aware preprocessing and enhancement strategies were applied to improve model robustness. Experimental results demonstrate that M4MLF-YOLO achieves excellent segmentation performance while maintaining low model complexity, with precision reaching 95.1% and recall reaching 88.3%, representing improvements of 1.9% and 3.9% over YOLOv5s, respectively. The mAP@0.5 also reached 93.4%. In terms of lightweight design, the model parameter count and computational complexity were reduced by 36.5% and 24.6%, respectively. These results validate that the proposed method significantly enhances deployment efficiency while preserving segmentation accuracy, showcasing promising potential for satellite-based visual perception applications. Full article

► Show Figures

Figure 1

18 pages, 2138 KB

Open AccessFeature PaperArticle

Park Visitors and Birds Connected by Trade-Offs and Synergies of Ecosystem Services

by Yichao Chen, Liyan Zhang, Zhengkai Zhang, Siwei Chen, Bei Yu and Yu Wang

Animals 2025, 15(17), 2619; https://doi.org/10.3390/ani15172619 - 6 Sep 2025

Viewed by 1528

Abstract

Parks serve as vital components of green infrastructure within urban ecosystems, providing recreational opportunities that not only enhance human well-being but also support bird diversity. However, the shared use of park spaces by both humans and birds inevitably leads to spatial overlap and [...] Read more.

Parks serve as vital components of green infrastructure within urban ecosystems, providing recreational opportunities that not only enhance human well-being but also support bird diversity. However, the shared use of park spaces by both humans and birds inevitably leads to spatial overlap and natural competition between the two groups. Consequently, addressing the diverse needs of both groups and balancing the ecosystem services provided to each has become an urgent and critical issue. In this study, we conducted bird and social surveys in an urban park and employed the SolVES and MaxEnt models to investigate the spatial patterns of cultural ecosystem services (CES), supporting ecosystem services (SES), and bird plumage color CES in the park. We then analyzed the trade-offs and synergies between different ecosystem service relationship pairs, as well as the factors influencing them, using bivariate spatial autocorrelation and geographical detectors analyses. Our results indicated a synergistic relationship between the recreational value of park CES and both park SES and bird plumage color CES. High-coverage vegetation areas along main roads promoted synergy, benefiting visitors’ appreciation of cultural services, bird roosting, and the supply of plumage color CES. Meanwhile, trade-offs were observed between the aesthetic value of park CES, park SES, and bird plumage color CES, primarily in fitness plazas where noise levels exceeded 70 dB. In contrast, visitors reacted more strongly to disturbances than birds. Furthermore, the colonization of colorful insectivorous birds enhanced the visual aesthetic value while simultaneously increasing the number of bird-feeding guilds and strengthening ecosystem stability. Our study suggests that planting tall trees, especially along park boundaries, expanding the perimeter green separation zone, and incorporating micro-water landscapes will help improve both avian CES and provide a more pleasant environment for visitors in parks. Full article

(This article belongs to the Special Issue Urban Wildlife Insights: Exploring the Behavior and Adaptations of Animals in Urban Environments)

► Show Figures

Figure 1

15 pages, 2654 KB

Open AccessArticle

The Evaluation of a Deep Learning Approach to Automatic Segmentation of Teeth and Shade Guides for Tooth Shade Matching Using the SAM2 Algorithm

by KyeongHwan Han, JaeHyung Lim, Jin-Soo Ahn and Ki-Sun Lee

Bioengineering 2025, 12(9), 959; https://doi.org/10.3390/bioengineering12090959 - 6 Sep 2025

Cited by 1 | Viewed by 736

Abstract

Accurate shade matching is essential in restorative and prosthetic dentistry yet remains difficult due to subjectivity in visual assessments. We develop and evaluate a deep learning approach for the simultaneous segmentation of natural teeth and shade guides in intraoral photographs using four fine-tuned [...] Read more.

Accurate shade matching is essential in restorative and prosthetic dentistry yet remains difficult due to subjectivity in visual assessments. We develop and evaluate a deep learning approach for the simultaneous segmentation of natural teeth and shade guides in intraoral photographs using four fine-tuned variants of Segment Anything Model 2 (SAM2: tiny, small, base plus, and large) and a UNet baseline trained under the same protocol. The spatial performance was assessed using the Dice Similarity Coefficient (DSC), the Intersection over the Union (IoU), and the 95th-percentile Hausdorff distance normalized by the ground-truth equivalent diameter (HD95). The color consistency within masks was quantified by the coefficient of variation (CV) of the CIELAB components (L*, a*, b*). The perceptual color difference was measured using CIEDE2000 (ΔE00). On a held-out test set, all SAM2 variants achieved a high overlap accuracy; SAM2-large performed best (DSC: 0.987 ± 0.006; IoU: 0.975 ± 0.012; HD95: 1.25 ± 1.80%), followed by SAM2-small (0.987 ± 0.008; 0.974 ± 0.014; 2.96 ± 11.03%), SAM2-base plus (0.985 ± 0.011; 0.971 ± 0.021; 1.71 ± 3.28%), and SAM2-tiny (0.979 ± 0.015; 0.959 ± 0.028; 6.16 ± 11.17%). UNet reached a DSC = 0.972 ± 0.020, an IoU = 0.947 ± 0.035, and an HD95 = 6.54 ± 16.35%. The CV distributions for all of the prediction models closely matched the ground truth (e.g., GT L*: 0.164 ± 0.040; UNet: 0.144 ± 0.028; SAM2-small: 0.164 ± 0.038; SAM2-base plus: 0.162 ± 0.039). The full-mask ΔE00 was low across models, with the summary statistics reported as the median (mean ± SD): UNet: 0.325 (0.487 ± 0.364); SAM2-tiny: 0.162 (0.410 ± 0.665); SAM2-small: 0.078 (0.126 ± 0.166); SAM2-base plus: 0.072 (0.198 ± 0.417); SAM2-large: 0.065 (0.167 ± 0.257). These ΔE00 values lie well below the ≈1 just noticeable difference threshold on average, indicating close chromatic agreement between the predictions and annotations. Within a single dataset and training protocol, fine-tuned SAM2, especially its larger variants, provides robust spatial accuracy, boundary reliability, and color fidelity suitable for clinical shade-matching workflows, while UNet offers a competitive convolutional baseline. These results indicate technical feasibility rather than clinical validation; broader baselines and external, multi-center evaluations are needed to determine its suitability for routine shade-matching workflows. Full article

(This article belongs to the Special Issue AI Advancements in Healthcare: Medical Imaging and Sensing Technologies)

► Show Figures

Figure 1

26 pages, 7650 KB

Open AccessArticle

ACD-DETR: Adaptive Cross-Scale Detection Transformer for Small Object Detection in UAV Imagery

by Yang Tong, Hui Ye, Jishen Yang and Xiulong Yang

Sensors 2025, 25(17), 5556; https://doi.org/10.3390/s25175556 - 5 Sep 2025

Viewed by 1330

Abstract

Small object detection in UAV imagery remains challenging due to complex aerial perspectives and the presence of dense, small targets with blurred boundaries. To address these challenges, we propose ACD-DETR, an adaptive end-to-end Transformer detector tailored for UAV-based small object detection. The framework [...] Read more.

Small object detection in UAV imagery remains challenging due to complex aerial perspectives and the presence of dense, small targets with blurred boundaries. To address these challenges, we propose ACD-DETR, an adaptive end-to-end Transformer detector tailored for UAV-based small object detection. The framework introduces three core modules: the Multi-Scale Edge-Enhanced Feature Fusion Module (MSEFM) to preserve fine-grained details; the Omni-Grained Boundary Calibrator (OG-BC) for boundary-aware semantic fusion; and the Dynamic Position Bias Attention-based Intra-scale Feature Interaction (DPB-AIFI) to enhance spatial reasoning. Furthermore, we introduce ACD-DETR-SBA+, a fusion-enhanced variant that removes OG-BC and DPB-AIFI while deploying densely connected Semantic–Boundary Aggregation (SBA) modules to intensify boundary–semantic fusion. This design sacrifices computational efficiency in exchange for higher detection precision, making it suitable for resource-rich deployment scenarios. On the VisDrone2019 dataset, ACD-DETR achieves 50.9% mAP@0.5, outperforming the RT-DETR-R18 baseline by 3.6 percentage points, while reducing parameters by 18.5%. ACD-DETR-SBA+ further improves accuracy to 52.0% mAP@0.5, demonstrating the benefit of SBA-based fusion. Extensive experiments on the VisDrone2019 and DOTA datasets demonstrate that ACD-DETR achieves a state-of-the-art trade-off between accuracy and efficiency, while ACD-DETR-SBA+ achieves further performance improvements at higher computational cost. Ablation studies and visual analyses validate the effectiveness of the proposed modules and design strategies. Full article

(This article belongs to the Section Remote Sensors)

► Show Figures

Figure 1

34 pages, 3473 KB

Open AccessArticle

Workspace Definition in Parallelogram Manipulators: A Theoretical Framework Based on Boundary Functions

by Luis F. Luque-Vega, Jorge A. Lizarraga, Dulce M. Navarro, Jose R. Navarro, Rocío Carrasco-Navarro, Emmanuel Lopez-Neri, Jesús Antonio Nava-Pintor, Fabián García-Vázquez and Héctor A. Guerrero-Osuna

Technologies 2025, 13(9), 404; https://doi.org/10.3390/technologies13090404 - 5 Sep 2025

Viewed by 528

Abstract

Robots with parallelogram mechanisms are widely employed in industrial applications due to their mechanical rigidity and precise motion control. However, the analytical definition of feasible workspace regions free from self-collisions remains an open challenge, especially considering the nonlinear and composite nature of such [...] Read more.

Robots with parallelogram mechanisms are widely employed in industrial applications due to their mechanical rigidity and precise motion control. However, the analytical definition of feasible workspace regions free from self-collisions remains an open challenge, especially considering the nonlinear and composite nature of such regions. This work introduces a mathematical model grounded in a collision theorem that formalizes boundary functions based on joint variables and geometric constraints. These functions explicitly define the envelope of safe configurations by evaluating relative positions between critical structural components. Using the MinervaBotV3 as a case study, the symbolic joint-space boundaries and their corresponding geometric regions in both 2D and 3D are computed and visualized. The feasible region is refined through centroid-based scaling to introduce safety margins and avoid singularities. The results show that this framework enables analytically continuous workspace representations, improving trajectory planning and reliability in constrained environments. Future work will extend this method to spatial mechanisms and real-time implementations in hybrid robotic systems. Full article

(This article belongs to the Special Issue Collaborative Robotics and Human-AI Interactions)

► Show Figures

Figure 1

18 pages, 4398 KB

Open AccessArticle

Connectivity Evaluation of Fracture-Cavity Reservoirs in S91 Unit

by Yunlong Xue, Yinghan Gao and Xiaobo Peng

Appl. Sci. 2025, 15(17), 9738; https://doi.org/10.3390/app15179738 - 4 Sep 2025

Cited by 1 | Viewed by 602

Abstract

Carbonate fracture–cavity reservoirs are significant oil and gas reservoirs globally, and their efficient development is influenced by the connectivity between fracture–cavity units within the reservoir. These reservoirs primarily consist of large caves, dissolution holes, and natural fractures, which serve as the primary storage [...] Read more.

Carbonate fracture–cavity reservoirs are significant oil and gas reservoirs globally, and their efficient development is influenced by the connectivity between fracture–cavity units within the reservoir. These reservoirs primarily consist of large caves, dissolution holes, and natural fractures, which serve as the primary storage and flow spaces. The S91 unit of the Tarim Oilfield is a karstic fracture–cavity reservoir with shallow coverage. It exhibits significant heterogeneity in the fracture–cavity reservoirs and presents complex connectivity between the fracture–cavity bodies. The integration of static and dynamic data, including geology, well logging, seismic, and production dynamics, resulted in the development of a set of static and dynamic connectivity evaluation processes designed for highly heterogeneous fracture–cavity reservoirs. Methods include using structural gradient tensors and stratigraphic continuity attributes to delineate the boundaries of caves and holes; performing RGB fusion analysis of coherence, curvature, and variance attributes to characterize large-scale fault development features; applying ant-tracking algorithms and fracture simulation techniques to identify the distribution and density characteristics of fracture zones; utilizing 3D visualization technology to describe the spatial relationship between fracture–cavity units and large-scale faults and fracture development zones; and combining dynamic data to verify interwell connectivity. This process will provide a key geological basis for optimizing well network deployment, improving water and gas injection efficiency, predicting residual oil distribution, and formulating adjustment measures, thereby improving the development efficiency of such complex reservoirs. Full article

(This article belongs to the Special Issue Advances in Geophysical Exploration)

► Show Figures

Figure 1

23 pages, 3606 KB

Open AccessArticle

Dual-Stream Attention-Enhanced Memory Networks for Video Anomaly Detection

by Weishan Gao, Xiaoyin Wang, Ye Wang and Xiaochuan Jing

Sensors 2025, 25(17), 5496; https://doi.org/10.3390/s25175496 - 4 Sep 2025

Viewed by 1021

Abstract

Weakly supervised video anomaly detection (WSVAD) aims to identify unusual events using only video-level labels. However, current methods face several key challenges, including ineffective modelling of complex temporal dependencies, indistinct feature boundaries between visually similar normal and abnormal events, and high false alarm [...] Read more.

Weakly supervised video anomaly detection (WSVAD) aims to identify unusual events using only video-level labels. However, current methods face several key challenges, including ineffective modelling of complex temporal dependencies, indistinct feature boundaries between visually similar normal and abnormal events, and high false alarm rates caused by an inability to distinguish salient events from complex background noise. This paper proposes a novel method that systematically enhances feature representation and discrimination to address these challenges. The proposed method first builds robust temporal representations by employing a hierarchical multi-scale temporal encoder and a position-aware global relation network to capture both local and long-range dependencies. The core of this method is the dual-stream attention-enhanced memory network, which achieves precise discrimination by learning distinct normal and abnormal patterns via dual memory banks, while utilising bidirectional spatial attention to mitigate background noise and focus on salient events before memory querying. The models underwent a comprehensive evaluation utilising solely RGB features on two demanding public datasets, UCF-Crime and XD-Violence. The experimental findings indicate that the proposed method attains state-of-the-art performance, achieving 87.43% AUC on UCF-Crime and 85.51% AP on XD-Violence. This result demonstrates that the proposed “attention-guided prototype matching” paradigm effectively resolves the aforementioned challenges, enabling robust and precise anomaly detection. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

15 pages, 1292 KB

Open AccessArticle

Lightweight Semantic Segmentation for AGV Navigation: An Enhanced ESPNet-C with Dual Attention Mechanisms

by Jianqi Shu, Xiang Yan, Wen Liu, Haifeng Gong, Jingtai Zhu and Mengdie Yang

Electronics 2025, 14(17), 3524; https://doi.org/10.3390/electronics14173524 - 3 Sep 2025

Viewed by 572

Abstract

Efficient navigation of Automated Guided Vehicles (AGVs) in dynamic warehouse environments requires real-time and accurate path segmentation algorithms. However, traditional semantic segmentation models suffer from excessive parameters and high computational costs, limiting their deployment on resource-constrained embedded platforms. A lightweight image segmentation algorithm [...] Read more.

Efficient navigation of Automated Guided Vehicles (AGVs) in dynamic warehouse environments requires real-time and accurate path segmentation algorithms. However, traditional semantic segmentation models suffer from excessive parameters and high computational costs, limiting their deployment on resource-constrained embedded platforms. A lightweight image segmentation algorithm is proposed, built on an improved ESPNet-C architecture, combining Spatial Group-wise Enhance (SGE) and Efficient Channel Attention (ECA) with a dual-branch upsampling decoder. On our custom warehouse dataset, the model attains 90.5% Miou with 0.425 M parameters and runs at ~160 FPS, reducing parameters by ×116–×136 and computational costs by 70–92% in comparison with DeepLabV3+. The proposed model improves boundary coherence by 22% under uneven lighting and achieves 90.2% Miou on the public BDD100K benchmark, demonstrating strong generalization beyond warehouse data. These results highlight its suitability as a real-time visual perception module for AGV navigation in resource-constrained environments and offer practical guidance for designing lightweight semantic segmentation models for embedded applications. Full article

► Show Figures

Figure 1

Search Results (199)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (199)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI