Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (277)

Search Parameters:
Keywords = contour attention

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 2692 KB  
Article
Improved UNet-Based Detection of 3D Cotton Cup Indentations and Analysis of Automatic Cutting Accuracy
by Lin Liu, Xizhao Li, Hongze Lv, Jianhuang Wang, Fucai Lai, Fangwei Zhao and Xibing Li
Processes 2025, 13(10), 3144; https://doi.org/10.3390/pr13103144 - 30 Sep 2025
Viewed by 262
Abstract
With the advancement of intelligent technology and the rise in labor costs, manual identification and cutting of 3D cotton cup indentations can no longer meet modern demands. The increasing variety and shape of 3D cotton cups due to personalized requirements make the use [...] Read more.
With the advancement of intelligent technology and the rise in labor costs, manual identification and cutting of 3D cotton cup indentations can no longer meet modern demands. The increasing variety and shape of 3D cotton cups due to personalized requirements make the use of fixed molds for cutting inefficient, leading to a large number of molds and high costs. Therefore, this paper proposes a UNet-based indentation segmentation algorithm to automatically extract 3D cotton cup indentation data. By incorporating the VGG16 network and Leaky-ReLU activation function into the UNet model, the method improves the model’s generalization capability, convergence speed, detection speed, and reduces the risk of overfitting. Additionally, attention mechanisms and an Atrous Spatial Pyramid Pooling (ASPP) module are introduced to enhance feature extraction, improving the network’s spatial feature extraction ability. Experiments conducted on a self-made 3D cotton cup dataset demonstrate a precision of 99.53%, a recall of 99.69%, a mIoU of 99.18%, and an mPA of 99.73%, meeting practical application requirements. The extracted 3D cotton cup indentation contour data is automatically input into an intelligent CNC cutting machine to cut 3D cotton cup. The cutting results of 400 data points show an 0.20 mm ± 0.42 mm error, meeting the cutting accuracy requirements for flexible material 3D cotton cups. This study may serve as a reference for machine vision, image segmentation, improvements to deep learning architectures, and automated cutting machinery for flexible materials such as fabrics. Full article
(This article belongs to the Section Automation Control Systems)
Show Figures

Figure 1

21 pages, 7001 KB  
Article
CGNet: Remote Sensing Instance Segmentation Method Using Contrastive Language–Image Pretraining and Gated Recurrent Units
by Hui Zhang, Zhao Tian, Zhong Chen, Tianhang Liu, Xueru Xu, Junsong Leng and Xinyuan Qi
Remote Sens. 2025, 17(19), 3305; https://doi.org/10.3390/rs17193305 - 26 Sep 2025
Viewed by 488
Abstract
Instance segmentation in remote sensing imagery is a significant application area within computer vision, holding considerable value in fields such as land planning and aerospace. The target scales of remote sensing images are often small, the contours of different categories of targets can [...] Read more.
Instance segmentation in remote sensing imagery is a significant application area within computer vision, holding considerable value in fields such as land planning and aerospace. The target scales of remote sensing images are often small, the contours of different categories of targets can be remarkably similar, and the background information is complex, containing more noise interference. Therefore, it is essential for the network model to utilize the background and internal instance information more effectively. Considering all the above, to fully adapt to the characteristics of remote sensing images, a network named CGNet, which combines an enhanced backbone with a contour–mask branch, is proposed. This network employs gated recurrent units for the iteration of contour and mask branches and adopts the attention head for branch fusion. Additionally, to address the common issues of missed and misdetections in target detection, a supervised backbone network using contrastive pretraining for feature supplementation is introduced. The proposed method has been experimentally validated in the NWPU VHR-10 and SSDD datasets, achieving average precision metrics of 68.1% and 67.4%, respectively, which are 0.9% and 3.2% higher than those of the suboptimal methods. Full article
Show Figures

Figure 1

12 pages, 2144 KB  
Article
Microvascular ALT-Flap Reconstruction for Distal Forearm and Hand Defects: Outcomes and Single-Case Application of a Bone-Anchored Venous Anastomosis
by Adrian Matthias Vater, Matthias Michael Aitzetmüller-Klietz, Philipp Edmund Lamby, Julia Stanger, Rainer Meffert, Karsten Schmidt, Michael Georg Jakubietz and Rafael Gregor Jakubietz
J. Clin. Med. 2025, 14(19), 6807; https://doi.org/10.3390/jcm14196807 - 26 Sep 2025
Viewed by 281
Abstract
Background: Reconstruction of distal forearm and hand soft tissue defects remains a complex surgical challenge due to the functional and aesthetic significance of the region. Several flap options have been established such as the posterior interosseous artery flap (PIA) or temporalis fascia flap [...] Read more.
Background: Reconstruction of distal forearm and hand soft tissue defects remains a complex surgical challenge due to the functional and aesthetic significance of the region. Several flap options have been established such as the posterior interosseous artery flap (PIA) or temporalis fascia flap (TFF), yet the anterolateral thigh flap (ALT) has gained increasing attention for its versatility and favorable risk profile. Methods: We retrospectively analyzed 12 patients (7 males, 5 females; mean age 51.8 years) who underwent free microvascular ALT reconstruction for distal forearm and hand defects between May 2020 and May 2025. Etiologies included infection, chemical burns, explosion injuries, and traffic accidents. The mean defect size was 75.4 cm2, and the average operative time was 217 min. Secondary flap thinning was performed in eight cases. In one patient without available recipient veins, a pedicle vein was anastomosed using a coupler device anchored into a cortical window of the distal radius to establish venous outflow via the bone marrow. Results: All flaps demonstrated complete survival with successful integration. Minor complications included transient venous congestion in one case and superficial wound dehiscence in four cases. Functional outcomes were favorable, with postoperative hand function rated as very good in 10 of 12 patients at follow-up. The bone-anchored venous anastomosis provided effective venous drainage in the salvage case. Conclusions: The free microvascular ALT is a reliable and highly adaptable method for distal forearm and hand reconstruction. It provides excellent soft tissue coverage, allows for secondary contouring, and achieves both functional and aesthetic goals. Furthermore, intraosseous venous anastomosis using a coupler device might represent a novel adjunct that may expand reconstructive options in cases with absent or unusable recipient veins. Full article
(This article belongs to the Special Issue Microsurgery: Current and Future Challenges)
Show Figures

Figure 1

21 pages, 4721 KB  
Article
Automated Brain Tumor MRI Segmentation Using ARU-Net with Residual-Attention Modules
by Erdal Özbay and Feyza Altunbey Özbay
Diagnostics 2025, 15(18), 2326; https://doi.org/10.3390/diagnostics15182326 - 13 Sep 2025
Viewed by 611
Abstract
Background/Objectives: Accurate segmentation of brain tumors in Magnetic Resonance Imaging (MRI) scans is critical for diagnosis and treatment planning due to their life-threatening nature. This study aims to develop a robust and automated method capable of precisely delineating heterogeneous tumor regions while improving [...] Read more.
Background/Objectives: Accurate segmentation of brain tumors in Magnetic Resonance Imaging (MRI) scans is critical for diagnosis and treatment planning due to their life-threatening nature. This study aims to develop a robust and automated method capable of precisely delineating heterogeneous tumor regions while improving segmentation accuracy and generalization. Methods: We propose Attention Res-UNet (ARU-Net), a novel Deep Learning (DL) architecture integrating residual connections, Adaptive Channel Attention (ACA), and Dimensional-space Triplet Attention (DTA) modules. The encoding module efficiently extracts and refines relevant feature information by applying ACA to the lower layers of convolutional and residual blocks. The DTA is fixed to the upper layers of the decoding module, decoupling channel weights to better extract and fuse multi-scale features, enhancing both performance and efficiency. Input MRI images are pre-processed using Contrast Limited Adaptive Histogram Equalization (CLAHE) for contrast enhancement, denoising filters, and Linear Kuwahara filtering to preserve edges while smoothing homogeneous regions. The network is trained using categorical cross-entropy loss with the Adam optimizer on the BTMRII dataset, and comparative experiments are conducted against baseline U-Net, DenseNet121, and Xception models. Performance is evaluated using accuracy, precision, recall, F1-score, Dice Similarity Coefficient (DSC), and Intersection over Union (IoU) metrics. Results: Baseline U-Net showed significant performance gains after adding residual connections and ACA modules, with DSC improving by approximately 3.3%, accuracy by 3.2%, IoU by 7.7%, and F1-score by 3.3%. ARU-Net further enhanced segmentation performance, achieving 98.3% accuracy, 98.1% DSC, 96.3% IoU, and a superior F1-score, representing additional improvements of 1.1–2.0% over the U-Net + Residual + ACA variant. Visualizations confirmed smoother boundaries and more precise tumor contours across all six tumor classes, highlighting ARU-Net’s ability to capture heterogeneous tumor structures and fine structural details more effectively than both baseline U-Net and other conventional DL models. Conclusions: ARU-Net, combined with an effective pre-processing strategy, provides a highly reliable and precise solution for automated brain tumor segmentation. Its improvements across multiple evaluation metrics over U-Net and other conventional models highlight its potential for clinical application and contribute novel insights to medical image analysis research. Full article
(This article belongs to the Special Issue Advances in Functional and Structural MR Image Analysis)
Show Figures

Figure 1

28 pages, 13681 KB  
Article
Occlusion-Aware Caged Chicken Detection Based on Multi-Scale Edge Information Extractor and Context Fusion
by Fei Pan, Fang Huang, Luping Zhang, Huadong Yin, Ying Ruan, Daizhuang Yang and Shuheng Wang
Animals 2025, 15(18), 2669; https://doi.org/10.3390/ani15182669 - 12 Sep 2025
Viewed by 427
Abstract
Due to the complex environment of caged chicken coops, uneven illumination and severe occlusion in the coops lead to unsatisfactory accuracy of chicken detection. In this study, we construct an image dataset in the production environment of caged chickens using a head and [...] Read more.
Due to the complex environment of caged chicken coops, uneven illumination and severe occlusion in the coops lead to unsatisfactory accuracy of chicken detection. In this study, we construct an image dataset in the production environment of caged chickens using a head and neck co-annotation method and a multi-stage co-enhancement strategy, and we propose Chicken-YOLO, an occlusion-aware caged chicken detection model based on multi-scale edge information extractor and context fusion for the severe occlusion and poor illumination situations. The model enhances chicken feather texture and crown contour features via the multi-scale edge information extractor (MSEIExtractor), optimizes downsampling information retention through integrated context-guided downsampling (CGDown), and improves occlusion perception using the detection head with the multi-scale separation and enhancement attention module (DHMSEAM). Experiments demonstrate that Chicken-YOLO achieves the best detection performance among mainstream models, exhibiting 1.7% and 1.6% improvements in mAP50 and mAP50:95, respectively, over the baseline model YOLO11n. Moreover, the improved model achieves higher mAP50 than the superior YOLO11s while using only 58.8% of its parameters and 42.3% of its computational cost. On the two specialized test sets—one for poor illumination cases and the other for multiple occlusion cases—Chicken-YOLO’s performance improves significantly, with mAP50 increasing by 3.0% and 1.8%, respectively. This suggests that the model enhances target capture capability under poor illumination and maintains better contour continuity in occlusion cases, verifying its robustness against complex disturbances. Full article
(This article belongs to the Section Animal System and Management)
Show Figures

Figure 1

23 pages, 10375 KB  
Article
Extraction of Photosynthetic and Non-Photosynthetic Vegetation Cover in Typical Grasslands Using UAV Imagery and an Improved SegFormer Model
by Jie He, Xiaoping Zhang, Weibin Li, Du Lyu, Yi Ren and Wenlin Fu
Remote Sens. 2025, 17(18), 3162; https://doi.org/10.3390/rs17183162 - 12 Sep 2025
Viewed by 478
Abstract
Accurate monitoring of the coverage and distribution of photosynthetic (PV) and non-photosynthetic vegetation (NPV) in the grasslands of semi-arid regions is crucial for understanding the environment and addressing climate change. However, the extraction of PV and NPV information from Unmanned Aerial Vehicle (UAV) [...] Read more.
Accurate monitoring of the coverage and distribution of photosynthetic (PV) and non-photosynthetic vegetation (NPV) in the grasslands of semi-arid regions is crucial for understanding the environment and addressing climate change. However, the extraction of PV and NPV information from Unmanned Aerial Vehicle (UAV) remote sensing imagery is often hindered by challenges such as low extraction accuracy and blurred boundaries. To overcome these limitations, this study proposed an improved semantic segmentation model, designated SegFormer-CPED. The model was developed based on the SegFormer architecture, incorporating several synergistic optimizations. Specifically, a Convolutional Block Attention Module (CBAM) was integrated into the encoder to enhance early-stage feature perception, while a Polarized Self-Attention (PSA) module was embedded to strengthen contextual understanding and mitigate semantic loss. An Edge Contour Extraction Module (ECEM) was introduced to refine boundary details. Concurrently, the Dice Loss function was employed to replace the Cross-Entropy Loss, thereby more effectively addressing the class imbalance issue and significantly improving both the segmentation accuracy and boundary clarity of PV and NPV. To support model development, a high-quality PV and NPV segmentation dataset for Hengshan grassland was also constructed. Comprehensive experimental results demonstrated that the proposed SegFormer-CPED model achieved state-of-the-art performance, with a mIoU of 93.26% and an F1-score of 96.44%. It significantly outperformed classic architectures and surpassed all leading frameworks benchmarked here. Its high-fidelity maps can bridge field surveys and satellite remote sensing. Ablation studies verified the effectiveness of each improved module and its synergistic interplay. Moreover, this study successfully utilized SegFormer-CPED to perform fine-grained monitoring of the spatiotemporal dynamics of PV and NPV in the Hengshan grassland, confirming that the model-estimated fPV and fNPV were highly correlated with ground survey data. The proposed SegFormer-CPED model provides a robust and effective solution for the precise, semi-automated extraction of PV and NPV from high-resolution UAV imagery. Full article
Show Figures

Graphical abstract

18 pages, 2065 KB  
Article
Phoneme-Aware Augmentation for Robust Cantonese ASR Under Low-Resource Conditions
by Lusheng Zhang, Shie Wu and Zhongxun Wang
Symmetry 2025, 17(9), 1478; https://doi.org/10.3390/sym17091478 - 8 Sep 2025
Viewed by 634
Abstract
Cantonese automatic speech recognition (ASR) faces persistent challenges due to its nine lexical tones, extensive phonological variation, and the scarcity of professionally transcribed corpora. To address these issues, we propose a lightweight and data-efficient framework that leverages weak phonetic supervision (WPS) in conjunction [...] Read more.
Cantonese automatic speech recognition (ASR) faces persistent challenges due to its nine lexical tones, extensive phonological variation, and the scarcity of professionally transcribed corpora. To address these issues, we propose a lightweight and data-efficient framework that leverages weak phonetic supervision (WPS) in conjunction with two pho-neme-aware augmentation strategies. (1) Dynamic Boundary-Aligned Phoneme Dropout progressively removes entire IPA segments according to a curriculum schedule, simulating real-world phenomena such as elision, lenition, and tonal drift while ensuring training stability. (2) Phoneme-Aware SpecAugment confines all time- and frequency-masking operations within phoneme boundaries and prioritizes high-attention regions, thereby preserving intra-phonemic contours and formant integrity. Built on the Whistle encoder—which integrates a Conformer backbone, Connectionist Temporal Classification–Conditional Random Field (CTC-CRF) alignment, and a multi-lingual phonetic space—the approach requires only a grapheme-to-phoneme lexicon and Montreal Forced Aligner outputs, without any additional manual labeling. Experiments on the Cantonese subset of Common Voice demonstrate consistent gains: Dynamic Dropout alone reduces phoneme error rate (PER) from 17.8% to 16.7% with 50 h of speech and 16.4% to 15.1% with 100 h, while the combination of the two augmentations further lowers PER to 15.9%/14.4%. These results confirm that structure-aware phoneme-level perturbations provide an effective and low-cost solution for building robust Cantonese ASR systems under low-resource conditions. Full article
(This article belongs to the Section Computer)
Show Figures

Figure 1

21 pages, 8753 KB  
Article
PowerStrand-YOLO: A High-Voltage Transmission Conductor Defect Detection Method for UAV Aerial Imagery
by Zhenrong Deng, Jun Li, Junjie Huang, Shuaizheng Jiang, Qiuying Wu and Rui Yang
Mathematics 2025, 13(17), 2859; https://doi.org/10.3390/math13172859 - 4 Sep 2025
Viewed by 498
Abstract
Broken or loose strands in high-voltage transmission conductors constitute critical defects that jeopardize grid reliability. Unmanned aerial vehicle (UAV) inspection has become indispensable for their timely discovery; however, conventional detectors falter in the face of cluttered backgrounds and the conductors’ diminutive pixel footprint, [...] Read more.
Broken or loose strands in high-voltage transmission conductors constitute critical defects that jeopardize grid reliability. Unmanned aerial vehicle (UAV) inspection has become indispensable for their timely discovery; however, conventional detectors falter in the face of cluttered backgrounds and the conductors’ diminutive pixel footprint, yielding sub-optimal accuracy and throughput. To overcome these limitations, we present PowerStrand-YOLO—an enhanced YOLOv8 derivative tailored for UAV imagery. The method is trained on a purpose-built dataset and integrates three technical contributions. (1) A C2f_DCNv4 module is introduced to strengthen multi-scale feature extraction. (2) An EMA attention mechanism is embedded to suppress background interference and emphasize defect-relevant cues. (3) The original loss function is superseded by Shape-IoU, compelling the network to attend closely to the geometric contours and spatial layout of strand anomalies. Extensive experiments demonstrate 95.4% precision, 96.2% recall, and 250 FPS. Relative to the baseline YOLOv8, PowerStrand-YOLO improves precision by 3% and recall by 6.8% while accelerating inference. Moreover, it also demonstrates competitive performance on the VisDrone2019 dataset. These results establish the improved framework as a more accurate and efficient solution for UAV-based inspection of power transmission lines. Full article
Show Figures

Figure 1

18 pages, 4265 KB  
Article
Hybrid-Recursive-Refinement Network for Camouflaged Object Detection
by Hailong Chen, Xinyi Wang and Haipeng Jin
J. Imaging 2025, 11(9), 299; https://doi.org/10.3390/jimaging11090299 - 2 Sep 2025
Viewed by 515
Abstract
Camouflaged object detection (COD) seeks to precisely detect and delineate objects that are concealed within complex and ambiguous backgrounds. However, due to subtle texture variations and semantic ambiguity, it remains a highly challenging task. Existing methods that rely solely on either convolutional neural [...] Read more.
Camouflaged object detection (COD) seeks to precisely detect and delineate objects that are concealed within complex and ambiguous backgrounds. However, due to subtle texture variations and semantic ambiguity, it remains a highly challenging task. Existing methods that rely solely on either convolutional neural network (CNN) or Transformer architectures often suffer from incomplete feature representations and the loss of boundary details. To address the aforementioned challenges, we propose an innovative hybrid architecture that synergistically leverages the strengths of CNNs and Transformers. In particular, we devise a Hybrid Feature Fusion Module (HFFM) that harmonizes hierarchical features extracted from CNN and Transformer pathways, ultimately boosting the representational quality of the combined features. Furthermore, we design a Combined Recursive Decoder (CRD) that adaptively aggregates hierarchical features through recursive pooling/upsampling operators and stage-wise mask-guided refinement, enabling precise structural detail capture across multiple scales. In addition, we propose a Foreground–Background Selection (FBS) module, which alternates attention between foreground objects and background boundary regions, progressively refining object contours while suppressing background interference. Evaluations on four widely used public COD datasets, CHAMELEON, CAMO, COD10K, and NC4K, demonstrate that our method achieves state-of-the-art performance. Full article
(This article belongs to the Section Computer Vision and Pattern Recognition)
Show Figures

Graphical abstract

20 pages, 17453 KB  
Article
Generative Denoising Method for Geological Images with Pseudo-Labeled Non-Matching Datasets
by Huan Zhang, Chunlei Wu, Jing Lu and Wenqi Zhao
Appl. Sci. 2025, 15(17), 9620; https://doi.org/10.3390/app15179620 - 1 Sep 2025
Viewed by 429
Abstract
Accurate prediction of oil and gas reservoirs requires precise river morphology. However, geological sedimentary images are often degraded by scattered non-structural noise from data errors or printing, which distorts river structures and complicates reservoir interpretation. To address this challenge, we propose GD-PND, a [...] Read more.
Accurate prediction of oil and gas reservoirs requires precise river morphology. However, geological sedimentary images are often degraded by scattered non-structural noise from data errors or printing, which distorts river structures and complicates reservoir interpretation. To address this challenge, we propose GD-PND, a generative framework that leverages pseudo-labeled non-matching datasets to enable geological denoising via information transfer. We first construct a non-matching dataset by deriving pseudo-noiseless images via automated contour delineation and region filling on geological images of varying morphologies, thereby reducing reliance on manual annotation. The proposed style transfer-based generative model for noiseless images employs cyclic training with dual generators and discriminators to transform geological images into outputs with well-preserved river structures. Within the generator, the excitation networks of global features integrated with multi-attention mechanisms can enhance the representation of overall river morphology, enabling preliminary denoising. Furthermore, we develop an iterative denoising enhancement module that performs comprehensive refinement through recursive multi-step pixel transformations and associated post-processing, operating independently of the model. Extensive visualizations confirm intact river courses, while quantitative evaluations show that GD-PND achieves slight improvements, with the chi-squared mean increasing by up to 466.0 (approximately 1.93%), significantly enhancing computational efficiency and demonstrating its superiority. Full article
Show Figures

Figure 1

10 pages, 505 KB  
Article
Gaze Dispersion During a Sustained-Fixation Task as a Proxy of Visual Attention in Children with ADHD
by Lionel Moiroud, Ana Moscoso, Eric Acquaviva, Alexandre Michel, Richard Delorme and Maria Pia Bucci
Vision 2025, 9(3), 76; https://doi.org/10.3390/vision9030076 - 1 Sep 2025
Viewed by 646
Abstract
Aim: The aim of this preliminary study was to explore the visual attention in children with ADHD using eye-tracking, and to identify a relevant quantitative proxy of their attentional control. Methods: Twenty-two children diagnosed with ADHD (aged 7 to 12 years) and their [...] Read more.
Aim: The aim of this preliminary study was to explore the visual attention in children with ADHD using eye-tracking, and to identify a relevant quantitative proxy of their attentional control. Methods: Twenty-two children diagnosed with ADHD (aged 7 to 12 years) and their 24 sex-, age-matched control participants with typical development performed a visual sustained-fixation task using an eye-tracker. Fixation stability was estimated by calculating the bivariate contour ellipse area (BCEA) as a continuous index of gaze dispersion during the task. Results: Children with ADHD showed a significantly higher BCEA than control participants (p < 0.001), reflecting their increased gaze instability. The impairment in gaze fixation persisted even in the absence of visual distractors, suggesting intrinsic attentional dysregulation in ADHD. Conclusions: Our results provide preliminary evidence that eye-tracking coupled with BCEA analysis, provides a sensitive and non-invasive tool for quantifying visual attentional resources of children with ADHD. If replicated and extended, the increased use of gaze instability as an indicator of visual attention in children could have a major impact in clinical settings to assist clinicians. This analysis focuses on overall gaze dispersion rather than fine eye micro-movements such as microsaccades. Full article
Show Figures

Figure 1

14 pages, 13449 KB  
Article
Multi-View Edge Attention Network for Fine-Grained Food Image Segmentation
by Chengxu Liu, Guorui Sheng, Weiqing Min, Xiaojun Wu and Shuqiang Jiang
Foods 2025, 14(17), 3016; https://doi.org/10.3390/foods14173016 - 28 Aug 2025
Viewed by 754
Abstract
Precisely identifying and delineating food regions automatically from images, a task known as food image segmentation, is crucial for enabling applications in food science such as automated dietary logging, accurate nutritional analysis, and food safety monitoring. However, accurately segmenting food images, particularly delineating [...] Read more.
Precisely identifying and delineating food regions automatically from images, a task known as food image segmentation, is crucial for enabling applications in food science such as automated dietary logging, accurate nutritional analysis, and food safety monitoring. However, accurately segmenting food images, particularly delineating food edges with precision, remains challenging due to the wide variety and diverse forms of food items, frequent inter-food occlusion, and ambiguous boundaries between food and backgrounds or containers. To overcome these challenges, we proposed a novel method called the Multi-view Edge Attention Network (MVEANet), which focuses on enhancing the fine-grained segmentation of food edges. The core idea behind this method is to integrate information obtained from observing food from different perspectives to achieve a more comprehensive understanding of its shape and specifically to strengthen the processing capability for food contour details. Rigorous testing on two large public food image datasets, FoodSeg103 and UEC-FoodPIX Complete, demonstrates that MVEANet surpasses existing state-of-the-art methods in segmentation accuracy, performing exceptionally well in depicting clear and precise food boundaries. This work provides the field of food science with a more accurate and reliable tool for automated food image segmentation, offering strong technical support for the development of more intelligent dietary assessment, nutritional research, and health management systems. Full article
(This article belongs to the Special Issue Food Computing-Enabled Precision Nutrition)
Show Figures

Figure 1

12 pages, 2172 KB  
Article
Instance Segmentation Method for Insulators in Complex Backgrounds Based on Improved SOLOv2
by Ze Chen, Yangpeng Ji, Xiaodong Du, Shaokang Zhao, Zhenfei Huo and Xia Fang
Sensors 2025, 25(17), 5318; https://doi.org/10.3390/s25175318 - 27 Aug 2025
Viewed by 652
Abstract
To precisely delineate the contours of insulators in complex transmission line images obtained from Unmanned Aerial Vehicle (UAV) inspections and thereby facilitate subsequent defect analysis, this study proposes an instance segmentation framework predicated upon an enhanced SOLOv2 model. The proposed framework integrates a [...] Read more.
To precisely delineate the contours of insulators in complex transmission line images obtained from Unmanned Aerial Vehicle (UAV) inspections and thereby facilitate subsequent defect analysis, this study proposes an instance segmentation framework predicated upon an enhanced SOLOv2 model. The proposed framework integrates a preprocessed edge channel, generated through the Non-Subsampled Contourlet Transform (NSCT), which augments the model’s capability to accurately capture the edges of insulators. Moreover, the input image resolution to the network is heightened to 1200 × 1600, permitting more detailed extraction of edges. Rather than the original ResNet + FPN architecture, the improved HRNet is utilized as the backbone to effectively harness multi-scale feature information, thereby enhancing the model’s overall efficacy. In response to the increased input size, there is a reduction in the network’s channel count, concurrent with an increase in the number of layers, ensuring an adequate receptive field without substantially escalating network parameters. Additionally, a Convolutional Block Attention Module (CBAM) is incorporated to refine mask quality and augment object detection precision. Furthermore, to bolster the model’s robustness and minimize annotation demands, a virtual dataset is crafted utilizing the fourth-generation Unreal Engine (UE4). Empirical results reveal that the proposed framework exhibits superior performance, with AP0.50 (90.21%), AP0.75 (83.34%), and AP[0.50:0.95] (67.26%) on a test set consisting of images supplied by the power grid. This framework surpasses existing methodologies and contributes significantly to the advancement of intelligent transmission line inspection. Full article
(This article belongs to the Special Issue Recent Trends and Advances in Intelligent Fault Diagnostics)
Show Figures

Figure 1

18 pages, 7165 KB  
Article
Dual-Path Enhanced YOLO11 for Lightweight Instance Segmentation with Attention and Efficient Convolution
by Qin Liao, Jianjun Chen, Fei Wang, Md Harun Or Rashid, Taihua Xu and Yan Fan
Electronics 2025, 14(17), 3389; https://doi.org/10.3390/electronics14173389 - 26 Aug 2025
Viewed by 790
Abstract
Instance segmentation stands as a foundational technology in real-world applications such as autonomous driving, where the inherent trade-off between accuracy and computational efficiency remains a key barrier to practical deployment. To tackle this challenge, we propose a dual-path enhanced framework based on YOLO11l. [...] Read more.
Instance segmentation stands as a foundational technology in real-world applications such as autonomous driving, where the inherent trade-off between accuracy and computational efficiency remains a key barrier to practical deployment. To tackle this challenge, we propose a dual-path enhanced framework based on YOLO11l. In this framework, two improved models, YOLO-SA and YOLO-SD, are developed to enable high-performance lightweight instance segmentation. The core innovation lies in balancing precision and efficiency through targeted architectural advancements. For YOLO-SA, we embed the parameter-free SimAM attention mechanism into the C3k2 module, yielding a novel C3k2SA structure. This design leverages neural inhibition principles to dynamically enhance focus on critical regions (e.g., object contours and semantic key points) without adding to model complexity. For YOLO-SD, we replace standard backbone convolutions with lightweight SPD-Conv layers (featuring spatial awareness) and adopt DySample in place of nearest-neighbor interpolation in the upsampling path. This dual modification minimizes information loss during feature propagation while accelerating feature extraction, directly optimizing computational efficiency. Experimental validation on the Cityscapes dataset demonstrates the effectiveness of our approach: YOLO-SA increases mAP from 0.401 to 0.410 with negligible overhead; YOLO-SD achieves a slight mAP improvement over the baseline while reducing parameters by approximately 5.7% and computational cost by 1.06%. These results confirm that our dual-path enhancements effectively reconcile accuracy and efficiency, offering a practical, lightweight solution tailored for resource-constrained real-world scenarios. Full article
(This article belongs to the Special Issue Knowledge Representation and Reasoning in Artificial Intelligence)
Show Figures

Figure 1

24 pages, 7604 KB  
Article
Ginseng-YOLO: Integrating Local Attention, Efficient Downsampling, and Slide Loss for Robust Ginseng Grading
by Yue Yu, Dongming Li, Shaozhong Song, Haohai You, Lijuan Zhang and Jian Li
Horticulturae 2025, 11(9), 1010; https://doi.org/10.3390/horticulturae11091010 - 25 Aug 2025
Viewed by 886
Abstract
Understory-cultivated Panax ginseng possesses high pharmacological and economic value; however, its visual quality grading predominantly relies on subjective manual assessment, constraining industrial scalability. To address challenges including fine-grained morphological variations, boundary ambiguity, and complex natural backgrounds, this study proposes Ginseng-YOLO, a lightweight and [...] Read more.
Understory-cultivated Panax ginseng possesses high pharmacological and economic value; however, its visual quality grading predominantly relies on subjective manual assessment, constraining industrial scalability. To address challenges including fine-grained morphological variations, boundary ambiguity, and complex natural backgrounds, this study proposes Ginseng-YOLO, a lightweight and deployment-friendly object detection model for automated ginseng grade classification. The model is built on the YOLOv11n (You Only Look Once11n) framework and integrates three complementary components: (1) C2-LWA, a cross-stage local window attention module that enhances discrimination of key visual features, such as primary root contours and fibrous textures; (2) ADown, a non-parametric downsampling mechanism that substitutes convolution operations with parallel pooling, markedly reducing computational complexity; and (3) Slide Loss, a piecewise IoU-weighted loss function designed to emphasize learning from samples with ambiguous or irregular boundaries. Experimental results on a curated multi-grade ginseng dataset indicate that Ginseng-YOLO achieves a Precision of 84.9%, a Recall of 83.9%, and an mAP@50 of 88.7%, outperforming YOLOv11n and other state-of-the-art variants. The model maintains a compact footprint, with 2.0 M parameters, 5.3 GFLOPs, and 4.6 MB model size, supporting real-time deployment on edge devices. Ablation studies further confirm the synergistic contributions of the proposed modules in enhancing feature representation, architectural efficiency, and training robustness. Successful deployment on the NVIDIA Jetson Nano demonstrates practical real-time inference capability under limited computational resources. This work provides a scalable approach for intelligent grading of forest-grown ginseng and offers methodological insights for the design of lightweight models in medicinal plants and agricultural applications. Full article
(This article belongs to the Section Medicinals, Herbs, and Specialty Crops)
Show Figures

Figure 1

Back to TopTop