Loading [MathJax]/jax/output/HTML-CSS/jax.js
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (192)

Search Parameters:
Keywords = quantitative image quality metrics

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
12 pages, 5510 KiB  
Article
Image Fusion of High-Resolution DynaCT and T2-Weighted MRI for Image-Guided Programming of dDBS
by Fadil Al-Jaberi, Matthias Moeskes, Martin Skalej, Melanie Fachet and Christoph Hoeschen
Brain Sci. 2025, 15(5), 521; https://doi.org/10.3390/brainsci15050521 - 19 May 2025
Abstract
Objectives: This study aimed to develop a semi-automated registration method for aligning preoperative non-contrast T2-weighted MRI with postoperative high-resolution cone-beam CT (DynaCT) in patients undergoing directional deep brain stimulation (dDBS) surgery targeting the subthalamic nucleus (STN). The aim was to facilitate image-guided programming [...] Read more.
Objectives: This study aimed to develop a semi-automated registration method for aligning preoperative non-contrast T2-weighted MRI with postoperative high-resolution cone-beam CT (DynaCT) in patients undergoing directional deep brain stimulation (dDBS) surgery targeting the subthalamic nucleus (STN). The aim was to facilitate image-guided programming of DBS devices and postoperative verification of the alignment of segmented contacts. Materials and Methods: A dataset of ten patients undergoing bilateral dDBS implantation was retrospectively collected, including DynaCT (acquired postoperatively) and non-contrast T2-weighted MRI (obtained preoperatively). A semi-automated registration method was used, employing manual initialization due to dissimilar anatomical information between DynaCT and T2-weighted MRI. Image visualization, initial alignment using a centered transformation initializer, and single-resolution image registration involving the Simple Insight Toolkit (SimpleITK) library were performed. Manual landmark-based alignment based on anatomical landmarks and evaluation metrics such as Target Registration Error (TRE) assessed alignment accuracy. Results: The registration method successfully aligned all images. Quantitative evaluation revealed an average of the mean TRE of 1.48 mm across all subjects, indicating satisfactory alignment quality. Multiplanar reformations (MPRs) based on electrode-oriented normal vectors visualized segmented contacts for accurate electrode placement. Conclusions: The developed method demonstrated successful registration between preoperative non-contrast T2-weighted MRI and postoperative DynaCT, despite dissimilar anatomical information. This approach facilitates accurate alignment crucial for DBS programming and postoperative verification, potentially reducing the programming time of the DBS. The study underscores the importance of image quality, manual initialization and semi-automated registration methods for successful multimodal image registration in dDBS procedures targeting the STN. Full article
(This article belongs to the Section Sensory and Motor Neuroscience)
Show Figures

Figure 1

19 pages, 2577 KiB  
Article
Deep Learning Models for Multi-Part Morphological Segmentation and Evaluation of Live Unstained Human Sperm
by Peiran Lei, Mozafar Saadat, Mahdieh Gol Hassani and Chang Shu
Sensors 2025, 25(10), 3093; https://doi.org/10.3390/s25103093 - 14 May 2025
Viewed by 170
Abstract
To perform accurate computer vision quality assessments of sperm used within reproductive medicine, a clear separation of each sperm component from the background is critical. This study systematically evaluates and compares the performance of Mask R-CNN, YOLOv8, YOLO11, and U-Net in multi-part sperm [...] Read more.
To perform accurate computer vision quality assessments of sperm used within reproductive medicine, a clear separation of each sperm component from the background is critical. This study systematically evaluates and compares the performance of Mask R-CNN, YOLOv8, YOLO11, and U-Net in multi-part sperm segmentation, focusing on the head, acrosome, nucleus, neck, and tail. This study conducts a quantitative analysis using a dataset of live, unstained human sperm, employing multiple metrics, including IoU, Dice, Precision, Recall, and F1 Score. The results indicate that Mask R-CNN outperforms other models in segmenting smaller and more regular structures (head, nucleus, and acrosome). In particular, it achieves a slightly higher IoU than YOLOv8 for the nucleus and surpasses YOLO11 for the acrosome, highlighting its robustness. For the neck, YOLOv8 performs comparably to or slightly better than Mask R-CNN, suggesting that single-stage models can rival two-stage models under certain conditions. For the morphologically complex tail, U-Net achieves the highest IoU, demonstrating the advantage of global perception and multi-scale feature extraction. These findings provide insights into model selection for sperm segmentation tasks, facilitating the optimization of segmentation architectures and advancing applications in assisted reproduction and biological image analysis. Full article
(This article belongs to the Section Biosensors)
Show Figures

Graphical abstract

18 pages, 8552 KiB  
Article
PID-NET: A Novel Parallel Image-Dehazing Network
by Wei Liu, Yi Zhou, Dehua Zhang and Yi Qin
Electronics 2025, 14(10), 1906; https://doi.org/10.3390/electronics14101906 - 8 May 2025
Viewed by 233
Abstract
Image dehazing is a critical task in image restoration, aiming to retrieve clear images from hazy scenes. This process is vital for various applications, including machine recognition, security monitoring, and aerial photography. Current dehazing algorithms often encounter challenges in multi-scale feature extraction, detail [...] Read more.
Image dehazing is a critical task in image restoration, aiming to retrieve clear images from hazy scenes. This process is vital for various applications, including machine recognition, security monitoring, and aerial photography. Current dehazing algorithms often encounter challenges in multi-scale feature extraction, detail preservation, effective haze removal, and maintaining color fidelity. To address these limitations, this paper introduces a novel Parallel Image-Dehazing Network (PID-Net). PID-Net uniquely combines a Convolutional Neural Network (CNN) for precise local feature extraction and a Vision Transformer (ViT) to capture global contextual information, overcoming the shortcomings of methods relying solely on either local or global features. A multi-scale CNN branch effectively extracts diverse local details through varying receptive fields, thereby enhancing the restoration of fine textures and details. To optimize the ViT component, a lightweight attention mechanism with CNN compensation is integrated, maintaining performance while minimizing the parameter count. Furthermore, a Redundant Feature Filtering Module is incorporated to filter out noise and haze-related artifacts, promoting the learning of subtle details. Our extensive experiments on public datasets demonstrated PID-Net’s significant superiority over state-of-the-art dehazing algorithms in both quantitative metrics and visual quality. Full article
Show Figures

Figure 1

26 pages, 7753 KiB  
Article
Decoupling Urban Street Attractiveness: An Ensemble Learning Analysis of Color and Visual Element Contributions
by Tao Wu, Zeyin Chen, Siying Li, Peixue Xing, Ruhang Wei, Xi Meng, Jingkai Zhao, Zhiqiang Wu and Renlu Qiao
Land 2025, 14(5), 979; https://doi.org/10.3390/land14050979 - 1 May 2025
Viewed by 269
Abstract
Constructing visually appealing public spaces has become an important issue in contemporary urban renewal and design. Existing studies mostly focus on single dimensions (e.g., vegetation ratio), lacking a large-scale integrated analysis of urban color and visual elements. To address this gap, this study [...] Read more.
Constructing visually appealing public spaces has become an important issue in contemporary urban renewal and design. Existing studies mostly focus on single dimensions (e.g., vegetation ratio), lacking a large-scale integrated analysis of urban color and visual elements. To address this gap, this study employs semantic segmentation and color computation on a massive street-view image dataset encompassing 56 cities worldwide, comparing eight machine learning models in predicting Visual Aesthetic Perception Scores (VAPSs). The results indicate that LightGBM achieves the best overall performance. To unpack this “black-box” prediction, we adopt an interpretable ensemble approach by combining LightGBM with Shapley Additive Explanations (SHAPs). SHAP assigns each feature a quantitative contribution to the model’s output, enabling transparent, post hoc explanations of how individual color metrics and visual elements drive VAPS. Our findings suggest that the vegetation ratio contributes the most to VAPS, but once greening surpasses a certain threshold, a “saturation effect” emerges and can no longer continuously enhance visual appeal. Excessive Sky Visibility Ratio can reduce VAPS. Moderate road visibility may increase spatial layering and vibrancy, whereas overly dense building significantly degrades overall aesthetic quality. While keeping the dominant color focused, moderate color saturation and complexity can increase the attractiveness of street views more effectively than overly uniform color schemes. Our research not only offers a comprehensve quantitative basis for urban visual aesthetics, but also underscores the importance of balancing color composition and visual elements, offering practical recommendations for public space planning, design, and color configuration. Full article
Show Figures

Figure 1

20 pages, 49431 KiB  
Article
Generative Adversarial Network-Based Lightweight High-Dynamic-Range Image Reconstruction Model
by Gustavo de Souza Ferreti, Thuanne Paixão and Ana Beatriz Alvarez
Appl. Sci. 2025, 15(9), 4801; https://doi.org/10.3390/app15094801 - 25 Apr 2025
Viewed by 309
Abstract
The generation of High-Dynamic-Range (HDR) images is essential for capturing details at various brightness levels, but current reconstruction methods, using deep learning techniques, often require significant computational resources, limiting their applicability on devices with moderate resources. In this context, this paper presents a [...] Read more.
The generation of High-Dynamic-Range (HDR) images is essential for capturing details at various brightness levels, but current reconstruction methods, using deep learning techniques, often require significant computational resources, limiting their applicability on devices with moderate resources. In this context, this paper presents a lightweight architecture for reconstructing HDR images from three Low-Dynamic-Range inputs. The proposed model is based on Generative Adversarial Networks and replaces traditional convolutions with depthwise separable convolutions, reducing the number of parameters while maintaining high visual quality and minimizing luminance artifacts. The evaluation of the proposal is conducted through quantitative, qualitative, and computational cost analyses based on the number of parameters and FLOPs. Regarding the qualitative analysis, a comparison between the models was performed using samples that present reconstruction challenges. The proposed model achieves a PSNR-μ of 43.51 dB and SSIM-μ of 0.9917, achieving competitive quality metrics comparable to HDR-GAN while reducing the computational cost by 6× in FLOPs and 7× in the number of parameters, using approximately half the GPU memory consumption, demonstrating an effective balance between visual fidelity and efficiency. Full article
(This article belongs to the Special Issue Advances in Image Recognition and Processing Technologies)
Show Figures

Figure 1

23 pages, 14157 KiB  
Article
A Spatial–Frequency Combined Transformer for Cloud Removal of Optical Remote Sensing Images
by Fulian Zhao, Chenlong Ding, Xin Li, Runliang Xia, Caifeng Wu and Xin Lyu
Remote Sens. 2025, 17(9), 1499; https://doi.org/10.3390/rs17091499 - 23 Apr 2025
Viewed by 421
Abstract
Cloud removal is a vital preprocessing step in optical remote sensing images (RSIs), directly enhancing image quality and providing a high-quality data foundation for downstream tasks, such as water body extraction and land cover classification. Existing methods attempt to combine spatial and frequency [...] Read more.
Cloud removal is a vital preprocessing step in optical remote sensing images (RSIs), directly enhancing image quality and providing a high-quality data foundation for downstream tasks, such as water body extraction and land cover classification. Existing methods attempt to combine spatial and frequency features for cloud removal, but they rely on shallow feature concatenation or simplistic addition operations, which fail to establish effective cross-domain synergistic mechanisms. These approaches lead to edge blurring and noticeable color distortions. To address this issue, we propose a spatial–frequency collaborative enhancement Transformer network named SFCRFormer, which significantly improves cloud removal performance. The core of SFCRFormer is the spatial–frequency combined Transformer (SFCT) block, which implements cross-domain feature reinforcement through a dual-branch spatial attention (DBSA) module and frequency self-attention (FreSA) module to effectively capture global context information. The DBSA module enhances the representation of spatial features by decoupling spatial-channel dependencies via parallelized feature refinement paths, surpassing the performance of traditional single-branch attention mechanisms in maintaining the overall structure of the image. FreSA leverages fast Fourier transform to convert features into the frequency domain, using frequency differences between object and cloud regions to achieve precise cloud detection and fine-grained removal. In order to further enhance the features extracted by DBSA and FreSA, we design the dual-domain feed-forward network (DDFFN), which effectively improves the detail fidelity of the restored image by multi-scale convolution for local refinement and frequency transformation for global structural optimization. A composite loss function, incorporating Charbonnier loss and Structural Similarity Index (SSIM) loss, is employed to optimize model training and balance pixel-level accuracy with structural fidelity. Experimental evaluations on the public datasets demonstrate that SFCRFormer outperforms state-of-the-art methods across various quantitative metrics, including PSNR and SSIM, while delivering superior visual results. Full article
Show Figures

Figure 1

26 pages, 44793 KiB  
Article
3D Reconstruction of Asphalt Pavement Macro-Texture Based on Convolutional Neural Network and Monocular Image Depth Estimation
by Xinliang Liu and Chao Yin
Appl. Sci. 2025, 15(9), 4684; https://doi.org/10.3390/app15094684 - 23 Apr 2025
Viewed by 308
Abstract
The 3D reconstruction of asphalt pavement macrotexture holds significant engineering value for pavement quality assessment and performance monitoring. However, conventional 3D reconstruction methods face challenges, such as high equipment costs and operational complexity, limiting their widespread application in engineering practice. Meanwhile, current deep [...] Read more.
The 3D reconstruction of asphalt pavement macrotexture holds significant engineering value for pavement quality assessment and performance monitoring. However, conventional 3D reconstruction methods face challenges, such as high equipment costs and operational complexity, limiting their widespread application in engineering practice. Meanwhile, current deep learning-based monocular image reconstruction for pavement texture remains in its early stages. To address these technical limitations, this study systematically prepared four types of asphalt mixture specimens (AC, SMA, OGFC, and PA) with a total of 14 gradations. High-precision equipment was used to simultaneously capture 2D RGB images and 3D RGB-D point cloud data of the surface texture. An innovative multi-scale feature fusion CNN model was developed based on an encoder–decoder architecture, along with an optimized training strategy for model parameters. For performance evaluation, multiple metrics were employed, including root mean square error (RMSE = 0.491), relative error (REL = 0.102), and accuracy at different thresholds (δ = 1/2/3: 0.931, 0.979, 0.990). The results demonstrate strong correlations between the reconstructed texture’s mean texture depth (MTD) and friction coefficient (f8) with actual measurements (0.913 and 0.953, respectively), outperforming existing methods. This confirms that the proposed CNN model achieves precise 3D reconstruction of asphalt pavement macrotexture, effectively supporting skid resistance evaluation. To validate engineering applicability, field tests were conducted on pavements with various gradations. The model exhibited excellent robustness under different conditions. Furthermore, based on extensive field data, this study established a quantitative relationship between MTD and friction coefficient, developing a more accurate pavement skid resistance evaluation system to support maintenance decision-making. Full article
Show Figures

Figure 1

21 pages, 13496 KiB  
Article
Advancing Interior Design with AI: Controllable Stable Diffusion for Panoramic Image Generation
by Wanggong Yang, Congcong Wang, Luxiang Liu, Shuying Dong and Yifei Zhao
Buildings 2025, 15(8), 1391; https://doi.org/10.3390/buildings15081391 - 21 Apr 2025
Viewed by 444
Abstract
AI-driven technologies have significantly advanced panoramic image generation in interior design; however, existing methods often lack controllability and consistency in rendering high-quality, coherent panoramas. To address these limitations, the study proposes CSD-Pano, a controllable and stable diffusion framework tailored for panoramic interior design [...] Read more.
AI-driven technologies have significantly advanced panoramic image generation in interior design; however, existing methods often lack controllability and consistency in rendering high-quality, coherent panoramas. To address these limitations, the study proposes CSD-Pano, a controllable and stable diffusion framework tailored for panoramic interior design generation. The study also introduces PSD-4, a curated dataset of panoramic scenes covering diverse interior decoration styles to support training and evaluation. CSD-Pano enables fine-grained control over aesthetic attributes, layout coherence, and stylistic consistency. Furthermore, the study designs a panoramic loss function that enhances spatial coherence, geometric alignment, and perceptual fidelity. Extensive qualitative and quantitative experiments demonstrate that CSD-Pano achieves superior performance compared to existing baselines, with significant improvements in SSIM and LPIPS metrics. These results validate the effectiveness of our approach in advancing automated panoramic interior design. Full article
(This article belongs to the Section Architectural Design, Urban Science, and Real Estate)
Show Figures

Figure 1

22 pages, 7958 KiB  
Article
Depth Upsampling with Local and Nonlocal Models Using Adaptive Bandwidth
by Niloufar Salehi Dastjerdi and M. Omair Ahmad
Electronics 2025, 14(8), 1671; https://doi.org/10.3390/electronics14081671 - 20 Apr 2025
Viewed by 166
Abstract
The rapid advancement of 3D imaging technology and depth cameras has made depth data more accessible for applications such as virtual reality and autonomous driving. However, depth maps typically suffer from lower resolution and quality compared to color images due to sensor limitations. [...] Read more.
The rapid advancement of 3D imaging technology and depth cameras has made depth data more accessible for applications such as virtual reality and autonomous driving. However, depth maps typically suffer from lower resolution and quality compared to color images due to sensor limitations. This paper introduces an improved approach to guided depth map super-resolution (GDSR) that effectively addresses key challenges, including the suppression of texture copying artifacts and the preservation of depth discontinuities. The proposed method integrates both local and nonlocal models within a structured framework, incorporating an adaptive bandwidth mechanism that dynamically adjusts guidance weights. Instead of relying on fixed parameters, this mechanism utilizes a distance map to evaluate patch similarity, leading to enhanced depth recovery. The local model ensures spatial smoothness by leveraging neighboring depth information, preserving fine details within small regions. On the other hand, the nonlocal model identifies similarities across distant areas, improving the handling of repetitive patterns and maintaining depth discontinuities. By combining these models, the proposed approach achieves more accurate depth upsampling with high-quality depth reconstruction. Experimental results, conducted on several datasets and evaluated using various objective metrics, demonstrate the effectiveness of the proposed method through both quantitative and qualitative assessments. The approach consistently delivers improved performance over existing techniques, particularly in preserving structural details and visual clarity. An ablation study further confirms the individual contributions of key components within the framework. These results collectively support the conclusion that the method is not only robust and accurate but also adaptable to a range of real-world scenarios, offering a practical advancement over current state-of-the-art solutions. Full article
(This article belongs to the Special Issue Image and Video Processing for Emerging Multimedia Technology)
Show Figures

Figure 1

18 pages, 3766 KiB  
Article
Self-Supervised Multiscale Contrastive and Attention-Guided Gradient Projection Network for Pansharpening
by Qingping Li, Xiaomin Yang, Bingru Li and Jin Wang
Sensors 2025, 25(8), 2560; https://doi.org/10.3390/s25082560 - 18 Apr 2025
Viewed by 317
Abstract
Pansharpening techniques are crucial in remote sensing image processing, with deep learning emerging as the mainstream solution. In this paper, the pansharpening problem is formulated as two optimization subproblems with a solution proposed based on multiscale contrastive learning combined with attention-guided gradient projection [...] Read more.
Pansharpening techniques are crucial in remote sensing image processing, with deep learning emerging as the mainstream solution. In this paper, the pansharpening problem is formulated as two optimization subproblems with a solution proposed based on multiscale contrastive learning combined with attention-guided gradient projection networks. First, an efficient and generalized Spectral–Spatial Universal Module (SSUM) is designed and applied to spectral and spatial enhancement modules (SpeEB and SpaEB). Then, the multiscale high-frequency features of PAN and MS images are extracted using discrete wavelet transform (DWT). These features are combined with contrastive learning and residual connection to progressively balance spectral and spatial information. Finally, high-resolution multispectral images are generated through multiple iterations. Experimental results verify that the proposed method outperforms existing approaches in both visual quality and quantitative evaluation metrics. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

27 pages, 7975 KiB  
Article
Improving the Efficiency and Quality of Sustainable Industrial CT by Optimizing Scanning Parameters
by Íñigo Fonfría, Ibon Holgado, Naiara Ortega, Ainhoa Castrillo and Soraya Plaza
Sensors 2025, 25(8), 2440; https://doi.org/10.3390/s25082440 - 12 Apr 2025
Cited by 1 | Viewed by 319
Abstract
Industrial Computed Tomography (CT) is a widely used Non-Destructive Testing (NDT) technique for evaluating internal and external geometries with high accuracy. However, its integration into industrial workflows is often hindered by long scan times and high energy consumption, raising sustainability concerns. This study [...] Read more.
Industrial Computed Tomography (CT) is a widely used Non-Destructive Testing (NDT) technique for evaluating internal and external geometries with high accuracy. However, its integration into industrial workflows is often hindered by long scan times and high energy consumption, raising sustainability concerns. This study introduces a novel approach to improving CT efficiency by integrating real-time energy consumption monitoring into the scanning process. A power measurement device was used to correlate scan parameters with energy usage and image quality, enabling a data-driven approach to parameter optimization. Results show that higher voltages improve image quality up to 32%, when evaluated using Contrast-to-Noise Ratio (CNR) amongst other image quality metrics, while reducing overall energy consumption by up to 61%. The results presented support the optimization of CT scan parameters by providing quantitative guidelines to balance efficiency, image quality, and sustainability. Additionally, deviations in dimensional measurements obtained through CT scans were compared against reference data from a Coordinate Measuring Machine (CMM), with differences up to ±45 μm. The findings contribute to enhancing CT performance while minimizing environmental impact. Full article
(This article belongs to the Special Issue Sensors in Nondestructive Testing)
Show Figures

Figure 1

23 pages, 57584 KiB  
Article
Pix2Next: Leveraging Vision Foundation Models for RGB to NIR Image Translation
by Youngwan Jin, Incheol Park, Hanbin Song, Hyeongjin Ju, Yagiz Nalcakan and Shiho Kim
Technologies 2025, 13(4), 154; https://doi.org/10.3390/technologies13040154 - 11 Apr 2025
Viewed by 364
Abstract
This paper proposes Pix2Next, a novel image-to-image translation framework designed to address the challenge of generating high-quality Near-Infrared (NIR) images from RGB inputs. Our method leverages a state-of-the-art Vision Foundation Model (VFM) within an encoder–decoder architecture, incorporating cross-attention mechanisms to enhance feature integration. [...] Read more.
This paper proposes Pix2Next, a novel image-to-image translation framework designed to address the challenge of generating high-quality Near-Infrared (NIR) images from RGB inputs. Our method leverages a state-of-the-art Vision Foundation Model (VFM) within an encoder–decoder architecture, incorporating cross-attention mechanisms to enhance feature integration. This design captures detailed global representations and preserves essential spectral characteristics, treating RGB-to-NIR translation as more than a simple domain transfer problem. A multi-scale PatchGAN discriminator ensures realistic image generation at various detail levels, while carefully designed loss functions couple global context understanding with local feature preservation. We performed experiments on the RANUS and IDD-AW datasets to demonstrate Pix2Next’s advantages in quantitative metrics and visual quality, highly improving the FID score compared to existing methods. Furthermore, we demonstrate the practical utility of Pix2Next by showing improved performance on a downstream object detection task using generated NIR data to augment limited real NIR datasets. The proposed method enables the scaling up of NIR datasets without additional data acquisition or annotation efforts, potentially accelerating advancements in NIR-based computer vision applications. Full article
Show Figures

Graphical abstract

29 pages, 16314 KiB  
Article
A Novel Framework for Real ICMOS Image Denoising: LD-NGN Noise Modeling and a MAST-Net Denoising Network
by Yifu Luo, Ting Zhang, Ruizhi Li, Bin Zhang, Nan Jia and Liping Fu
Remote Sens. 2025, 17(7), 1219; https://doi.org/10.3390/rs17071219 - 29 Mar 2025
Viewed by 300
Abstract
Intensified complementary metal-oxide semiconductor (ICMOS) sensors involve multiple steps, including photoelectric conversion and photoelectric multiplication, each of which introduces noise that significantly impacts image quality. To address the issues of insufficient denoising performance and poor model generalization in ICMOS image denoising, this paper [...] Read more.
Intensified complementary metal-oxide semiconductor (ICMOS) sensors involve multiple steps, including photoelectric conversion and photoelectric multiplication, each of which introduces noise that significantly impacts image quality. To address the issues of insufficient denoising performance and poor model generalization in ICMOS image denoising, this paper proposes a systematic solution. First, we established an experimental platform to collect real ICMOS images and introduced a novel noise generation network (LD-NGN) that accurately simulates the strong sparsity and spatial clustering of ICMOS noise, generating a multi-scene paired dataset. Additionally, we proposed a new noise evaluation metric, KL-Noise, which allows a more precise quantification of noise distribution. Based on this, we designed a denoising network specifically for ICMOS images, MAST-Net, and trained it using the multi-scene paired dataset generated by LD-NGN. By capturing multi-scale features of image pixels, MAST-Net effectively removes complex noise. The experimental results show that, compared to traditional methods and denoisers trained with other noise generators, our method outperforms both qualitatively and quantitatively. The denoised images achieve a peak signal-to-noise ratio (PSNR) of 35.38 dB and a structural similarity index (SSIM) of 0.93. This optimization provides support for tasks such as image preprocessing, target recognition, and feature extraction. Full article
Show Figures

Graphical abstract

22 pages, 4060 KiB  
Article
Quantitative Analysis of the Labeling Quality of Biological Images for Semantic Segmentation Based on Attribute Agreement Analysis
by Rong Xiang, Xinyu Yuan, Yi Zhang and Xiaomin Zhang
Agriculture 2025, 15(7), 680; https://doi.org/10.3390/agriculture15070680 - 22 Mar 2025
Viewed by 331
Abstract
Semantic segmentation in biological images is increasingly common, particularly in smart agriculture, where deep learning model precision is tied to image labeling quality. However, research has largely focused on improving models rather than analyzing image labeling quality. We proposed a method for quantitatively [...] Read more.
Semantic segmentation in biological images is increasingly common, particularly in smart agriculture, where deep learning model precision is tied to image labeling quality. However, research has largely focused on improving models rather than analyzing image labeling quality. We proposed a method for quantitatively assessing labeling quality in semantically segmented biological images using attribute agreement analysis. This method evaluates labeling variation, including internal, external, and overall labeling quality, and labeling bias between labeling results and standards through case studies of tomato stem and group-reared pig images, which vary in labeling complexity. The process involves the following three steps: confusion matrix calculation, Kappa value determination, and labeling quality assessment. Initially, two labeling workers were randomly selected to label ten images from each category twice, according to the requirements of the attribute agreement analysis method. Confusion matrices for each image’s dual labeling results were calculated, followed by Kappa value computation. Finally, labeling quality was evaluated by comparing Kappa values against quality criteria. We also introduced a contour ring method to enhance Kappa value differentiation in imbalanced sample scenarios. Three types of representative images were used to test the performance of the proposed method. The results show that attribute agreement analysis effectively quantifies image labeling quality, and the contour ring method improves Kappa value differentiation. The attribute agreement analysis method allows for quantitative analysis of labeling quality based on image labeling difficulty, and Kappa values can also be used as a metric of image labeling difficulty. Dynamic analysis of image labeling variations over time needs further research. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Graphical abstract

15 pages, 4373 KiB  
Article
Deep Supervised Attention Network for Dynamic Scene Deblurring
by Seok-Woo Jang, Limin Yan and Gye-Young Kim
Sensors 2025, 25(6), 1896; https://doi.org/10.3390/s25061896 - 18 Mar 2025
Viewed by 354
Abstract
In this study, we propose a dynamic scene deblurring approach using a deep supervised attention network. While existing deep learning-based deblurring methods have significantly outperformed traditional techniques, several challenges remain: (1) Invariant weights: Small conventional neural network (CNN) models struggle to address the [...] Read more.
In this study, we propose a dynamic scene deblurring approach using a deep supervised attention network. While existing deep learning-based deblurring methods have significantly outperformed traditional techniques, several challenges remain: (1) Invariant weights: Small conventional neural network (CNN) models struggle to address the spatially variant nature of dynamic scene deblurring, making it difficult to capture the necessary information. A more effective architecture is needed to better extract valuable features. (2) Limitations of standard datasets: Current datasets often suffer from low data volume, unclear ground truth (GT) images, and a single blur scale, which hinders performance. To address these challenges, we propose a multi-scale, end-to-end recurrent network that utilizes supervised attention to recover sharp images. The supervised attention mechanism focuses the model on features most relevant to ambiguous information as data are passed between networks at difference scales. Additionally, we introduce new loss functions to overcome the limitations of the peak signal-to-noise ratio (PSNR) estimation metric. By incorporating a fast Fourier transform (FFT), our method maps features into frequency space, aiding in the recovery of lost high-frequency details. Experimental results demonstrate that our model outperforms previous methods in both quantitative and qualitative evaluations, producing higher-quality deblurring results. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

Back to TopTop