MDPI - Publisher of Open Access Journals

41 pages, 3893 KB

Open AccessReview

Research Progress on Color Image Quality Assessment

by Minjuan Gao, Chenye Song, Qiaorong Zhang, Xuande Zhang, Yankang Li and Fujiang Yuan

J. Imaging 2025, 11(9), 307; https://doi.org/10.3390/jimaging11090307 - 8 Sep 2025

Viewed by 597

Image quality assessment (IQA) aims to measure the consistency between an objective algorithm output and a subjective perception measurement. This article focuses on this complex relationship in the context of color image scenarios—color image quality assessment (CIQA). This review systematically investigates CIQA applications [...] Read more.

Image quality assessment (IQA) aims to measure the consistency between an objective algorithm output and a subjective perception measurement. This article focuses on this complex relationship in the context of color image scenarios—color image quality assessment (CIQA). This review systematically investigates CIQA applications in image compression, processing optimization, and domain-specific scenarios, analyzes benchmark datasets and assessment metrics, and categorizes CIQA algorithms into full-reference (FR), reduced-reference (RR) and no-reference (NR) methods. In this study, color images are evaluated using a newly developed CIQA framework. Focusing on FR and NR methods, FR methods leverage reference images with machine learning, visual perception models, and mathematical frameworks, while NR methods utilize distortion-only features through feature fusion and extraction techniques. Specialized CIQA algorithms are developed for robotics, low-light, and underwater imaging. Despite progress, challenges remain in cross-domain adaptability, generalization, and contextualized assessment. Future directions may include prototype-based cross-domain adaptation, fidelity–structure balancing, spatiotemporal consistency integration, and CIQA–restoration synergy to meet emerging demands. Full article

(This article belongs to the Section Color, Multi-spectral, and Hyperspectral Imaging)

► Show Figures

Figure 1

39 pages, 12437 KB

Open AccessArticle

Optimizing Deep Learning-Based Crack Detection Using No-Reference Image Quality Assessment in a Mobile Tunnel Scanning System

by Chulhee Lee, Donggyou Kim and Dongku Kim

Sensors 2025, 25(17), 5437; https://doi.org/10.3390/s25175437 - 2 Sep 2025

Viewed by 698

Abstract

The mobile tunnel scanning system (MTSS) enables efficient tunnel inspection; however, motion blur (MB) generated at high travel speeds remains a major factor undermining the reliability of deep-learning-based crack detection. This study focuses on investigating how horizontally oriented MB in MTSS imagery affects [...] Read more.

The mobile tunnel scanning system (MTSS) enables efficient tunnel inspection; however, motion blur (MB) generated at high travel speeds remains a major factor undermining the reliability of deep-learning-based crack detection. This study focuses on investigating how horizontally oriented MB in MTSS imagery affects the crack-detection performance of convolutional neural networks (CNNs) and proposes a data-centric quality-assurance framework that leverages no-reference image quality assessment (NR-IQA) to optimize model performance. By intentionally applying MB to both public and real-world MTSS datasets, we analyzed performance changes in ResNet-, VGG-, and AlexNet-based models and established the correlations between four NR-IQA metrics (BRISQUE, NIQE, PIQE, and CPBD) and performance (F1 score). As the MB intensity increased, the F1 score of ResNet34 dropped from 89.43% to 4.45%, confirming the decisive influence of image quality. PIQE and CPBD exhibited strong correlations with F1 (−0.87 and +0.82, respectively), emerging as the most suitable indicators for horizontal MB. Using thresholds of PIQE ≤ 20 and CPBD ≥ 0.8 to filter low-quality images improved the AlexNet F1 score by 1.46%, validating the effectiveness of the proposed methodology. The proposed framework objectively assesses MTSS data quality and optimizes deep learning performance, enhancing the reliability of intelligent infrastructure maintenance systems. Full article

(This article belongs to the Section Intelligent Sensors)

► Show Figures

Figure 1

19 pages, 8091 KB

Open AccessArticle

Leveraging Synthetic Degradation for Effective Training of Super-Resolution Models in Dermatological Images

by Francesco Branciforti, Kristen M. Meiburger, Elisa Zavattaro, Paola Savoia and Massimo Salvi

Electronics 2025, 14(15), 3138; https://doi.org/10.3390/electronics14153138 - 6 Aug 2025

Viewed by 504

Abstract

Teledermatology relies on digital transfer of dermatological images, but compression and resolution differences compromise diagnostic quality. Image enhancement techniques are crucial to compensate for these differences and improve quality for both clinical assessment and AI-based analysis. We developed a customized image degradation pipeline [...] Read more.

Teledermatology relies on digital transfer of dermatological images, but compression and resolution differences compromise diagnostic quality. Image enhancement techniques are crucial to compensate for these differences and improve quality for both clinical assessment and AI-based analysis. We developed a customized image degradation pipeline simulating common artifacts in dermatological images, including blur, noise, downsampling, and compression. This synthetic degradation approach enabled effective training of DermaSR-GAN, a super-resolution generative adversarial network tailored for dermoscopic images. The model was trained on 30,000 high-quality ISIC images and evaluated on three independent datasets (ISIC Test, Novara Dermoscopic, PH²) using structural similarity and no-reference quality metrics. DermaSR-GAN achieved statistically significant improvements in quality scores across all datasets, with up to 23% enhancement in perceptual quality metrics (MANIQA). The model preserved diagnostic details while doubling resolution and surpassed existing approaches, including traditional interpolation methods and state-of-the-art deep learning techniques. Integration with downstream classification systems demonstrated up to 14.6% improvement in class-specific accuracy for keratosis-like lesions compared to original images. Synthetic degradation represents a promising approach for training effective super-resolution models in medical imaging, with significant potential for enhancing teledermatology applications and computer-aided diagnosis systems. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

37 pages, 10762 KB

Open AccessArticle

Evaluating Adversarial Robustness of No-Reference Image and Video Quality Assessment Models with Frequency-Masked Gradient Orthogonalization Adversarial Attack

by Khaled Abud, Sergey Lavrushkin and Dmitry Vatolin

Big Data Cogn. Comput. 2025, 9(7), 166; https://doi.org/10.3390/bdcc9070166 - 25 Jun 2025

Cited by 1 | Viewed by 1475

Abstract

Neural-network-based models have made considerable progress in many computer vision areas over recent years. However, many works have exposed their vulnerability to malicious input data manipulation—that is, to adversarial attacks. Although many recent works have thoroughly examined the adversarial robustness of classifiers, the [...] Read more.

Neural-network-based models have made considerable progress in many computer vision areas over recent years. However, many works have exposed their vulnerability to malicious input data manipulation—that is, to adversarial attacks. Although many recent works have thoroughly examined the adversarial robustness of classifiers, the robustness of Image Quality Assessment (IQA) methods remains understudied. This paper addresses this gap by proposing FM-GOAT (Frequency-Masked Gradient Orthogonalization Attack), a novel white box adversarial method tailored for no-reference IQA models. Using a novel gradient orthogonalization technique, FM-GOAT uniquely optimizes adversarial perturbations against multiple perceptual constraints to minimize visibility, moving beyond traditional

l_{p}

-norm bounds. We evaluate FM-GOAT on seven state-of-the-art NR-IQA models across three image and video datasets, revealing significant vulnerability to the proposed attack. Furthermore, we examine the applicability of adversarial purification methods to the IQA task, as well as their efficiency in mitigating white box adversarial attacks. By studying the activations from models’ intermediate layers, we explore their behavioral patterns in adversarial scenarios and discover valuable insights that may lead to better adversarial detection. Full article

► Show Figures

Figure 1

34 pages, 10241 KB

Open AccessReview

A Comprehensive Benchmarking Framework for Sentinel-2 Sharpening: Methods, Dataset, and Evaluation Metrics

by Matteo Ciotola, Giuseppe Guarino, Antonio Mazza, Giovanni Poggi and Giuseppe Scarpa

Remote Sens. 2025, 17(12), 1983; https://doi.org/10.3390/rs17121983 - 7 Jun 2025

Viewed by 909

Abstract

The advancement of super-resolution and sharpening algorithms for satellite images has significantly expanded the potential applications of remote sensing data. In the case of Sentinel-2, despite significant progress, the lack of standardized datasets and evaluation protocols has made it difficult to fairly compare [...] Read more.

The advancement of super-resolution and sharpening algorithms for satellite images has significantly expanded the potential applications of remote sensing data. In the case of Sentinel-2, despite significant progress, the lack of standardized datasets and evaluation protocols has made it difficult to fairly compare existing methods and advance the state of the art. This work introduces a comprehensive benchmarking framework for Sentinel-2 sharpening, designed to address these challenges and foster future research. It analyzes several state-of-the-art sharpening algorithms, selecting representative methods ranging from traditional pansharpening to ad hoc model-based optimization and deep learning approaches. All selected methods have been re-implemented within a consistent Python-based (Version 3.10) framework and evaluated on a suitably designed, large-scale Sentinel-2 dataset. This dataset features diverse geographical regions, land cover types, and acquisition conditions, ensuring robust training and testing scenarios. The performance of the sharpening methods is assessed using both reference-based and no-reference quality indexes, highlighting strengths, limitations, and open challenges of current state-of-the-art algorithms. The proposed framework, dataset, and evaluation protocols are openly shared with the research community to promote collaboration and reproducibility. Full article

(This article belongs to the Special Issue Machine Learning Approaches for Semantic and Instance Segmentation in Remote Sensing)

► Show Figures

Figure 1

20 pages, 27964 KB

Open AccessArticle

Delving into Underwater Image Utility: Benchmark Dataset and Prediction Model

by Jiapeng Liu, Yi Liu and Qiuping Jiang

Remote Sens. 2025, 17(11), 1906; https://doi.org/10.3390/rs17111906 - 30 May 2025

Cited by 1 | Viewed by 883

Abstract

High-quality underwater images are essential for both human visual perception and machine analysis in marine vision applications. Although significant progress has been achieved in Underwater Image Quality Assessment (UIQA), almost all existing UIQA methods focus on the visual perception-oriented image quality issue and [...] Read more.

High-quality underwater images are essential for both human visual perception and machine analysis in marine vision applications. Although significant progress has been achieved in Underwater Image Quality Assessment (UIQA), almost all existing UIQA methods focus on the visual perception-oriented image quality issue and cannot be used to gauge the utility of underwater images for the use in machine vision applications. To address this issue, in this work, we focus on the problem of automatic underwater image utility assessment (UIUA). On the one hand, we first construct a large-scale Object Detection-oriented Underwater Image Utility Assessment (OD-UIUA) dataset, which includes 1200 raw underwater images, corresponding to 12,000 enhanced results by 10 representative underwater image enhancement (UIE) algorithms and 13,200 underwater image utility scores (UIUSs) for all raw and enhanced underwater images in the dataset. On the other hand, based on this newly constructed OD-UIUA dataset, we train a deep UIUA network (DeepUIUA) that can automatically and accurately predict UIUS. To the best of our knowledge, this is the first benchmark dataset for UIUA and also the first model focusing on the specific UIUA problem. We comprehensively compare the performance of our proposed DeepUIUA model with that of 14 state-of-the-art no-reference image quality assessment (NR-IQA) methods by using the OD-UIUA dataset as the benchmark. Extensive experiments showcase that our proposed DeepUIUA model has superior performance compared with the existing NR-IQA methods in assessing UIUS. The OD-UIUA dataset and the source code of our DeepUIUA model will be released. Full article

(This article belongs to the Special Issue Advanced Techniques for Water-Related Remote Sensing (Second Edition))

► Show Figures

Figure 1

25 pages, 11184 KB

Open AccessArticle

Comparative Evaluation of Multimodal Large Language Models for No-Reference Image Quality Assessment with Authentic Distortions: A Study of OpenAI and Claude.AI Models

by Domonkos Varga

Big Data Cogn. Comput. 2025, 9(5), 132; https://doi.org/10.3390/bdcc9050132 - 16 May 2025

Cited by 3 | Viewed by 3764

Abstract

This study presents a comparative analysis of several multimodal large language models (LLMs) for no-reference image quality assessment, with a particular focus on images containing authentic distortions. We evaluate three models developed by OpenAI and three models from Claude.AI, comparing their performance in [...] Read more.

This study presents a comparative analysis of several multimodal large language models (LLMs) for no-reference image quality assessment, with a particular focus on images containing authentic distortions. We evaluate three models developed by OpenAI and three models from Claude.AI, comparing their performance in estimating image quality without reference images. Our results demonstrate that these LLMs outperform traditional methods based on hand-crafted features. However, more advanced deep learning models, especially those based on deep convolutional networks, surpass LLMs in performance. Notably, we make a unique contribution by publishing the processed outputs of the LLMs, providing a transparent and direct comparison of their quality assessments based solely on the predicted quality scores. This work underscores the potential of multimodal LLMs in image quality evaluation, while also highlighting the continuing advantages of specialized deep learning approaches. Full article

(This article belongs to the Special Issue Advances in Natural Language Processing and Text Mining)

► Show Figures

Figure 1

25 pages, 104688 KB

Open AccessArticle

No-Reference Quality Assessment of Infrared Image Colorization with Color–Spatial Features

by Dian Sheng, Weiqi Jin, Xia Wang and Li Li

Electronics 2025, 14(6), 1126; https://doi.org/10.3390/electronics14061126 - 12 Mar 2025

Cited by 1 | Viewed by 958

Abstract

LDANet represents an innovative no-reference quality assessment model specifically engineered to evaluate colorized infrared images. This is a crucial task for various applications, and existing methods often fail to capture color-specific distortions. The proposed model distinguishes itself by uniquely combining color feature extraction [...] Read more.

LDANet represents an innovative no-reference quality assessment model specifically engineered to evaluate colorized infrared images. This is a crucial task for various applications, and existing methods often fail to capture color-specific distortions. The proposed model distinguishes itself by uniquely combining color feature extraction through latent Dirichlet allocation (LDA) with spatial feature extraction enhanced by multichannel and spatial attention mechanisms. It employs a dual-feature approach that facilitates thorough assessment of both color fidelity and detail preservation in colorized images. The architecture of LDANet encompasses two critical components: an LDA-based color feature extraction module which meticulously analyzes and learns color distribution patterns, and a spatial feature extraction module that leverages an inception network bolstered by attention mechanisms to effectively capture multiscale spatial characteristics. Rigorous experimental validation conducted on a specialized dataset of colorized infrared images demonstrates that LDANet significantly outperforms existing leading no-reference image quality assessment methods. This study reports the effectiveness of integrating color-specific features within a quality assessment framework tailored for infrared image colorization, representing a meaningful advancement in this domain. These findings emphasize the essential role of color feature integration in the evaluation of colorized infrared images, providing a robust tool for optimizing colorization algorithms and enhancing their practical applications. Full article

(This article belongs to the Topic Visual Computing and Understanding: New Developments and Trends)

► Show Figures

Figure 1

21 pages, 3281 KB

Open AccessArticle

Multi-Space Feature Fusion and Entropy-Based Metrics for Underwater Image Quality Assessment

by Baozhen Du, Hongwei Ying, Jiahao Zhang and Qunxin Chen

Entropy 2025, 27(2), 173; https://doi.org/10.3390/e27020173 - 6 Feb 2025

Viewed by 1194

Abstract

In marine remote sensing, underwater images play an indispensable role in ocean exploration, owing to their richness in information and intuitiveness. However, underwater images often encounter issues such as color shifts, loss of detail, and reduced clarity, leading to the decline of image [...] Read more.

In marine remote sensing, underwater images play an indispensable role in ocean exploration, owing to their richness in information and intuitiveness. However, underwater images often encounter issues such as color shifts, loss of detail, and reduced clarity, leading to the decline of image quality. Therefore, it is critical to study precise and efficient methods for assessing underwater image quality. A no-reference multi-space feature fusion and entropy-based metrics for underwater image quality assessment (MFEM-UIQA) are proposed in this paper. Considering the color shifts of underwater images, the chrominance difference map is created from the chrominance space and statistical features are extracted. Moreover, considering the information representation capability of entropy, entropy-based multi-channel mutual information features are extracted to further characterize chrominance features. For the luminance space features, contrast features from luminance images based on gamma correction and luminance uniformity features are extracted. In addition, logarithmic Gabor filtering is applied to the luminance space images for subband decomposition and entropy-based mutual information of subbands is captured. Furthermore, underwater image noise features, multi-channel dispersion information, and visibility features are extracted to jointly represent the perceptual features. The experiments demonstrate that the proposed MFEM-UIQA surpasses the state-of-the-art methods. Full article

(This article belongs to the Collection Entropy in Image Analysis)

► Show Figures

Figure 1

19 pages, 25413 KB

Open AccessArticle

No-Reference Image Quality Assessment with Moving Spectrum and Laplacian Filter for Autonomous Driving Environment

by Woongchan Nam, Taehyun Youn and Chunghun Ha

Vehicles 2025, 7(1), 8; https://doi.org/10.3390/vehicles7010008 - 21 Jan 2025

Cited by 3 | Viewed by 1418

Abstract

The increasing integration of autonomous driving systems into modern vehicles heightens the significance of Image Quality Assessment (IQA), as it pertains directly to vehicular safety. In this context, the development of metrics that can emulate the Human Visual System (HVS) in assessing image [...] Read more.

The increasing integration of autonomous driving systems into modern vehicles heightens the significance of Image Quality Assessment (IQA), as it pertains directly to vehicular safety. In this context, the development of metrics that can emulate the Human Visual System (HVS) in assessing image quality assumes critical importance. Given that blur is often the primary aberration in images captured by aging or deteriorating camera sensors, this study introduces a No-Reference (NR) IQA model termed BREMOLA (Blind/Referenceless Model via Moving Spectrum and Laplacian Filter). This model is designed to sensitively respond to varying degrees of blur in images. BREMOLA employs the Fourier transform to quantify the decline in image sharpness associated with increased blur. Subsequently, deviations in the Fourier spectrum arising from factors such as nighttime lighting or the presence of various objects are normalized using the Laplacian filter. Experimental application of the BREMOLA model demonstrates its capability to differentiate between images processed with a 3 × 3 average filter and their unprocessed counterparts. Additionally, the model effectively mitigates the variance introduced in the Fourier spectrum due to variables like nighttime conditions, object count, and environmental factors. Thus, BREMOLA presents a robust approach to IQA in the specific context of autonomous driving systems. Full article

► Show Figures

Figure 1

30 pages, 82967 KB

Open AccessArticle

Pansharpening Techniques: Optimizing the Loss Function for Convolutional Neural Networks

by Rocco Restaino

Remote Sens. 2025, 17(1), 16; https://doi.org/10.3390/rs17010016 - 25 Dec 2024

Cited by 1 | Viewed by 1271

Abstract

Pansharpening is a traditional image fusion problem where the reference image (or ground truth) is not accessible. Machine-learning-based algorithms designed for this task require an extensive optimization phase of network parameters, which must be performed using unsupervised learning techniques. The learning phase can [...] Read more.

Pansharpening is a traditional image fusion problem where the reference image (or ground truth) is not accessible. Machine-learning-based algorithms designed for this task require an extensive optimization phase of network parameters, which must be performed using unsupervised learning techniques. The learning phase can either rely on a companion problem where ground truth is available, such as by reproducing the task at a lower scale or using a pretext task, or it can use a reference-free cost function. This study focuses on the latter approach, where performance depends not only on the accuracy of the quality measure but also on the mathematical properties of these measures, which may introduce challenges related to computational complexity and optimization. The evaluation of the most recognized no-reference image quality measures led to the proposal of a novel criterion, the Regression-based QNR (RQNR), which has not been previously used. To mitigate computational challenges, an approximate version of the relevant indices was employed, simplifying the optimization of the cost functions. The effectiveness of the proposed cost functions was validated through the reduced-resolution assessment protocol applied to a public dataset (PairMax) containing images of diverse regions of the Earth’s surface. Full article

► Show Figures

Figure 1

12 pages, 660 KB

Open AccessProceeding Paper

NR-IQA with Gaussian Derivative Filter, Convolutional Block Attention Module, and Spatial Pyramid Pooling

by Jyothi Sri Vadlamudi and Sameeulla Khan Md

Eng. Proc. 2024, 82(1), 20; https://doi.org/10.3390/ecsa-11-20482 - 26 Nov 2024

Viewed by 428

Abstract

Gaussian derivatives offer valuable capabilities for analyzing image characteristics such as structure, edges, texture, and features, which are essential aspects in the assessment of image quality. Recently, Convolutional Neural Networks (CNNs) have gained in importance in computer vision applications and also in the [...] Read more.

Gaussian derivatives offer valuable capabilities for analyzing image characteristics such as structure, edges, texture, and features, which are essential aspects in the assessment of image quality. Recently, Convolutional Neural Networks (CNNs) have gained in importance in computer vision applications and also in the image quality assessment domain. Due to the characteristics of Gaussian derivatives that perform a major role in assessing image quality, this work seeks to combine these characteristics with CNNs to better extract features for assessing the quality of an image. While CNNs have demonstrated their ability to handle distortion effectively, they are limited in their capacity to capture features at different scales, making them inadequate in dealing with significant variations in object size. Consequently, the concept of spatial pyramid pooling (SPP) is introduced to address this limitation in image quality assessment (IQA). SPP involves pooling the spatial feature maps from the highest convolutional layers into a feature representation of fixed length. Additionally, through the utilization of a convolutional block attention module (CBAM), a module designed for the interpretation of images, and local importance pooling (LIP), we propose method for no-reference image quality assessment has demonstrated improved accuracy, generalization, and efficiency on the IVC database compared to conventional or traditional IQA methods, while achieving competitive performance on other datasets. Full article

(This article belongs to the Proceedings of The 11th International Electronic Conference on Sensors and Applications)

► Show Figures

Figure 1

19 pages, 3896 KB

Open AccessArticle

No-Reference Quality Assessment Based on Dual-Channel Convolutional Neural Network for Underwater Image Enhancement

by Renzhi Hu, Ting Luo, Guowei Jiang, Zhiqiang Lin and Zhouyan He

Electronics 2024, 13(22), 4451; https://doi.org/10.3390/electronics13224451 - 13 Nov 2024

Viewed by 822

Abstract

Underwater images are important for underwater vision tasks, yet their quality often degrades during imaging, promoting the generation of Underwater Image Enhancement (UIE) algorithms. This paper proposes a Dual-Channel Convolutional Neural Network (DC-CNN)-based quality assessment method to evaluate the performance of different UIE [...] Read more.

Underwater images are important for underwater vision tasks, yet their quality often degrades during imaging, promoting the generation of Underwater Image Enhancement (UIE) algorithms. This paper proposes a Dual-Channel Convolutional Neural Network (DC-CNN)-based quality assessment method to evaluate the performance of different UIE algorithms. Specifically, inspired by the intrinsic image decomposition, the enhanced underwater image is decomposed into reflectance with color information and illumination with texture information based on the Retinex theory. Afterward, we design a DC-CNN with two branches to learn color and texture features from reflectance and illumination, respectively, reflecting the distortion characteristics of enhanced underwater images. To integrate the learned features, a feature fusion module and attention mechanism are conducted to align efficiently and reasonably with human visual perception characteristics. Finally, a quality regression module is used to establish the mapping relationship between the extracted features and quality scores. Experimental results on two public enhanced underwater image datasets (i.e., UIQE and SAUD) show that the proposed DC-CNN method outperforms a variety of the existing quality assessment methods. Full article

► Show Figures

Figure 1

22 pages, 11597 KB

Open AccessArticle

MRI Super-Resolution Analysis via MRISR: Deep Learning for Low-Field Imaging

by Yunhe Li, Mei Yang, Tao Bian and Haitao Wu

Information 2024, 15(10), 655; https://doi.org/10.3390/info15100655 - 19 Oct 2024

Cited by 4 | Viewed by 3220

Abstract

This paper presents a novel MRI super-resolution analysis model, MRISR. Through the utilization of generative adversarial networks for the estimation of degradation kernels and the injection of noise, we have constructed a comprehensive dataset of high-quality paired high- and low-resolution MRI images. The [...] Read more.

This paper presents a novel MRI super-resolution analysis model, MRISR. Through the utilization of generative adversarial networks for the estimation of degradation kernels and the injection of noise, we have constructed a comprehensive dataset of high-quality paired high- and low-resolution MRI images. The MRISR model seamlessly integrates VMamba and Transformer technologies, demonstrating superior performance across various no-reference image quality assessment metrics compared with existing methodologies. It effectively reconstructs high-resolution MRI images while meticulously preserving intricate texture details, achieving a fourfold enhancement in resolution. This research endeavor represents a significant advancement in the field of MRI super-resolution analysis, contributing a cost-effective solution for rapid MRI technology that holds immense promise for widespread adoption in clinical diagnostic applications. Full article

(This article belongs to the Special Issue From Data to Diagnosis: Recent Advances of Machine Learning in Biomedical and Health Informatics)

► Show Figures

Figure 1

19 pages, 1708 KB

Open AccessArticle

No-Reference Image Quality Assessment Combining Swin-Transformer and Natural Scene Statistics

by Yuxuan Yang, Zhichun Lei and Changlu Li

Sensors 2024, 24(16), 5221; https://doi.org/10.3390/s24165221 - 12 Aug 2024

Cited by 8 | Viewed by 3829

Abstract

No-reference image quality assessment aims to evaluate image quality based on human subjective perceptions. Current methods face challenges with insufficient ability to focus on global and local information simultaneously and information loss due to image resizing. To address these issues, we propose a [...] Read more.

No-reference image quality assessment aims to evaluate image quality based on human subjective perceptions. Current methods face challenges with insufficient ability to focus on global and local information simultaneously and information loss due to image resizing. To address these issues, we propose a model that combines Swin-Transformer and natural scene statistics. The model utilizes Swin-Transformer to extract multi-scale features and incorporates a feature enhancement module and deformable convolution to improve feature representation, adapting better to structural variations in images, apply dual-branch attention to focus on key areas, and align the assessment more closely with human visual perception. The Natural Scene Statistics compensates information loss caused by image resizing. Additionally, we use a normalized loss function to accelerate model convergence and enhance stability. We evaluate our model on six standard image quality assessment datasets (both synthetic and authentic), and show that our model achieves advanced results across multiple datasets. Compared to the advanced DACNN method, our model achieved Spearman rank correlation coefficients of 0.922 and 0.923 on the KADID and KonIQ datasets, respectively, representing improvements of 1.9% and 2.4% over this method. It demonstrated outstanding performance in handling both synthetic and authentic scenes. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

Search Results (74)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (74)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI