Next Article in Journal
Three-Dimensional Reconstruction from a Single RGB Image Using Deep Learning: A Review
Previous Article in Journal
Image Quality Comparison between Digital Breast Tomosynthesis Images and 2D Mammographic Images Using the CDMAM Test Object
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Full-Reference Image Quality Assessment Based on an Optimal Linear Combination of Quality Measures Selected by Simulated Annealing

Ronin Institute, Montclair, NJ 07043, USA
J. Imaging 2022, 8(8), 224; https://doi.org/10.3390/jimaging8080224
Submission received: 15 July 2022 / Revised: 17 August 2022 / Accepted: 18 August 2022 / Published: 21 August 2022
(This article belongs to the Section Image and Video Processing)

Abstract

:
Digital images can be distorted or contaminated by noise in various steps of image acquisition, transmission, and storage. Thus, the research of such algorithms, which can evaluate the perceptual quality of digital images consistent with human quality judgement, is a hot topic in the literature. In this study, an image quality assessment (IQA) method is introduced that predicts the perceptual quality of a digital image by optimally combining several IQA metrics. To be more specific, an optimization problem is defined first using the weighted sum of a few IQA metrics. Subsequently, the optimal values of the weights are determined by minimizing the root mean square error between the predicted and ground-truth scores using the simulated annealing algorithm. The resulted optimization-based IQA metrics were assessed and compared to other state-of-the-art methods on four large, widely applied benchmark IQA databases. The numerical results empirically corroborate that the proposed approach is able to surpass other competing IQA methods.

1. Introduction

Nowadays, people increasingly communicate through media in form of audio, video, and digital images. Therefore, image quality assessment (IQA) has found many applications and become a hot research topic in the research community [1]. Namely, IQA methods evaluate the perceptual quality of digital images and support, among others, image enhancement [2], restoration [3], steganography [4], or denoising algorithms [5]. Further, IQA is also necessary in the benchmarking of many image processing or computer-vision algorithms [6,7,8]. In the literature, IQA is classified into two groups, i.e., subjective and objective IQA. Specifically, subjective IQA deals with the collection of users’ quality ratings for a set of digital images either in a laboratory [1] or in an online crowd-sourcing experiment [9]. Moreover, images’ perceptual quality is expressed as a mean opinion score (MOS), which is the arithmetic mean of individual quality scores. As a result, subjective IQA provides quality labelled images with objective IQA as training or test data [10]. Namely, objective IQA deals with algorithms and mathematical models that are able to predict the quality of a given image. Conventionally, objective IQA is divided into three classes [11]—full-reference (FR) [12], reduced-reference (RR) [13], and no-reference (NR) [14]—with respect to the availability of the reference (distortion-free) images. As the names indicate, FR-IQA methods have full access to the reference images. In contrast, NR-IQA algorithms evaluate image quality without any information about the reference images [15], and RR-IQA algorithms have partial information about them.

1.1. Contribution

The development of objective FR-IQA algorithms can also involve fusion-based strategies that already take existing FR-IQA metrics and try to create a “super evaluator”. Recently, many complex fusion-based approaches have been published in the literature [16,17,18,19]. The main contribution to this paper is also a fusion-based approach. Namely, we demonstrate a solution based on a linear combination of several already existing FR-IQA metrics optimized with a simulated annealing (SA) algorithm using a root mean square error (RMSE) objective, which is able to produce well-performing fusion-based FR-IQA metrics. To be more specific, a linear combination of 16 FR-IQA metrics is used in an optimization problem to select FR-IQA metrics and find their weights via an SA algorithm that minimizes the RMSE of the prediction. Unlike the approach of Oszust [20], we apply simulated annealing instead of a genetic algorithm for performing the fusion of FR-IQA metrics. Namely, simulated annealing usually achieves better results in the case of continuous function approximation than basic genetic algorithms because they choose one or two genes at a given location [21]. The proposed fusion-based metrics was evaluated on large, popular, and widely accepted IQA benchmark databases, such as LIVE [22], TID2013 [23], TID2008 [24], and CSIQ [25].

1.2. Organization

The rest of this paper is organized as follows. In Section 2, an overview about the current state of FR-IQA is given. Next, the proposed fusion-based metric is introduced in Section 3. Our experimental results, together with the description of the applied benchmark IQA databases, evaluation environment, and performance indices, are given in Section 4. Finally, a conclusion is drawn in Section 5.

2. Literature Review

In this paper, we follow the classification of FR-IQA algorithms presented in [26]. To be specific, Ding et al. [26] categorized existing FR-IQA algorithms into five distinct classes, i.e., error visibility, structural similarity, information theoretic, learning-based, and fusion-based methods.
Error visibility methods measure a distance between the pixels of the distorted and the reference images to quantify perceptual quality degradation. The representative method of this class of FR-IQA is the mean squared error (MSE) method, which measures the average of the squares of the errors. In other words, it is the average squared difference between the reference and the distorted images in the context of FR-IQA [27]. Another well-known example is the peak signal-to-noise ratio (PSNR), which is commonly applied to assess the quality of the reconstruction of lossy compression codecs [28]. Although both MSE and PSNR have low computational costs and their physical meaning is clear and well understood, they often mismatch with subjective perceptions of visual quality.
Structural similarity methods measure the similarity between the corresponding regions of the distorted and reference images using sliding-windows in the images and correlation measures. The representative and first published method of this class is the structural similarity index (SSIM) [29], which has become extremely popular in the field with many extensions and applications [30]. The theorem of SSIM has become extremely popular in the research community and inspired many variants. For example, the wavelet domain structural similarity [31] carries out SSIM in the wavelet domain to quantify perceptual quality. This work was extended by Sampat et al. [32] into the complex wavelet domain. In [33], information content was utilized as weights in the pooling process of local image quality scores. In contrast, Wang et al. [34] extended SSIM to multi-scale processing to improve perceptual quality estimation. Li and Bovik [35] elaborated an FR-IQA metric by taking the average of SSIMs computed over three different regions of an image, such as edges, textures, and smooth regions. Kolaman and Yadid-Pecht [36] found an extension of SSIM to color images by representing red, green, and blue color channels with quaternions. Later, SSIM was also extended to hyperspectral images [37].
Information theoretic methods approach the FR-IQA task from the point of view of information communication. For example, Sheikh et al. [38,39] compared the information content of the reference and distorted images. Namely, perceptual quality was quantified by how much information is similar between the reference and distorted images. In contrast, Larson and Chandler [25] classified image distortions as near-threshold and supra-threshold. The authors elaborated two quality indexes for both distortion types. Finally, the overall perceptual quality was determined based on the quality scores of near-threshold and supra-threshold distortions.
As the terminology suggests, learning-based methods rely on a specific machine learning algorithm to create a quality model from training images. Next, the obtained quality model is tested on previously unseen images. For instance, Liang et al. [40] implemented a special convolutional neural network containing two paths, one for the reference image and the other for the distorted image. Further, this network was trained on 224 × 224 -sized image patches sampled simultaneously from the reference and distorted images. As a consequence, the perceptual quality of a distorted image was estimated by the average score of the considered patches. Kim and Lee [41] devised a similar network, but it predicts a visual sensitivity map that is multiplied by an error map calculated directly from the reference and the distorted images to estimate perceptual image quality. Ahn et al. [42] further improved the idea of Kim and Lee [41] by implementing an end-to-end trained convolutional neural network with three inputs, i.e., reference image, distorted image, and spatial error map. Similar to [41], a distortion-sensitivity map was predicted from the inputs and was later multiplied by the spatial error map to give an estimation for the perceptual image quality. In contrast to the previously mentioned methods, Ding et al. [43] extracted a set of feature maps from the reference and the distorted images using the Sobel operator, log Gabor filter, and local pattern analysis. Subsequently, the extracted feature maps were compared, and from the resulting similarity scores a feature vector was compiled that was mapped onto perceptual quality scores with a trained support vector regressor. Tang et al. [44] took a similar approach, but the authors employed a different set of features (phase congruency maps [45], gradient magnitude maps, and log Gabor maps). Further, the similarity scores of the feature maps were mapped onto perceptual quality with a trained random forest regressor.
Fusion-based FR-IQA methods utilize existing FR-IQA metrics to create a new FR-IQA algorithm. First, Okarma [46] suggested the idea of combined methods. Namely, the author proposed a combined metric using the product and power of MS-SSIM [34], VIF [38], and R-SVD [47]. This approach was developed further in [19], where the optimal exponents in the product were determined by using MATLAB’s fminsearch command. In [48], Oszust took a similar approach, but the author applied the scores of traditional FR-IQA metrics as predictor variables in a lasso regression. Instead of lasso regression, Yuan et al. [49] used kernel ridge regression in a similar layout. The work of Lukin et al. [50] exhibits the properties of both learning-based and fusion-based methods. Specifically, the authors created a training and a test set from the images of an IQA benchmark database. Next, the scores of several traditional FR-IQA metrics were used as image features, and a neural network was trained to estimate perceptual image quality. Amirshahi et al. [51] elaborated a special fusion-based FR-IQA metric relying on a pretrained convolutional neural network. Namely, the authors ran a reference-distorted image pair through an AlexNet [52] network and compared the activation maps with the help of a traditional FR-IQA metric. Next, the resulted scores were aggregated to obtain a single score for the perceptual image quality. Bakurov et al. [53] revisited the classical SSIM [29] and MS-SSIM [34] metrics by applying evolutionary and swarm intelligence optimization methods to find optimal hyperparameters for SSIM and MS-SSIM instead of the original settings. Fusion-based metrics were also proposed for remote sensing images [54], stitched panoramic images [55], and 3D image quality assessment [18].
For more detailed studies about FR-IQA, we refer readers to the book of Xu et al.’s [56] and to the study of Pedersen and Hardeberg [57]. Further, Zhang et al. [58] provide an evaluation of several state-of-the-art FR-IQA algorithms on various IQA benchmark databases. Zhai and Min provided an comprehensive overview of classical algorithms in [59]. For the quality assessment of screen content images [60], Min et al. gave an overview in [61].

3. Proposed Method

As already mentioned, an FR-IQA metric should deliver perceptual quality scores consistent with the human judgement using both the distorted and reference images. Let us express the aggregated decision of n different FR-IQA metrics by a weighted sum as:
Q = i = 1 n α i q i ,
where q i ( i = 1 , 2 , . . . , n ) stands for the quality scores provided by the FR-IQA metrics. Further, α = ( α 1 , α 2 , , α n ) is a real vector of weights whose values are found via an optimization procedure to ensure an effective fusion of FR-IQA metrics. Namely, an optimization fusion was carried out in our study using n = 16 open-source FR-IQA metrics, such as FSIM [62], FSIMc [62], GSM [63], IFC [38], IFS [64], IW-SSIM [33], MAD [25], MS-SSIM [34], NQM [65], PSNR, RFSIM [66], SFF [67], SR-SIM [12], SSIM [29], VIF [39], and VSI [68].
In the literature, Pearson’s linear correlation coefficient (PLCC), Spearman’s rank-order correlation coefficient (SROCC), Kendall’s rank order correlation coefficient (KROCC), and root mean square error (RMSE) are often considered to characterize the consistency between the ground-truth quality scores of an IQA benchmark database and the quality scores predicted by an FR-IQA metric [22]. From these performance indices, RMSE was applied as an objective function in the proposed optimization based metric. Figure 1 and Figure 2 depict flowcharts where the compilation of the proposed fusion-based metrics and its application for FR-IQA are demonstrated.
Formally, the optimization problem can be written as
min α R M S E ( F ( Q p , β ) , S ) , subject to α i R , n N , β 0 ,
where Q p is vector containing the quality scores of a set of images obtained by Equation (1) and S contains the corresponding ground-truth scores. Further, prior to the calculation of RMSE, a non-linear regression is also applied [22] since a non-linear relationship exists between the ground-truth and predicted scores. Formally, it can be written
Q = β 1 1 2 1 1 + e β 2 ( Q p β 3 ) + β 4 Q p + β 5 ,
where β 1 , . . . , β 5 stand for the parameters of the regression model. In addition, Q and Q p are the fitted and predicted scores, respectively. Since we use four large, widely accepted IQA benchmark databases, i.e., LIVE [22], TID2013 [23], TID2008 [24], and CSIQ [25], in this paper, four optimization-based fusion FR-IQA metrics are proposed, respectively. To this end, approximately 20% of the reference images were randomly selected from a given benchmark IQA database. More precisely, Q and S were compiled based on those distorted images whose reference counterparts were randomly selected. Although 20% is a common choice for parameter setting in the literature [69,70], there are also researchers who applied 30% [62] or 80% [71] for parameter tuning. However, we evaluate all the fusion based metrics on all the databases to demonstrate results independent from the database.
Next, the optimization problem was solved described by Equation (2) to determine the α i weights for Equation (1). Since the number of possible solutions increases exponentially with number of the considered FR-IQA metrics, simulated annealing (SA) [72,73] was used to solve the above-described optimization task. Namely, SA is a probabilistic optimization technique for estimating the global optimum of a given function. The stochastic nature of this algorithm enables the usage of nonlinear objective functions where many other methods do not operate well. SA was inspired by the physical model of heating a material and then slowly decreasing the temperature to eliminate imperfections from the material. Hence, minimizing the system’s energy is the main goal. More precisely, the SA randomly generates a new point at each iteration. Based on a probability distribution with a scale proportional to the temperature, the new point’s distance from the present point or the size of the search is determined. All new points that reduce the objective are accepted by the algorithm, but points that increase the objective can also be accepted with a pre-defined probability. Due to this property of the method, SA is prevented from being stuck in local minima in early iterations. In our implementation, the SA was performed using MATLAB R2020a with a Global Optimization Toolbox using α i = 0 for i = 1 , 2 , , n as initial point and defining no lower or upper bounds for the method. After 100 runs of SA, the best solution— α d b e s t —was selected, where d denotes the database from which 20% of the reference images was chosen randomly.
In the end of the SA optimization processes using LIVE [22], TID2013 [23], TID2008 [24], and CSIQ [25] databases, the following FR-IQA metrics can be obtained, which are codenamed LCSA, referring to the fact that they are linear combinations of FR-IQA measures selected by simulated annealing:
L C S A 1 = α LIVE best = 561.0123 · V S I + 281.826 · F S I M c 116.1501 · I F C 846.6376 · M A D + 349.6191 · M S S S I M 262.6766 · N Q M + 41.6348 · P S N R 308.9426 · S S I M + 722.4479 · V I F ,
L C S A 2 = α TID 2013 best = 1774.8368 · V S I + 467.5433 · F S I M c 332.1863 · G S M 63.4379 · I F C + 84.7954 · I W S S I M 346.5585 · M A D 126.5188 · N Q M + 381.0923 · P S N R 626.9841 · S S I M + 380.3341 · V I F + 524.6484 · I F S + 342.7968 · S F F ,
L C S A 3 = α TID 2008 best = 1253.2402 · V S I + 217.0877 · I W S S I M 168.1779 · M A D 75.6832 · N Q M + 276.9035 · P S N R 28.5915 · R F S I M 454.7619 · S S I M + 203.0893 · V I F + 500.4323 · I F S 153.3686 · S F F ,
L C S A 4 = α CSIQ best = 266.3256 · F S I M 119.8937 · F S I M c 15.6937 · I W S S I M 529.1806 · M A D 656.4991 · M S S S I M 73.009 · N Q M + 381.0923 · P S N R 626.9841 · S S I M + 380.3341 · V I F + 524.6484 · I F S + 342.7968 · S F F .
The corresponding β vectors are as follows:
β LIVE = ( 106.1735 , 36.8421 , 30.0447 , 15.7705 , 139.3613 ) ,
β TID 2013 = ( 56.413 , 193.7249 , 14.9834 , 147.7736 , 89.8778 ) ,
β TID 2008 = ( 13.4153 , 115.9834 , 45.4464 , 22.0253 , 269.7624 ) ,
β CSIQ = ( 13.5361 , 105.4132 , 70.1095 , 150.7645 , 11.5291 ) .

4. Results

In this section, our experimental results are presented. First, the applied IQA benchmark databases and evaluation protocol are described in Section 4.1. Next, Section 4.2 presents a comparison to other competing state-of-the-art methods on four large IQA benchmark databases, i.e., LIVE [22], TID2013 [23], TID2008 [24], and CSIQ [25].

4.1. Applied IQA Benchmark Databases and Evaluation Protocol

The main properties of the applied IQA benchmark databases are outlined in Table 1. These databases consist of a set of reference images, whose visual quality are considered perfect and flawless. Further, distorted images are generated artificially from the reference images using different distortion types (i.e., JPEG compression noise, JPEG2000 compression noise, salt and pepper, motion blur, Gaussian, Poisson, etc.) at different distortion levels. Figure 3 depicts the empirical MOS distributions of the applied benchmark databases.
In the literature, PLCC, SROCC, and KROCC is widely used and accepted to characterize the performance of FR-IQA methods. They are measured between the ground-truth scores of an IQA benchmark database and the predicted scores. Moreover, prior to the calculation of PLCC a non-linear regression is also applied [22] since a non-linear relationship exists between the ground-truth and predicted scores. This non-linear relationship was also defined by Equation (3). Further, Q and Q p are the fitted and predicted scores, respectively. PLCC between vectors x and y with length m is defined as
P L C C ( x , y ) = x T y x ¯ T y ¯ y ¯ T x ¯ ,
where x ¯ and y ¯ are the mean subtracted version of vectors x and y , respectively. On the other hand, SROCC can be defined as
S R O C C ( x , y ) = 1 6 i = 1 m ( x i y i ) 2 m ( m 2 1 ) ,
where x i and y i are the ith entries of vectors x and y, respectively. In contrast, KROCC uses the number of concordant pairs ( m c ) and the number of discordant pairs ( m d ) between vectors x and y and is defined as
K R O C C ( x , y ) = m c m d 1 2 m ( m 1 ) .
As already mentioned, the proposed fusion-based metrics were implemented using MATLAB R2020a and its Global Optimization Toolbox. The computer configuration applied in our experiments is summarized in Table 2.

4.2. Comparison to the State-of-the-Art

In this subsection, the proposed fusion-based metrics are compared to several state-of-the-art FR-IQA whose original source codes were made publicly available by the authors. Moreover, we reimplemented the fusion-based SSIM-CNN [51] method in MATLAB R2020a (available at: https://github.com/Skythianos/SSIM-CNN (accessed on 12 May 2022)). The PLCC, SROCC, and KROCC performance comparisons of the proposed fusion-based FR-IQA metrics with the state-of-the-art are summarized in Table 3 and Table 4. Specifically, Table 3 demonstrates the results on LIVE [22] and TID2013 [23], while Table 4 contains the obtained results for TID2008 [24] and CSIQ [25] databases. The obtained results clearly show that the proposed L C S A metrics are able to outperform the state-of-the-art. Specifically, those L C S A metrics that were parameter-tuned on database d always deliver the highest correlation values, while another L C S A not parameter-tuned on database d usually provides the second-best results.
Table 5 illustrates the direct and weighted average of correlation values measured on LIVE [22], TID2013 [23], TID2008 [24], and CSIQ [25]. From the results of direct averages, it can be clearly seen that the proposed L C S A 2 and L C S A 4 provide the best results in two out of three performance indices, while L C S A 3 is able to produce second best KROCC value. The results of weighted averages are biased towards those FR-IQA measures that perform well on TID2013 [23] since it is the largest database from the applied benchmarks. Similarly, L C S A 2 is the best-performing method in this respect because it provides the best results for SROCC and KROCC. Further, L C S A 4 delivers the second best PLCC and KROCC values, while L C S A 3 ’s performance is equivalent in terms of SROCC and KROCC to those of L C S A 4 .
In the following, we examine the performance of the proposed and the other state-of-the-art methods on the individual distortion types of the applied IQA benchmark databases. The distortion types and their abbreviations used by the databases are summarized in Table 6. Further, Table 7, Table 8, Table 9 and Table 10 contain detailed results on the different distortion types of LIVE [22], TID2013 [23], TID2008 [24], and CSIQ [25], respectively. To be more specific, the SROCC values are given for each individual distortion types.

5. Conclusions

In this study, we presented a novel fusion-based FR-IQA metric using simulated annealing. Specifically, an optimization problem was solved based on the weighted sum of several FR-IQA metrics by minimizing the root mean squared error between the predicted and ground-truth perceptual quality scores. The evaluation of the proposed fusion-based metrics on four large publicly available and widely accepted IQA benchmark databases empirically corroborated that the proposed metrics are able to produce competitive results compared to the state-of-the-art in terms of various performance indices, such as PLCC, SROCC, and KROCC. Future research could involve other optimization techniques and their combination for improved perceptual quality prediction. Another direction is the generalization of the proposed method for other types of media.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

In this paper, the following publicly available benchmark databases were used: 1. LIVE: https://live.ece.utexas.edu/research/quality/subjective.htm (accessed on 12 May 2022), 2. TID2013: http://www.ponomarenko.info/tid2013.htm (accessed on 12 May 2022), 3. TID2008: http://www.ponomarenko.info/tid2008.htm (accessed on 12 May 2022), and 4. CSIQ: https://isp.uv.es/data_quality.html (accessed on 12 May 2022).

Acknowledgments

We thank the anonymous reviewers for their careful reading of our manuscript and their many insightful comments and suggestions.

Conflicts of Interest

The author declares no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
FR-IQAfull-reference image quality assessment
IQAimage quality assessment
KROCCKendall’s rank order correlation coefficient
MOSmean opinion score
MSEmean squared error
NR-IQAno-reference image quality assessment
PLCCPearson’s linear correlation coefficient
PSNRpeak signal-to-noise ratio
RMSEroot mean square error
RR-IQAreduced-reference image quality assessment
SAsimulated annealing
SROCCSpearman’s rank order correlation coefficient
SSIMstructural similarity index

References

  1. Chubarau, A.; Akhavan, T.; Yoo, H.; Mantiuk, R.K.; Clark, J. Perceptual image quality assessment for various viewing conditions and display systems. Electron. Imaging 2020, 2020, 67-1. [Google Scholar] [CrossRef]
  2. Tao, L.; Zhu, C.; Xiang, G.; Li, Y.; Jia, H.; Xie, X. LLCNN: A convolutional neural network for low-light image enhancement. In Proceedings of the 2017 IEEE Visual Communications and Image Processing (VCIP), St. Petersburg, FL, USA, 10–13 December 2017; pp. 1–4. [Google Scholar]
  3. Rehman, A.; Rostami, M.; Wang, Z.; Brunet, D.; Vrscay, E.R. SSIM-inspired image restoration using sparse representation. EURASIP J. Adv. Signal Process. 2012, 2012, 1–12. [Google Scholar] [CrossRef]
  4. Setiadi, D.R.I.M. PSNR vs SSIM: Imperceptibility quality assessment for image steganography. Multimed. Tools Appl. 2021, 80, 8423–8444. [Google Scholar] [CrossRef]
  5. Goyal, B.; Gupta, A.; Dogra, A.; Koundal, D. An adaptive bitonic filtering based edge fusion algorithm for Gaussian denoising. Int. J. Cogn. Comput. Eng. 2022, 3, 90–97. [Google Scholar] [CrossRef]
  6. Wang, Z. Applications of objective image quality assessment methods [applications corner]. IEEE Signal Process. Mag. 2011, 28, 137–142. [Google Scholar] [CrossRef]
  7. Kalender, W.A. Computed Tomography: Fundamentals, System Technology, Image Quality, Applications; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
  8. Kaur, B.; Dogra, A.; Goyal, B. Comparative Analysis of Bilateral Filter and its Variants for Magnetic Resonance Imaging. Open Neuroimaging J. 2020, 13, 21–29. [Google Scholar] [CrossRef]
  9. Saupe, D.; Hahn, F.; Hosu, V.; Zingman, I.; Rana, M.; Li, S. Crowd workers proven useful: A comparative study of subjective video quality assessment. In Proceedings of the QoMEX 2016: 8th International Conference on Quality of Multimedia Experience, Lisbon, Portugal, 6–8 June 2016. [Google Scholar]
  10. Lin, H.; Hosu, V.; Saupe, D. KADID-10k: A large-scale artificially distorted IQA database. In Proceedings of the 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX), Berlin, Germany, 5–7 June 2019; pp. 1–3. [Google Scholar]
  11. Ciocca, G.; Corchs, S.; Gasparini, F.; Schettini, R. How to assess image quality within a workflow chain: An overview. Int. J. Digit. Libr. 2014, 15, 1–25. [Google Scholar] [CrossRef]
  12. Zhang, L.; Li, H. SR-SIM: A fast and high performance IQA index based on spectral residual. In Proceedings of the 2012 19th IEEE International Conference on Image Processing, Orlando, FL, USA, 30 September–3 October 2012; pp. 1473–1476. [Google Scholar]
  13. Soundararajan, R.; Bovik, A.C. RRED indices: Reduced reference entropic differencing for image quality assessment. IEEE Trans. Image Process. 2011, 21, 517–526. [Google Scholar] [CrossRef] [Green Version]
  14. Min, X.; Zhai, G.; Gu, K.; Liu, Y.; Yang, X. Blind image quality estimation via distortion aggravation. IEEE Trans. Broadcast. 2018, 64, 508–517. [Google Scholar] [CrossRef]
  15. Min, X.; Gu, K.; Zhai, G.; Liu, J.; Yang, X.; Chen, C.W. Blind quality assessment based on pseudo-reference image. IEEE Trans. Multimed. 2017, 20, 2049–2062. [Google Scholar] [CrossRef]
  16. Bouida, A.; Khelifi, M.; Beladgham, M.; Hamlili, F.Z. Monte Carlo Optimization of a Combined Image Quality Assessment for Compressed Images Evaluation. Trait. Du Signal 2021, 38, 281–289. [Google Scholar] [CrossRef]
  17. Merzougui, N. Multi-measures fusion based on multi-objective genetic programming for full-reference image quality assessment. arXiv 2017, arXiv:1801.06030. [Google Scholar]
  18. Okarma, K. On the usefulness of combined metrics for 3D image quality assessment. In Image Processing & Communications Challenges 6; Springer: Berlin/Heidelberg, Germany, 2015; pp. 137–144. [Google Scholar]
  19. Okarma, K.; Lech, P.; Lukin, V.V. Combined Full-Reference Image Quality Metrics for Objective Assessment of Multiply Distorted Images. Electronics 2021, 10, 2256. [Google Scholar] [CrossRef]
  20. Oszust, M. Full-reference image quality assessment with linear combination of genetically selected quality measures. PLoS ONE 2016, 11, e0158333. [Google Scholar] [CrossRef] [Green Version]
  21. Soares, S.; Antunes, C.H.; Araújo, R. Comparison of a genetic algorithm and simulated annealing for automatic neural network ensemble development. Neurocomputing 2013, 121, 498–511. [Google Scholar] [CrossRef] [Green Version]
  22. Sheikh, H.R.; Sabir, M.F.; Bovik, A.C. A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Process. 2006, 15, 3440–3451. [Google Scholar] [CrossRef]
  23. Ponomarenko, N.; Ieremeiev, O.; Lukin, V.; Egiazarian, K.; Jin, L.; Astola, J.; Vozel, B.; Chehdi, K.; Carli, M.; Battisti, F.; et al. Color image database TID2013: Peculiarities and preliminary results. In Proceedings of the European Workshop on Visual Information Processing (EUVIP), Paris, France, 10–12 June 2013; pp. 106–111. [Google Scholar]
  24. Ponomarenko, N.; Lukin, V.; Zelensky, A.; Egiazarian, K.; Carli, M.; Battisti, F. TID2008-a database for evaluation of full-reference visual quality assessment metrics. Adv. Mod. Radioelectron. 2009, 10, 30–45. [Google Scholar]
  25. Larson, E.C.; Chandler, D.M. Most apparent distortion: Full-reference image quality assessment and the role of strategy. J. Electron. Imaging 2010, 19, 011006. [Google Scholar]
  26. Ding, K.; Ma, K.; Wang, S.; Simoncelli, E.P. Comparison of full-reference image quality models for optimization of image processing systems. Int. J. Comput. Vis. 2021, 129, 1258–1281. [Google Scholar] [CrossRef]
  27. Sara, U.; Akter, M.; Uddin, M.S. Image quality assessment through FSIM, SSIM, MSE and PSNR—A comparative study. J. Comput. Commun. 2019, 7, 8–18. [Google Scholar] [CrossRef] [Green Version]
  28. Saupe, D.; Hamzaoui, R.; Hartenstein, H. Fractal Image Compression: An Introductory Overview; Universität Wien Fakultät für Informatik: Wien, Austria, 1997. [Google Scholar]
  29. Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  30. Wang, S.; Rehman, A.; Wang, Z.; Ma, S.; Gao, W. SSIM-motivated rate-distortion optimization for video coding. IEEE Trans. Circuits Syst. Video Technol. 2011, 22, 516–529. [Google Scholar] [CrossRef] [Green Version]
  31. Liu, L.; Wang, Y.; Wu, Y. A wavelet-domain structure similarity for image quality assessment. In Proceedings of the 2009 2nd International Congress on Image and Signal Processing, Tianjin, China, 17–19 October 2009; pp. 1–5. [Google Scholar]
  32. Sampat, M.P.; Wang, Z.; Gupta, S.; Bovik, A.C.; Markey, M.K. Complex wavelet structural similarity: A new image similarity index. IEEE Trans. Image Process. 2009, 18, 2385–2401. [Google Scholar] [CrossRef] [PubMed]
  33. Wang, Z.; Li, Q. Information content weighting for perceptual image quality assessment. IEEE Trans. Image Process. 2010, 20, 1185–1198. [Google Scholar] [CrossRef]
  34. Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural similarity for image quality assessment. In Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, Pacific Grove, CA, USA, 9–12 November 2003; Volume 2, pp. 1398–1402. [Google Scholar]
  35. Li, C.; Bovik, A.C. Three-component weighted structural similarity index. In Proceedings of the Image Quality and System Performance VI; International Society for Optics and Photonics: San Jose, CA, USA, 2009; Volume 7242, p. 72420Q. [Google Scholar]
  36. Kolaman, A.; Yadid-Pecht, O. Quaternion structural similarity: A new quality index for color images. IEEE Trans. Image Process. 2011, 21, 1526–1536. [Google Scholar] [CrossRef]
  37. Zhu, R.; Zhou, F.; Xue, J.H. MvSSIM: A quality assessment index for hyperspectral images. Neurocomputing 2018, 272, 250–257. [Google Scholar] [CrossRef] [Green Version]
  38. Sheikh, H.R.; Bovik, A.C.; De Veciana, G. An information fidelity criterion for image quality assessment using natural scene statistics. IEEE Trans. Image Process. 2005, 14, 2117–2128. [Google Scholar] [CrossRef] [Green Version]
  39. Sheikh, H.R.; Bovik, A.C. Image information and visual quality. IEEE Trans. Image Process. 2006, 15, 430–444. [Google Scholar] [CrossRef]
  40. Liang, Y.; Wang, J.; Wan, X.; Gong, Y.; Zheng, N. Image quality assessment using similar scene as reference. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2016; pp. 3–18. [Google Scholar]
  41. Kim, J.; Lee, S. Deep learning of human visual sensitivity in image quality assessment framework. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1676–1684. [Google Scholar]
  42. Ahn, S.; Choi, Y.; Yoon, K. Deep learning-based distortion sensitivity prediction for full-reference image quality assessment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 344–353. [Google Scholar]
  43. Ding, Y.; Zhao, Y.; Zhao, X. Image quality assessment based on multi-feature extraction and synthesis with support vector regression. Signal Process. Image Commun. 2017, 54, 81–92. [Google Scholar] [CrossRef]
  44. Tang, Z.; Zheng, Y.; Gu, K.; Liao, K.; Wang, W.; Yu, M. Full-reference image quality assessment by combining features in spatial and frequency domains. IEEE Trans. Broadcast. 2018, 65, 138–151. [Google Scholar] [CrossRef]
  45. Kovesi, P. Image features from phase congruency. Videre J. Comput. Vis. Res. 1999, 1, 1–26. [Google Scholar]
  46. Okarma, K. Combined full-reference image quality metric linearly correlated with subjective assessment. In Proceedings of the International Conference on Artificial Intelligence and Soft Computing; Springer: Berlin/Heidelberg, Germany, 2010; pp. 539–546. [Google Scholar]
  47. Mansouri, A.; Aznaveh, A.M.; Torkamani-Azar, F.; Jahanshahi, J.A. Image quality assessment using the singular value decomposition theorem. Opt. Rev. 2009, 16, 49–53. [Google Scholar] [CrossRef]
  48. Oszust, M. Image quality assessment with lasso regression and pairwise score differences. Multimed. Tools Appl. 2017, 76, 13255–13270. [Google Scholar] [CrossRef] [Green Version]
  49. Yuan, Y.; Guo, Q.; Lu, X. Image quality assessment: A sparse learning way. Neurocomputing 2015, 159, 227–241. [Google Scholar] [CrossRef]
  50. Lukin, V.V.; Ponomarenko, N.N.; Ieremeiev, O.I.; Egiazarian, K.O.; Astola, J. Combining full-reference image visual quality metrics by neural network. In Proceedings of the Human Vision and Electronic Imaging XX, San Francisco, CA, USA, 17 March 2015; Volume 9394, pp. 172–183. [Google Scholar]
  51. Amirshahi, S.A.; Pedersen, M.; Beghdadi, A. Reviving traditional image quality metrics using CNNs. In Proceedings of the Color and Imaging Conference, Albuquerque, NM, USA, 4–8 November 2018; Volume 2018, pp. 241–246. [Google Scholar]
  52. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
  53. Bakurov, I.; Buzzelli, M.; Schettini, R.; Castelli, M.; Vanneschi, L. Structural similarity index (SSIM) revisited: A data-driven approach. Expert Syst. Appl. 2022, 189, 116087. [Google Scholar] [CrossRef]
  54. Okarma, K. Combined visual quality metric of remote sensing images based on neural network. In Radioelectronic and Computer Systems; National Aerospace University: Kharkiv, Ukraine, 2020; pp. 4–15. [Google Scholar]
  55. Okarma, K.; Chlewicki, W.; Kopytek, M.; Marciniak, B.; Lukin, V. Entropy-Based Combined Metric for Automatic Objective Quality Assessment of Stitched Panoramic Images. Entropy 2021, 23, 1525. [Google Scholar] [CrossRef]
  56. Xu, L.; Lin, W.; Kuo, C.C.J. Visual Quality Assessment by Machine Learning; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
  57. Pedersen, M.; Hardeberg, J.Y. Full-reference image quality metrics: Classification and evaluation. Found. Trends® Comput. Graph. Vis. 2012, 7, 1–80. [Google Scholar]
  58. Zhang, L.; Zhang, L.; Mou, X.; Zhang, D. A comprehensive evaluation of full reference image quality assessment algorithms. In Proceedings of the 2012 19th IEEE International Conference on Image Processing, Orlando, FL, USA, 30 September–3 October 2012; pp. 1477–1480. [Google Scholar]
  59. Zhai, G.; Min, X. Perceptual image quality assessment: A survey. Sci. China Inf. Sci. 2020, 63, 1–52. [Google Scholar] [CrossRef]
  60. Min, X.; Ma, K.; Gu, K.; Zhai, G.; Wang, Z.; Lin, W. Unified blind quality assessment of compressed natural, graphic, and screen content images. IEEE Trans. Image Process. 2017, 26, 5462–5474. [Google Scholar] [CrossRef]
  61. Min, X.; Gu, K.; Zhai, G.; Yang, X.; Zhang, W.; Le Callet, P.; Chen, C.W. Screen content quality assessment: Overview, benchmark, and beyond. ACM Comput. Surv. (CSUR) 2021, 54, 1–36. [Google Scholar] [CrossRef]
  62. Zhang, L.; Zhang, L.; Mou, X.; Zhang, D. FSIM: A feature similarity index for image quality assessment. IEEE Trans. Image Process. 2011, 20, 2378–2386. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  63. Liu, A.; Lin, W.; Narwaria, M. Image quality assessment based on gradient similarity. IEEE Trans. Image Process. 2011, 21, 1500–1512. [Google Scholar] [PubMed]
  64. Chang, H.W.; Zhang, Q.W.; Wu, Q.G.; Gan, Y. Perceptual image quality assessment by independent feature detector. Neurocomputing 2015, 151, 1142–1152. [Google Scholar] [CrossRef]
  65. Damera-Venkata, N.; Kite, T.D.; Geisler, W.S.; Evans, B.L.; Bovik, A.C. Image quality assessment based on a degradation model. IEEE Trans. Image Process. 2000, 9, 636–650. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  66. Zhang, L.; Zhang, L.; Mou, X. RFSIM: A feature based image quality assessment metric using Riesz transforms. In Proceedings of the 2010 IEEE International Conference on Image Processing, Hong Kong, China, 26–29 September 2010; pp. 321–324. [Google Scholar]
  67. Chang, H.W.; Yang, H.; Gan, Y.; Wang, M.H. Sparse feature fidelity for perceptual image quality assessment. IEEE Trans. Image Process. 2013, 22, 4007–4018. [Google Scholar] [CrossRef]
  68. Zhang, L.; Shen, Y.; Li, H. VSI: A visual saliency-induced index for perceptual image quality assessment. IEEE Trans. Image Process. 2014, 23, 4270–4281. [Google Scholar] [CrossRef] [Green Version]
  69. Shi, C.; Lin, Y. Full reference image quality assessment based on visual salience with color appearance and gradient similarity. IEEE Access 2020, 8, 97310–97320. [Google Scholar] [CrossRef]
  70. Shi, C.; Lin, Y. Image Quality Assessment Based on Three Features Fusion in Three Fusion Steps. Symmetry 2022, 14, 773. [Google Scholar] [CrossRef]
  71. Wu, J.; Lin, W.; Shi, G. Image quality assessment with degradation on spatial structure. IEEE Signal Process. Lett. 2014, 21, 437–440. [Google Scholar] [CrossRef]
  72. Van Laarhoven, P.J.; Aarts, E.H. Simulated annealing. In Simulated Annealing: Theory and Applications; Springer: Berlin/Heidelberg, Germany, 1987; pp. 7–15. [Google Scholar]
  73. Kirkpatrick, S.; Gelatt Jr, C.D.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef] [PubMed]
  74. Yu, X.; Bampis, C.G.; Gupta, P.; Bovik, A.C. Predicting the quality of images compressed after distortion in two steps. IEEE Trans. Image Process. 2019, 28, 5757–5770. [Google Scholar] [CrossRef]
  75. Temel, D.; AlRegib, G. CSV: Image quality assessment based on color, structure, and visual system. Signal Process. Image Commun. 2016, 48, 92–103. [Google Scholar] [CrossRef] [Green Version]
  76. Ding, K.; Ma, K.; Wang, S.; Simoncelli, E.P. Image quality assessment: Unifying structure and texture similarity. arXiv 2020, arXiv:2004.07728. [Google Scholar] [CrossRef] [PubMed]
  77. Zhang, X.; Feng, X.; Wang, W.; Xue, W. Edge strength similarity for image quality assessment. IEEE Signal Process. Lett. 2013, 20, 319–322. [Google Scholar] [CrossRef]
  78. Temel, D.; AlRegib, G. ReSIFT: Reliability-weighted sift-based image quality assessment. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 2047–2051. [Google Scholar]
  79. Yang, G.; Li, D.; Lu, F.; Liao, Y.; Yang, W. RVSIM: A feature similarity method for full-reference image quality assessment. EURASIP J. Image Video Process. 2018, 2018, 1–15. [Google Scholar] [CrossRef] [Green Version]
  80. Temel, D.; AlRegib, G. Perceptual image quality assessment through spectral analysis of error representations. Signal Process. Image Commun. 2019, 70, 37–46. [Google Scholar] [CrossRef] [Green Version]
Figure 1. In the offline optimization stage, the proposed fusion-based metric is obtained by using 20% of the reference with its corresponding distorted counterparts. Next, a simulated annealing (SA) optimization process selects FR-IQA metrics and provides them with weights. The resulting metric is codenamed as LCSA-IQA to refer to the fact that is the linear combination of selected FR-IQA metrics where the weights were assigned using simulated annealing.
Figure 1. In the offline optimization stage, the proposed fusion-based metric is obtained by using 20% of the reference with its corresponding distorted counterparts. Next, a simulated annealing (SA) optimization process selects FR-IQA metrics and provides them with weights. The resulting metric is codenamed as LCSA-IQA to refer to the fact that is the linear combination of selected FR-IQA metrics where the weights were assigned using simulated annealing.
Jimaging 08 00224 g001
Figure 2. The optimal linear combination of the selected FR-IQA metrics is applied to estimate perceptual image quality.
Figure 2. The optimal linear combination of the selected FR-IQA metrics is applied to estimate perceptual image quality.
Jimaging 08 00224 g002
Figure 3. Empirical MOS distributions in the used benchmark IQA databases: (a) LIVE, (b) TID2013, (c) TID2008, and (d) CSIQ.
Figure 3. Empirical MOS distributions in the used benchmark IQA databases: (a) LIVE, (b) TID2013, (c) TID2008, and (d) CSIQ.
Jimaging 08 00224 g003
Table 1. Summary of benchmark databases used in this study.
Table 1. Summary of benchmark databases used in this study.
LIVE [22]TID2013 [23]TID2008 [24]CSIQ [25]
No. of reference images29252530
No. of distorted images77930001700866
No. of distortions524176
No. of levels5544-5
No. of observers16191783835
Resolution 768 × 512 512 × 384 512 × 384 500 × 500
Table 2. Computer configuration applied in our experiments.
Table 2. Computer configuration applied in our experiments.
Computer modelSTRIX Z270H Gaming
Operating systemWindows 10
Memory15 GB
CPUIntel(R) Core(TM) i7-7700K CPU 4.20 GHz (8 cores)
GPUNvidia GeForce GTX 1080
Table 3. PLCC, SROCC, and KROCC performance comparison of the proposed fusion-based FR-IQA metrics on LIVE and TID2013 databases with the state-of-the-art. The best results are typed in bold, and the second best results are underlined.
Table 3. PLCC, SROCC, and KROCC performance comparison of the proposed fusion-based FR-IQA metrics on LIVE and TID2013 databases with the state-of-the-art. The best results are typed in bold, and the second best results are underlined.
LIVE [22]TID2013 [23]
FR-IQA MetricPLCCSROCCKROCCPLCCSROCCKROCC
2stepQA [74]0.9370.9320.8280.7360.7330.550
CSV [75]0.9670.9590.8340.8520.8480.657
DISTS [76]0.9540.9540.8110.7590.7110.524
ESSIM [77]0.9630.9620.8400.7400.7970.627
FSIM [62]0.9600.9630.8330.8590.8020.629
FSIMc [62]0.9610.9650.8360.8770.8510.667
GSM [63]0.9440.9550.8310.7890.7870.593
IFC [38]0.9270.9260.7580.5540.5390.394
IFS [64]0.9590.9600.8250.8790.8700.679
IW-SSIM [33]0.9520.9560.8170.8320.7780.598
MAD [25]0.9670.9670.8420.8270.7780.600
MS-SSIM [34]0.9410.9510.8040.7940.7850.604
NQM [65]0.9120.9090.7410.6900.6430.474
PSNR0.8720.8760.6870.6160.6460.467
ReSIFT [78]0.9610.9620.8380.6300.6230.471
RFSIM [66]0.9350.9400.7820.8330.7740.595
RVSIM [79]0.6410.6300.4950.7630.6830.520
SFF [67]0.9630.9650.8360.8710.8510.658
SR-SIM [12]0.9550.9620.8290.8590.8000.631
SSIM [29]0.9410.9510.8040.6180.6160.437
SSIM-CNN [51]0.9650.9630.8380.7590.7520.566
SUMMER [80]0.9670.9590.8330.6230.6220.472
VIF [39]0.9410.9640.8280.7740.6770.515
VSI [68]0.9480.9520.8050.9000.8940.677
LCSA10.9740.9740.8570.8200.7880.607
LCSA20.8460.9620.8280.9160.9030.731
LCSA30.9470.9690.8430.7700.8210.647
LCSA40.9670.9700.8440.8590.8230.649
Table 4. PLCC, SROCC, and KROCC performance comparison of the proposed fusion-based FR-IQA metrics on TID2008 and CSIQ databases with the state-of-the-art. The best results are typed in bold, and the second best results are underlined.
Table 4. PLCC, SROCC, and KROCC performance comparison of the proposed fusion-based FR-IQA metrics on TID2008 and CSIQ databases with the state-of-the-art. The best results are typed in bold, and the second best results are underlined.
TID2008 [24]CSIQ [25]
FR-IQA MetricPLCCSROCCKROCCPLCCSROCCKROCC
2stepQA [74]0.7570.7690.5740.8410.8490.655
CSV [75]0.8520.8480.6570.9330.9330.766
DISTS [76]0.7050.6680.4880.9300.9300.764
ESSIM [77]0.6580.8760.6960.8140.9330.768
FSIM [62]0.8740.8810.6950.9120.9240.757
FSIMc [62]0.8760.8840.6990.9190.9310.769
GSM [63]0.7820.7810.5780.8960.9110.737
IFC [38]0.5750.5680.4240.8370.7670.590
IFS [64]0.8790.8690.6780.9580.9580.817
IW-SSIM [33]0.8420.8560.6640.8040.9210.753
MAD [25]0.8310.8290.6390.9500.9470.797
MS-SSIM [34]0.8380.8460.6480.8990.9130.739
NQM [65]0.6080.6240.4610.7430.7400.564
PSNR0.4470.4890.3460.8530.8090.599
ReSIFT [78]0.6270.6320.4840.8840.8680.695
RFSIM [66]0.8650.8680.6780.9120.9300.765
RVSIM [79]0.7890.7430.5660.9230.9030.728
SFF [67]0.8710.8510.6580.9640.9600.826
SR-SIM [12]0.8590.7990.6310.9250.9320.773
SSIM [29]0.6690.6750.4850.8120.8120.606
SSIM-CNN [51]0.7700.7370.5510.9520.9460.794
SUMMER [80]0.8170.8230.6230.8260.8300.658
VIF [39]0.8080.7490.5860.9280.9200.754
VSI [68]0.8980.8960.7090.9280.9420.785
LCSA10.8860.8740.6850.9660.9560.819
LCSA20.8960.9060.7270.8970.9490.800
LCSA30.9230.9210.7550.9640.9610.827
LCSA40.9060.9090.7370.9770.9730.857
Table 5. PLCC, SROCC, and KROCC performance comparison of the proposed fusion-based FR-IQA metrics with the state-of-the-art. The best results are typed in bold, the second best results are underlined.
Table 5. PLCC, SROCC, and KROCC performance comparison of the proposed fusion-based FR-IQA metrics with the state-of-the-art. The best results are typed in bold, the second best results are underlined.
Direct AverageWeighted Average
FR-IQA MetricPLCCSROCCKROCCPLCCSROCCKROCC
2stepQA [74]0.8180.8210.6520.7810.7830.605
CSV [75]0.9010.8970.7290.8770.8730.694
DISTS [76]0.8370.8160.6470.7920.7590.582
ESSIM [77]0.7940.8920.7330.7560.8570.691
FSIM [62]0.9010.8930.7290.8830.8600.689
FSIMc [62]0.9080.9080.7430.8930.8850.710
GSM [63]0.8530.8590.6850.8210.8230.638
IFC [38]0.7230.7000.5420.6440.6250.473
IFS [64]0.9190.9140.7500.9000.8930.715
IW-SSIM [33]0.8570.8780.7080.8460.8400.664
MAD [25]0.8940.8800.7200.8620.8380.667
MS-SSIM [34]0.8680.8740.6990.8380.8390.659
NQM [65]0.7380.7290.5600.7030.6840.516
PSNR0.6970.7050.5250.6340.6540.480
ReSIFT [78]0.7760.7710.6220.7050.7000.550
RFSIM [66]0.8860.8780.7050.8650.8410.663
RVSIM [79]0.7790.7400.5770.7770.7230.558
SFF [67]0.9170.9080.7450.8950.8800.703
SR-SIM [12]0.9000.8730.7160.8800.8380.675
SSIM [29]0.7600.7640.5830.6980.7000.518
SSIM-CNN [51]0.8610.8490.6870.8140.8000.626
SUMMER [80]0.8080.8090.6470.7450.7460.582
VIF [39]0.8630.8280.6710.8250.7650.605
VSI [68]0.9190.9210.7440.9090.9080.716
LCSA10.9120.8980.7420.8770.8570.688
LCSA20.8890.9300.7720.8990.9170.751
LCSA30.9010.9180.7680.8590.8850.725
LCSA40.9270.9190.7720.9010.8850.725
Table 6. Distortion types used in the applied benchmark IQA databases (LIVE [22], TID2013 [23], TID2008 [24], and CSIQ [25]).
Table 6. Distortion types used in the applied benchmark IQA databases (LIVE [22], TID2013 [23], TID2008 [24], and CSIQ [25]).
AbbreviationDescriptionLIVE [22]TID2013 [23]TID2008 [24]CSIQ [25]
AGNadditive Gaussian noise🗸🗸🗸🗸
ANCadditive noise in color components 🗸🗸🗸
SCNspatially correlated noise 🗸🗸
MNmasked noise 🗸🗸
HFNhigh-frequency noise 🗸🗸
INimpulse noise 🗸🗸
QNquantization noise 🗸🗸
FFsimulated fast fading Rayleigh channel🗸
GBGaussian blur🗸🗸🗸
GCDglobal contrast decrement 🗸
DENimage denoising 🗸
JPEGJPEG compression noise🗸🗸🗸🗸
JP2KJPEG2000 compression noise🗸🗸🗸🗸
JGTEJPEG transmission errors 🗸🗸
J2TEJPEG2000 transmission errors 🗸🗸
NEPNnon-eccentricity pattern noise 🗸🗸
BLOCKlocal block-wise distortions of different intensity 🗸🗸
MSmean shift 🗸🗸
CCcontrast change 🗸🗸
CCSchange of color saturation 🗸
MGNmultiplicative Gaussian noise 🗸
CNcomfort noise 🗸
LCNIlossy compression of noisy images 🗸
ICQDimage color quantization with dither 🗸
CAchromatic aberration 🗸
SSRsparse sampling and reconstruction 🗸
Table 7. Comparison on LIVE’s [22] distortion types. SROCC values are given. The highest values are typed in bold, while the second highest ones are underlined.
Table 7. Comparison on LIVE’s [22] distortion types. SROCC values are given. The highest values are typed in bold, while the second highest ones are underlined.
FSIMFSIMcIFSMS-SSIMSFFVIFVSILCSA1LCSA2LCSA3LCSA4
AGN0.9650.9720.9880.9730.9860.9860.9840.9760.9610.9620.965
FF0.9500.9520.9400.9470.9530.9650.9430.9840.9780.9880.980
GB0.9710.9710.9670.9540.9750.9730.9530.9780.9890.9970.996
JPEG0.9830.9840.9780.9820.9790.9850.9760.9740.9730.9640.965
JP2K0.9720.9700.9690.9630.9670.9700.9600.9520.9690.9670.978
All0.9630.9650.9600.9510.9650.9640.9520.9740.9620.9690.970
Table 8. Comparison on TID2013’s [23] distortion types. SROCC values are given. The highest values are typed in bold, while the second highest ones are underlined.
Table 8. Comparison on TID2013’s [23] distortion types. SROCC values are given. The highest values are typed in bold, while the second highest ones are underlined.
FSIMFSIMcIFSMS-SSIMSFFVIFVSILCSA1LCSA2LCSA3LCSA4
AGN0.8970.9100.9380.8650.9070.8990.9460.9080.9320.9250.925
ANC0.8210.8540.8540.7730.8170.8300.8710.8460.8540.8530.857
SCN0.8750.8900.9340.8540.8980.8840.9370.9080.9400.9330.915
MN0.7940.8090.7960.8070.8190.8450.7700.7920.7690.8110.801
HFN0.8980.9040.9140.8600.8980.8970.9200.9040.9140.9090.903
IN0.8070.8250.8390.7630.7870.8540.8740.5740.7950.7900.728
QN0.8720.8810.8340.8710.8610.7850.8750.8540.8860.8440.863
GB0.9550.9550.9660.9670.9680.9650.9610.9540.9560.9590.970
DEN0.9300.9330.9180.9270.9090.8910.9480.9170.9370.9130.937
JPEG0.9320.9340.9290.9270.9270.9190.9540.9210.9300.9290.932
JP2K0.9580.9590.9610.9500.9570.9520.9710.9500.9650.9570.953
JGTE0.8460.8610.8930.8480.8830.8410.9220.8540.8910.8630.859
J2TE0.8910.8920.9010.8890.8710.8760.9230.9090.9160.9130.916
NEPN0.7920.7940.7840.7970.7670.7720.8060.8260.8150.8150.822
BLOCK0.5490.5530.1000.4800.1790.5310.1710.4520.3530.3280.185
MS0.7530.7490.6580.7910.6650.6280.7700.5540.6780.4550.620
CC0.4690.4680.4470.4630.4690.8390.4750.5350.4480.6310.423
CCS0.2750.8360.8260.4100.8270.3100.8100.7120.8290.8130.813
MGN0.8470.8570.8790.7790.8430.8470.9120.8750.9000.8820.875
CN0.9120.9140.9040.8530.9010.8950.9240.9110.9230.9040.906
LCNI0.9470.9490.9430.9070.9260.9200.9560.9510.9580.9450.957
ICQD0.8760.8820.9010.8560.8800.8410.8840.8910.9030.8910.900
CA0.8720.8930.8860.8780.8790.8850.8910.8620.8730.8700.874
SSR0.9570.9580.9560.9480.9520.9350.9630.9480.9570.9650.955
All0.8020.8510.8700.7850.8510.6770.8940.7880.9030.8210.823
Table 9. Comparison on TID2008’s [24] distortion types. SROCC values are given. The highest values are typed in bold, while the second highest ones are underlined.
Table 9. Comparison on TID2008’s [24] distortion types. SROCC values are given. The highest values are typed in bold, while the second highest ones are underlined.
FSIMFSIMcIFSMS-SSIMSFFVIFVSILCSA1LCSA2LCSA3LCSA4
AGN0.8570.8760.9170.8090.8730.8800.9230.8870.9160.9060.905
ANC0.8530.8930.8960.8050.8630.8760.9120.8870.8900.8930.889
SCN0.8480.8710.9310.8210.8940.8700.9300.8940.9150.9360.918
MN0.8020.8260.8020.8110.8370.8680.7730.7820.7330.8570.817
HFN0.9090.9160.9220.8690.9120.9080.9250.9010.9090.9220.917
IN0.7450.7720.8140.6910.7480.8330.8300.3960.7290.7520.618
QN0.8560.8730.7970.8590.8450.7970.8730.8250.8590.8550.854
GB0.9470.9470.9600.9560.9620.9540.9530.9330.9440.9530.963
DEN0.9600.9620.9490.9580.9380.9160.9690.9360.9560.9640.963
JPEG0.9280.9290.9280.9320.9320.9170.9620.9210.9420.9390.937
JP2K0.9770.9780.9780.9700.9770.9710.9850.9750.9910.9860.977
JGTE0.8710.8760.8740.8680.8570.8590.9160.8860.9140.8930.904
J2TE0.8540.8560.8780.8610.8390.8500.8940.8890.8850.9110.901
NEPN0.7490.7510.7040.7380.6970.7620.7700.8310.7730.8050.796
BLOCK0.8490.8460.0870.7550.5370.8320.6300.8260.6310.7420.672
MS0.6720.6550.5220.7340.5230.5100.6710.4600.3830.5540.497
CC0.6480.6510.6270.6380.6460.8190.6560.6300.6040.7320.577
All0.8810.8840.8690.8460.8510.7490.8960.8740.9060.9210.909
Table 10. Comparison on CSIQ’s [25] distortion types. SROCC values are given. The highest values are typed in bold, while the second highest ones are underlined.
Table 10. Comparison on CSIQ’s [25] distortion types. SROCC values are given. The highest values are typed in bold, while the second highest ones are underlined.
FSIMFSIMcIFSMS-SSIMSFFVIFVSILCSA1LCSA2LCSA3LCSA4
AGN0.9260.9360.9590.9470.9470.9580.9640.9650.9710.9670.976
ANC0.9230.9370.9530.9330.9550.9510.9640.9120.9480.9620.969
GB0.9730.9730.9620.9710.9750.9750.9680.9830.9720.9710.981
GCD0.9420.9440.9490.9530.9540.9350.9500.9750.9590.9720.963
JPEG0.9650.9660.9660.9630.9640.9710.9620.9670.9830.9810.979
JP2K0.9680.9700.9710.9680.9760.9670.9690.9560.9500.9410.950
All0.9240.9310.9580.9130.9600.9200.9420.9560.9490.9610.973
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Varga, D. Full-Reference Image Quality Assessment Based on an Optimal Linear Combination of Quality Measures Selected by Simulated Annealing. J. Imaging 2022, 8, 224. https://doi.org/10.3390/jimaging8080224

AMA Style

Varga D. Full-Reference Image Quality Assessment Based on an Optimal Linear Combination of Quality Measures Selected by Simulated Annealing. Journal of Imaging. 2022; 8(8):224. https://doi.org/10.3390/jimaging8080224

Chicago/Turabian Style

Varga, Domonkos. 2022. "Full-Reference Image Quality Assessment Based on an Optimal Linear Combination of Quality Measures Selected by Simulated Annealing" Journal of Imaging 8, no. 8: 224. https://doi.org/10.3390/jimaging8080224

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop