Center-Emphasized Visual Saliency and a Contrast-Based Full Reference Image Quality Index

Layek, Md. Abu; Uddin, A. F. M. Shahab; Le, Tuyen P.; Chung, TaeChoong; Huh, Eui-Nam

doi:10.3390/sym11030296

Open AccessArticle

Center-Emphasized Visual Saliency and a Contrast-Based Full Reference Image Quality Index

by

Md. Abu Layek

,

A. F. M. Shahab Uddin

,

Tuyen P. Le

,

TaeChoong Chung

and

Eui-Nam Huh

^*

Department of Computer Science and Engineering, Kyung Hee University Global Campus, Yongin 17104, Korea

^*

Author to whom correspondence should be addressed.

Symmetry 2019, 11(3), 296; https://doi.org/10.3390/sym11030296

Submission received: 28 December 2018 / Revised: 14 February 2019 / Accepted: 22 February 2019 / Published: 26 February 2019

Download

Browse Figures

Versions Notes

Abstract

:

Objective image quality assessment (IQA) is imperative in the current multimedia-intensive world, in order to assess the visual quality of an image at close to a human level of ability. Many parameters such as color intensity, structure, sharpness, contrast, presence of an object, etc., draw human attention to an image. Psychological vision research suggests that human vision is biased to the center area of an image and display screen. As a result, if the center part contains any visually salient information, it draws human attention even more and any distortion in that part will be better perceived than other parts. To the best of our knowledge, previous IQA methods have not considered this fact. In this paper, we propose a full reference image quality assessment (FR-IQA) approach using visual saliency and contrast; however, we give extra attention to the center by increasing the sensitivity of the similarity maps in that region. We evaluated our method on three large-scale popular benchmark databases used by most of the current IQA researchers (TID2008, CSIQ and LIVE), having a total of 3345 distorted images with 28 different kinds of distortions. Our method is compared with 13 state-of-the-art approaches. This comparison reveals the stronger correlation of our method with human-evaluated values. The prediction-of-quality score is consistent for distortion specific as well as distortion independent cases. Moreover, faster processing makes it applicable to any real-time application.

Keywords:

image quality assessment; full reference; center-emphasized

1. Introduction

Computer-based automatic image quality assessment (IQA) has been sought after for decades because numerous image and video applications need this assessment to automate their quality maintenance. To date, IQA research has been significantly advanced, however, this is still an active area of research to bring the methods closer to human-level ability. In the literature, there are three principle IQA approaches. No-reference image quality assessment (NR-IQA) uses a single distorted image without any reference image, whereas in reduced-reference IQA (RR-IQA), partial information of a reference image is given. The third category is full-reference IQA (FR-IQA) where the complete reference image is given along with the distorted one. In this paper, we deal with FR-IQA.

Table 1 presents a brief survey on several state-of-the-art IQA approaches, which we compared in this paper. We made a separate column for the pooling strategy because recent IQA research tends to combine multiple features where pooling plays an important role. Early pixel-based, faster IQA methods such as mean squared error (MSE) and peak signal-to-noise ratio (PSNR) consider neither the human visual system (HVS) nor any aspects of human perception. Thus, those approaches fail to achieve good correlation with human assessment [1,2]. Two images with the same PSNR or MSE may be perceived in totally different ways by a human observer. However, humans are the ultimate receiver of images; as a result, the search for methods that can achieve a closer correlation with humans is ongoing. Wang et al., in their revolutionary work on the structural similarity index or SSIM [3], argued that human visual perception is highly sensitive to structural information. The SSIM index incorporates luminance, contrast, and structural comparison information and achieves a very good correlation with the mean opinion scores (MOS) of human observers. Inspired by the success of SSIM, several extended versions, such as the multi-scale structural similarity for image quality assessment (MS-SSIM) [4] and Information content weighting for perceptual image quality assessment (IW-SSIM) [5], were proposed by the same research group. IW-SSIM utilizes an image pyramid to decompose the original and distorted images into versions of varying scales, and then computes the information content from the images. Finally, it finds the quality score using the information content as a weighting function.

Based on shared information between the reference and distorted images, Sheikh et al. proposed the information fidelity criteria (IFC) [6] and the visual information fidelity (VIF) [7]. The most apparent distortion (MAD) approach [8] separates images based on the distortion and applies either a detection-based strategy or an appearance-based strategy. Some of the methods, such as the noise quality measure (NQM) [9] and the visual signal-to-noise ratio (VSNR) [10], take into account the HVS by incorporating interactions among different visual signals. In contrast, other approaches, including the popular feature similarity index or FSIM [11], emphasize phase congruency [12,13,14]. FSIM uses the image gradient as a secondary feature and local quality maps are weighted by phase congruency to obtain the final score. The image gradient has been used effectively in a number of other works [15,16]. Xue et al., in their gradient magnitude standard deviation (GMSD) [17], used the gradient magnitude with a different pooling strategy, by applying the standard deviation, and Alaei et al. adopted a similar approach for assessing document images [18]. Both examples prove the effectiveness of standard deviation pooling, however, the authors of the GMSD approach showed that standard deviation (SD) pooling is not effective for all types of methods. Wang et al. proposed the multi-scale contrast similarity deviation (MCSD) metod [19], which can be termed as a continuation of SSIM and MS-SSIM, since it also uses the root mean square (RMS) contrast similarity; however, they employed standard deviation pooling for the final score.

Meanwhile, inspired by vision-related psychological research, visual saliency (VS)-based IQA methods [20,21], which utilize different kinds of visual saliencies [22,23,24], have attracted researchers’ attention. In the visual saliency index (VSI) method [23], VS is used as both a quality map and the weighting function at the pooling stage. The spectral residual similarity index (SR-SIM) [24] uses the spectral residual saliency, which makes the approach very fast while maintaining a competitive correlation with the mean opinion score. Combining VS with other features has also become popular [25,26]. Li et al. proposed an approach that combines VS and FSIM while, recently, Jia et al. used contrast and spectral residual saliency as well as summation-based SD pooling.

In the context of the HVS, center bias in early eye movements is an established fact in psychological vision research [27,28,29,30]. Bindemann found that eye movement is biased not only to the scene center but also to the screen center [31]. As a result, if a scene appears at the center of the screen, it will receive the most attention. For example, in Figure 1, the human eye will first move to the Block05 region and if that part has visually important information then it will attract even more attention. As a result, people will be more sensitive to the distortions in this region. To the best of our knowledge, there is no research in IQA considering this center bias for quality assessment.

In this paper, we propose a new method for IQA which accounts for the center emphasis in HVS. In the proposed method, we first obtain both the contrast and VS similarity maps for the entire image. To give center emphasis, we find the VS similarity map of the mid-region and apply element-wise multiplication in the mid-part to raise the similarity deviation there. However, for the contrast similarity, we apply element-wise squaring in the center part. Contrast is a local quality map, so we do not calculate the contrast of the mid-area separately. On the other hand, VS is a global quality map, and thus it is calculated differently in the mid-region. The final score is obtained by performing weighted summation of the standard deviations on both of the similarity maps; further details with mathematical equations are given in Section 3.

We evaluated our proposed method on three popular benchmark databases for IQA research and compared it with 13 other state-of-the-art approaches. The results in terms of the correlations with human evaluated scores show that the method proposed by us outperforms the other approaches, with a reasonable amount of processing time.

This paper is organized as follows. Section 2 describes some underlying theories and related techniques. Section 3 explains the proposed center-emphasized assessment approach, and the results with relevant discussions are presented in Section 4. Finally, the paper is concluded in Section 5.

2. Background

In this section, we briefly review the underlying theories on which the content of this paper relies, including the spectral residual visual saliency similarity, contrast similarity, standard deviation pooling, and relevant evaluation metrics.

2.1. Spectral Residual Visual Saliency Similarity

In the human visual system (HVS), some interesting or salient regions of an image receive more attention than other parts. Detection of image saliency itself is an active field in vision research. As such, the human is more sensitive to these salient parts and any distortion in these parts attracts more intense attention, which makes it an important feature in IQA methods. There are a lot of saliency detection techniques available [32], among which spectral residual saliency detection [22] is a very fast approach. We adopt the saliency map generator described in the SR-SIM [24] and the visual saliency plus contrast (VSP) [26] approaches.

For an image

f (x, y)

, according to Reference [22], the spectral residual saliency (SRS) is computed as follows:

M_{f} (u, v) = a b s [F {f (x, y)} (u, v)]

(1)

A_{f} (u, v) = a n g l e [F {f (x, y)} (u, v)]

(2)

L_{M} (u, v) = log {M_{f} (u, v)}

(3)

R_{f} (u, v) = L_{M} (u, v) - h_{n} (u, v) * L_{M} (u, v)

(4)

S R S (x, y) = g (x, y) * {[F^{- 1} {exp (R_{f} + j A_{f})} (x, y)]}^{2},

(5)

where

F

and

F^{- 1}

denote the Fourier and inverse Fourier transforms, respectively;

a b s (.)

and

a n g l e (.)

return the magnitude and argument of a complex number, respectively;

h_{n} (u, v)

is an

n \times n

mean filter;

g (x, y)

is a Gaussian filter; and ∗ denotes the convolution operation.

In this way, we calculate the SRS for both the reference and distorted images denoted by

S R S_{r} (x, y)

and

S R S_{d} (x, y)

, respectively. Then, the spectral residual visual saliency similarity (

S R V S S (r, d)

) is calculated as:

S R V S S (r, d) = \frac{2 S R S_{r} (x, y) ⊙ S R S_{d} (x, y) + c_{1}}{S R S_{r} {(x, y)}^{2} + S R S_{d} {(x, y)}^{2} + c_{1}},

(6)

where ⊙ is the element-wise multiplication,

^{2}

is the element-wise squaring and

c_{1}

is a positive constant used to increase the calculation stability.

2.2. Contrast Similarity

Contrast is a basic perceptual attribute of an image [33] which varies greatly over the image, and the contrast map (CM) contains the spatial distribution of those varying values. There are many ways devised for calculating CMs and in this paper, we adopted the RMS contrast from SSIM, since it achieves better performance for natural images. The RMS contrast map

C_{X}

of an image signal X is given by:

C_{X} = {[\frac{1}{N - 1} \sum_{i = 1}^{N} {(x_{i} - μ_{X})}^{2}]}^{\frac{1}{2}},

(7)

where N is the total number of pixels in the image,

x_{i}

is the intensity of pixel i, and

μ_{X}

is the mean intensity, defined as:

μ_{X} = \frac{1}{N} \sum_{i = 1}^{N} x_{i} .

(8)

Again, using Equation (7), we obtain the contrast maps

C_{r}

and

C_{d}

for both reference and distorted images, respectively, and find the contrast similarity

C S (r, d)

as follows:

C S (r, d) = \frac{2 C_{r} ⊙ C_{d} + c_{2}}{C_{r}^{2} + C_{d}^{2} + c_{2}},

(9)

where similar to Equation (6), ⊙ is the element-wise multiplication,

^{2}

is the element-wise squaring and

c_{2}

is a positive constant used to increase the calculation stability.

2.3. Standard Deviation Pooling

As discussed in the introduction, standard deviation (SD) pooling achieves very good performance in specific cases and is adopted by several successful methods. Jia et al. conducted an experiment with several other combinations of pooling and found that SD pooling provides the best correlation. The final quality score (

Q S

) is calculated using the following equation:

Q S = \frac{1}{w_{1} + w_{2}} [w_{1} \times s t d {S R V S S (r, d)} + w_{2} \times s t d {C S (r, d)}],

(10)

where

w_{1}

and

w_{2}

are weighting factors that specify the importance of VSS and CS, respectively. The standard deviations in the above equation are defined as:

s t d (S R V S S (r, d)) = {\{\frac{1}{M} \sum_{i = 1}^{M} {(S R V S S_{i} - μ_{S R V S S})}^{2}\}}^{\frac{1}{2}}

(11)

s t d (C S (r, d)) = {\{\frac{1}{M} \sum_{i = 1}^{M} {(C S_{i} - μ_{C S})}^{2}\}}^{\frac{1}{2}},

(12)

where M is the number of total elements in the similarity matrices;

S R V S S_{i}

and

C S_{i}

are the ith items;

μ_{S R V S S}

and

μ_{C S}

are the mean values of the

S R V S S (r, d)

and

C S (r, d)

, respectively, and are given by:

μ_{S R V S S} = \frac{1}{M} \sum_{i = 1}^{M} S R V S S_{i}

(13)

μ_{C S} = \frac{1}{M} \sum_{i = 1}^{M} C S_{i}

(14)

2.4. Evaluation Metrics

The performance of any IQA method is usually measured by the mean squared error and several other correlations with the subjective scores that are human evaluated values usually in the form of MOS, or their differential DMOS. However, to apply linear correlation, the two compared values should be on the same scale and perfectly linearly correlated [34]. To ensure better fairness, before applying the linear correlation measurements, a logistic mapping function is used to convert the objective scores. We use the following nonlinear regression model as suggested by Sheikh [35].

q^{'} = β_{1} \{\frac{1}{2} - \frac{1}{1 + exp (β_{2} (q - β_{3}))}\} + β_{4} q + β_{5},

(15)

where q is the objective score calculated by a IQA method,

q^{'}

is the mapped value, and

β_{i}, i = 1, 2, 3, 4, 5

are the parameters that are tuned based on the relationship between objective and subjective scores. We utilized the MATLAB function

n l i n f i t

to find the optimal parameters. After the mapping is done, the subjective scores are then used with these mapped scores to find the correlation coefficients.

One of the widely adopted basic correlations is Pearson’s linear correlation coefficient (PLCC) which is defined as follows:

P L C C (o, s) = \frac{\sum_{i = 1}^{m} (o_{i} - μ_{o}) (s_{i} - μ_{s})}{{\{\sum_{i = 1}^{m} {(o_{i} - μ_{o})}^{2}\}}^{\frac{1}{2}} {\{\sum_{i = 1}^{m} {(s_{i} - μ_{s})}^{2}\}}^{\frac{1}{2}}},

(16)

where m is the number of distorted images; o and s are vectors of the objective and subjective scores, respectively; and

μ_{o}

and

μ_{s}

are the mean scores, defined by:

μ_{o} = \frac{1}{m} \sum_{i = 1}^{m} o_{i}

(17)

μ_{s} = \frac{1}{m} \sum_{i = 1}^{m} s_{i} .

(18)

In our case, the objective scores o are actually the mapped scores using Equation (15). If the nonlinear mapping in Equation (15) is to be avoided, then rank order coefficients can be used. The most popular Spearman’s rank-order correlation coefficient (SROCC) is defined as:

S R O C C (o, s) = P L C C (r a n k (o), r a n k (s)) .

(19)

The function

r a n k ()

of a vector returns a rank-vector, where the i-th entry contains the relative rank of the i-th item in the original vector.

Another popularly adopted rank order coefficient is Kendall’s rank-order correlation coefficient (KROCC), which is given as below:

K R O C C (o, s) = \frac{C - D}{m (m - 1) / 2},

(20)

where C is the number of concordant pairs that are consistently correlated between objective and subjective scores; and D is the number of discordant pairs.

The root mean square error (RMSE) is also commonly adopted and is defined as:

R M S E (o, s) = {\{\frac{1}{m} \sum_{i = 1}^{m} {(o_{i} - s_{i})}^{2}\}}^{\frac{1}{2}}

(21)

A larger value for PLCC, SROCC, and KROCC indicates that the corresponding method is better. On the other hand, a smaller value of the RMSE is a sign of a superior IQA.

3. Proposed Center-Emphasized Quality Assessment

The general flow diagram of our proposed method is presented in Figure 2. At first, the center parts of both the reference and distorted images are extracted. To do this, we split the image in

3 \times 3

image blocks as shown in Figure 1, and the fifth block, which resides in the middle both horizontally and vertically, is taken as the center area. If the original image dimension is

(H \times W)

, then the corresponding dimension for the center block becomes

(H_{m i d} \times W_{m i d})

, where:

H_{m i d} = [\frac{H}{3}] and W_{m i d} = [\frac{W}{3}] .

(22)

The center block is defined as a rectangular area identified by two corner points

(x_{m i n}, y_{m i n}) and (x_{m a x}, y_{m a x})

, where:

x_{m i n} = [\frac{H}{3}], y_{m i n} = [\frac{W}{3}], x_{m a x} = [\frac{H}{3}] + H_{m i d}, and y_{m a x} = [\frac{W}{3}] + W_{m i d}

(23)

First, the saliency similarity maps for the full images and middle images are found using Equations (1)–(6) and are denoted as

V S S

and

V S S_{m i d}

, respectively. Simultaneously, the contrast similarity map for the full-size,

C S

, is also obtained. As discussed before, we do not derive the CS map for middle images.

Then, we increase the sensitivity of the center area within both of the maps. Let

V S S (m i d)

and

C S (m i d)

be the center areas of

V S S

and

C S

, respectively. The updated middle parts will be determined as follows:

V S S (m i d) = V S S (m i d) ⊙ V S S_{m i d}

(24)

C S (m i d) = C S (m i d) ⊙ C S (m i d),

(25)

where ⊙ is the element-wise multiplication.

With the updated middle portion, we obtain the finalized maps

V S S_{f i n a l}

and

C S_{f i n a l}

, and using Equation (10), we calculate the final quality score of the proposed method

C E Q I

as:

C E Q I = \frac{1}{w_{1} + w_{2}} [w_{1} \times s t d (V S S_{f i n a l}) + w_{2} \times s t d (C S_{f i n a l})]

(26)

The MATLAB code is publicly available to test the algorithm and can be found online at http://layek.khu.ac.kr/CEQI.

4. Results and Analysis

Experiments were carried out on three popular benchmark databases for IQA research—TID2008 [36], CSIQ [37] and LIVE [38]. Our approach was compared with 13 other state-of-the-art approaches as listed in Table 1. Basic information about the databases is given in Table 2 and the distortion information is recorded in Table 3.

For performance comparison, we use four commonly adopted metrics—Spearman’s rank-order correlation coefficient (SROCC), Kendall’s rank-order correlation coefficient (KROCC), Pearson’s linear correlation coefficient (PLCC), and the root mean square error (RMSE)—which we defined in Section 2.4.

Table 4 compares the four metrics among the different IQA models, for all of the three databases. The top three values for each metric are typed in boldfaced and light-gray shaded; the top value is colored blue; the second highest value is colored red, and the third highest value is colored black. However, in the case of RMSE, coloring is done in a reverse way, i.e., the lowest value is colored in blue and so on, since a lower RMSE implies a better method. We see that, for the biggest database, TID2008, our proposed method outperforms all other methods in all metrics. For the other two databases, it achieves competitive performance. We calculated the weighted averages of the SROCC, KROCC, PLCC, and RMSE using the number of distorted images to find the overall performance, as proposed in Reference [5]. It can be noticed that, compared to VSI and VSP, our approach shown better prediction accuracy with (1.09%, 0.3%)-point, (2.44%, 0.39%)-point and (2.19%, 0.22%)-point higher overall SROC, KROC and PLCC values, respectively. The overall ranking based on performance is shown in Table 5.

Table 6 compares the SROCC performance for all distortion types; please refer to Table 3 for a description of the abbreviations. We see that different methods perform better for different distortions and performance even varies between databases. This is the case because images are not affected equally by a specific type of distortion—it depends on the color, salient regions, and perhaps a combination of many other factors. Still, distortion-wise comparison gives us a good understanding of whether an IQA method is biased to some noise type or not. It can be seen that the proposed CEQI performs consistently well for all types of distortion; it is not too biased to any specific type of distortion, while retaining an average performance within the top 3 methods.

Figure 3 shows scatter plots of the predicted scores for different IQA approaches with the MOS/DMOS values on the TID2008 database. These results show that CEQI’s prediction is consistent compared to other methods, while providing a better correlation. We do not include PSNR because its predictions are very inconsistent. NQM is also discarded for the same reason, although its performance is not as inconsistent as PSNR.

Although the prime consideration of an IQA model is the performance of its prediction, having a low computational cost is also a desirable feature, especially for a real-time system. We evaluated the various IQA models with MATLAB R2017b using a computer equipped with an Intel(R) Core(TM) i5-4670 CPU with a 3.40 GHz processor and 16 GB of RAM. The MATLAB codes provided by the authors were used and elapsed time was recorded using the traditional tic-toc function. The results of these tests are shown in Table 7. As expected, PSNR has the lowest computation time. Surprisingly, the gradient magnitude similarity division model can process 263.05 images per second with satisfactory performance (rank 4 as shown in Table 5). VIF shows very good performance for the LIVE database where it is the best-performing IQA, but it can only process 1.79 images per second on average, which makes it inappropriate for real-time systems or systems with low processing capability. On the other hand, CEQI takes 15.25 ms to process an image, with the capability of processing 65.51 images per second. This frame rate meets the need for almost all kinds of real-time systems.

5. Conclusions

In this paper, we considered the center bias of HVS and proposed a full-reference image quality assessment method, CEQI, combining visual saliency and contrast. We placed extra emphasis on the center part of the image so that any degradation within the center region results in more effects than other regions. The proposed approach was compared with other state-of-the-art IQA models and it outperforms other competing methods in most cases. Comparing individual distortion types, the proposed method gives consistent scores. Additionally, the running time is suitable for real-time applications. The center emphasis makes the method more balanced and robust. We believe that this center emphasis will enhance the performance of other existing IQA models, including no-reference and reduced-reference approaches. In our future work, we will investigate these possibilities.

Author Contributions

This paper represents the result of collaborative teamwork. Conceptualization, M.A.L.; Funding acquisition, E.-N.H.; Software, M.A.L. and A.F.M.S.U.; Visualization, M.A.L. and L.P.T.; Writing—original draft, M.A.L.; Writing—review & editing, M.A.L., E.-N.H. and T.C.

Funding

This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2017-0-00294, Service mobility support distributed cloud technology).

Acknowledgments

This work was supported by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT) (No. 2017-0-00294, Service mobility support distributed cloud technology).

Conflicts of Interest

The authors declare no conflict of interest.

References

Girod, B. Psychovisual aspects of image processing: What’s wrong with mean squared error? In Proceedings of the Seventh Workshop on IEEE Multidimensional Signal Processing, Lake Placid, NY, USA, 23–25 September 1991; p. 2. [Google Scholar]
Wang, Z.; Bovik, A.C. Mean squared error: Love it or leave it? A new look at signal fidelity measures. IEEE Signal Process. Mag. 2009, 26, 98–117. [Google Scholar] [CrossRef]
Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural similarity for image quality assessment. In Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, Pacific Grove, CA, USA, 9–12 November 2003; IEEE: New York, NY, USA, 2003; Volume 2, pp. 1398–1402. [Google Scholar]
Wang, Z.; Li, Q. Information content weighting for perceptual image quality assessment. IEEE Trans. Image Process. 2011, 20, 1185–1198. [Google Scholar] [CrossRef] [PubMed]
Sheikh, H.R.; Bovik, A.C.; De Veciana, G. An information fidelity criterion for image quality assessment using natural scene statistics. IEEE Trans. Image Process. 2005, 14, 2117–2128. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Sheikh, H.R.; Bovik, A.C. A visual information fidelity approach to video quality assessment. In Proceedings of the First International Workshop on Video Processing and Quality Metrics for Consumer Electronics, Doubletree Paradise Valley Resort, Scottsdale, AZ, USA, 23–25 January 2005; pp. 23–25. [Google Scholar]
Larson, E.C.; Chandler, D.M. Most apparent distortion: Full-reference image quality assessment and the role of strategy. J. Electron. Imaging 2010, 19, 011006. [Google Scholar]
Damera-Venkata, N.; Kite, T.D.; Geisler, W.S.; Evans, B.L.; Bovik, A.C. Image quality assessment based on a degradation model. IEEE Trans. Image Process. 2000, 9, 636–650. [Google Scholar] [CrossRef] [PubMed]
Chandler, D.M.; Hemami, S.S. VSNR: A wavelet-based visual signal-to-noise ratio for natural images. IEEE Trans. Image Process. 2007, 16, 2284–2298. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Zhang, L.; Mou, X.; Zhang, D. FSIM: A feature similarity index for image quality assessment. IEEE Trans. Image Process. 2011, 20, 2378–2386. [Google Scholar] [CrossRef] [PubMed]
Kovesi, P. Image features from phase congruency. Videre 1999, 1, 1–26. [Google Scholar]
Liu, Z.; Laganière, R. Phase congruence measurement for image similarity assessment. Pattern Recognit. Lett. 2007, 28, 166–172. [Google Scholar] [CrossRef]
Saha, A.; Wu, Q.J. Perceptual image quality assessment using phase deviation sensitive energy features. Signal Process. 2013, 93, 3182–3191. [Google Scholar] [CrossRef]
Chen, G.H.; Yang, C.L.; Xie, S.L. Gradient-based structural similarity for image quality assessment. In Proceedings of the 2006 IEEE International Conference on Image Processing, Atlanta, GA, USA, 8–11 October 2006; IEEE: New York, NY, USA, 2006; pp. 2929–2932. [Google Scholar]
Zhu, J.; Wang, N. Image quality assessment by visual gradient similarity. IEEE Trans. Image Process. 2012, 21, 919–933. [Google Scholar] [PubMed]
Xue, W.; Zhang, L.; Mou, X.; Bovik, A.C. Gradient magnitude similarity deviation: A highly efficient perceptual image quality index. IEEE Trans. Image Process. 2014, 23, 684–695. [Google Scholar] [CrossRef] [PubMed]
Alaei, A.; Conte, D.; Raveaux, R. Document image quality assessment based on improved gradient magnitude similarity deviation. In Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR), Nancy, France, 23–26 August 2015; pp. 176–180. [Google Scholar]
Wang, T.; Zhang, L.; Jia, H.; Li, B.; Shu, H. Multiscale contrast similarity deviation: An effective and efficient index for perceptual image quality assessment. Signal Process. Image Commun. 2016, 45, 1–9. [Google Scholar] [CrossRef] [Green Version]
Zhang, L.; Shen, Y.; Li, H. VSI: A visual saliency-induced index for perceptual image quality assessment. IEEE Trans. Image Process. 2014, 23, 4270–4281. [Google Scholar] [CrossRef] [PubMed]
Ma, Q.; Zhang, L. Saliency-based image quality assessment criterion. In International Conference on Intelligent Computing; Springer: Berlin/Heidelberg, Germany, 2008; pp. 1124–1133. [Google Scholar]
Hou, X.; Zhang, L. Saliency detection: A spectral residual approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2007 (CVPR’07), Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. [Google Scholar]
Duan, L.; Wu, C.; Miao, J.; Qing, L.; Fu, Y. Visual saliency detection by spatially weighted dissimilarity. In Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 20–25 June 2011; pp. 473–480. [Google Scholar]
Zhang, L.; Li, H. SR-SIM: A fast and high performance IQA index based on spectral residual. In Proceedings of the 2012 19th IEEE International Conference on Image Processing (ICIP), Orland, FL, USA, 30 September–3 October 2012; pp. 1473–1476. [Google Scholar]
Li, A.; She, X.; Sun, Q. Color image quality assessment combining saliency and fsim. In Proceedings of the Fifth International Conference on Digital Image Processing (ICDIP 2013), Beijing, China, 21–22 April 2013; International Society for Optics and Photonics: Bellingham, WA, USA, 2013; Volume 8878, p. 88780. [Google Scholar]
Jia, H.; Zhang, L.; Wang, T. Contrast and visual saliency similarity-induced index for assessing image quality. IEEE Access 2018, 6, 65885–65893. [Google Scholar] [CrossRef]
Langford, R.C. How people look at pictures, a study of the psychology of perception in art. J. Educ. Psychol. 1936, 27, 397–398. [Google Scholar] [CrossRef]
Mannan, S.; Ruddock, K.; Wooding, D. Fixation sequences made during visual examination of briefly presented 2D images. Spat. Vis. 1997, 11, 157–178. [Google Scholar] [CrossRef] [PubMed]
Parkhurst, D.; Law, K.; Niebur, E. Modeling the role of salience in the allocation of overt visual attention. Vis. Res. 2002, 42, 107–123. [Google Scholar] [CrossRef] [Green Version]
Tatler, B.W. The central fixation bias in scene viewing: Selecting an optimal viewing position independently of motor biases and image feature distributions. J. Vis. 2007, 7, 4. [Google Scholar] [CrossRef] [PubMed]
Bindemann, M. Scene and screen center bias early eye movements in scene viewing. Vis. Res. 2010, 50, 2577–2587. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cong, R.; Lei, J.; Fu, H.; Cheng, M.M.; Lin, W.; Huang, Q. Review of visual saliency detection with comprehensive information. arXiv, 2018; arXiv:1803.03391. [Google Scholar] [CrossRef]
Peli, E. Contrast in complex images. JOSA A 1990, 7, 2032–2040. [Google Scholar] [CrossRef]
Ding, Y. General Framework of Image Quality Assessment. In Visual Quality Assessment for Natural and Medical Image; Springer: Berlin/Heidelberg, Germany, 2018; pp. 45–62. [Google Scholar]
Sheikh, H.R.; Sabir, M.F.; Bovik, A.C. A statistical evaluation of recent full reference image quality assessment algorithms. IEEE Trans. Image Proc. 2006, 15, 3440–3451. [Google Scholar] [CrossRef]
Ponomarenko, N.; Lukin, V.; Zelensky, A.; Egiazarian, K.; Carli, M.; Battisti, F. TID2008—A database for evaluation of full-reference visual quality assessment metrics. Adv. Modern Radioelectron. 2009, 10, 30–45. [Google Scholar]
Larson, E.C.; Chandler, D. Categorical Image Quality (CSIQ) Database. 2010. Available online: http://vision.eng.shizuoka.ac.jp/ (accessed on 23 December 2018).
Sheikh, H. LIVE Image Quality Assessment Database Release 2. 2005. Available online: http://live. ece. utexas. edu/research/quality (accessed on 20 December 2018).

Figure 1. The image ’Sailing.bmp’ is split into nine blocks, where Block05 is the center area.

Figure 2. Flow diagram of the proposed center-emphasized approach.

Figure 3. Scatter plots of the mean opinion scores (MOS/DMOS) versus scores predicted by different methods on the TID2008 database. The black curves are obtained by the nonlinear fitting based on Equation (15).

Table 1. Overview of the compared image quality assessment (IQA) methods.

IQA Method	Principle Consideration	Pooling Used	Comments
PSNR	Pixel-by-pixel error	Average	Primitive method, does not consider HVS, poor correlation with humans, low computation, widely used.
[3] SSIM	Luminance, contrast, structure	Average	Milestone method to consider structural information that better represents HVS.
[4] MS-SSIM	Multi-scale structure	Weighted sum	An extension to SSIM that incorporates variations of viewing conditions and capable of multi-scale assessment.
[5] IW-SSIM	Information content extraction	Weighted by information content	Main emphasis is on information content extraction which is applicable to other methods as well, uses image pyramid.
[7] MAD	Most visible distortion	Weighted product	A novel strategy consisting of two phases to detect the most apparent distortion.
[11] FSIM	Phase congruency, Gradient magnitude	Weighted average	A state-of-the art IQA approach using phase congruency and the gradient magnitude weighted by phase congruency to calculate the final score.
[17] GMSD	Image gradient	Standard deviation	Very fast assessment approach after PSNR showing competitive performance with other state-of-the-art approaches.
[23] VSI	Visual saliency (VS)	Weighted average	Introduced visual saliency for IQA where VS is used for both the quality map and weighting function at the pooling stage.
[19] MCSD	Multi-scale contrast	Standard deviation	Another fast method next to GMSD, but providing better results.
[7] VIF	Visual information extraction	Average	Quantifies the extracted reference information from a distorted image.
[9] NQM	Distortion and noise	Squared sum	Considers HVS using distortion and noise, better than PSNR.
[24] SR-SIM	Saliency and gradient	Weighted average	Uses spectral residual saliency and image gradient, which shows competitive performance.
[26] VSP	Saliency and contrast	Weighted sum of deviations	Uses deviation pooling in a combined method.
CEQI (Proposed)	Saliency, contrast, center emphasis	Weighted sum of deviations	Gives special attention to the center area.

PSNR: peak signal-to-noise ratio; SSIM: structural similarity index; MS-SSIM: multi-scale structural similarity; IW-SSIM: information content weighted structural similarity; MAD: most apparent distortion; FSIM: feature similarity index; GMSD: gradient magnitude standard deviation; VSI: visual saliency index; MCSD: multi-scale contrast similarity deviation; VIF: visual information fidelity; NQM: noise quality measure; SR-SIM: spectral residual similarity index; VSP: visual saliency plus; CEQI: center-emphasized quality index; HVS: human visual system.

Table 2. Basic information about the databases used for the experiments.

Dataset	Reference Images	Distorted Images	Distortion Types	No. of Subjects
TID2008	25	1700	17	838
CSIQ	30	866	6	35
LIVE	29	779	5	161

Table 3. Types of distortion used in each database.

TID2008	CSIQ	LIVE	Type of Distortion	Abbreviation
Y	Y	Y	Additive Gaussian noise	AGN
Y	-	-	Additive noise in color components	ANC
Y	-	-	Spatially correlated noise	SCN
Y	-	-	Masked noise	MN
Y	-	-	High frequency noise	HFN
Y	-	-	Impulse noise	IN
Y	-	-	Quantization noise	QN
Y	Y	Y	Gaussian blur	GB
Y	-	-	Image denoising	IDN
Y	Y	Y	JPEG compression	JPEG
Y	Y	Y	JPEG2000 compression	JP2K
Y	-	-	JPEG transmission errors	JGTE
Y	-	-	JPEG2000 transmission errors	J2TE
Y	-	-	Non-eccentricity pattern noise	NEPN
Y	-	-	Local block-wise distortions of different intensity	LBD
Y	-	-	Mean shift (intensity shift)	MS
Y	Y	-	Contrast change	CTC
-	-	Y	Fast fading Rayleigh	FF
-	Y	-	Additive pink Gaussian noise	AWPN

Table 4. Performance comparison of IQA methods on three databases.

Dataset	Metric	PSNR	SSIM	MS-SSIM	IW-SSIM	MAD	FSIM	GMSD	VSI	MCSD	VIF	NQM	SRSIM	VSP	CEQI (Prop.)
TID 2008	SROCC	0.5245	0.7749	0.8542	0.8559	0.8340	0.8805	0.8907	0.8979	0.8911	0.7491	0.6243	0.8913	0.9001	0.9069
	KROCC	0.3696	0.5768	0.6568	0.6636	0.6445	0.6946	0.7092	0.7123	0.7133	0.5860	0.4608	0.7149	0.7215	0.7307
	PLCC	0.5309	0.7732	0.8451	0.8579	0.8306	0.8738	0.8788	0.8762	0.8844	0.8084	0.6085	0.8867	0.8962	0.9014
	RMSE	1.1372	0.8511	0.7173	0.6895	0.7473	0.6525	0.6404	0.6466	0.6263	0.7899	1.0649	0.6205	0.5953	0.5810
CSIQ	SROCC	0.8388	0.8755	0.9132	0.9212	0.9466	0.9242	0.9570	0.9422	0.9592	0.9194	0.7436	0.9318	0.9579	0.9563
	KROCC	0.6351	0.6900	0.7386	0.7522	0.7963	0.7561	0.8122	0.7850	0.8171	0.7532	0.5648	0.7718	0.8171	0.8138
	PLCC	0.8276	0.8612	0.8991	0.9144	0.9502	0.9120	0.9541	0.9279	0.9560	0.9257	0.7433	0.9250	0.9589	0.9565
	RMSE	0.1474	0.1334	0.1149	0.1063	0.0818	0.1077	0.0786	0.0979	0.0770	0.0993	0.1756	0.0998	0.0745	0.0766
LIVE	SROCC	0.8765	0.9460	0.9512	0.9604	0.9567	0.9610	0.9546	0.9464	0.9603	0.9719	0.8545	0.9558	0.9573	0.9577
	KROCC	0.7012	0.8057	0.8181	0.8379	0.8290	0.8380	0.8236	0.8000	0.8350	0.8571	0.6938	0.8190	0.8297	0.8307
	PLCC	0.9132	0.9385	0.9468	0.9515	0.9493	0.9492	0.9511	0.9431	0.9540	0.9723	0.8773	0.9453	0.9523	0.9534
	RMSE	9.4197	7.9838	7.4380	7.1116	7.2690	7.2762	7.1374	7.6856	6.9329	5.4030	11.0941	7.5434	7.0506	6.9730
OVERALL	SROCC	0.6986	0.8468	0.8954	0.9008	0.8954	0.9134	0.9246	0.9221	0.9269	0.8523	0.7171	0.9190	0.9300	0.9330
	KROCC	0.5262	0.6678	0.7214	0.7335	0.7326	0.7493	0.7660	0.7543	0.7723	0.7018	0.5507	0.7576	0.7748	0.7787
	PLCC	0.7091	0.8404	0.8864	0.8976	0.8926	0.9040	0.9172	0.9073	0.9211	0.8824	0.7158	0.9123	0.9270	0.9292
	RMSE	3.1880	2.6501	2.4304	2.3246	2.3899	2.3528	2.3015	2.4609	2.2377	1.8981	3.6237	2.4095	2.2549	2.2270

– For each row, the first, second and third-ranked performances are highlighted respectively in blue, red and black colors; – For SROCC, KROCC and PLCC metrics, the higher the value, the better the method whereas for RMSE a lower score is better.

Table 5. Overall performance ranking of the compared IQA methods.

IQA	SROCC	KROCC	PLCC	RMSE
PSNR	14	14	14	13
SSIM	12	12	12	12
MS-SSIM	9	10	10	10
IW-SSIM	8	8	8	6
MAD	10	9	9	8
FSIM	7	7	7	7
GMSD	4	4	4	5
VSI	5	6	6	11
MCSD	3	3	3	3
VIF	11	11	11	1
NQM	13	13	13	14
SRSIM	6	5	5	9
VSP	2	2	2	4
CEQI (Proposed)	1	1	1	2

– Number 1 indicates the best performing method and 14 is the worst.

Table 6. Distortion-wise SROCC performance comparison of the IQA methods on three databases.

Dataset	Dist.	PSNR	SSIM	MS-SSIM	IW-SSIM	MAD	FSIM	GMSD	VSI	MCSD	VIF	NQM	SRSIM	VSP	CEQI (Prop.)
TID 2008	AGN	0.9115	0.8109	0.8086	0.7869	0.8384	0.8562	0.9180	0.9240	0.9187	0.8797	0.7679	0.8999	0.9202	0.9203
	ANC	0.9068	0.8029	0.8054	0.7920	0.8307	0.8527	0.8977	0.9118	0.8898	0.8757	0.7596	0.8952	0.8969	0.9014
	SCN	0.9229	0.8145	0.8209	0.7714	0.8680	0.8487	0.9128	0.9351	0.9210	0.8698	0.7720	0.9084	0.9058	0.9060
	MN	0.8487	0.7795	0.8107	0.8088	0.7336	0.8021	0.7347	0.8011	0.7321	0.8683	0.7071	0.7906	0.7762	0.7835
	HFN	0.9323	0.8774	0.8734	0.8703	0.8875	0.9153	0.9173	0.9258	0.9180	0.9075	0.9030	0.9197	0.9190	0.9204
	IN	0.9177	0.6732	0.6907	0.6465	0.0579	0.7452	0.6611	0.8298	0.6893	0.8327	0.7771	0.7667	0.6981	0.7176
	QN	0.8700	0.8531	0.8589	0.8177	0.8160	0.8564	0.8875	0.8731	0.8952	0.8813	0.8317	0.8354	0.8897	0.8809
	GB	0.8673	0.9544	0.9563	0.9636	0.9196	0.9472	0.8968	0.9529	0.8880	0.9540	0.8846	0.9549	0.9296	0.9320
	IDN	0.9381	0.9530	0.9582	0.9473	0.9433	0.9603	0.9752	0.9693	0.9766	0.9183	0.9450	0.9667	0.9693	0.9706
	JPEG	0.9011	0.9252	0.9322	0.9207	0.9327	0.9370	0.9525	0.9616	0.9486	0.9168	0.9075	0.9411	0.9445	0.9467
	JP2K	0.8301	0.9630	0.9700	0.9738	0.9707	0.9773	0.9795	0.9848	0.9787	0.9709	0.9531	0.9804	0.9774	0.9762
	JGTE	0.7664	0.8678	0.8681	0.8588	0.8661	0.8708	0.8621	0.9160	0.7681	0.8585	0.7359	0.8877	0.8893	0.8985
	J2TE	0.7765	0.8577	0.8606	0.8203	0.8394	0.8544	0.8825	0.8942	0.8946	0.8501	0.7412	0.8907	0.8704	0.8752
	NEPN	0.5931	0.7107	0.7377	0.7724	0.8298	0.7492	0.7601	0.7699	0.7986	0.7619	0.6800	0.7672	0.7647	0.7727
	LBD	0.5851	0.8462	0.7560	0.7634	0.7970	0.8494	0.8967	0.6288	0.8933	0.8324	0.3367	0.7789	0.8404	0.8295
	MS	0.7076	0.7231	0.7338	0.7067	0.5163	0.6720	0.6486	0.6714	0.5350	0.5096	0.5440	0.5731	0.6736	0.7177
	CTC	0.6126	0.4417	0.6381	0.6301	0.3236	0.6481	0.6346	0.6557	0.5932	0.8403	0.8263	0.6482	0.5695	0.5399
	AVG	0.8169	0.8150	0.8282	0.8147	0.7630	0.8437	0.8481	0.8591	0.8376	0.8546	0.7690	0.8473	0.8491	0.8523
CSIQ	AWGN	0.9344	0.8974	0.9471	0.9380	0.9542	0.9262	0.9676	0.9637	0.9674	0.9575	0.9387	0.9631	0.9665	0.9680
	JPEG	0.9008	0.9543	0.9631	0.9660	0.9614	0.9652	0.9651	0.9615	0.9670	0.9703	0.9525	0.9670	0.9689	0.9685
	JP2K	0.9307	0.9605	0.9682	0.9682	0.9752	0.9684	0.9717	0.9692	0.9746	0.9671	0.9629	0.9772	0.9778	0.9777
	AGPN	0.9315	0.8924	0.9330	0.9057	0.9568	0.9233	0.9502	0.9636	0.9479	0.9509	0.9114	0.9519	0.9516	0.9525
	GB	0.9359	0.9608	0.9711	0.9781	0.9681	0.9728	0.9712	0.9679	0.9747	0.9744	0.9583	0.9767	0.9788	0.9777
	CTC	0.8861	0.7925	0.9528	0.9540	0.9210	0.9420	0.9037	0.9505	0.9509	0.9345	0.9478	0.9530	0.9324	0.9354
	AVG	0.9199	0.9097	0.9559	0.9517	0.9561	0.9497	0.9549	0.9627	0.9638	0.9591	0.9453	0.9648	0.9627	0.9633
LIVE	JP2K	0.9506	0.9762	0.9802	0.9791	0.9775	0.9819	0.9823	0.9700	0.9825	0.9738	0.9702	0.9727	0.9815	0.9813
	JPEG	0.9361	0.9598	0.9626	0.9602	0.9464	0.9625	0.9607	0.9534	0.9613	0.9568	0.9469	0.9546	0.9615	0.9614
	AWGN	0.8643	0.9801	0.9845	0.9807	0.9904	0.9798	0.9847	0.9881	0.9889	0.9899	0.8242	0.9865	0.9887	0.9900
	GB	0.0359	0.9517	0.9733	0.9838	0.9692	0.9832	0.9751	0.9703	0.9728	0.9826	0.8453	0.9782	0.9804	0.9804
	FF	0.9306	0.9643	0.9690	0.9674	0.9748	0.9707	0.9658	0.9644	0.9723	0.9778	0.7929	0.9666	0.9763	0.9767
	AVG	0.7435	0.9664	0.9739	0.9742	0.9717	0.9756	0.9737	0.9692	0.9756	0.9762	0.8759	0.9717	0.9777	0.9780

– For each row, the first, second and third-ranked performances are highlighted respectively in blue, red and black colors. – The acronyms for distortions are defined in Table 3, AVG refers to average performance over all noises in a database.

Table 7. Running time comparison of the IQA models.

IQA	Running Time (ms)	Images per Second
PSNR	3.47	288.49
SSIM	7.47	133.88
MS-SSIM	25.71	38.90
IW-SSIM	157.85	6.34
MAD	696.00	1.44
FSIM	115.41	8.66
GMSD	3.80	263.05
VSI	64.12	15.60
MCSD	6.34	157.70
VIF	560.19	1.79
NQM	106.59	9.38
SRSIM	10.29	97.20
VSP	10.75	93.03
CEQI (Proposed)	15.27	65.51

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Layek, M.A.; Uddin, A.F.M.S.; Le, T.P.; Chung, T.; Huh, E.-N. Center-Emphasized Visual Saliency and a Contrast-Based Full Reference Image Quality Index. Symmetry 2019, 11, 296. https://doi.org/10.3390/sym11030296

AMA Style

Layek MA, Uddin AFMS, Le TP, Chung T, Huh E-N. Center-Emphasized Visual Saliency and a Contrast-Based Full Reference Image Quality Index. Symmetry. 2019; 11(3):296. https://doi.org/10.3390/sym11030296

Chicago/Turabian Style

Layek, Md. Abu, A. F. M. Shahab Uddin, Tuyen P. Le, TaeChoong Chung, and Eui-Nam Huh. 2019. "Center-Emphasized Visual Saliency and a Contrast-Based Full Reference Image Quality Index" Symmetry 11, no. 3: 296. https://doi.org/10.3390/sym11030296

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Center-Emphasized Visual Saliency and a Contrast-Based Full Reference Image Quality Index

Abstract

1. Introduction

2. Background

2.1. Spectral Residual Visual Saliency Similarity

2.2. Contrast Similarity

2.3. Standard Deviation Pooling

2.4. Evaluation Metrics

3. Proposed Center-Emphasized Quality Assessment

4. Results and Analysis

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI