No-Reference Video Quality Assessment Using the Temporal Statistics of Global and Local Image Features
Abstract
:1. Introduction
2. Literature Review
3. Proposed Method
3.1. Global Features
- Blur: This refers to the parts of an image that are out of focus. With too much blur, edges are no longer distinct. As a consequence, the amount of blur is an important element of human perceptual judgment. Due to its low computational complexity, the metric of Crété-Roffet et al. [56] was chosen in our model for the characterization of the amount of blur in a video frame. A video sequence’s blur was defined as the average of all video frames’ blur.
- Colorfulness (CF): This is a characteristic of human visual perception that describes whether an image or image area seems to be more or less chromatic [57]. In [58], it was pointed out that humans tend to have a tendency toward more colorful scenes. In our model, we adopted the definition of colorfulness for a video frame proposed by Hasler and Suesstrunk [59]:
- Vividness was suggested as a color attribute by Berns [60], and it describes the degree of departure of the color from a neutral black color. Berns’ model can be expressed by the following formula:
- In this study, the heaviness of an image was defined by the average of all H values calculated from CIELAB’s channels. As a quality-aware feature for a video sequence, the average of all video frames’ heaviness was taken.
- Depth is also a color attribute, but it characterizes the degree of departure of a given color from a neutral white color, and in Berns’ model [60], it is formally given as:In this study, the depth of an image was defined by the average of all values calculated from CIELAB’s channels. As a quality-aware feature for a video sequence, the average of all video frames’ depth was taken.
- The spatial information (SI) of a video frame is defined with the help of the non-maximum suppression (NMS) [64,65] algorithm. Namely, a video frame is characterized as the number of detected local extrema using three different T thresholds (, , and were considered in this study). More specifically, the filtered video frame in which NMS is carried out is defined as follows:
- Temporal information was defined by using the difference between two consecutive video frames. Namely, the standard deviations of all difference maps were determined, and their arithmetic mean was considered as a video-level quality-aware feature.
- The color gradient magnitude (CGM) map of an RGB digital image is defined as
- In addition to the mean of the CGM, the standard deviation of the CGM is also considered a quality-aware feature for a single video frame. As in the previous point, the average of all video frames’ standard deviation was used to characterize the whole video sequence.
- Sharpness determines the amount of detail in an image. It is most visible in image edges, and many approaches measure it with the step response. In our model, we estimated the sharpness of a video frame by using image gradients. Namely, the gradient magnitude map () was calculated as
- Michelson contrast: By definition, contrast corresponds to the difference in luminance that makes an object noticeable in an image [66]. Humans tend to appreciate images with higher contrast, since they can better distinguish between differences in intensity. In our model, we incorporated two different quantizations of contrast, i.e., Michelson and root mean square (RMS) contrast. The Michelson contrast of a still image is determined as follows:
- The RMS contrast of image with size corresponds to the standard deviation of intensities [67]:
- The mean of an image gives the contribution sof individual pixel intensities for the entire image. Further, the mean is inversely proportional to the haze. In our study, the average of all video frames was considered as a quality-aware feature.
- Entropy: This can be viewed as a measure of disorder in a digital image, and at the same time, it is a statistical feature that gives information about the average information content of an image [54]. Further, entropy tends to increase in an image as the intensity of noise or degradation levels increase [68]. An 8-bit-depth grayscale image’s entropy (E) can be given as
- A perception-based image quality evaluator (PIQE) [69] is an opinion-unaware image quality estimator that does not require any training data. Further, it estimates perceptual quality only from salient image regions. First, an input image is divided into non-overlapping -sized blocks. The identification of salient blocks is carried out with the help of mean subtracted contrast normalized (MSCN) coefficients. Moreover, noise and artifact quantization are also carried out with MSCN coefficients. In our study, the average of all video frames’ PIQE metrics was considered as a quality-aware feature.
- The naturalness image quality evaluator (NIQE) [20] is also an opinion-unaware image quality estimator that needs no training data. Namely, it quantifies image quality as the distance between the NSS features of an input image and the NSS features of a model that was obtained from pristine (distortion-free) images. The applied NSS features are modeled as multidimensional Gaussian distributions. In our study, the average of all video frames’ NIQE metrics was considered as a quality-aware feature.
3.2. Local Features
4. Results
4.1. Datasets and Protocol
4.2. Parameter Study
4.3. Comparison to the State-of-the-Art Methods
5. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AGGD | asymmetric generalized Gaussian distribution |
BDT | binary decision tree |
CF | colorfulness |
CGM | color gradient magnitude |
CNN | convolutional neural network |
E | entropy |
ET | extra tree |
FR-VQA | full-reference video quality assessment |
GAM | generalized additive model |
GPR | Gaussian process regressor |
GRU | gated recurrent unit |
HVS | human visual system |
IQA | image quality assessment |
LIVE | Laboratory for Image and Video Engineering |
MOS | mean opinion score |
NIQE | naturalness image quality evaluator |
NMS | non-maximal suppression |
NN | neural network |
NR-IQA | no-reference image quality assessment |
NR-VQA | no-reference video quality assessment |
NSS | natural scene statistics |
PIQE | perception-based image quality evaluator |
PLCC | Pearson’s linear correlation coefficient |
RBF | radial basis function |
RMS | root mean square |
RNN | recurrent neural network |
RR-VQA | reduced-reference video quality assessment |
SI | spatial information |
SROCC | Spearman’s rank order correlation coefficient |
SVR | support vector regressor |
VQA | video quality assessment |
VQC | video quality challenge |
References
- Hewage, C.T.; Ahmad, A.; Mallikarachchi, T.; Barman, N.; Martini, M.G. Measuring, modelling and Integrating Time-varying Video Quality in End-to-End Multimedia Service Delivery: A Review and Open Challenges. IEEE Access 2022, 10, 60267–60293. [Google Scholar] [CrossRef]
- Saupe, D.; Hahn, F.; Hosu, V.; Zingman, I.; Rana, M.; Li, S. Crowd workers proven useful: A comparative study of subjective video quality assessment. In Proceedings of the QoMEX 2016: 8th International Conference on Quality of Multimedia Experience, Lisbon, Portugal, 6–8 June 2016. [Google Scholar]
- Men, H.; Hosu, V.; Lin, H.; Bruhn, A.; Saupe, D. Subjective annotation for a frame interpolation benchmark using artefact amplification. Qual. User Exp. 2020, 5, 8. [Google Scholar] [CrossRef]
- Brunnstrom, K.; Hands, D.; Speranza, F.; Webster, A. VQEG validation and ITU standardization of objective perceptual video quality metrics [standards in a nutshell]. IEEE Signal Process. Mag. 2009, 26, 96–101. [Google Scholar] [CrossRef] [Green Version]
- Winkler, S. Video quality measurement standards—Current status and trends. In Proceedings of the 2009 IEEE 7th International Conference on Information, Communications and Signal Processing (ICICS), Macau, China, 8–10 December 2009; pp. 1–5. [Google Scholar]
- Gadiraju, U.; Möller, S.; Nöllenburg, M.; Saupe, D.; Egger-Lampl, S.; Archambault, D.; Fisher, B. Crowdsourcing versus the laboratory: Towards human-centered experiments using the crowd. In Evaluation in the Crowd. Crowdsourcing and Human-Centered Experiments; Springer: Berlin/Heidelberg, Germany, 2017; pp. 6–26. [Google Scholar]
- Wit, M.T.; Wit, R.M.; Wit, N.B.; Ribback, R.; Iqu, K.R. 5G Experimentation Environment for 3rd Party Media Services D2. 9 Continuous QoS/QoE Monitoring Engine Development-Initial. 2022. Available online: https://www.5gmediahub.eu/wp-content/uploads/2022/06/D2.9_submitted.pdf (accessed on 29 October 2022).
- Shahid, M.; Rossholm, A.; Lövström, B.; Zepernick, H.J. No-reference image and video quality assessment: A classification and review of recent approaches. EURASIP J. Image Video Process. 2014, 2014, 40. [Google Scholar] [CrossRef] [Green Version]
- Ghadiyaram, D.; Chen, C.; Inguva, S.; Kokaram, A. A no-reference video quality predictor for compression and scaling artifacts. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 3445–3449. [Google Scholar]
- De Cesarei, A.; Loftus, G.R. Global and local vision in natural scene identification. Psychon. Bull. Rev. 2011, 18, 840–847. [Google Scholar] [CrossRef]
- Bae, S.H.; Kim, M. A novel image quality assessment with globally and locally consilient visual quality perception. IEEE Trans. Image Process. 2016, 25, 2392–2406. [Google Scholar] [CrossRef]
- Wang, H.; Qu, H.; Xu, J.; Wang, J.; Wei, Y.; Zhang, Z. Combining Statistical Features and Local Pattern Features for Texture Image Retrieval. IEEE Access 2020, 8, 222611–222624. [Google Scholar] [CrossRef]
- Chang, H.w.; Du, C.Y.; Bi, X.D.; Chen, K.; Wang, M.H. Lg-Iqa: Integration of Local and Global Features for No-Reference Image Quality Assessment. Displays 2022, 75, 102334, Available at SSRN 4108605. [Google Scholar] [CrossRef]
- Varga, D. A Human Visual System Inspired No-Reference Image Quality Assessment Method Based on Local Feature Descriptors. Sensors 2022, 22, 6775. [Google Scholar] [CrossRef] [PubMed]
- Rosten, E.; Drummond, T. Fusing points and lines for high performance tracking. In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, Beijing, China, 17–21 October 2005; Volume 2, pp. 1508–1515. [Google Scholar]
- Hosu, V.; Hahn, F.; Jenadeleh, M.; Lin, H.; Men, H.; Szirányi, T.; Li, S.; Saupe, D. The Konstanz natural video database (KoNViD-1k). In Proceedings of the 2017 IEEE Ninth International Conference on Quality of Multimedia Experience (QoMEX), Erfurt, Germany, 31 May–2 June 2017; pp. 1–6. [Google Scholar]
- Sinno, Z.; Bovik, A.C. Large-scale study of perceptual video quality. IEEE Trans. Image Process. 2018, 28, 612–627. [Google Scholar] [CrossRef]
- Kossi, K.; Coulombe, S.; Desrosiers, C.; Gagnon, G. No-reference video quality assessment using distortion learning and temporal attention. IEEE Access 2022, 10, 41010–41022. [Google Scholar] [CrossRef]
- Srivastava, A.; Lee, A.B.; Simoncelli, E.P.; Zhu, S.C. On advances in statistical modeling of natural images. J. Math. Imaging Vis. 2003, 18, 17–33. [Google Scholar] [CrossRef]
- Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 2012, 20, 209–212. [Google Scholar] [CrossRef]
- Mittal, A.; Moorthy, A.K.; Bovik, A.C. No-reference image quality assessment in the spatial domain. IEEE Trans. Image Process. 2012, 21, 4695–4708. [Google Scholar] [CrossRef] [PubMed]
- Kundu, D.; Ghadiyaram, D.; Bovik, A.C.; Evans, B.L. No-reference quality assessment of tone-mapped HDR pictures. IEEE Trans. Image Process. 2017, 26, 2957–2971. [Google Scholar] [CrossRef]
- Men, H.; Lin, H.; Saupe, D. Empirical evaluation of no-reference VQA methods on a natural video quality database. In Proceedings of the 2017 IEEE Ninth International Conference on Quality of Multimedia Experience (QoMEX), Erfurt, Germany, 31 May–2 June 2017; pp. 1–3. [Google Scholar]
- Smola, A.J.; Schölkopf, B. A tutorial on support vector regression. Stat. Comput. 2004, 14, 199–222. [Google Scholar] [CrossRef] [Green Version]
- Xu, J.; Ye, P.; Liu, Y.; Doermann, D. No-reference video quality assessment via feature learning. In Proceedings of the 2014 IEEE International Conference on Image Processing (ICIP), Paris, France, 27–30 October 2014; pp. 491–495. [Google Scholar]
- Saad, M.A.; Bovik, A.C. Blind quality assessment of videos using a model of natural scene statistics and motion coherency. In Proceedings of the IEEE 2012 Conference Record of the Forty Sixth Asilomar Conference on Signals, Systems and Computers (ASILOMAR), Pacific Grove, CA, USA, 4–7 November 2012; pp. 332–336. [Google Scholar]
- Yan, P.; Mou, X. No-reference video quality assessment based on perceptual features extracted from multi-directional video spatiotemporal slices images. In Proceedings of the Optoelectronic Imaging and Multimedia Technology V, International Society for Optics and Photonics, Beijing, China, 11–13 October 2018; Volume 10817, pp. 335–344. [Google Scholar]
- Lemesle, A.; Marion, A.; Roux, L.; Gouaillard, A. NARVAL: A no-reference video quality tool for real-time communications. Electron. Imaging 2019, 2019, 213-1–213-7. [Google Scholar] [CrossRef]
- Dalal, N.; Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; Volume 1, pp. 886–893. [Google Scholar]
- Nussbaumer, H.J. The fast Fourier transform. In Fast Fourier Transform and Convolution Algorithms; Springer: Berlin/Heidelberg, Germany, 1981; pp. 80–111. [Google Scholar]
- Wang, Z.; Li, Q. Video quality assessment using a statistical model of human visual speed perception. JOSA A 2007, 24, B61–B69. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Po, L.M.; Cheung, C.H.; Xu, X.; Feng, L.; Yuan, F.; Cheung, K.W. No-reference video quality assessment with 3D shearlet transform and convolutional neural networks. IEEE Trans. Circuits Syst. Video Technol. 2015, 26, 1044–1057. [Google Scholar] [CrossRef]
- Lim, W.Q. The discrete shearlet transform: A new directional transform and compactly supported shearlet frames. IEEE Trans. Image Process. 2010, 19, 1166–1180. [Google Scholar]
- Wang, C.; Su, L.; Zhang, W. COME for no-reference video quality assessment. In Proceedings of the 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), Miami, FL, USA, 10–12 April 2018; pp. 232–237. [Google Scholar]
- Agarla, M.; Celona, L.; Schettini, R. No-reference quality assessment of in-capture distorted videos. J. Imaging 2020, 6, 74. [Google Scholar] [CrossRef] [PubMed]
- Korhonen, J. Two-level approach for no-reference consumer video quality assessment. IEEE Trans. Image Process. 2019, 28, 5923–5938. [Google Scholar] [CrossRef] [PubMed]
- Agarla, M.; Celona, L.; Schettini, R. An Efficient Method for No-Reference Video Quality Assessment. J. Imaging 2021, 7, 55. [Google Scholar] [CrossRef] [PubMed]
- Dupond, S. A thorough review on the current advance of neural network structures. Annu. Rev. Control. 2019, 14, 200–230. [Google Scholar]
- Chen, P.; Li, L.; Ma, L.; Wu, J.; Shi, G. RIRNet: Recurrent-in-recurrent network for video quality assessment. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, WA, USA, 12–16 October 2020; pp. 834–842. [Google Scholar]
- Li, D.; Jiang, T.; Jiang, M. Quality assessment of in-the-wild videos. In Proceedings of the 27th ACM International Conference on Multimedia, Nica, France, 21–25 October 2019; pp. 2351–2359. [Google Scholar]
- Cho, K.; Van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv 2014, arXiv:1409.1259. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Amsterdam, The Netherlands, 8–10 October 2016; pp. 770–778. [Google Scholar]
- Zhang, A.X.; Wang, Y.G. Texture Information Boosts Video Quality Assessment. In Proceedings of the ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, Singapore, 23–27 May 2022; pp. 2050–2054. [Google Scholar]
- Li, D.; Jiang, T.; Jiang, M. Unified quality assessment of in-the-wild videos with mixed datasets training. Int. J. Comput. Vis. 2021, 129, 1238–1257. [Google Scholar] [CrossRef]
- Guan, X.; Li, F.; Zhang, Y.; Cosman, P.C. End-to-End Blind Video Quality Assessment Based on Visual and Memory Attention Modeling. IEEE Trans. Multimed. 2022, 1–16. [Google Scholar] [CrossRef]
- Lou, Y.; Caruana, R.; Gehrke, J. Intelligible models for classification and regression. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, 12–16 August 2012; pp. 150–158. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
- Seeger, M. Gaussian processes for machine learning. Int. J. Neural Syst. 2004, 14, 69–106. [Google Scholar] [CrossRef] [Green Version]
- Wright, S.; Nocedal, J. Numerical optimization. Springer Sci. 1999, 35, 7. [Google Scholar]
- Loh, W.Y. Regression tress with unbiased variable selection and interaction detection. Stat. Sin. 2002, 12, 361–386. [Google Scholar]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef] [Green Version]
- Zhu, Y.; Li, C.; Tang, J.; Luo, B. Quality-aware feature aggregation network for robust RGBT tracking. IEEE Trans. Intell. Veh. 2020, 6, 121–130. [Google Scholar] [CrossRef]
- Liu, L.; Hua, Y.; Zhao, Q.; Huang, H.; Bovik, A.C. Blind image quality assessment by relative gradient statistics and adaboosting neural network. Signal Process. Image Commun. 2016, 40, 1–15. [Google Scholar] [CrossRef]
- Liu, L.; Liu, B.; Huang, H.; Bovik, A.C. No-reference image quality assessment based on spatial and spectral entropies. Signal Process. Image Commun. 2014, 29, 856–863. [Google Scholar] [CrossRef]
- Xue, W.; Mou, X.; Zhang, L.; Bovik, A.C.; Feng, X. Blind image quality assessment using joint statistics of gradient magnitude and Laplacian features. IEEE Trans. Image Process. 2014, 23, 4850–4862. [Google Scholar] [CrossRef]
- Crété-Roffet, F.; Dolmiere, T.; Ladret, P.; Nicolas, M. The blur effect: Perception and estimation with a new no-reference perceptual blur metric. In Proceedings of the SPIE Electronic Imaging Symposium Conference Human Vision and Electronic Imaging, San Jose, CA, USA, 12 February 2007; Volume 6492, pp. 196–206. [Google Scholar]
- Palus, H. Colorfulness of the image: Definition, computation, and properties. In Proceedings of the Lightmetry and Light and Optics in Biomedicine 2004, SPIE, Warsaw, Poland, 20 April 2006; Volume 6158, pp. 42–47. [Google Scholar]
- Yendrikhovskij, S.; Blommaert, F.J.; de Ridder, H. Optimizing color reproduction of natural images. In Proceedings of the Color and Imaging Conference. Society for Imaging Science and Technology, Scottsdale, AZ, USA, 17–20 November 1998; Volume 1998, pp. 140–145. [Google Scholar]
- Hasler, D.; Suesstrunk, S.E. Measuring colorfulness in natural images. In Proceedings of the Human Vision and Electronic Imaging VIII, SPIE, Santa Clara, CA, USA, 17 June 2003; Volume 5007, pp. 87–95. [Google Scholar]
- Berns, R.S. Extending CIELAB: Vividness, depth, and clarity. Color Res. Appl. 2014, 39, 322–330. [Google Scholar] [CrossRef]
- Midtfjord, H.B.; Green, P.; Nussbaum, P. Vividness as a colour appearance attribute. In Proceedings of the Color and Imaging Conference. Society for Imaging Science and Technology, Washington, DC, USA, 16 June 2019; Volume 2019, pp. 308–313. [Google Scholar]
- Chetverikov, D. Fundamental structural features in the visual world. In Fundamental Structural Properties in Image and Pattern Analysis; Citeseer: University Park, PA, USA, 1999. [Google Scholar]
- Ou, L.C.; Luo, M.R.; Woodcock, A.; Wright, A. A study of colour emotion and colour preference. Part III: Colour preference modeling. Color Res. Appl. 2004, 29, 381–389. [Google Scholar] [CrossRef]
- Neubeck, A.; Van Gool, L. Efficient non-maximum suppression. In Proceedings of the 18th International Conference on Pattern Recognition (ICPR’06) IEEE, Hong Kong, China, 20–24 August 2006; Volume 3, pp. 850–855. [Google Scholar]
- Hosang, J.; Benenson, R.; Schiele, B. Learning non-maximum suppression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4507–4515. [Google Scholar]
- Campbell, F.W.; Robson, J.G. Application of Fourier analysis to the visibility of gratings. J. Physiol. 1968, 197, 551. [Google Scholar] [CrossRef]
- Peli, E. Contrast in complex images. JOSA A 1990, 7, 2032–2040. [Google Scholar] [CrossRef]
- Andre, T.; Antonini, M.; Barlaud, M.; Gray, R.M. Entropy-based distortion measure for image coding. In Proceedings of the IEEE 2006 International Conference on Image Processing, Atlanta, GA, USA, 8–11 October 2006; pp. 1157–1160. [Google Scholar]
- Venkatanath, N.; Praneeth, D.; Bh, M.C.; Channappayya, S.S.; Medasani, S.S. Blind image quality evaluation using perception based features. In Proceedings of the IEEE 2015 Twenty First National Conference on Communications (NCC), Mumbai, India, 27 February–1 March 2015; pp. 1–6. [Google Scholar]
- Ghosh, K.; Sarkar, S.; Bhaumik, K. A possible mechanism of zero-crossing detection using the concept of the extended classical receptive field of retinal ganglion cells. Biol. Cybern. 2005, 93, 1–5. [Google Scholar] [CrossRef] [PubMed]
- Ghosh, K.; Sarkar, S.; Bhaumik, K. Understanding image structure from a new multi-scale representation of higher order derivative filters. Image Vis. Comput. 2007, 25, 1228–1238. [Google Scholar] [CrossRef]
- Patil, S.B.; Patil, B. Automatic Detection of Microaneurysms in Retinal Fundus Images using Modified High Boost Filtering, Line Detectors and OC-SVM. In Proceedings of the IEEE 2020 International Conference on Industry 4.0 Technology (I4Tech), Pune, India, 13–15 February 2020; pp. 148–153. [Google Scholar]
- Li, Q.; Lin, W.; Fang, Y. No-reference image quality assessment based on high order derivatives. In Proceedings of the 2016 IEEE International Conference on Multimedia and Expo (ICME), Seattle, WA, USA, 11–15 July 2016; pp. 1–6. [Google Scholar]
- Poynton, C.A. A Technical Introduction to Digital Video; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1996. [Google Scholar]
- Thomee, B.; Shamma, D.A.; Friedland, G.; Elizalde, B.; Ni, K.; Poland, D.; Borth, D.; Li, L.J. YFCC100M: The new data in multimedia research. Commun. ACM 2016, 59, 64–73. [Google Scholar] [CrossRef] [Green Version]
- Xu, L.; Lin, W.; Kuo, C.C.J. Visual Quality Assessment by Machine Learning; Springer: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
- Rohaly, A.M.; Corriveau, P.J.; Libert, J.M.; Webster, A.A.; Baroncini, V.; Beerends, J.; Blin, J.L.; Contin, L.; Hamada, T.; Harrison, D.; et al. Video quality experts group: Current results and future directions. In Proceedings of the Visual Communications and Image Processing 2000, SPIE, Perth, Australia, 30 May 2000; Volume 4067, pp. 742–753. [Google Scholar]
- Mittal, A. Natural Scene Statistics-Based Blind Visual Quality Assessment in the Spatial Domain. Ph.D. Thesis, The University of Texas at Austin, Austin, TX, USA, 2013. [Google Scholar]
- Saad, M.A.; Bovik, A.C.; Charrier, C. Blind prediction of natural video quality. IEEE Trans. Image Process. 2014, 23, 1352–1365. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Mittal, A.; Saad, M.A.; Bovik, A.C. A completely blind video integrity oracle. IEEE Trans. Image Process. 2015, 25, 289–300. [Google Scholar] [CrossRef]
- Dendi, S.V.R.; Channappayya, S.S. No-reference video quality assessment using natural spatiotemporal scene statistics. IEEE Trans. Image Process. 2020, 29, 5612–5624. [Google Scholar] [CrossRef]
- Men, H.; Lin, H.; Saupe, D. Spatiotemporal feature combination model for no-reference video quality assessment. In Proceedings of the IEEE 2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX), Cagliari, Italy, 29 May–1 June 2018; pp. 1–3. [Google Scholar]
- Ebenezer, J.P.; Shang, Z.; Wu, Y.; Wei, H.; Bovik, A.C. No-reference video quality assessment using space-time chips. In Proceedings of the 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP), Tampere, Finland, 21–24 September 2020; pp. 1–6. [Google Scholar]
- Tu, Z.; Wang, Y.; Birkbeck, N.; Adsumilli, B.; Bovik, A.C. UGC-VQA: Benchmarking blind video quality assessment for user generated content. IEEE Trans. Image Process. 2021, 30, 4449–4464. [Google Scholar] [CrossRef]
- Hosu, V.; Lin, H.; Sziranyi, T.; Saupe, D. KonIQ-10k: An Ecologically Valid Database for Deep Learning of Blind Image Quality Assessment. IEEE Trans. Image Process. 2020, 29, 4041–4056. [Google Scholar] [CrossRef] [Green Version]
- Ying, Z.; Niu, H.; Gupta, P.; Mahajan, D.; Ghadiyaram, D.; Bovik, A. From patches to pictures (PaQ-2-PiQ): Mapping the perceptual space of picture quality. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3575–3585. [Google Scholar]
- Ebenezer, J.P.; Shang, Z.; Wu, Y.; Wei, H.; Sethuraman, S.; Bovik, A.C. ChipQA: No-reference video quality prediction via space-time chips. IEEE Trans. Image Process. 2021, 30, 8059–8074. [Google Scholar] [CrossRef]
Feature Index | Description |
---|---|
f1–f6 | Temporally pooled BRISQUE [21] statistics |
f7–f12 | Temporally pooled OG-IQA [53] statistics |
f13–f18 | Temporally pooled SSEQ [54] statistics |
f19–f24 | Temporally pooled GM-LOG-BIQA [55] statistics |
f25–f40 | Perceptual features |
f41–f46 | Temporally pooled Bilaplacian features’ statistics |
f47–f52 | Temporally pooled high-boost features’ statistics |
f53–f58 | Temporally pooled derivative features’ statistics |
Computer model | Z590 D |
CPU | Intel(R) Core(TM) i7-11700F CPU 2.50 GHz (8 cores) |
Memory | 31.9 GB |
GPU | Nvidia GeForce RTX 3090 |
Method | PLCC | SROCC |
---|---|---|
NVIE [78] | 0.404 | 0.333 |
V.BLIINDS [79] | 0.661 | 0.694 |
VIIDEO [80] | 0.301 | 0.299 |
3D-MSCN [81] | 0.401 | 0.370 |
ST-Gabor [81] | 0.639 | 0.628 |
3D-MSCN + ST-Gabor [81] | 0.653 | 0.640 |
FC Model [82] | 0.492 | 0.472 |
STFC Model [82] | 0.639 | 0.606 |
STS-SVR [27] | 0.680 | 0.673 |
STS-MLP [27] | 0.407 | 0.420 |
ChipQA-0 [83] | 0.697 | 0.694 |
ChipQA [87] | 0.763 | 0.763 |
KonCept512 [84,85] | 0.749 | 0.735 |
PaQ-2-PiQ [84,86] | 0.601 | 0.613 |
FLG-VQA | 0.787 | 0.783 |
Method | PLCC | SROCC |
---|---|---|
NVIE [78] | 0.447 | 0.459 |
V.BLIINDS [79] | 0.690 | 0.703 |
VIIDEO [80] | −0.006 | −0.034 |
3D-MSCN [81] | 0.502 | 0.510 |
ST-Gabor [81] | 0.591 | 0.599 |
3D-MSCN + ST-Gabor [81] | 0.675 | 0.677 |
FC Model [82] | - | - |
STFC Model [82] | - | - |
STS-SVR [27] | - | - |
STS-MLP [27] | - | - |
ChipQA-0 [83] | 0.669 | 0.697 |
ChipQA [87] | 0.723 | 0.719 |
KonCept512 [84,85] | 0.728 | 0.665 |
PaQ-2-PiQ [84,86] | 0.668 | 0.644 |
FLG-VQA | 0.733 | 0.731 |
Direct Average | Weighted Average | |||
---|---|---|---|---|
Method | PLCC | SROCC | PLCC | SROCC |
NVIE [78] | 0.426 | 0.396 | 0.418 | 0.374 |
V.BLIINDS [79] | 0.676 | 0.698 | 0.671 | 0.697 |
VIIDEO [80] | 0.148 | 0.133 | 0.200 | 0.190 |
3D-MSCN [81] | 0.452 | 0.440 | 0.434 | 0.416 |
ST-Gabor [81] | 0.615 | 0.613 | 0.623 | 0.618 |
3D-MSCN + ST-Gabor [81] | 0.664 | 0.659 | 0.660 | 0.652 |
FC Model [82] | - | - | - | - |
STFC Model [82] | - | - | - | - |
STS-SVR [27] | - | - | - | - |
STS-MLP [27] | - | - | - | - |
ChipQA-0 [83] | 0.683 | 0.696 | 0.688 | 0.695 |
ChipQA [87] | 0.743 | 0.741 | 0.750 | 0.749 |
KonCept512 [84,85] | 0.739 | 0.700 | 0.742 | 0.712 |
PaQ-2-PiQ [84,86] | 0.635 | 0.629 | 0.623 | 0.623 |
FLG-VQA | 0.760 | 0.757 | 0.769 | 0.766 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Varga, D. No-Reference Video Quality Assessment Using the Temporal Statistics of Global and Local Image Features. Sensors 2022, 22, 9696. https://doi.org/10.3390/s22249696
Varga D. No-Reference Video Quality Assessment Using the Temporal Statistics of Global and Local Image Features. Sensors. 2022; 22(24):9696. https://doi.org/10.3390/s22249696
Chicago/Turabian StyleVarga, Domonkos. 2022. "No-Reference Video Quality Assessment Using the Temporal Statistics of Global and Local Image Features" Sensors 22, no. 24: 9696. https://doi.org/10.3390/s22249696
APA StyleVarga, D. (2022). No-Reference Video Quality Assessment Using the Temporal Statistics of Global and Local Image Features. Sensors, 22(24), 9696. https://doi.org/10.3390/s22249696