Exploring Metrics to Establish an Optimal Model for Image Aesthetic Assessment and Analysis
Abstract
:1. Introduction
- For RSRL, besides the external F-measure, an internal metric called the disentanglement-measure (D-measure), which measures the degree of disentanglement of final FC layer nodes, is defined.
- By combining the F-measure with the D-measure to obtain an FD measure, an algorithm of determining the optimal model from many re-trained models generated by RSRL is proposed, while these models are score prediction models.
- The effectiveness of the proposed method is validated by comparing the performances of many re-trained models with different CNN structures.
- The FFP and the AIR of the model to the image are defined by the activated feature maps. It is found that analyzing the FFP and AIR of the images is useful in understanding the internal properties of the model related to the human aesthetics and validating its external performances of the aesthetic assessment.
2. Disentanglement Measure
- Normalizing as ;
- Calculating the correlation matrix of based on ;
- Calculating and sorting the eigenvalues in a descending order, denoted , and obtaining the corresponding eigenvectors, denoted ;
- Calculating the factor loading of factor m (latent variable) against , j ∈ [1, J];
- Calculating the two-norm of factor loadings against the two nodes j1 and j2 of ;
- Calculating the minimum of two-norm regarding one node j to all of the other nodes jj of ;
- Calculating the mean of the minimum of two-norm regarding all of the nodes of ;
- Defining the D-measure
3. Establishing an Optimal Model for Image Aesthetic Assessment via RSRL
4. Extracting FFP and AIR
- Obtaining the most activated feature map and the sum of feature maps of the final convolutional layer of CNN regarding an image I(x,y) and normalizing and resizing these to the size equal to the I, where x and y indicate the coordinates of the pixel, respectively;
- Extracting FFP and AIR;
5. Experimental Results and Analysis
5.1. Establishing Optimal Model
- Type a: Fine-Tuned AlexNet
- 1--end-3 layers: transferring 1--end-3 of AlexNet
- end-2 layer ‘fc’: 8 fully Connected layer, each corresponding to a score class of 2–9
- end-1 layer ‘softmax’: Softmax
- end layer ‘classoutput’: Classification Output
- Type b: Changed AlexNet
- 1--end-9 layers: transferring 1--end-5 of AlexNet
- end-8 layer ‘batchnorm_1’: Batch normalization with 4096 channels
- end-7 layer ‘relu_1’: ReLU
- end-6 layer ‘dropout’: 50% dropout
- end-5 layer ‘fc_1’: 32 fully connected layer
- end-4 layer ‘batchnorm_2’: Batch normalization with 32 channels
- end-3 layer ‘relu_2’: ReLU
- end-2 layer ‘fc_2’: 8 fully Connected layer, each corresponding to a score class of 2–9
- end-1 layer ‘softmax’: Softmax
- end layer ‘classoutput’: Classification Output
- Type c: Only 1 × 1 convolutions CNN
- 1 layer ‘imageinput’: 227 × 227 × 3 images with ‘zerocenter’ normalization
- 2 layer ‘conv_1’: 94 1 × 1 × 3 convolutions with stride [8 8] and padding [0 0 0 0]
- 3 layer ‘batchnorm_1’: Batch normalization with 94 channels
- 4 layer ‘relu_1’: ReLU
- 5 layer ‘conv_2’: 36 1 × 1 × 94 convolutions with stride [4 4] and padding [0 0 0 0]
- 6 layer ‘batchnorm_2’: Batch normalization with 36 channels
- 7 layer ‘relu_2’: ReLU
- 8 layer ‘conv_3’: 36 1 × 1 × 36 convolutions with stride [1 1] and padding [0 0 0 0]
- 9 layer ‘batchnorm_3’: Batch normalization with 36 channels
- 10 layer ‘relu_3’: ReLU
- 11 layer ‘fc_1’: 36 fully connected layer
- 12 layer ‘fc_2’: 8 fully Connected layer, each corresponding to a score class of 2~9
- 13 layer ‘softmax’: Softmax
- 14 layer ‘classoutput’: Classification Output
5.2. Extracting FFP and AIR
6. Conclusions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
CNN | convolution neural network |
RSRL | repetitively self-revised learning |
D-measure | disentanglement-measure |
FD measure | combining F-measure with D-measure |
FFP | first fixation perspective |
AIR | assessment interest region |
Type a | Fine-tuned AlexNet |
Type b | Changed AlexNet |
Type c | Only 1 × 1 convolutions CNN |
References
- Deng, Y.; Loy, C.C.; Tang, X. Image aesthetic assessment: An experimental survey. IEEE Signal Process. Mag. 2017, 34, 80–106. [Google Scholar] [CrossRef]
- Talebi, H.; Milanfar, P. NIMA: Neural image assessment. IEEE Trans. Image Process. 2018, 27, 3998–4011. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wang, L.; Wang, X.; Yamasaki, T.; Aizawa, K. Aspect-ratio-preserving multi-patch image aesthetics score prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Xu, Y.; Wang, Y.; Zhang, H.; Jiang, Y. Spatial attentive image aesthetic assessment. In Proceedings of the 2020 IEEE international Conference on Multimedia and Expo (ICME), London, UK, 6–10 July 2020. [Google Scholar] [CrossRef]
- Sheng, K.; Dong, W.; Ma, C.; Mei, X.; Huang, F.; Hu, B.G. Attention-based multi-patch aggregation for image aesthetic assessment. In Proceedings of the ACM’MM, Seoul, Korea, 22–26 October 2018; pp. 879–886. [Google Scholar]
- Zhang, X.; Gao, X.; Lu, W.; He, L. A gated peripheral-foveal convolution neural network for unified image aesthetic prediction. IEEE Trans. Multimed. 2019, 21, 2815–2826. [Google Scholar] [CrossRef]
- Zhang, X.; Gao, X.; Lu, W.; He, L.; Li, J. Beyond vision: A multimodal recurrent attention convolutional neural network for unified image aesthetic prediction tasks. IEEE Trans. Multimed. 2020, 23, 611–623. [Google Scholar] [CrossRef]
- Lee, J.-T.; Kim, C.-S. Image Aesthetic Assessment Based on Pairwise Comparison–A Unified Approach to Score Regression, Binary Classification, and Personalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27 October–2 November 2019; pp. 1191–1200. [Google Scholar]
- Ren, J.; Shen, X.; Lin, Z.; Mech, R.; Foran, D.J. Personalized image aesthetics. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar] [CrossRef]
- Jang, H.; Lee, J.-S. Analysis of deep features for image aesthetic assessment. IEEE Access 2021, 9, 29850–29861. [Google Scholar] [CrossRef]
- Li, X.; Li, X.; Zhang, G.; Zhang, X. A novel feature fusion method for computing image aesthetic quality. IEEE Access 2020, 8, 63043–63054. [Google Scholar] [CrossRef]
- Varga, D. No-Reference Image Quality Assessment with Convolutional Neural Networks and Decision Fusion. Appl. Sci. 2022, 12, 101. [Google Scholar] [CrossRef]
- Takimoto, H.; Omori, F.; Kanagawa, A. Image Aesthetics Assessment Based on Multi stream CNN Architecture and Saliency Features. Appl. Artif. Intell. Int. J. 2021, 35, 25–40. [Google Scholar] [CrossRef]
- Jin, X.-L.; Chen, X.; Zhou, Z. The impact of cover image authenticity and aesthetics on users’ product-knowing and content-reading willingness in social shopping community. Int. J. Inf. Manag. 2021, 62, 102428. [Google Scholar] [CrossRef]
- Sheng, K.; Dong, W.; Huang, H.; Chai, M.; Zhang, Y.; Ma, C.; Hu, B.-G. Learning to assess visual aesthetics of food images. Comput. Vis. Media 2021, 7, 139–152. [Google Scholar] [CrossRef]
- Khajehabdollahi, S.; Martius, G.; Levina, A. Assessing aesthetics of generated abstract images using correlation structure. In Proceedings of the IEEE Symposium Series on Computational Intelligence (SSCI), Xiamen, China, 6–9 December 2019; pp. 306–313. [Google Scholar]
- Maqbool, H.; Masek, M. Image Aesthetics Classification using Deep Features and Image Category. In Proceedings of the 2021 36th International Conference on Image and Vision Computing New Zealand (IVCNZ), Tauranga, New Zealand, 9–10 December 2021. [Google Scholar] [CrossRef]
- Dai, Y. Sample-specific repetitive learning for photo aesthetic auto-assessment and highlight elements analysis. Multimed. Tools Appl. 2020, 80, 1387–1402. [Google Scholar] [CrossRef]
- Dai, Y. CNN-based repetitive self-revised learning for photos’ aesthetics imbalanced classification. In Proceedings of the IEEE 25th International Conference on Pattern Recognition (ICPR), Milan, Italy, 10–15 January 2021; pp. 331–338. [Google Scholar]
- Alzubaidi, L.; Zhang, J.; Humaidi, A.J.; Al-Dujaili, A.; Duan, Y.; Al-Shamma, O.; Santamaría, J.; Fadhel, M.A.; Al-Amidie, M.; Farhan, L. Review of deep learning: Concepts, CNN architecture, challenges, applications, future directions. J. Big Data 2021, 8, 53. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dai, Y. Exploring Metrics to Establish an Optimal Model for Image Aesthetic Assessment and Analysis. J. Imaging 2022, 8, 85. https://doi.org/10.3390/jimaging8040085
Dai Y. Exploring Metrics to Establish an Optimal Model for Image Aesthetic Assessment and Analysis. Journal of Imaging. 2022; 8(4):85. https://doi.org/10.3390/jimaging8040085
Chicago/Turabian StyleDai, Ying. 2022. "Exploring Metrics to Establish an Optimal Model for Image Aesthetic Assessment and Analysis" Journal of Imaging 8, no. 4: 85. https://doi.org/10.3390/jimaging8040085
APA StyleDai, Y. (2022). Exploring Metrics to Establish an Optimal Model for Image Aesthetic Assessment and Analysis. Journal of Imaging, 8(4), 85. https://doi.org/10.3390/jimaging8040085