A Saliency-Based Patch Sampling Approach for Deep Artistic Media Recognition
Abstract
:1. Introduction
2. Related Works
2.1. Handcrafted Features for Recognizing Style and Artist
2.2. CNN-Based Features for Recognizing Style and Artist
2.2.1. Fusioned Features
2.2.2. CNN-Based Features
2.2.3. Gram Matrix
2.2.4. Features for Fine-Grained Region
2.3. CNN-Based Feature for Recognizing Artistic Media
3. Building Saliency-Based Patch Sampling Strategy from Human Study
3.1. Building the Distribution of Saliency Scores of Patches
3.1.1. Review of Saliency Estimation Schemes
3.1.2. Efficient Computation of Saliency Score of Patches
3.1.3. Building the Distribution of Saliency Scores
3.2. Building the Distribution of Saliency Scores of gtPatches
3.2.1. Capturing gtPatches from Mechanical Turks
3.2.2. Building the Distribution of Saliency Scores of gtPatches
3.3. Capturing Expected gtPatches from an Input Image
4. Structure of Our Classifier
5. Experiment
5.1. Data Collection
5.2. Experimental Setup
5.3. Training
5.4. Experiments
5.4.1. Experiment 1. Baseline Experiment
5.4.2. Experiment 2: Comparison on Patch Sampling Schemes
5.4.3. Experiment 3: Comparison on Recent Deep Learning-Based Recognition Methods
6. Analysis
- (RQ1) Our sampling strategy shows best accuracies on confusion matrices.
- (RQ2) Our sampling strategy shows a significant different accuracies than the other sampling strategies.
- (RQ3) Why pastel shows worst accuracies for four media recognition through three approaches.
- (RQ4) Three sampling strategies show consistent recognition and confusion patterns.
6.1. Analysis1: Analysis of the Performance Compared to the Sampling Strategies
6.2. Analysis2: Analysis of Statistically Significant Difference of the Accuracies
6.3. Analysis3: Analysis on the Poor Accuracy of Pastel
6.4. Analysis4: Analysis on the Consistency of the Sampling Strategies
6.4.1. Analysis on the Consistency for Confusion Pattern
- strong distinguishing:
- weak distinguishing:
- weak confusing:
- strong confusing:
- strong match: ’s from three datasets belong to the same confusion type.
- weak match: ’s from three datasets belong to the same side of confusing or distinguishing, but they can be either strong or weak.
- weak mismatch: ’s from three datasets belong to the opposite sides, but they are all weak distinguishing or confusing.
- strong mismatch: ’s from three datasets do not belong to the one of three above cases.
6.4.2. Analysis on the Consistency for Recognition Pattern
- strong unrecognizing:
- weak unrecognizing:
- weak recognizing:
- strong recognizing:
7. Conclusions and Future Work
Author Contributions
Funding
Conflicts of Interest
References
- Liu, G.; Yan, Y.; Ricci, E.; Yang, Y.; Han, Y.; Winkler, S.; Sebe, N. Inferring painting style with multi-task dictionary learning. In Proceedings of the International Conference on Artificial Intelligence 2015, Buenos Aires, Argentina, 25–31 July 2015; pp. 2162–2168. [Google Scholar]
- Florea, C.; Toca, C.; Gieseke, F. Artistic movement recognition by boosted fusion of color structure and topographic description. In Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), Santa Rosa, CA, USA, 24–31 March 2017; pp. 569–577. [Google Scholar]
- Liao, Z.; Gao, L.; Zhou, T.; Fan, X.; Zhang, Y.; Wu, J. An Oil Painters Recognition Method Based on luster Multiple Kernel Learning Algorithm. IEEE Access 2019, 7, 26842–26854. [Google Scholar] [CrossRef]
- Karayev, S.; Trentacoste, M.; Han, H.; Agarwala, A.; Darrell, T.; Hertzmann, A.; Winnemoeller, H. Recognizing image style. In Proceedings of the British Machine Vision Conference 2014, Nottingham, UK, 1–5 September 2014; pp. 1–20. [Google Scholar]
- Bar, Y.; Levy, N.; Wolf, L. Classification of artistic styles using binarized features derived from a deep neural network. In Proceedings of the Workshop at the European Conference on Computer Vision 2014, Zurich, Switzerland, 6–7 September 2014; pp. 71–84. [Google Scholar]
- Zhong, S.; Huang, X.; Xiao, Z. Fine-art painting classification via two-channel dual path networks. Int. J. Mach. Learn. Cybern. 2020, 11, 137–152. [Google Scholar] [CrossRef]
- Cetinic, E.; Lipic, T.; Grgic, S. Fine-tuning convolutional neural networks for fine art classification. Expert Syst. Appl. 2018, 114, 107–118. [Google Scholar] [CrossRef]
- van Noord, N.; Hendriks, E.; Postma, E. Toward Discovery of the Artist’s Style: Learning to recognize artists by their artworks. IEEE Signal Process. Mag. 2015, 32, 46–54. [Google Scholar] [CrossRef]
- Tan, W.; Chan, C.; Aguirre, H.; Tanaka, K. Ceci nést pas une pipe: A deep convolutional network for fine-art paintings classification. In Proceedings of the IEEE International Conference on Image Processing 2016, Phoenix, AZ, USA, 25–28 September 2016; pp. 3703–3707. [Google Scholar]
- Cetinic, E.; Grgic, S. Genre classification of paintings. In Proceedings of the International Symposium ELMAR 2016, Zadar, Croatia, 2–14 September 2016; pp. 201–204. [Google Scholar]
- Lecoutre, A.; Negrevergne, B.; Yger, F. Recognizing art styles automatically in painting with deep learning. In Proceedings of the Asian Conference on Machine Learning 2017, Seoul, Korea, 15–17 November 2017; pp. 327–342. [Google Scholar]
- Strezoski, G.; Worring, M. OmniArt: Multi-task deep learning for artistic data analysis. arXiv 2017, arXiv:1708.00684. [Google Scholar]
- Gatys, L.; Ecker, A.; Bethge, M. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, Bhubaneswar, India, 27–28 July 2016; pp. 2414–2423. [Google Scholar]
- Sun, T.; Wang, Y.; Yang, J.; Hu, X. Convolution neural networks with two pathways for image style recognition. IEEE Trans. Image Process. 2017, 26, 4102–4113. [Google Scholar] [CrossRef] [PubMed]
- Chu, W.-T.; Wu, Y.-L. Deep correlation features for image style classification. In Proceedings of the 24th ACM International Conference on Multimedia, Amsterdam, The Netherlands, 15–19 October 2016; pp. 402–406. [Google Scholar]
- Chu, W.-T.; Wu, Y.-L. Image Style Classification based on Learnt Deep Correlation Features. IEEE Trans. Multimed. 2018, 20, 2491–2502. [Google Scholar] [CrossRef]
- Chen, L.; Yang, J. Recognizing the Style of Visual Arts via Adaptive Cross-layer Correlation. In Proceedings of the 27th ACM International Conference on Multimedia, Nice, France, 21–25 October 2019; pp. 2459–2467. [Google Scholar]
- Kong, S.; Shen, X.; Lin, Z.; Mech, R.; Fowlkes, C. Photo aesthetics ranking network with attributes and content adaptation. In Proceedings of the European Conference on Computer Vision 2016, Amsterdam, The Netherlands, 11–14 October 2016; pp. 662–679. [Google Scholar]
- Lu, X.; Lin, Z.; Jin, H.; Yang, J.; Wang, J.Z. RAPID: Rating pictorial aesthetics using deep learning. In Proceedings of the ACM International Conference on Multimedia 2014, Mountain View, CA, USA, 18–19 June 2014; pp. 457–466. [Google Scholar]
- Lu, X.; Lin, Z.; Shen, X.; Mech, R.; Wang, J.Z. Deep multi-patch aggregation network for image style, aesthetics, and quality estimation. In Proceedings of the IEEE International Conference on Computer Vision 2015, Santiago, Chile, 11–18 December 2015; pp. 990–998. [Google Scholar]
- Mai, L.; Jin, H.; Liu, F. Composition-preserving deep photo aesthetics assessment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 497–506. [Google Scholar]
- Anwer, R.; Khan, F.; Weijer, J.V.; Laaksonen, J. Combining holistic and part-based deep representations for computational painting categorization. In Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, New York, NY, USA, 6–9 June 2016; pp. 339–342. [Google Scholar]
- Yang, H.; Min, K. A Multi-Column Deep Framework for Recognizing Artistic Media. Electronics 2019, 8, 1277. [Google Scholar] [CrossRef] [Green Version]
- Sandoval, C.; Pirogova, E.; Lech, M. Two-Stage Deep Learning Approach to the Classification of Fine-Art Paintings. IEEE Access 2019, 7, 41770–41781. [Google Scholar] [CrossRef]
- Bianco, S.; Mazzini, D.; Napoletano, P.; Schettini, R. Multitask Painting Categorization by Deep Multibranch Neural Network. Expert Syst. Appl. 2019, 135, 90–101. [Google Scholar] [CrossRef] [Green Version]
- Mensink, T.; van Gemert, J. The Rijksmuseum Challenge: Museum-centered visual recognition. In Proceedings of the ACM International Conference on Multimedia Retrieval 2014, Glasgow, UK, 1–4 April 2014; p. 451. [Google Scholar]
- Mao, H.; Cheung, M.; She, J. DeepArt: Learning joint representations of visual arts. In Proceedings of the ACM International Conference on Multimedia 2017, Mountain View, CA, USA, 23–27 October 2017; pp. 1183–1191. [Google Scholar]
- Gupta, A.K.; Seal, A.; Prasad, M.; Khanna, P. Salient object detection techniques in computer vision—A survey. Entropy 2020, 20, 1174. [Google Scholar] [CrossRef] [PubMed]
- Cheng, M.-M.; Mitra, N.J.; Huang, X.; Torr, P.H.S.; Hu, S.-M. Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 37, 569–581. [Google Scholar] [CrossRef] [Green Version]
- Lu, S.; Mahadevan, V.; Vasconcelos, N. Learning optimal seeds for diffusion-based salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, Columbus, OH, USA, 23–28 June 2014; pp. 2790–2797. [Google Scholar]
- Zhang, J.; Sclaroff, S.; Lin, Z.; Shen, X.; Price, B.; Mech, R. Minimum barrier salient object detection at 80 fps. In Proceedings of the IEEE International Conference on Computer Vision 2015, Santiago, Chile, 7–13 December 2015; pp. 1404–1412. [Google Scholar]
- Zhu, W.; Liang, S.; Wei, Y.; Sun, J. Saliency optimization from robust background detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, OH, USA, 23–28 June 2014; pp. 2814–2821. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. In Proceedings of the Advances in Neural Information Processing Systems 2012, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- Simonyan, K.; Andrew, Z. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2015, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Maaten, L.V.D.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
- Tan, M.; Le, Q. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. In Proceedings of the International Conference on Machine Learning 2019, Long Beach, CA, USA, 9–15 June1 2019; pp. 6105–6114. [Google Scholar]
- Yang, H.; Min, K. A Deep Approach for Classifying Artistic Media from Artworks. KSII Trans. Internet Inf. Syst. 2019, 13, 2558–2573. [Google Scholar]
Module | Time per Epoch (min) | ||
---|---|---|---|
YMSet+ | Wiki4 | Wiki10 | |
AlexNet | |||
VGGNet | |||
GoogLeNet | |||
ResNet | |||
DenseNet | |||
EfficientNet |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, H.; Min, K. A Saliency-Based Patch Sampling Approach for Deep Artistic Media Recognition. Electronics 2021, 10, 1053. https://doi.org/10.3390/electronics10091053
Yang H, Min K. A Saliency-Based Patch Sampling Approach for Deep Artistic Media Recognition. Electronics. 2021; 10(9):1053. https://doi.org/10.3390/electronics10091053
Chicago/Turabian StyleYang, Heekyung, and Kyungha Min. 2021. "A Saliency-Based Patch Sampling Approach for Deep Artistic Media Recognition" Electronics 10, no. 9: 1053. https://doi.org/10.3390/electronics10091053
APA StyleYang, H., & Min, K. (2021). A Saliency-Based Patch Sampling Approach for Deep Artistic Media Recognition. Electronics, 10(9), 1053. https://doi.org/10.3390/electronics10091053