Deep Learning Model with Transfer Learning to Infer Personal Preferences in Images
Abstract
:Featured Application
Abstract
1. Introduction
2. Proposed Model
2.1. Deep Convolutional Neural Networks
2.2. Transfer Learning
2.3. Grad-CAM
2.4. Three Image Databases
3. Experiment Results
4. Conclusions and Further Work
Author Contributions
Funding
Conflicts of Interest
References
- Shankar, D.; Narumanchi, S.; A Ananya, H.; Kompalli, P.; Chaudhury, K. Deep learning based large scale visual recommendation and search for e-commerce. arXiv 2017, arXiv:1703.02344. [Google Scholar]
- Tallapally, D.; Sreepada, R.S.; Patra, B.K.; Babu, K.S. User preference learning in multi-criteria recommendations using stacked auto encoders. In Proceedings of the 12th ACM Conference on Recommender Systems, Association for Computing Machinery, New York, NY, USA, 2–7 October 2018; pp. 475–479. [Google Scholar]
- Yang, L.; Hsieh, C.-K.; Estrin, D. Beyond Classification: Latent User Interests Profiling from Visual Contents Analysis. In Proceedings of the 2015 IEEE International Conference on Data Mining Workshop, Atlantic City, NJ, USA, 14–17 November 2015; pp. 1410–1416. [Google Scholar]
- Subramaniyaswamy, V.; Logesh, R. Adaptive KNN based Recommender System through Mining of User Preferences. Wirel. Pers. Commun. 2017, 97, 2229–2247. [Google Scholar] [CrossRef]
- Chu, W.-T.; Tsai, Y.-L. A hybrid recommendation system considering visual information for predicting favorite restaurants. World Wide Web 2017, 20, 1313–1331. [Google Scholar] [CrossRef]
- Oh, J.; Kim, M.; Ban, S. User preference Classification Model for Atypical Visual Feature. In Proceedings of the 8th International Conference on Green and Human Information Technology 2020, Hanoi, Vietnam, 5–7 February 2020; pp. 257–260. [Google Scholar]
- Chen, H.; Sun, M.; Tu, C.; Lin, Y.; Liu, Z. Neural Sentiment Classification with User and Product Attention. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, TX, USA, 1–5 November 2016; pp. 1650–1659. [Google Scholar]
- Gurský, P.; Horváth, T.; Novotný, R.; Vaneková, V.; Vojtáš, P. UPRE: User Preference Based Search System. In Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence, Hong Kong, China, 18–22 December 2006; pp. 841–844. [Google Scholar]
- McAuley, J.J.; Targett, C.; Shi, Q.; Hengel, A.V.D. Image-based recommendations on styles and substitutes. In Proceedings of the 38th international ACM SIGIR Conference on Research and Development in Information Retrieval, Santiago, Chile, 9–13 August 2015; pp. 43–52. [Google Scholar]
- Liu, Q.; Wu, S.; Wang, L. DeepStyle: Learning user preferences for visual recommendation. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Japan, 7–11 August 2017; pp. 841–844. [Google Scholar]
- Deldjoo, Y.; Elahi, M.; Cremonesi, P.; Garzotto, F.; Piazzolla, P.; Quadrana, M. Content-Based Video Recommendation System Based on Stylistic Visual Features. J. Data Semant. 2016, 5, 99–113. [Google Scholar]
- Savchenko, A.V.; Demochkin, K.V.; Grechikhin, I. User preference prediction in visual data on mobile devices. arXiv 2019, arXiv:1907.04519. [Google Scholar]
- Farseev, A.; Samborskii, I.; Filchenkov, A.; Chua, T.-S. Cross-Domain Recommendation via Clustering on Multi-Layer Graphs. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Tokyo, Japan, 7–11 August 2017; pp. 195–204. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. How transferable are features in deep neural networks? In Advances in Neural Information Processing Systems; Jordan, M.I., LeCun, Y., Solla, S.A., Eds.; MIT Press: Cambdridge, MA, USA, 2014; pp. 3320–3328. [Google Scholar]
- Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-MNIST: A novel image dataset for benchmarking machine learning algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
- Huang, G.B.; Ramesh, M.; Berg, T.; Learned-Miller, E. Labeled Faces in the Wild: A Database Forstudying Face Recognition in Unconstrained Environments. Available online: http://vis-www.cs.umass.edu/lfw (accessed on 8 January 2020).
- Quattoni, A.; Torralba, A. Recognizing indoor scenes. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 413–420. [Google Scholar]
- Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet classification with deep convolutional neural networks. Commun. ACM 2017, 60, 84–90. [Google Scholar] [CrossRef]
- Kingma, D.P.; Jimmy, B.A. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Zhou, B.; Khosla, A.; Lapedriza, A.; Oliva, A.; Torralba, A. Learning deep features for discriminative localization. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2921–2929. [Google Scholar]
Fashion-MNIST | LFW | Indoor Scene Recognition | |
---|---|---|---|
Preferred | 57.92% (40,545/70,000) | 55.93% (5034/9000) | 54.77% (8246/15,000) |
Non-preferred | 42.08% (29,455/70,000) | 44.07% (3966/1000) | 45.03% (6754/15,000) |
Data Set: Fashion-MNIST | Preference Classification Results | |||
---|---|---|---|---|
Model 1 | Model 2 | |||
Training dataset | TP | FP | TP | FP |
(31,889/60,000) | (2876/60,000) | (25,512/50,000) | (20,859/50,000) | |
FN | TN | FN | TN | |
(602/60,000) | (24,633/60,000) | (391/50,000) | (3238/50,000) | |
Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | |||
94.20% (56,522/60,000) | 92.74% (46,371/50,000) | |||
Test dataset | TP | FP | TP | FP |
(5229/10,000) | (137/10,000) | (5347/10,000) | (60/10,000) | |
FN | TN | FN | TN | |
(651/10,000) | (4083/10,000) | (433/10,000) | (4160/10,000) | |
Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | |||
93.12% (9312/10,000) | 95.07% (9507/10,000) | |||
Validation data-set | TP | FP | ||
(5280/10,000) | (53/10,000) | |||
FN | TN | |||
(755/10,000) | (3932/10,000) | |||
Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | ||||
92.12% (9212/10,000) |
LFW Dataset Classification Results | Indoor Scene Recognition Dataset Classification Results | |||
---|---|---|---|---|
Model 1 | TP | FP | TP | FP |
(3657/9000) | (780/9000) | (6935/15,000) | (1105/15,000) | |
FN | TN | FN | TN | |
(1377/9000) | (3186/9000) | (1311/15,000) | (5649/15,000) | |
Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | |||
76.03% (6843/9000) | 83.89% (12,584/15,000) | |||
Model 2 | TP | FP | TP | FP |
(3643/9000) | (699/9000) | (6608/15,000) | (1332/15,000) | |
FN | TN | FN | TN | |
(1397/9000) | (3267/9000) | (1638/15,000) | (5422/15,000) | |
Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | |||
76.78% (6910/9000) | 80.20% (12,030/15,000) |
LFW Classification Results through Transfer Learning | Indoor Scene Recognition Classification Results through Transfer Learning | |||||||
---|---|---|---|---|---|---|---|---|
Model 1 | Model 2 | Model 1 | Model 2 | |||||
Training data-set | TP | FP | TP | FP | TP | FP | TP | FP |
(3823/8000) | (526/ 8000) | (3811/ 8000) | (522/ 8000) | (5988/ 12,000) | (575/ 12,000) | (5912/ 12,000) | (447/ 12,000) | |
FN | TN | FN | TN | FN | TN | FN | TN | |
(652/ 8000) | (2999/ 8000) | (664/ 8000) | (3003/ 8000) | (499/ 12,000) | (4938/ 12,000) | (375/ 12,000) | (5066/ 12,000) | |
Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | |||||
85.28% (6822/8000) | 85.18% (6814/8000) | 91.05% (10,926/12,000) | 91.48% (10,978/12,000) | |||||
Test data-set | TP | FP | TP | FP | TP | FP | TP | FP |
(465/1000) | (70/1000) | (451/1000) | (64/1000) | (1599/3000) | (219/3000) | (1585/3000) | (138/3000) | |
FN | TN | FN | TN | FN | TN | FN | TN | |
(94/ 1000) | (371/ 1000) | (108/ 1000) | (377/ 1000 | (160/3000) | (1022/ 3000) | (174/ 3000) | (1103/ 3000) | |
Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | Correct Accuracy (TP + TN)/(TP + FP + FN + TN) | |||||
83.60% (836/1000) | 82.80% (828/1000) | 87.36% (2621/3000) | 89.60% (2688/3000) |
LFW (Test Set: 1000) | Indoor Scene Recognition (Test Set: 3000) | |||
---|---|---|---|---|
Preferred (465) | Non-Preferred (371) | Preferred (1585) | Non-Preferred (1103) | |
Grad-CAM results match with subject’s preference decision | 67.10% (312/465) | 64.15% (238/371) | 66.68% (1057/1585) | 60.74% (581/1103) |
Grad-CAM results do not match with subject’s preference decision | 32.90% (153/465) | 35.85% (133/371) | 33.32% (528/1585) | 39.26% (522/1103) |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Oh, J.; Kim, M.; Ban, S.-W. Deep Learning Model with Transfer Learning to Infer Personal Preferences in Images. Appl. Sci. 2020, 10, 7641. https://doi.org/10.3390/app10217641
Oh J, Kim M, Ban S-W. Deep Learning Model with Transfer Learning to Infer Personal Preferences in Images. Applied Sciences. 2020; 10(21):7641. https://doi.org/10.3390/app10217641
Chicago/Turabian StyleOh, Jaeho, Mincheol Kim, and Sang-Woo Ban. 2020. "Deep Learning Model with Transfer Learning to Infer Personal Preferences in Images" Applied Sciences 10, no. 21: 7641. https://doi.org/10.3390/app10217641
APA StyleOh, J., Kim, M., & Ban, S. -W. (2020). Deep Learning Model with Transfer Learning to Infer Personal Preferences in Images. Applied Sciences, 10(21), 7641. https://doi.org/10.3390/app10217641