Few-Shot Fine-Grained Image Classification: A Comprehensive Review
Abstract
:1. Introduction
2. Problem, Datasets, and Categorization of FSFGIC Methods
2.1. Problem Formulation
2.2. A Taxonomy of the Existing Feature Representation Learning for FSFGIC
2.3. Benchmark Datasets
3. Methods on FSFGIC
3.1. Data Augmentation Techniques for FSFGIC
3.2. Local and/or Global Deep Feature Representation Learning Based FSFGIC Methods
3.2.1. Optimization-Based Local and/or Global Deep Feature Representation Learning
3.2.2. Metric-Based Local and/or Global Deep Feature Representation Learning
3.3. Class Representation Learning Based FSFGIC Methods
3.3.1. Optimization-Based Class Representation Learning
3.3.2. Metric-Based Class Representation Learning
3.4. Task-Specific Feature Representation Learning Based FSFGIC Methods
3.4.1. Optimization-Based Task-Specific Feature Representation Learning
3.4.2. Metric-Based Task-Specific Feature Representation Learning
3.5. Comparison of Experimental Results
4. Summary and Discussions
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhang, Y.; Tang, H.; Jia, K. Fine-grained visual categorization using meta-learning optimization with sample selection of auxiliary data. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 233–248. [Google Scholar]
- Wah, C.; Branson, S.; Welinder, P.; Perona, P.; Belongie, S. The Caltech-Ucsd Birds-200-2011 Dataset; California Institute of Technology: Pasadena, CA, USA, 2011. [Google Scholar]
- Nilsback, M.E.; Zisserman, A. Automated flower classification over a large number of classes. In Proceedings of the Indian Conference on Computer Vision, Graphics & Image Processing, Bhubaneswar, India, 16–19 December 2008; pp. 722–729. [Google Scholar]
- Maji, S.; Rahtu, E.; Kannala, J.; Blaschko, M.; Vedaldi, A. Fine-grained visual classification of aircraft. arXiv 2013, arXiv:1306.5151. [Google Scholar]
- Smith, L.B.; Slone, L.K. A developmental approach to machine learning? Front. Psychol. 2017, 8, 2124. [Google Scholar] [CrossRef]
- Zhu, Y.; Liu, C.; Jiang, S. Multi-attention Meta Learning for Few-shot Fine-grained Image Recognition. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, Yokohama, Japan, 11–17 July 2020; pp. 1090–1096. [Google Scholar]
- Li, W.; Wang, L.; Xu, J.; Huo, J.; Gao, Y.; Luo, J. Revisiting local descriptor based image-to-class measure for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7260–7268. [Google Scholar]
- Dong, C.; Li, W.; Huo, J.; Gu, Z.; Gao, Y. Learning task-aware local representations for few-shot learning. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, Yokohama, Japan, 11–17 July 2020; pp. 716–722. [Google Scholar]
- Cao, S.; Wang, W.; Zhang, J.; Zheng, M.; Li, Q. A few-shot fine-grained image classification method leveraging global and local structures. Int. J. Mach. Learn. Cybern. 2022, 13, 2273–2281. [Google Scholar] [CrossRef]
- Abdelaziz, M.; Zhang, Z. Learn to aggregate global and local representations for few-shot learning. Multimed. Tools Appl. 2023, 82, 32991–33014. [Google Scholar] [CrossRef]
- Zhu, H.; Koniusz, P. EASE: Unsupervised discriminant subspace learning for transductive few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 9068–9078. [Google Scholar]
- Li, Y.; Bian, C.; Chen, H. Generalized ridge regression-based channelwise feature map weighted reconstruction network for fine-grained few-shot ship classification. IEEE Trans. Geosci. Remote. Sens. 2023, 61, 1–10. [Google Scholar] [CrossRef]
- Hu, Z.; Shen, L.; Lai, S.; Yuan, C. Task-adaptive Feature Disentanglement and Hallucination for Few-shot Classification. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 3638–3648. [Google Scholar] [CrossRef]
- Zhou, Z.; Luo, L.; Zhou, S.; Li, W.; Yang, X.; Liu, X.; Zhu, E. Task-Related Saliency for Few-Shot Image Classification. IEEE Trans. Neural Netw. Learn. Syst. 2023, early access. [CrossRef] [PubMed]
- Chen, C.; Yang, X.; Xu, C.; Huang, X.; Ma, Z. Eckpn: Explicit class knowledge propagation network for transductive few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 6596–6605. [Google Scholar]
- Guo, Y.; Ma, Z.; Li, X.; Dong, Y. Atrm: Attention-based task-level relation module for gnn-based fewshot learning. arXiv 2021, arXiv:2101.09840. [Google Scholar]
- Shen, Z.; Liu, Z.; Qin, J.; Savvides, M.; Cheng, K.T. Partial is better than all: Revisiting fine-tuning strategy for few-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 9594–9602. [Google Scholar]
- Shi, B.; Li, W.; Huo, J.; Zhu, P.; Wang, L.; Gao, Y. Global-and local-aware feature augmentation with semantic orthogonality for few-shot image classification. Pattern Recognit. 2023, 142, 109702. [Google Scholar] [CrossRef]
- Jiang, Z.; Kang, B.; Zhou, K.; Feng, J. Few-shot classification via adaptive attention. arXiv 2020, arXiv:2008.02465. [Google Scholar]
- Song, H.; Deng, B.; Pound, M.; Özcan, E.; Triguero, I. A fusion spatial attention approach for few-shot learning. Inf. Fusion 2022, 81, 187–202. [Google Scholar] [CrossRef]
- Huang, X.; Choi, S.H. Sapenet: Self-attention based prototype enhancement network for few-shot learning. Pattern Recognit. 2023, 135, 109170. [Google Scholar] [CrossRef]
- Zhang, C.; Cai, Y.; Lin, G.; Shen, C. Deepemd: Few-shot image classification with differentiable earth mover’s distance and structured classifiers. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WS, USA, 13–19 June 2020; pp. 12200–12210. [Google Scholar]
- Wu, H.; Zhao, Y.; Li, J. Selective, structural, subtle: Trilinear spatial-awareness for few-shot fine-grained visual recognition. In Proceedings of the IEEE International Conference on Multimedia and Expo, Shenzhen, China, 5–9 July 2021; pp. 1–6. [Google Scholar]
- Liu, Y.; Zhu, L.; Wang, X.; Yamada, M.; Yang, Y. Bilaterally normalized scale-consistent sinkhorn distance for few-shot image classification. IEEE Trans. Neural Netw. Learn. Syst. 2023, early access. [CrossRef]
- Zhao, J.; Lin, X.; Zhou, J.; Yang, J.; He, L.; Yang, Z. Knowledge-based fine-grained classification for few-shot learning. In Proceedings of the IEEE International Conference on Multimedia and Expo, London, UK, 6–10 July 2020; pp. 1–6. [Google Scholar]
- Sun, X.; Xv, H.; Dong, J.; Zhou, H.; Chen, C.; Li, Q. Few-shot learning for domain-specific fine-grained image classification. IEEE Trans. Ind. Electron. 2020, 68, 3588–3598. [Google Scholar] [CrossRef]
- Huang, H.; Zhang, J.; Zhang, J.; Wu, Q.; Xu, J. Compare more nuanced: Pairwise alignment bilinear network for few-shot fine-grained learning. In Proceedings of the IEEE International Conference on Multimedia and Expo, Shanghai, China, 8–12 July 2019; pp. 91–96. [Google Scholar]
- Zheng, Z.; Feng, X.; Yu, H.; Li, X.; Gao, M. BDLA: Bi-directional local alignment for few-shot learning. Appl. Intell. 2023, 53, 769–785. [Google Scholar] [CrossRef]
- Ruan, X.; Lin, G.; Long, C.; Lu, S. Few-shot fine-grained classification with spatial attentive comparison. Knowl.-Based Syst. 2021, 218, 106840. [Google Scholar] [CrossRef]
- Chen, Y.; Zheng, Y.; Xu, Z.; Tang, T.; Tang, Z.; Chen, J.; Liu, Y. Cross-domain few-shot classification based on lightweight Res2Net and flexible GNN. Knowl.-Based Syst. 2022, 247, 108623. [Google Scholar] [CrossRef]
- Zhang, H.; Torr, P.; Koniusz, P. Few-shot learning with multi-scale self-supervision. arXiv 2020, arXiv:2001.01600. [Google Scholar]
- Wei, X.S.; Wang, P.; Liu, L.; Shen, C.; Wu, J. Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples. IEEE Trans. Image Process. 2019, 28, 6116–6125. [Google Scholar] [CrossRef] [PubMed]
- Park, S.J.; Han, S.; Baek, J.W.; Kim, I.; Song, J.; Lee, H.B.; Han, J.J.; Hwang, S.J. Meta variance transfer: Learning to augment from the others. In Proceedings of the International Conference on Machine Learning, Virtually, 12–18 July 2020; pp. 7510–7520. [Google Scholar]
- Yang, L.; Li, L.; Zhang, Z.; Zhou, X.; Zhou, E.; Liu, Y. DPGN: Distribution propagation graph network for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 13390–13399. [Google Scholar]
- Qi, H.; Brown, M.; Lowe, D.G. Low-shot learning with imprinted weights. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5822–5830. [Google Scholar]
- Hu, Y.; Gripon, V.; Pateux, S. Leveraging the feature distribution in transfer-based few-shot learning. In Proceedings of the International Conference on Artificial Neural Networks, Bratislava, Slovakia, 14–17 September 2021; pp. 487–499. [Google Scholar]
- Liu, X.; Zhou, K.; Yang, P.; Jing, L.; Yu, J. Adaptive distribution calibration for few-shot learning via optimal transport. Inf. Sci. 2022, 611, 1–17. [Google Scholar] [CrossRef]
- Karlinsky, L.; Shtok, J.; Harary, S.; Schwartz, E.; Aides, A.; Feris, R.; Giryes, R.; Bronstein, A.M. Repmet: Representative-based metric learning for classification and few-shot object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5197–5206. [Google Scholar]
- Wertheimer, D.; Tang, L.; Hariharan, B. Few-shot classification with feature map reconstruction networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 8012–8021. [Google Scholar]
- Zhang, W.; Zhao, Y.; Gao, Y.; Sun, C. Re-abstraction and perturbing support pair network for few-shot fine-grained image classification. Pattern Recognit. 2023, 148, 110158. [Google Scholar] [CrossRef]
- He, X.; Lin, J.; Shen, J. Weakly-supervised Object Localization for Few-shot Learning and Fine-grained Few-shot Learning. arXiv 2020, arXiv:2003.00874. [Google Scholar]
- Doersch, C.; Gupta, A.; Zisserman, A. Crosstransformers: Spatially-aware few-shot transfer. Adv. Neural Inf. Process. Syst. 2020, 33, 21981–21993. [Google Scholar]
- Huang, H.; Wu, Z.; Li, W.; Huo, J.; Gao, Y. Local descriptor-based multi-prototype network for few-shot learning. Pattern Recognit. 2021, 116, 107935. [Google Scholar] [CrossRef]
- Li, X.; Wu, J.; Sun, Z.; Ma, Z.; Cao, J.; Xue, J.H. BSNet: Bi-similarity network for few-shot fine-grained image classification. IEEE Trans. Image Process. 2020, 30, 1318–1331. [Google Scholar] [CrossRef] [PubMed]
- Zhu, P.; Gu, M.; Li, W.; Zhang, C.; Hu, Q. Progressive point to set metric learning for semi-supervised few-shot classification. In Proceedings of the IEEE International Conference on Image Processing, Abu Dhabi, United Arab Emirates, 25–28 October 2020; pp. 196–200. [Google Scholar]
- Hao, F.; He, F.; Cheng, J.; Wang, L.; Cao, J.; Tao, D. Collect and select: Semantic alignment metric learning for few-shot learning. In Proceedings of the IEEE international Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8460–8469. [Google Scholar]
- Huang, H.; Zhang, J.; Zhang, J.; Xu, J.; Wu, Q. Low-rank pairwise alignment bilinear network for few-shot fine-grained image classification. IEEE Trans. Multimed. 2020, 23, 1666–1680. [Google Scholar] [CrossRef]
- Li, Y.; Li, H.; Chen, H.; Chen, C. Hierarchical representation based query-specific prototypical network for few-shot image classification. arXiv 2021, arXiv:2103.11384. [Google Scholar]
- Pahde, F.; Puscas, M.; Klein, T.; Nabi, M. Multimodal prototypical networks for few-shot learning. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Virtually, 5–9 January 2021; pp. 2644–2653. [Google Scholar]
- Huang, S.; Zhang, M.; Kang, Y.; Wang, D. Attributes-guided and pure-visual attention alignment for few-shot recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 2–9 February 2021; Volume 35, pp. 7840–7847. [Google Scholar]
- Wang, R.; Zheng, H.; Duan, X.; Liu, J.; Lu, Y.; Wang, T.; Xu, S.; Zhang, B. Few-Shot Learning with Visual Distribution Calibration and Cross-Modal Distribution Alignment. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 23445–23454. [Google Scholar]
- Achille, A.; Lam, M.; Tewari, R.; Ravichandran, A.; Maji, S.; Fowlkes, C.C.; Soatto, S.; Perona, P. Task2vec: Task embedding for meta-learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 6430–6439. [Google Scholar]
- Lee, H.B.; Lee, H.; Na, D.; Kim, S.; Park, M.; Yang, E.; Hwang, S.J. Learning to balance: Bayesian meta-learning for imbalanced and out-of-distribution tasks. arXiv 2019, arXiv:1905.12917. [Google Scholar]
- He, Y.; Liang, W.; Zhao, D.; Zhou, H.Y.; Ge, W.; Yu, Y.; Zhang, W. Attribute surrogates learning and spectral tokens pooling in transformers for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 9119–9129. [Google Scholar]
- Peng, S.; Song, W.; Ester, M. Combining domain-specific meta-learners in the parameter space for cross-domain few-shot classification. arXiv 2020, arXiv:2011.00179. [Google Scholar]
- Perrett, T.; Masullo, A.; Burghardt, T.; Mirmehdi, M.; Damen, D. Meta-learning with context-agnostic initialisations. In Proceedings of the Asian Conference on Computer Vision, Virtually, 30 November–4 December 2020; pp. 70–86. [Google Scholar]
- Li, W.; Xu, J.; Huo, J.; Wang, L.; Gao, Y.; Luo, J. Distribution consistency based covariance metric networks for few-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HI, USA, 27 January–1 February 2019; Volume 33, pp. 8642–8649. [Google Scholar]
- Tseng, H.Y.; Lee, H.Y.; Huang, J.B.; Yang, M.H. Cross-domain few-shot classification via learned feature-wise transformation. arXiv 2020, arXiv:2001.08735. [Google Scholar]
- Lee, D.H.; Chung, S.Y. Unsupervised embedding adaptation via early-stage feature reconstruction for few-shot classification. In Proceedings of the International Conference on Machine Learning, Virtually, 18–24 July 2021; pp. 6098–6108. [Google Scholar]
- Xue, Z.; Duan, L.; Li, W.; Chen, L.; Luo, J. Region comparison network for interpretable few-shot image classification. arXiv 2020, arXiv:2009.03558. [Google Scholar]
- Liu, Y.; Zheng, T.; Song, J.; Cai, D.; He, X. Dmn4: Few-shot learning via discriminative mutual nearest neighbor neural network. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtually, 22 February–1 March 2022; Volume 36, pp. 1828–1836. [Google Scholar]
- Li, X.; Yu, L.; Fu, C.W.; Fang, M.; Heng, P.A. Revisiting metric learning for few-shot image classification. Neurocomputing 2020, 406, 49–58. [Google Scholar] [CrossRef]
- Welinder, P.; Branson, S.; Mita, T.; Wah, C.; Schroff, F.; Belongie, S.; Perona, P. Caltech-UCSD Birds 200; California Institute of Technology: Pasadena, CA, USA, 2010. [Google Scholar]
- Khosla, A.; Jayadevaprakash, N.; Yao, B.; Li, F.F. Novel dataset for fine-grained image categorization: Stanford dogs. In Proceedings of the CVPR Workshop on Fine-Grained Visual Categorization, Colorado Springs, CO, USA, 20–25 June 2011; Citeseer: State College, PA, USA, 2011; Volume 2. [Google Scholar]
- Krause, J.; Stark, M.; Deng, J.; Fei-Fei, L. 3D object representations for fine-grained categorization. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Sydney, NSW, Australia, 1–8 December 2013; pp. 554–561. [Google Scholar]
- Van Horn, G.; Branson, S.; Farrell, R.; Haber, S.; Barry, J.; Ipeirotis, P.; Perona, P.; Belongie, S. Building a bird recognition APP and large scale dataset with citizen scientists: The fine print in fine-grained dataset collection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 595–604. [Google Scholar]
- Xiao, J.; Hays, J.; Ehinger, K.A.; Oliva, A.; Torralba, A. Sun database: Large-scale scene recognition from abbey to zoo. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 3485–3492. [Google Scholar]
- Yu, X.; Zhao, Y.; Gao, Y.; Xiong, S.; Yuan, X. Patchy image structure classification using multi-orientation region transform. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 12741–12748. [Google Scholar]
- Afrasiyabi, A.; Lalonde, J.F.; Gagné, C. Associative alignment for few-shot image classification. In Proceedings of the European Conference on Computer Vision, Virtually, 23–28 August 2020; pp. 18–35. [Google Scholar]
- Hilliard, N.; Phillips, L.; Howland, S.; Yankov, A.; Corley, C.D.; Hodas, N.O. Few-shot learning with metric-agnostic conditional embeddings. arXiv 2018, arXiv:1802.04376. [Google Scholar]
- Zhang, M.; Wang, D.; Gai, S. Knowledge distillation for model-agnostic meta-learning. In Proceedings of the 24th European Conference on Artificial Intelligence, Virtually, 29 August–8 September 2020; pp. 1355–1362. [Google Scholar]
- Pahde, F.; Nabi, M.; Klein, T.; Jahnichen, P. Discriminative hallucination for multi-modal few-shot learning. In Proceedings of the IEEE International Conference on Image Processing, Athens, Greece, 7–10 October 2018; pp. 156–160. [Google Scholar]
- Xian, Y.; Sharma, S.; Schiele, B.; Akata, Z. f-vaegan-d2: A feature generating framework for any-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 10275–10284. [Google Scholar]
- Xu, J.; Le, H.; Huang, M.; Athar, S.; Samaras, D. Variational feature disentangling for fine-grained few-shot classification. In Proceedings of the IEEE International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 8812–8821. [Google Scholar]
- Luo, Q.; Wang, L.; Lv, J.; Xiang, S.; Pan, C. Few-shot learning via feature hallucination with variational inference. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Virtually, 5–9 January 2021; pp. 3963–3972. [Google Scholar]
- Tsutsui, S.; Fu, Y.; Crandall, D. Meta-reinforced synthetic data for one-shot fine-grained visual recognition. arXiv 2019, arXiv:1911.07164. [Google Scholar]
- Pahde, F.; Jähnichen, P.; Klein, T.; Nabi, M. Cross-modal hallucination for few-shot fine-grained recognition. arXiv 2018, arXiv:1806.05147. [Google Scholar]
- Wang, Y.; Xu, C.; Liu, C.; Zhang, L.; Fu, Y. Instance credibility inference for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 12836–12845. [Google Scholar]
- Chen, M.; Fang, Y.; Wang, X.; Luo, H.; Geng, Y.; Zhang, X.; Huang, C.; Liu, W.; Wang, B. Diversity transfer network for few-shot learning. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 10559–10566. [Google Scholar]
- Schwartz, E.; Karlinsky, L.; Shtok, J.; Harary, S.; Marder, M.; Kumar, A.; Feris, R.; Giryes, R.; Bronstein, A. Delta-encoder: An effective sample synthesis method for few-shot object recognition. Adv. Neural Inf. Process. Syst. 2018, 31, 2850–2860. [Google Scholar]
- Wang, C.; Song, S.; Yang, Q.; Li, X.; Huang, G. Fine-grained few shot learning with foreground object transformation. Neurocomputing 2021, 466, 16–26. [Google Scholar] [CrossRef]
- Lupyan, G.; Ward, E.J. Language can boost otherwise unseen objects into visual awareness. Natl. Acad. Sci. 2013, 110, 14196–14201. [Google Scholar] [CrossRef] [PubMed]
- Tokmakov, P.; Wang, Y.X.; Hebert, M. Learning compositional representations for few-shot recognition. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6372–6381. [Google Scholar]
- Chen, W.Y.; Liu, Y.C.; Kira, Z.; Wang, Y.C.F.; Huang, J.B. A closer look at few-shot classification. arXiv 2019, arXiv:1904.04232. [Google Scholar]
- Snell, J.; Swersky, K.; Zemel, R. Prototypical networks for few-shot learning. Adv. Neural Inf. Process. Syst. 2017, 30, 4080–4090. [Google Scholar]
- Flores, C.F.; Gonzalez-Garcia, A.; van de Weijer, J.; Raducanu, B. Saliency for fine-grained object recognition in domains with scarce training data. Pattern Recognit. 2019, 94, 62–73. [Google Scholar] [CrossRef]
- Tavakoli, H.R.; Borji, A.; Laaksonen, J.; Rahtu, E. Exploiting inter-image similarity and ensemble of extreme learners for fixation prediction using deep features. Neurocomputing 2017, 244, 10–18. [Google Scholar] [CrossRef]
- Zhang, X.; Wei, Y.; Feng, J.; Yang, Y.; Huang, T.S. Adversarial complementary learning for weakly supervised object localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 1325–1334. [Google Scholar]
- Liao, Y.; Zhang, W.; Gao, Y.; Sun, C.; Yu, X. ASRSNet: Automatic Salient Region Selection Network for Few-Shot Fine-Grained Image Classification. In Proceedings of the International Conference on Pattern Recognition and Artificial Intelligence, Paris, France, 1–3 June 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 627–638. [Google Scholar]
- Chen, Q.; Yang, R. Learning to distinguish: A general method to improve compare-based one-shot learning frameworks for similar classes. In Proceedings of the IEEE International Conference on Multimedia and Expo, Shanghai, China, 8–12 July 2019; pp. 952–957. [Google Scholar]
- Huynh, D.; Elhamifar, E. Compositional fine-grained low-shot learning. arXiv 2021, arXiv:2105.10438. [Google Scholar]
- Zhang, W.; Sun, C. Corner detection using second-order generalized Gaussian directional derivative representations. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 43, 1213–1224. [Google Scholar] [CrossRef] [PubMed]
- Zhang, W.; Sun, C.; Gao, Y. Image intensity variation information for interest point detection. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 9883–9894. [Google Scholar] [CrossRef] [PubMed]
- Jing, J.; Liu, S.; Wang, G.; Zhang, W.; Sun, C. Recent advances on image edge detection: A comprehensive review. Neurocomputing 2022, 503, 259–271. [Google Scholar] [CrossRef]
- Zhang, W.; Zhao, Y.; Breckon, T.P.; Chen, L. Noise robust image edge detection based upon the automatic anisotropic Gaussian kernels. Pattern Recognit. 2017, 63, 193–205. [Google Scholar] [CrossRef]
- Jing, J.; Gao, T.; Zhang, W.; Gao, Y.; Sun, C. Image feature information extraction for interest point detection: A comprehensive review. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 4694–4712. [Google Scholar] [CrossRef]
- Zhang, W.; Sun, C.; Breckon, T.; Alshammari, N. Discrete curvature representations for noise robust image corner detection. IEEE Trans. Image Process. 2019, 28, 4444–4459. [Google Scholar] [CrossRef]
- Zhang, W.; Sun, C. Corner detection using multi-directional structure tensor with multiple scales. Int. J. Comput. Vis. 2020, 128, 438–459. [Google Scholar] [CrossRef]
- Shui, P.L.; Zhang, W.C. Corner detection and classification using anisotropic directional derivative representations. IEEE Trans. Image Process. 2013, 22, 3204–3218. [Google Scholar] [CrossRef] [PubMed]
- He, J.; Kortylewski, A.; Yuille, A. CORL: Compositional representation learning for few-shot classification. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 3890–3899. [Google Scholar]
- Arjovsky, M.; Bottou, L. Towards principled methods for training generative adversarial networks. arXiv 2017, arXiv:1701.04862. [Google Scholar]
- Xian, Y.; Lorenz, T.; Schiele, B.; Akata, Z. Feature generating networks for zero-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5542–5551. [Google Scholar]
- Verma, V.K.; Arora, G.; Mishra, A.; Rai, P. Generalized zero-shot learning via synthesized examples. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4281–4289. [Google Scholar]
- Das, D.; Moon, J.; George Lee, C. Few-shot image recognition with manifolds. In Proceedings of the Advances in Visual Computing: International Symposium, San Diego, CA, USA, 5–7 October 2020; pp. 3–14. [Google Scholar]
- Lyu, Q.; Wang, W. Compositional Prototypical Networks for Few-Shot Classification. arXiv 2023, arXiv:2306.06584. [Google Scholar] [CrossRef]
- Luo, X.; Chen, Y.; Wen, L.; Pan, L.; Xu, Z. Boosting few-shot classification with view-learnable contrastive learning. In Proceedings of the IEEE International Conference on Multimedia and Expo, Shenzhen, China, 5–9 July 2021; pp. 1–6. [Google Scholar]
- Chen, X.; Wang, G. Few-shot learning by integrating spatial and frequency representation. In Proceedings of the Conference on Robots and Vision, Burnaby, BC, Canada, 26–28 May 2021; pp. 49–56. [Google Scholar]
- Ji, Z.; Chai, X.; Yu, Y.; Pang, Y.; Zhang, Z. Improved prototypical networks for few-shot learning. Pattern Recognit. Lett. 2020, 140, 81–87. [Google Scholar] [CrossRef]
- Hu, Y.; Pateux, S.; Gripon, V. Squeezing backbone feature distributions to the max for efficient few-shot learning. Algorithms 2022, 15, 147. [Google Scholar] [CrossRef]
- Chobola, T.; Vašata, D.; Kordík, P. Transfer learning based few-shot classification using optimal transport mapping from preprocessed latent space of backbone neural network. In Proceedings of the AAAI Workshop on Meta-Learning and MetaDL Challenge, Virtually, 9 February 2021; pp. 29–37. [Google Scholar]
- Zagoruyko, S.; Komodakis, N. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer. arXiv 2016, arXiv:1612.03928. [Google Scholar]
- Yang, X.; Nan, X.; Song, B. D2N4: A discriminative deep nearest neighbor neural network for few-shot space target recognition. IEEE Trans. Geosci. Remote. Sens. 2020, 58, 3667–3676. [Google Scholar] [CrossRef]
- Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A discriminative feature learning approach for deep face recognition. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 499–515. [Google Scholar]
- Simon, C.; Koniusz, P.; Nock, R.; Harandi, M. Adaptive subspaces for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 4136–4145. [Google Scholar]
- Triantafillou, E.; Zemel, R.; Urtasun, R. Few-shot learning through an information retrieval lens. Adv. Neural Inf. Process. Syst. 2017, 30, 2252–2262. [Google Scholar]
- Liu, B.; Cao, Y.; Lin, Y.; Li, Q.; Zhang, Z.; Long, M.; Hu, H. Negative margin matters: Understanding margin in few-shot classification. In Proceedings of the European Conference on Computer Vision, Virtually, 23–28 August 2020; pp. 438–455. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. Cbam: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
- Gu, Q.; Luo, Z.; Zhu, Y. A Two-Stream Network with Image-to-Class Deep Metric for Few-Shot Classification. In Proceedings of the ECAI 2020, Santiago de Compostela, Spain, 29 August–8 September 2020; pp. 2704–2711. [Google Scholar]
- Zhang, B.; Li, X.; Ye, Y.; Huang, Z.; Zhang, L. Prototype completion with primitive knowledge for few-shot learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 3754–3762. [Google Scholar]
- Jaakkola, T.; Haussler, D. Exploiting generative models in discriminative classifiers. Adv. Neural Inf. Process. Syst. 1998, 11, 487–493. [Google Scholar]
- Finn, C.; Abbeel, P.; Levine, S. Model-agnostic meta-learning for fast adaptation of deep networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 1126–1135. [Google Scholar]
- Wang, J.; Wu, J.; Bai, H.; Cheng, J. M-nas: Meta neural architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; Volume 34, pp. 6186–6193. [Google Scholar]
- Tseng, H.Y.; Chen, Y.W.; Tsai, Y.H.; Liu, S.; Lin, Y.Y.; Yang, M.H. Regularizing meta-learning via gradient dropout. In Proceedings of the Asian Conference on Computer Vision, Virtually, 30 November–4 December 2020. [Google Scholar]
- Zhou, F.; Wu, B.; Li, Z. Deep meta-learning: Learning to learn in the concept space. arXiv 2018, arXiv:1802.03596. [Google Scholar]
- Tian, P.; Li, W.; Gao, Y. Consistent meta-regularization for better meta-knowledge in few-shot learning. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 7277–7288. [Google Scholar] [CrossRef] [PubMed]
- Antoniou, A.; Storkey, A.J. Learning to learn by self-critique. Adv. Neural Inf. Process. Syst. 2019, 32. [Google Scholar]
- Gowda, K.; Krishna, G. The condensed nearest neighbor rule using the concept of mutual nearest neighborhood. IEEE Trans. Inf. Theory 1979, 25, 488–490. [Google Scholar] [CrossRef]
- Ye, M.; Guo, Y. Deep triplet ranking networks for one-shot recognition. arXiv 2018, arXiv:1804.07275. [Google Scholar]
- Li, X.; Song, Q.; Wu, J.; Zhu, R.; Ma, Z.; Xue, J.H. Locally-Enriched Cross-Reconstruction for Few-Shot Fine-Grained Image Classification. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 7530–7540. [Google Scholar] [CrossRef]
- Huang, H.; Zhang, J.; Yu, L.; Zhang, J.; Wu, Q.; Xu, C. TOAN: Target-oriented alignment network for fine-grained image categorization with few labeled samples. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 853–866. [Google Scholar] [CrossRef]
- Zhou, X.; Zhang, Y.; Wei, Q. Few-Shot Fine-Grained Image Classification via GNN. Sensors 2022, 22, 7640. [Google Scholar] [CrossRef]
- Vinyals, O.; Blundell, C.; Lillicrap, T.; Wierstra, D. Matching networks for one shot learning. Adv. Neural Inf. Process. Syst. 2016, 29, 3637–3645. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Liu, Y.; Bai, Y.; Che, X.; He, J. Few-Shot Fine-Grained Image Classification: A Survey. In Proceedings of the 2022 4th International Conference on Natural Language Processing (ICNLP), Xi’an, China, 25–27 March 2022; pp. 201–211. [Google Scholar]
Dataset Name | Class | Images | Categories |
---|---|---|---|
CUB-200-2010 [63] | Birds | 6033 | 200 |
CUB-200-2011 [2] | Birds | 11,788 | 200 |
Stanford Dogs [64] | Dogs | 20,580 | 120 |
Stanford Cars [65] | Cars | 16,185 | 196 |
FGVC-Aircraft [4] | Aircrafts | 10,000 | 100 |
NABirds [66] | Birds | 48,562 | 555 |
SUN397 [67] | Scenes | 108,754 | 397 |
Oxford 102 Flowers [3] | Flowers | 8189 | 102 |
Methods | Published in | Backbone | Accuracy | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
CUB_2010 | CUB_2011 | Dogs | Cars | |||||||||
1 shot | 5 shot | 1 shot | 5 shot | 1 shot | 5 shot | 1 shot | 5 shot | |||||
LG 1 | O 4 | MattML [6] | IJCAL 2020 | Conv-64F | - | - | 66.29 | 80.34 | 54.84 | 71.34 | 66.11 | 82.80 |
P-Transfer [17] | AAAI 2021 | ResNet-12 | - | - | 73.88 | 87.81 | - | - | - | - | ||
GLFA [18] | PR 2023 | ResNet-12 | - | - | 76.52 | 90.27 | - | - | - | - | ||
M 5 | PABN [27] | ICME 2019 | Bilinear CNN | - | - | 66.71 | 76.81 | 55.47 | 66.65 | 56.80 | 68.78 | |
DeepEMD [22] | CVPR 2020 | ResNet-12 | - | - | 75.65 | 88.69 | - | - | - | - | ||
Adaptive Attention [19] | Arxiv 2020 | Conv-64F | 64.51 | 78.62 | - | - | 61.74 | 77.37 | 70.73 | 87.72 | ||
MMN [25] | ICME 2020 | ResNet-18 | - | - | 72.5 | 86.1 | - | - | - | - | ||
SACN [29] | KBS 2021 | Conv-32F | - | - | 71.50 | 79.77 | 64.30 | 71.65 | 68.23 | 78.70 | ||
S3Net [23] | ICME 2021 | Conv-64F | 64.27 | 78.02 | 72.30 | 84.23 | 63.56 | 77.54 | 71.19 | 84.40 | ||
LCCRN [129] | TCSVT 2023 | ResNet-12 | - | - | 82.97 | 93.63 | - | - | 87.04 | 96.19 | ||
SAPENet [21] | PR 2023 | Conv-64F | - | - | 70.38 | 84.47 | - | - | - | - | ||
CR 2 | O | PCM [32] | TIP 2019 | Bilinear CNN | - | - | 42.10 | 62.48 | 28.78 | 46.92 | 29.63 | 52.28 |
DPGN [34] | CVPR 2020 | ResNet-12 | - | - | 75.71 | 91.48 | - | - | - | - | ||
ADC [37] | Information Sciences 2022 | ResNet-12 | - | - | 80.2 | 91.42 | - | - | - | - | ||
M | MACO [70] | Arxiv 2018 | Conv-32F | - | - | 60.76 | 74.96 | - | - | - | - | |
SAML [46] | ICCV 2019 | Conv-64F | - | - | 69.35 | 81.37 | - | - | - | - | ||
DN4 [7] | CVPR 2019 | Conv-64F | 53.15 | 81.90 | - | - | 45.73 | 66.33 | 61.51 | 89.60 | ||
LRPABN [47] | TMM 2020 | Bilinear CNN | - | - | 67.97 | 78.04 | 54.52 | 67.12 | 63.11 | 72.63 | ||
TSNN [118] | ECAI 2020 | Conv-64F | 57.02 | 70.33 | 48.62 | 63.45 | - | - | - | - | ||
Centroid [69] | ECCV 2020 | ResNet-18 | - | - | 74.22 | 88.65 | - | - | - | - | ||
BSNet [44] | TIP 2020 | Conv-64F | - | - | 62.84 | 85.39 | 43.42 | 71.90 | 40.89 | 86.88 | ||
CTX [42] | NIPS 2020 | ResNet-34 | - | - | - | 84.06 | - | - | - | - | ||
D2N4 [112] | TGRS 2020 | Conv-64F | 56.85 | 77.78 | - | - | 47.74 | 70.76 | 59.46 | 86.76 | ||
FRN [39] | Arxiv 2020 | ResNet-12 | - | - | 83.55 | 92.92 | - | - | - | - | ||
Neg-Cosine [116] | ECCV 2020 | ResNet-18 | - | - | 72.66 | 89.40 | - | - | - | - | ||
PPSML [45] | ICIP 2020 | Conv-64F | 63.43 | 78.76 | - | - | 52.16 | 72.00 | 71.71 | 90.02 | ||
AGAM [50] | AAAI 2021 | ResNet-12 | - | - | 79.58 | 87.17 | - | - | - | - | ||
PN+VLCL [106] | ICME 2021 | WRN | 71.21 | 85.08 | - | - | - | - | - | - | ||
ECKPN [15] | CVPR 2021 | ResNet-12 | - | - | 77.43 | 92.21 | - | - | - | - | ||
QPN [48] | Arxiv 2021 | Conv-64F | - | - | 66.04 | 82.85 | 53.69 | 70.98 | 63.91 | 89.27 | ||
LMPNet [43] | PR 2021 | ResNet-12 | 65.59 | 68.19 | - | - | 61.89 | 68.21 | 68.31 | 80.27 | ||
ProtoComNet [119] | CVPR 2021 | ResNet-12 | - | - | 93.20 | 94.90 | - | - | - | - | ||
TOAN [130] | TCSVT 2021 | ResNet-256 | - | - | 67.17 | 82.09 | 51.83 | 69.83 | 76.62 | 89.57 | ||
EASE+SIAMESE [11] | CVPR 2022 | WRN | - | - | 91.68 | 94.12 | - | - | - | - | ||
CPN [105] | Arxiv 2023 | ResNet-12 | - | - | 87.29 | 92.54 | - | - | ||||
RaPSPNet [40] | PR 2023 | Conv-64F | 67.54 | 83.73 | 73.53 | 91.21 | 55.77 | 73.58 | 71.39 | 92.60 | ||
TR 3 | O | DEML+Meta-SGD [124] | Arxiv 2018 | ResNet-50 | - | - | 66.95 | 77.11 | - | - | - | - |
CosML [55] | Arxiv 2020 | Conv-64F | 46.89 | 66.15 | - | - | - | - | 47.74 | 60.17 | ||
ANIL+CM [125] | TNNLS 2021 | ResNet-12 | - | - | 59.89 | 74.35 | - | - | - | - | ||
CA-MAML++ [56] | ACCV 2020 | ResNet-18 | - | - | 43.3 | 57.9 | - | - | - | - | ||
M-NAS [122] | AAAI 2020 | Conv-64F | - | - | 58.76 | 72.22 | - | - | - | - | ||
GNN [131] | Sensors 2022 | GNN | - | - | 61.1 | 78.6 | 49.8 | 65.3 | - | - | ||
M | CovaMNet [57] | AAAI 2019 | Conv-64F | 52.42 | 63.76 | - | - | 49.10 | 63.04 | 56.65 | 71.33 | |
ATL-Net [8] | IJCAI 2020 | Conv-64F | 60.91 | 77.05 | - | - | 54.49 | 73.20 | 67.95 | 89.16 | ||
DPGN+ATRM [16] | Arxiv 2021 | ResNet-12 | - | - | 77.53 | 90.39 | - | - | - | - | ||
DMN4 [61] | Arxiv 2021 | Conv-64F | - | - | 78.36 | 92.16 | - | - | - | - | ||
TRSN-T [14] | TNNLS 2023 | ResNet-12 | - | - | 93.58 | 95.09 | - | - | - | - |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ren, J.; Li, C.; An, Y.; Zhang, W.; Sun, C. Few-Shot Fine-Grained Image Classification: A Comprehensive Review. AI 2024, 5, 405-425. https://doi.org/10.3390/ai5010020
Ren J, Li C, An Y, Zhang W, Sun C. Few-Shot Fine-Grained Image Classification: A Comprehensive Review. AI. 2024; 5(1):405-425. https://doi.org/10.3390/ai5010020
Chicago/Turabian StyleRen, Jie, Changmiao Li, Yaohui An, Weichuan Zhang, and Changming Sun. 2024. "Few-Shot Fine-Grained Image Classification: A Comprehensive Review" AI 5, no. 1: 405-425. https://doi.org/10.3390/ai5010020
APA StyleRen, J., Li, C., An, Y., Zhang, W., & Sun, C. (2024). Few-Shot Fine-Grained Image Classification: A Comprehensive Review. AI, 5(1), 405-425. https://doi.org/10.3390/ai5010020