Image–Text (IT)-Prompt: Prompt-Based Learning Framework Empowered by the Cluster-Based Nearest Class Mean (C-NCM) for Rehearsal-Free Contrastive Language–Image Pretraining (CLIP)-Based Continual Learning
Abstract
:1. Introduction
- We introduce IT-Prompt, a novel prompt-based continual learning method for pretrained CLIP models, which leverages prompts for both image and text modalities to harness the correlation between visual and textual modalities, effectively extending the pretrained CLIP models’ knowledge. IT-Prompt preserves the zero-shot capabilities of the pretrained CLIP models and mitigates the catastrophic forgetting problem inherent in previous CLIP-based continual learning methods.
- Building on IT-Prompt, we further introduce a C-NCM classifier, which eliminates the need for Softmax classifiers to store and retrain old task samples, significantly improving training efficiency and reducing resource consumption.
- The proposed method outperforms all state-of-the-art methods across multiple benchmark datasets, demonstrating a significant performance improvement of around 10% on various task sequences in CIFAR100 and TinyImageNet.
2. Related Work
2.1. Continual Learning
2.1.1. Traditional Continual Learning Methods
2.1.2. Prompt-Based Continual Learning Methods
2.1.3. CLIP-Based Continual Learning Methods
2.2. Classifier
3. Proposed Method
3.1. Problem Formulation
3.2. Overall Framework
3.3. Task-Specific Prompt Learning
3.3.1. Task-Specific Prompt Inference
3.3.2. Construction of the Prompt
3.3.3. Optimization of the Prompt
3.4. C-NCM Classifier
3.4.1. Feature Preprocessing
3.4.2. C-NCM Classifier
4. Experiments and Analysis
4.1. Experiment Settings
4.1.1. Datasets
4.1.2. Evaluation Metrics
- Final average accuracy (FAA) refers to the last average accuracy after learning all the tasks:
- Cumulative Average Accuracy (CAA) is the average of historical FAA values after learning each task:CAA reflects the overall performance after learning each incremental task, which can also be denoted as “Inc-Acc”. Larger CAA indicates greater learning capacity and less forgetting.
- Final Forgetting Maximum (FFM) quantifies catastrophic forgetting by averaging the maximum performance drop of each task after learning the final task. A lower FFM indicates greater learning capacity and reduced forgetting. The FFM can be formulated as
- F1-score. This metric provides deeper insights into model performance, where a higher F1-score signifies better overall effectiveness. It is defined as:Here, TP (True Positive) is the correctly predicted positive cases, FP (False Positive) is cases incorrectly predicted as positive, and FN (False Negative) is cases not identified as positive by the model.
4.1.3. Compared Methods
4.2. Results
- Additional Evaluation Results Table 3 presents the performance comparison of different continual learning methods on the CIFAR100-T10 and ImageNet-R-T10 datasets using the newly introduced FFM and F1-score metrics. The results demonstrate the superiority of our proposed IT-Prompt approach. Specifically, IT-Prompt achieves the lowest FFM across both datasets, indicating significantly reduced catastrophic forgetting compared to other methods. Simultaneously, it attains the highest F1-score, highlighting its ability to maintain robust classification performance throughout incremental learning. These results reinforce the effectiveness of IT-Prompt in continual learning scenarios.
4.3. Analysis
4.3.1. Impact of Pretrained Paradigm
4.3.2. Impact of Pretrained Datasets
4.3.3. Impact of Classifiers
4.3.4. Hyperparameters for Prompt
4.3.5. Effectiveness of Text Guide Component
5. Conclusions and Future Work
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Implementation Details
Appendix A.1. Pretrained Models
Appendix A.2. Training Regime
References
- Zhou, D.W.; Wang, Q.W.; Qi, Z.H.; Ye, H.J.; Zhan, D.C.; Liu, Z. Deep class-incremental learning: A survey. arXiv 2023, arXiv:2302.03648. [Google Scholar]
- Wang, Z.; Zhang, Z.; Lee, C.Y.; Zhang, H.; Sun, R.; Ren, X.; Su, G.; Perot, V.; Dy, J.; Pfister, T. Learning to prompt for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 139–149. [Google Scholar]
- Garg, S.; Farajtabar, M.; Pouransari, H.; Vemulapalli, R.; Mehta, S.; Tuzel, O.; Shankar, V.; Faghri, F. Tic-clip: Continual training of clip models. arXiv 2023, arXiv:2310.16226. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 8748–8763. [Google Scholar]
- Zhou, K.; Yang, J.; Loy, C.C.; Liu, Z. Learning to prompt for vision-language models. Int. J. Comput. Vis. 2022, 130, 2337–2348. [Google Scholar] [CrossRef]
- Ke, Z.; Liu, B. Continual learning of natural language processing tasks: A survey. arXiv 2022, arXiv:2211.12701. [Google Scholar]
- Lee, K.Y.; Zhong, Y.; Wang, Y.X. Do pre-trained models benefit equally in continual learning? In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 2–7 January 2023; pp. 6485–6493. [Google Scholar]
- Hayes, T.L.; Krishnan, G.P.; Bazhenov, M.; Siegelmann, H.T.; Sejnowski, T.J.; Kanan, C. Replay in deep learning: Current approaches and missing biological elements. Neural Comput. 2021, 33, 2908–2950. [Google Scholar] [CrossRef] [PubMed]
- Mai, Z.; Li, R.; Kim, H.; Sanner, S. Supervised contrastive replay: Revisiting the nearest class mean classifier in online class-incremental continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 3589–3599. [Google Scholar]
- Pellegrini, L.; Graffieti, G.; Lomonaco, V.; Maltoni, D. Latent replay for real-time continual learning. In Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, NV, USA, 25–29 October 2020; pp. 10203–10209. [Google Scholar]
- Rolnick, D.; Ahuja, A.; Schwarz, J.; Lillicrap, T.; Wayne, G. Experience replay for continual learning. In Advances in Neural Information Processing Systems, Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019; Volume 32. [Google Scholar]
- Hayes, T.L.; Kanan, C. Selective replay enhances learning in online continual analogical reasoning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 3502–3512. [Google Scholar]
- Ho, S.; Liu, M.; Du, L.; Gao, L.; Xiang, Y. Prototype-guided memory replay for continual learning. IEEE Trans. Neural Netw. Learn. Syst. 2023, 35, 10973–10983. [Google Scholar] [CrossRef] [PubMed]
- Ding, Y.; Liu, L.; Tian, C.; Yang, J.; Ding, H. Don’t stop learning: Towards continual learning for the clip model. arXiv 2022, arXiv:2207.09248. [Google Scholar]
- Zheng, Z.; Ma, M.; Wang, K.; Qin, Z.; Yue, X.; You, Y. Preventing zero-shot transfer degradation in continual learning of vision-language models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 4–6 October 2023; pp. 19125–19136. [Google Scholar]
- Tang, Y.M.; Peng, Y.X.; Zheng, W.S. When prompt-based incremental learning does not meet strong pretraining. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 4–6 October 2023; pp. 1706–1716. [Google Scholar]
- Wang, R.; Duan, X.; Kang, G.; Liu, J.; Lin, S.; Xu, S.; Lü, J.; Zhang, B. Attriclip: A non-incremental learner for incremental knowledge learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 3654–3663. [Google Scholar]
- Wang, L.; Xie, J.; Zhang, X.; Huang, M.; Su, H.; Zhu, J. Hierarchical decomposition of prompt-based continual learning: Rethinking obscured sub-optimality. In Advances in Neural Information Processing Systems, Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2024; Curran Associates Inc.: Red Hook, NY, USA, 2024; Volume 36. [Google Scholar]
- Wang, Z.; Zhang, Z.; Ebrahimi, S.; Sun, R.; Zhang, H.; Lee, C.Y.; Ren, X.; Su, G.; Perot, V.; Dy, J.; et al. Dualprompt: Complementary prompting for rehearsal-free continual learning. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer: Cham, Switzerland, 2022; pp. 631–648. [Google Scholar]
- Khan, M.G.Z.A.; Naeem, M.F.; Van Gool, L.; Stricker, D.; Tombari, F.; Afzal, M.Z. Introducing language guidance in prompt-based continual learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 4–6 October 2023; pp. 11463–11473. [Google Scholar]
- Smith, J.S.; Karlinsky, L.; Gutta, V.; Cascante-Bonilla, P.; Kim, D.; Arbelle, A.; Panda, R.; Feris, R.; Kira, Z. Coda-prompt: Continual decomposed attention-based prompting for rehearsal-free continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Paris, France, 4–6 October 2023; pp. 11909–11919. [Google Scholar]
- Kirkpatrick, J.; Pascanu, R.; Rabinowitz, N.; Veness, J.; Desjardins, G.; Rusu, A.A.; Milan, K.; Quan, J.; Ramalho, T.; Grabska-Barwinska, A.; et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl. Acad. Sci. USA 2017, 114, 3521–3526. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Hoiem, D. Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 2935–2947. [Google Scholar] [CrossRef] [PubMed]
- Shi, Y.; Zhou, K.; Liang, J.; Jiang, Z.; Feng, J.; Torr, P.H.; Bai, S.; Tan, V.Y. Mimicking the oracle: An initial phase decorrelation approach for class incremental learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 16722–16731. [Google Scholar]
- Aljundi, R.; Babiloni, F.; Elhoseiny, M.; Rohrbach, M.; Tuytelaars, T. Memory aware synapses: Learning what (not) to forget. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 139–154. [Google Scholar]
- Zeng, G.; Chen, Y.; Cui, B.; Yu, S. Continual learning of context-dependent processing in neural networks. Nat. Mach. Intell. 2019, 1, 364–372. [Google Scholar] [CrossRef]
- Zhao, H.; Fu, Y.; Kang, M.; Tian, Q.; Wu, F.; Li, X. MgSvF: Multi-Grained Slow versus Fast Framework for Few-Shot Class-Incremental Learning. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 46, 1576–1588. [Google Scholar] [CrossRef] [PubMed]
- Arani, E.; Sarfraz, F.; Zonooz, B. Learning fast, learning slow: A general continual learning method based on complementary learning system. arXiv 2022, arXiv:2201.12604. [Google Scholar]
- Van De Ven, G.M.; Li, Z.; Tolias, A.S. Class-incremental learning with generative classifiers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 3611–3620. [Google Scholar]
- Zhou, D.W.; Wang, F.Y.; Ye, H.J.; Ma, L.; Pu, S.; Zhan, D.C. Forward compatible few-shot class-incremental learning. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 9046–9056. [Google Scholar]
- Wang, F.Y.; Zhou, D.W.; Liu, L.; Ye, H.J.; Bian, Y.; Zhan, D.C.; Zhao, P. Beef: Bi-compatible class-incremental learning via energy-based expansion and fusion. In Proceedings of the Eleventh International Conference on Learning Representations, Virtual, 25–29 April 2022. [Google Scholar]
- Li, X.; Zhou, Y.; Wu, T.; Socher, R.; Xiong, C. Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 3925–3934. [Google Scholar]
- Konishi, T.; Kurokawa, M.; Ono, C.; Ke, Z.; Kim, G.; Liu, B. Parameter-level soft-masking for continual learning. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; pp. 17492–17505. [Google Scholar]
- Van de Ven, G.M.; Siegelmann, H.T.; Tolias, A.S. Brain-inspired replay for continual learning with artificial neural networks. Nat. Commun. 2020, 11, 4069. [Google Scholar] [CrossRef] [PubMed]
- Van de Ven, G.M.; Tolias, A.S. Generative replay with feedback connections as a general strategy for continual learning. arXiv 2018, arXiv:1809.10635. [Google Scholar]
- Bagus, B.; Gepperth, A. An investigation of replay-based approaches for continual learning. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–9. [Google Scholar]
- Tiwari, R.; Killamsetty, K.; Iyer, R.; Shenoy, P. Gcr: Gradient coreset based replay buffer selection for continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 99–108. [Google Scholar]
- Wang, Y.; Huang, Z.; Hong, X. S-prompts learning with pre-trained transformers: An occam’s razor for domain incremental learning. In Advances in Neural Information Processing Systems, Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 28 November–9 December 2022; Curran Associates Inc.: Red Hook, NY, USA, 2022; Volume 35, pp. 5682–5695. [Google Scholar]
- Thengane, V.; Khan, S.; Hayat, M.; Khan, F. Clip model is an efficient continual learner. arXiv 2022, arXiv:2210.03114. [Google Scholar]
- Jha, S.; Gong, D.; Yao, L. Clap4clip: Continual learning with probabilistic finetuning for vision-language models. In Advances in Neural Information Processing Systems, Proceedings of the 39th Annual Conference on Neural Information Processing Systems, Philadelphia, PA, USA, 25 February–4 March 2025; Association for the Advancement of Artificial Intelligence: Washington, DC, USA, 2025; Volume 37, pp. 129146–129186. [Google Scholar]
- Yu, J.; Zhuge, Y.; Zhang, L.; Hu, P.; Wang, D.; Lu, H.; He, Y. Boosting continual learning of vision-language models via mixture-of-experts adapters. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 23219–23230. [Google Scholar]
- Pan, Y.; Yuan, Z.; Wu, X.; Li, Z.; Xu, C. TMM-CLIP: Task-guided Multi-Modal Alignment for Rehearsal-Free Class Incremental Learning. In Proceedings of the 6th ACM International Conference on Multimedia in Asia, Auckland, New Zealand, 3–6 December 2024; pp. 1–7. [Google Scholar]
- Hou, S.; Pan, X.; Loy, C.C.; Wang, Z.; Lin, D. Learning a unified classifier incrementally via rebalancing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019 ; pp. 831–839. [Google Scholar]
- Van de Ven, G.M.; Tolias, A.S. Three scenarios for continual learning. arXiv 2019, arXiv:1904.07734. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Li, X.L.; Liang, P. Prefix-Tuning: Optimizing Continuous Prompts for Generation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 1–6 August 2021. [Google Scholar] [CrossRef]
- Lin, H.; Zhang, B.; Feng, S.; Li, X.; Ye, Y. PCR: Proxy-based contrastive replay for online class-incremental continual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 24246–24255. [Google Scholar]
- Hendrycks, D.; Basart, S.; Mu, N.; Kadavath, S.; Wang, F.; Dorundo, E.; Desai, R.; Zhu, T.; Parajuli, S.; Guo, M.; et al. The many faces of robustness: A critical analysis of out-of-distribution generalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 8340–8349. [Google Scholar]
- Wah, C.; Branson, S.; Welinder, P.; Perona, P.; Belongie, S. The Caltech-UCSD Birds-200-2011 Dataset; California Institute of Technology: Pasadena, CA, USA, 2011. [Google Scholar]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images; University of Toronto: Toronto, ON, Canada, 2009. [Google Scholar]
- Zhou, J.; Wei, C.; Wang, H.; Shen, W.; Xie, C.; Yuille, A.; Kong, T. ibot: Image bert pre-training with online tokenizer. arXiv 2021, arXiv:2111.07832. [Google Scholar]
- Caron, M.; Touvron, H.; Misra, I.; Jégou, H.; Mairal, J.; Bojanowski, P.; Joulin, A. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 9650–9660. [Google Scholar]
- Chen, X.; Xie, S.; He, K. An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 9640–9649. [Google Scholar]
Method | CIFAR-100 | TinyImageNet | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
T10 | T20 | T50 | T5 | T10 | T20 | |||||||
FAA | CAA | FAA | CAA | FAA | CAA | FAA | CAA | FAA | CAA | FAA | CAA | |
Continual-CLIP | 66.72 | 75.17 | 66.72 | 75.95 | 66.72 | 76.49 | 66.43 | 70.49 | 66.43 | 70.55 | 66.43 | 70.51 |
LwF-VR | 70.75 | 78.81 | 63.54 | 74.54 | 59.45 | 71.02 | 70.89 | 77.56 | 67.05 | 74.12 | 63.89 | 69.94 |
ZSCL | 73.65 | 82.15 | 69.58 | 80.39 | 67.36 | 79.92 | 73.57 | 80.27 | 71.62 | 78.61 | 68.30 | 77.18 |
AttriCLIP | 81.4 | |||||||||||
IT-Prompt (Softmax+Replay) | 88.41 | 91.98 | 88.05 | 92.32 | 85.88 | 91.79 | 82.11 | 84.8 | 81.72 | 84.63 | 77.64 | 82.45 |
IT-Prompt (C-NCM) | 87.01 | 90.88 | 88.15 | 92.82 | 86.08 | 92.09 | 81.01 | 83.9 | 82.22 | 85.03 | 78.04 | 83.34 |
Method | ImageNet-R | CUB-200 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
T5 | T10 | T20 | T5 | T10 | T20 | |||||||
FAA | CAA | FAA | CAA | FAA | CAA | FAA | CAA | FAA | CAA | FAA | CAA | |
Continual-CLIP | 73.97 | 78.91 | 73.97 | 79.81 | 73.97 | 80.36 | 51.12 | 60.35 | 51.12 | 62.13 | 51.12 | 63.79 |
iCaRL | 80.67 | 86.46 | 78.03 | 84.66 | 73.7 | 81.74 | 58.37 | 70.46 | 48.39 | 64.57 | 49.48 | 64.52 |
LwF-VR | 80.35 | 86.93 | 76.47 | 84.41 | 75.57 | 81.25 | 58.84 | 70.36 | 48.69 | 63.23 | 48.88 | 63.85 |
ZSCL | 82.4 | 87.21 | 80.03 | 86.35 | 78.67 | 85.08 | 63.89 | 73.84 | 56.44 | 70.4 | 52.93 | 66.13 |
IT-Prompt (Softmax+Replay) | 84.12 | 84.89 | 84.84 | 86.94 | 84.76 | 85.77 | 81.0 | 83.25 | 75.54 | 80.16 | 74.72 | 79.75 |
IT-Prompt (C-NCM) | 82.22 | 82.99 | 83.44 | 85.89 | 85.66 | 85.87 | 80.07 | 82.77 | 78.20 | 80.36 | 77.66 | 79.15 |
Method | CIFAR100-T10 | ImageNet-R-T10 | ||
---|---|---|---|---|
FFM | F1-Score | FFM | F1-Score | |
Continual-CLIP | 6.78 | 0.43 | 5.89 | 0.49 |
iCaRL | 5.66 | 0.52 | 3.32 | 0.61 |
LwF-VR | 5.23 | 0.56 | 3.12 | 0.65 |
ZSCL | 4.28 | 0.63 | 2.37 | 0.71 |
IT-Prompt | 3.16 | 0.84 | 1.58 | 0.79 |
PTM | ImageNet-R-T10 | CUB-200-T10 | ||
---|---|---|---|---|
FAA | CAA | FAA | CAA | |
iBOT | 71.33 | 73.62 | 75.9 | 78.44 |
DINO | 68.11 | 71.70 | 75.47 | 78.68 |
MoCo | 63.77 | 68.26 | 61.52 | 67.51 |
CLIP | 83.11 | 84.72 | 76.37 | 81.11 |
PTM | ImageNet-R-T10 | CUB-200-T10 | ||
---|---|---|---|---|
FAA | CAA | FAA | CAA | |
iBOT-21k | 70.83 | 73.23 | 71.64 | 75.26 |
iBOT-1k | 71.33 | 73.62 | 75.9 | 78.44 |
CLIP-1k | 84.84 | 86.94 | 78.20 | 80.36 |
CLIP-DataComp | 84.63 | 86.95 | 77.28 | 79.37 |
CLIP-WIT | 83.11 | 84.72 | 77.37 | 80.12 |
Prompt Length | CUB-200 | |||||
---|---|---|---|---|---|---|
T5 | T10 | T20 | ||||
FAA | CAA | FAA | CAA | FAA | CAA | |
L-10 | 80.02 | 82.18 | 77.82 | 79.41 | 75.92 | 78.11 |
L-20 | 80.07 | 82.77 | 78.20 | 80.36 | 77.66 | 79.15 |
L-40 | 81.11 | 83.85 | 78.87 | 80.65 | 77.77 | 79.93 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jiao, L.; Fu, W.; Chen, X. Image–Text (IT)-Prompt: Prompt-Based Learning Framework Empowered by the Cluster-Based Nearest Class Mean (C-NCM) for Rehearsal-Free Contrastive Language–Image Pretraining (CLIP)-Based Continual Learning. Appl. Sci. 2025, 15, 2966. https://doi.org/10.3390/app15062966
Jiao L, Fu W, Chen X. Image–Text (IT)-Prompt: Prompt-Based Learning Framework Empowered by the Cluster-Based Nearest Class Mean (C-NCM) for Rehearsal-Free Contrastive Language–Image Pretraining (CLIP)-Based Continual Learning. Applied Sciences. 2025; 15(6):2966. https://doi.org/10.3390/app15062966
Chicago/Turabian StyleJiao, Li, Wenlong Fu, and Xiaolu Chen. 2025. "Image–Text (IT)-Prompt: Prompt-Based Learning Framework Empowered by the Cluster-Based Nearest Class Mean (C-NCM) for Rehearsal-Free Contrastive Language–Image Pretraining (CLIP)-Based Continual Learning" Applied Sciences 15, no. 6: 2966. https://doi.org/10.3390/app15062966
APA StyleJiao, L., Fu, W., & Chen, X. (2025). Image–Text (IT)-Prompt: Prompt-Based Learning Framework Empowered by the Cluster-Based Nearest Class Mean (C-NCM) for Rehearsal-Free Contrastive Language–Image Pretraining (CLIP)-Based Continual Learning. Applied Sciences, 15(6), 2966. https://doi.org/10.3390/app15062966