LLM Fine-Tuning: Concepts, Opportunities, and Challenges
Abstract
:1. Introduction
2. Core Concepts of LLM Fine-Tuning
2.1. Key Techniques of Fine-Tuning in LLMs
2.2. Evolution of LLM Fine-Tuning
- The Foundations of Fine-Tuning (2017–2018): From Pre-training to Initial Comprehension
- Advancing Fine-Tuning (2019–2024): Task-Specific Adaptation and Enhanced Comprehension
- Breakthrough Phase (December 2024–): TFT and the Self-Evoluted Comprehension
3. Applications and Opportunities
3.1. Applications
3.2. Opportunities
4. Challenges and Future Directions
4.1. Challenges
- Challenges in Model Behavior and Task Adaptation
- Challenges of Resource Efficiency and Scalability
- Challenges of Interpretability and Trustworthiness
- Ethical Risks and Implications of Advanced Fine-Tuning Techniques
4.2. Future Directions
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Turing, A.M. Computing machinery and intelligence. Mind 1950, 59, 433–460. [Google Scholar]
- Russell, S.J.; Norvig, P. Artificial Intelligence: A Modern Approach; Pearson: Sydney, Australia, 2016. [Google Scholar]
- Sutton, R.S.; Barto, A.G. Reinforcement Learning: An Introduction; MIT Press: Cambridge, UK, 1998; Volume 1. [Google Scholar]
- Schmidhuber, J. Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Trans. Auton. Ment. Dev. 2010, 2, 230–247. [Google Scholar]
- Newell, A.; Simon, H.A. Computer science as empirical inquiry: Symbols and search. In ACM Turing Award Lectures; Association for Computing Machinery: New York, NY, USA, 2007; p. 1975. Available online: https://dl.acm.org/doi/10.1145/360018.360022 (accessed on 30 March 2025).
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [PubMed]
- Schaeffer, R.; Miranda, B.; Koyejo, S. Are emergent abilities of large language models a mirage? Adv. Neural Inf. Process. Syst. 2023, 36, 55565–55581. [Google Scholar]
- Buttazzo, G. Artificial consciousness: Utopia or real possibility? Computer 2001, 34, 24–30. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Kahneman, D. Thinking, Fast and Slow; Macmillan: London, UK, 2011. [Google Scholar]
- Gadamer, H.G. Aesthetics and hermeneutics. In The Continental Aesthetics Reader; Routledge: Oxfordshire, UK, 1960; pp. 181–186. [Google Scholar]
- Clark, A.; Karmiloff-Smith, A. The cognizer’s innards: A psychological and philosophical perspective on the development of thought. Mind Lang. 1993, 8, 487–519. [Google Scholar]
- Gadamer, H.G. Philosophical Hermeneutics; University of California Press: Berkeley, CA, USA, 1977. [Google Scholar]
- Heidegger, M. Being and time Harper and Row. New York. 1962. Available online: http://pdf-objects.com/files/Heidegger-Martin-Being-and-Time-trans.-Macquarrie-Robinson-Blackwell-1962.pdf (accessed on 6 March 2025).
- Kintsch, W. Comprehension: A Paradigm for Cognition; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
- Ricoeur, P. Interpretation Theory: Discourse and the Surplus of Meaning; TCU Press: Fort Worth, TX, USA, 1976. [Google Scholar]
- Bender, E.M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Online, 3–10 March 2021; pp. 610–623. [Google Scholar]
- Howard, J.; Ruder, S. Universal language model fine-tuning for text classification. arXiv 2018, arXiv:1801.06146. [Google Scholar]
- Vaswani, A. Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Devlin, J. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Lake, B.; Baroni, M. Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In Proceedings of the International Conference on Machine Learning, PMLR, Stockholm, Sweden, 10–15 July 2018; pp. 2873–2882. [Google Scholar]
- Petroni, F.; Rocktäschel, T.; Lewis, P.; Bakhtin, A.; Wu, Y.; Miller, A.H.; Riedel, S. Language models as knowledge bases? arXiv 2019, arXiv:1909.01066. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- Bender, E.M.; Koller, A. Climbing towards NLU: On meaning, form, and understanding in the age of data. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–7 July 2020; pp. 5185–5198. [Google Scholar]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A survey of large language models. arXiv 2023, arXiv:2303.18223. [Google Scholar]
- Susnjak, T.; Hwang, P.; Reyes, N.H.; Barczak, A.L.; McIntosh, T.R.; Ranathunga, S. Automating research synthesis with domain-specific large language model fine-tuning. ACM Trans. Knowl. Discov. Data 2024, 19, 1–39. [Google Scholar] [CrossRef]
- Mallery, J.C.; Hurwitz, R.; Duffy, G. Hermeneutics: From Textual Explication to Computer Understanding? 1986. Available online: https://www.researchgate.net/publication/2769192_Hermeneutics_From_Textual_Explication_to_Computer_Understanding (accessed on 4 March 2025).
- Chen, M.; Herrera, F.; Hwang, K. Cognitive computing: Architecture, technologies and intelligent applications. IEEE Access 2018, 6, 19774–19783. [Google Scholar] [CrossRef]
- Liu, J.; Hao, Y.; He, Z.; Chen, M.; Hu, L.; Wei, G. BigFiberNet: LLMs and Fabric Computing Empowered Large-scale Non-disturbance Mobile Sensing Networks. IEEE Netw. 2024. Available online: https://ieeexplore.ieee.org/document/10804831 (accessed on 2 March 2025).
- Liu, J.; Chen, M. FaGeL: Fabric LLMs Agent empowered Embodied Intelligence Evolution with Autonomous Human-Machine Collaboration. arXiv 2024, arXiv:2412.20297. [Google Scholar]
- Qu, G.; Chen, Q.; Wei, W.; Lin, Z.; Chen, X.; Huang, K. Mobile edge intelligence for large language models: A contemporary survey. IEEE Commun. Surv. Tutor. 2025. Available online: https://arxiv.org/html/2407.18921v2 (accessed on 2 March 2025).
- Wang, F.Y.; Miao, Q.; Li, X.; Wang, X.; Lin, Y. What does ChatGPT say: The DAO from algorithmic intelligence to linguistic intelligence. IEEE/CAA J. Autom. Sin. 2023, 10, 575–579. [Google Scholar] [CrossRef]
- Liu, Y.; Wu, F.; Liu, Z.; Wang, K.; Wang, F.; Qu, X. Can language models be used for real-world urban-delivery route optimization? Innovation 2023, 4, 100520. [Google Scholar]
- Wu, H.; Chen, X.; Huang, K. Resource management for low-latency cooperative fine-tuning of foundation models at the network edge. arXiv 2024, arXiv:2407.09873. [Google Scholar]
- White, J.; Fu, Q.; Hays, S.; Sandborn, M.; Olea, C.; Gilbert, H.; Elnashar, A.; Spencer-Smith, J.; Schmidt, D.C. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv 2023, arXiv:2302.11382. [Google Scholar]
- Lewis, P.; Perez, E.; Piktus, A.; Petroni, F.; Karpukhin, V.; Goyal, N.; Küttler, H.; Lewis, M.; Yih, W.t.; Rocktäschel, T.; et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Adv. Neural Inf. Process. Syst. 2020, 33, 9459–9474. [Google Scholar]
- Lv, K.; Yang, Y.; Liu, T.; Gao, Q.; Guo, Q.; Qiu, X. Full parameter fine-tuning for large language models with limited resources. arXiv 2023, arXiv:2306.09782. [Google Scholar]
- Houlsby, N.; Giurgiu, A.; Jastrzebski, S.; Morrone, B.; De Laroussilhe, Q.; Gesmundo, A.; Attariyan, M.; Gelly, S. Parameter-efficient transfer learning for NLP. In Proceedings of the International Conference on Machine Learning, PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 2790–2799. [Google Scholar]
- Han, Z.; Gao, C.; Liu, J.; Zhang, J.; Zhang, S.Q. Parameter-efficient fine-tuning for large models: A comprehensive survey. arXiv 2024, arXiv:2403.14608. [Google Scholar]
- Valipour, M.; Rezagholizadeh, M.; Kobyzev, I.; Ghodsi, A. Dylora: Parameter efficient tuning of pre-trained models using dynamic search-free low-rank adaptation. arXiv 2022, arXiv:2210.07558. [Google Scholar]
- Zhang, S.; Dong, L.; Li, X.; Zhang, S.; Sun, X.; Wang, S.; Li, J.; Hu, R.; Zhang, T.; Wu, F.; et al. Instruction tuning for large language models: A survey. arXiv 2023, arXiv:2308.10792. [Google Scholar]
- Christiano, P.F.; Leike, J.; Brown, T.; Martic, M.; Legg, S.; Amodei, D. Deep reinforcement learning from human preferences. Adv. Neural Inf. Process. Syst. 2017, 30, 4302–4310. Available online: https://dl.acm.org/doi/pdf/10.5555/3294996.3295184 (accessed on 4 March 2025).
- Stiennon, N.; Ouyang, L.; Wu, J.; Ziegler, D.; Lowe, R.; Voss, C.; Radford, A.; Amodei, D.; Christiano, P.F. Learning to summarize with human feedback. Adv. Neural Inf. Process. Syst. 2020, 33, 3008–3021. [Google Scholar]
- Radford, A. Improving Language Understanding by Generative Pre-Training. 2018. Available online: https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf (accessed on 6 March 2025).
- Ding, N.; Lv, X.; Wang, Q.; Chen, Y.; Zhou, B.; Liu, Z.; Sun, M. Sparse low-rank adaptation of pre-trained language models. arXiv 2023, arXiv:2311.11696. [Google Scholar]
- Zadouri, T.; Üstün, A.; Ahmadian, A.; Ermiş, B.; Locatelli, A.; Hooker, S. Pushing mixture of experts to the limit: Extremely parameter efficient moe for instruction tuning. arXiv 2023, arXiv:2309.05444. [Google Scholar]
- Lin, Y.; Ma, X.; Chu, X.; Jin, Y.; Yang, Z.; Wang, Y.; Mei, H. Lora dropout as a sparsity regularizer for overfitting control. arXiv 2024, arXiv:2404.09610. [Google Scholar]
- Pfeiffer, J.; Kamath, A.; Rücklé, A.; Cho, K.; Gurevych, I. Adapterfusion: Non-destructive task composition for transfer learning. arXiv 2020, arXiv:2005.00247. [Google Scholar]
- Liu, X.; Ji, K.; Fu, Y.; Tam, W.L.; Du, Z.; Yang, Z.; Tang, J. P-tuning v2: Prompt tuning can be comparable to fine-tuning universally across scales and tasks. arXiv 2021, arXiv:2110.07602. [Google Scholar]
- Li, X.L.; Liang, P. Prefix-tuning: Optimizing continuous prompts for generation. arXiv 2021, arXiv:2101.00190. [Google Scholar]
- Ma, F.; Zhang, C.; Ren, L.; Wang, J.; Wang, Q.; Wu, W.; Quan, X.; Song, D. Xprompt: Exploring the extreme of prompt tuning. arXiv 2022, arXiv:2210.04457. [Google Scholar]
- Li, J.; Aitken, W.; Bhambhoria, R.; Zhu, X. Prefix propagation: Parameter-efficient tuning for long sequences. arXiv 2023, arXiv:2305.12086. [Google Scholar]
- Huang, J.; Lin, F.; Yang, J.; Wang, X.; Ni, Q.; Wang, Y.; Tian, Y.; Li, J.; Wang, F. From prompt engineering to generative artificial intelligence for large models: The state of the art and perspective. Chin. J. Intell. Sci. Technol. 2024, 6, 115–133. [Google Scholar]
- Wei, J.; Bosma, M.; Zhao, V.Y.; Guu, K.; Yu, A.W.; Lester, B.; Du, N.; Dai, A.M.; Le, Q.V. Finetuned language models are zero-shot learners. arXiv 2021, arXiv:2109.01652. [Google Scholar]
- Chung, H.W.; Hou, L.; Longpre, S.; Zoph, B.; Tay, Y.; Fedus, W.; Li, Y.; Wang, X.; Dehghani, M.; Brahma, S.; et al. Scaling instruction-finetuned language models. J. Mach. Learn. Res. 2024, 25, 1–53. [Google Scholar]
- Wu, T.; Zhu, B.; Zhang, R.; Wen, Z.; Ramchandran, K.; Jiao, J. Pairwise proximal policy optimization: Harnessing relative feedback for llm alignment. arXiv 2023, arXiv:2310.00212. [Google Scholar]
- Rafailov, R.; Sharma, A.; Mitchell, E.; Manning, C.D.; Ermon, S.; Finn, C. Direct preference optimization: Your language model is secretly a reward model. Adv. Neural Inf. Process. Syst. 2024, 36, 53728–53741. [Google Scholar]
- Gulcehre, C.; Le Paine, T.; Srinivasan, S.; Konyushkova, K.; Weerts, L.; Sharma, A.; Siddhant, A.; Ahern, A.; Wang, M.; Gu, C.; et al. Reinforced self-training (rest) for language modeling. arXiv 2023, arXiv:2308.08998. [Google Scholar]
- Yuan, Z.; Yuan, H.; Tan, C.; Wang, W.; Huang, S.; Huang, F. Rrhf: Rank responses to align language models with human feedback without tears. arXiv 2023, arXiv:2304.05302. [Google Scholar]
- Luong, T.Q.; Zhang, X.; Jie, Z.; Sun, P.; Jin, X.; Li, H. Reft: Reasoning with reinforced fine-tuning. arXiv 2024, arXiv:2401.08967. [Google Scholar]
- Liu, J.; Wang, Y.; Lin, Z.; Chen, M.; Hao, Y.; Hu, L. Natural Language Fine-Tuning. arXiv 2024, arXiv:2412.20382. [Google Scholar]
- Guo, D.; Yang, D.; Zhang, H.; Song, J.; Zhang, R.; Xu, R.; Zhu, Q.; Ma, S.; Wang, P.; Bi, X.; et al. Deepseek-r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv 2025, arXiv:2501.12948. [Google Scholar]
- Muennighoff, N.; Yang, Z.; Shi, W.; Li, X.L.; Fei-Fei, L.; Hajishirzi, H.; Zettlemoyer, L.; Liang, P.; Candès, E.; Hashimoto, T. s1: Simple test-time scaling. arXiv 2025, arXiv:2501.19393. [Google Scholar]
- Yuan, J.; Gao, H.; Dai, D.; Luo, J.; Zhao, L.; Zhang, Z.; Xie, Z.; Wei, Y.X.; Wang, L.; Xiao, Z.; et al. Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention. arXiv 2025, arXiv:2502.11089. [Google Scholar]
- Sriram, A.; Jun, H.; Satheesh, S.; Coates, A. Cold fusion: Training seq2seq models together with language models. arXiv 2017, arXiv:1708.06426. [Google Scholar]
- Lewis, M. Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv 2019, arXiv:1910.13461. [Google Scholar]
- Guo, D.; Zhu, Q.; Yang, D.; Xie, Z.; Dong, K.; Zhang, W.; Chen, G.; Bi, X.; Wu, Y.; Li, Y.; et al. DeepSeek-Coder: When the Large Language Model Meets Programming–The Rise of Code Intelligence. arXiv 2024, arXiv:2401.14196. [Google Scholar]
- DeepSeek, vs. OpenAI, xAI, and Anthropic: A Comparative Evaluation by FlagEval. 2025. Available online: https://hub.baai.ac.cn/view/43898 (accessed on 6 March 2025).
- Broughel, J. The Tradeoffs Between Energy Efficiency, Consumer Preferences, and Economic Growth; The Center for Growth and Opportunity: Logan, UT, USA, 2025. [Google Scholar]
- Liu, R.; Gao, J.; Zhao, J.; Zhang, K.; Li, X.; Qi, B.; Ouyang, W.; Zhou, B. Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling. arXiv 2025, arXiv:2502.06703. [Google Scholar]
- George, A.; Jose, A.; Ashik, F.; Prabhakar, P.; Pati, P.B.; Parida, S. Enhancing Legal Decision Making: WRIT Case Outcome Prediction with LegalBERT Embeddings and AdaBoost Classifier. In Proceedings of the 2024 IEEE International Conference on Contemporary Computing and Communications (InC4), Bangalore, India, 15–16 March 2024; IEEE: New York, NY, USA, 2024; Volume 1, pp. 1–6. [Google Scholar]
- Tian, Y.; Lin, F.; Li, Y.; Zhang, T.; Zhang, Q.; Fu, X.; Huang, J.; Dai, X.; Wang, Y.; Tian, C.; et al. UAVs Meet LLMs: Overviews and Perspectives Toward Agentic Low-Altitude Mobility. arXiv 2025, arXiv:2501.02341. [Google Scholar]
- Tang, Y.; Han, X.; Li, X.; Yu, Q.; Hao, Y.; Hu, L.; Chen, M. Minigpt-3d: Efficiently aligning 3d point clouds with large language models using 2d priors. In Proceedings of the 32nd ACM International Conference on Multimedia, Melbourne, Australia, 28 October–1 November 2024; pp. 6617–6626. [Google Scholar]
- Elshin, D.; Karpachev, N.; Gruzdev, B.; Golovanov, I.; Ivanov, G.; Antonov, A.; Skachkov, N.; Latypova, E.; Layner, V.; Enikeeva, E.; et al. From general LLM to translation: How we dramatically improve translation quality using human evaluation data for LLM finetuning. In Proceedings of the Ninth Conference on Machine Translation, Miami, FL, USA, 12–13 November 2024; pp. 247–252. [Google Scholar]
- Wang, X.; Zhou, W.; Zu, C.; Xia, H.; Chen, T.; Zhang, Y.; Zheng, R.; Ye, J.; Zhang, Q.; Gui, T.; et al. Instructuie: Multi-task instruction tuning for unified information extraction. arXiv 2023, arXiv:2304.08085. [Google Scholar]
- Gupta, P.; Jiao, C.; Yeh, Y.T.; Mehri, S.; Eskenazi, M.; Bigham, J.P. InstructDial: Improving zero and few-shot generalization in dialogue through instruction tuning. arXiv 2022, arXiv:2205.12673. [Google Scholar]
- Bražinskas, A.; Nallapati, R.; Bansal, M.; Dreyer, M. Efficient few-shot fine-tuning for opinion summarization. arXiv 2022, arXiv:2205.02170. [Google Scholar]
- Varia, S.; Wang, S.; Halder, K.; Vacareanu, R.; Ballesteros, M.; Benajiba, Y.; John, N.A.; Anubhai, R.; Muresan, S.; Roth, D. Instruction tuning for few-shot aspect-based sentiment analysis. arXiv 2022, arXiv:2210.06629. [Google Scholar]
- Bill, D.; Eriksson, T. Fine-Tuning a Llm Using Reinforcement Learning from Human Feedback for a Therapy Chatbot Application. 2023. Available online: https://www.diva-portal.org/smash/get/diva2:1782678/FULLTEXT01.pdf (accessed on 3 March 2025).
- Maharjan, J.; Garikipati, A.; Singh, N.P.; Cyrus, L.; Sharma, M.; Ciobanu, M.; Barnes, G.; Thapa, R.; Mao, Q.; Das, R. OpenMedLM: Prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models. Sci. Rep. 2024, 14, 14156. [Google Scholar] [CrossRef]
- Lehman, E.; Johnson, A. Clinical-t5: Large language models built using mimic clinical text. PhysioNet 2023, 101, 215–220. [Google Scholar]
- Ross, E.; Kansal, Y.; Renzella, J.; Vassar, A.; Taylor, A. Supervised Fine-Tuning LLMs to Behave as Pedagogical Agents in Programming Education. arXiv 2025, arXiv:2502.20527. [Google Scholar]
- Gao, L.; Lu, J.; Shao, Z.; Lin, Z.; Yue, S.; Ieong, C.; Sun, Y.; Zauner, R.J.; Wei, Z.; Chen, S. Fine-tuned large language model for visualization system: A study on self-regulated learning in education. IEEE Trans. Vis. Comput. Graph. 2025, 31, 514–524. [Google Scholar] [CrossRef]
- Latif, E.; Zhai, X. Fine-tuning ChatGPT for automatic scoring. Comput. Educ. Artif. Intell. 2024, 6, 100210. [Google Scholar]
- Iacovides, G.; Konstantinidis, T.; Xu, M.; Mandic, D. FinLlama: LLM-Based Financial Sentiment Analysis for Algorithmic Trading. In Proceedings of the 5th ACM International Conference on AI in Finance, Brooklyn, NY, USA, 14–17 November 2024; pp. 134–141. [Google Scholar]
- Chen, W.; Wang, Q.; Long, Z.; Zhang, X.; Lu, Z.; Li, B.; Wang, S.; Xu, J.; Bai, X.; Huang, X.; et al. Disc-finllm: A chinese financial large language model based on multiple experts fine-tuning. arXiv 2023, arXiv:2310.15205. [Google Scholar]
- Ni, S.; Cheng, H.; Yang, M. Pre-training, Fine-tuning and Re-ranking: A Three-Stage Framework for Legal Question Answering. In Proceedings of the ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 6–11 April 2025; IEEE: New York, NY, USA, 2025; pp. 1–5. [Google Scholar]
- Satterfield, N.; Holbrooka, P.; Wilcoxa, T. Fine-tuning llama with case law data to improve legal domain performance. OSF 2024. [Google Scholar] [CrossRef]
- Guan, C.; Chin, A.; Vahabi, P. Enhancing news summarization with elearnfit through efficient in-context learning and efficient fine-tuning. arXiv 2024, arXiv:2405.02710. [Google Scholar]
- Wang, Z.; Cheng, J.; Cui, C.; Yu, C. Implementing BERT and fine-tuned RobertA to detect AI generated news by ChatGPT. arXiv 2023, arXiv:2306.07401. [Google Scholar]
- Wang, Y.; Zhang, Z.; Wang, J.; Fan, D.; Xu, Z.; Liu, L.; Hao, X.; Bhat, V.; Li, X. GEXIA: Granularity Expansion and Iterative Approximation for Scalable Multi-grained Video-language Learning. arXiv 2024, arXiv:2412.07704. [Google Scholar]
- Lee, S.H.; Wang, J.; Fan, D.; Zhang, Z.; Liu, L.; Hao, X.; Bhat, V.; Li, X. NowYouSee Me: Context-Aware Automatic Audio Description. arXiv 2024, arXiv:2412.10002. [Google Scholar]
- Li, Z.; Chen, C.; Xu, T.; Qin, Z.; Xiao, J.; Sun, R.; Luo, Z.Q. Entropic distribution matching in supervised fine-tuning of LLMs: Less overfitting and better diversity. arXiv 2024, arXiv:2408.16673. [Google Scholar]
- Lin, H.; Huang, B.; Ye, H.; Chen, Q.; Wang, Z.; Li, S.; Ma, J.; Wan, X.; Zou, J.; Liang, Y. Selecting large language model to fine-tune via rectified scaling law. arXiv 2024, arXiv:2402.02314. [Google Scholar]
- Ash, M.G. Gestalt Psychology in German Culture, 1890–1967: Holism and the Quest for Objectivity; Cambridge University Press: Cambridge, UK, 1998. [Google Scholar]
- Chen, J.; Zhang, A.; Shi, X.; Li, M.; Smola, A.; Yang, D. Parameter-efficient fine-tuning design spaces. arXiv 2023, arXiv:2301.01821. [Google Scholar]
- Hu, Z.; Wang, L.; Lan, Y.; Xu, W.; Lim, E.P.; Bing, L.; Xu, X.; Poria, S.; Lee, R.K.W. Llm-adapters: An adapter family for parameter-efficient fine-tuning of large language models. arXiv 2023, arXiv:2304.01933. [Google Scholar]
- Zhou, H.; Wan, X.; Vulić, I.; Korhonen, A. Autopeft: Automatic configuration search for parameter-efficient fine-tuning. Trans. Assoc. Comput. Linguist. 2024, 12, 525–542. [Google Scholar] [CrossRef]
- Zhu, K.; Zhao, Q.; Chen, H.; Wang, J.; Xie, X. Promptbench: A unified library for evaluation of large language models. J. Mach. Learn. Res. 2024, 25, 1–22. [Google Scholar]
- Jebali, M.S.; Valanzano, A.; Murugesan, M.; Veneri, G.; De Magistris, G. Leveraging the Regularizing Effect of Mixing Industrial and Open Source Data to Prevent Overfitting of LLM Fine Tuning. In Proceedings of the International Joint Conference on Artificial Intelligence 2024 Workshop on AI Governance: Alignment, Morality, and Law, Jeju, Republic of Korea, 3–9 August 2024. [Google Scholar]
- Liu, T.; Dong, Z.; Zhang, L.; Wang, H.; Gao, J. Mitigating Heterogeneous Token Overfitting in LLM Knowledge Editing. arXiv 2025, arXiv:2502.00602. [Google Scholar]
- Schramowski, P.; Turan, C.; Andersen, N.; Rothkopf, C.A.; Kersting, K. Large pre-trained language models contain human-like biases of what is right and wrong to do. Nat. Mach. Intell. 2022, 4, 258–268. [Google Scholar] [CrossRef]
- Luo, W.; Keung, J.W.; Yang, B.; Ye, H.; Goues, C.L.; Bissyande, T.F.; Tian, H.; Le, B. When Fine-Tuning LLMs Meets Data Privacy: An Empirical Study of Federated Learning in LLM-Based Program Repair. arXiv 2024, arXiv:2412.01072. [Google Scholar]
- Giannini, F.; Franzè, G.; Pupo, F.; Fortino, G. A sustainable multi-agent routing algorithm for vehicle platoons in urban networks. IEEE Trans. Intell. Transp. Syst. 2023, 24, 14830–14840. [Google Scholar] [CrossRef]
- Tran, K.T.; Dao, D.; Nguyen, M.D.; Pham, Q.V.; O’Sullivan, B.; Nguyen, H.D. Multi-Agent Collaboration Mechanisms: A Survey of LLMs. arXiv 2025, arXiv:2501.06322. [Google Scholar]
Data Usage Method | Key Technique \Models | Release Time | Technical Focus | Developer |
---|---|---|---|---|
SFT | Traditional SFT: BERT [20], GPT [45] | 2018 | Directly fine-tune models using annotated datasets | Google, OpenAI |
Full Fine-tuning [25] | 2020 | Data quality affects performance, risk of overfitting | Raffel et al. (Google) | |
LoRA: DyLoRA [41], AdaLoRA [46], MoLORA [47], LoRA Dropout [48] | 2022, 2023, 2023, 2024 | Achieves near-full fine-tuning through rank decomposition | Valipour, et al. | |
Adapter-based: Serial Adapter [39], AdapterFusion [49] | 2019, 2020 | Adapter parameters are small, ideal for multi-task scenarios | Houlsby, Pfeiffer | |
Prompt Tuning: P-Tuning v2 [50], Prefix-tuning [51], Xprompt [52], Prefix propagation [53], Prompt engineering [54] | 2021, 2021, 2022, 2023, 2024 | Suitable for few-shot scenarios, sensitive to task complexity | Liu et al. | |
FLAN (Fine-tuned Language Net): FLAN [55], FLAN-T5 [56] | 2021, 2024 | Enhances model generalization across tasks | Wei et al. (Google) | |
RLHF | PPO (Proximal Policy Optimization) [39,57] | 2023 | Introduces trust region constraints to prevent excessive policy updates | Wu et al. |
DPO (Direct Preference Optimization) [58] | 2024 | Optimizes policies using preference data, bypassing explicit reward models | Rafailov et al. | |
ReST (Reinforced Self-Training) [59] | 2023 | Iteratively generates high-quality samples to fine-tune models | Gulcehre | |
RRHF (Rank Responses to align Human Feedback) [60] | 2023 | Aligns human preferences using ranking loss functions, no reinforcement learning | Yuan et al. | |
RFT | ReFT [61] | 2024 | Explores multiple CoT paths for non-differentiable objective optimization | Luong et al. (ByteDance) |
RFT | 2024 | Utilizes reinforcement fine-tuning to create expert models with high-quality tasks | OpenAI | |
Grok-3 | 2025 | Self-correction mechanism and reinforcement learning capability | xAI | |
TFT | NLFT [62] | 2024 (Dec.) | Compares token probabilities under different prompts, using natural language as supervisory signal | Liu et al. (EPIC Lab) |
DeepSeek R1 [63] | 2025 (Jan.) | Releases distilled models for low-cost, high-performance inference | DeepSeek-AI | |
TTS [64] | 2025 (Feb.) | Introduces budget enforcement for effective scaling during testing (details undisclosed) | Stanford (Fei-Fei Li’s team) | |
NSA (Native Sparse Attention) [65] | 2025 (Feb.) | Hardware alignment and sparse attention mechanisms enhance training and inference efficiency | DeepSeek-AI |
Domain Type | Domain Target | Domain-Specific Instruction Fine-Tuned LLMs | Fine-Tuning Techniques | Base Model |
---|---|---|---|---|
General | Machine translation | YandexMT-GPT [75] | SFT | YandexGPT |
Information extraction | InstructUlE [76] | Multi-task Instruction Tuning | FlanT5 | |
Dialogue | InstructDial [77] | Instruction Tuning | T0 | |
Text summarization | FewShotSummarizer [78] | adapter | BART | |
Sentiment analysis | bangla-bert [79] | SFT | BERT | |
Chatbot | PsychRLHF-Chatbot [80] | RLHF | Llama-7b | |
Medical | Medical Q&A | OpenMedLM [81] | Prompt tuning | OS Yi 34B |
Clinical decision support | Clinical-T5 [82] | SFT | T5 | |
Education | Pedagogical agents | GuideLM [83] | SFT | ChatGPT-4o |
Educational question generation | EduQG [84] | RFT | Google FLAN-T5 | |
Automatic scoring | EduScoreGPT [85] | Task Fine-tuning | GPT-3.5 | |
Finance | Risk assessment | FinLlama [86] | LoRA | Llama2 7B |
Automated customer support | DISC-FinLLM [87] | RHLF | Baichuan-13B | |
Law | Legal consultation | RoBERTa-LQA [88] | Task Fine-tuning | RoBERTa |
Legal document analysis | FTLlama3-Legal [89] | SFT | Llama3 | |
Journalism | News summarization | LLaMa2-EFit [90] | LoRA | LLaMa2 |
Fake news detection | FT-RoBERTa-FND [91] | SFT | RobertA, BERT |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, X.-K.; Chen, M.; Li, W.; Wang, R.; Lu, L.; Liu, J.; Hwang, K.; Hao, Y.; Pan, Y.; Meng, Q.; et al. LLM Fine-Tuning: Concepts, Opportunities, and Challenges. Big Data Cogn. Comput. 2025, 9, 87. https://doi.org/10.3390/bdcc9040087
Wu X-K, Chen M, Li W, Wang R, Lu L, Liu J, Hwang K, Hao Y, Pan Y, Meng Q, et al. LLM Fine-Tuning: Concepts, Opportunities, and Challenges. Big Data and Cognitive Computing. 2025; 9(4):87. https://doi.org/10.3390/bdcc9040087
Chicago/Turabian StyleWu, Xiao-Kun, Min Chen, Wanyi Li, Rui Wang, Limeng Lu, Jia Liu, Kai Hwang, Yixue Hao, Yanru Pan, Qingguo Meng, and et al. 2025. "LLM Fine-Tuning: Concepts, Opportunities, and Challenges" Big Data and Cognitive Computing 9, no. 4: 87. https://doi.org/10.3390/bdcc9040087
APA StyleWu, X.-K., Chen, M., Li, W., Wang, R., Lu, L., Liu, J., Hwang, K., Hao, Y., Pan, Y., Meng, Q., Huang, K., Hu, L., Guizani, M., Chao, N., Fortino, G., Lin, F., Tian, Y., Niyato, D., & Wang, F.-Y. (2025). LLM Fine-Tuning: Concepts, Opportunities, and Challenges. Big Data and Cognitive Computing, 9(4), 87. https://doi.org/10.3390/bdcc9040087