Transformers and State-Space Models: Fine-Tuning Techniques for Solving Differential Equations
Abstract
1. Introduction
- Benchmark and Protocol: We extend the AGDES package to generate a more diverse collection of differential equations and their solutions. We also provide codes with standardized prompts and evaluation scripts, allowing reproducible fine-tuning and comparison between models.
- Empirical Study: We fine-tune four different types of LLMs and report their solution accuracy and generalization ability.
- Insights and Analysis: We examine the error patterns of the models and discuss the need to develop new metrics to assess the quality of the solutions generated in LaTeX format.
2. Related Work
2.1. LLMs in Mathematics
2.2. Solving Differential Equations with LLMs
3. Materials and Methods
3.1. Dataset Description
- Polynomial equations, of the form .
- Separable-variable equations, of the form , where is an integrable function.
- Homogeneous equations of second and third order, i.e., and .
- Inhomogeneous equations of second and third order, i.e., and , with being an arbitrary function.
3.2. Description of Models and Fine-Tuning Protocols
3.3. Metric Description
4. Results
5. Discussion and Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
LLM | Large Language Model |
BLEU | Bilingual Evaluation Understudy |
PINN | Physics-Informed Neural Network |
STEM | Science, Technology, Engineering, and Mathematics |
SSM | State-Space Model |
References
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems; Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., Lin, H., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2020; Volume 33, pp. 1877–1901. [Google Scholar]
- OpenAI; Achiam, J.; Adler, S.; Agarwal, S.; Ahmad, L.; Akkaya, I.; Aleman, F.L.; Almeida, D.; Altenschmidt, J.; Altman, S.; et al. GPT-4 Technical Report. arXiv 2024, arXiv:2303.08774. [Google Scholar]
- Howard, J.; Ruder, S. Universal Language Model Fine-tuning for Text Classification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; pp. 328–339. [Google Scholar] [CrossRef]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 5485–5551. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; The MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Zhang, C.; Bengio, S.; Hardt, M.; Recht, B.; Vinyals, O. Understanding deep learning (still) requires rethinking generalization. Commun. ACM 2021, 64, 107–115. [Google Scholar] [CrossRef]
- Lewkowycz, A.; Andreassen, A.; Dohan, D.; Dyer, E.; Michalewski, H.; Ramasesh, V.; Slone, A.; Anil, C.; Schlag, I.; Gutman-Solo, T.; et al. Solving quantitative reasoning problems with language models. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS’22, Red Hook, NY, USA, 28 November 2022. [Google Scholar]
- Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S.; et al. PaLM: Scaling Language Modeling with Pathways. J. Mach. Learn. Res. 2023, 24, 240:1–240:113. [Google Scholar]
- Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
- Cuomo, S.; Di Cola, V.S.; Giampaolo, F.; Rozza, G.; Raissi, M.; Piccialli, F. Scientific Machine Learning Through Physics–Informed Neural Networks: Where we are and What’s Next. J. Sci. Comput. 2022, 92, 88. [Google Scholar] [CrossRef]
- Shalyt, M.; Elimelech, R.; Kaminer, I. ASyMOB: Algebraic Symbolic Mathematical Operations Benchmark. arXiv 2025, arXiv:2505.23851. [Google Scholar] [CrossRef]
- Zakharov, V.; Surkov, A.; Koltcov, S. AGDES: A Python package and an approach to generating synthetic data for differential equation solving with LLMs. Procedia Comput. Sci. 2025, 258, 1169–1178. [Google Scholar] [CrossRef]
- Hendrycks, D.; Burns, C.; Kadavath, S.; Arora, A.; Basart, S.; Tang, E.; Song, D.; Steinhardt, J. Measuring Mathematical Problem Solving with the MATH Dataset. In Proceedings of the Advances in Neural Information Processing Systems, Online, 6–14 December 2021; Volume 34, pp. 5852–5864. [Google Scholar]
- Cobbe, K.; Kosaraju, V.; Bavarian, M.; Chen, M.; Jun, H.; Kaiser, L.; Plappert, M.; Tworek, J.; Hilton, J.; Nakano, R.; et al. Training Verifiers to Solve Math Word Problems. arXiv 2021, arXiv:2110.14168. [Google Scholar] [CrossRef]
- Nye, M.; Andreassen, A.J.; Gur-Ari, G.; Michalewski, H.; Austin, J.; Bieber, D.; Dohan, D.; Lewkowycz, A.; Bosma, M.; Luan, D.; et al. Show Your Work: Scratchpads for Intermediate Computation with Language Models. arXiv 2021, arXiv:2112.00114. [Google Scholar] [CrossRef]
- Wei, J.; Wang, X.; Schuurmans, D.; Bosma, M.; Ichter, B.; Xia, F.; Chi, E.; Le, Q.V.; Zhou, D. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2022; Volume 35, pp. 24824–24837. [Google Scholar]
- Kojima, T.; Gu, S.S.; Reid, M.; Matsuo, Y.; Iwasawa, Y. Large Language Models are Zero-Shot Reasoners. In Advances in Neural Information Processing Systems; Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A., Eds.; Curran Associates, Inc.: Red Hook, NY, USA, 2022; Volume 35, pp. 22199–22213. [Google Scholar]
- Zelikman, E.; Wu, Y.; Mu, J.; Goodman, N.D. STaR: Self-taught reasoner bootstrapping reasoning with reasoning. In Proceedings of the 36th International Conference on Neural Information Processing Systems, NIPS’22, Red Hook, NY, USA, 9 December 2022. [Google Scholar]
- Wang, X.; Wei, J.; Schuurmans, D.; Le, Q.V.; Chi, E.H.; Narang, S.; Chowdhery, A.; Zhou, D. Self-Consistency Improves Chain of Thought Reasoning in Language Models. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023. [Google Scholar]
- Yao, S.; Yu, D.; Zhao, J.; Shafran, I.; Griffiths, T.L.; Cao, Y.; Narasimhan, K. Tree of thoughts: Deliberate problem solving with large language models. In Proceedings of the 37th International Conference on Neural Information Processing Systems, NIPS’23, Red Hook, NY, USA, 10–16 December 2023. [Google Scholar]
- Gao, L.; Madaan, A.; Zhou, S.; Alon, U.; Liu, P.; Yang, Y.; Callan, J.; Neubig, G. PAL: Program-aided Language Models. In Machine Learning Research, Proceedings of the 40th International Conference on Machine Learning, Honolulu, HI, USA, 23–29 July 2023; Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J., Eds.; PMLR: New York, NY, USA, 2023; Volume 202, pp. 10764–10799. [Google Scholar]
- Schick, T.; Dwivedi-Yu, J.; Dessi, R.; Raileanu, R.; Lomeli, M.; Hambro, E.; Zettlemoyer, L.; Cancedda, N.; Scialom, T. Toolformer: Language Models Can Teach Themselves to Use Tools. In Proceedings of the Thirty-Seventh Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023. [Google Scholar]
- Luo, H.; Sun, Q.; Xu, C.; Zhao, P.; Lou, J.G.; Tao, C.; Geng, X.; Lin, Q.; Chen, S.; Tang, Y.; et al. WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct. In Proceedings of the Thirteenth International Conference on Learning Representations, Singapore, 24–28 April 2025. [Google Scholar]
- Madaan, A.; Tandon, N.; Gupta, P.; Hallinan, S.; Gao, L.; Wiegreffe, S.; Alon, U.; Dziri, N.; Prabhumoye, S.; Yang, Y.; et al. Self-Refine: Iterative Refinement with Self-Feedback. arXiv 2023, arXiv:2303.17651. [Google Scholar] [CrossRef]
- Chen, W.; Ma, X.; Wang, X.; Cohen, W.W. Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks. arXiv 2023, arXiv:2211.12588. [Google Scholar] [CrossRef]
- Lu, P.; Bansal, H.; Xia, T.; Liu, J.; Li, C.; Hajishirzi, H.; Cheng, H.; Chang, K.W.; Galley, M.; Gao, J. MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts. arXiv 2024, arXiv:2310.02255. [Google Scholar] [CrossRef]
- Zhao, C.; Tan, Z.; Ma, P.; Li, D.; Jiang, B.; Wang, Y.; Yang, Y.; Liu, H. Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens. arXiv 2025, arXiv:2508.01191. [Google Scholar]
- Ling, Z.; Fang, Y.; Li, X.; Huang, Z.; Lee, M.; Memisevic, R.; Su, H. Deductive Verification of Chain-of-Thought Reasoning. arXiv 2023, arXiv:2306.03872. [Google Scholar]
- Zhang, H.; Da, J.; Lee, D.; Robinson, V.; Wu, C.; Song, W.; Zhao, T.; Raja, P.; Zhuang, C.; Slack, D.; et al. A Careful Examination of Large Language Model Performance on Grade School Arithmetic. arXiv 2024, arXiv:2405.00332. [Google Scholar] [CrossRef]
- Yang, K.; Swope, A.M.; Gu, A.; Chalamala, R.; Song, P.; Yu, S.; Godil, S.; Prenger, R.; Anandkumar, A. LeanDojo: Theorem Proving with Retrieval-Augmented Language Models. arXiv 2023, arXiv:2306.15626. [Google Scholar]
- Nikankin, Y.; Reusch, A.; Mueller, A.; Belinkov, Y. Arithmetic Without Algorithms: Language Models Solve Math with a Bag of Heuristics. arXiv 2025, arXiv:2410.21272. [Google Scholar]
- Nasr, M.; Carlini, N.; Hayase, J.; Jagielski, M.; Cooper, A.F.; Ippolito, D.; Choquette-Choo, C.A.; Wallace, E.; Tramèr, F.; Lee, K. Scalable Extraction of Training Data from (Production) Language Models. arXiv 2023, arXiv:2311.17035. [Google Scholar] [CrossRef]
- Morris, J.X.; Sitawarin, C.; Guo, C.; Kokhlikyan, N.; Suh, G.E.; Rush, A.M.; Chaudhuri, K.; Mahloujifar, S. How much do language models memorize? arXiv 2025, arXiv:2505.24832. [Google Scholar] [CrossRef]
- Lample, G.; Charton, F. Deep Learning for Symbolic Mathematics. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 26–30 April 2020. [Google Scholar]
- Huang, X.; Shen, Q.; Hu, Y.; Gao, A.; Wang, B. LLMs for Mathematical Modeling: Towards Bridging the Gap between Natural and Mathematical Languages. In Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, NM, USA, 29 April–4 May 2025; pp. 2678–2710. [Google Scholar] [CrossRef]
- Sun, J.; Liu, Y.; Zhang, Z.; Schaeffer, H. Towards a foundation model for partial differential equations: Multioperator learning and extrapolation. Phys. Rev. E 2025, 111, 035304. [Google Scholar] [CrossRef]
- Du, M.; Chen, Y.; Wang, Z.; Nie, L.; Zhang, D. Large language models for automatic equation discovery of nonlinear dynamics. Phys. Fluids 2024, 36, 097121. [Google Scholar] [CrossRef]
- Kuriam, J. Differential-Equations. 2016. Available online: https://github.com/JaKXz/Differential-Equations (accessed on 9 August 2025).
- Filippov, A. Sbornik Zadach Po Differentsialnym Uravneniiam [Collection of Problems on Differential Equations]; Regular and Chaotic Dynamics: Izhevsk, Russia, 2000. [Google Scholar]
- Lu, Q.; Dou, D.; Nguyen, T. ClinicalT5: A Generative Language Model for Clinical Text. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, 7–11 December 2022; pp. 5436–5443. [Google Scholar] [CrossRef]
- Li, Y.; Harrigian, K.; Zirikly, A.; Dredze, M. Are Clinical T5 Models Better for Clinical Text? arXiv 2024, arXiv:2412.05845. [Google Scholar] [CrossRef]
- Athugodage, M.; Mitrofanove, O.; Gudkov, V. Transfer Learning for Russian Legal Text Simplification. In Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024, Torino, Italy, 20 May 2024; pp. 59–69. [Google Scholar]
- Zhang, W.; Shen, H.; Lei, T.; Wang, Q.; Peng, D.; Wang, X. GLQA: A Generation-based Method for Legal Question Answering. In Proceedings of the 2023 International Joint Conference on Neural Networks (IJCNN), Queensland, Australia, 18–23 June 2023; pp. 1–8. [Google Scholar] [CrossRef]
- Poornima, A.; Nagaraja, K.V.; Venugopalan, M. Legal Contract Analysis and Risk Assessment Using Pre-Trained Legal-T5 and Law-GPT. In Proceedings of the 2025 3rd International Conference on Integrated Circuits and Communication Systems (ICICACS), Raichur, India, 21–22 February 2025; pp. 1–8. [Google Scholar] [CrossRef]
- Li, Y.; Wang, S.; Ding, H.; Chen, H. Large Language Models in Finance: A Survey. In Proceedings of the Fourth ACM International Conference on AI in Finance, ICAIF’23, New York, NY, USA, 27–29 November 2023; pp. 374–382. [Google Scholar] [CrossRef]
- Xia, Y.; Huang, Y.; Qiu, Q.; Zhang, X.; Miao, L.; Chen, Y. A Question and Answering Service of Typhoon Disasters Based on the T5 Large Language Model. ISPRS Int. J. Geo-Inf. 2024, 13, 165. [Google Scholar] [CrossRef]
- Luo, J.; Chen, Z.; Chen, W.; Lu, H.; Lyu, F. A study on the application of the T5 large language model in encrypted traffic classification. Peer-Netw. Appl. 2025, 18, 15. [Google Scholar] [CrossRef]
- Liao, X.; Zhu, B.; He, J.; Liu, G.; Zheng, H.; Gao, J. A Fine-Tuning Approach for T5 Using Knowledge Graphs to Address Complex Tasks. arXiv 2025, arXiv:2502.16484. [Google Scholar] [CrossRef]
- Gu, A.; Dao, T. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv 2024, arXiv:2312.00752. [Google Scholar]
- Waleffe, R.; Byeon, W.; Riach, D.; Norick, B.; Korthikanti, V.; Dao, T.; Gu, A.; Hatamizadeh, A.; Singh, S.; Narayanan, D.; et al. An Empirical Study of Mamba-based Language Models. arXiv 2024, arXiv:2406.07887. [Google Scholar] [CrossRef]
- Zhang, H.; Zhu, Y.; Wang, D.; Zhang, L.; Chen, T.; Wang, Z.; Ye, Z. A Survey on Visual Mamba. Appl. Sci. 2024, 14, 5683. [Google Scholar] [CrossRef]
- Schiff, Y.; Kao, C.H.; Gokaslan, A.; Dao, T.; Gu, A.; Kuleshov, V. Caduceus: Bi-directional equivariant long-range DNA sequence modeling. In Proceedings of the 41st International Conference on Machine Learning, JMLR.org, ICML’24, Vienna, Austria, 21–27 July 2024. [Google Scholar]
- Erol, M.H.; Senocak, A.; Feng, J.; Chung, J.S. Audio Mamba: Bidirectional State Space Model for Audio Representation Learning. IEEE Signal Process. Lett. 2024, 31, 2975–2979. [Google Scholar] [CrossRef]
- Ma, H.; Chen, Y.; Zhao, W.; Yang, J.; Ji, Y.; Xu, X.; Liu, X.; Jing, H.; Liu, S.; Yang, G. A Mamba Foundation Model for Time Series Forecasting. arXiv 2024, arXiv:2411.02941. [Google Scholar] [CrossRef]
- DeepSeek-AI; Guo, D.; Yang, D.; Zhang, H.; Song, J.; Zhang, R.; Xu, R.; Zhu, Q.; Ma, S.; Wang, P.; et al. DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. arXiv 2025, arXiv:2501.12948. [Google Scholar]
- Microsoft; Abouelenin, A.; Ashfaq, A.; Atkinson, A.; Awadalla, H.; Bach, N.; Bao, J.; Benhaim, A.; Cai, M.; Chaudhary, V.; et al. Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs. arXiv 2025, arXiv:arXiv:2503.01743. [Google Scholar]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, ACL’02, Philadelphia, PA, USA, 6–12 July 2002; pp. 311–318. [Google Scholar] [CrossRef]
- Wang, B.; Wu, F.; Ouyang, L.; Gu, Z.; Zhang, R.; Xia, R.; Zhang, B.; He, C. Image over Text: Transforming Formula Recognition Evaluation with Character Detection Matching. arXiv 2025, arXiv:2409.03643. [Google Scholar]
- Jung, K.; Kim, N.J.; Ryu, H.G.; Hyeon, S.; Lee, S.J.; Lee, H.J. TeXBLEU: Automatic Metric for Evaluate LaTeX Format. In Proceedings of the ICASSP 2025—2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 6–11 April 2025; pp. 1–5. [Google Scholar] [CrossRef]
- Mahdavi, M.; Zanibbi, R. Tree-based structure recognition evaluation for math expressions. In Proceedings of the GREC, Sydney, Australia, 20–21 September 2019; pp. 9–10. [Google Scholar]
Class | Number of Equations |
---|---|
Polynomial (generated) | 7600 |
With separable variables (generated) | 242,753 |
Second-order homogeneous (generated) | 9405 |
Third-order homogeneous (generated) | 6498 |
Second-order inhomogeneous (generated) | 15,751 |
Third-order inhomogeneous (generated) | 6369 |
Manually collected from textbooks | 1078 |
Model | Separable Variables | Inhomog. 2nd Order | Homog. 2nd Order | Polynomial | Inhomog. 3rd Order | Homog. 3rd Order | Hand-Labeled |
---|---|---|---|---|---|---|---|
Mamba 130M | 0.47230 (±0.39) | 0.3298 (±0.17) | 0.1922 (±0.15) | 0.1013 (±0.08) | 0.5048 (±0.21) | 0.2162 (±0.19) | 0.6346 (±0.36) |
Mamba 2.8B | 0.2666 (±0.19) | 0.2735 (±0.17) | 0.5496 (±0.46) | 0.1688 (±0.24) | 0.4525 (±0.19) | 0.3433 (±0.41) | 0.6068 (±0.38) |
T5 | 0.7715 (±0.08) | 0.5328 (±0.06) | 0.1477 (±0.05) | 0.6030 (±0.17) | 0.5142 (±0.08) | 0.08163 (±0.02) | 0.08653 (±0.11) |
Phi-4-mini | 0.9102 (±0.12) | 0.9216 (±0.14) | 0.9140 (±0.07) | 0.9219 (±0.006) | 0.9009 (±0.39) | 0.9107 (±0.02) | 0.9279 (±0.23) |
Deepseek Qwen | 0.2715 (±0.11) | 0.2773 (±0.16) | 0.2681 (±0.05) | 0.2731 (±0.24) | 0.2728 (±0.27) | 0.2807 (±0.02) | 0.2391 (±0.07) |
Model | Separable Variables | Inhomog. 2nd Order | Homog. 2nd Order | Polynomial | Inhomog. 3rd Order | Homog. 3rd Order | Hand-Labeled |
---|---|---|---|---|---|---|---|
Mamba 130M | 0.999994 (±0.0001) | 1 (±0) | 0.999937 (±0.0004) | 1 (±0) | 0.999999 (±0.00001) | 1 (±0) | 0.999995 (±0.00004) |
Mamba 2.8B | 1 (±0) | 1 (±0.00001) | 1 (±0) | 1 (±0) | 1 (±0) | 1 (±0) | 0.999968 (±0.00014) |
T5 | 0.753553 (±0.015) | 0.733906 (±0.008) | 0.795395 (±0.005) | 0.884749 (±0.0569) | 0.741548 (±0.008) | 0.790599 (±0.009) | 0.791326 (±0.061) |
Phi-4-mini | 0.926765 (±0.083) | 0.982052 (±0.046) | 0.943722 (±0.078) | 0.999815 (±0.002) | 0.957788 (±0.078) | 0.917312 (±0.08) | 0.787775 (±0.05) |
Deepseek Qwen | 0.926764 (±0.08) | 0.982052 (±0.017) | 0.943722 (±0.076) | 0.999815 (±0.0004) | 0.957787 (±0.036) | 0.917312 (±0.08) | 0.787775 (±0.051) |
Equation Type | Mamba 130M | Mamba 2.8B | T5 | Phi-4-mini | Deepseek Qwen |
---|---|---|---|---|---|
separable variables | redundant terms of form , arbitrary symbols and words from natural language | redundant arbitrary math expressions | absence of “\” and “{}” in math functions; arbitrary math expressions | incorrect functions (for example, arctg instead of cos) | incorrect functions (e.g., exponential instead of trigonometric) |
inhomog. 2nd order | long redundant arbitrary math expressions | redundant arbitrary math expressions | absence of “\” and “{}” in math functions | incorrect coefficients in math functions | redundant expressions |
homog. 2nd order | redundant terms of form , arbitrary symbols | redundant terms of form and | absence of “\” and “{}” in math functions | incorrect coefficients in exponential functions | no typical mistakes |
polynomial | many redundant terms of form | many redundant terms of form and | absence of “\” and “{}” in math functions | incorrect coefficients in power functions | many redundant power functions |
inhomog. 3rd order | long redundant arbitrary math expressions | redundant arbitrary terms with trigonometric functions | absence of “\” and “{}” in math functions; shorter expression than the true answers | generation of shorter expression containing parts of true answers | generation of shorter expression containing parts of true answers |
homog. 3rd order | redundant arbitrary math expressions, symbols and words from natural language | redundant arbitrary math terms | absence of “\” before math functions and “{}” in math functions | incorrect coefficients in exponential functions | no typical mistakes |
hand-labeled | arbitrary math expressions | redundant arbitrary math expressions and symbols | absence of “\” and “{}” in math functions; arbitrary math expressions | arbitrary math functions | arbitrary math functions |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ignatenko, V.; Surkov, A.; Zakharov, V.; Koltcov, S. Transformers and State-Space Models: Fine-Tuning Techniques for Solving Differential Equations. Sci 2025, 7, 130. https://doi.org/10.3390/sci7030130
Ignatenko V, Surkov A, Zakharov V, Koltcov S. Transformers and State-Space Models: Fine-Tuning Techniques for Solving Differential Equations. Sci. 2025; 7(3):130. https://doi.org/10.3390/sci7030130
Chicago/Turabian StyleIgnatenko, Vera, Anton Surkov, Vladimir Zakharov, and Sergei Koltcov. 2025. "Transformers and State-Space Models: Fine-Tuning Techniques for Solving Differential Equations" Sci 7, no. 3: 130. https://doi.org/10.3390/sci7030130
APA StyleIgnatenko, V., Surkov, A., Zakharov, V., & Koltcov, S. (2025). Transformers and State-Space Models: Fine-Tuning Techniques for Solving Differential Equations. Sci, 7(3), 130. https://doi.org/10.3390/sci7030130