Increasing the Reliability of Software Systems Using a Large-Language-Model-Based Solution for Onboarding
Abstract
:1. Introduction
2. Materials and Methods
2.1. The Literature and the State of the Art
2.2. Materials Used
- Chat Interface: Users interact with a chatbot that uses a sophisticated conversational model. This model is able to answer questions and address concerns in a natural and engaging way.
- Confluence Integration: The chatbot is linked to Confluence, which is a popular knowledge management platform. This allows the chatbot to access and retrieve a vast amount of information relevant to newcomers.
- State-of-the-Art Model: The project promises to utilize a cutting-edge conversational model, ensuring users receive the most up-to-date and accurate information.
- Internal Infrastructure: To guarantee data security, the entire system, including the chatbot model, runs on CERN’s internal infrastructure. This means no data leave CERN’s secure network.
- Simplified Access to Information: Newcomers will not need to search through various resources. They can simply ask the chatbot their questions and receive answers directly.
- Improved Onboarding Experience: By providing a user-friendly and informative platform, the chatbot can significantly improve the onboarding experience for newcomers to BC.
3. Implementation Details
3.1. Technology Stack and Model Selection
3.2. Prompt Engineering
- The system prompt: gives context to the model in order for it to know in which fashion it should answer. The model’s responses should only be affected stylistically by the prompt and not be used to give information to the user.
- The context prompt: this is used to inject the pieces of data that the system uses to provide answers. The process is described in a future section, but the idea is that the user’s input is transformed into a query that is matched against the base of documentation provided in the system. In this way, the model can give back “informed answers” by consulting the essential bits of documentation.
- The user prompt: this is used to concentrate the model on the question or statement given by the user and is given in red here.
- The assistant prompt: this prompt is left empty, as the model basically acts as a smart completion tool, filling in the hole provided after the assistant prompt, which can then be used to display the answer to the user in the interface.
3.3. Key Processes
3.3.1. Documentation Ingestion and Preparation
3.3.2. Memory
3.3.3. Back-End and Deployment
4. Results and Discussion
- Memory: Can the model recall bits and pieces from previous points in the conversation?
- Context: Does the model provide good information based on the queries? Are the relevant articles retrieved so that the user is pointed to the right corner of the internal documentation?
- Accuracy: Is the correct information given back to the user? Does the response make sense for a given query?
- Retrieval-augmented generation (RAG): This is a method that is already used in the implementation of our chatbot, as one of the main goals of the implementation was to help with domain-specific knowledge when newcomers onboard. However, we use the method of searching in a database of existing knowledge to provide context (knowledge retrieval), but there are a few more other techniques that could be explored:
- –
- LLM augumentor—small modules inside an LLM architecture are tuned for a specific task.
- –
- High-entropy word spotting and replacement—words that have a high entropy (meaning they occur many times in the text) are replaced by synonyms in order to get better answers from the model.
- –
- Decompose and query—a technique in which the prompt is decomposed into sub-queries that are each treated separately by the model in order to have more pin-point answers.
- Self-reflection: This is a technique assessed by Renze and Guven [18] that tries to use the same process that humans do when self-reflecting. A prompt such as “solve this problem, walk me through the steps”, might have a more accurate response than “solve this problem”, as it seems the model course-corrects in the middle of the text generation process by using the previous steps it described.
- More advanced implementations like ChatProtect, which allow the identification of the individual sentences where the model self-contradicts and re-generates those sentences for the answer to be more consistent [19].
Potential Impact Evaluation
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Ju, A.; Sajnani, H.; Kelly, S.; Herzig, K. A case study of onboarding in software teams: Tasks and strategies. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), Madrid, Spain, 22–30 May 2021; pp. 613–623. [Google Scholar]
- Sharma, G.G.; Stol, K.J. Exploring onboarding success, organizational fit, and turnover intention of software professionals. J. Syst. Softw. 2020, 159, 110442. [Google Scholar] [CrossRef]
- Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M.; et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 16–20 November 2020; pp. 38–45. [Google Scholar]
- Stojkovic, J.; Choukse, E.; Zhang, C.; Goiri, I.; Torrellas, J. Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference. arXiv 2024, arXiv:2403.20306. [Google Scholar]
- Balfroid, M.; Vanderose, B.; Devroey, X. Towards LLM-Generated Code Tours for Onboarding. In Proceedings of the Workshop on NL-based Software Engineering (NLBSE’24), Lisbon, Portugal, 20 April 2024. [Google Scholar]
- Jain, S.M. Hugging face. In Introduction to Transformers for NLP: With the Hugging Face Library and Models to Solve Problems; Springer: Berlin/Heidelberg, Germany, 2022; pp. 51–67. [Google Scholar]
- Wu, Y.; Sun, Z.; Yuan, H.; Ji, K.; Yang, Y.; Gu, Q. Self-Play Preference Optimization for Language Model Alignment. arXiv 2024, arXiv:2405.00675. [Google Scholar]
- Dubois, Y.; Li, C.X.; Taori, R.; Zhang, T.; Gulrajani, I.; Ba, J.; Guestrin, C.; Liang, P.S.; Hashimoto, T.B. Alpacafarm: A simulation framework for methods that learn from human feedback. arXiv 2024, arXiv:2305.14387. [Google Scholar]
- Zhao, J.; Zhang, Z.; Chen, B.; Wang, Z.; Anandkumar, A.; Tian, Y. Galore: Memory-efficient llm training by gradient low-rank projection. arXiv 2024, arXiv:2403.03507. [Google Scholar]
- White, J.; Fu, Q.; Hays, S.; Sandborn, M.; Olea, C.; Gilbert, H.; Elnashar, A.; Spencer-Smith, J.; Schmidt, D.C. A prompt pattern catalog to enhance prompt engineering with chatgpt. arXiv 2023, arXiv:2302.11382. [Google Scholar]
- Gao, Y.; Xiong, Y.; Gao, X.; Jia, K.; Pan, J.; Bi, Y.; Dai, Y.; Sun, J.; Wang, H. Retrieval-augmented generation for large language models: A survey. arXiv 2023, arXiv:2312.10997. [Google Scholar]
- Pan, J.J.; Wang, J.; Li, G. Survey of vector database management systems. arXiv 2023, arXiv:2310.14021. [Google Scholar]
- Topsakal, O.; Akinci, T.C. Creating large language model applications utilizing langchain: A primer on developing llm apps fast. In Proceedings of the International Conference on Applied Engineering and Natural Sciences 2023, Konya, Turkey, 10–12 July 2023; Volume 1, pp. 1050–1056. [Google Scholar]
- Karau, H.; Lublinsky, B. Scaling Python with Ray; O’Reilly Media, Inc.: Newton, MA, USA, 2022. [Google Scholar]
- Lathkar, M. Getting started with FastAPI. In High-Performance Web Apps with FastAPI: The Asynchronous Web Framework Based on Modern Python; Springer: Berlin/Heidelberg, Germany, 2023; pp. 29–64. [Google Scholar]
- Ji, Z.; Yu, T.; Xu, Y.; Lee, N.; Ishii, E.; Fung, P. Towards mitigating LLM hallucination via self reflection. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, 6–10 December 2023; pp. 1827–1843. [Google Scholar]
- Tonmoy, S.; Zaman, S.; Jain, V.; Rani, A.; Rawte, V.; Chadha, A.; Das, A. A comprehensive survey of hallucination mitigation techniques in large language models. arXiv 2024, arXiv:2401.01313. [Google Scholar]
- Renze, M.; Guven, E. Self-Reflection in LLM Agents: Effects on Problem-Solving Performance. arXiv 2024, arXiv:2405.06682. [Google Scholar]
- Mündler, N.; He, J.; Jenko, S.; Vechev, M. Self-contradictory hallucinations of large language models: Evaluation, detection and mitigation. arXiv 2023, arXiv:2305.15852. [Google Scholar]
Time Spent Onboarding | Potential Days Gained |
---|---|
30% | 39.3 |
25% | 32.75 |
15% | 19.65 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Schuszter, I.C.; Cioca, M. Increasing the Reliability of Software Systems Using a Large-Language-Model-Based Solution for Onboarding. Inventions 2024, 9, 79. https://doi.org/10.3390/inventions9040079
Schuszter IC, Cioca M. Increasing the Reliability of Software Systems Using a Large-Language-Model-Based Solution for Onboarding. Inventions. 2024; 9(4):79. https://doi.org/10.3390/inventions9040079
Chicago/Turabian StyleSchuszter, Ioan Cristian, and Marius Cioca. 2024. "Increasing the Reliability of Software Systems Using a Large-Language-Model-Based Solution for Onboarding" Inventions 9, no. 4: 79. https://doi.org/10.3390/inventions9040079
APA StyleSchuszter, I. C., & Cioca, M. (2024). Increasing the Reliability of Software Systems Using a Large-Language-Model-Based Solution for Onboarding. Inventions, 9(4), 79. https://doi.org/10.3390/inventions9040079