LLM Based Chatbot for Farm-to-Fork Blockchain Traceability Platform

Benzinho, José; Ferreira, João; Batista, Joel; Pereira, Leandro; Maximiano, Marisa; Távora, Vítor; Gomes, Ricardo; Remédios, Orlando

doi:10.3390/app14198856

Open AccessArticle

LLM Based Chatbot for Farm-to-Fork Blockchain Traceability Platform

by

José Benzinho

¹,

João Ferreira

¹,

Joel Batista

¹,

Leandro Pereira

¹,

Marisa Maximiano

^1,2

,

Vítor Távora

¹,

Ricardo Gomes

^1,*

and

Orlando Remédios

³

¹

School of Technology and Management, Polytechnic of Leiria, 2411-901 Leiria, Portugal

²

Computer Science and Communication Research Centre (CIIC), 2411-901 Leiria, Portugal

³

Sensefinity, 1749-016 Lisboa, Portugal

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(19), 8856; https://doi.org/10.3390/app14198856

Submission received: 17 July 2024 / Revised: 28 August 2024 / Accepted: 27 September 2024 / Published: 2 October 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

Blockchain technology has been used with great effect in farm-to-fork traceability projects. However, this technology has a steep learning curve when it comes to its user interface. To minimize this difficulty, we created a solution based on a Large Language Model (LLM) conversational agent. Our implementation, starting with an existing knowledge base that is prepared and processed with an embedding model to be stored in a vector database, follows a Retrieval-Augmented Generation (RAG) approach. Other non-textual media like images and videos are aggregated with the embeddings to enrich the user experience. User queries are combined with a proximity search in the vector database and feed into an LLM that considers the conversation history with the user in its replies. Given the asynchronous nature of these models, we implemented a similarly asynchronous scheme using Server-Sent Events that deliver the models’ replies to a UI that supports multimodal media types such as images and videos by providing the visualization of these resources. The end solution allows users to interact with advanced technologies using a natural language interface; this in turn empowers food traceability projects to overcome their natural difficulty in engaging early adopters.

Keywords:

chatbot; large language model; user support; vector databases; RAG

1. Introduction

Conversational agents (CAs) play a crucial role in today’s systems, especially in those where the interaction between the end user and a platform is crucial and requires the need for real-time support and assistance. The use of a CA allows seamless communication between the parties using, for example, a chatbot, where the agent can answer common queries from the platform users without the need for a specific human support team. This automation also provides some advantages since the use of a CA can also enable the platform to scale the technical support that it is possible to provide to the end users. Nowadays, Large Language Models (LLMs) are transforming how we interact with technology, potentiating the creation of more efficient and natural communication tools. From a practical point of view, LLMs can handle tasks like text generation, question answering, and language translation, which are key features of systems that support a CA.

Agenda Blockchain is a project that integrates a consortium of companies with the main objective of driving the national development of technological solutions using blockchain [1], taking advantage of the business opportunities and innovations this technology offers. Agenda Blockchain is subdivided into various Work Packages.

This project is part of Work Package 1—Agriculture and Agro-food—which aims to enable farm-to-fork traceability through the development of specialized Internet of Things (IoT) solutions and their integration with Distributed Ledger Technology (DLT)/blockchain information systems. With this technology, it is possible to monitor the entire food process, from production to the consumer’s plate, ensuring transparency at every stage. The end goal of a farm-to-fork traceability solution is to build consumer confidence by providing an immutable store of all the relevant data that pertains to a particular food product.

One example could be livestock whose sensor data for all its lifespan is stored along with any relevant intervention (like medication administered by a veterinarian) in an immutable store, including other sensor data after slaughter and processing (including transportation to the final store), which is where the distributed ledgers shine. This allows a consumer to fully understand the lifecycle of the produce they are consuming.

Thus, this project aims to contribute to the digital transformation of the agri-food sector, promoting traceability, efficiency, and transparency in the supply chain, as well as reinforcing the national blockchain industry and positioning Portugal as a global leader in this technology. It is important to state that the overhaul project is ongoing and that this research effort fits into a specific identified necessity—support users that might have low technical skills interact with a complex technological stack. The full traceability solution is being built atop Hyperledger [2], a permissioned blockchain, with several organizations in the consortium deploying a participating node.

Oracles are commonly used mechanisms to consume IoT sensor network data and consistently create transactions on the chain. Oracles are trusted third-party services that provide external data to smart contracts, allowing the blockchain to interact with real-world information.

Furthermore, a service layer is being developed as a first layer of abstraction between the blockchain and the necessary business processes.

The general objective of our research efforts is to create a conversational agent (CA) that enables the user of a farm-to-fork traceability platform, which utilizes advanced technologies like distributed ledgers, to extract useful information regarding the produce they intend to buy. Blockchain technology, being in its early stages, does not provide a good user experience. From the need to manage a wallet to the necessity to understand the costs of transactions, almost all aspects of this space are not well suited for the average consumer, which is the ultimate end user of the underlying project. The recent advances in Artificial Intelligence (AI) provide a great opportunity to bridge this gap. In order to achieve this goal, we propose an architecture that constructs a knowledge base (using information provided in PDF files or another computer-readable format available within the supply-chain platform) to subsequently make it available to a Large Language Model (LLM) that integrates a user support conversational agent environment so that the user can interact with the platform with the help of this chatbot.

Thus, this project aims to provide the platform’s frontend with the capability to answer user questions in an adapted manner, whether through text, images, video, or any other computer-readable resource, to ensure a better user experience.

The remaining chapters of this paper are organized as follows: in the Background chapter, we discuss the relevant technologies and their behavior. The next chapter, Related Works, is focused on showcasing other researchers’ works that are in line with this project. The Architecture chapter presents our system design proposal, including a view of its flow and function. Following is the Development chapter, where we detail our implementation, and finally, we present the findings in the Conclusion chapter, along with some suggestions for future work.

2. Background

In this chapter, we present the most relevant background concepts that underpin the work developed in this project.

2.1. Large Language Models

Large Language Models (LLMs) are a type of Artificial Intelligence created using Machine Learning techniques, specifically Deep Learning [3]. These models can recognize, extract, summarize, and generate text based on the knowledge acquired during their training, obtained from large datasets. They share the common ability to process and generate text like natural language, a capacity referred to as Natural Language Processing (NLP) [3]. However, the performance of these models depends not only on the quantity of data used during training but also on the quality of the information [1].

These models respond to prompts, which initiate the beginning of a conversation. In the context of LLMs, a prompt can be defined as a text input provided to a language model to guide it in text generation. It functions as a stimulus for the model, directing text generation based on the provided input. Crafting these prompts to ensure an optimal response from a model is therefore crucial and corresponds to an area called prompt engineering [3]. There is no perfect mechanism (recipe) that can create prompts suitable for all possible cases; however, there are several best practices that help achieve consistently good results.

Prompt engineering involves designing and optimizing prompts used in NLP, such as ChatGPT, chatbots, or virtual assistants. This involves crafting clear, concise, and effective instructions to obtain the desired response. Prompt engineering is just one part of the model optimization process. Another essential part is choosing how the text is generated. For instance, we can alter how the model selects each subsequent token when generating text. By adjusting these parameters, we can reduce generated text repetition and make it more like human-created text. This parameter adjustment process is called fine-tuning [3].

2.2. Embeddings

In mathematics, a vector is a set of numbers that defines a point in a dimensional space. Machine Learning algorithms leverage this concept to search for similar vectors, and in this case, similar objects.

Embeddings [4] are representations of values or objects such as text, images, and audio to be consumed by AI models based on Machine Learning. These vector representations make data analysis operations faster and more efficient.

Transposing this concept to the context of Large Language Models, each word can be converted into an embedding, which means that sentences, paragraphs, and articles can be searched and analyzed as shown in Figure 1.

2.3. Vector Databases

The history of vector databases dates to the early 2000s, when researchers at the University of California, Berkeley, began developing a new type of database aimed at storing and querying high-dimensional vectors. The first commercial vector database was launched in 2010 by VectorWise, which was later acquired by Actian in 2011 [5].

In recent years, there has been growing interest in vector databases due to advancements in Artificial Intelligence and Machine Learning applications. These applications generate and utilize high-dimensional vectors to represent data. One of the initial uses of vector databases was to store and query high-dimensional data efficiently, such as images, text, and sensor data.

Currently, there are several popular vector databases available, including Pinecone [6], Chroma [7], and FAISS (Facebook AI Similarity Search) [8]. These vector databases are driving the industry towards a future where understanding data is not just a challenge but an opportunity.

2.4. Retrieval-Augmented Generation

Retrieval-Augmented Generation (RAG) [9] is a technique used to enhance the accuracy and reliability of generative Artificial Intelligence models with facts sourced from external data sets.

RAG provides models with sources of information that can be cited and consulted by the user, thus enhancing confidence in the searches conducted. In addition to enhancing confidence in the user, this technique helps the model eliminate ambiguity in the user’s query, meaning it reduces the possibility of the model making a wrong prediction, a phenomenon called hallucination. These hallucinations refer to the generation of nonsensical, grammatically inconsistent, or even incorrect words or phrases. There are several factors that can contribute to these inconsistencies, including insufficient model training data and low-quality or contaminated data [10]. Furthermore, the lack of context or constraints provided to the model can cause these hallucinations, which can have a significant impact resulting in inaccurate or wrong responses. Thus, one of the main challenges is to develop strategies and techniques that minimize the occurrence of these hallucinations, ensuring more accurate and reliable responses [10].

With the RAG technique, users gain the ability to have conversations with data repositories, thus opening a whole new range of possibilities and experiences. For example, a model fed with a database of medical records can be a good assistant for a doctor or nurse, just as financial analysts would benefit from an assistant fed with market data.

All companies can transform their information, be it technical manuals, FAQs, videos, or even knowledge base records, to feed and enhance an LLM. These knowledge bases can be transformed, for example, into customer support systems or training systems, thus increasing productivity.

As the name suggests, the RAG technique consists of two parts, one of retrieval and the other of generation. However, it is easier to understand if we think of it as a sequence of four steps (the first step is conducted once, and the other three steps are conducted as many times as needed). Figure 2 illustrates the RAG architecture.

There are two phases that happen at different times. First, the Bootstrap Phase is where we bootstrap the knowledge that the system will use for context. The first step (1) begins with data cleaning and formatting to define the chunks that will be passed to the embedding model. The resulting embeddings are stored in a vector database. The second phase—Conversational—is where the system interacts with the user by responding to questions using the context stored in the first phase. In the second step (2), a question is created, for example, a question asked by a user. Then in the third step (3), the user’s question is enriched with data collected from the vector database using similarity searches. At this point, the context of the document stored in the vector database is added to the query before it is sent to the Large Language Model (LLM). Finally, in the fourth step (4), a response is generated by the LLM.

The data sources supporting the knowledge bases can be from diverse origins such as PDF files, Word documents, Excel documents, Web pages, or other documents.

After extracting the text, it is necessary to split it into chunks. Next, we must map these tokens to vectors (floating-point numbers) typically in the size of 768 or 1024 dimensions, although it could be a higher value. These vectors, called embeddings [4], convert a piece of text into a numerical representation of a vector space. Once the embeddings are created, they need to be stored in a vector database. Storing this type of data in a vector database allows for efficient retrieval of relevant information and provides additional context to queries. This helps ensure more accurate responses while avoiding “nonsensical” responses, determined as “hallucinations” [4].

2.5. Application Architecture

Current web development requires design patterns already provided by the landscape of web frameworks and meta-frameworks. These frameworks can be focused on developing static or dynamic web pages. Frameworks can be classified as client-side, being more focused on user interface, or server-side, which takes priority in rules, architecture, and security.

Nowadays, there are multiple frameworks, each one with its approaches and advantages, some examples of frameworks are Vue [11], Angular [12], React [13], and Ruby on Rails [14]. These frameworks tend to showcase the most common design patterns, like server-side rendering, hydration, or static site generation. Frameworks like these drive the industry to a state where web development becomes easier and more streamlined. When an application is divided into a request processing server located backend and an interactable browser frontend, establishing communication between components is necessary.

For a long time, HTTP has been a staple and go-to choice when deciding which communication protocol to implement; however, the need for a protocol that allows for real-time communication has been increasing for the past few years. This need for real-time communication was also felt when developing the application [15].

For our proposed solution, the usage of Server-Sent Events (SSE) communication is a perfect fit, being defined as a protocol “suited for unidirectional communication from a server to a client” [15].

3. Related Work

The way people communicate with companies has been evolving at a rapid pace. For years, communication between individuals and businesses was conducted in person or over the phone. With the internet’s emergence, many communication channels began to arise, such as email, social media, mobile applications, or even through filling out forms.

More recently, with the advent of real-time messaging, there has been a paradigm shift in how people communicate with companies. Faced with these changes, companies have had to adapt and start using intelligent tools to enhance the quality and availability of their customer service. It is in this scenario that chatbots become integrated into companies’ strategies [16]. According to EeuWen [17], chatbots are “intelligent software programs that communicate with the user in natural language via chat and can be used for commercial purposes”.

According to McTear et al. [18], a chatbot operates by recognizing the text sent by the user, interpreting the words and their meaning, formulating a response, or, if the message is unclear, interacting with the user to clarify it, constructing a response, and displaying the response.

There are two types of chatbots for interacting with users: rule-based chatbots and AI-based chatbots. Rule-based chatbots operate based on predefined rules. When they receive a message from the user, they analyze the words and respond according to their programming. If the user uses unknown words, the chatbot will not be able to respond.

AI-based chatbots, on the other hand, can learn and understand natural language, as well as provide responses with the appropriate information. Since these chatbots learn from Artificial Intelligence algorithms and conversation histories, the more they interact with users, the more they improve their accuracy [19].

At the forefront of this technology, there are some examples, such as OpenAI, a company recognized for its advancements in Artificial Intelligence and one of the main drivers in the development of advanced chatbots, including the Generative Pre-trained Transformer (GPT). Metas’ Llama 2 is an open-source LLM from Meta, and its main difference from other LLMs is that it is available for free to anyone. Claude 3 is the third generation of AI model lines released by Anthropic. Google Gemini, launched by Google and formerly known as Bard, is a chatbot based on the LaMDA language model family.

4. Proposed Architecture

In this chapter we introduce, analyze, and discuss the proposed architecture to our research problem.

4.1. General Architecture

Firstly, we should focus on the general architecture that was implemented. Figure 3 represents the simplified flow and function of the overall application.

The process begins with the submission of the user prompt (in the frontend), which is sent to the backend. To answer the user prompt, the backend needs to find the right answer in the knowledge database. The information is extracted from supporting documents of the Agri-food platform. These documents are available in a digital format and were parsed and stored in our local database. Those data are combined with the initial user prompt and fed to the LLM that generates an adequate answer, which is sent to the frontend through streaming.

The frontend updates the chat history as it receives the answer, creating a new visual component and inserting it into the chat page.

4.2. Backend Architecture

Regarding the backend, it contains several fundamental components that were chosen to integrate the proposed architecture, which are illustrated in Figure 4.

The process begins with the loading and processing of documents stored in a predefined directory (Step 1). These documents can include texts, images, video links, or any other type of information necessary to answer user queries. These documents are then divided into chunks, facilitating processing, and maintaining context in subsequent steps (Step 2).

Next, these chunks are transformed into embeddings (Step 3). These embeddings are then stored in a vector database (Step 4), where they are accessed and compared later. These steps comprise the first phase—Bootstrap—of the system and they happen initially and every time the underlying knowledge base is changed and needs to be reprocessed.

When a user starts interacting with the system, the second phase begins—Conversational—where the previously processed and stored knowledge is used to augment the system responses. The phase starts when a question is received by the API (Step 5); it is converted into embeddings (Step 6). This allows the question and context to be represented similarly to the previously loaded documents, enabling comparison. Subsequently, a similarity search is conducted between the question’s embeddings and the data in the vector database (Step 7). This process aims to find the most relevant documents (or parts) that can answer the question received by the API.

The most relevant results obtained from the vector database (Step 8) are processed to construct the complete prompt. This prompt includes the obtained results from the database, the history of messages exchanged between the user and the model, and the user’s question (Step 9). This prompt is then sent to the LLM (Step 10). The LLM uses these data to generate and return an appropriate response to the received question in a streaming format (Step 11). As data from the LLM arrive, they are processed individually. A buffer is created to process and validate the data received via streaming and subsequently send it to the API.

The entire cycle is necessary each time the original documents are altered. Without changes to the documents, subsequent API calls trigger the flow from point (Step 5).

4.3. Frontend Architecture

The aim of the frontend is to create the visual and interactive elements of the website application, namely, the chatbot interface that represents the user interface for the conversational agent. Therefore, next, the frontend architecture is implemented and a brief description of its processes is shown. Figure 5 illustrates the various components implemented and the flow between them, addressing the required integration with the backend also.

The processes related to the frontend application begin with an initial request made by the browser (Step 1) to a server that provides the Web application (React Web Application [13]). Then, the application is initialized (Step 2), which involves setting up various libraries necessary for the application, mainly, Redux [20] for state management, i18n for internationalization, and Grommet [21] for user interface components.

Before the application is functional, a request is made to the backend server to obtain a session key (Step 3), which serves as a Universally Unique IDentifier (UUID) for the user’s chat session within the context of the two applications (backend and frontend application). This obtained session key is stored in Redux [20] (Step 4), which is responsible for the global state management of the application, in order to identify which machine is requesting an answer when sent to the backend.

With the session key stored and the application initialized, the web page presented to the user is rendered (Step 5) in the user’s browser. From this point, the user can interact with the application and submit a prompt. When the user submits a prompt (Step 6), it, along with the session key, is sent to the backend through a Server-Sent Events (SSE) request (Step 7). This technology allows the server to send real-time updates to the frontend application, enabling us to receive and display the response as it is being generated by the LLM, constituted by the message and its type.

As the backend responses arrive, they are analyzed and processed by the function responsible for handling communication with the backend (Step 8). The received messages are then used to update the message history stored in the Redux state (Step 9).

When the message history is updated, the user interface is re-rendered to show the updated history with the received message, which involves generating a component for each message in the history (Step 10). This component will be responsible for presenting all the content that was received from the backend for the message, which can include text, links, images, and videos.

5. Development

In this chapter, some relevant implementation details are provided, starting with the process of ingestion of the static documents that serve as the system’s body of knowledge. This solution component is managed by the backend.

The construction of the documents knowledge database is the first step to allow a reliable font of information to be used by the developed solution. Therefore, initially, a function checks whether the documents in a specified directory are up to date. This function uses the docs_folder variable, which stores the directory path from the .env file. If the documents are not up to date, they are loaded, split into smaller chunks for processing, and then indexed in a vector database. In our case, we used the FAISS [8] vector database. If the documents are already up to date, the existing vector database is used. This step, as we can see in Figure 6, is crucial for effective document processing and subsequent information retrieval.

To process the documents, a function receives a list of document chunks and the type of database to use (e.g., FAISS or ChromaDB). It creates embeddings using a local model, mistralai/Mixtral-8x7B-Instruct-v0.1, provided by HuggingFace [22], and either loads an existing database or creates a new one, if the list of document chunks is provided.

The communication with the frontend is conducted through a REST API (Backend API) developed with FastAPI [23], a framework for building web APIs in Python. All communication is conducted in JSON format. The frontend initiates a session by requesting a session key from the backend. When a user asks a question in the chatbot (frontend), the frontend sends a POST request to the API, which processes the question and returns the response in a streaming format, as we can see in Figure 7. The function responsible for generating responses receives the vector database instance, user input, and a session key. It retrieves the user’s chat history and sets the vector database to retriever mode to return results relevant to the user’s question. Then, the language model is instantiated, and a prompt is constructed with system instructions, user instructions, and the chat history. The model then processes this prompt and generates a response, which is sent back to the frontend in streaming mode, allowing for real-time interaction with the user.

As the language model generates text blocks, they are processed and sent to the frontend by the Backend API. Buffers play a crucial role in generating the response context. A buffer stores and compares text blocks to ensure a continuous data flow. Regular expressions are applied to the text blocks to define data types (e.g., text, link, video, or picture) that enrich the final JSON object sent to the frontend. The user’s chat history is managed by the backend using an SQLite database, which stores all messages exchanged during a chat session. These messages are used to provide context to the LLM. If the database does not exist, it is created, and a table for chat history is set up. When a chat session ends or a new message is received, the history is saved to the database. Previous messages are loaded from the database when a new question is asked, ensuring continuity in the conversation. The history can also be deleted if necessary, ensuring efficient management of all interactions between the user and the application.

As the frontend receives the language model’s answer, it stores the message, verifies what type of message it is, text, image, or video, and generates a suitable component (web component) for its presentation on the web page, providing different levels of interactions for the end users, besides presenting a link to the resource.

We can see the final product in Figure 8, where a conversation is started and the user asks what the project is about, with a response message returned by LLM.

6. Results and Discussion

There are several components to the approach presented in this paper, and in this section, we will showcase our tests for the most relevant ones and present a discussion on the results.

6.1. Vector Databases Testing

Several options exist to store vector data, and it is not an objective of this research work to perform a broad analysis of this tech space. To fulfill our requirements, we implemented some performance testing on two of the most used vector databases: ChromaDB and FAISS.

The test was implemented with the databases initially configured without any preloaded data. In each test, we used a single 38KB text file that contains the user manual for the preliminary version of the traceability platform. Using this single file allows us to ensure consistency in the results. The time taken to store and load embeddings was monitored through the application’s log, enabling a detailed analysis of each database’s performance.

As can be seen in Table 1, FAISS’s storage footprint is only 10% of that of ChromaDB. On the other hand, ChromaDB outperforms FAISS in both read and write speeds, with the reading operation being more significant.

This performance testing showed us that depending on the actual use case, we can target either storage efficiency or operational performance. For our purposes, storage efficiency was the main factor, being the most relevant. Therefore, based on the results, the FAISS was our choice.

6.2. Large Language Models Testing and Validation

Large Language Models (LLMs) for Retrieval-Augmented Generation (RAG) tend to be more performant when their fine-tuning process is geared towards instruction compliance. For the purpose of this research effort, we pre-selected three instruct and one generic model: Mistral-8x7B-Instruct-V0.1; Mistral-7B-Instruct-V0.2; Meta-LLamma-3-8B-Instruct and Google Gemma-2B.

To test these models, we devised a set of questions in English (EN) and Portuguese (PT) and used FAISS to store the static user support documentation for the RAG proximity search. Following are some examples of the questions asked of the models:

[EN] What is the purpose of this platform?
[EN] How can I log in to the platform?
[EN] Is there any video about impact reports?
[EN] How can I change the owner of a transport container?
[PT] Quais são os modos de transporte disponíveis?

Given that our specific purpose does not benefit from testing against general benchmarks, we performed a series of prompts, and we manually graded their answer precision on a scale of 1–5. As presented in Table 2, the Mistral models largely outperformed both Meta and Google models.

This evaluation method is inherently subjective. However, for the specific purpose of end-user support, subjectivity is not only warranted but also a crucial aspect that our target audience requires.

6.3. Frontend and Usability

Our use case is very end-user-focused, and our proof-of-concept should be a responsive application. To validate these assumptions, we tested our implementation both in terms of technical characteristics, using Google’s Lighthouse (Google Lighthouse: https://developer.chrome.com/docs/lighthouse/overview (accessed on 25 June 2024)) testing tool, and in terms of usability, using the System Usability Scale [24] questionnaire.

In Table 3 we can see the average results from a series of tests conducted on our proof-of-concept. The results were within our expectations given that in terms of web technology, our application architecture and implementation are efficient and that the chosen technological stack (React + Gromit) provides a very well-established set of default behaviors that conform with the currently upheld web norms.

For our usability testing, we requested some external users, without any prior knowledge of this project, to test our application and fill out a System Usability Scale [24] questionnaire. In Table 4, we can see the results that had an overall average of 80.7, with one user scoring our application much lower than the remaining, which could mean that our user interface requires tweaking to provide better informational cues for first-time users.

6.4. Discussion

Our proof-of-concept demonstrated the viability of implementing a Retrieval-Augmented Generation (RAG) based Large Language Model (LLM) approach to provide support for end users of a farm-to-fork traceability blockchain solution. This innovative approach addresses a critical challenge in the adoption of blockchain technology in the agricultural supply chain.

Blockchain technology offers numerous advantages for traceability solutions aimed at end consumers, including enhanced transparency, immutability of records, and increased trust in the food supply chain. However, the complexity of blockchain systems often presents a steep learning curve for users unfamiliar with the technology. This complexity can potentially hinder widespread adoption, which is crucial for the success of farm-to-fork initiatives.

By integrating a conversational agent powered by RAG-based LLM technology, we can significantly reduce the barriers to entry for users. This AI-driven support system can interpret user queries, access relevant information from the blockchain, and present it in an easily understandable format. The conversational interface allows users to interact with the complex blockchain system using natural language, effectively smoothing the learning curve and promoting better user adoption.

While our full traceability solution is still in the research phase, the approach detailed in this paper provides a solid foundation for pilot implementations. By incorporating this user-friendly AI interface from the outset, we ensure that potential users’ technological barriers are being addressed proactively. This consideration is crucial in building the trust necessary for the successful implementation and adoption of blockchain-based traceability solutions in the farm-to-fork space.

Furthermore, this approach allows for continuous improvement based on user interactions. As the system gathers more data on user queries and concerns, it can be refined to provide increasingly targeted and helpful responses. This iterative process not only enhances the user experience but also provides valuable insights into user needs and behaviors, which can inform future developments in the traceability solution.

7. Conclusions and Future Work

Technologies like blockchain are often touted as a solution to a fair number of procedural problems, and that can be true in several instances, but these technologies increase the friction of the user experience, as they are not at the point where they can be transparent to the common user. This project focuses on easing that friction by providing a mechanism to guide users on a blockchain-enabled platform.

This project shows that an RAG approach is feasible to create a conversational agent that can behave as user support based on dynamically loaded static documents for a farm-to-fork traceability platform that is being developed under the umbrella of the previously mentioned Agenda Blockchain project.

Our main difficulty was navigating the sea of technological and architectural options that exist in the ML space these days, separating fact from fiction, and building a simple and reliable system that can not only provide a good user experience but also evolve along with its parent platform the required user support documentation.

Overall, we were able to develop a chatbot that is mostly limited by its documentation, being extremely malleable in defining its knowledge domain and providing up-to-date information without needing to retrain the language model.

The field of Large Language Models and the Retrieval-Augmented Generation approach is an ongoing development without signs of slowing down. Our research provided a proof-of-concept for a specific niche business case, but further research is required in order to adapt to the real-world business case. We suggest that further testing is performed with actual end users of the platform and that this testing incorporates the upcoming releases of new models as well as new approaches to context-aware conversational agents.

Author Contributions

Conceptualization, V.T., M.M., R.G. and O.R.; methodology, V.T., M.M. and R.G.; validation, J.F., J.B. (José Benzinho), J.B. (Joel Batista) and L.P.; formal analysis, J.F., J.B. (José Benzinho), J.B. (Joel Batista) and L.P.; investigation, J.F., J.B. (José Benzinho), J.B. (Joel Batista) and L.P.; writing—original draft preparation, J.F., J.B. (José Benzinho), J.B. (Joel Batista) and L.P.; writing—review and editing, V.T., M.M., R.G. and O.R.; supervision, V.T., M.M., R.G. and O.R.; funding acquisition, V.T. and O.R. All authors have read and agreed to the published version of the manuscript.

Funding

This work was financially supported by Project BlockchainPT–Decentralize Portugal with Blockchain Agenda, WP 2: Health and Wellbeing, 02/C05-i01.01/2022.PC644918095-00000033, funded by the Portuguese Recovery and Resilience Program (PRR), the Portuguese Republic and the European Union (EU) under the framework of the Next Generation EU Program.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

Author Orlando Remédios was employed by the company Sensefinity. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Blockchain. Available online: https://www.blockchain.com/ (accessed on 14 June 2024).
Aggarwal, S.; Kumar, N. Hyperledger. In Advances in Computers; Elsevier: Amsterdam, The Netherlands, 2021; Volume 121, pp. 323–343. [Google Scholar] [CrossRef]
Chockalingam, A.; Patel, A.; Verma, S.; Yeung, T. A Beginner’s Guide to Large Language Models. 2023. Available online: https://resources.nvidia.com/en-us-large-language-model-ebooks (accessed on 25 March 2024).
Borwankar, N. Retrieval-Augmented Generation, Step by Step. Available online: https://www.infoworld.com/article/3712860/retrieval-augmented-generation-step-by-step.html (accessed on 16 March 2024).
Yelikar, S. Understanding Similarity or Semantic Search and Vector Databases|by Sudhir Yelikar|Medium. Available online: https://medium.com/@sudhiryelikar/understanding-similarity-or-semantic-search-and-vector-databases-5f9a5ba98acb (accessed on 24 March 2024).
Pinecone. Home—Pinecone Docs. Available online: https://docs.pinecone.io/home (accessed on 13 June 2024).
Chroma. Chroma Docs. Available online: https://docs.trychroma.com/ (accessed on 13 June 2024).
Faiss. Welcome to Faiss Documentation—Faiss documentation. Available online: https://faiss.ai/index.html (accessed on 13 June 2024).
Merritt, R. «What Is Retrieval-Augmented Generation aka RAG?», NVIDIA Blog. Available online: https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/ (accessed on 25 March 2024).
da Silva Duque-Pereira, I.; de Moura, S.A. Compreendendo a Inteligência Artificial Generativa na Perspectiva da Linguagem. 2023, 2023–2033. [Google Scholar] [CrossRef]
Vue. Glossary|Vue.js. Available online: https://vuejs.org/glossary/ (accessed on 17 June 2024).
Angular. What is Angular? Angular. Available online: https://angular.dev/overview (accessed on 17 June 2024).
React. React Reference Overview—React. Available online: https://react.dev/reference/react (accessed on 17 June 2024).
Ruby. Ruby on Rails Guides. Available online: https://guides.rubyonrails.org/v7.0/ (accessed on 17 June 2024).
Estep, E. Mobile HTML5: Efficiency and Performance of WebSockets and Server-Sent Events; KTH, School of Information and Communication Technology (ICT): Stockholm, Sweden, 2013. [Google Scholar]
Drift. The Chatbots Report (2018): Reshaping Online Experiences. Available online: https://www.drift.com/blog/chatbots-report/ (accessed on 23 March 2024).
Van Eeuwen, M. Mobile Conversational Commerce: Messenger Chatbots as the Next Interface between Businesses and Consumers. Master’s Thesis, University of Twente, Enschede, The Netherlands, 2017. [Google Scholar]
McTear, M.; Callejas, Z.; Griol, D. Creating a Conversational Interface Using Chatbot Technology. In The Conversational Interface; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar] [CrossRef]
Sengupta, R.; Lakshman, S. Conversational Chatbots-Let’s Chat. 2017. Available online: https://www2.deloitte.com/content/dam/Deloitte/in/Documents/strategy/in-strategy-innovation-conversational-chatbots-lets-chat-final-report-noexp.pdf (accessed on 25 March 2024).
Redux. Usage Guides Index|Redux. Available online: https://redux.js.org/usage/ (accessed on 17 June 2024).
Grommet. Available online: https://v2.grommet.io/docs (accessed on 17 June 2024).
Hugging Face. Models—Hugging Face. Available online: https://huggingface.co/models (accessed on 20 May 2024).
FastAPI. FastAPI Documentation—DevDocs. Available online: https://devdocs.io/fastapi/ (accessed on 14 June 2024).
Lewis, J.R. The System Usability Scale: Past, Present, and Future. Int. J. Hum.–Comput. Interact. 2018, 34, 577–590. [Google Scholar] [CrossRef]

Figure 1. Embeddings and word visualization.

Figure 2. RAG diagram.

Figure 3. General architecture.

Figure 4. Backend architecture elements.

Figure 5. Frontend architecture.

Figure 6. Document processing steps—in Bootstrap Phase.

Figure 7. Communication steps between frontend and backend.

Figure 8. Frontend application.

Table 1. Vector database performance testing.

	Storage Size Average	Store Speed Average	Read Speed Average
FAISS	59 KB	4.41 s	0.88 s
ChromaDB	652 KB	4.01 s	0.46 s

Table 2. LLM RAG precision testing.

Model	Precision (1–5)
Mistral-8x7B-Instruct-V0.1	4
Mistral-7B-Instruct-V0.2	5
Meta-LLamma-3-8B-Instruct	2
Google Gemma-2B	1

Table 3. Frontend Lighthouse testing results.

Lighthouse Metric	Average Result
Performance	96.3
Accessibility	100
Best Practices	100
First Contentful Paint	1.4
Total Blocking Time	2.6
Cumulative Layout Shift	0
Speed Index	1.4

Table 4. System Usability Scale testing results.

End User	SUS Score
1	72.5
2	57.5
3	92.5
4	80
5	90
6	92.5
7	80

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Benzinho, J.; Ferreira, J.; Batista, J.; Pereira, L.; Maximiano, M.; Távora, V.; Gomes, R.; Remédios, O. LLM Based Chatbot for Farm-to-Fork Blockchain Traceability Platform. Appl. Sci. 2024, 14, 8856. https://doi.org/10.3390/app14198856

AMA Style

Benzinho J, Ferreira J, Batista J, Pereira L, Maximiano M, Távora V, Gomes R, Remédios O. LLM Based Chatbot for Farm-to-Fork Blockchain Traceability Platform. Applied Sciences. 2024; 14(19):8856. https://doi.org/10.3390/app14198856

Chicago/Turabian Style

Benzinho, José, João Ferreira, Joel Batista, Leandro Pereira, Marisa Maximiano, Vítor Távora, Ricardo Gomes, and Orlando Remédios. 2024. "LLM Based Chatbot for Farm-to-Fork Blockchain Traceability Platform" Applied Sciences 14, no. 19: 8856. https://doi.org/10.3390/app14198856

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LLM Based Chatbot for Farm-to-Fork Blockchain Traceability Platform

Abstract

1. Introduction

2. Background

2.1. Large Language Models

2.2. Embeddings

2.3. Vector Databases

2.4. Retrieval-Augmented Generation

2.5. Application Architecture

3. Related Work

4. Proposed Architecture

4.1. General Architecture

4.2. Backend Architecture

4.3. Frontend Architecture

5. Development

6. Results and Discussion

6.1. Vector Databases Testing

6.2. Large Language Models Testing and Validation

6.3. Frontend and Usability

6.4. Discussion

7. Conclusions and Future Work

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI