Large language models (LLMs) are incredibly powerful but they have one big problem: they tend to hallucinate; that is giving very confident answers that are factually or logically incorrect. As explained in this previous post, LLMs suffer from serious limitations that make them prone to hallucination by default.
Fortunately, solutions to mitigate this risk are starting to emerge such as retrieval-augmented generation (RAG) frameworks and observability tools. The former consists of supplementing LLMs with internal representation of information with contextual data stored on a vector database while the latter helps track and evaluate LLM-powered apps using feedback functions.
Yet, to apply them effectively one must understand why models hallucinate, potential sources of hallucinations, and scenarios where this risk is amplified. That’s the topic of this post.
Sources of LLM hallucinations
LLMs can hallucinate due to various factors:
- Erroneous encoding – If the encoder misunderstands the input text, i.e. training data passed through an embedding model and turned into vectors, it will create misleading vector representations. As a result, the model decoder will generate misleading text outputs.
- Erroneous decoding – Decoders of LLM-powered chatbots (e.g. ChatGPT, Bard) are responsible for producing a ranked list of vectors with their associated “probabilities”. But they don’t always select vectors with highest probabilities because we would get very flat answers. Instead, sometimes they randomly pick lower-ranked vectors to generate more creative responses. The more randomness one introduces, the more creative are the model outputs but also more likely to be hallucinations.
- Exposure bias – During the training phase, typical generation models use ground truth input text (i.e. books, articles, websites, and other text-based sources) as the basis for predicting a sequence of tokens. This is no longer the case during inference. At this point,subsequent generations use the synthetic text the model generates. We may observe a decline in the generated text’s quality as a result of this “synthetic generation based on synthetic input”.
Forms of hallucinations
LLMs hallucinations manifest take two primary forms:
- Intrinsic hallucination: this occurs when the generated output manipulates information in a way that contradicts the content of the source material. This form of hallucination is characterized by the model distorting the facts present in the input data. Here is an illustration; if we asked “who was the 44th President of the United States” and the model replies “Richard Nixon” it would be a case of distorted information because the dataset on which the model was trained certainly includes the right answer: Barack Obama.
- Extrinsic hallucination: it occurs when the generated output introduces additional information that cannot be directly inferred from the source material. For instance, if we ask ChatGPT “what is Trulens?”. The only reasonable answer would be that it does know because this tool was released after its latest update. Any other answer is likely to be an extrinsic hallucination.
How Trulens can help
Trulens is an open-source tool to track and evaluate LLM-powered applications across their lifecycle. It is particularly valuable to address LLM hallucinations through feedback functions. A feedback function takes as input generated text from an LLM-powered app and some metadata, and returns a score. Here are some examples of feedback functions particularly useful to test if RAG apps are hallucination free:
- Context relevance: is the retrieved context relevant to the query? Trulens allows users to evaluate their apps using LLM-based evaluation or by measuring embedding distance between query and context embeddings.
- Groundedness: is the response supported by the context? TruLens checks the groundedness of the LLM’s responses by splitting LLM responses into claims and then evaluating each against verified text.
- Question/answer relevance:is the answer relevant to the response? Trulens evaluates the relevance of the final response to the user’s input, ensuring that the response effectively addresses the user’s query. This helps verify that the LLM-generated answers are directly related to the user’s question, reducing the likelihood of hallucinations.
In addition, Trulens allows for continuous monitoring of LLM-powered applications throughout their lifecycle. Indeed, by adding an observability layer, users can identify and rectify hallucinations as they occur, making the application more reliable and trustworthy.
Trulens offers a comprehensive set of tools to evaluate and mitigate hallucinations in LLM-powered applications, enhancing their overall quality and reliability. It provides a crucial mechanism for ensuring that these applications produce accurate and relevant responses, addressing the challenges associated with LLM hallucinations.
This article was co-authored by Annie Pang, Shayak Sen and Lofred Madzou.