Evaluate with any LLM: TruLens + LiteLLM

Written by: Joshua Reini
Category: AI Infrastructure, AI Observability, Foundation Models, LLMs

We’re excited to announce the integration of TruLens with LiteLLM to offer evaluations from a wide breadth of models supported by the LiteLLM Interface. Models available via LiteLLM include GPT-4, Llama-2, Claude-2, Cohere Command Nightly and more. Through the use of LiteLLM, anyone can now access the full suite of TruLens LLM evaluations including groundedness, context relevance, toxicity and more using the model best suited for your organization.

How do you get started?

Using LiteLLM for automated LLM app evaluations is easy. After importing the LiteLLM module from TruLens, we choose any model provider just by specifying the model name. Then we can choose from the library of out-of-the-box feedback functions available from TruLens.

After we’ve set up feedback function(s) with TruLens and LiteLLM, we can create our recorder by wrapping our LLM application, in this case named chain, and passing in our feedback function(s).

Once we’ve done so, we can use tru_recorder as a context manager for our application. Critically, by using the recorder, every call to our application will now be evaluated by your choice of model from LiteLLM.

Want to try it hands on? Run it yourself! Colab notebook

Claude 2, Cohere, GPT-4, Large Language Models, LiteLLM, Llama-2

October 4, 2023

Joshua Reini

DevRel Data Scientist

Josh is a Developer Relations Data Scientist @ TruEra, focused on growing a community of ML practitioners passionate about improving the AI Quality of their work. He enjoys building new applications that leverage XAI techniques to make ML models more trustworthy, more fair and more robust.