From Dec 1 to 11, Truera hosted its first-ever hackathon in partnership with Google, Zilliz, LlamaIndex and LabLabAI. We had 3361 participants divided into 300 teams. Over ten days, talented hackers used Google Vertex, Trulens, Milvus and LlamaIndex to build and iterate on 40 LLM-based applications. Working with these teams to build their applications, a few themes emerged:
- Theme 1: Quick feedback and logging accelerated development. Reliable, comprehensive evaluations connected to a system for tracking experiments allowed teams to understand the performance of their app with every change they made. This allowed them to quickly find new failure modes and iterate to find the best possible solution.
- Theme 2: After evaluating the RAG triad (answer relevance, context relevance and groundedness), teams expanded the coverage of evaluations with evals for harmless and helpfulness. This was particularly critical as the winning apps were customer-facing.
- Theme 3: After establishing their application meets out-of-the-box criteria for honesty, harmlessness and helpfulness, the winning teams tuned their application with custom evals. Because they were the subject matter experts in the domain of their app, the developers used that knowledge to write business requirements and turn those into evaluations of their app.
- Theme 4: AI was used for good! The winning teams made it easier to learn, assisted those of us with diverse cognitive abilities, and assisted in solving complex immigration challenges.
We received fantastic submissions but three really stood out.
🥉Winner #3: Study Buddy App
Team: Yesid Leonardo López Sierra, Juan Carlos Ortize Drada, Sara Ortiz Drada and Steven Quintana.
Study Buddy is an innovative iOS application designed to enhance the studying experience for students. It offers unique functionalities, such as a Q&A chatbot that enables students to query any image or PDF in their course materials, and automated flashcard creation that instantly converts sections of their course materials into custom flashcards. This app is powered by Google Cloud Platform (GCP), using a combination of advanced models (e.g. Cloud Vision for OCR, Palm for natural language processing tasks, and Google Cloud Text to Speech) and uses Zilliz as its vector database. TruLens allowed the Study Buddy team to identify a need for post-processing the OCR step of the application – dramatically improving answer relevance and lowering app latency.
🥈Winner #2: SimplifAI
Team: Sara Diaz del Ser, Joel Weiss, Ismael Delgado, Amalia Cid, Paulina Aguiló and Alberto Garcia Garcia.
SimplifAI is an user-friendly web application that effortlessly converts any given English text into ‘Plain English,’ a simplified form of writing designed to enhance understanding for people with diverse cognitive abilities. It can also help students whose English is not their mother tongue. As a result, its user base is potentially wide, ranging from individuals with learning or mental disabilities to caregivers, educators, and any professional aiming to communicate with a diverse audience. SimplifAI uses a similar tech stack as Study buddy.
The Simplifai team showcased a full range of evaluation metrics including BLEU and language match along with custom evaluations that confirmed the simplicity of the resulting text, a key business requirement. Last, to explain which tokens in the LLM response were most influential to its complexity, the team used an integrated gradients technique available from TruLens-explain.
🥇Winner #1: Ask Priya
Team: Bassim Eldath, Fadil Faizal and Phuc Nguyen,
Ask Priya is a pioneering AI chatbot designed to streamline the process of acquiring US immigration information, leveraging data from the United States Citizenship and Immigration Services (USCIS) website. Ask Priya is a classic retrieval augmented generation app whose answers are grounded on USCIS-indexed data and evaluated using Trulens’ feedback functions. Considering the continuous and massive flows of immigrants seeking to move to the U.S. and the complexity of its immigration policy, Ask Priya addresses a pressing issue.
During development, the Ask Priya team added a “development mode”, which accelerated iteration by allowing the team to test the application while receiving feedback from TruLens on the quality of the LLM responses. On top of the RAG triad to assess hallucinations, the team also built in language match feedback functions as their app seeks to serve a diverse set of users.
Other remarkable submissions
Two other applications came very close to the winning teams and thus are worth presenting. First, we have Huddle, a professional networking app that enables users to connect with interesting professionals selected based on professional backgrounds, work experiences, and objectives. If there’s a match, the app schedules a video call for both parties, thereby streamlining the networking process. Customer Chatbot API also made a great impression. This app improves customer engagement and optimizes online shopping experience through more personalized product recommendations and conversational search. Both apps used the RAG triad from TruLens to validate the lack of hallucinations in their final product.
This blog post was co-authored by Lofred Madzou and Josh Reini.