Using Evaluations to Optimize a RAG Pipeline: from Chunkings and Embeddings to LLMs | by Christy Bergman

Using Evaluations to Optimize a RAG Pipeline: from Chunkings and Embeddings to LLMs | by Christy Bergman | Jul, 2024

Last updated: 2024/07/14 at 3:17 AM

Editor AI News

1 Min Read

Best practices RAG with Milvus vector database, part 2

Image created by author using https://www.bing.com/images/create. Content credentials: **Generated with AI** ∙ July 9, 2024 at 10:04 AM.

Retrieval Augmented Generation (RAG) is a useful technique for using your own data in an AI-powered Chatbot. In this blog post, I’ll walk through three key strategies to get the most out of RAG and evaluate each strategy to find the best combinations.

For readers who just want to know the TL;DR conclusion: the most RAG accuracy improvement came from exploring different chunking strategies.

89% Improvement by changing the Chunking Strategy 📦
20% Improvement by changing the Embedding Model 🤖
6% Improvement by changing the LLM Model 🧪

Let’s dive into each strategy and find the best-performers for a real-world RAG application using RAG component evaluations! 🚀📚

I’ll use Milvus documentation public web pages as the docs data and Ragas as the evaluation method. See my previous blog about how to use RAGAS. The rest of this blog is organized as follows:

Text Chunking Strategies
Embedding Models
LLM (Generative) Models