A deep dive into advanced indexing, pre-retrieval, retrieval, and post-retrieval techniques to enhance RAG performance
Have you ever asked a generative AI app, like ChatGPT, a question and found the answer incomplete, outdated, or just plain wrong? What if there was a way to fix this and make AI more accurate? There is! It’s called Retrieval Augmented Generation or just RAG. A novel concept introduced by Lewis et al in their seminal paper Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , RAG has swiftly emerged as a cornerstone, enhancing reliability and trustworthiness in the outputs from Large Language Models (LLMs). LLMs have been shown to store factual knowledge in their parameters, also referred to as parametric memory and this knowledge is rooted in the data the LLM has been trained on. RAG enhances the knowledge of the LLMs by giving them access to an external information store, or a knowledge base. This knowledge base is also referred to as non-parametric memory (because it is not stored in model parameters). In 2024, RAG is one of the most widely used techniques in generative AI applications.
60% of LLM applications utilize some form of RAG