Low-Hanging Fruit for RAG Search

Low-Hanging Fruit for RAG Search - jxnl.co

RelatedInsightsHighlights

For a collection of advanced Retrieval-Augmented Generation (RAG) techniques this is a very resourceful repo. Many topics are covered like - Metadata Filtering: Apply filters based on attributes like date, source, author, or document type. - Similarity... See more

Rohan Paul

x.com

GitHub - infiniflow/ragflow: RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

infiniflow github.com

Thumbnail of GitHub - infiniflow/ragflow: RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

6 low hanging fruit to improve RAG in a few days https://t.co/XQzs2hnt6Q 1. Synthetic Data for Baseline Metrics 2. Adding Date Filters 3. Improving Thumbs Up/Down Copy 4. Tracking Average Cosine Distance and Cohere Reranking Score 5.... See more

jason liu x.com

I think the biggest mistake around improving the system is that most people are spending too much time on the actual synthesis without actually understanding whether or not the data is being retrieved correctly. To avoid this:

Create synthetic questions for each text chunk in your database

Use these questions to test your retrieval system

Calculate

Systematically Improving Your RAG - jxnl.co

RAG (Retrieval-Augmented Generation) is the most common pattern for building chatbots that answer questions about specific documents.

The Problem: Models like ChatGPT don't know about your company's internal documents or events that happened after their 2023 training cut-off. Asking them will lead to hallucinations.
The Architecture:
1. Retrieve: When a