Systematically Improving Your RAG - jxnl.co
1. Synthetic Data for Baseline Metrics¶
Synthetic data can be used to establish baseline precision and recall metrics for your reverse search. The simplest kind of synthetic data is to take existing text chunks, generate synthetic questions, and verify that when we query our synthetic questions, the sourced text chunk is retrieved correctly.
Benefi... See more
Synthetic data can be used to establish baseline precision and recall metrics for your reverse search. The simplest kind of synthetic data is to take existing text chunks, generate synthetic questions, and verify that when we query our synthetic questions, the sourced text chunk is retrieved correctly.
Benefi... See more
Low-Hanging Fruit for RAG Search - jxnl.co
RAG (Retrieval-Augmented Generation) is the most common pattern for building chatbots that answer questions about specific documents.
The Problem: Models like ChatGPT don't know about your company's internal documents or events that happened after their 2023 training cut-off. Asking them will lead to hallucinations.
The Architecture:
Retrieve: When a