Low-Hanging Fruit for RAG Search

Low-Hanging Fruit for RAG Search - jxnl.co

RelatedHighlights

I think the biggest mistake around improving the system is that most people are spending too much time on the actual synthesis without actually understanding whether or not the data is being retrieved correctly. To avoid this:

Create synthetic questions for each text chunk in your database

Use these questions to test your retrieval system

Calculate p

Systematically Improving Your RAG - jxnl.co

Nicolay Gerold added

Better RAG Results With Reciprocal Rank Fusion and Hybrid Search

John Wang assembled.com

added

Retrieval Augmented Generation and Reranking in Custom ChatGPT

Rafal Zawadzki chatwith.tools

Michael Iversen added

Retrieval-augmented generation (RAG) is a technique that enhances text generation by retrieving and incorporating external knowledge.

Ben Auffarth • Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs

Ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. RAG denotes a class of LLM applications that use external data to augment the LLM’s context. There are existing tools and frameworks that help you build these pipelines but evaluating it and quantifying your pipeline performance can be hard. This is wh... See more

explodinggradients • GitHub - explodinggradients/ragas: Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines

Nicolay Gerold added

a couple of the top of my head:

LLM in the loop with preference optimization

synthetic data generation

cross modality "distillation" / dictionary remapping

constrained decoding

r/MachineLearning - Reddit

Nicolay Gerold added

Additional LLM paradigms beyond RAG

here are two basic approaches to creating AI datasets. The first one, which is typical of the case we have been studying, a pool of open works is purposefully chosen to ensure license compliance. The second approach creates the dataset by scraping the “raw internet” and relying on copyright exceptions. LAION , a dataset of 400 million image-text pa... See more

Alek Tarkowski • Filling the governance vacuum related to the use of information commons for AI training

madisen added

Retrieval-Augmented Generation (RAG): A Technical AI Explainer

youtube.com

added