Low-Hanging Fruit for RAG Search - jxnl.co
I think the biggest mistake around improving the system is that most people are spending too much time on the actual synthesis without actually understanding whether or not the data is being retrieved correctly. To avoid this:
- Create synthetic questions for each text chunk in your database
- Use these questions to test your retrieval system
- Calculate p
Systematically Improving Your RAG - jxnl.co
Nicolay Gerold added
Michael Iversen added
Retrieval-augmented generation (RAG) is a technique that enhances text generation by retrieving and incorporating external knowledge.
Ben Auffarth • Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs
Ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. RAG denotes a class of LLM applications that use external data to augment the LLM’s context. There are existing tools and frameworks that help you build these pipelines but evaluating it and quantifying your pipeline performance can be hard. This is wh... See more
explodinggradients • GitHub - explodinggradients/ragas: Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
Nicolay Gerold added
a couple of the top of my head:
- LLM in the loop with preference optimization
- synthetic data generation
- cross modality "distillation" / dictionary remapping
- constrained decoding
r/MachineLearning - Reddit
Nicolay Gerold added
Additional LLM paradigms beyond RAG
here are two basic approaches to creating AI datasets. The first one, which is typical of the case we have been studying, a pool of open works is purposefully chosen to ensure license compliance. The second approach creates the dataset by scraping the “raw internet” and relying on copyright exceptions. LAION , a dataset of 400 million image-text pa... See more
Alek Tarkowski • Filling the governance vacuum related to the use of information commons for AI training
madisen added