
Data Machina #222

Ragas is a framework that helps you evaluate your Retrieval Augmented Generation (RAG) pipelines. RAG denotes a class of LLM applications that use external data to augment the LLM’s context. There are existing tools and frameworks that help you build these pipelines but evaluating it and quantifying your pipeline performance can be hard. This is... See more
explodinggradients • GitHub - explodinggradients/ragas: Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
ANY
LLM of your choice, statistical methods, or NLP models that runs
locally on your machine
:
- G-Eval
- Summarization
- Answer Relevancy
- Faithfulness
- Contextual Recall
- Contextual Precision
- RAGAS
- Hallucination
- Toxicity
- Bias
- etc.
GitHub - confident-ai/deepeval: The LLM Evaluation Framework
A framework of automated and human-in-the-loop evaluations (e.g., code-based scripts, LLM-as-judge checks) that measures whether an AI product “actually works,” supporting rapid learning cycles and alignment with user and business goals.