Pytest for LLM Apps!
DeepEval turns LLM evaluation into a two-line test suite, helping you identify the best models, prompts, and architecture for your AI workflow.
Works with any framework, including LlamaIndex, Langchain, CrewAI, and more.
100% open-source,... See more