GitHub - confident-ai/deepeval: The LLM Evaluation Framework

"My benchmark for large language models"
https://t.co/YZBuwpL0tl
Nice post but even more than the 100 tests specifically, the Github code looks excellent - full-featured test evaluation framework, easy to extend with further tests and run against many... See more
GitHub - Shubhamsaboo/awesome-llm-apps: Collection of awesome LLM apps with AI Agents and RAG using OpenAI, Anthropic, Gemini and opensource models.
github.comGitHub - arthur-ai/bench: A tool for evaluating LLMs