GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.
github.com
GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.
Testing & Observability Platform for LLM Apps
From prompt playground to end-to-end tests, baserun helps you ship your LLM apps with confidence and speed.