GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

github.com
Thumbnail of GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.
Aarush Sahx.com

GitHub - arthur-ai/bench: A tool for evaluating LLMs