GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

github.com
Thumbnail of GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

Testing framework for LLM Part

GitHub - AnswerDotAI/rerankers

Humanity's Last Exam

maggieappleton.com
Thumbnail of Humanity's Last Exam