GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

RelatedHighlights

A new v0.4.0 release of lm-evaluation-harness is available !

New updates and features include:

Internal refactoring

Config-based task creation and configuration

Easier import and sharing of externally-defined task config YAMLs

Support for Jinja2 prompt design, easy modification of prompts + prompt imports from Promptsource

More advanced configuration opt

... See more

GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.