GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

github.com
Thumbnail of GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

GitHub - arthur-ai/bench: A tool for evaluating LLMs

BA Builder added

GitHub - BrunoScaglione/langtest: Deliver safe & effective language models

Testing framework for LLM Part

Long-Context Retrieval Models with Monarch Mixer

GitHub - confident-ai/deepeval: The LLM Evaluation Framework

ghimiresunil GitHub - ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing: LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.

Paving the way to efficient architectures: StripedHyena-7B, open source models offering a glimpse into a world beyond Transformers

kaistAI GitHub - kaistAI/CoT-Collection: [Under Review] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning