GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

github.com
Thumbnail of GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.

Judge Arena: Benchmarking LLMs as Evaluators

AtlaAIhuggingface.co
Thumbnail of Judge Arena: Benchmarking LLMs as Evaluators

Long-Context Retrieval Models with Monarch Mixer

Prompt Engineering

kaggle.com

Ben Auffarth Generative AI with LangChain: Build large language model (LLM) apps with Python, ChatGPT, and other LLMs

GitHub - HandsOnLLM/Hands-On-Large-Language-Models: Official code repo for the O'Reilly Book - "Hands-On Large Language Models"

github.com
Cover of GitHub - HandsOnLLM/Hands-On-Large-Language-Models: Official code repo for the O'Reilly Book - "Hands-On Large Language Models"

GitHub - deepseek-ai/DeepSeek-Coder: DeepSeek Coder: Let the Code Write Itself

deepseek-aigithub.com
Thumbnail of GitHub - deepseek-ai/DeepSeek-Coder: DeepSeek Coder: Let the Code Write Itself