GitHub - sqrkl/lm-evaluation-harness: A framework for few-shot evaluation of language models.
GitHub - arthur-ai/bench: A tool for evaluating LLMs
GitHub - arthur-ai/bench: A tool for evaluating LLMs
BA Builder added
Take a look at our official page for user documentation and examples: langtest.org
Key Features
Key Features
- Generate and execute more than 50 distinct types of tests only with 1 line of code
- Test all aspects of model quality: robustness, bias, representation, fairness and accuracy.
- Automatically augment training data based on test results (for select models)
- Sup
GitHub - BrunoScaglione/langtest: Deliver safe & effective language models
Nicolay Gerold added
Welcome to prompttools created by Hegel AI! This repo offers a set of open-source, self-hostable tools for experimenting with, testing, and evaluating LLMs, vector databases, and prompts. The core idea is to enable developers to evaluate using familiar interfaces like code, notebooks, and a local playground.... See more
In just a few lines of codes, you can t
Testing framework for LLM Part
Nicolay Gerold added
Text embeddings are a critical piece of many pipelines, from search, to RAG, to vector databases and more. Most embedding models are BERT/Transformer-based and typically have short context lengths (e.g., 512). That’s only about two pages of text, but documents can be very long – books, legal cases, TV screenplays, code repositories, etc can be tens... See more
Long-Context Retrieval Models with Monarch Mixer
Nicolay Gerold added
ANY
LLM of your choice, statistical methods, or NLP models that runs
locally on your machine
:
- G-Eval
- Summarization
- Answer Relevancy
- Faithfulness
- Contextual Recall
- Contextual Precision
- RAGAS
- Hallucination
- Toxicity
- Bias
- etc.
GitHub - confident-ai/deepeval: The LLM Evaluation Framework
Nicolay Gerold added
LLM-PowerHouse: A Curated Guide for Large Language Models with Custom Training and Inferencing
Welcome to LLM-PowerHouse, your ultimate resource for unleashing the full potential of Large Language Models (LLMs) with custom training and inferencing. This GitHub repository is a comprehensive and curated guide designed to empower developers, researche... See more
Welcome to LLM-PowerHouse, your ultimate resource for unleashing the full potential of Large Language Models (LLMs) with custom training and inferencing. This GitHub repository is a comprehensive and curated guide designed to empower developers, researche... See more
ghimiresunil • GitHub - ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing: LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.
Nicolay Gerold added
One of the focus areas at Together Research is new architectures for long context, improved training, and inference performance over the Transformer architecture. Spinning out of a research program from our team and academic collaborators, with roots in signal processing-inspired sequence models, we are excited to introduce the StripedHyena models.... See more
Paving the way to efficient architectures: StripedHyena-7B, open source models offering a glimpse into a world beyond Transformers
Nicolay Gerold added
Repository for the paper "The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning", including 1.84M CoT rationales extracted across 1,060 tasks"
Paper Link : https://arxiv.org/abs/2305.14045
Paper Link : https://arxiv.org/abs/2305.14045
kaistAI • GitHub - kaistAI/CoT-Collection: [Under Review] The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
Nicolay Gerold added