GitHub - Giskard-AI/giskard: π’ The testing framework for ML models, from tabular to LLMs
Take a look at our official page for user documentation and examples: langtest.org
Key Features
Key Features
- Generate and execute more than 50 distinct types of tests only with 1 line of code
- Test all aspects of model quality: robustness, bias, representation, fairness and accuracy.β
- Automatically augment training data based on test results (for select models)β
- Sup
GitHub - BrunoScaglione/langtest: Deliver safe & effective language models
Nicolay Gerold added
Welcome to prompttools created by Hegel AI! This repo offers a set of open-source, self-hostable tools for experimenting with, testing, and evaluating LLMs, vector databases, and prompts. The core idea is to enable developers to evaluate using familiar interfaces like code, notebooks, and a local playground.... See more
In just a few lines of codes, you can t
Testing framework for LLM Part
Nicolay Gerold added
Creative AI Lab
creative-ai.orgIsabelle Levent added
π lakera.ai
An Overview of Lakera Guard β Bringing Enterprise-Grade Security to LLMs with One Line of Code... See more
At Lakera, we supercharge AI developers by enabling them to swiftly identify and eliminate their AI applicationsβ security threats so that they can focus on building the most exciting applications securely.
Businesses around the world are in
Testing framework for LLM Part
Nicolay Gerold added
baserun.aiπͺπͺπͺ
Testing & Observability Platform for LLM Apps
From prompt playground to end-to-end tests, baserun helps you ship your LLM apps with confidence and speed.
Testing framework for LLM Part
Nicolay Gerold added
GitHub - AI4Finance-Foundation/FinRobot: FinRobot: An Open-Source AI Agent Platform for Financial Applications using LLMs π π π
GitHub - AI4Finance-Foundation/FinRobot: FinRobot: An Open-Source AI Agent Platform for Financial Applications using LLMs π π π
Steve Werber added
GitHub - arthur-ai/bench: A tool for evaluating LLMs
GitHub - arthur-ai/bench: A tool for evaluating LLMs
BA Builder added
Creating test suites - bench documentation
Creating test suites - bench documentation
BA Builder added
Best way to manage ai experiments. Very generic and extendible. Should make something similar.