GitHub - BrunoScaglione/langtest: Deliver safe & effective l...

GitHub - BrunoScaglione/langtest: Deliver safe & effective language models

RelatedHighlights

Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by

ANY

LLM of your choice, statistical methods, or NLP models that runs

locally on your machine

:

G-Eval

Summarization

Answer Relevancy

Faithfulness

Contextual Recall

Contextual Precision

RAGAS

Hallucination

Toxicity

Bias

etc.

GitHub - confident-ai/deepeval: The LLM Evaluation Framework

Nicolay Gerold added

Welcome to prompttools created by Hegel AI! This repo offers a set of open-source, self-hostable tools for experimenting with, testing, and evaluating LLMs, vector databases, and prompts. The core idea is to enable developers to evaluate using familiar interfaces like code, notebooks, and a local playground.

In just a few lines of codes, you can t

Testing framework for LLM Part

Nicolay Gerold added

LLM-PowerHouse: A Curated Guide for Large Language Models with Custom Training and Inferencing

Welcome to LLM-PowerHouse, your ultimate resource for unleashing the full potential of Large Language Models (LLMs) with custom training and inferencing. This GitHub repository is a comprehensive and curated guide designed to empower developers, researche... See more

ghimiresunil • GitHub - ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing: LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.

Nicolay Gerold added

GitHub - arthur-ai/bench: A tool for evaluating LLMs

BA Builder added

promptfoo is a tool for testing and evaluating LLM output quality.

With promptfoo, you can:

Systematically test prompts & models against predefined test cases

Evaluate quality and catch regressions by comparing LLM outputs side-by-side

Speed up evaluations with caching and concurrency

Score outputs automatically by defining test cases

Use as a

Testing framework for LLM Part

Nicolay Gerold added

baserun.ai💪💪💪

Testing & Observability Platform for LLM Apps

From prompt playground to end-to-end tests, baserun helps you ship your LLM apps with confidence and speed.

Testing framework for LLM Part

Nicolay Gerold added

DeepEval — It’s a tool for easy and efficient LLM testing. Deepeval aims to make writing tests for LLM applications (such as RAG) as easy as writing Python unit tests.

Testing framework for LLM Part

Nicolay Gerold added

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with LLMs faster and more controllable by co-designing the frontend language and the runtime system.

The core features of SGLang include:

A Flexible Front-End Language : This allows for easy programming of LLM applications with multiple ch

sgl-project • GitHub - sgl-project/sglang: SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Nicolay Gerold added

They have a fast jsond ecoding feature with a finite state machine.