GitHub - confident-ai/deepeval: The LLM Evaluation Framework...

GitHub - confident-ai/deepeval: The LLM Evaluation Framework

RelatedHighlights

DeepEval — It’s a tool for easy and efficient LLM testing. Deepeval aims to make writing tests for LLM applications (such as RAG) as easy as writing Python unit tests.

Testing framework for LLM Part

Nicolay Gerold added

GitHub - arthur-ai/bench: A tool for evaluating LLMs

BA Builder added

Deep-ML

deep-ml.com

and added

Our extensive test over 25 LLMs (including APIs and open-sourced models) shows that, while top commercial LLMs present a strong ability of acting as agents in complex environments, there is a significant disparity in performance between them and open-sourced competitors.

Xiao Liu • AgentBench: Evaluating LLMs as Agents

Darren LI added

LLM-PowerHouse: A Curated Guide for Large Language Models with Custom Training and Inferencing

Welcome to LLM-PowerHouse, your ultimate resource for unleashing the full potential of Large Language Models (LLMs) with custom training and inferencing. This GitHub repository is a comprehensive and curated guide designed to empower developers, researche... See more

ghimiresunil • GitHub - ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing: LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.

Nicolay Gerold added

GitHub - MadcowD/ell: A language model programming library.

github.com

added

Amplify Partners was running a survey among 800+ AI engineers to bring transparency to the AI Engineering space. The report is concise, yet it provides a wealth of insights into the technologies and methods employed by companies for the implementation of AI products.

Highlights

👉 Top AI use cases are code intelligence, data extraction and workflow a... See more

Feed | LinkedIn

added

a couple of the top of my head:

LLM in the loop with preference optimization

synthetic data generation

cross modality "distillation" / dictionary remapping

constrained decoding

r/MachineLearning - Reddit

Nicolay Gerold added

Additional LLM paradigms beyond RAG