GitHub - confident-ai/deepeval: The LLM Evaluation Framework
DeepEval β Itβs a tool for easy and efficient LLM testing. Deepeval aims to make writing tests for LLM applications (such as RAG) as easy as writing Python unit tests.
Testing framework for LLM Part
Nicolay Gerold added
GitHub - arthur-ai/bench: A tool for evaluating LLMs
GitHub - arthur-ai/bench: A tool for evaluating LLMs
BA Builder added
Deep-ML
deep-ml.comOur extensive test over 25 LLMs (including APIs and open-sourced models) shows that, while top commercial LLMs present a strong ability of acting as agents in complex environments, there is a significant disparity in performance between them and open-sourced competitors.
Xiao Liu β’ AgentBench: Evaluating LLMs as Agents
Darren LI added
LLM-PowerHouse: A Curated Guide for Large Language Models with Custom Training and Inferencing
Welcome to LLM-PowerHouse, your ultimate resource for unleashing the full potential of Large Language Models (LLMs) with custom training and inferencing. This GitHub repository is a comprehensive and curated guide designed to empower developers, researche... See more
Welcome to LLM-PowerHouse, your ultimate resource for unleashing the full potential of Large Language Models (LLMs) with custom training and inferencing. This GitHub repository is a comprehensive and curated guide designed to empower developers, researche... See more
ghimiresunil β’ GitHub - ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing: LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.
Nicolay Gerold added
GitHub - MadcowD/ell: A language model programming library.
github.comAmplify Partners was running a survey among 800+ AI engineers to bring transparency to the AI Engineering space. The report is concise, yet it provides a wealth of insights into the technologies and methods employed by companies for the implementation of AI products.
Highlights
π Top AI use cases are code intelligence, data extraction and workflow a... See more
Highlights
π Top AI use cases are code intelligence, data extraction and workflow a... See more
Feed | LinkedIn
a couple of the top of my head:
- LLM in the loop with preference optimization
- synthetic data generation
- cross modality "distillation" / dictionary remapping
- constrained decoding
r/MachineLearning - Reddit
Nicolay Gerold added
Additional LLM paradigms beyond RAG