GitHub - amoffat/HeimdaLLM: Verify LLM output

amoffat github.com

RelatedHighlights

Patterns for Building LLM-based Systems & Products

Eugene Yan eugeneyan.com

llama.ttf

fuglede.github.io

GitHub - abradburne/lume

github.com

a couple of the top of my head:

LLM in the loop with preference optimization

synthetic data generation

cross modality "distillation" / dictionary remapping

constrained decoding

r/MachineLearning - Reddit

GitHub - gregpr07/browser-use: Make websites accessible for AI agents

github.com

DataTrove

DataTrove is a library to process, filter and deduplicate text data at a very large scale. It provides a set of prebuilt commonly used processing blocks with a framework to easily add custom functionality.

DataTrove processing pipelines are platform-agnostic, running out of the box locally or on a slurm cluster. Its (relatively) low memory... See more

huggingface • GitHub - huggingface/datatrove: Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.