GitHub - NVIDIA/NeMo-Curator: Scalable toolkit for data curation

GitHub - NVIDIA/NeMo-Curator: Scalable toolkit for data curation

github.com
Thumbnail of GitHub - NVIDIA/NeMo-Curator: Scalable toolkit for data curation

Models All The Way Down

knowingmachines.org
Thumbnail of Models All The Way Down

Lila Shroff Datasets as Imagination

GitHub - elicit/machine-learning-list: A curriculum for learning about foundation models, from scratch to the frontier

elicitgithub.com
Thumbnail of GitHub - elicit/machine-learning-list: A curriculum for learning about foundation models, from scratch to the frontier

databonsai GitHub - databonsai/databonsai: clean & curate your data with LLMs.