togethercomputer/RedPajama-Data-V2 · Datasets at Hugging Face

togethercomputer/RedPajama-Data-V2 · Datasets at Hugging Face

huggingface.co
Thumbnail of togethercomputer/RedPajama-Data-V2 · Datasets at Hugging Face

Datasets as Imagination

Lila Shroffjoinreboot.org
Thumbnail of Datasets as Imagination

Sarah Drinkwater and added

alibaba GitHub - alibaba/data-juicer: A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

Alek Tarkowski Filling the governance vacuum related to the use of information commons for AI training

madisen added

Open X-Embodiment: Robotic Learning Datasets and RT-X Models

Darren LI added

Eric Siegel Predictive Analytics

DataStax Retrieval Augmented Generation (RAG) Explained: Understanding Key Concepts

added

GitHub - NVIDIA/NeMo-Curator: Scalable toolkit for data curation

Joel Gurin Open Data Now: The Secret to Hot Startups, Smart Investing, Savvy Marketing, and Fast Innovation (Business Books)