GitHub - alibaba/data-juicer: A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

Arctic-SnowCoder: Demystifying High-Quality Data in Code Pretraining
abs: https://t.co/5VOVrPCn5A
New... See more

Short Musings on AI Engineering and "Failed AI Projects"
sh-reya.com