open source projects

open source projects

madisen and

Dolma: 3 Trillion Token Open Corpus for Language Model Pretraining

Luca Soldainiblog.allenai.org
Thumbnail of Dolma: 3 Trillion Token Open Corpus for Language Model Pretraining

Jason Barrett Prado DAOs are interesting, likely, and terrifying

Ideas related to this collection