GitHub - huggingface/datatrove: Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.
updated 8mo ago
updated 8mo ago
Nicolay Gerold added
Nicolay Gerold added
Nicolay Gerold added