GitHub - Unstructured-IO/unstructured: Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
GitHub - jxnl/instructor-classify
github.comLLM data - Anna’s Archive
annas-archive.org
Gitingest
gitingest.com