GitHub - Unstructure...

GitHub - Unstructured-IO/unstructured: Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.Unstructured-IO

There's so much more to explore