jina-ai/reader: Convert any URL to an LLM-friendly input ... - GitHub

jina-ai/reader: Convert any URL to an LLM-friendly input ... - GitHub

jina-aigithub.com
Thumbnail of jina-ai/reader: Convert any URL to an LLM-friendly input ... - GitHub

alibaba GitHub - alibaba/data-juicer: A one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷为大语言模型提供更高质量、更丰富、更易”消化“的数据!

Lumina: The AI Powered Research Suite

lumina-chat.com
Thumbnail of Lumina: The AI Powered Research Suite

Andrés added

GitHub - gregpr07/browser-use: Make websites accessible for AI agents

github.com
Thumbnail of GitHub - gregpr07/browser-use: Make websites accessible for AI agents

Anton Gorshkov added

Unstructured-IO GitHub - Unstructured-IO/unstructured: Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

ghimiresunil GitHub - ghimiresunil/LLM-PowerHouse-A-Curated-Guide-for-Large-Language-Models-with-Custom-Training-and-Inferencing: LLM-PowerHouse: Unleash LLMs' potential through curated tutorials, best practices, and ready-to-use code for custom training and inferencing.

Inoreader – Build your own newsfeed

inoreader.com
Thumbnail of Inoreader – Build your own newsfeed

and added

CambioML GitHub - CambioML/uniflow-llm-based-pdf-extraction-text-cleaning-data-clustering: LLM-based text extraction from unstructured data like PDFs, Words and HTMLs. Transform and cluster the text into your desired format. Less information loss, more interpretation, and faster...

GitHub - run-llama/llama-hub: A library of data loaders for LLMs made by the community -- to be used with LlamaIndex and/or LangChain