
GitHub - databonsai/databonsai: clean & curate your data with LLMs.

Cleaning data sucks so I spent a month automating it using agents and LLMs.
If you'd like to use it or have feedback, let me know down below! https://t.co/FN8DHifMS1
Arav Kumarx.comCrawl the web in an LLM-friendly style!
Introducing Crawl4AI 🤖🕷️which is a web data crawler that extracts semantically labeled chunks into JSON, along with clean HTML and markdown for RAG, fine-tuning, and AI chatbots.
This open-source tool offers efficient crawling and multi-URL support. ... See more
Unclecode (Hossein)x.com
I started compiling a list of handy repos to gather data for LLMs: https://t.co/hrHx1Naay6