
GitHub - databonsai/databonsai: clean & curate your data with LLMs.

Firecrawl
firecrawl.dev
GitHub - elicit/machine-learning-list: A curriculum for learning about foundation models, from scratch to the frontier
elicitgithub.com
GitHub - mit-han-lab/streaming-llm: Efficient Streaming Language Models with Attention Sinks
mit-han-labgithub.comGitHub - virattt/ai-hedge-fund: An AI Hedge Fund Team
github.com
Flora AI
florafauna.ai
Data-Juicer: A One-Stop Data Processing System for Large Language Models
Data-Juicer is a one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs. This project is being actively updated and maintained, and we will periodically enhance and add more features and data recipes. We welcome you to join us in pro... See more
Data-Juicer is a one-stop data processing system to make data higher-quality, juicier, and more digestible for LLMs. This project is being actively updated and maintained, and we will periodically enhance and add more features and data recipes. We welcome you to join us in pro... See more